We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using cu_mem_get_info gives results that are not very meaningful.
cu_mem_get_info
For example:
┌─────────────────────────────────────────────────────────────────────────────────────────┬───────────┬───────────────┬───────┬─────────┬───────────────────────────────────────────────────┐ │Benchmarks │Time in sec│Memory in bytes│Speedup│Mem gain │init time in sec, min loss, last loss │ ├─────────────────────────────────────────────────────────────────────────────────────────┼───────────┼───────────────┼───────┼─────────┼───────────────────────────────────────────────────┤ │seed 7, inline 0, parallel 1, batch 240, backend cc, val prec single, grad prec single │0.229846796│187036 │5.306 │18431.763│(0.602457722 62.728876709938049 62.728876709938049)│ │seed 7, inline 0, parallel 1, batch 240, backend cc, val prec half, grad prec half │0.681410625│93522 │1.790 │36861.950│(0.830183092 62.6259765625 62.6259765625) │ │seed 7, inline 0, parallel 1, batch 240, backend cuda, val prec single, grad prec single │0.796467672│3447403316 │1.531 │1.000 │(3.596758795 62.728905558586121 62.728905558586121)│ │seed 7, inline 0, parallel 1, batch 240, backend cuda, val prec half, grad prec half │1.219598776│2061500416 │1.000 │1.672 │(3.974976031 62.93798828125 62.93798828125) │ │seed 7, inline 3, parallel 1, batch 240, backend cc, val prec single, grad prec single │0.251448531│187036 │4.850 │18431.763│(0.511715823 62.7288755774498 62.7288755774498) │ │seed 7, inline 3, parallel 1, batch 240, backend cc, val prec half, grad prec half │0.63360842 │93522 │1.925 │36861.950│(0.585796587 62.30078125 62.30078125) │ │seed 7, inline 3, parallel 1, batch 240, backend cuda, val prec single, grad prec single │0.657724256│2210398208 │1.854 │1.560 │(0.996566334 62.728905558586121 62.728905558586121)│ │seed 7, inline 3, parallel 1, batch 240, backend cuda, val prec half, grad prec half │0.779391164│1088421888 │1.565 │3.167 │(1.305761225 62.2236328125 62.2236328125) │ │seed 7, inline 0, parallel 3, batch 240, backend cc, val prec single, grad prec single │0.245330525│571884 │4.971 │6028.151 │(0.808980378 62.153002977371216 62.153002977371216)│ │seed 7, inline 0, parallel 3, batch 240, backend cc, val prec half, grad prec half │0.459211186│285954 │2.656 │12055.797│(1.063122458 62.41552734375 62.41552734375) │ │seed 7, inline 0, parallel 3, batch 240, backend cuda, val prec single, grad prec single │0.524303261│1352663040 │2.326 │2.549 │(3.233237763 63.376171588897705 63.376171588897705)│ │seed 7, inline 0, parallel 3, batch 240, backend cuda, val prec half, grad prec half │0.750559389│612368384 │1.625 │5.630 │(5.178235428 62.83740234375 62.83740234375) │ │seed 7, inline 3, parallel 3, batch 240, backend cc, val prec single, grad prec single │0.246047198│571884 │4.957 │6028.151 │(0.72678405 62.152995347976685 62.152995347976685) │ │seed 7, inline 3, parallel 3, batch 240, backend cc, val prec half, grad prec half │0.446806293│285954 │2.730 │12055.797│(0.838345553 62.47265625 62.47265625) │ │seed 7, inline 3, parallel 3, batch 240, backend cuda, val prec single, grad prec single │0.558565954│715128832 │2.183 │4.821 │(1.419007865 63.376166462898254 63.376166462898254)│ │seed 7, inline 3, parallel 3, batch 240, backend cuda, val prec half, grad prec half │0.662616926│341835776 │1.841 │10.085 │(2.182560358 62.17529296875 62.17529296875) │ │seed 7, inline 0, parallel 6, batch 240, backend cc, val prec single, grad prec single │0.324366117│1176096 │3.760 │2931.226 │(1.099047585 61.730027139186859 61.730027139186859)│ │seed 7, inline 0, parallel 6, batch 240, backend cc, val prec half, grad prec half │0.537282895│588072 │2.270 │5862.213 │(1.315531069 62.76953125 62.76953125) │ │seed 7, inline 0, parallel 6, batch 240, backend cuda, val prec single, grad prec single │0.557164894│580911104 │2.189 │5.934 │(2.652769076 63.376184284687042 63.376184284687042)│ │seed 7, inline 0, parallel 6, batch 240, backend cuda, val prec half, grad prec half │0.659206927│297795584 │1.850 │11.576 │(4.897720286 62.7421875 62.7421875) │ │seed 7, inline 3, parallel 6, batch 240, backend cc, val prec single, grad prec single │0.327492657│1176096 │3.724 │2931.226 │(0.945304816 61.718904912471771 61.718904912471771)│ │seed 7, inline 3, parallel 6, batch 240, backend cc, val prec half, grad prec half │0.496853717│588072 │2.455 │5862.213 │(1.055382175 60.982421875 60.982421875) │ │seed 7, inline 3, parallel 6, batch 240, backend cuda, val prec single, grad prec single │0.484854294│337641472 │2.515 │10.210 │(1.661079693 63.376177906990051 63.376177906990051)│ │seed 7, inline 3, parallel 6, batch 240, backend cuda, val prec half, grad prec half │0.637598667│153092096 │1.913 │22.518 │(2.544604816 62.099609375 62.099609375) │ │seed 7, inline 0, parallel 12, batch 240, backend cc, val prec single, grad prec single │0.374894618│2481504 │3.253 │1389.239 │(1.55095354 61.862113118171692 61.862113118171692) │ │seed 7, inline 0, parallel 12, batch 240, backend cc, val prec half, grad prec half │0.565150972│1240800 │2.158 │2778.371 │(1.795796058 62.04931640625 62.04931640625) │ │seed 7, inline 0, parallel 12, batch 240, backend cuda, val prec single, grad prec single│0.579294911│276824064 │2.105 │12.453 │(2.876144217 63.376185953617096 63.376185953617096)│ │seed 7, inline 0, parallel 12, batch 240, backend cuda, val prec half, grad prec half │0.697255179│153092096 │1.749 │22.518 │(4.924106615 62.80078125 62.80078125) │ │seed 7, inline 3, parallel 12, batch 240, backend cc, val prec single, grad prec single │0.363463621│2481504 │3.355 │1389.239 │(1.313944785 61.862080454826355 61.862080454826355)│ │seed 7, inline 3, parallel 12, batch 240, backend cc, val prec half, grad prec half │0.562140134│1240800 │2.170 │2778.371 │(1.499167458 61.90234375 61.90234375) │ │seed 7, inline 3, parallel 12, batch 240, backend cuda, val prec single, grad prec single│0.596052431│180355072 │2.046 │19.115 │(2.841286029 63.376178562641144 63.376178562641144)│ │seed 7, inline 3, parallel 12, batch 240, backend cuda, val prec half, grad prec half │0.663990027│67108864 │1.837 │51.370 │(4.696311311 61.94580078125 61.94580078125) │ │seed 7, inline 0, parallel 16, batch 240, backend cc, val prec single, grad prec single │0.474191279│3423616 │2.572 │1006.948 │(1.769277872 61.757832944393158 61.757832944393158)│ │seed 7, inline 0, parallel 16, batch 240, backend cc, val prec half, grad prec half │0.577466576│1711872 │2.112 │2013.821 │(2.305998903 61.90576171875 61.90576171875) │ │seed 7, inline 0, parallel 16, batch 240, backend cuda, val prec single, grad prec single│0.620073868│186646528 │1.967 │18.470 │(2.423241004 63.376178324222565 63.376178324222565)│ │seed 7, inline 0, parallel 16, batch 240, backend cuda, val prec half, grad prec half │0.764845615│88080384 │1.595 │39.139 │(4.584916593 62.71826171875 62.71826171875) │ │seed 7, inline 3, parallel 16, batch 240, backend cc, val prec single, grad prec single │0.412470958│3423616 │2.957 │1006.948 │(1.665891434 61.757833182811737 61.757833182811737)│ │seed 7, inline 3, parallel 16, batch 240, backend cc, val prec half, grad prec half │0.59185156 │1711872 │2.061 │2013.821 │(1.89430643 61.8232421875 61.8232421875) │ │seed 7, inline 3, parallel 16, batch 240, backend cuda, val prec single, grad prec single│0.617704358│109051904 │1.974 │31.613 │(2.188090388 63.376178324222565 63.376178324222565)│ │seed 7, inline 3, parallel 16, batch 240, backend cuda, val prec half, grad prec half │0.741150168│41943040 │1.646 │82.193 │(3.408914471 61.9169921875 61.9169921875) │ └─────────────────────────────────────────────────────────────────────────────────────────┴───────────┴───────────────┴───────┴─────────┴────────────────────────
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Using
cu_mem_get_info
gives results that are not very meaningful.For example:
The text was updated successfully, but these errors were encountered: