-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNIST single GPU example: GradScaler AssertionError #168
Comments
Can you share me the details reproduce steps? Seems pytorch 2.2 needs a higher version of NCCL and currently we only supports pytorch 2.1 and 1.4 |
this one works #178 (comment) |
I met the same problem. My torch version is 2.4.0 with CUDA 12.1:
The |
Hi @yatorho , PyTorch added a new assertion to check whether param is torch.Tensor, but ScalingTensor in MS-AMP is not torch.Tensor. A temporal solution is to comment the Line 256 in |
Thanks! it works for me. |
What's the issue, what's expected?:
python mnist.py --enable-msamp --opt-level=O2
should work with the versions pinned inpyproject.toml
. Specifically, it should work withtorch==2.2.1
, given that torch is unpinned.How to reproduce it?:
build MS-AMP with
torch==2.2.1
.Log message or shapshot?:
Additional information:
This occurs because
optimizer.param_groups[:,'params']
containsScalingParameter
sScalingParameter
s subclassScalingTensor
which subclasses nothing, so theisinstance
check failsCommenting out the assertion line manually fixes the issue. I do not know how to reasonably fix this without resorting to that.
The text was updated successfully, but these errors were encountered: