-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding preliminary AMDGPU support #99
Conversation
Thanks for doing this. I think this is the right idea and from a quick look the implementation seems good. Let me know when it is ready for review. |
In principle, the PR runs now, but I am getting some scalar indexing in the tests somewhere that I need to fix. I've been noticing scalar indexing in other Molly runs for a while now, but haven't been able to pin down where they are happening or if the issue is amd-specific, so I'll keep playing around here and try to find the problem. |
It seems to be here with the |
I have noticed scalar indexing occasionally when using Molly too, though it is disallowed in the current tests so I think that case is AMD-specific. I am not against allowing scalar indexing in specified places if performance isn't too bad. The |
Since the code runs, but tests are not quite passing, I think it's best to keep this PR open and refactor it into a KernelAbstractions PR on top of the |
Sounds good. |
Closing as it is outdated and #147 supersedes it. |
I am still running the tests, so this PR is still a draft and might change in the next few days, but it adds preliminary support for AMDGPU devices.
There were a few small things to note:
CuArray
to use an Array TypeAT
value instead. There were a few places (such as inzygote.jl
with internal functions that were only used for the GPU. In these cases,AT
became a function argument. In most cases,AT
could be created without any change to the function calls, themselves.interactions/implicit_solver.jl
), I added a new conditional withROCArray
instead ofCuArray
System(...)
call was modified to try and figure out what device you wanted to use if you setgpu
to true, but will default toCUDA
if both GPUs are present and things are left unspecifiedrun_cuda_tests
andrun_rocm_tests
. It will then populate the arraygpu_array_types
withCuArray
,ROCArray
, or both. I tried to modify all the tests to use the array types from this array.Again, still messing around with things now, but thought I would put the draft PR up in the case that there are design decisions to talk about.