You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We get the following error when we run our code on Frontier (OLCF). We are not sure where and how the memory access is failing and will be glad if you provide any suggestions to mitigate this.
CFL = 2.828e-08; dt = 1.000e-01; Time = 0.0000000000000e+00
| Nonlinear | F 2-Norm | # Linear | R 2-Norm |
0 3.19e-03
Memory access fault by GPU node-4 (Agent handle: 0xa77bbf0) on address 0xffff00000000. Reason: Unknown.
Aborted
rocgdb report:
#0 0x00007ff2e28d9124 in PHX::MDField<Sacado::Fad::Exp::GeneralFad<Sacado::Fad::Exp::DynamicStorage<double, double> > const, panzer::Cell, panzer::Point, panzer::Dim>::operator()<int, int, int> (this=0x7ff2e28fbdb0 <kokkos_impl_hip_constant_memory_buffer+272>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>)
at libs/Trilinos-install-16/include/Phalanx_MDField.hpp:461
461 return m_view(indices...);
Thank you,
Kalyan
The text was updated successfully, but these errors were encountered:
I suspect it is a problem with setting the derivative dimension for the fad object correctly. An MDField is a light weight wrapper around a Kokkos::View. You could configure your build to do array bounds checking with:
-D Kokkos_ENABLE_DEBUG_BOUNDS_CHECK=ON
It that doesn't help, try printing the derivative array dimensions of the mdfields in the failing functor.
We get the following error when we run our code on Frontier (OLCF). We are not sure where and how the memory access is failing and will be glad if you provide any suggestions to mitigate this.
CFL = 2.828e-08; dt = 1.000e-01; Time = 0.0000000000000e+00
| Nonlinear | F 2-Norm | # Linear | R 2-Norm |
0 3.19e-03
Memory access fault by GPU node-4 (Agent handle: 0xa77bbf0) on address 0xffff00000000. Reason: Unknown.
Aborted
rocgdb report:
#0 0x00007ff2e28d9124 in PHX::MDField<Sacado::Fad::Exp::GeneralFad<Sacado::Fad::Exp::DynamicStorage<double, double> > const, panzer::Cell, panzer::Point, panzer::Dim>::operator()<int, int, int> (this=0x7ff2e28fbdb0 <kokkos_impl_hip_constant_memory_buffer+272>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>,
indices=<error reading variable: Cannot access memory at address 0x2000000000afc>)
at libs/Trilinos-install-16/include/Phalanx_MDField.hpp:461
461 return m_view(indices...);
Thank you,
Kalyan
The text was updated successfully, but these errors were encountered: