-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature detection is not thread-safe #560
Comments
Sorry for the late reply and thank you very much for the detailed explanation of the issue, providing a reproducer as well as suggesting a fix and PR. Will look into your PR shortly! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
On S/390 I ran into problem with the way the VXE2 hardware feature is currently being detected in multi-threaded code. It uses a SIGILL signal handler and sigjmp/longjmp in order to detect whether a certain instruction is available or not. Although I've looked into it for S/390 this should apply to other platforms using the same mechanism.
The problem is triggered by the way PyTorch is using Sleef:
pytorch/pytorch#128503
But it can also be reproduced with a small example like this:
Running the test like this:
gcc -DNUM_THREADS=4 t.c -O3 -mzvector -march=z15 -lsleef -fopenmp -lgomp -o t && ./t
results in either broken results or crashes
While the single threaded version works fine:
gcc -DNUM_THREADS=1 t.c -O3 -mzvector -march=z15 -lsleef -fopenmp -lgomp -o t && ./t
The cpuSupportsExt function uses the file scope variable sigjmp to store the execution status what makes this function thread-unsafe.
I will send a PR to check HWCAPs instead of using the signal handler. This fixes the problem for S/390. I think other platforms might need similar adjustments.
The text was updated successfully, but these errors were encountered: