Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: epoll crashes for some Windows users #4730

Closed
minrk opened this issue Aug 20, 2024 · 2 comments · Fixed by #4734
Closed

Problem: epoll crashes for some Windows users #4730

minrk opened this issue Aug 20, 2024 · 2 comments · Fixed by #4734

Comments

@minrk
Copy link
Member

minrk commented Aug 20, 2024

Issue description

Reported in pyzmq, but after updating to libzmq 4.3.5, it appears the patch in #4422 did not fix the problem, but just shifted the error.

Environment

  • libzmq version (commit hash if unreleased): 4.3.5
  • OS: Windows 11 and 10
  • Python 3.12.3 via Microsoft Store (though others may be relevant)

There seems to be some interaction with VPNs or firewalls or something that has yet to be fully understood, making it very hard to reproduce. It appears to be most of the time for affected users, but I've never been able to see it myself.

Minimal test code / Steps to reproduce the issue

Python:

import zmq
ctx = zmq.Context()
with ctx:
    with ctx.socket(zmq.PUSH) as s:
        s.bind("tcp://127.0.0.1:5555")

What's the actual result? (include assertion message & call stack if applicable)

If built with cmake defaults (epoll, ipc enabled), this crashes with:

Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmppy2n81h4\build_deps\bundled_libzmq-src\src\epoll.cpp:73)

epoll.cpp:73

What's the expected result?

it doesn't crash.

@minrk
Copy link
Member Author

minrk commented Aug 21, 2024

I managed to reproduce this on the same system in the same env by adding a user with the username 日本語. So something in wepoll or libzmq (or Windows itself) is doing something weird that's sensitive to the username and/or home directory.

@minrk
Copy link
Member Author

minrk commented Aug 25, 2024

#4732 fixes one cause of this error, but not all. This error is indeed seen when _wmkdir raises, which is fixed by #4732, but the more mysterious case seen in pyzmq appears to be caused even after a successful bind of the ipc socket, which I can't explain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant