-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Req: make wasi_socket_working
more configurable instead of hard coding retry times
#263
Comments
Does the shim need to handle the deletion of the shim socket address (e.g. |
Yeah, it looks like it just exhausts all retries (which is 1 second) and then removes socket. IMO we can remove it. pinging @abel-von and @Burning1020 . Any chance you have more context here? |
I think this is for support of multi containers in one pod, containerd will call |
@abel-von that makes sense, but I think the retry timeout is unusually long - a whole second! What are the reasons for trying 200 times for 5ms each? Can we reduce the number of retries? |
@Mossaka maybe we can make it a configurable value, but I'd like to know why is the connection not established everytime? |
I like that, and changed this issue's title to reflect the request.
There is a bug in runwasi that prevents it from deleting the socket address after ttrpc server closes. Thus it hangs over connection to the socket address. |
shim::spawn
takes more than one second to finishwasi_socket_working
more configurable instead of hard coding retry times
I was looking into an issue regarding the slow startup time of runwasi shims, and by inspecting the traces, I found that the
wait_socket_working(&address, 5, 200)
call always took a second to finish if the address is there but for some reason not able to establish a connection to the ttrpc client correctly.code ref: https://github.com/containerd/rust-extensions/blob/main/crates/shim/src/synchronous/mod.rs#L471-L476
I would appreciate if you could clarify the motivation behind call to
wait_socket_working
. Why coulnd't we proceed to remove the socket if it's already in use and then callstart_listener
immediately? Also, what's the motivation behind the one second wait time, which has a significant impact on the startup time of the shims?Here is a modified code that doesn't wait for sockets:
The text was updated successfully, but these errors were encountered: