Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Failed to run container" errors when cloud host is running Docker 20.10.6 #295

Open
wmorrell opened this issue Apr 29, 2021 · 1 comment

Comments

@wmorrell
Copy link
Contributor

Primary Jenkins node is running:
Jenkins 2.277.3 (current LTS)
YAD 0.2.0
Clouds configured to launch on another host, and connect to container with Docker SSH Computer Launcher

Cloud hosts are running:
Docker 20.10.6

Provisioning with prior Docker versions, up to and including 20.10.5, works fine. As soon as the cloud node patches to 20.10.6, provisioning attempts fail. There is a work-around. Re-configure the cloud container launch settings, with Create Container Settings edited to include 0.0.0.0::22 under "Port bindings".

When this error triggers, provisioning attempts will start, then repeatedly fail. They will show the following error in the Cloud Statistics listing:

java.lang.IllegalStateException: Failed to run container.
	at com.github.kostyasha.yad.DockerCloud.provisionWithWait(DockerCloud.java:257)
	at com.github.kostyasha.yad.DockerCloud.lambda$provision$0(DockerCloud.java:135)
	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

In the Jenkins logs, there's also this:

Apr 23, 2021 5:06:21 AM FINE com.github.kostyasha.yad.launcher.DockerComputerSSHLauncher waitUp
TCP connection attempt failed 60 retries with 2 second interval for [::]:49163

Basically what I think is happening, per moby/moby#42313, is that Docker is now returning both an IPv4 and an IPv6 port when inspecting port bindings on a container. In Docker 20.10.5 and earlier, only the IPv4 port would be listed. When YAD is inspecting the ports to create the SSH url, it ends up grabbing the IPv6 address. If IPv6 is not configured on the host, or the IPv6 traffic is otherwise blocked, the DockerComputerSSHLauncher will eventually reach a connection timeout and fail the provisioning attempt. The work-around listed above works by forcing the container to launch with only IPv4 addresses, so YAD doesn't see the IPv6 address.

@tghastings
Copy link

Thank you for this. I spent a few hours trying to troubleshoot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants