-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSHLauncher tries to launch Windows agent for more than 13 hours #359
Comments
Thanks for the report! I am not sure if this is a problem of
|
@olivergondza thanks for getting back to me so quickly
No, node is N/A (highlighted in red) and a job run is cancelled after timeout, but node is never terminated. I have a reason why I believe it's an Also I'd like to stress out this happen quite "randomly" (meaning I can't related it to the cause) as sometimes there is 20 nodes provisioned correctly in row, sometimes it's couple of this in row. On average, the "launch" phase is done within 14 seconds.
I can't disclose details, but we have quite a volume of linux-based runs and I can't see it there.
I'll try that next time I see the problem and get back to you here. In short - ATM no, but I'll be watching it. |
@olivergondza, me and @pematous have encountered this issue recently. We decided to workaround it by not running Jenkins Agent on the Windows machines. Instead, we spawn Windows machines using openstackMachine step, which works reliably, and then connect to the machine by its IP using SSH/Ansible, as needed. It is less convenient than having the Jenkins agent available, but it works! |
Jenkins and plugins versions report
Environment
What Operating System are you using (both controller, and any agents involved in the problem)?
agent - Windows Server 2019 (CYGWIN_NT-10.0-17763)
Reproduction steps
Provision Windows nodes again and again until you hit the problem.
Expected Results
One of:
SSHLauncher
should automatically re-launchJCloudsLauncher
on Windows (re-try mechanism)Actual Results
Sometimes ("randomly") provisioning fails and node is stuck in state "launch", in cloud stats I can see 13 hr and counting (then I terminated node manually), however there is no issue with the agent. I can access node from my computer with
ssh
and there are no logs in/remoting
(the directory does not exist, but remoting.jar is there). Using "relaunch agent" button does not help.Here is edited log:
Anything else?
No response
The text was updated successfully, but these errors were encountered: