-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate worker (Celery?) issues #2697
Comments
AFAICT it only affects short-running workers.
This matches the period of time when no tasks were being processed on that particular worker. There was no such message on the other worker yesterday, but the "idle time" was very similar, and 1799 is almost exactly 30 minutes, it could be just a coincidence, but it seems a bit suspicious to me (but I suppose it could be an interval in which Celery checks for non-responsive workers). |
I checked the images and there were no updates of |
I was releasing Packit and I had to tag in sidetags both ogr and specfile. Ogr was quick:
Specfile took almost 30 minutes to react:
|
So we are hitting this on stage as well? |
yes, exactly! |
This are the most suspicious messages I found in the logs for the short-running-worker that didn't process any task in 30 minutes.
And this is the strange log for the long-running-worker not responding to heartbeat check.
Shouldn't sync_from_downstream be disabled? |
After the previous week redeployment (6th January), we have started hitting issues with jobs processing, causing tasks not being processed for some time and delays:
Substantial drift from celery@packit-worker-long-running-0 may mean clocks are out of sync. Current drift is 1799 seconds. [orig: 2025-01-14 14:51:59.656603 recv: 2025-01-14 14:22:00.484181]
consumer: Connection to broker lost. Trying to re-establish the connection...
followed by a restartThe text was updated successfully, but these errors were encountered: