-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP/POC] Degraded NodePool Status Condition #1880
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rschalo The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
func (c *Controller) Reconcile(ctx context.Context, nodePool *v1.NodePool) (reconcile.Result, error) { | ||
ctx = injection.WithControllerName(ctx, "nodepool.degraded") | ||
stored := nodePool.DeepCopy() | ||
if nodePool.Status.FailedLaunches >= 3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider combining these two statements so we do a check for 0 and do a check for greater than or equal to 3 together and then set the status condition and patch in the same call
// If the Registered statusCondition hasn't gone True during the TTL since we first updated it, we should terminate the NodeClaim | ||
// NOTE: ttl has to be stored and checked in the same place since l.clock can advance after the check causing a race | ||
if ttl := registrationTTL - l.clock.Since(registered.LastTransitionTime.Time); ttl > 0 { | ||
// If the nodepool is degraded, requeue for the remaining TTL. | ||
if ttl := registrationTTL - l.clock.Since(registered.LastTransitionTime.Time); ttl > 0 || nodePool.StatusConditions().Get(v1.ConditionTypeDegraded).IsTrue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following what this check is intended to do. Is this supposed to make it so that we dynamically scale the requeue time?
return reconcile.Result{RequeueAfter: ttl}, nil | ||
} | ||
// Delete the NodeClaim if we believe the NodeClaim won't register since we haven't seen the node | ||
// Here we delete the nodeclaim if the node failed to register, we want to retry against the nodeClaim's nodeClass/nodePool 3x. | ||
// store against a nodepool since nodeclass is not available? nodeclass ref on nodepool, nodepool is 1:1 with nodeclass anyway | ||
log.FromContext(ctx).V(1).WithValues("failures", nodePool.Status.FailedLaunches).Info("failed launches so far") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be here for debugging, but I don't think that we should merge this since it's going to be incredibly noisy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely just in for debugging and agree it should be removed.
log.FromContext(ctx).V(1).WithValues("failures", nodePool.Status.FailedLaunches).Info("failed launches so far") | ||
nodePool.Status.FailedLaunches += 1 | ||
log.FromContext(ctx).V(1).WithValues("failures", nodePool.Status.FailedLaunches).Info("failed launches so far") | ||
if err := l.kubeClient.Status().Update(ctx, nodePool); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we choose to do an Update in some places and do a patch with optimistic locking in others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make these consistent, was just working around an issue I had where Patch didn't always work and I wasn't yet sure why.
pkg/apis/v1/nodepool_status.go
Outdated
@@ -27,13 +27,18 @@ const ( | |||
ConditionTypeValidationSucceeded = "ValidationSucceeded" | |||
// ConditionTypeNodeClassReady = "NodeClassReady" condition indicates that underlying nodeClass was resolved and is reporting as Ready | |||
ConditionTypeNodeClassReady = "NodeClassReady" | |||
// TODO | |||
ConditionTypeDegraded = "Degraded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have thoughts around how you are going to track the last failed launch and then extend the amount of time before we retry out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, I was thinking we'd use the lastTransitionTime
for when Degraded == true
.
@@ -498,6 +498,9 @@ spec: | |||
- type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we propose this change as an RFC? There seems to be a bunch of detail around what this status condition means, how we are going to track failures, what it means with respect to scheduling, etc. and I think it would be good to let people see this and get cloudprovder input as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm writing the RFC today and I should be able to post tomorrow.
cloudProvider cloudprovider.CloudProvider | ||
} | ||
|
||
func NewController(kubeClient client.Client, cloudProvider cloudprovider.CloudProvider) *Controller { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about how this status condition affects the schedulability of the NodePool? Does it deprioritize the NodePool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't want to skip over Degraded
NodePools but they should be deprioritized. The simplest way is to do this is probably treat the weight as 0 in scheduling if a NodePool is Degraded
.
I think the downside of this is that if a general or fast and cheap
NodePool is degraded and a fallback NodePool exists which satisfies the pending pod but is more constrained and expensive then cluster costs could unexpectedly increase.
Fixes #N/A
Description
How was this change tested?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.