Skip to content

Track node removal latency for empty and non-empty nodes#9377

Open
tetianakh wants to merge 1 commit intokubernetes:masterfrom
tetianakh:was_empty
Open

Track node removal latency for empty and non-empty nodes#9377
tetianakh wants to merge 1 commit intokubernetes:masterfrom
tetianakh:was_empty

Conversation

@tetianakh
Copy link
Copy Markdown
Contributor

@tetianakh tetianakh commented Mar 18, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

This change adds type label to the node_removal_latency_seconds metric. This allows to track the scaledown latency of empty and non-empty nodes separately.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 18, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/needs-area area/cluster-autoscaler needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 18, 2026
@k8s-ci-robot k8s-ci-robot requested review from elmiko and x13n March 18, 2026 15:58
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @tetianakh. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tetianakh
Once this PR has been reviewed and has the lgtm label, please assign aleksandra-malinowska for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 18, 2026
@tetianakh tetianakh marked this pull request as ready for review March 18, 2026 16:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 18, 2026
@k8s-ci-robot k8s-ci-robot requested a review from feiskyer March 18, 2026 16:04
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 19, 2026
@tetianakh tetianakh changed the title Add was_empty field to node_removal_latency_seconds Track node removal latency for empty and non-empty nodes Mar 19, 2026
@tetianakh tetianakh force-pushed the was_empty branch 4 times, most recently from 7f80e68 to 5d251ee Compare March 24, 2026 14:42
@tetianakh
Copy link
Copy Markdown
Contributor Author

/assign Choraden

Copy link
Copy Markdown
Contributor

@Choraden Choraden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tetianakh! Left some minor comments.

Help: "Latency from when an unneeded node is eligible for scale down until it is removed (deleted=true) or it became needed again (deleted=false).",
Buckets: k8smetrics.ExponentialBuckets(1, 1.5, 19), // ~1s → ~24min
}, []string{"deleted"},
}, []string{"deleted", "type"},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"type" is common and very generic. How about node_type or unneeded_type to avoid ambiguity?

Comment on lines +192 to +195
nodeType := metrics.NonEmptyUnneededNode
if len(v.ntbr.PodsToReschedule) == 0 {
nodeType = metrics.EmptyUnneededNode
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about extracting it to a helper function?
That would make the intent clearer and centralize the definition of emptiness. It would also simplify testing this particular behavior.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That helper function could be also reused in the candidatesFromNames (from node_latency_tracker_test.go) to be sure that we are testing the right logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants