Skip to content

Remove duplicate bootstrap duration observation in taint removal path#285

Open
rawadhossain wants to merge 2 commits into
kubernetes-sigs:mainfrom
rawadhossain:cleanup-bootstrap-observation
Open

Remove duplicate bootstrap duration observation in taint removal path#285
rawadhossain wants to merge 2 commits into
kubernetes-sigs:mainfrom
rawadhossain:cleanup-bootstrap-observation

Conversation

@rawadhossain

Copy link
Copy Markdown
Contributor

Description

cleanup: duplicate observation of node_readiness_bootstrap_duration_seconds in the bootstrap-only taint removal path.
Two consecutive observation blocks recording the same bootstrap completion event.

This is important because upcoming metrics, dashboards, and SLOs rely on this histogram for accurate calculations. Duplicate samples would skew those results.

Files changed

File Change
internal/controller/nodereadinessrule_controller.go Remove duplicate Block
internal/controller/nodereadinessrule_controller_test.go Add regression test + histogramSampleCount helper

Type of Change

/kind bug

Testing

Added test to verify the histogram sample count increases by exactly one for a bootstrap completion event

  • Ran:
    • make test
    • make lint

Checklist

  • make test passes
  • make lint passes

Signed-off-by: Rawad Hossain <rawad.hossain00@gmail.com>
@kubernetes-prow kubernetes-prow Bot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 2, 2026
@netlify

netlify Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit 47a3593
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/6a47873bb227d700085d976b

@kubernetes-prow

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rawadhossain
Once this PR has been reviewed and has the lgtm label, please assign tallclair for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubernetes-prow kubernetes-prow Bot requested review from ajaysundark and mrunalp July 2, 2026 10:42
@kubernetes-prow kubernetes-prow Bot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 2, 2026
@kubernetes-prow

Copy link
Copy Markdown

Hi @rawadhossain. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubernetes-prow kubernetes-prow Bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 2, 2026
@ajaysundark

Copy link
Copy Markdown
Contributor

/ok-to-test

/cc @Karthik-K-N

@kubernetes-prow kubernetes-prow Bot requested a review from Karthik-K-N July 3, 2026 06:35
@kubernetes-prow kubernetes-prow Bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 3, 2026

@Karthik-K-N Karthik-K-N left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this. If possible we can do small improvement in test from the readability and stability aspect apart from that LGTM. Thank you.

Comment on lines +1079 to +1088
Status: corev1.NodeStatus{
Conditions: []corev1.NodeCondition{
{
Type: "Ready",
Status: corev1.ConditionTrue,
LastTransitionTime: transitionTime,
},
},
},
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does not matter, wont the create ignore status?
Usually we create only with spec

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. Removed the Status block, now node is created with spec only

Expect(k8sClient.Get(ctx, types.NamespacedName{Name: nodeName}, node)).To(Succeed())
node.Status.Conditions[0].LastTransitionTime = transitionTime
node.Status.Conditions[0].Status = corev1.ConditionTrue
Expect(k8sClient.Status().Update(ctx, node)).To(Succeed())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually Update() call should go inside a Eventually block to have some retry, How about something like this

    Eventually(func() error {
        if err := k8sClient.Get(ctx, types.NamespacedName{Name: nodeName}, node); err != nil {
            return err
        }
        node.Status.Conditions = []corev1.NodeCondition{
            {
                Type:               "Ready",
                Status:             corev1.ConditionTrue,
                LastTransitionTime: transitionTime,
            },
        }
        return k8sClient.Status().Update(ctx, node)
    }, "5s", "100ms").Should(Succeed())

@rawadhossain rawadhossain Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Commiting the changes now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants