Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: WaitTask waits for filtered resources #452

Closed
karlkfi opened this issue Oct 27, 2021 · 6 comments · Fixed by #456
Closed

Bug: WaitTask waits for filtered resources #452

karlkfi opened this issue Oct 27, 2021 · 6 comments · Fixed by #456

Comments

@karlkfi
Copy link
Contributor

karlkfi commented Oct 27, 2021

WaitTask currently ignores failed applies and failed prunes, but doesn't ignore resources skipped by filters in the ApplyTask or WaitTask.

Because WaitTasks are only added when there are dependencies (depends-on, CRDs, namespaces, etc.), this hasn't been much of a problem yet, but when using a sufficiently large/complex set of resources, it's feasible for the WaitTask to wait until timeout and fail because it's waiting for one of the following:

  • [deletion prevention, inventory policy, local namespace] Expecting a delete that was skipped to cause that object to become NotFound
  • [inventory policy] Expecting an apply that was skipped to cause the object (previously different or broken) to become Current

Suggested fix:

  • Add tracking of skipped deletions and applies to the TaskContext (similar to the existing deletion tracking)
  • Update WaitTask to ignore skipped deletions and applies
@karlkfi
Copy link
Contributor Author

karlkfi commented Oct 27, 2021

Blocks #438 & #449

@karlkfi
Copy link
Contributor Author

karlkfi commented Oct 27, 2021

Looks like apply filters are captured as failures. So that case might not actually be a problem (just confusing naming overlap).
https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/apply/task/apply_task.go#L137

Looks like prune filter are also captured as failures, except in the case of PreventRemoveFilterName where the inventory annotation is successfully removed. I don't know why this case is not captured. The logic around it is perplexingly complex with no comment as to why. Fixing it may cause a regression, if it was deliberate.
https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/apply/prune/prune.go#L134

@karlkfi
Copy link
Contributor Author

karlkfi commented Oct 27, 2021

The complex logic around whether to mark as a failed apply was added recently in #424

It looks like the failed applies are checked in the InvSetTask and kept in the inventory.
https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/apply/task/inv_set_task.go#L50

So changing it to track the deletion might change inventory behavior in the deletion prevention use case.

@haiyanmeng
Copy link
Contributor

The reason for not calling taskContext.CapturePruneFailure when removeInventoryAnnotation succeeded is that TaskContext.pruneFailures are added into the inventory in the InvSetTask.

@karlkfi
Copy link
Contributor Author

karlkfi commented Oct 28, 2021

Gotcha, yeah. We're gonna need to find another solution, because the wait needs to be skipped, and the taskContext.CapturePruneFailure is what was telling the WaitTask to skip that object.

@karlkfi
Copy link
Contributor Author

karlkfi commented Oct 29, 2021

Fix: #456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants