Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cylc remove proposal #6472

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Conversation

MetRonnie
Copy link
Member

Closes #5643. Supersedes #6370.

Summary

This fully implements the "Cylc Remove Extension" proposal.

Flow numbers

cylc remove now has a --flow option for removing a task from specific flows.

If not used, it will remove the task from all flows that it belongs to.

If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).

If a task is removed from all flows that it belongs to, it will become a no-flow task (flow=None).

For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums 1.

Historical tasks

cylc remove now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in the task_states and task_outputs tables)2, and un-setting any prerequisites of active tasks that the removed task had naturally satisfied3. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.

The above also applies to active/waiting tasks that cylc remove is used on.

Kill submitted/running tasks

Using cylc remove on a submitted/running task will now kill it if you are removing the task from all flows that it belongs to.

Unlike with cylc kill, downstream tasks will not spawn off the :fail or :submit-fail outputs as the task is in flow=none, and also the failed and submission failed handlers will not run.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Footnotes

  1. Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment.

  2. If removing flows would result in two rows in the DB no longer being unique, the SQLite UPDATE OR REPLACE statement is used, so the first entry will be removed and the most recent entry will remain.

  3. Prerequisites manually satisfied by cylc set --pre are not affected by cylc remove.

`json.dumps()`/`json.loads()` are relatively slow (~1us). But these functions are likely to be called many times with `flow={1}`.
- Update data store with changed prereqs
- Don't un-queue downstream task if:
  - the task is already preparing
  - the task exists in flows other than that being removed
  - the task's prereqs are still satisfied overall
- Remove the downstream task from the pool if it no longer has any satisfied prerequisite tasks
This will allow it to call the method to kill submitted/running tasks
Plus ensure traceback for internal errors when cleaning gets logged in verbose mode
@MetRonnie MetRonnie added this to the 8.4.0 milestone Nov 11, 2024
@MetRonnie MetRonnie self-assigned this Nov 11, 2024
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still part way through, looks great so far.

cylc/flow/task_state.py Show resolved Hide resolved
cylc/flow/command_validation.py Show resolved Hide resolved
cylc/flow/command_validation.py Show resolved Hide resolved
cylc/flow/command_validation.py Show resolved Hide resolved
cylc/flow/platforms.py Show resolved Hide resolved
cylc/flow/scheduler.py Show resolved Hide resolved
cylc/flow/scheduler.py Show resolved Hide resolved
cylc/flow/scripts/remove.py Show resolved Hide resolved
cylc/flow/scripts/remove.py Show resolved Hide resolved
cylc/flow/scripts/remove.py Show resolved Hide resolved
@hjoliver
Copy link
Member

(Did some testing today, so far so good, but I'm not done yet.)

@oliver-sanders
Copy link
Member

This PR removes flow nums from the task_outputs and task_states table which is functionally sufficient.

However, it doesn't remove flow nums from the task_jobs table. Removing them here doesn't appear to be functionally required, so no problem. However, I think this table is used to populate "cylc-review", so we might want to remove them here too, possibly as follow-on?

@MetRonnie
Copy link
Member Author

However, it doesn't remove flow nums from the task_jobs table.

(It does, but only for submitted/running tasks that get killed by cylc remove)

@wxtim
Copy link
Member

wxtim commented Nov 14, 2024

Functional question: If I do cylc remove workflow//farfuture/task I get

2024-11-14T08:59:17Z WARNING - Task(s) not removable: 221/foo

In the scheduler log, which is great. Any possibility of a warning from command validation? Or is this a sufficiently silly foot-shooting that we're not worrying?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove: extend beyond the task pool
4 participants