Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version Tracker complains because grapher://grapher steps are not found in the dag #3267

Open
pabloarosado opened this issue Sep 10, 2024 · 2 comments

Comments

@pabloarosado
Copy link
Contributor

pabloarosado commented Sep 10, 2024

Problem

Currently, running etl d version-tracker raises an error, because, e.g.

* Missing step
    grapher://grapher/energy/2024-06-20/primary_energy_consumption
  is a dependency of the following active steps:
    export://multidim/energy/latest/energy

Expected behaviour

We would expect that version tracker automatically includes the auto-generated grapher://grapher/... steps when sanity checking the dependency graph, i.e. no error here.

Why this is happening

This happens because the grapher://grapher/ dependency is not in the dag (however, the corresponding data://grapher/ step is in the dag, so the error should not be raised). This started happening recently, since we started having export steps that depend on grapher://grapher steps.

Technical notes

  • We note that we really do need to execute grapher://grapher/... steps before exporting
  • This means version tracker needs to build off a computed DAG that includes these steps
@pabloarosado
Copy link
Contributor Author

This issue also causes StepUpdater to fail when, e.g. archiving steps. I can imagine that it may also fail occasionally when updating steps, and maybe the information shown in the dashboard (regarding the status of a step) may not be correct.
The main reason why this happens is that grapher://grapher steps are not defined in the dag.

The easiest would be to replace them with data://grapher, but I think that may break some of the new logic on export steps.

An alternative would be to add some logic when reading the dag, so that additional (hidden) steps are added, namely, the "grapher://grapher/[STEP]": "data://grapher/[STEP]". This doesn't need to happen for all grapher://grapher steps, but only to those that explicitly appear in the dag.

@Marigold
Copy link
Collaborator

Marigold commented Nov 5, 2024

There's a function construct_dag with a couple of arguments for adding various steps. If that doesn't help, we should start refactoring how is DAG constructed (it's pretty messy right now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants