Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CWLProv: add option exclude raw copies of the data #1586

Open
3 tasks
mr-c opened this issue Jan 6, 2022 · 4 comments
Open
3 tasks

CWLProv: add option exclude raw copies of the data #1586

mr-c opened this issue Jan 6, 2022 · 4 comments
Assignees

Comments

@mr-c
Copy link
Member

mr-c commented Jan 6, 2022

https://matrix.to/#/!RQMxrGNGkeDmWHOaEs:gitter.im/$AJGFCdt6jVAn3aR5lQ0PK3_0SGgvFrubf5SMClsOgGA (a.k.a https://gitter.im/common-workflow-language/common-workflow-language?at=61d6a7bfbfe2f54b2e04661d )

  • --prov-exclude-inputs Skips copying the input files into the CWLProv ResearchObject
  • --prov-exclude-intermediates Skips copying the intermediate files into the CWLProv ResearchObject
  • --prov-exclude-outputs Skips copying the output files into the CWLProv ResearchObject
@jjkoehorst
Copy link

Thanks for creating the ticket for me personally only the metadata (rdf files, workflows files) are needed. As the input and output files are preserved on a cloud store.

@mr-c
Copy link
Member Author

mr-c commented Jan 6, 2022

Areas to investigate, (add flag to skip the copying, but still calculate and store the checksums)

def add_data_file(

def _relativise_files(

def generate_snapshot(self, prov_dep: CWLObjectType) -> None:
called from
research_obj.generate_snapshot(

def create_job(

@jjkoehorst
Copy link

To update this, when providing Directory or Files as input it will copy the entire content to /tmp. Solution for now is to use Strings instead of Directory when possible.

@mr-c
Copy link
Member Author

mr-c commented Feb 8, 2022

Solution for now is to use Strings instead of Directory when possible.

FYI, while that may work for now, that will break mulit-node execution of the workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants