Skip to content

Use Nextflow file() and toUriString() for cloud storage compatibility#139

Merged
jmuhlich merged 3 commits into
devfrom
fix-xmlpath
Feb 3, 2026
Merged

Use Nextflow file() and toUriString() for cloud storage compatibility#139
jmuhlich merged 3 commits into
devfrom
fix-xmlpath

Conversation

@adamjtaylor
Copy link
Copy Markdown
Contributor

@adamjtaylor adamjtaylor commented Jan 29, 2026

Changes xmlPath input from val() to path() so Nextflow stages the file locally before processing.
Uses Nextflow's file() function instead of Java's File class, and replace toString() with toUriString() to read XML content. This fixes "No such file or directory" errors when running on cloud platforms (AWS S3, Azure, GCS) via Seqera Platform.

Noted in review for PR #120

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mcmicro branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 29, 2026

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 044dcca

+| ✅ 223 tests passed       |+
#| ❔   5 tests were ignored |#
#| ❔   1 tests had warnings |#
!| ❗   5 tests had warnings |!
Details

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: 2.0.0
  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • schema_lint - Parameter input not found in schema

❔ Tests ignored:

  • files_exist - File is ignored: conf/igenomes.config
  • files_exist - File is ignored: conf/igenomes_ignored.config
  • nextflow_config - Config variable ignored: params.input
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • template_strings - Ignoring Jinja template strings in file /home/runner/work/mcmicro/mcmicro/assets/mcmicro_metro.pdf

❔ Tests fixed:

✅ Tests passed:

Run details

  • nf-core/tools version 3.5.1
  • Run at 2026-01-29 15:59:49

@adamjtaylor adamjtaylor marked this pull request as draft January 29, 2026 14:55
@adamjtaylor
Copy link
Copy Markdown
Contributor Author

Turning this to draft while I confirm that I can get this fix working on batch

@adamjtaylor adamjtaylor changed the title Change xmlPath input type to path Use Nextflow file() for cloud storage compatibility Jan 29, 2026
@adamjtaylor
Copy link
Copy Markdown
Contributor Author

adamjtaylor commented Jan 29, 2026

So I am still running into issues getting this working on Seqera Platform with AWS Batch and the fusion file system. The file TEST1_1.xml exists on S3 in the work dir of the process it was generated in, but the exec block running on the head node can't access it because:

  • The path being passed is /mc2-project-tower-scratch/work/4c/... (Fusion-style local
    mount path)
  • But the exec block runs on the head node which doesn't have the Fusion mount
  • So file() is trying to read a local path that doesn't exist

I think the solution is to move this out of exec into a script block as being planned in this comment on #120 #120 (comment)

@adamjtaylor adamjtaylor changed the title Use Nextflow file() for cloud storage compatibility Use Nextflow file() and toUriString() for cloud storage compatibility Jan 29, 2026
@adamjtaylor
Copy link
Copy Markdown
Contributor Author

OK. I have a working solution while we wait on moving this out of exec. We needed to both use the Nextflow file() and us toUriString to preserve the S3 URI to the work directory bring in the XML file. This both passes local tests and gets through this step on Sage's Seqera Platfom with AWS Batch compute.

@adamjtaylor adamjtaylor marked this pull request as ready for review January 29, 2026 16:42
@jmuhlich
Copy link
Copy Markdown
Member

jmuhlich commented Feb 2, 2026

We are moving away from exec scripts entirely and implementing this functionality in a local module, so I don't think we need this anymore?

@adamjtaylor
Copy link
Copy Markdown
Contributor Author

@jmuhlich. Agree. I will need to keep the branch alive as without it I cannot test on our Seqera platform. We might choose to merge into dev for now, ahead of the dropping of exec which would allow me to keep testing against the dev branch. I'm happy to take your lead on if to merge or not.

@jmuhlich
Copy link
Copy Markdown
Member

jmuhlich commented Feb 3, 2026

OK, happy to merge now to ease your testing. The python rewrite should be done this week.

@jmuhlich jmuhlich merged commit fe80bb9 into dev Feb 3, 2026
19 checks passed
@jmuhlich jmuhlich deleted the fix-xmlpath branch February 3, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants