Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential fix to allow QIIME2 with conda #835

Open
d4straub opened this issue Mar 4, 2025 · 12 comments
Open

Potential fix to allow QIIME2 with conda #835

d4straub opened this issue Mar 4, 2025 · 12 comments
Labels
enhancement New feature or request

Comments

@d4straub
Copy link
Collaborator

d4straub commented Mar 4, 2025

Description of feature

Running QIIME2 with conda isnt allowed currently, beause no conda recipe was added to the pipeline. That might be a relict of the past, because QIIME2 org itself is maintaining and releasing a conda recipe at https://anaconda.org/qiime2/qiime2-amplicon. Potentially it works to add qiime2/label/r2024.10::qiime2-amplicon or such as conda package in QIIME2 processes hroughout the pipeline to make QIIME2 work with conda as well.

@d4straub d4straub added the enhancement New feature or request label Mar 4, 2025
@chan-98
Copy link

chan-98 commented Mar 5, 2025

Hi, So I tested this and the above label doesn't work

Channels:
 - conda-forge
 - qiime2/label/r2024.10
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

LibMambaUnsatisfiableError: Encountered problems while solving:
  - nothing provides bioconductor-ancombc 2.4.0.* needed by qiime2-amplicon-2024.10.24.01.47.20-py310pl5321r43h6fd29f9_0

Could not solve for environment specs
The following package could not be installed
└─ qiime2-amplicon is not installable because it requires
   └─ bioconductor-ancombc 2.4.0.* , which does not exist (perhaps a missing channel).

We can add qiime2::qiime2-amplicon though. Have added this to the local copy of my pipeline and it does run Qiime with conda now. However there is no way to specify a version in this case.

@chan-98
Copy link

chan-98 commented Mar 7, 2025

Ofcourse, could just add conda "https://data.qiime2.org/distro/amplicon/qiime2-amplicon-2024.10-py310-linux-conda.yml" in the process, or keep the yml file in ${projectDir}/envs/. and provide that path as well (this might be the only process then, which would use a yml file). Is this something we could add to the pipeline?

@d4straub
Copy link
Collaborator Author

d4straub commented Mar 7, 2025

We can add qiime2::qiime2-amplicon though.

thats unfortunately not acceptable because it doesnt fix the version.

conda "https://data.qiime2.org/distro/amplicon/qiime2-amplicon-2024.10-py310-linux-conda.yml" in the process, or keep the yml file in ${projectDir}/envs/. and provide that path as well (this might be the only process then, which would use a yml file).

that is better I think. I am not sure whether the https link or the envs dir would be better practice.

@d4straub
Copy link
Collaborator Author

d4straub commented Mar 7, 2025

yml file in ${projectDir}/envs/ seems safer, because the file at a random https adress could change or move at any time and will make the piepeline again unusable with conda. Copying will rely the pipeline only on bioconda, which seems a as better bet.

@SPPearce
Copy link

SPPearce commented Mar 7, 2025

Hi, So I tested this and the above label doesn't work

Channels:
 - conda-forge
 - qiime2/label/r2024.10
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

LibMambaUnsatisfiableError: Encountered problems while solving:
  - nothing provides bioconductor-ancombc 2.4.0.* needed by qiime2-amplicon-2024.10.24.01.47.20-py310pl5321r43h6fd29f9_0

Could not solve for environment specs
The following package could not be installed
└─ qiime2-amplicon is not installable because it requires
   └─ bioconductor-ancombc 2.4.0.* , which does not exist (perhaps a missing channel).

We can add qiime2::qiime2-amplicon though. Have added this to the local copy of my pipeline and it does run Qiime with conda now. However there is no way to specify a version in this case.

Did you try including bioconda as a channel when you tried this?

@chan-98
Copy link

chan-98 commented Mar 7, 2025

@SPPearce Yes I did, by creating an env file as mentioned in my reply to @d4straub . But with that version (2024.10.24) of qiime2 env I found a different issue while running the pipeline:

Command exit status:
  1

Command output:
  Saved FeatureTable[Frequency] to: lvl5-habitat.qza
  Exported lvl5-habitat.qza as BIOMV210DirFmt to directory exported/
  Running external command line application(s). This may print messages to stdout and/or stderr.
  The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
  
  Command: run_ancombc.R --inp_abundances_path /tmp/tmpia_tm0lc/input.biom.tsv --inp_metadata_path /tmp/tmpia_tm0lc/input.map.txt --md_column_types {"name": "categorical", "habitat": "categorical", "Riv_vs_Gro": "categorical", "Sed_vs_Soil": "categorical"} --formula habitat --p_adj_method holm --prv_cut 0.1 --lib_cut 500 --reference_levels ['habitat::Groundwater'] --tol 1e-05 --max_iter 100 --conserve True --alpha 0.05 --output_loaf /tmp/q2-DataLoafPackageDirFmt-7wsxkpmj

Command error:
         P <- if (!is.null(cc <- conditionCall(e))) 
             paste(" in", deparse(cc)[1L])
         else ""
         msg <- gettextf("package or namespace load failed for %s%s:\n %s", 
             sQuote(package), P, conditionMessage(e))
         if (logical.return && !quietly) 
             message(paste("Error:", msg), domain = NA)
         else stop(msg, call. = FALSE, domain = NA)
     })
  3: library(phyloseq)
  2: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
         classes)) tryInvokeRestart("muffleWarning"))
  1: suppressWarnings(library(phyloseq))
  Traceback (most recent call last):
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2_composition/_ancombc.py", line 255, in _ancombc
      run_commands([cmd])
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2_composition/_ancombc.py", line 32, in run_commands
      subprocess.run(cmd, check=True)
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/subprocess.py", line 528, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['run_ancombc.R', '--inp_abundances_path', '/tmp/tmpia_tm0lc/input.biom.tsv', '--inp_metadata_path', '/tmp/tmpia_tm0lc/input.map.txt', '--md_column_types', '{"name": "categorical", "habitat": "categorical", "Riv_vs_Gro": "categorical", "Sed_vs_Soil": "categorical"}', '--formula', 'habitat', '--p_adj_method', 'holm', '--prv_cut', '0.1', '--lib_cut', '500', '--reference_levels', "['habitat::Groundwater']", '--tol', '1e-05', '--max_iter', '100', '--conserve', 'True', '--alpha', '0.05', '--output_loaf', '/tmp/q2-DataLoafPackageDirFmt-7wsxkpmj']' returned non-zero exit status 1.
  
  During handling of the above exception, another exception occurred:
  
  Traceback (most recent call last):
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2cli/commands.py", line 520, in __call__
      results = self._execute_action(
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2cli/commands.py", line 581, in _execute_action
      results = action(**arguments)
    File "<decorator-gen-19>", line 2, in ancombc
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
      outputs = self._callable_executor_(
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/qiime2/sdk/action.py", line 576, in _callable_executor_
      output_views = self._callable(**view_args)
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2_composition/_ancombc.py", line 41, in ancombc
      return _ancombc(
    File "/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/python3.9/site-packages/q2_composition/_ancombc.py", line 257, in _ancombc
      raise Exception('An error was encountered while running ANCOM-BC'
  Exception: An error was encountered while running ANCOM-BC in R (return code 1), please inspect stdout and stderr to learn more.
  
  Plugin error from composition:
  
    An error was encountered while running ANCOM-BC in R (return code 1), please inspect stdout and stderr to learn more.
  
  See above for debug info.
  Running external command line application(s). This may print messages to stdout and/or stderr.
  The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
  
  Command: run_ancombc.R --inp_abundances_path /tmp/tmpia_tm0lc/input.biom.tsv --inp_metadata_path /tmp/tmpia_tm0lc/input.map.txt --md_column_types {"name": "categorical", "habitat": "categorical", "Riv_vs_Gro": "categorical", "Sed_vs_Soil": "categorical"} --formula habitat --p_adj_method holm --prv_cut 0.1 --lib_cut 500 --reference_levels ['habitat::Groundwater'] --tol 1e-05 --max_iter 100 --conserve True --alpha 0.05 --output_loaf /tmp/q2-DataLoafPackageDirFmt-7wsxkpmj

Work dir:
  /data/scratch/chandini.v/nf-core/ampliseq/test_run/work/d8/7a52cad8b4fdee8178ccd57108269c

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

When I looked into the code, it was an issue with loading the phyloseq object. Though the latest version of biobase and phyloseq were in my env. I tried loading the phyloseq R library within the environment and got this error:

> library(phyloseq)
Error: package or namespace load failed for ‘phyloseq’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/R/library/ade4/libs/ade4.so':
  /data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/R/bin/exec/../../lib/../.././libstdc++.so.6: version CXXABI_1.3.15' not found (required by /data/scratch/chandini.v/nf-core/ampliseq/test_run/work/conda/env-f67b03a2a8760bdf-dbf25e83f27c352a2073155655e6e7bf/lib/R/library/ade4/libs/ade4.so)

I'll try to first resolve this and if I'm unable to, I'll see which version of qiime2 works.

@d4straub
Copy link
Collaborator Author

d4straub commented Mar 7, 2025

Currently we use in ampliseq QIIME2 container "qiime2/amplicon:2024.10", i.e. QIIME2 version 204.10. We should have the same version in conda and container, we should avoid have different versions in my opinion.

@chan-98
Copy link

chan-98 commented Mar 7, 2025

Yeah. I am running test with conda and it works by downloading https://data.qiime2.org/distro/amplicon/qiime2-amplicon-2024.10-py310-linux-conda.yml into nf-core/ampliseq/envs/. folder and adding the line conda "${projectDir}/envs/qiime2-amplicon-2024.10-py310-linux-conda.yml" to all the processes (and commenting out the error statements and if conditions in nf-core/ampliseq/subworkflows/ampliseq.nf, ofc). Thanks everyone!

@SPPearce
Copy link

SPPearce commented Mar 7, 2025

Why do you not just put that yml as the enviroment.yml file for the module? (rather than in projectDir/envs)

@chan-98
Copy link

chan-98 commented Mar 7, 2025

@SPPearce Because qiime is not a separate module; I thought since it's not required by every other file in the modules/local (like the dada2, barrnap, etc processes) it wouldn't make sense 🥲

chandini.v@rBoardDev1:~/multiBoard$ ls public/nf-core/ampliseq/modules/local/
assignsh.nf                dada2_filtntrim.nf     filter_len.nf              format_taxresults.nf          phyloseq_inasv.nf           qiime2_classify.nf           qiime2_extract.nf             qiime2_train.nf           sidle_filttax.nf      summary_report.nf
barrnap.nf                 dada2_merge.nf         filter_ssu.nf              format_taxresults_kraken2.nf  phyloseq_intax.nf           qiime2_diversity_adonis.nf   qiime2_featuretable_group.nf  qiime2_tree.nf            sidle_in.nf           trunclen.nf
barrnapsummary.nf          dada2_quality.nf       filter_stats.nf            format_taxresults_sintax.nf   picrust.nf                  qiime2_diversity_alpha.nf    qiime2_filtersamples.nf       rename_raw_data_files.nf  sidle_indb.nf
combine_table.nf           dada2_rmchimera.nf     format_fastainput.nf       itsx_cutasv.nf                qiime2_alphararefaction.nf  qiime2_diversity_beta.nf     qiime2_inasv.nf               sbdiexport.nf             sidle_indbaligned.nf
cutadapt_summary.nf        dada2_splitregions.nf  format_pplacetax.nf        merge_stats.nf                qiime2_ancom_asv.nf         qiime2_diversity_betaord.nf  qiime2_inseq.nf               sbdiexportreannotate.nf   sidle_seqrecon.nf
cutadapt_summary_merge.nf  dada2_stats.nf         format_taxonomy.nf         metadata_all.nf               qiime2_ancom_tax.nf         qiime2_diversity_core.nf     qiime2_intax.nf               sidle_align.nf            sidle_tablerecon.nf
dada2_addspecies.nf        dada2_taxonomy.nf      format_taxonomy_qiime.nf   metadata_pairwise.nf          qiime2_ancombc_asv.nf       qiime2_export_absolute.nf    qiime2_intree.nf              sidle_dbextract.nf        sidle_taxrecon.nf
dada2_denoising.nf         filter_clusters.nf     format_taxonomy_sidle.nf   novaseq_err.nf                qiime2_ancombc_tax.nf       qiime2_export_relasv.nf      qiime2_seqfiltertable.nf      sidle_dbfilt.nf           sidle_treerecon.nf
dada2_err.nf               filter_codons.nf       format_taxonomy_sintax.nf  phyloseq.nf                   qiime2_barplot.nf           qiime2_export_reltax.nf      qiime2_tablefiltertaxa.nf     sidle_dbrecon.nf          sidle_trim.nf

@SPPearce
Copy link

SPPearce commented Mar 7, 2025

Ok, that is a lot of local modules! Yeah, then what you've done seems fine.

@SPPearce
Copy link

SPPearce commented Mar 7, 2025

I'd forgotten that it is common for local modules to just be a .nf file on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants