Skip to content

cached cwl.output.json contains docker file path references #1573

Open
@tschoonj

Description

@tschoonj

Hi all,

I am experimenting with cwl.output.json to get the results back from a CommandLineTool that executes in a Docker environment. This works fine, but there appears to be a problem when re-running the same workflow: cwltool correctly recognizes that the cache can be used, but it chokes on the filepath that was saved into the cwl.output.json file which contains a path to a random generated folder that was used during the first run.

Expected Behavior

Caching should work fine, as expected

Actual Behavior

When retrying, I get the following error:

cwltool --outdir output --cachedir cache spike.cwl spike.yaml
INFO /usr/local/miniforge3/bin/cwltool 3.1.20211107152837
INFO Resolved 'spike.cwl' to 'file:///home/tom/gitlab/cwl-workflows/workflows/spike.cwl'
spike.cwl:8:3: Warning: checking item
                      Warning:   Field `class` contains undefined reference to
                      `http://commonwl.org/cwltool#Secrets`
INFO spike.cwl:8:3: Unknown hint http://commonwl.org/cwltool#Secrets
INFO [workflow ] start
INFO [workflow ] starting step arv_get
INFO [step arv_get] start
INFO [job arv_get] Using cached output in /home/tom/gitlab/cwl-workflows/workflows/cache/d04184a5b32119f4058d7e8fbc6ff511
ERROR Workflow error, try again with --debug for more information:
Output file path /cByOZc/ubuntu.sif must be within designated output directory (/nnKPqR) or an input file pass through.

The initial run produced:

INFO /usr/local/miniforge3/bin/cwltool 3.1.20211107152837
INFO Resolved 'spike.cwl' to 'file:///home/tom/gitlab/cwl-workflows/workflows/spike.cwl'
spike.cwl:8:3: Warning: checking item
                      Warning:   Field `class` contains undefined reference to
                      `http://commonwl.org/cwltool#Secrets`
INFO spike.cwl:8:3: Unknown hint http://commonwl.org/cwltool#Secrets
INFO [workflow ] start
INFO [workflow ] starting step arv_get
INFO [step arv_get] start
INFO [job arv_get] Output of job will be cached in /home/tom/gitlab/cwl-workflows/workflows/cache/d04184a5b32119f4058d7e8fbc6ff511
INFO [job arv_get] /home/tom/gitlab/cwl-workflows/workflows/cache/d04184a5b32119f4058d7e8fbc6ff511$ docker \
    run \
    -i \
    --mount=type=bind,source=/home/tom/gitlab/cwl-workflows/workflows/cache/d04184a5b32119f4058d7e8fbc6ff511,target=/cByOZc \
    --mount=type=bind,source=/tmp/0n3iy0j2,target=/tmp \
    --workdir=/cByOZc \
    --read-only=true \
    --user=1002:1002 \
    --rm \
    --cidfile=/tmp/z5bv8n5s/20211207145621-197335.cid \
    --env=TMPDIR=/tmp \
    --env=HOME=/cByOZc \
    arv-cli:build-tar-fd145ede211e86f23f7aeab39e45de43 \
    arv-get-cwl
INFO [job arv_get] Max memory used: 47MiB
INFO [job arv_get] completed success
INFO [step arv_get] completed success
INFO [workflow ] completed success
{
    "collection_file": [
        {
            "class": "File",
            "basename": "ubuntu.sif",
            "location": "file:///home/tom/gitlab/cwl-workflows/workflows/output/ubuntu.sif",
            "checksum": "sha1$8a13313f5de5ace0d943ff7a3257fc83c0538829",
            "size": 27742208,
            "path": "/home/tom/gitlab/cwl-workflows/workflows/output/ubuntu.sif"
        }
    ]
}
INFO Final process status is success

Workflow Code

CommandLineTool arv-get.cwl:

cwlVersion: v1.2
class: CommandLineTool

requirements:
  DockerRequirement:
    dockerPull: arv-cli
  NetworkAccess:
    networkAccess: true
  InitialWorkDirRequirement:
    listing:
      - entryname: cwl.inputs.json
        entry: '{"inputs": $(inputs)"}'

baseCommand:
  - arv-get-cwl

inputs:
  arvados_collection_locator: string
  arvados_api_token: string
  arvados_api_host: string

outputs:
  collection_file: File

The arv-get-cwl script within the container extracts the input from cwl.inputs.json and passes it to the arv-get command, after which the cwl.output.json file is produced with the filename:

cat > ${outputfile} <<EOL
{
  "collection_file": {
    "path": "${download_destination}",
    "class": "File"
  }
}
EOL

Workflow spike.cwl:

cwlVersion: v1.2
class: Workflow

$namespaces:
  cwltool: "http://commonwl.org/cwltool#"

hints:
  "cwltool:Secrets":
    secrets: [arvados_api_token]

requirements:
  InlineJavascriptRequirement: {}
  ScatterFeatureRequirement: {}
  StepInputExpressionRequirement: {}
  MultipleInputFeatureRequirement: {}

inputs:
  arvados_input_collection_locators: string[]
  arvados_output_collection_name: string
  arvados_api_host: string
  arvados_api_token: string

outputs:
  collection_file:
    type: File[]
    outputSource: arv_get/collection_file

steps:
  arv_get:
    run: arv-get.cwl
    scatter: arvados_collection_locator
    in:
      arvados_api_token: arvados_api_token
      arvados_api_host: arvados_api_host
      arvados_collection_locator: arvados_input_collection_locators
    out:
      - collection_file

Your Environment

  • cwltool version: 3.1.20211107152837
    Check using cwltool --version

CC @jrandall

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions