Skip to content

add info on how outputBinding and glob works #353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
71 changes: 71 additions & 0 deletions src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,77 @@ The reference runner and several other CWL implementations support running
those Docker format containers using the Singularity engine. Directly
specifying a Singularity format container is not part of the CWL standards.

## How does glob work when describing output values?
The field outputBinding describes how to set the value of each output parameter. The 'glob' field specifies a pattern to find files/directories relative to the output directory. The pattern used for glob' must be either relative to the output directory, an absolute path to the output directory, or an absolute path to an input file.

CWL uses the POSIX glob(3) pathname matching. Wildcards are allowed in the glob field and are useful when If you don’t know the exact name of a file or directory in advance. The wildcard characters can either be an asterisk `*`, a question mark matching pathnames in directories. `?` or a range `[]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wildcard characters can either be an asterisk *, a question mark matching pathnames in directories, ?, or a range [].

i think this needs to be a comma

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? does look like it works for a single character. Not sure about range, let me check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok -- so glob[3] is for C library functions. glob[7] is for general Linux programming reference -- so we can use the rules for glob[7] https://man7.org/linux/man-pages/man7/glob.7.html


If an array used in the glob field, any files that match any pattern in the array are returned.

In the example below, the glob field using the `*` wildcard is used to return all outputs from the tool.

```
cwlVersion: v1.0
class: CommandLineTool
inputs:
in1:
type: File
default:
class: File
path: /path/to/my/file
inputBinding:
position: 1

baseCommand: cat

outputs:
my_output:
type:
type: array
items: [Directory, File]
outputBinding:
glob: "*"
```

Below is an example where the input file is used as the output file using `glob`
```
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool

doc: |
Merge multiple BAM files.

hints:
ResourceRequirement:
coresMin: 1
ramMin: 20000
DockerRequirement:
dockerPull: quay.io/biocontainers/samtools:1.14--hb421002_0

baseCommand: ["samtools", "merge"]

inputs:
- id: output_name
doc: name of merged bam file
type: string
inputBinding:
position: 1
- id: bams
doc: bam files to be merged
type:
type: array
items: File
inputBinding:
position: 2

outputs:
- id: bam_merged
type: File
outputBinding:
glob: $(inputs.output_name)
```

## Debug JavaScript Expressions

You can use the <code>--js-console</code> option of <code>cwltool</code>, or you can try
Expand Down