MARS-seq v2 pre-processing pipeline with velocity

public public 1yr ago Version: 1.0.1 0 bookmarks

Introduction

nf-core/marsseq is a bioinformatics single-cell preprocessing pipeline for MARS-seq v2.0 experiments. MARS-seq is a plate-based technique that can be combined with FACS in order to study rare populations of cells. On top of the pre-existing pipeline, we have developed an RNA velocity workflow that can be used to study cell dynamics using StarSolo . We do so by converting the raw FASTQ reads into 10X v2 format.

Workflow

Usage

Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

To run the pipeline you have create experiment metadata files:

and samplesheet ( samplesheet.csv ). We provide test example here .

Next, you have to generate genome references to incorporate ERCC spike-ins. References are downloaded from GENCODE database.

nextflow run nf-core/marsseq \
 -profile <docker/singularity/.../institute> \
 --genome <mm10,mm9,GRCh38_v43> \
 --build_references \
 --input samplsheet.csv \
 --outdir <OUTDIR>

Now, you can run the pipeline using:

nextflow run nf-core/marsseq \
 -profile <docker/singularity/.../institute> \
 --genome <mm10,mm9,GRCh38_v43> \
 --input samplesheet.csv \
 --outdir <OUTDIR>

Warning: Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters ; see docs .

For more details and further functionality, please refer to the usage documentation and the parameter documentation .

Pipeline output

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation .

Credits

nf-core/marsseq was originally written by Martin Proks .

We thank the following people for their extensive assistance in the development of this pipeline:

Keren-Shaul, H., Kenigsberg, E., Jaitin, D.A. et al. MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing. Nat Protoc 14, 1841–1862 (2019). https://doi.org/10.1038/s41596-019-0164-4

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines .

For further information or help, don't hesitate to get in touch on the Slack #marsseq channel (you can join with this invite ).

Citations

If you use nf-core/marsseq for your analysis, please cite it using the following doi: 10.5281/zenodo.8063539

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x .

Code Snippets

25
26
27
28
29
30
31
32
"""
cut -f1-9,12- $read > $filename

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cut: \$( cut --version 2>&1 | sed -n 1p | sed 's/cut (GNU coreutils) //g' )
END_VERSIONS
"""
NextFlow From line 25 of sam/main.nf
36
37
38
39
40
41
42
43
"""
touch ${filename}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cut: \$( cut --version 2>&1 | sed -n 1p | sed 's/cut (GNU coreutils) //g' )
END_VERSIONS
"""
NextFlow From line 36 of sam/main.nf
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
"""
mkdir -p output/umi.tab/
mkdir -p output/offset.tab/
mkdir -p output/singleton_offset.tab/
mkdir -p output/QC/read_stats/
mkdir -p output/QC/read_stats_amp_batch/
mkdir -p output/QC/umi_stats/
mkdir -p output/QC/noffsets_per_umi_distrib/
mkdir -p output/QC/nreads_per_umi_distrib/
mkdir -p output/QC/umi_nuc_per_pos/
mkdir -p _debug/${meta.amp_batch}/

demultiplex.pl \\
    ${meta.amp_batch} \\
    ${meta.pool_barcode} \\
    $wells_cells \\
    $gene_intervals \\
    $spike_seq \\
    $oligos \\
    $read \\
    . \\
    $args

mv _debug output/
ln -s output output_tmp

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    demultiplex.pl: \$( demultiplex.pl --version )
END_VERSIONS
"""
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
"""
mkdir -p output/umi.tab/
mkdir -p output/offset.tab/
mkdir -p output/singleton_offset.tab/
mkdir -p output/QC/{read_stats,read_stats_amp_batch,umi_stats,noffsets_per_umi_distrib,nreads_per_umi_distrib,umi_nuc_per_pos}

touch output/umi.tab/${meta.amp_batch}.txt
touch output/offset.tab/${meta.amp_batch}.txt
touch output/singleton_offset.tab/${meta.amp_batch}.txt
touch output/QC/{read_stats,read_stats_amp_batch,umi_stats,noffsets_per_umi_distrib,nreads_per_umi_distrib,umi_nuc_per_pos}/${meta.amp_batch}.txt

mkdir -p output/_debug/${meta.amp_batch}/
touch output/_debug/${meta.amp_batch}/{offsets,UMIs}.txt

ln -s output output_tmp

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    demultiplex.pl: \$( demultiplex.pl --version )
END_VERSIONS
"""
22
23
24
25
26
27
28
29
"""
create_ercc_fasta.py --input $spikeins --output ercc.fa

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    ERCC_CREATE: \$( create_ercc_fasta.py --version )
END_VERSIONS
"""
NextFlow From line 22 of ercc/main.nf
32
33
34
35
36
37
38
39
"""
touch ercc.fa

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    ERCC_CREATE: \$( create_ercc_fasta.py --version )
END_VERSIONS
"""
NextFlow From line 32 of ercc/main.nf
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
"""
gunzip -f $reads
mkdir labeled_reads

extract_labels.pl \\
    $r1 \\
    $r2 \\
    $meta.id \\
    $seq_batches \\
    $oligos \\
    $amp_batches \\
    labeled_reads/$r1 \\
    labeled_reads/$qc \\
    .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    extract_labels.pl: \$( extract_labels.pl --version )
END_VERSIONS
"""
NextFlow From line 30 of extract/main.nf
55
56
57
58
59
60
61
62
63
64
"""
mkdir labeled_reads
touch labeled_reads/${r1}
touch labeled_reads/${qc}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    extract_labels.pl: \$( extract_labels.pl --version )
END_VERSIONS
"""
NextFlow From line 55 of extract/main.nf
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
"""
mkdir raw_reads/
fastp \
    -i ${reads[0]} \\
    -I ${reads[1]} \\
    -o raw_reads/${reads[0]} \\
    -O raw_reads/${reads[1]} \\
    --thread $task.cpus \\
    --disable_quality_filtering \\
    --disable_length_filtering \\
    --disable_adapter_trimming \\
    --disable_trim_poly_g \\
    --json ${meta.id}.fastp.json \\
    $args \\
    2> raw_reads/${meta.id}.fastp.log

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
END_VERSIONS
"""
53
54
55
56
57
58
59
60
61
62
63
64
"""
touch ${meta.id}.fastp.json
mkdir raw_reads/
touch raw_reads/000{1..3}.${reads[0]}
touch raw_reads/000{1..3}.${reads[1]}
touch raw_reads/${meta.id}.fastp.log

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
END_VERSIONS
"""
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
"""
prepare_pipeline.py \\
    --batch ${meta.id} \\
    --amp_batches $amp_batches \\
    --seq_batches $seq_batches \\
    --well_cells $well_cells \\
    --gtf $gtf \\
    --output .
cat $ercc_regions >> gene_intervals.txt
validate_data.py --input .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    prepare_pipeline.py: \$( prepare_pipeline.py --version )
    validate_data.py: \$( validate_data.py --version )
END_VERSIONS
"""
NextFlow From line 38 of prepare/main.nf
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
"""
cat <<AMP_BATCH > amp_batches.txt
Amp_batch_ID\tSeq_batch_ID\tPool_barcode\tSpike_type\tSpike_dilution\tSpike_volume_ul\tExperiment_ID\tOwner\tDescription
AB339\tSB26\tTGAT\tERCC_mix1\t2.5e-05\t0.01\tTECH_ES\tHadas\tES#7_poolA
AMP_BATCH

cat <<SEQ_BATCHES > seq_batches.txt
Seq_batch_ID\tRun_name\tDate\tR1_design\tI5_design\tR2_design\tNotes
SB26\tsc_v3_Hadas_Diego_05042015\t150405\t5I.4P.51M\t7W.8R\t\tmm10
SEQ_BATCHES

cat <<WELLS_CELLS > wells_cells.txt
Well_ID\tWell_coordinates\tplate_ID\tSubject_ID\tAmp_batch_ID\tCell_barcode\tNumber_of_cells
TW1\tA1\t154\t35\tAB339\tCTATTCG\t1
WELLS_CELLS

cat <<GENE_INTERVALS > gene_intervals.txt
chrom\tstart\tend\tstrand\tgene_name
chr1\t3143476\t3144545\t1\t4933401J01Rik
GENE_INTERVALS

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    prepare_pipeline.py: \$( prepare_pipeline.py --version )
    validate_data.py: \$( validate_data.py --version )
END_VERSIONS
"""
NextFlow From line 57 of prepare/main.nf
26
27
28
29
30
31
32
33
34
35
"""
qc_align.r \\
    $sam \\
    $labeled_qc

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_align.r: \$( qc_align.r --version )
END_VERSIONS
"""
NextFlow From line 26 of align/main.nf
38
39
40
41
42
43
44
45
"""
touch _${labeled_qc.baseName}.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_align.r: \$( qc_align.r --version )
END_VERSIONS
"""
NextFlow From line 38 of align/main.nf
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
"""
mkdir report_per_amp_batch/ rd/

qc_batch.r \\
    $meta.amp_batch \\
    $wells_cells \\
    $amp_batches \\
    $seq_batches \\
    $spike_concentrations \\
    $folder

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_batch.r: \$( qc_batch.r --version )
END_VERSIONS
"""
NextFlow From line 25 of batch/main.nf
43
44
45
46
47
48
49
50
51
52
"""
mkdir report_per_amp_batch/ rd/
touch report_per_amp_batch/${meta.amp_batch}.pdf
touch rd/${meta.amp_batch}.rd

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_batch.r: \$( qc_batch.r --version )
END_VERSIONS
"""
NextFlow From line 43 of batch/main.nf
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
"""
mkdir -p output/QC_reports
mkdir _temp/

export TMPDIR=/tmp
qc_report.r \\
    $wells_cells \\
    $amp_batches \\
    .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_report.r: \$( qc_report.r --version )
END_VERSIONS
"""
NextFlow From line 29 of report/main.nf
46
47
48
49
50
51
52
53
54
55
56
"""
touch amp_batches_summary.txt
touch amp_batches_stats.txt
mkdir -p output/QC_reports/
touch output/QC_reports/qc_${meta.id}.pdf

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    qc_report.r: \$( qc_report.r --version )
END_VERSIONS
"""
NextFlow From line 46 of report/main.nf
21
22
23
24
25
26
27
28
29
30
"""
check_samplesheet.py \\
    $samplesheet \\
    samplesheet.valid.csv

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | sed 's/Python //g')
END_VERSIONS
"""
24
25
26
27
28
29
30
31
32
33
34
35
36
37
"""
velocity.py convert \\
    --input $reads \\
    --output _temp/ \\
    --threads $task.cpus

for f in _temp/*R1*.fastq.gz; do cat \$f >> Undetermined_S0_R1_001.fastq.gz; done
for f in _temp/*R2*.fastq.gz; do cat \$f >> Undetermined_S0_R2_001.fastq.gz; done

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    velocity.py: \$( velocity.py --version )
END_VERSIONS
"""
NextFlow From line 24 of convert/main.nf
40
41
42
43
44
45
46
47
"""
touch Undetermined_S0_R{1,2}_001.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    velocity.py: \$( velocity.py --version )
END_VERSIONS
"""
NextFlow From line 40 of convert/main.nf
26
27
28
29
30
31
32
33
34
35
36
"""
velocity.py whitelist \\
    --batch $meta.id \\
    --amp_batches $amp_batches \\
    --well_cells $well_cells

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    velocity.py: \$( velocity.py --version )
END_VERSIONS
"""
39
40
41
42
43
44
45
46
"""
touch whitelist.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    velocity.py: \$( velocity.py --version )
END_VERSIONS
"""
27
28
29
30
31
32
33
34
"""
wget $args $url -O $outfile

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    wget: \$(echo wget -V 2>&1 | grep "GNU Wget" | cut -d" " -f3 > versions.yml)
END_VERSIONS
"""
NextFlow From line 27 of wget/main.nf
42
43
44
45
46
47
48
49
"""
touch ${outfile}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    wget: \$(echo wget -V 2>&1 | grep "GNU Wget" | cut -d" " -f3 > versions.yml)
END_VERSIONS
"""
NextFlow From line 42 of wget/main.nf
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
"""
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\\.rev.1.bt2\$//"`
[ -z "\$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\\.rev.1.bt2l\$//"`
[ -z "\$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1

bowtie2 \\
    -x \$INDEX \\
    $reads_args \\
    --threads $task.cpus \\
    $unaligned \\
    $args \\
    2> ${prefix}.bowtie2.log \\
    | samtools $samtools_command $args2 --threads $task.cpus -o ${prefix}.${extension} -

if [ -f ${prefix}.unmapped.fastq.1.gz ]; then
    mv ${prefix}.unmapped.fastq.1.gz ${prefix}.unmapped_1.fastq.gz
fi

if [ -f ${prefix}.unmapped.fastq.2.gz ]; then
    mv ${prefix}.unmapped.fastq.2.gz ${prefix}.unmapped_2.fastq.gz
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""
80
81
82
83
84
85
86
87
88
89
90
91
92
"""
touch ${prefix}.${extension}
touch ${prefix}.bowtie2.log
touch ${prefix}.unmapped_1.fastq.gz
touch ${prefix}.unmapped_2.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""
NextFlow From line 80 of align/main.nf
22
23
24
25
26
27
28
29
"""
mkdir bowtie2
bowtie2-build $args --threads $task.cpus $fasta bowtie2/${fasta.baseName}
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
END_VERSIONS
"""
32
33
34
35
36
37
38
39
40
41
"""
mkdir bowtie2
touch bowtie2/${fasta.baseName}.{1..4}.bt2
touch bowtie2/${fasta.baseName}.rev.{1,2}.bt2

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
END_VERSIONS
"""
38
39
40
41
42
43
44
45
46
47
48
49
"""
$command1 \\
    $args \\
    ${file_list.join(' ')} \\
    $command2 \\
    > ${prefix}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""
NextFlow From line 38 of cat/main.nf
54
55
56
57
58
59
60
61
"""
touch $prefix

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""
NextFlow From line 54 of cat/main.nf
25
26
27
28
29
30
31
32
33
34
35
36
"""
cutadapt \\
    --cores $task.cpus \\
    $args \\
    $trimmed \\
    $reads \\
    > ${prefix}.cutadapt.log
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cutadapt: \$(cutadapt --version)
END_VERSIONS
"""
41
42
43
44
45
46
47
48
49
"""
touch ${prefix}.cutadapt.log
touch ${trimmed}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cutadapt: \$(cutadapt --version)
END_VERSIONS
"""
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
"""
printf "%s %s\\n" $rename_to | while read old_name new_name; do
    [ -f "\${new_name}" ] || ln -s \$old_name \$new_name
done

fastqc \\
    $args \\
    --threads $task.cpus \\
    $renamed_files

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""
46
47
48
49
50
51
52
53
54
"""
touch ${prefix}.html
touch ${prefix}.zip

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""
23
24
25
26
27
28
29
30
31
32
33
"""
gunzip \\
    -f \\
    $args \\
    $archive

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//')
END_VERSIONS
"""
NextFlow From line 23 of gunzip/main.nf
37
38
39
40
41
42
43
"""
touch $gunzip
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//')
END_VERSIONS
"""
NextFlow From line 37 of gunzip/main.nf
28
29
30
31
32
33
34
35
36
37
38
39
40
"""
multiqc \\
    --force \\
    $args \\
    $config \\
    $extra_config \\
    .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" )
END_VERSIONS
"""
43
44
45
46
47
48
49
50
51
52
"""
touch multiqc_data
touch multiqc_plots
touch multiqc_report.html

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" )
END_VERSIONS
"""
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
"""
STAR \\
    --genomeDir $index \\
    --readFilesIn $in_reads \\
    --runThreadN $task.cpus \\
    --outFileNamePrefix $prefix. \\
    $out_sam_type \\
    $ignore_gtf \\
    $solo_whitelist \\
    $attrRG \\
    $args

$mv_unsorted_bam

if [ -f ${prefix}.Unmapped.out.mate1 ]; then
    mv ${prefix}.Unmapped.out.mate1 ${prefix}.unmapped_1.fastq
    gzip ${prefix}.unmapped_1.fastq
fi
if [ -f ${prefix}.Unmapped.out.mate2 ]; then
    mv ${prefix}.Unmapped.out.mate2 ${prefix}.unmapped_2.fastq
    gzip ${prefix}.unmapped_2.fastq
fi

if [ -d ${prefix}.Solo.out ]; then
    # Backslashes still need to be escaped (https://github.com/nextflow-io/nextflow/issues/67)
    find ${prefix}.Solo.out \\( -name "*.tsv" -o -name "*.mtx" \\) -exec gzip {} \\;
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    star: \$(STAR --version | sed -e "s/STAR_//g")
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    gawk: \$(echo \$(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*\$//')
END_VERSIONS
"""
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
"""
touch ${prefix}Xd.out.bam
touch ${prefix}.Log.final.out
touch ${prefix}.Log.out
touch ${prefix}.Log.progress.out
touch ${prefix}.sortedByCoord.out.bam
mkdir ${prefix}.Solo.out
touch ${prefix}.toTranscriptome.out.bam
touch ${prefix}.Aligned.unsort.out.bam
touch ${prefix}.Aligned.sortedByCoord.out.bam
touch ${prefix}.unmapped_1.fastq.gz
touch ${prefix}.unmapped_2.fastq.gz
touch ${prefix}.tab
touch ${prefix}.SJ.out.tab
touch ${prefix}.ReadsPerGene.out.tab
touch ${prefix}.Chimeric.out.junction
touch ${prefix}.out.sam
touch ${prefix}.Signal.UniqueMultiple.str1.out.wig
touch ${prefix}.Signal.UniqueMultiple.str1.out.bg

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    star: \$(STAR --version | sed -e "s/STAR_//g")
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    gawk: \$(echo \$(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*\$//')
END_VERSIONS
"""
NextFlow From line 95 of align/main.nf
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
"""
mkdir star
STAR \\
    --runMode genomeGenerate \\
    --genomeDir star/ \\
    --genomeFastaFiles $fasta \\
    --sjdbGTFfile $gtf \\
    --runThreadN $task.cpus \\
    $memory \\
    $args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    star: \$(STAR --version | sed -e "s/STAR_//g")
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    gawk: \$(echo \$(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*\$//')
END_VERSIONS
"""
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
"""
samtools faidx $fasta
NUM_BASES=`gawk '{sum = sum + \$2}END{if ((log(sum)/log(2))/2 - 1 > 14) {printf "%.0f", 14} else {printf "%.0f", (log(sum)/log(2))/2 - 1}}' ${fasta}.fai`

mkdir star
STAR \\
    --runMode genomeGenerate \\
    --genomeDir star/ \\
    --genomeFastaFiles $fasta \\
    --sjdbGTFfile $gtf \\
    --runThreadN $task.cpus \\
    --genomeSAindexNbases \$NUM_BASES \\
    $memory \\
    $args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    star: \$(STAR --version | sed -e "s/STAR_//g")
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    gawk: \$(echo \$(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*\$//')
END_VERSIONS
"""
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
"""
mkdir star
touch star/Genome
touch star/Log.out
touch star/SA
touch star/SAindex
touch star/chrLength.txt
touch star/chrName.txt
touch star/chrNameLength.txt
touch star/chrStart.txt
touch star/exonGeTrInfo.tab
touch star/exonInfo.tab
touch star/geneInfo.tab
touch star/genomeParameters.txt
touch star/sjdbInfo.txt
touch star/sjdbList.fromGTF.out.tab
touch star/sjdbList.out.tab
touch star/transcriptInfo.tab

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    star: \$(STAR --version | sed -e "s/STAR_//g")
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    gawk: \$(echo \$(gawk --version 2>&1) | sed 's/^.*GNU Awk //; s/, .*\$//')
END_VERSIONS
"""
ShowHide 37 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://nf-co.re/marsseq
Name: marsseq
Version: 1.0.1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...