baseCommand:-picard-AddOrReplaceReadGroupsdoc:|-Assigns all the reads in a file to a single new read-group.<h3>Summary</h3>Many tools (Picard and GATK for example) require or assume the presence of at least one <code>RG</code> tag, defining a "read-group"to which each read can be assigned (as specified in the <code>RG</code> tag in the SAM record).This tool enables the user to assign all the reads in the INPUT to a single new read-group.For more information about read-groups, see the <a href='https://www.broadinstitute.org/gatk/guide/article?id=6472'>GATK Dictionary entry.</a><br />This tool accepts as INPUT BAM and SAM files or URLs from the<a href="http://ga4gh.org/#/documentation">Global Alliance for Genomics and Health (GA4GH)</a>.<h3>Caveats</h3>The value of the tags must adhere (according to the <a href="https://samtools.github.io/hts-specs/SAMv1.pdf">SAM-spec</a>)with the regex <pre>#READGROUP_ID_REGEX</pre> (one or more characters from the ASCII range 32 through 126). Inparticular <code><Space></code> is the only non-printing character allowed.<br/>The program enables only the wholesale assignment of all the reads in the INPUT to a single read-group. If your filealready has reads assigned to multiple read-groups, the original <code>RG</code> value will be lost.Documentation: http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroupsrequirements:ShellCommandRequirement:{}InlineJavascriptRequirement:expressionLib:-|function generateGATK4BooleanValue(){/*** Boolean types in GATK 4 are expressed on the command line as --<PREFIX> "true"/"false",* so patch here*/if(self === true || self === false){return self.toString()}return self;}hints:DockerRequirement:dockerPull:quay.io/biocontainers/picard:2.22.2--0inputs:-doc:Input file (BAM or SAM or a GA4GH url). [synonymous with -I]id:INPUTtype:FileinputBinding:prefix:INPUT=separate:false-doc:Read-Group library [synonymous with -LB]id:RGLBtype:stringinputBinding:prefix:RGLB=separate:false-doc:Read-Group platform (e.g. ILLUMINA, SOLID) [synonymous with -PL]id:RGPLtype:stringinputBinding:prefix:RGPL=separate:false-doc:Read-Group platform unit (eg. run barcode) [synonymous with -PU]id:RGPUtype:stringinputBinding:prefix:RGPU=separate:false-doc:Read-Group sample name [synonymous with -SM]id:RGSMtype:stringinputBinding:prefix:RGSM=separate:false-doc:Output filename (BAM or SAM)id:OUTPUTtype:stringinputBinding:prefix:OUTPUT=separate:false-doc:Reference sequence file. [synonymous with -R]id:REFERENCE_SEQUENCEtype:File?inputBinding:prefix:REFERENCE_SEQUENCE=separate:false-doc:Optional sort order to output in. If not supplied OUTPUT is in the same orderas INPUT. [synonymous with -SO]id:SORT_ORDERtype:-'null'-type:enumsymbols:-unsorted-queryname-coordinate-duplicate-unknowninputBinding:prefix:SORT_ORDER=separate:false-doc:Read-Group sequencing center name [synonymous with -CN]id:RGCNtype:string?inputBinding:prefix:RGCN=separate:false-doc:Read-Group description [synonymous with -DS]id:RGDStype:string?inputBinding:prefix:RGDS=separate:false-doc:Read-Group run date in Iso8601Date format [synonymous with -DT]id:RGDTtype:string?inputBinding:prefix:RGDT=separate:false-doc:Read-Group flow order [synonymous with -FO]id:RGFOtype:string?inputBinding:prefix:RGFO=separate:false-doc:Read-Group ID [synonymous with -ID]id:RGIDtype:string?inputBinding:prefix:RGID=separate:false-doc:Read-Group key sequence [synonymous with -KS]id:RGKStype:string?inputBinding:prefix:RGKS=separate:false-doc:Read-Group program group [synonymous with -PG]id:RGPGtype:string?inputBinding:prefix:RGPG=separate:false-doc:Read-Group predicted insert size [synonymous with -PI]id:RGPItype:int?inputBinding:prefix:RGPI=separate:false-doc:Read-Group platform model [synonymous with -PM]id:RGPMtype:string?inputBinding:prefix:RGPM=separate:false-doc:Control verbosity of logging.id:VERBOSITYtype:-'null'-type:enumsymbols:-ERROR-WARNING-INFO-DEBUGinputBinding:prefix:VERBOSITY=separate:false-doc:Whether to suppress job-summary info on System.err.id:QUIETtype:boolean?inputBinding:prefix:QUIET=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:Validation stringency for all SAM files read by this program. Setting stringencyto SILENT can improve performance when processing a BAM file in which variable-lengthdata (read, qualities, tags) do not otherwise need to be decoded.id:VALIDATION_STRINGENCYtype:-'null'-type:enumsymbols:-STRICT-LENIENT-SILENTinputBinding:prefix:VALIDATION_STRINGENCY=separate:false-doc:Compression level for all compressed files created (e.g. BAM and VCF).id:COMPRESSION_LEVELtype:int?inputBinding:prefix:COMPRESSION_LEVEL=separate:false-doc:When writing files that need to be sorted, this will specify the number ofrecords stored in RAM before spilling to disk. Increasing this number reducesthe number of file handles needed to sort the file, and increases the amount ofRAM needed.id:MAX_RECORDS_IN_RAMtype:int?inputBinding:prefix:MAX_RECORDS_IN_RAM=separate:false-doc:Use the JDK Deflater instead of the Intel Deflater for writing compressed output[synonymous with -use_jdk_deflater]id:USE_JDK_DEFLATERtype:boolean?inputBinding:prefix:USE_JDK_DEFLATER=separate:falsevalueFrom:$(generateGATK4BooleanValue())-doc:Use the JDK Inflater instead of the Intel Inflater for reading compressed input[synonymous with -use_jdk_inflater]id:USE_JDK_INFLATERtype:boolean?inputBinding:prefix:USE_JDK_INFLATER=separate:falsevalueFrom:$(generateGATK4BooleanValue())-doc:Whether to create a BAM index when writing a coordinate-sorted BAM file.id:CREATE_INDEXtype:boolean?inputBinding:prefix:CREATE_INDEX=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:'WhethertocreateanMD5digestforanyBAMorFASTQfilescreated.'id:CREATE_MD5_FILEtype:boolean?inputBinding:prefix:CREATE_MD5_FILE=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:Google Genomics API client_secrets.json file path.id:GA4GH_CLIENT_SECRETStype:File?inputBinding:prefix:GA4GH_CLIENT_SECRETS=separate:falsearguments:-TMP_DIR=$(runtime.tmpdir)
baseCommand:-picard-CreateSequenceDictionarydoc:|-Create a SAM/BAM file from a fasta containing reference sequence. The output SAM file contains a header but noSAMRecords, and the header contains only sequence records.requirements:ShellCommandRequirement:{}InitialWorkDirRequirement:listing:-$(inputs.REFERENCE)InlineJavascriptRequirement:expressionLib:-|function generateGATK4BooleanValue(){/*** Boolean types in GATK 4 are expressed on the command line as --<PREFIX> "true"/"false",* so patch here*/if(self === true || self === false){return self.toString()}return self;}hints:DockerRequirement:dockerPull:quay.io/biocontainers/picard:2.22.2--0inputs:-doc:Input reference fasta or fasta.gz [synonymous with -R]id:REFERENCEtype:FileinputBinding:valueFrom:REFERENCE=$(self.basename)-doc:Put into AS field of sequence dictionary entry if supplied [synonymous with-AS]id:GENOME_ASSEMBLYtype:string?inputBinding:prefix:GENOME_ASSEMBLY=separate:false-doc:Put into UR field of sequence dictionary entry. If not supplied, input referencefile is used [synonymous with -UR]id:URItype:string?inputBinding:prefix:URI=separate:false-doc:Put into SP field of sequence dictionary entry [synonymous with -SP]id:SPECIEStype:string?inputBinding:prefix:SPECIES=separate:false-doc:Make sequence name the first word from the > line in the fasta file. By defaultthe entire contents of the > line is used, excluding leading and trailing whitespace.id:TRUNCATE_NAMES_AT_WHITESPACEtype:boolean?inputBinding:prefix:TRUNCATE_NAMES_AT_WHITESPACE=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:Stop after writing this many sequences. For testing.id:NUM_SEQUENCEStype:int?inputBinding:prefix:NUM_SEQUENCES=separate:false-doc:"Optionalfilecontainingthealternativenamesforthecontigs.Toolsmay\\ usethisinformationtoconsiderdifferentcontignotationsasidentical(e.g:\\ 'chr1'and'1').Thealternativenameswillbeputintotheappropriate@AN\\ annotationforeachcontig.Noheader.Firstcolumnistheoriginalname,the\\ secondcolumnisanalternativename.Onecontigmayhavemorethanonealternative\\ name.[synonymouswith-AN]"id:ALT_NAMEStype:File?inputBinding:prefix:ALT_NAMES=separate:false-doc:Control verbosity of logging.id:VERBOSITYtype:-'null'-type:enumsymbols:-ERROR-WARNING-INFO-DEBUGinputBinding:prefix:VERBOSITY=separate:false-doc:Whether to suppress job-summary info on System.err.id:QUIETtype:boolean?inputBinding:prefix:QUIET=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:Validation stringency for all SAM files read by this program. Setting stringencyto SILENT can improve performance when processing a BAM file in which variable-lengthdata (read, qualities, tags) do not otherwise need to be decoded.id:VALIDATION_STRINGENCYtype:-'null'-type:enumsymbols:-STRICT-LENIENT-SILENTinputBinding:prefix:VALIDATION_STRINGENCY=separate:false-doc:Compression level for all compressed files created (e.g. BAM and VCF).id:COMPRESSION_LEVELtype:int?inputBinding:prefix:COMPRESSION_LEVEL=separate:false-doc:When writing files that need to be sorted, this will specify the number ofrecords stored in RAM before spilling to disk. Increasing this number reducesthe number of file handles needed to sort the file, and increases the amount ofRAM needed.id:MAX_RECORDS_IN_RAMtype:int?inputBinding:prefix:MAX_RECORDS_IN_RAM=separate:false-doc:Use the JDK Deflater instead of the Intel Deflater for writing compressed output[synonymous with -use_jdk_deflater]id:USE_JDK_DEFLATERtype:boolean?inputBinding:prefix:USE_JDK_DEFLATER=separate:falsevalueFrom:$(generateGATK4BooleanValue())-doc:Use the JDK Inflater instead of the Intel Inflater for reading compressed input[synonymous with -use_jdk_inflater]id:USE_JDK_INFLATERtype:boolean?inputBinding:prefix:USE_JDK_INFLATER=separate:falsevalueFrom:$(generateGATK4BooleanValue())-doc:Whether to create a BAM index when writing a coordinate-sorted BAM file.id:CREATE_INDEXtype:boolean?inputBinding:prefix:CREATE_INDEX=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:'WhethertocreateanMD5digestforanyBAMorFASTQfilescreated.'id:CREATE_MD5_FILEtype:boolean?inputBinding:prefix:CREATE_MD5_FILE=valueFrom:$(generateGATK4BooleanValue())separate:false-doc:Google Genomics API client_secrets.json file path.id:GA4GH_CLIENT_SECRETStype:File?inputBinding:prefix:GA4GH_CLIENT_SECRETS=separate:falsearguments:-TMP_DIR=$(runtime.tmpdir)-OUTPUT=$(inputs.REFERENCE.nameroot).dict
A set of command line tools for manipulating high-throughput sequencing (HTS) data in formats such as SAM/BAM/CRAM and VCF. Available as a standalone program or within the GATK4 program.