Barcode Mapping and Read Extraction Workflow with bcmap

public public 1yr ago Version: v1.2.1 0 bookmarks

bcmap

Maps barcodes to a reference genome and returns genomic windows from which the barcoded reads most likely originate. Each window is assessed with a quality score representing the trustworthines of the mapping. A barcode index is constructed alongside the mapping and can be used to quickly retrieve all reads belonging to a barcode. For ease of use we provide a snakemake workflow to extract all reads from user defined regions of interest.

Prerequisites

  • gcc version 7.2.0

Installation

git clone https://github.com/kehrlab/bcmap.git
cd bcmap
make

Data requirements

  • Paired-end Linked-reads

  • Barcodes are stored in BX:Z: flag of read Ids

  • Sorted by barcode (use i.e. bcctools (only for 10x genomics linked-reads) or samtools )

To trimm, correct and sort barcodes with bcctools use the following command in the bcctools folder:

./script/run_bcctools -f fastq first.fq.gz second.fq.gz

Commands

For detailed information on Arguments and Options:

./bcmap [command] --help

index

Builds an minimized open addressing k-mer index of the reference genome. The index is required to run "map".

./bcmap index reference.fa [options]

map

Maps the barcodes of the provided readfiles to the reference and creates a barcode index of the readfiles to quickly retrieve all reads of a given barcode.

./bcmap map readfile1.fastq readfile2.fastq [options]

Content of output bed-file:

  • chromosome startposition endposition barcode mapping_score

plot

Bcmap returns a output.hist file that can be ploted using plot_score_histogram.py resulting in a plot like the one above. To create a set of mappings with very high precision (at the cost of some recall), the local minimum inbetween the two peaks should be set as the score threshold. A lower theshold yields better recall at the cost of precision, a higher threshold is not recomended.

get

Returns all reads of the given barcodes. Barcodes can be provided directly as argument or in a file.

./bcmap get readfile1.fastq readfile2.fastq Barcodes [options]

Example

This small example demonstrates how to use bcmap and allows you to check if it is properly installed. Navigate to the bcmap folder and run the commands listed below.

# building the index for chr21.fa
./bcmap index example/chr21.fa -o example/Index
# mapping the reads of readfile 1 and 2 to chromosome 21
./bcmap map example/readfile.1.fq example/readfile.2.fq -i example/Index -r example/ReadIndex -o example/results.bed
# extracting the first barcode from the results
awk '{if(NR==1) print($4)}' example/results.bed > example/FirstBarcode.txt
# extracting all reads belonging to the first barcode
./bcmap get example/readfile.1.fq example/readfile.2.fq example/FirstBarcode.txt -r example/ReadIndex -o example/readsOfFirstBarcode
# extracting reads of barcode AACATCGCAAACAGTA
./bcmap get example/readfile.1.fq example/readfile.2.fq AACATCGCAAACAGTA -r example/ReadIndex -o example/readsOfAACATCGCAAACAGTA

Code Snippets

29
30
shell:
    "../bcmap index {input.reference} -o {params.index_name}"
42
43
shell:
    "../bcmap map {input.readfile1} {input.readfile2} -i {input.index} -o {output.barcode_index} -r {output.readfile_index} -t {threads}"
52
53
shell:
    """awk '{{if({params.conditions}) print($4)}}' {input.barcode_index} > {output}"""
66
67
shell:
    "../bcmap get {input.readfile1} {input.readfile2} {input.barcodes} -o {params.output_prefix} -r {input.readfile_index}"
ShowHide 4 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/kehrlab/bcmap
Name: bcmap
Version: v1.2.1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...