Snakemake pipeline for quantitation of single-cell data using salmon alevin
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
This is the template for a new Snakemake workflow. Replace this text with a comprehensive description covering the purpose and domain.
Insert your code into the respective folders, i.e.
scripts
,
rules
, and
envs
. Define the entry point of the workflow in the
Snakefile
and the main configuration in the
config.yaml
file.
Authors
- Kevin Rue-Albrecht (@kevinrue)
Usage
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and, if available, its DOI (see above).
Step 1: Obtain a copy of this workflow
-
Create a new github repository using this workflow as a template .
-
Clone the newly created repository to your local system, into the place where you want to perform the data analysis.
Step 2: Configure workflow
Configure the workflow according to your needs via editing the files in the
config/
folder. Adjust
config.yaml
to configure the workflow execution, and
samples.tsv
to specify your sample setup.
Step 3: Install Snakemake
Install Snakemake using conda :
conda create -c bioconda -c conda-forge -n snakemake snakemake
For installation details, see the instructions in the Snakemake documentation .
Step 4: Execute workflow
Activate the conda environment:
conda activate snakemake
Test your configuration by performing a dry-run via
snakemake --use-conda -n
Execute the workflow locally via
snakemake --use-conda --cores $N
using
$N
cores or run it in a cluster environment via
snakemake --use-conda --cluster qsub --jobs 100
or
snakemake --use-conda --drmaa --jobs 100
If you not only want to fix the software stack but also the underlying OS, use
snakemake --use-conda --use-singularity
in combination with any of the modes above. See the Snakemake documentation for further details.
Step 5: Investigate results
After successful execution, you can create a self-contained interactive HTML report with all results via:
snakemake --report report.html
This report can, e.g., be forwarded to your collaborators. An example (using some trivial test data) can be seen here .
Step 6: Commit changes
Whenever you change something, don't forget to commit the changes back to your github copy of the repository:
git commit -a
git push
Step 7: Obtain updates from upstream
Whenever you want to synchronize your workflow copy with new developments from upstream, do the following.
-
Once, register the upstream repository in your local copy:
git remote add -f upstream [email protected]:snakemake-workflows/snakemake_alevin_quant.git
orgit remote add -f upstream https://github.com/snakemake-workflows/snakemake_alevin_quant.git
if you do not have setup ssh keys. -
Update the upstream version:
git fetch upstream
. -
Create a diff with the current version:
git diff HEAD upstream/master workflow > upstream-changes.diff
. -
Investigate the changes:
vim upstream-changes.diff
. -
Apply the modified diff via:
git apply upstream-changes.diff
. -
Carefully check whether you need to update the config files:
git diff HEAD upstream/master config
. If so, do it manually, and only where necessary, since you would otherwise likely overwrite your settings and samples.
Step 8: Contribute back
In case you have also changed or added steps, please consider contributing them back to the original repository:
-
Fork the original repo to a personal or lab account.
-
Clone the fork to your local system, to a different place than where you ran your analysis.
-
Copy the modified files from your analysis to the clone of your fork, e.g.,
cp -r workflow path/to/fork
. Make sure to not accidentally copy config file contents or sample sheets. Instead, manually update the example config files if necessary. -
Commit and push your changes to your fork.
-
Create a pull request against the original repository.
Testing
Test cases are in the subfolder
.test
. They are automatically executed via continuous integration with
Github Actions
.
Code Snippets
23 24 25 26 27 28 29 30 31 32 33 34 | shell: """ {DATETIME} > {log.time} && rm -rf results/alevin/{wildcards.sample} && salmon alevin -l ISR -i {input.index} \ -1 {input.fastq1} -2 {input.fastq2} \ -o results/alevin/{wildcards.sample} -p {params.threads} --tgMap {input.tgmap} \ --chromium --dumpFeatures \ {params.cells_option} \ 2> {log.err} > {log.out} && {DATETIME} >> {log.time} """ |
10 11 | script: "../scripts/barcode_rank.R" |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | message("Started") # # Manage script inputs # quants_mat_file <- snakemake@input[["quants"]] plot_file <- snakemake@output[[1]] # # Manage R packages # library(tidyverse) library(tximport) library(DelayedMatrixStats) library(cowplot) library(sessioninfo) # # Main script # txi <- tximport::tximport(files = quants_mat_file, type = "alevin") plot <- tibble( barcode = colnames(txi$counts), total = DelayedMatrixStats::colSums2(txi$counts) ) %>% arrange(desc(total)) %>% mutate(rank = row_number()) %>% ggplot() + geom_point(aes(rank, total)) + scale_y_log10() + scale_x_log10() + theme_cowplot() ggsave(filename = plot_file, plot = plot) # # session info # sessioninfo::session_info() message("Completed") |
1
of
scripts/barcode_rank.R
Support
- Future updates