Snakemake pipeline to detect novel lncRNA

public public 1yr ago Version: 2 0 bookmarks

Snakemake Workflow for detection of lncRNA

This is a snakemake pipeline to differentiate lncRNAs from mRNAs.

The pipeline takes samples in either fasta format or fastq format as input.

The pipeline takes samples with a suffix 'r_1.fq.gz' and 'r_2.fq.gz' if the samples are paired. Or it takes samples with suffix 'fq.gz' if the samples is single-end reads. It also accepts '.fa' reads/ Regardless your samples are paired, single-ended or fasta, samples names should be samples.tsv without the suffix.

You can change the name of the input files samples.tsv by editing the config file. You will also need to set the PAIRED variable in the config file to either TRUE or FALSE.

Run the pipeline

snakemake -jn

where n is the number of cores for example for 10 cores use:

snakemake -j10

Use conda

For less froodiness, use conda:

snakemake -jn --use-conda

For example, for 10 cores use:

snakemake -j10 --use-conda

This will pull automatically the same versiosn of tools we used. Conda has to be installed in the system, in addition to snakemake.

Dry Run

For a dry run use:

snakemake -j1 -n

and to print command in dry run use:

snakemake -j1 -n -p

Use Corresponding configfile:

Just update your config file to include all your sample names, edit your interval.list file to include your intervals of interest, your path, etc for example:

snakemake -j1 --configfile config-WES.yaml

or:

snakemake -j1 configfile config-WGS.yaml

TODO

More tools will be included

References

  1. Li, A., Zhang, J., & Zhou, Z. (2014). PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme. BMC bioinformatics, 15(1), 1-10.

Code Snippets

19
20
21
22
shell:
    """
    reformat.sh in={input.r1} in2={input.r2} out={output}
    """
30
31
32
33
shell:
    """
    reformat.sh in={input} out={output}
    """
44
45
46
47
shell:
   """
   PLEK -f {input} -o {params.prefix} 
   """
ShowHide 2 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/SherineAwad/lncRNA
Name: lncrna
Version: 2
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Other Versions:
Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...