ATAC-Seq Data Analysis Pipeline for Identifying Nuclear Sites in Ikaros Translocation Study

public public 1yr ago 0 bookmarks

This is a ATACSeq snakemake pipeline for High Performance Computing course. We first did the ATACSeq data analysis with slurm (https://github.com/dijashis/Projet_HPC) . We use ATAC-Seq data from Gomez-Cabrero et al. (2019) from a murine B3 cell line. One of the goals of the study is to identify new nuclear sites following translocation of the transcription factor Ikaros after exposure to the drug Tamoxifen The original data set has 50,000 cells collected per sample 3 replicates per sample and 2 cell stages: 0 and 24h (harvest time after drug treatment). Use (https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html) for reproducibility.

To launch the pipeline you we need :

  • Files that should be in data/mydatalocal/atacseq/ ( subsets and bowtie2 index)

  • One config files detailling all the different options and data needed to launch the pipe. config/config.yaml in the config directory

  • Snakefiles (entrypoint of the workflow contains rules and scripts)

  • env.yaml ( we need conda with bioconda and conda-forge and the following dependancies for all rules in snakefile )

dependencies :

- FastQC==0.11.9
- Cutadapt==3.5 
- Bowtie2 
- samtools==1.14
- r-base==4.1.1
- openjdk==10.0.2
- picard==2.26.5
- deepTools 
- MACS2
- bedtools
 

Code Snippets

39
40
41
42
43
shell:
    """
    mkdir -p tmp
    gunzip -c {input} > {output}
    """
SnakeMake From line 39 of main/Snakefile
54
55
56
57
58
shell:
    """
    mkdir -p results/fastqc_init
    fastqc {input} -o "results/fastqc_init" -t {threads}
    """
72
73
74
75
76
shell:
    """
    mkdir -p results/trimming
    cutadapt -j {threads} -a CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -A CTGTCTCTTATACACATCTGACGCTGCCGACGA -o {output.clean_R1} -p {output.clean_R2} {input.read} {input.read2}
    """
87
88
89
90
91
shell:
    """
    mkdir -p results/fastqc_post_trim
    fastqc {input} -o "results/fastqc_post_trim" -t {threads}
    """
103
104
105
106
107
shell:
    """
    mkdir -p results/bowtie2
    bowtie2  --very-sensitive -p {threads} -x data/mydatalocal/atacseq/bowtie2/all -1 {input.R1} -2 {input.R2} |  samtools view -q 2 -bS  -  |  samtools sort - -o {output.bam}
    """
119
120
121
122
123
124
shell:
    """
    mkdir -p results/picard
    picard MarkDuplicates -I {input.map} -O {output.bamnet} -M {output.net_txt} -REMOVE_DUPLICATES true
    samtools index -b {output.bamnet}
    """
136
137
138
139
140
141
142
shell:
    """
    mkdir -p results/deeptools
    plotCoverage --bamfiles {params.bam} --plotFile {output.plot_cov} --smartLabels --plotFileFormat pdf -p 6
    multiBamSummary bins --bamfiles {params.bam} -o {output.summary} -p 6
    plotCorrelation -in {output.summary} --corMethod spearman --skipZeros --whatToPlot heatmap --colorMap RdYlBu --plotNumbers -o {output.plot_corr}
    """
159
160
161
162
163
shell:
    """
    mkdir -p results/macs
    macs2 callpeak -t {input.pic} -f BAM -n {params.nom} --outdir {params.rep}
    """
176
177
178
179
180
181
182
shell:
    """
    mkdir -p results/bedtools
    bedtools intersect -a {input.zero} -b {input.V} > {output.commun}
    bedtools intersect -v -a {input.zero} -b {input.V} > {output.uniquezero}
    bedtools intersect -v -a {input.V} -b {input.zero} > {output.onlyV}
    """
ShowHide 2 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/dijashis/TP_WORKFLOW_SNAKEMAKE
Name: tp_workflow_snakemake
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...