Description of the microbiome of the Barfusser mummy of Basel

public public 1yr ago 0 bookmarks

a Snakemake workflow for the analysis of the Franciscan church mummy of Basel (known as the Barfüsser mummy)

The analysis includes:

1- Ancient human DNA analysis

2- Microbiome general taxonomic classification

3- De novo assem

Code Snippets

84
85
86
87
88
89
90
shell:
    """
    fastp -i {input.reads1} -I {input.reads2} \
    -o {output.reads_qc1} -O {output.reads_qc2} \
    -h {output.report_PE} -j {output.json_PE} \
    --thread {threads} --cut_mean_quality 30
    """
102
103
104
105
106
shell:
    """
    bbcountunique.sh in={input.reads_1} in2={input.reads_2} out={output.hist}
    Rscript src/scripts/LibraryComplexityPlot.R {output.hist} {output.hist_plot}
    """
SnakeMake From line 102 of main/Snakefile
125
126
127
128
129
130
131
132
shell:
    """
    fastp -i {input.reads1} -I {input.reads2} \
    -o {output.reads_unmerged1} -O {output.reads_unmerged2} \
    --merge --merged_out {output.reads_merged} \
    --overlap_len_require 10 -h {output.report_merged} \
    -j {output.json_merged} --length_required 25 --thread {threads}
    """
150
151
152
153
154
155
shell:
    """
    seqkit rmdup -s -j {threads} {input.reads_merged} | gzip > {output.dedup}
    seqkit rmdup -s -j {threads} {input.reads1_unmerged} | gzip > {output.reads1_unmerged}
    seqkit rmdup -s -j {threads} {input.reads2_unmerged} | gzip > {output.reads2_unmerged}
    """
184
185
186
187
188
189
190
191
192
193
194
shell:
    """
    spades.py -1 {input.reads1} -2 {input.reads2} -m {resources.mem_mb} -t {threads} -o {output.assembly_dir} --meta
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/tmp
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/corrected
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/K55
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/K33
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/K21
    rm -r out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/misc
    rm out/assembly/meta_spades/{wildcards.sample}/{wildcards.sample}.contigs/params.txt
    """
209
210
211
212
213
214
215
shell:
    """
    rm -rf {output.assembly_dir}
    megahit -1 {input.reads1} -2 {input.reads2} -t {threads} -o {output.assembly_dir}
    rm -r out/assembly/megahit/{wildcards.sample}/{wildcards.sample}.contigs/intermediate_contigs
    mv out/assembly/megahit/{wildcards.sample}/{wildcards.sample}.contigs/final.contigs.fa {output.contigs}
    """
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
shell:
    """
    seqtk seq -L 1000 {input.contigs}/contigs.fasta > {output.contigs}
    seqtk seq -L 1000 {input.contigs}/contigs.fasta | gzip > {output.zip_contigs}
    bowtie2-build {output.contigs} out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000
    bowtie2 -x out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000 -1 {input.reads1} -2 {input.reads2} -S out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.sam --no-unal --threads {threads}
    samtools view -@ {threads} -Sbq 30 out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.sam > out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.bam
    samtools sort -@ {threads} out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.bam > {output.bam}
    samtools index -@ {threads} {output.bam}
    jgi_summarize_bam_contig_depths {output.bam} --outputDepth {output.depth_raw}
    cat {output.depth_raw} | cut -f1,3 > {output.depth}
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.sam
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.bam
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.rev.2.bt2
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.rev.1.bt2
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.2.bt2
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.1.bt2
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.4.bt2
    rm out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/scaffolds.1000.3.bt2
    """
268
269
270
271
shell:
    """
    metabat -i {input.contigs} -a {input.depth} -o {output.base}/metabat -m 1500 --minClsSize 10000
    """
286
287
288
289
290
291
292
shell:
    """
    run_MaxBin.pl -contig {input.contigs} -abund {input.depth} -out {output.base}
    mkdir out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/maxbin
    cd out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning
    for i in $(ls maxbin*.fasta | rev | cut -c 7- | rev); do mv $i.fasta maxbin/$i.fa; done
    """
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
shell:
    """
    samtools index {input.bam}
    cut_up_fasta.py {input.contigs} -c 10000 -o 0 --merge_last -b out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/contigs_10K.bed > out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/contigs_10K.fna
    concoct_coverage_table.py out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/contigs_10K.bed {input.bam} > {output.cov_tab}
    concoct --composition_file out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/contigs_10K.fna --coverage_file {output.cov_tab} --threads 8 -b out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/
    merge_cutup_clustering.py out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/clustering_gt1000.csv > out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/clustering_merged.csv
    mkdir {output.concoct_dir}
    extract_fasta_bins.py {input.contigs} out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/clustering_merged.csv --output_path {output.concoct_dir}
    cd {output.concoct_dir}
    for file in $(ls *.fa)
    do
    minimumsize=50000
    actualsize=$(wc -c < $file)
    if [ $actualsize -le $minimumsize ]; then
        rm $file
    fi
    done
    for i in $(ls * | rev | cut -c 4- | rev)
    do
    mv $i.fa concoct.$i.fa
    done
    """
351
352
353
354
355
356
357
358
shell:
    """
    mkdir {output.DAS_Tool}
    src/scripts/Fasta_to_Scaffolds2Bin.sh -i {input.metabat} -e fa > {output.metabat}
    src/scripts/Fasta_to_Scaffolds2Bin.sh -i {input.maxbin} -e fa > {output.maxbin}
    src/scripts/Fasta_to_Scaffolds2Bin.sh -i {input.concoct} -e fa > {output.concoct}
    DAS_Tool -i {output.metabat},{output.maxbin},{output.concoct} -l metabat,maxbin,concoct  -c {input.contigs} -o out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.binning/{wildcards.sample} -t {threads} --write_bins 1 --score_threshold 0.0
    """
374
375
376
377
378
379
380
shell:
    """
    checkm lineage_wf {input.bins} {output.checkm} -x ".fa" -t {threads} --pplacer_threads {threads} --tab_table > {output.summary}

    rm -rf {output.checkm}/storage
    rm -rf {output.checkm}/bins
    """
394
395
396
397
398
399
400
401
402
403
404
405
shell:
    """
    export PATH="/apps/augustus/3.4.0/bin:$PATH"
    export PATH="/apps/augustus/3.4.0/scripts:$PATH"
    export AUGUSTUS_CONFIG_PATH="/apps/augustus/3.4.0/config/"
    export BUSCO_CONFIG_FILE="busco_config.ini"
    ./src/scripts/BUSCO.sh {input.bins} {output.busco} {wildcards.sample}_busco
    #busco -i {input.bins} -f -o {wildcards.sample}_busco -m genome --auto-lineage --out_path {output.busco}
    rm -rf out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.busco/{wildcards.sample}_busco/*fa
    rm -rf out/assembly/{wildcards.assembler}/{wildcards.sample}/{wildcards.sample}.busco/{wildcards.sample}_busco/logs
    rm busco*log
    """
SnakeMake From line 394 of main/Snakefile
419
420
421
422
423
shell:
    """
    gtdbtk classify_wf --genome_dir {input.bins} --cpus {threads} -x fa --out_dir {output.GTDB} --force --min_perc_aa 5 --pplacer_cpus {threads}
    #rm -r {output.GTDB}/*/intermediate_results
    """
440
441
442
443
444
445
446
447
shell:
    """
    bwa aln -t {threads} {input.ref} {input.reads} > out/human/{wildcards.sample}/{wildcards.sample}.mito.sai
    bwa samse {input.ref} out/human/{wildcards.sample}/{wildcards.sample}.mito.sai {input.reads} | samtools view -q 30 -bSh > out/human/{wildcards.sample}/{wildcards.sample}.mito.bam
    samtools sort -@ {threads} out/human/{wildcards.sample}/{wildcards.sample}.mito.bam -o {output.mito} 
    samtools index {output.mito}
    rm out/human/{wildcards.sample}/{wildcards.sample}.mito.sai out/human/{wildcards.sample}/{wildcards.sample}.mito.bam
    """
462
463
464
465
466
467
468
469
shell:
    """
    bwa aln -t {threads} {input.ref} {input.reads} > out/human/{wildcards.sample}/{wildcards.sample}.auto.sai
    bwa samse {input.ref} out/human/{wildcards.sample}/{wildcards.sample}.auto.sai {input.reads} | samtools view -q 30 -bSh > out/human/{wildcards.sample}/{wildcards.sample}.auto.bam
    samtools sort out/human/{wildcards.sample}/{wildcards.sample}.auto.bam -o {output.human_reads}
    samtools index {output.human_reads}
    rm out/human/{wildcards.sample}/{wildcards.sample}.auto.sai out/human/{wildcards.sample}/{wildcards.sample}.auto.bam
    """
482
483
484
485
486
487
    shell:
        """
		unset DISPLAY
        qualimap bamqc -bam {input.human_mito} -outdir {output.mito_qualimap}
        qualimap bamqc -bam {input.human_auto} -outdir {output.auto_qualimap}
        """
509
510
511
512
513
514
515
516
517
shell:
    """
    module load mapdamage
    module load R/3.5.1
    mapDamage -i {input.human_mito} -r {input.rCRS} -d {output.mito_damage} --rescale --rescale-out {output.rescaled_mito}
    samtools index {output.rescaled_mito}
    mapDamage -i {input.human_auto} -r {input.hg19} -d {output.auto_damage} --rescale --rescale-out {output.rescaled_auto}
    samtools index {output.rescaled_auto}
    """
528
529
530
531
532
533
534
shell:
    """
    samtools index {input.bam}
    samtools view {input.bam} | python2 src/scripts/skoglund_xy.py > {output.skglnd_sex}
    samtools idxstats {input.bam} > out/human/{wildcards.sample}/{wildcards.sample}.auto.idxstats
    Rscript src/scripts/sex_determination_mittnik.R out/human/{wildcards.sample}/{wildcards.sample}.auto > {output.mtnk_sex}
    """
546
547
548
549
550
551
552
553
554
shell:
    """
    samtools calmd -b {input.human_mito} {input.rCRS} > {output.md_bam}
    samtools index {output.md_bam}
    contDeam.pl --library double --length 10 --out {output.schmutzi} {output.md_bam}
    schmutzi.pl --notusepredC --uselength --ref {input.rCRS} {output.schmutzi}\
    /apps/schmutzi/20171024/alleleFreqMT/eurasian/freqs/ {output.md_bam}
    log2fasta -q 30 out/human/{wildcards.sample}/{wildcards.sample}.mito.schmutzi_final_endo.log > {output.fasta}
    """
568
569
570
571
572
shell:
    """
    bcftools mpileup -Q 30 -q 30 -f {input.hg19} {input.bam} -r chrY | bcftools call -c -o {output.y_vcf}
    callHaplogroups.py -i {output.y_vcf} -c -hp -ds -dsd -as -asd -o {output.y_haplo}
    """
591
592
593
594
595
596
597
shell:
    """
    angsd -i {input.bam} -r chrX:5000000-154900000 -doCounts 1  -iCounts 1 -minMapQ 30 -minQ 30 -out out/human/{wildcards.sample}/angsdput

    Rscript {input.angsd_path}/R/contamination.R mapFile={input.angsd_path}/RES/chrX.unique.gz hapFile={input.angsd_path}/RES/HapMapChrX.gz countFile=out/human/{wildcards.sample}/angsdput.icnts.gz mc.cores=4 > {output.angsd_R}
    {input.angsd_path}/misc/contamination -a out/human/{wildcards.sample}/angsdput.icnts.gz -h /apps/angsd/0.918/RES/HapMapChrX.gz -d 2 -e 100 2 > {output.angsd_conta}
    """
603
604
shell:
    "cd src/scripts/; curl -sL haplogrep.now.sh | bash"
SnakeMake From line 603 of main/Snakefile
614
615
616
617
618
619
shell:
   """
   bcftools mpileup -B -Ou -d 1000 -q 30 -f {input.rCRS} {input.rescaled_mito} | bcftools call -mv --ploidy 1 -Ou -o {output.mito_vcf}

   ./src/scripts/haplogrep classify --in {output.mito_vcf} --format vcf --out {output.mito_hg} --extend-report
    """
643
644
645
646
647
648
649
650
shell:
    """
    diamond blastx -p {threads} -d {input.diamond_db} \
    -q {input.merged} -o {output.diamond_tab} -b 12 -c 1

    zcat {input.reads1} {input.reads2} | diamond blastx -p {threads} -d {input.diamond_db} \
    -o {output.discarded_tab} -b 12 -c 1
    """
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
    shell:
        """
        #unset DISPLAY
        blast2rma \
            --in {input.diamond_tab} \
            --format "BlastTab" \
            --blastMode "BlastX" \
            --out {output.diamond_rma6} \
            --minPercentIdentity 80 \
            --minSupportPercent 0.0001 \
            --threads {threads} \
            --acc2taxa {input.megan_prot_db}

		blast2rma \
            --in {input.discarded_tab} \
            --format "BlastTab" \
            --blastMode "BlastX" \
            --out {output.discarded_rma6} \
            --minPercentIdentity 80 \
            --minSupportPercent 0.0001 \
            --threads {threads} \
            --acc2taxa {input.megan_prot_db}
        """
SnakeMake From line 665 of main/Snakefile
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
shell:
    """
    metaphlan --bowtie2db {input.db} --input_type fastq --nproc {threads} --min_mapq_val 25 --no_map --add_viruses --read_min_len 25 {input.merged}  > {output.merged}

    metaphlan --bowtie2db {input.db} --input_type fastq --nproc {threads} --min_mapq_val 25 --no_map --add_viruses --read_min_len 25 {input.reads1}  > {output.reads1}

    metaphlan --bowtie2db {input.db} --input_type fastq --nproc {threads} --min_mapq_val 25 --no_map --add_viruses --read_min_len 25 {input.reads2}  > {output.reads2}

    merge_metaphlan_tables.py out/taxonomic_classifications/metaphlan/{wildcards.sample}/*def.metaphlan > out/taxonomic_classifications/metaphlan/{wildcards.sample}/merged_abundance_table.txt

    grep -E "s__|clade" out/taxonomic_classifications/metaphlan/{wildcards.sample}/merged_abundance_table.txt | sed 's/^.*s__//g' | cut -f1,3- | sed -e 's/clade_name/body_site/g' > {output.sp_table}

    python3 /apps/metaphlan/3.0.1/lib/python3.8/site-packages/metaphlan/utils/hclust2/hclust2.py \
    -i {output.sp_table} \
    -o {output.heatmap} \
    --f_dist_f correlation \
    --s_dist_f euclidean \
    --cell_aspect_ratio 0.5 \
    -l \
    --flabel_size 4 \
    --slabel_size 4 \
    --max_flabel_len 100 \
    --max_slabel_len 100 \
    --minv 0.1 \
    --dpi 300 --no_fclustering --no_sclustering
    """
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
shell:
    """
    kraken2 --db {input.db} \
    --report {output.report} \
    --threads {threads} \
    --output {output.out} {input.merged}

    kraken2 --db {input.db} \
    --report {output.report_dis} \
    --threads {threads} \
    --output {output.out_dis} \
    --paired {input.reads1} {input.reads2}

    ktImportTaxonomy -q 2 -t 3 {output.out} -o {output.krona}
    rm -r out/taxonomic_classifications/kraken/{wildcards.sample}/{wildcards.sample}.krona.html.files

    ktImportTaxonomy -q 2 -t 3 {output.out_dis} -o {output.krona_dis}
    rm -r out/taxonomic_classifications/kraken/{wildcards.sample}/{wildcards.sample}.discarded.krona.html.files


    bracken -d {input.db} -i {output.report} -o {output.bracken} -l S

    bracken -d {input.db} -i {output.report_dis} -o {output.bracken_dis} -l S
    """
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
shell:
    """
    echo "---------------------General_stats----------------------------" >> {output.stats}
    count=$(zgrep -c "^+$" {input.reads1})
    echo "Total number of raw reads "\t" $count" >> {output.stats}
    merge=$(zgrep -c "^+$" {input.reads_merged})
    echo "Total number of merged reads "\t" $merge" >> {output.stats}
    echo "Percentage of merged reads "\t" $(echo "$merge*100/$count" | bc -l) %" >> {output.stats}
    echo "" >>  {output.stats}
    echo "---------------------Deduplication_stats----------------------" >> {output.stats}
    dedup=$(zgrep -c "^+$" {input.dedup})
    echo "Total number of deduplicated reads "\t" $dedup" >> {output.stats}
    echo "Percentage of duplication "\t" $(echo "100-$(echo "$dedup*100/$merge" | bc -l)") %">> {output.stats}
    echo "Total number of deduplicated discarded forward reads "\t" " $(zgrep -c "^+$" {input.reads1_rmdup}) >> {output.stats}
    echo "Total number of deduplicated discarded reverse reads "\t" " $(zgrep -c "^+$" {input.reads2_rmdup}) >> {output.stats}
    echo "" >>  {output.stats}
    echo "---------------------Human_mitochondrial_DNA_stats------------" >> {output.stats}
    mito=$(samtools view -c -F 260 {input.mito} )
    echo "Number of mapped reads "\t" $mito" >> {output.stats}
    echo "Percentage of mitochondrial reads "\t" $(echo "100*$mito/$dedup" | bc -l) %" >> {output.stats}
    echo "Mitochondrial genome coverage "\t" $(grep "mean coverageData" {input.mito_cov}/genome_results.txt | cut -d"=" -f2)" >> {output.stats}
    echo $(grep "std coverageData" {input.mito_cov}/genome_results.txt)>> {output.stats}
    echo "" >>  {output.stats}
    echo "---------------------Human_autosomal_DNA_stats----------------" >> {output.stats}
    auto=$(samtools view -c -F 260 {input.auto} )
    echo "Number of mapped reads "\t" $auto" >> {output.stats}
    echo "Human endogenous DNA content "\t" $(echo "100*$auto/$dedup" | bc -l) %" >> {output.stats}
    echo "Human genome coverage "\t" $(grep "mean coverageData" {input.auto_cov}/genome_results.txt| cut -d"=" -f2)" >> {output.stats}
    echo $(grep "std coverageData" {input.auto_cov}/genome_results.txt)>> {output.stats}
    echo "" >>  {output.stats}
    echo "---------------------Human_sex_assignment----------------------" >> {output.stats}
    echo "Skoglund sex assignment: " >> {output.stats}
    head -2 {input.skglnd_sex} >> {output.stats}
    echo "" >>  {output.stats}
    echo "Mittnik sex assignment: " >> {output.stats}
    echo $(grep "Sex assignment" {input.mtnk_sex} | cut -c 6- | rev | cut -c 2- | rev) >> {output.stats}
    echo "" >>  {output.stats}
    Rscript src/scripts/PlotStats.R {output.stats}
    """
SnakeMake From line 799 of main/Snakefile
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
args = commandArgs(trailingOnly=TRUE)

data_tab = read.table(args[1])

x = data_tab$V1

y = data_tab$V2

pdf(file = args[2], width = 5, height = 5)

plot(x, y, lwd=2, pch=16, xlab="", ylab="")

fit3 <- lm(y~poly(x,3,raw=TRUE), data=data_tab)
lines(x, predict(fit3, data.frame(x=data_tab)), col="red", lwd=2)
title(main="Library complexity", xlab="Number of reads", ylab="Unique reads (%)") 

dev.off()
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
args = commandArgs(trailingOnly=TRUE)

data_tab = read.table(args, sep = ";", header = FALSE)

#data_tab = read.csv("C:/Users/Sabry/OneDrive - Scientific Network South Tyrol/Desktop/Body_water.seq_stats.txt")

pdf(file = paste(args, "pdf", sep = ".", collapse = ""), width = 7, height = 7)

plot.new()
for (i in 1:nrow(data_tab)){
  title(sub=data_tab[i, 1], line = -(nrow(data_tab)-i), adj = 0)
}

dev.off()
R From line 2 of scripts/PlotStats.R
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
        # samtools index Ajv52_q30.bam
        # samtools idxstats Ajv52_q30.bam > Ajv52.idxstats
        # Rscript Rx_identifier.r Ajv52 > Ajv52.Rx


args=(commandArgs(TRUE))
PREFIX=as.character(args[1])

idxstats<-read.table(paste(PREFIX,'.idxstats',sep=''),header=F,nrows=24,row.names=1)
c1 <- c(as.numeric(idxstats[,1]))
c2 <- c(as.numeric(idxstats[,2]))
total_ref <- sum(c1)
total_map <- sum(c2)

LM <- lm(c1~c2)
summary(LM)  

Rt1 <- (idxstats[1,2]/total_map)/(idxstats[1,1]/total_ref)
Rt2 <- (idxstats[2,2]/total_map)/(idxstats[2,1]/total_ref)
Rt3 <- (idxstats[3,2]/total_map)/(idxstats[3,1]/total_ref)
Rt4 <- (idxstats[4,2]/total_map)/(idxstats[4,1]/total_ref)
Rt5 <- (idxstats[5,2]/total_map)/(idxstats[5,1]/total_ref)
Rt6 <- (idxstats[6,2]/total_map)/(idxstats[6,1]/total_ref)
Rt7 <- (idxstats[7,2]/total_map)/(idxstats[7,1]/total_ref)
Rt8 <- (idxstats[8,2]/total_map)/(idxstats[8,1]/total_ref)
Rt9 <- (idxstats[9,2]/total_map)/(idxstats[9,1]/total_ref)
Rt10 <- (idxstats[10,2]/total_map)/(idxstats[10,1]/total_ref)
Rt11 <- (idxstats[11,2]/total_map)/(idxstats[11,1]/total_ref)
Rt12 <- (idxstats[12,2]/total_map)/(idxstats[12,1]/total_ref)
Rt13 <- (idxstats[13,2]/total_map)/(idxstats[13,1]/total_ref)
Rt14 <- (idxstats[14,2]/total_map)/(idxstats[14,1]/total_ref)
Rt15 <- (idxstats[15,2]/total_map)/(idxstats[15,1]/total_ref)
Rt16 <- (idxstats[16,2]/total_map)/(idxstats[16,1]/total_ref)
Rt17 <- (idxstats[17,2]/total_map)/(idxstats[17,1]/total_ref)
Rt18 <- (idxstats[18,2]/total_map)/(idxstats[18,1]/total_ref)
Rt19 <- (idxstats[19,2]/total_map)/(idxstats[19,1]/total_ref)
Rt20 <- (idxstats[20,2]/total_map)/(idxstats[20,1]/total_ref)
Rt21 <- (idxstats[21,2]/total_map)/(idxstats[21,1]/total_ref)
Rt22 <- (idxstats[22,2]/total_map)/(idxstats[22,1]/total_ref)
Rt23 <- (idxstats[23,2]/total_map)/(idxstats[23,1]/total_ref)
Rt24 <- (idxstats[24,2]/total_map)/(idxstats[24,1]/total_ref)

tot <- c(Rt23/Rt1,Rt23/Rt2,Rt23/Rt3,Rt23/Rt4,Rt23/Rt5,Rt23/Rt6,Rt23/Rt7,Rt23/Rt8,Rt23/Rt9,Rt23/Rt10,Rt23/Rt11,Rt23/Rt12,Rt23/Rt13,Rt23/Rt14,Rt23/Rt15,Rt23/Rt16,Rt23/Rt17,Rt23/Rt18,Rt23/Rt19,Rt23/Rt20,Rt23/Rt21,Rt23/Rt22)
Rx <- mean(tot)
cat("Rx :",Rx,"\n")
confinterval <- 1.96*(sd(tot)/sqrt(22))
CI1 <- Rx-confinterval
CI2 <- Rx+confinterval
cat("95% CI :",CI1, CI2,"\n")

if (CI1 > 0.8) {print ("Sex assignment:The sample should be assigned as Female")
} else if (CI2 < 0.6) {print ("Sex assignment:The sample should be assigned as Male")
} else if (CI1 > 0.6 & CI2 > 0.8) {print ("Sex assignment:The sample is consistent with XX but not XY")
} else if (CI1 < 0.6 & CI2 < 0.8) {print ("Sex assignment:The sample is consistent with XY but not XX")
} else print ("Sex assignment:The sample could not be assigned")

print ("***It is important to realize that the assignment is invalid, if there is no correlation between the number of reference reads and that of the mapped reads***")
ShowHide 12 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/msabrysarhan/Barfusser_microbiome
Name: barfusser_microbiome
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...