Alignment and Annotation Pipeline for Klebsiella pneumoniae

public public 1yr ago 0 bookmarks

MBB659Termproject

MBB659 Final Project: Alignment and annotation pipeline for Klebsiella pneumoniae

Project background

Klebsiella pneumoniae is considered by World Health Organization as a critical priority pathogen that urgently needs new antibiotic treatment, as it is highly resistant to most antibiotics and encodes a diverse set of antimicrobial resistance (AMR) genes that can be easily transmitted between different bacteria1,2. Of great concern is K. pneumoniae ’s role in trafficking AMR genes on a global scale3. Recently, in addition to the reservoir of AMR genes, carbapenemase genes carried by many large plasmid have further hindered the effects of last-line-of-defence antibiotics used in treatment4. Furthermore, K. pneumoniae is a natural inhabitant of the gastrointestinal microbiome and an important pathogen in nosocomial infections2. There are strains of hypervirulent K. pneumoniae (hvkp) that cause these community-linked outbreaks6. However, it is observed these are often less resistant compared to AMR strains. These hvkp strains are found to have thicker capsules to evade host immune mechanisms and proliferate within the host. However, there has been a rise in detection of carbapenemase genes carried in hvkp5,6. I hypothesize that hvkp strains of K. pneumoniae containing carbapenemase encoded on plasmids likely have higher mutations and genetic variants that affect capsule genes, such as rmpA and rmpA2 , that contributes to a less effective capsule the intake of carbapenemase resistance encoded plasmids.

The steps for this pipeline are:

  1. Download reference genome and reads SRA_toolkit

  2. Align reads to the reference genome bwa mem

  3. Call variants and manipulate bcftools

  4. Gene annotation snpEff

Directed Acyclic Graph:

image

Heres how to access the git repository:

Cloning the repository:

Create folder where you want the repo

Open terminal in that folder

In terminal enter:
git clone https://github.com/kluongni/MBB659Termproject.git

Change directory to workflow with:
cd /MBB659Termproject/workflow

Assuming the user has conda installed:
if not: https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html

Activate the conda environment with:
conda env create --file environment.yml
(press y when prompted)
conda activate termProject

Run the Snakemake file within the directory:

snakemake --cores * "results/annotatedVcf.vcf" will bring you to the end of the pipeline
*designates amount of cores user would like to dedicate for the run.

Inputs:
GCA_000009885.1_ASM988v1 is a hvkp whole-genome assembly used as a reference genome for alignment.
SRR10160941 is the SRA accession for a Illumina High-throughput sequenced carbapenemase carrying hvkp.
Outputs:
-----------------------------------------------------------------------------------------------------
snpEff_genes.txt is a text file containing the gene annotation.
annotatedVcf.vcf is a annotated vcf file of all the genes found within the sequence that was aligned.
The snpEFF_genes.txt provides valuable information on the genes returned from the annotation. The upstream and downstream genetic variants can be parsed from the file to provide further analyses. On the right, the graph shows that contrary to my hypothesis, there were no significant amount of genetic variations that had a detrimental effect in the rmpA gene that I believed would lead to carbapenemase acquisition. image
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------

References:

  1. Navon-Venezia, S., Kondratyeva, K. & Carattoli, A. Klebsiella pneumoniae: a major worldwide source and shuttle for antibiotic resistance. FEMS Microbiol. Rev. 013, 252–275 (2017).

  2. Sands, K. et al. Characterization of antimicrobial-resistant Gram-negative bacteria that cause neonatal sepsis in seven low- and middle-income countries. Nat. Microbiol. 23, 24.

  3. Wyres, K. L. & Holt, K. E. Klebsiella pneumoniae as a key trafficker of drug resistance genes from environmental to clinically important bacteria. Curr. Opin. Microbiol. 45, 131–139 (2018).

  4. Chiu, S. K. et al. Carbapenem Nonsusceptible Klebsiella pneumoniae in Taiwan: Dissemination and Increasing Resistance of Carbapenemase Producers During 2012-2015. Sci. Rep. 8, 1–9 (2018).

  5. Lee, C. R. et al. Antimicrobial resistance of hypervirulent Klebsiella pneumoniae: Epidemiology, hypervirulence-associated determinants, and resistance mechanisms. Front. Cell. Infect. Microbiol. 7, (2017).

  6. Xie, M. et al. Clinical evolution of ST11 carbapenem resistant and hypervirulent Klebsiella pneumoniae. Commun. Biol. 4, 1–9 (2021).

Code Snippets

10
11
12
13
14
shell:
    """
    wget -nc https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/009/885/GCA_000009885.1_ASM988v1/GCA_000009885.1_ASM988v1_genomic.fna.gz -P results
    gzip -d results/GCA_000009885.1_ASM988v1_genomic.fna.gz
    """
20
21
shell:
    "fastq-dump SRR10160941 --split-files -O results"
28
29
30
31
shell:
    """
    samtools faidx {input.genome}
    """
41
42
43
44
45
shell:
    """
    bwa index {input.genome}
    bwa mem -t {threads} {input.genome} {input.read1} {input.read2} | samtools view -u -F 4 -q 30 -@ {threads} | samtools sort -O BAM -o {output.alignedBAM} -@ {threads}
    """
69
70
shell:
    "snpEff ann Klebsiella_pneumoniae_subsp_pneumoniae_ntuh_k2044 {input.calledVcf} > {output.annotatedVcf}"
ShowHide 2 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/kluongni/MBB659Termproject
Name: mbb659termproject
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...