tool / biotools

Workflow engine and language. It aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style.

biotools edam URL URL pmid URL
tool / biotools

SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.

biotools email email edam edam edam edam edam edam edam edam edam URL GitHub doi doi doi edam edam URL URL URL URL
tool / biotools

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

biotools URL URL URL URL URL email edam edam URL
tool / pypi

The fundamental package for scientific computing with Python

URL pypi URL email
tool / bioconda

The full Genome Analysis Toolkit (GATK) framework, v3

bioconda URL biotools
tool / biotools

This tool aims to provide a QC report which can spot problems or biases which originate either in the sequencer or in the starting library material. It can be run in one of two modes. It can either run as a stand alone interactive application for the immediate analysis of small numbers of FastQ files, or it can be run in a non-interactive mode where it would be suitable for integrating into a larger analysis pipeline for the systematic processing of large numbers of files.

edam edam URL email edam edam edam edam edam edam edam GitHub URL doi biotools edam
tool / biotools

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.

biotools email GitHub URL doi edam edam edam URL edam edam URL URL URL doi
tool / cran

Create Elegant Data Visualisations Using the Grammar of Graphics: A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

tool / biotools

MultiQC aggregates results from multiple bioinformatics analyses across many samples into a single report. It searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.

biotools edam edam edam GitHub URL bioconda doi email edam edam edam edam URL pypi URL
tool / biotools

BEDTools is an extensive suite of utilities for comparing genomic features in BED format.

biotools pmid
tool / pypi

Matplotlib produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, Python/IPython shells, web application servers, and various graphical user interface toolkits.

URL email GitHub
tool / cran

Easily Install and Load the 'Tidyverse': The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at .

cran email
tool / biotools

Fast, accurate, memory-efficient aligner for short and long sequencing reads

biotools URL email email edam edam edam edam edam edam edam edam edam edam edam edam edam URL doi doi doi doi URL URL URL bioconda doi doi
tool / biotools

A set of command line tools for manipulating high-throughput sequencing (HTS) data in formats such as SAM/BAM/CRAM and VCF. Available as a standalone program or within the GATK4 program.

biotools edam GitHub URL
tool / pypi

Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style. Snakemake workflows are essentially Python scripts extended by declarative code to define rules. Rules describe how to create output files from input files.

pypi URL email
tool / bioconda

C library and command line tools for high-throughput sequencing data formats.

bioconda GitHub
tool / cran

A Grammar of Data Manipulation: A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

email cran
tool / biotools
Bowtie 2

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

edam edam edam edam bioconda doi doi doi doi biotools edam edam URL URL URL GitHub URL URL URL bioconda URL doi doi
tool / biotools

An integrated solution to management and visualization of sequencing data.

biotools email edam GitHub doi
tool / biotools

Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.

biotools edam edam edam pmid edam URL URL
format / edam

JavaScript Object Notation format; a lightweight, text-based format to represent tree-structured data using key-value pairs.

edam URL
tool / biotools

Find and remove adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.

biotools email edam edam edam URL doi edam URL URL doi
tool / biotools

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

biotools edam edam GitHub doi
tool / biotools

QIIME 2™ is a next-generation microbiome bioinformatics platform that is extensible, free, open source, and community developed.

biotools email edam edam edam edam GitHub URL URL URL doi URL
tool / bioconda

A collection of utility functions and classes for Snakemake wrappers.

bioconda GitHub URL
tool / pypi

-------------------------------------- seaborn: statistical data visualization ======================================= [![PyPI Version](]( [![License](]( [![DOI](]( [![Tests](]( [![Code Coverage](]( Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. Documentation ------------- Online documentation is available at []( The docs include a [tutorial](, [example

pypi URL email
tool / cran

Extension of 'data.frame': Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.

email URL
tool / biotools

A flexible read trimming tool for Illumina NGS data

biotools email edam edam edam edam edam edam edam edam URL edam edam URL URL doi
tool / pypi

.. image:: :target: :width: 110 :height: 110 :align: left .. image:: :target: .. image:: :target: .. image:: :target: .. image:: :target: .. image:: :target: SciPy (pronounced "Sigh Pie") is an open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, li

URL URL email pypi
tool / biotools

Pairwise aligner for genomic and spliced nucleotide sequences

biotools edam GitHub URL doi
tool / cran

Tidy Messy Data: Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).

email cran
tool / biotools

A tool for processing sequences in the FASTA or FASTQ format. It parses both FASTA and FASTQ files which can also be optionally compressed by gzip.

biotools edam edam GitHub doi
tool / biotools

FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations.

biotools URL doi edam edam edam edam edam edam edam
tool / biotools

R/Bioconductor package for differential gene expression analysis based on the negative binomial distribution. Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

biotools doi email edam URL GitHub URL
tool / pypi

pysam - a python module for reading, manipulating and writing genomic data sets. pysam is a lightweight wrapper of the htslib C-API and provides facilities to read and write SAM/BAM/VCF/BCF/BED/GFF/GTF/FASTA/FASTQ files as well as access to the command line functionality of the samtools and bcftools packages. The module supports compression and random access through indexing. This module provides a low-level wrapper around the htslib C-API as using cython and a high-level API for convenient access to the data within standard genomic file formats. See:

GitHub URL
tool / pypi

pypi URL email
tool / cran

Simple, Consistent Wrappers for Common String Operations: A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.

cran email
tool / pypi

Utils ===== .. image:: :target: Sometimes you write a function over and over again; sometimes you look up at the ceiling and ask "why, Guido, why isn't this included in the standard library?" Well, we perhaps can't answer that question. But we can collect those functions into a centralized place! Provided things +++++++++++++++ Utils is broken up into broad swathes of functionality, to ease the task of remembering where exactly something lives. enum ---- Python doesn't have a built-in way to define an enum, so this module provides (what I think) is a pretty clean way to go about them. .. code-block:: python from utils import enum class Colors(enum.Enum): RED = 0 GREEN = 1 # Defining an Enum class allows you to specify a few # things about the way it's going to behave. class Options: frozen = True # can't change attributes stric

email URL pypi URL
tool / biotools

A tool that finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.

edam edam edam edam edam URL doi doi doi biotools email edam edam edam edam edam edam edam edam URL URL doi
tool / biotools

MATLAB program for protein quantitation by iTRAQ.

biotools email edam edam edam edam edam edam URL URL URL pmid edam edam edam edam edam URL URL
tool / biotools

featureCounts is a very efficient read quantifier. It can be used to summarize RNA-seq reads and gDNA-seq reads to a variety of genomic features such as genes, exons, promoters, gene bodies and genomic bins. It is included in the Bioconductor Rsubread package and also in the SourceForge Subread package.

biotools email pmid email email edam URL URL

Trim Galore! is a wrapper script to automate quality and adapter trimming as well as quality control, with some added functionality to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing).

tool / biotools

Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm.

edam edam edam URL biotools email edam edam edam GitHub URL doi
tool / biotools

User-friendly tools for the normalization and visualization of deep-sequencing data.

biotools email edam edam URL URL pmid
tool / bioconda

Tools for dealing with SAM, BAM and CRAM files

GitHub biotools bioconda
tool / biotools

A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides.

biotools email email edam edam URL doi
tool / biotools

The Consensus server aligns a sequence to a structural template using a consensus of 5 different alignment methods. A measure of reliability is produced for each alignment position in order to predict the suitability of regions for comparative modelling.

biotools email edam URL edam edam edam URL pmid
tool / biotools

Bowtie is an ultrafast, memory-efficient short read aligner.

email edam edam edam edam edam edam URL URL doi doi biotools edam edam edam edam edam URL doi