Keywords and Expertise
Use keywords to characterize workflows and forum posts, and reach out to sellers with relevant expertise
Link together a non-contiguous series of genomic sequences into a scaffold, consisting of sequences separated by gaps of known length. The sequences that are linked are typically typically contigs; contiguous sequences corresponding to read overlaps.|Scaffold may be positioned along a chromosome physical map to create a "golden path".
Detect, identify and map mutations, such as single nucleotide polymorphisms, short indels and structural variants, in multiple DNA sequences. Typically the alignment and comparison of the fluorescent traces produced by DNA sequencing hardware, to study genomic alterations.|Somatic variant calling is the detection of variations established in somatic cells and hence not inherited as a germ line variant.|Variant detection|Methods often utilise a database of aligned reads.
Identify putative protein-binding regions in a genome sequence from analysis of Chip-sequencing data or ChIP-on-chip data.|Chip-sequencing combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to generate a set of reads, which are aligned to a genome sequence. The enriched areas contain the binding sites of DNA-associated proteins. For example, a transcription factor binding site. ChIP-on-chip in contrast combines chromatin immunoprecipitation ('ChIP') with microarray ('chip'). "Peak-pair calling" is similar to "Peak calling" in the context of ChIP-exo.
Model protein-ligand (for example protein-peptide) binding using comparative modelling or other techniques.|Virtual screening is used in drug discovery to search libraries of small molecules in order to identify those molecules which are most likely to bind to a drug target (typically a protein receptor or enzyme).|Methods aim to predict the position and orientation of a ligand bound to a protein receptor or enzyme.
Combine (align and merge) overlapping fragments of a DNA sequence to reconstruct the original sequence.|For example, assemble overlapping reads from paired-end sequencers into contigs (a contiguous sequence corresponding to read overlaps). Or assemble contigs, for example ESTs and genomic DNA fragments, depending on the detected fragment overlaps.
Infer haplotypes, either alleles at multiple loci that are transmitted together on the same chromosome, or a set of single nucleotide polymorphisms (SNPs) on a single chromatid that are statistically associated.|Haplotype inference can help in population genetic studies and the identification of complex disease genes, , and is typically based on aligned single nucleotide polymorphism (SNP) fragments. Haplotype comparison is a useful way to characterize the genetic variation between individuals. An individual's haplotype describes which nucleotide base occurs at each position for a set of common SNPs. Tools might use combinatorial functions (for example parsimony) or a likelihood function or model with optimisation such as minimum error correction (MEC) model, expectation-maximisation algorithm (EM), genetic algorithm or Markov chain Monte Carlo (MCMC).
Analyse flexibility and motion in protein structure.|Use this concept for analysis of flexible and rigid residues, local chain deformability, regions undergoing conformational change, molecular vibrations or fluctuational dynamics, domain motions or other large-scale structural transitions in a protein structure.
Detect, predict and identify genes or components of genes in DNA sequences, including promoters, coding regions, splice sites, etc.|Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc.
Analyse a genetic variation, for example to annotate its location, alleles, classification, and effects on individual transcripts predicted for a gene model.|Genetic variation annotation provides contextual interpretation of coding SNP consequences in transcripts. It allows comparisons to be made between variation data in different populations or strains for the same transcript.
Pre-process sequence reads to ensure (or improve) quality and reliability.|For example process paired end reads to trim low quality ends remove short sequences, identify sequence inserts, detect chimeric reads, or remove low quality sequnces including vector, adaptor, low complexity and contaminant sequences. Sequences might come from genomic DNA library, EST libraries, SSH library and so on.
Generate a genetic (linkage) map of a DNA sequence (typically a chromosome) showing the relative positions of genetic markers based on estimation of non-physical distances.|Mapping involves ordering genetic loci along a chromosome and estimating the physical distance between loci. A genetic map shows the relative (not physical) position of known genes and genetic markers.|This includes mapping of the genetic architecture of dynamic complex traits (functional mapping), e.g. by characterisation of the underlying quantitative trait loci (QTLs) or nucleotides (QTNs).