Genome-wide Comparative Alignment Tools

from Luo et al (2011) in Microbial Population Genetics

Genome sequence comparison has been an important method for understanding gene function and genome evolution since the early days of gene sequencing. Alignment of DNA sequences is the core process in comparative genomics. In recent years, an important new sequence-analysis task has emerged: comparing an entire genome with another. Several powerful alignment algorithms have been developed to align two or more sequences.

MUMmer
MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. MUMmer can also align incomplete genomes; it can handle thousands of contigs from a shotgun sequencing project, and will align them to another set of contigs or a genome using the NUCmer program included within the system. If the species are too divergent for a DNA sequence alignment to detect similarity, then the PROmer program within the environment can generate alignments based upon the six-frame translations of both input sequences. The original MUMmer system, version 1.0, was described in a 1999 Nucleic Acids Research paper. Version 2.1 appeared a few years later and was described in a 2002 Nucleic Acids Research paper , and the most recent version MUMmer 3.0 was described in a 2004 Genome Biology paper.

BLAT
BLAT (The BLAST-Like Alignment Tool) is a new tool for sequence alignment, which is similar in many ways to BLAST. The program rapidly scans for relatively short matches (hits), and extends these into high-scoring pairs (HSPs). However, BLAT differs from BLAST in several significant ways. Specifically, where BLAST builds an index of the query sequence and then scans linearly through the database, BLAT builds an index of the database and then scans linearly through the query sequence. Where BLAST triggers an extension when one or two hits occur in proximity to each other, BLAT can trigger extensions on any number of perfect or near-perfect hits. Where BLAST returns each area of homology between two sequences as separate alignments, BLAT stitches them together into a larger alignment. Both the client/server and the stand-alone can do comparisons at the nucleotide, protein, or translated nucleotide level.

MEGABlast
Mega BLAST uses the greedy algorithm of Zhang et al. for nucleotide sequence alignment search and concatenates many queries to save time spent scanning the database. This program is optimized for aligning sequences that differ slightly as a result of sequencing or other similar "errors". It is up to 10 times faster than more common sequence similarity search and alignment programs and therefore can be used to swiftly compare two large sets of sequences against each other.

Suggested reading:
1. Microbial Population Genetics
2. Genomics books