Genome comparison visualization tool

from Luo et al (2011) in Microbial Population Genetics

Comparative analysis is an increasingly important step in the annotation and analysis process of genome sequence data, allowing phenotypic differences between strains and species to be correlated with changes in the chromosomes. For example, comparative sequence analysis has enabled the identification of cis-regulatory regions and location of coding exons using purely computational means. Visual front-ends are necessary and important to make the process of viewing alignments intuitive and easy to facilitate discovery of conserved sequences for functionally significant regions. Below we describe a few visualization tools for genome comparisons.

PipMaker and MultiPipMaker
PipMaker is a World-Wide Web site for comparing two long DNA sequences to identify conserved segments and for producing informative, high-resolution displays of the resulting alignments. One display is a percent identity plot (pip), which shows both the position in one sequence and the degree of similarity for each aligning segment between the two sequences in a compact and easily understandable form. The web site also provides a plot of the locations of those segments in both species. PipMaker is appropriate for comparing genomic sequences from any two related species, although the types of information that can be inferred (e.g., protein-coding regions and cis-regulatory elements) depend on the level of conservation and the time and divergence rate since the separation of the species. PipMaker supports analysis of unfinished or working draft sequences by permitting one of the two sequences to be in un-oriented and unordered contigs. Similarly, MultiPipMaker allows the user to visualize relationships among more than two sequences. All pairwise alignments with the first sequence are computed and then returned as interleaved pips. Moreover, MultiPipMaker can be requested to compute a true multiple alignment of the input sequences and return a nucleotide-level view of the results.

ACT
ACT (Artemis Comparison Tool) is a DNA sequence comparison viewer, such as parsed BLAST alignments based on Artemis - an annotation tool. Similar to other Artemis tools, ACT is written in Java and runs on Unix, GNU/Linux, Macintosh and MS Windows systems. It can read complete EMBL and GENBANK entries or sequence in FASTA or raw sequence format. Other types of readable sequence input files include EMBL, GENBANK and GFF formats. The sequence comparison displayed by ACT is usually the result of running a blastn or tblastx search.

VISTA
Vista (Visualization and Alignment Software for Comparative Genomics) is a visualization tool for alignments, which displays GLASS alignments. It is a program to depict long alignments of DNA sequences from two or more organisms with various types of annotation in a clear and easily interpretable format. Originally it was developed to locate conserved sequences in syntenic regions of different genomes. The key features of the VISTA program are mainly the following:
1. Clean graphical output, allowing for easy identification of sequence similarities and differences.
2. Easily configurable, enabling the visualization of alignments of up to several million bases at different levels of resolution.
3. Displays alignments of draft sequences.
4. Displays sequence annotations such as repeats, coding exons, UTRs and more.
The VISTA plot is based on moving a user-specified window over the entire alignment and calculating the percent identity over the window at each base pair.

SynPlot
Synplot (displays DIALIGN and GLASS alignments) is an application program, written in Perl, for viewing global alignments of syntenic regions of genomic DNA sequence. The alignment is used to calculate the percentage identity along the alignment within a sliding window, the width of which can be specified by the user. This information is used to draw a picture of the alignment in postscript format. The sequences are rendered as lines interrupted by spaces corresponding to the gaps introduced by the alignment, with a plot of the percentage identity underneath. Features can also be drawn on the sequence lines. This program uses a GFF format file output by ACeDB from the annotated genomic sequence, and a configuration file which specifies the color, height and order in which the rectangles representing the features are drawn.

Suggested reading:
1. Microbial Population Genetics
2. Genomics books