from Luo et al (2011)
in Microbial Population GeneticsA variety of specialized data resources manage the results of microbial genome data processing and interpretation at different stages. These stages correspond to different levels of microbial genome characterization. Draft and finished microbial genome data are continuously incorporated in various microbial genome data resources. Below are brief descriptions to the main microbial genome data resources.
GOLD Genomes Online DatabaseGOLD (Genomes Online Database) is a World Wide Web resource for comprehensive access to information regarding complete and ongoing genome projects, as well as metagenomes and metadata, around the world. GOLD was created in 1997 with the aim to (i) monitor all genome sequencing projects from instigation to completion and (ii) provide the community with a centralized database integrating diverse information related to those projects in the form of hyper-text links to disparate web-based resources. Although several different types of statistics, related to each of the data fields, can be derived from the user at any point using the search engine, the database also provides readily available graphical overviews for specific data types.
Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of March 2008, GOLD contains information on more than 3613 sequencing projects, out of which 731 have been completed and their sequence data deposited in the public databases (GOLD V2.0). GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence' (MIGS) guideline.
ASAP A systematic annotation packageASAP (a systematic annotation package for community analysis of genomes) is a relational database and has a web interface developed to store, update and distribute genome sequence data and their functional characterizations. ASAP facilitates ongoing community annotation of genomes and tracking of information as genome projects move from preliminary data collection through post sequencing functional analysis. The database includes multiple genome sequences at various stages of analysis, corresponding experimental data and access to collections of related genome resources. Its development was motivated by the need to more directly involve a greater community of researchers, with their collective expertise, in keeping the genome annotation current and to provide a synergistic link between up-to-date annotation and functional genomic data. ASAP supports three levels of users: public viewers, annotators and curators. Public viewers can readily browse updated annotation information such as for Escherichia coli K-12 strain MG1655, genome-wide transcript profiles from more than 50 microarray experiments and an extensive collection of mutant strains and associated phenotypic data. Annotators worldwide are currently using ASAP to participate in a community annotation project for the Erwinia chrysanthemi strain 3937 genome. Curation of the E. chrysanthemi genome annotation as well as those of additional published enterobacterial genomes are underway and will be publicly accessible in the near future.
CMR Comprehensive Microbial ResourceCMR (Comprehensive Microbial Resource) is a tool that allows researchers to access all the bacterial genome sequences completed to date. It contains robust annotation of all completed microbial genomes and allows for a wide variety of data retrievals. For each genome not sequenced at The Institute of Genome Research (TIGR), two kinds of annotation are displayed: the Primary annotation taken from the genome sequencing center and the TIGR annotation generated by an automated annotation process at TIGR. CMR thus allows access of all the information on all of the bacterial genomes or any subset of them. Retrievals can be based on protein properties such as molecular role assignments and taxonomy. The CMR also has special web-based tools to allow data mining using pre-run homology searches, whole genome dot-plots, batch downloading and traversal across genomes using a variety of data types.
IMG Integrated Microbial GenomesThe IMG (Integrated Microbial Genomes) system serves as a community resource for comparative analysis and annotation of all publicly available genomes from the three domains of life, in a uniquely integrated context. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. An increasing number of eukaryotic genomes, viruses (including phages) and plasmids have also been added to IMG in order to increase its genomic context for comparative analysis. IMG's analytical tools have been gradually generalized and enhanced in terms of their usability, analysis flow and performance. These tools allow users to focus on a subset of genes, genomes and functions of interest, and conduct analysis using summary tables, graphical viewers and various methods for comparing genes, pathways and functions across genomes.
SEED Comparative genomics researchSEED is a software environment to support early phases in building design that has been adopted for comparative genomics research. Database support in SEED allows designers to store and retrieve different design versions, alternatives and past designs that can be reused and adapted in different contexts (case-based design in the terminology of Artificial Intelligence). In addition, the database stores recurring problem specifications and typical requirements for building types or functional areas common to many buildings. The database serves also as a main means of information exchange between modules, which do not communicate design decisions directly to each other. Current literature refers to this as information modeling or product and process modeling.
Tags: Microbial Population Genetics | Population Genetics | Analytic Tools in Comparative Genomics | Comparative Genomics | Comparative Genomics Tools | Comparative Microbial Genomics | Population Genetic Patterns and Evolutionary Implications | Microbial genomics