Anthony R. Kerlavage, Ph.D.
Senior Director, Gene Discovery
Celera Genomics
Rockville, MD USA
Abstract
The field of genomics was radically changed with the sequencing of the first complete microbial genome, Haemophilus influenzae by The Institute for Genomic Research (TIGR)1. This project made it apparent that the DNA of entire complex organisms many megabases in size could be accurately and rapidly sequenced by using a "shotgun" sequencing strategy. Since that time, TIGR and other labs have combined to completely sequence the genomes of over 20 microbes. Knowing the complete genome sequence of the pathogens in this group will open up exciting opportunities to develop novel pharmaceuticals, biologics, and vaccines. The genomes of two important eukaryotic model organisms, S. cerevisiae2 and C. elegans3 have also been completed. In addition, several chromosomes from P. falciparum and A. thaliana are finished and these entire genomes will soon be complete.
Across all of these species, nearly half of the candidate genes that have been identified cannot be assigned a definitive biological role, leaving open a tremendous opportunity for functional as well as computational genomics. On the other hand, by a combination of molecular sequence analysis techniques, new insights have been made concerning the metabolic pathways, cell-surface receptor and transporter complement, and phylogeny of these organisms. The availability of these complete genomes makes comparative genomic analysis possible, leading to the discovery of synteny among organisms as well as regulatory and developmental networks controlling the expression of genes. The integration and semantic representation of this wealth of data will be critical to our ability to understand it.
At Celera Genomics we have set our goal to become the definitive source of genomic and associated medical information that will be used by scientists to develop a better understanding of the biological processes in humans and agriculturally important organisms and deliver improved healthcare in the future. Using breakthrough DNA sequencing technology, we are operating a genomics sequencing facility with an expected capacity greater than that of the current combined world output4. The early focus at Celera will be on completing the genomes of human, mouse, Drosophila and rice. While the size of these genomes and the speed with which they will be sequenced will present enormous computational challenges for the discovery and characterization of genes, they represent an enormous opportunity to advance the complete understanding of living systems.
References