ISMB99 - Tutorial 3

 

The challenge of annotating a complete eukaryotic genome:
A case study in Drosophila melanogaster


Martin Reese

Many of the technical issues involved in sequencing complete genomes are essentially solved. Technologies already exist that provide sufficient solutions for ascertaining sequencing error rates and for assembling sequence data. Currently, however, standards or rules for the annotation process of DNA sequences are still an outstanding problem.

How shall the genomes be annotated, what shall be annotated, which computational tools are most effective, how reliable are these annotations, how organism-specific do the tools have to be and ultimately how should the computational results be presented to the community? All these questions are unsolved. The proposed tutorial will try to give an overview and assessment of the current state of annotation based upon experiences gained at the Drosophila melanogaster genome project.

In the tutorial we will do three things. First, with the participation of computational biologists from the community we will compare existing tools for sequence annotation. We will do this by providing a 3 Megabase sequence that has been well-characterized at our center as a test bed for evaluating different feature-finding algorithms. This is similar to what has been done at the CASP (Critical Assessment of Techniques for Protein Structure Prediction) conferences (see PredictionCenter) for protein structure prediction. Second, we will break the annotation process down and differentiate separate aspects of the problem. This will serve to clarify the term "annotation", which is often used to collectively describe a process that has a number of discrete steps. Third, we will discuss which annotation problems are essentially solved and which problems remain.

In connection with this tuturial the authors of the tutorial have performed a community-wide annotation experiment (announcement).
The results are available here:

-> Results of the Annotation Experiment

-> Tutorial Handouts
-> Tutorial Program
-> ISMB 99