Contents
Project Description
An integral part of the class is a semester long project which will allow you to apply some of the algorithms we study in the class to real biological data. More specifically, you will investigate the connection between network topology of protein-protein interaction networks and protein function.
You will work in groups of three or four, depending on the number of people in the class, with each group responsible for the analysis of one particular genome-wide protein interaction network for the Saccharomyces cerevisiae. You will have four networks to choose from as detailed below; each network was compiled from different experimental sources and/or using different computational methodologies. Thus, hopefully, each group will have a slightly different story to tell about topology/function connection.
Throughout the semester we will post four analysis milestones and you will have three to four weeks to complete the analysis and prepare a short (15-25 minutes) oral presentation describing your findings.
After each milestone we will dedicate one class to discussion of the results. Each group will have 15-25 minutes to present their findings, the oral presentations will be followed by the discussion. You will also prepare a written report at the end of the semester that uses the results of your findings across the milestones to shed light on the topology/function connection in your network.
Grading and Policies
We will take into account the following when grading your project: (i) quality of oral presentations, (ii) active participation in the discussion sessions, (iii) quality of final report. We won't grade each milestone separately but can provide you with a general feedback on your performance throughout the semester.
We expect each group member to be actively involved in all parts of the project: analysis, preparation of oral presentations, and final report. In particular, we expect each group member to give an oral presentation for at least one milestone.
We strongly recommend Python for computational analysis tasks. We expect each group to write its own analysis code and may ask for source code to enforce this policy.
Biological Data
- Protein interaction networks
- Literature curation (unweighted)
- Primary reference: T. Reguly et al., Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae, Journal of Biology, 5:11, 2006 (PubMed).
- You can download the network from here.
- High-throughput complex purification experiments (weighted)
- Primary reference:[1] S.R. Collins et al., Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae, Molecular & Cellular Proteomics, 6:439-450, 2007 (PubMed).
- You can download the network from here.
- High-throughput yeast-two-hybrid experiments (unweighted)
- Primary reference: H. Yu et al., High-quality binary protein interaction map of the yeast interactome network, Science, 322:104-110, 2008 (PubMed).
- You can download the network from here.
- Integration of all experimental evidence (weighted)
- Primary reference: L. Kiemer, S. Costa, M. Ueffing, and G. Cesareni, WI-PHI: a weighted yeast interactome enriched for direct physical interactions, Proteomics, 7:932-943, 2007 (PubMed).
- You can download the network from here.
- Functional annotation
- You can download the annotation from here
- Optional reading
- B.A. Shoemaker and A.R. Panchenko, Deciphering protein-protein interactions. Part I. Experimental techniques and databases, PLoS Computational Biology, 3:e42, 2007 (PubMed).
Project Milestones
- Warm-Up Milestone: get to know the computational tools and biological data that you will be using throughout the semester.
- Completion date: May 15.
- You can get the detailed description of the milestone from here.
- Slides from group presentations:
- Links and additional reading:
- Function Prediction Milestone: implement an algorithm based on Linear Programming for function prediction.
- Completion date: June 10.
- You can get the detailed description of the milestone from here.
- Slides from group presentations:
- Links and additional reading:
- E. Nabieva, K. Jim, A. Agarwal, B. Chazelle, and M. Singh, Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps, Bioinformatics, 21:i302-10, Suppl 1, 2005 (PubMed).
- You can get CVXOPT tutorial files from here
- Network Distances Milestone: use various network distance measures to predict function.
- Completion date: July 17.
- You can get the detailed description of the milestone from here.
- Slides from group presentations: