Information processing by cells and biologists
The core agenda of post-WWII molecular biology has been defined as the
molecular understanding of how genetic information was transmitted and read
out (see for example Stent 1968), and, by the 1950's, the analogy between the tape
in a Turing machine and the linear sequence of nucleotides in DNA was
apparent to both computer scientists and biologists.
In the early 21st century, it may be that molecular biology can fruitfully
return to return to these roots, by recasting part of its agenda in terms of the
need to understand how biological information is processed. In a somewhat
more modern formulation, cells can be thought of as machines that process and
make decisions on three kinds of information: 1) information stored in the
genome 2) information about intracellular events (for example from checkpoint
mechanisms) and 3) information external to the cell.
In many cases the machinery that cells use to make decisions is reasonably
well understood at a qualitative level. However, in no case do we possess a
corresponding quantitative understanding, and, reflecting this, nor are we very
capable of predicting the outcomes of perturbations to the genome, the internal
workings of the cell, or its external environment.
One path to understanding the behavior of these ensembles of components
clearly lies in construction of mechanism-based quantitative models representing
cellular processes. Building such models requires solution of numerous
computational and experimental biological challenges. I will detail some of
these, and progr.
Another path may involve computation on the qualitative biological
knowledge that now exists. Expert biologists reason on this qualitative
information to make statements about the consequences of perturbations, but
expert systems that do the same in the main do not exist. Here, although the
need is clear, the relative opacity (to me) of much of the seemingly relevant
computer science literature has made it more difficult to figure out first steps.
Finally, note that information theory (Shannon 1948) has it roots in the 20th
century need to understand transmission of electrical signals through channels.
It is not immediately clear that the representations of biological processes used
by biologists map well to concepts that come from this theory. To give only one
example, one is hard pressed to define or find, inside a cell that is processing
signals from the outside, either the signal or the "bits" (Tukey, 1946) that might
make it up. There may be thus be an opportunity here for new theory to guide
thinking and further experiment.
Brent, R. 2000. Genomic biology. Cell, 100, 169-183
Endy, D. and Brent, R. 2001. Modeling cellular behavior. Nature (supplement),
in press.
Shannon, C. E. (1948) The mathematical theory of communication. Bell System
Technical Journal.
Stent, G. (1968) That was the molecular biology that was. Science 160, 390-394.
Tukey, J. W. (1946) Referenced at www.maa.org.