Viatcheslav R. Akmaev, Scott T. Kelley, Gary D. Stormo
Abstract
Methods based on the Mutual
Information statistic (MI methods) predict structure by looking
for statistical correlations between sequence positions in a set
of aligned sequences. Although MI methods are often quite effective,
these methods ignore the underlying phylogenetic relationships
of the sequences they analyze. Thus, they cannot distinguish between
correlations due to structural interactions, and spurious correlations
resulting from phylogenetic history. In this paper, we introduce
a method analogous to MI that incorporates phylogenetic
information. We show that this method accurately recovers the
structures of well-known RNA molecules. We also demonstrate, with
both real and simulated data, that this phylogenetically-based
method outperforms standard MI methods, and improves the ability
to distinguish interacting from non-interacting positions in RNA.
This method is flexible, and may be applied to the prediction
of protein structure given the appropriate evolutionary model.
Because this method incorporates phylogenetic data, it also has
the potential to be improved with the addition of more accurate
phylogenetic information, although we show that even approximate
phylogenies are helpful.