Probabilistic Graphical Models

in

Computational Molecular Biology

 

 

 

 

 

 

Pierre Baldi

University of California, Irvine

 

 

 

 

OUTLINE

 

 

 

 

 

 

  1. INTRODUCTION: BIOLOGICAL DATA AND PROBLEMS
  2. THE BAYESIAN STATISTICAL FRAMEWORK
  3. PROBABILISTIC GRAPHICAL MODELS
  4. APPLICATIONS

 

 

 

 

 

 

DATA COMPLEXITY AND COMPUTATIONAL PROBLEMS

 

 

 

 

 

 

 

 

 

 

 

 

MACHINE LEARNING

 

 

 

 

 

 

 

 

 

 

THREE KEY FACTORS

 

 

 

 

 

 

Data Mining/Machine Learning Expansion is fueled by:

 

 

 

 

INTUITIVE APPROACH

 

 

 

 

 

 

 

DEDUCTION AND INFERENCE

 

 

 

 

 

 

 

 

 

 

If A B and A is true,

then B is true.

 

 

 

If A B and B is true,

then A is more plausible.

 

 

BAYESIAN STATISTICS

 

 

V (non A)=f(V (A))

V (A,B)=F(V (A), V (B|A))

 

 

PROBABILITY AS DEGREE OF BELIEF

 

 

 

 

P(A|I) = 1-P(non-A|I)

P(A,B|I) = P(A|I) P(B|A,I)

P(A|B) = P(B|A) P(A) / P(B)

P(Model|Data) = P(Data|Model) P(Model) / P(Data)

P(Model|Data,I) = P(Data|Model,I) P(Model|I) / P(Data|I)

P(Model|D1,D2,…,Dn+1) = P(Dn+1|Model) P(Model|D1,…,Dn) / P(Dn+1|D1,…,Dn)

 

 

 

 

DIFFERENT LEVELS OF BAYESIAN INFERENCE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A non-probabilistic model is NOT a scientific model.

 

 

 

 

 

EXAMPLES OF NON-SCIENTIFIC MODELS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TO CHOOSE A SIMPLE MODEL BECAUSE DATA IS SCARCE IS LIKE SEARCHING FOR THE KEY UNDER THE LIGHT IN THE PARKING LOT.

 

 

 

 

 

 

 

 

MODEL CLASSES

 

 

 

 

 

 

 

 

 

 

 

LEARNING

 

 

 

 

 

 

 

 

 

 

 

PRIORS

 

 

 

 

 

 

 

 

 

 

 

LEARNING ALGORITHMS

 

 

 

 

 

 

 

 

 

OTHER ASPECTS

 

 

 

 

 

 

 

 

 

AXIOMATIC HIERARCHY

 

 

 

 

 

 

 

 

 

 

 

 

GRAPHICAL MODELS

 

 

 

 

BASIC NOTATION

 

 

 

 

 

 

P(X,Y|Z)=P(X|Z) P(Y|Z)

 

 

 

 

 

UNDIRECTED GRAPHICAL MODELS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MARKOV PROPERTIES

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GLOBAL FACTORIZATION

 

 

 

 

 

P(X1,…,Xn) = exp [-> C fC(XC)] / Z.

 

 

 

 

 

 

 

 

DIRECTED GRAPHICAL MODELS

 

 

 

 

 

 

 

 

 

 

 

MARKOV PROPERTIES

 

 

 

 

The future is independent of the past given the present

 

 

 

 

 

 

GLOBAL FACTORIZATION

 

 

 

 

 

 

 

 

P(X1,…,Xn) = < i P(Xi|Xj : j parent of i)

 

 

 

 

 

BELIEF PROPAGATION OR INFERENCE

 

 

 

 

Basically a repeated application of Bayes rule.

 

 

 

 

 

 

 

 

RELATIONSHIP TO OTHER MODELS

 

 

 

 

 

 

 

 

 

 

APPLICATIONS