Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society

Elements of Statistical Learning 2 (WS 2009/10)


  • 2010-04-01 Time slots for discussions before the exam are Thursday, April 8 from 11 to 12 and Monday, April 12 from 16 to 17. The place is Room 533.
  • 2010-03-03 We have made a preliminary exam schedule available in the password protected area.
  • 2010-02-09 The exam on February 11 is canceled.
  • 2010-01-13 The tutorial on January 21 will take place in Room 023.
  • 2010-01-06 The tutorial on January 7 is moved to Thursday, January 14, Room 533 (rotunda on fifth floor).
  • 2009-11-30 The tutorial on December 10 will take place in Room 533 (rotunda on fifth floor).

General information

Lecturer Thomas Lengauer
Tutor Yassen Assenov
Language English

Time and location

Lecture Wednesday, 10:00 - 12:00, Campus E1 4 (MPI building), Room 024
First lecture will be held on October 14, 2009
Tutorial Thursday, 10:00 - 12:00, Campus E1 4 (MPI building), Room 024
First tutorial will be held on October 29, 2009
Office hours Thomas Lengauer: after each lecture
Yassen Assenov: Monday 09:00 - 11:00 or by appointment, Campus E1.4, Room 522


This course covers a subject that is relevant for computer scientists in general as well as for other scientists involved in data analysis and modeling. It is not limited to the field of computational biology.

The course will be the second part of a two semester course on Statistical Learning. The first part (SS 2009) concentrated on chapters 1-5 and 7-10 of the book The Elements of Statistical Learning, Springer 2009. The second part will present the other chapter in the book, focusing on advanced topics in supervised and unsupervised leaning, such as kernels, SVMs, neural networks, and clustering. The theoretical models will be illustrated with interesting applications, out of which many are challenging problems in Bioinformatics. As in the previous semester, there will be two hours of lecture per week and one hour of tutorial (V2/Ü1), however, the tutorial will actually be two hours every second week.

Both parts of this lecture fulfill the requirements for the curricula of computer science and bioinformatics as optional course with 5 credit points (Spezialvorlesung, 5 Leistungspunkte).


The course is targeted to advanced students in math, computer science and general science with mathematical background. Students should know linear algebra and have basic knowledge of statistics.

Requirements for the course certificate

You need a cumulative 50% of the points in the homework assingments to be admitted to the oral exam. A score of 50% in the exam is then considered a passing grade.


Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer 2009. The readers of the course are encouraged to acquire this book. You can download it as a PDF file from the dedicated page on Charlie Tibshirani's web site. More information on this book, as well as a contents listing can be found on the Springer web site.


The tutorials focus on homework assignments. A very brief reiteration of parts of the lecture is also given. Homework assignments will cover theoretical proofs and programming excercises with roughly equal weight. The programming exercises will be in the form of two projects presented during the semester.

The programming language that we use is R - a language for statistical computing. It is freely available for Windows and Linux and - as a vectorized programming language - is ideally suited for the problems we will encounter. There are also many freely available packages (or libraries) to perform a variety of classification and regression tasks, or to visualize the results of statistical analyses in a convenient way.

Course material

Lecture slides and tutorial handouts are available in the password protected area.