max planck institut

informatik

informatik

**2013-10-13**The 2nd exam for ESL1 will take place on Oct 31st. Exam slots will be announced shortly.**2013-07-24**The exam date for the first exam has been changed to Aug 1st and Aug 6th. Please finalize your exam date preference by July 30.**2013-07-15**Doodle poll for first exam date closes on July 30. Exam slots will be announced shortly after that.**2013-07-14**Assignment 06 is now due on July 24th 10:00.

Lecturer |
Thomas Lengauer |

Teaching Assistant |
Prabhav Kalaghatgi |

Language |
English |

Lecture |
Wednesday, 10:00 c.t. - 12:00, Campus E2.1 (CBI building), room 001 First lecture will be held on April 17, 2013 in E2.1, room 001 |

Tutorial |
Slot 1: Monday, 14:15 - 16:00, MPI-INF, room 023 Slot 2: Tuesday, 14:15 - 16:00, MPI-INF, room 023 |

Office hours |
Thomas Lengauer: After each lecture Prabhav Kalaghatgi: By appointment, Campus E1.4 (MPI-INF building), Room 526 |

In order to successfully participate, you need to register for the lecture in the LSF/HISPOS system of Saarland University - this will be possible as soon as the exam date has been entered into the system (this usually happens a few weeks into the semester).

Lecture slides, tutorial handouts and problem sets are available in the password protected area.

This course covers a subject that is relevant for computer scientists in general as well as for other scientists involved in data analysis and modeling. It is not limited to the field of computational biology.

The course will be the first part of a two semester course on Statistical Learning. The first part (SS 2013) will concentrate on chapters 1-5 and 7-10 of the book The Elements of Statistical Learning, Springer (second edition, 2009). In both semesters, there will be two hours of lecture per week and one hour of tutorial (V2/Ü1); however, the slot for the tutorial will be set after the first lecture, a 2 hour tutorial every other week is also possible.

Both parts of this lecture fulfill the requirements for the curricula of computer science and bioinformatics as special lecture (Spezialvorlesung, 5 credit points).

The course is targeted to advanced students in bioinformatics, computer science, math and general science with mathematical background. Students should know linear algebra and have basic knowledge of statistics.

You need a cumulative 50% of the points in the problem sets to be admitted to the oral exam.

Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer (second edition, 2009). The readers of the course are encouraged to acquire this book.

More information on this book, as well as a contents listing can be found on the Springer web site.

A free pdf copy of the book can be obtained at http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Additional literature can be found in the library; the reserve list for the lecture can be found here: library reserve list for 'Elements of Statistical Learning 1'

Please keep in mind that only the book by Hastie, Tibshirani and Friedman will be covered in the lecture.

Problem sets will cover theoretical proofs and programming exercises with roughly equal weight. In general, they are due Wednesday before the lecture (10:00 sharp); further details regarding the assignments will be announced in the first lecture.

The programming language that will be used is R - a language for statistical computing. It is freely available for Windows, Linux and Mac. As a vectorized programming language is ideally suited for the problems we will encounter. There are also many freely available packages (or libraries) to perform a variety of classification and regression tasks, or to visualize the results of statistical analyses in a convenient way.

The tutorials focus on the problem sets. A very brief reiteration of parts of the lecture is also given.

- Refresh your knowledge on basic statistics. Basic linear algebra will also be useful.
- Familiarize yourself with the R programming language. You might find the following tutorials useful:
- R for Beginners by Emmanuel Paradis. Especially relevant for us are chapters 1, 2, 3 and 6.
- An Introduction to R - the standard R introduction. This is a very detailed manual; it is therefore quite lengthy.