Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

The Elements of Statistical Learning I (SS 2011)

News

  • 2011-07-14 The location for the exams will be the rotunda on the 5th floor of the mpii
  • 2011-07-11 The exam schedule is now available in the password protected area. Please report any problems immediately to Fabian Müller.
  • 2011-07-08 The date of the second exam is now fixed to October 27.
  • 2011-07-07 As preparation for the exam, there will be an additional tutorial on Thursday, July 14, 16-18h. Please submit questions that you want to discuss to Fabian Müller.
  • 2011-06-27 If you want to participate in the exam, please send an email to Fabian Müller containing your full name, student ID, preferred exam date (end of this semester/beginning of next semester) and preferred exam language (English/German) by Tuesday, July 6
  • 2011-06-10 Due to the holiday, the next tutorials will be on Tuesday, June 28 16-18h and 18-20h
  • 2011-05-11 beginning this week, there will be an additional alternative tutorial, Thursdays 18-20h
  • 2011-04-27 Problem set 2 has been updated due to a typo in problem 4(c)
  • 2011-04-27 The room for the lecture has changed to E2.1 room 001
  • 2011-04-20 Due to numerous objections to the change in tutorial time, the time slot for the tutorials will be the initial one (Thursday 16:00-18:00)
  • 2011-04-15 Send an email to Fabian Müller if you have objections to changing the time of the tutorial to Thursday 18:00-20:00
  • 2011-04-15 The room for the tutorial has changed to E2.1 room 007 (same as for the lecture)
  • 2011-04-06 Exam date fixed to 2011-07-21 10:00-12:00 and 14:00-16:00
  • 2011-04-06 The first tutorial will take place on 2011-04-14 in E2.1, room 1.06
  • 2011-03-21 The first lecture will take place on 2011-04-13 in E2.1, room 001

General information

Lecturer Thomas Lengauer
Teaching Assistant Fabian Müller
Language English

Time and location

Lecture Wednesday, 10:00 - 12:00, Campus E2.1 (CBI building), room 001
First lecture will be held on April 13, 2011 in E2.1, room 001
Tutorial Thursday (biweekly), 16:00 - 18:00, Campus E2.1 (CBI building), room 007
Thursday (biweekly), 18:00 - 20:00, Campus E2.1 (CBI building), room 007
First tutorial will be held on April 14, 2011
Office hours Thomas Lengauer: after each lecture
Fabian Müller: Tuesday 17:00-18:00 or by appointment, Campus E1.4 (MPI building), Room 509

Course material

Lecture slides, tutorial handouts and problem sets are available in the password protected area.

Overview

This course covers a subject that is relevant for computer scientists in general as well as for other scientists involved in data analysis and modeling. It is not limited to the field of computational biology.

The course will be the first part of a two semester course on Statistical Learning. The first part (SS 2011) will concentrate on chapters 1-5 and 7-10 of the book The Elements of Statistical Learning, Springer (second edition, 2009). In both semesters, there will be two hours of lecture per week and one hour of tutorial (V2/Ü1), however, the tutorial will actually be two hours every other week.

Both parts of this lecture fulfill the requirements for the curricula of computer science and bioinformatics as special lecture (Spezialvorlesung, 5 credit points).

Prerequisites

The course is targeted to advanced students in bioinformatics, computer science, math and general science with mathematical background. Students should know linear algebra and have basic knowledge of statistics.

Requirements for the course certificate

You need a cumulative 50% of the points in the problem sets to be admitted to the oral exam. A score of 50% in the exam is then considered a passing grade.

Literature

Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer (second edition, 2009). The readers of the course are encouraged to acquire this book.
More information on this book, as well as a contents listing can be found on the Springer web site.

Problem Sets

Problem sets will cover theoretical proofs and programming exercises with roughly equal weight. They are due Wednesday before the lecture (10:00 sharp) every other week and can be handed in in groups of two students. The first assignment will be handed out after the first lecture

The programming language that will be used is R - a language for statistical computing. It is freely available for Windows, Linux and Mac. As a vectorized programming language is ideally suited for the problems we will encounter. There are also many freely available packages (or libraries) to perform a variety of classification and regression tasks, or to visualize the results of statistical analyses in a convenient way.

Tutorials

The tutorials focus on the problem sets. A very brief reiteration of parts of the lecture is also given.

What can I do to prepare for the lecture?