CS2750  Machine Learning (ISSP 2170)


Time:  Monday, Wednesday 2:00-3:20pm, 
Location: Sennott Square, Room 5313


Instructor:  Milos Hauskrecht
Computer Science Department
5329 Sennott Square
phone: x4-8845
e-mail: milos@cs.pitt.edu
office hours: by appointment


TA:  Ali Alanjawi
Computer Science Department
5404 Sennot Square
phone: x4-1185
e-mail: alanjawi@cs.pitt.edu
office hours: Tuesday, Thursday: 1pm - 4pm


Announcements !!!!!

Quiz:

Final projects:

!!! CS 3750 Advanced Topics in Machine Learning (ISSP 3535) !!!

Additional readings for the course including topics still to be covered



Links

Course description
Lectures
Homeworks
Term projects
Matlab



Abstract

The goal of the field of machine learning is to build computer systems that learn from experience and that are capable to adapt to their environments. Learning techniques and methods developed by researchers in this field have been successfully applied to a variety of learning tasks in a broad range of areas, including, for example, text classification, gene discovery, financial forecasting, credit card fraud detection, collaborative filtering, design of adaptive web agents and others.

This introductory machine learning course will give an overview of many models and algorithms used in modern machine learning, including linear models, multi-layer neural networks, support vector machines, density estimation methods, Bayesian belief networks, mixture models, clustering, ensamble methods, and reinforcement learning. The course will give the student the basic ideas and intuition behind these methods, as well as, a more formal understanding of how and why they work. Students will have an opportunity to experiment with machine learning techniques and apply them a selected problem in the context of a term project.

Prerequisites

Knowledge of matrices and linear algebra (CS 0280), probability (CS 1151), statistics (CS 1000), programming (CS 1501) or equivalent, or the permission of the instructor.



Textbook:

Recommended book: Other books we may use:

Lectures
 
 
Lectures  Topic(s)  Assignments
January 6 Course administration
January 8 Introduction.

Readings:

January 13 Designing a learning system.

Readings:

  • DHS textbook: Chapter 1.
  • Data preprocessing. Chapter 3 in Han, Kamber. Data mining. Concepts and Techniques. Morgan Kauffman, 2001.
  • Optimization. Chapter 6 in Michael Heath. Scientific Computing, McGraw Hill, 1997.
  • Statistical tests: Z-test, T-test, and Chi-Square test for variance. In D. Sheshkin. Handbook of parametric and non-parametric statistical procedures. CRC Press, 1997.
January 15 Matlab tutorial.

Running Matlab at the University of Pittsburgh.

January 22 Evaluation of predictors

Readings:

  • Review of confidence intervals, z-test & t-test for a directional hypothesis using your favorit statistical textbook or earlier handout
Homework 1
(Data files for HW-1)
January 27 Density estimation

Readings:

  • DHS book: Chapter 3. Sections 3.1-3.5.
January 29 Density estimation II

Readings:

  • DHS book: Chapter 3. Sections 3.1-3.6.
Homework 2
(Data files for HW-2)
February 3 Linear regression

Readings:

  • HTF book: Chapter 3
February 5 Linear regression (cont).
Classification with linear models.

Readings:

  • HTF book: Chapter 3
  • DHS book: Chapter 5 (5.1-5.4).
Homework 3
(Data files for HW-3)
February 10 Classification with linear models.

Readings:

.
February 12 Multiway classification.
Bayesian decision theory.

Readings:

  • DHS book: Chapter 2.
Homework 4
(Data files for HW-4)
February 17 Class cancelled due to snow storm .
February 19 Multilayer neural networks

Readings:

  • DHS book: Chapter 6
Homework 5
(Data files for HW-5)
February 24 Support vector machines

Readings:

.
February 26 Bayesian belief networks

Readings:

Homework 6
(Data files for HW-6)
March 10 Bayesian belief networks. Inference

Readings:

.
March 12 Learning Bayesian belief networks.

Readings:

.
March 17 Midterm .
March 19 Density estimation with hidden variables. EM.

Readings:

Homework 7
(Data files for HW-7)
March 24 Project proposals .
March 26 Expectation maximization.
A naive Bayes model with hidden class and missing values. Mixture of Gaussians.

Readings:

  • DHS: Chapter 10.1-10.4.
Homework 8
(Data files for HW-8)
March 31 Clustering. Non-parametric density estimation.

Readings:

  • DHS: Chapter 10.5.-10.10. & Chapter 4
April 2 Dimensionality reduction.

Readings:

  • DHS: Chapter 3.7-3.8., Chapter 10.13.
April 7 Decision trees.

Readings:

  • DHS: Chapter 8 (8.1.-8.5.)
April 9 Ensamble methods. Mixture of experts. Bagging.

Readings:

April 14 Ensamble methods. Boosting.

Readings:

April 16 Reinforcement learning.

Readings:

April 21 Quiz.
Reinforcement Learning (cont.)
April 23 Term projects: reports due at 2pm. No class.
April 25 Term projects: presentations.



Homeworks

The homework assignments will have mostly a character of projects and will require you to implement some of the learning algorithms covered during lectures. Programming assignmets will be implemented in Matlab. See rules for the submission of programs.

The assignments (both written and programming parts) are due at the beginning of the class on the day specified on the assignment. In general, no extensions will be granted.

Collaborations: You may discuss material with your fellow students, but the report and programs should be written individually.
 



Term projects

The term project is due at the end of the semester and accounts for a significant portion of your grade. You can choose your own problem topic. You will be asked to write a short proposal for the purpose of approval and feedback. The project must have a distinctive and non-trivial learning or adaptive component. In general, a project may consist of a replication of previously published results, design of new learning methods and their testing, or application of machine learning to a domain or a problem of your interest.



Matlab

Matlab is a mathematical tool for numerical computation and manipulation, with excellent graphing capabilities. It provides a great deal of support and capabilities for things you will need to run Machine Learning experiments. Upitt has a number of Matlab licences running on both unix and windows platforms. Click here to find out how to access Matlab at Upitt.

Matlab tutorial files from 01/15/03.

Other Matlab resources on the web:

Online MATLAB  documentation
Online Mathworks documentation including MATLAB toolboxes


Course webpage from Spring 2002



Last updated by Milos on 11/18/2002