CS 3750: Advanced Topics in Machine Learning (ISSP 3535)

CS 3750 Advanced Topics in Machine Learning (ISSP 3535)

Time: Tuesday, Thursday 4:00pm-5:15pm
Location: Sennott Square, Room 5313

Instructor: Milos Hauskrecht
Computer Science Department
5329 Sennott Square
phone: x4-8845
e-mail: milos_at_cs_pitt_edu
office hours: by appointment

Announcements !!!!!

The class presentations for your final projects will be held on Thursday, December 11 from 4:00-5:30pm. The presentations should be 6 minutes long, and highlight the problem, the data, methods used to address the problem, results and conclusions. The presentations in the pdf format should be submitted by email by 3:00pm on December 11, 2014.
Link to all presentation slides from December 12, 2014.
The project reports for the class are due at noon on Friday, December 12, 2014. Please submit your report electronically by emailing it to milos@pitt.edu. The typical report should include introduction, background, methodology, experiments, discussion of results, and conclusions/future work sections. References to existing work should be included. The final reports should be selfexplanatory, in that the main ideas and methods should be clearly communicated and written in the report.
If you are interested you may want to peruse slides and readings material for CS 3750 Machine Learning course offered in Fall 2011
Topics to be covered next and tentative schedule (Readings can be found here)
- 09/23. PCA and SVD (Eric Strobl)
- 09/25. Applications of SVD (Daniel Steinberg)
- 09/30. Probabilistic Latent semantic analysis (pLSA)(Lingjia Deng)
- 10/02. Latent Dirichlet Allocation (LDA) (Mahdi Pakdaman)
- 10/07. Probabilistic PCA, and extensions (milos)
- 10/09. Probabilistic models of time-series and sequences (Zitao Liu)
- 10/16. Conditional Random Fields (CRF) (Patrick Luo)
- 10/21. Latent component analysis and variational methods (milos)
- 10/23. Laplacian Eigenmaps for dimensionality reduction (Daniel Steinberg)
- 10/28. Spectral clustering (Salim Malakouti)
- 10/30. Label propagation on graphs. Semi-supervised learning(ChangSheng Liu)
- 11/04. Metric, kernel learning (Eric Heim)
- 11/06. Active learning (Nils Murrugara Llerena)
- 11/11. Multilabel learning (Charmgil Hong)
- 11/13. Transfer learning (Jaromir Savelka)
- 11/18. Learning from multiple annotators (Gaurav Trivedi)
- 11/20. One-shot, zero-shot learning (Jeya Balaji Balasubramanian)
- 11/25. Deep learning (Yoonjung Choi)
- 12/02. Anomaly detection (Yanbing Xue)
- 12/04. Compressed sensing (Ka Wai Yung)

Links

Course description
Lectures
TBA
Paper presentations
Projects

Abstract

The goal of the field of machine learning is to build computer systems that learn from experience and that are capable to adapt to their environments. Learning techniques and methods developed by researchers in this field have been successfully applied to a variety of learning tasks in a broad range of areas, including, for example, text classification, gene discovery, financial forecasting, credit card fraud detection, collaborative filtering, design of adaptive web agents and others.

The objective of the Advances Machine Learning course is to expand on the material covered in the introductory Machine Learning course (CS2750), and focus on special topics in ML such as, latent variable and dimensionality reduction models, active, transfer, multidimensional learning, learning with multiple annotators, outlier detection. The course will consist of a mix of lectures, presentations and discussions. Students will be evaluated based on their participation in discussions, presentations and projects.

Prerequisites

CS 2750 Machine Learning , or the permission of the instructor.

Readings:

We will use readings from:

Chris Bishop. Pattern Recognition and Machine Learning. Springer, 2006.

In addition we will use conference and journal paper readings that will be distributed electronically or in a hardcopy form.

Other (useful) books

R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification. Second edition. John Wiley and Sons, 2000.
J. Han, M. Kamber. Data mining. Concepts and Techniques. Morgan Kauffman, 2001.
T. Mitchell. Machine Learning. Mc Graw Hill, 1997.
B. Schokopf and A. Smola. Learning with kernels. MIT Press, 2002.

Lectures


Lectures	Topic(s)
August 26	Course Administration Readings: Basic concepts in probability, algebra, math Introduction to matrices Learning from Data: Mathematics Primer CS2750 Lecture notes
August 28	Review of CS 2750 material Readings: CS2750 Lecture notes
September 2	Markov Random Fields (MRFs) Readings: Bishop. Pattern Recognition and Machine Learning. Chapter 8.
September 4	Markov Random Fields (MRFs) II: inference, variable elimination, belief propagation Readings: Bishop. Pattern Recognition and Machine Learning. Chapter 8.
September 9	Markov Random Fields (MRFs) III: inference, learning Readings: Z. Ghahramani. Learning MRFs Jirousek, Kushmerick. Constructing probabilistic models Lecture notes on Learning MRFs by Sam Roweis
September 11	Markov Random Fields (MRFs) IV: learning Readings: Bishop. Pattern Recognition and Machine Learning. Chapter 20. Z. Ghahramani. Learning MRFs Jirousek, Kushmerick. Constructing probabilistic models Lecture notes on Learning MRFs by Sam Roweis
September 16	Monte Carlo methods Readings: David MacKay. Introduction to Monte Carlo methods: forward, rejection sampling, importance sampling Andrieu et al. An introduction to MCMC for Machine Learning. Machine Learning, vol. 50, pp.5-43, 2003.
September 18	Monte Carlo methods: MCMC Readings: David MacKay. Introduction to Monte Carlo methods. Andrieu et al. An introduction to MCMC for Machine Learning. Machine Learning, vol. 50, pp.5-43, 2003.
September 23	PCA and SVD (Eric Strobl) Readings: Lecture notes for CS2750 Chris Bishop. Chapter 12.1. Tutorial on PCA and SVD Tutorial on PCA Other resouces: A book chapter on SVD
September 25	Applications of SVD: Latent semantic analysis, Link analysis (Daniel Steinberg) Readings: Applications of PCA: Information Retrieval. Michael W. Berry, Zlatko Drmac, Elizabeth R. Jessup. Matrices, Vector Spaces, and Information Retrieval, SIAM Review, 1999. Link analysis. Borodin et al. Finding Authorities and Hubs From Link Structures on the World Wide Web Jon M. Kleinberg. Authoritative Sources in a Hyperlinked Environment, Journal of ACM. 1999. Sergey Brin, Lawrence Page The Anatomy of a Large-Scale Hypertextual Web Search Engine A. Ng. Stable Algorithms for Link Analysis
September 30	Latent Variable Models for text analysis, information retrieval and link analysis: PLSA, (Lingjia Deng) Readings: Probabilistic latent semantic analysis (pLSA) Thomas Hoffman. Probabilistic Latent Semantic Analysis. UAI-99, 1999. Thomas Hofmann. Probabilistic Latent Semantic Indexing. SIGIR-99, 1999. pLSA for link analysis David Cohn and Huan Chang. Learning to probabilistically identify Authoritative documents David Cohn and Thomas Hoffman. The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
October 2	Latent Dirichlet Allocation (LDA) (Mahdi Pakdaman) Readings: Latent Dirichlet Allocation David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent Dirichlet Allocation. JMLR, 2003.
October 7	Probabilistic PCA, extensions (Milos) Readings: EM algorithms for probabilistic PCA Michael E. Tipping, Chris M. Bishop. Probabilistic Principal Component Analysis, 1999. (TR in 1997) Sam Roweis. EM Algorithms for PCA and SPCA. NIPS-1998. Tipping and Bishop 1998 Mixtures of Probabilistic Principal Component Analysers Extensions of probabilistic PCA Wray Buntine and Sami Perttu. Is multinomial PCA multi-faceted clustering or dimensionality reduction AI in statistics, 2003. Michael Collins, Sanjoy Dasgupta, Robert E. Schapire. A Generalization of Principal Component Analysis to the Exponential Family
October 9	Probabilistic models of time-series and sequences (Zitao) Readings: Bishop. Chapter 13.
October 16	Conditional Random Fields (Patrick Luo) Readings: Charles Sutton and Andrew McCallum. An Introduction to Conditional Random Fields
October 21	Latent component analysis and variational methods. (Milos) Readings: Variational methods: Basics Jordan et al An Introduction to Variational Methods for Graphical Models , 1999. Z. Ghahramani. Variational methods (lecture notes) Tommi Jaakkola Tutorial on variational approximation methods Variational ML learning for component analysis Z. Ghahramani. Factorial Learning and the EM Algorithm 1996. X. Lu, M. Hauskrecht, R.S. Day. Variational Bayesian learning of the cooperative vector quantizer model. Part I: The Theory. Technical Report, Center for Biomedical Informatics, CBMI-02-181, 2002. T. Singliar and M. Hauskrecht. Noisy-or Component Analysis and its Application to Link Analysis. Journal of Machine Learning Research, 2006.
October 23	Laplacian Eigenmaps for dimensionality reduction (Daniel Steinberg) Readings: Tenenbaum, De Silva, Langford. A Global Geometric Framework for Nonlinear Dimensionality Reduction, 2000 Roweis, Saul. Locally Linear Embedding, 2000 . Belkin, Niyogi. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, 2002 .
October 28	Spectral clustering (Salim Malakouti) Readings: von Luxburg Tutorial on Spectral Clustering . C. Ding. Spectral clustering Tutorial. presented at NIPS 2004. Andrew Y. Ng, Michael I. Jordan, Yair Weiss. On Spectral Clustering: Analysis and an algorithm.
October 30	Label propagation on graphs. Semi-supervised learning (ChangSheng Liu ) Readings: Zhu, Ghahramani Learning from labeled and unlabeled data Zhu, Ghahramani, Lafferty Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Zhou at al Learning with Local and Global Consistency Zhu et al Semi-supervised learning
November 4	Metric, kernel learning (Eric Heim) Readings: B. Kulis Metric learning: A Survey S. Agarwal et al Generalized Nonmetric Multi dimensional Scaling M. Varma, RB. Bodla More generality in efficient multiple kernel learning
November 6	Active Learning (Nils Murrugara Llerena) Readings: B. Settles Active learning literature survey
November 11	Multilabel learning (Charmgil Hong) Readings: Tsoumakas, G., Katakis, I., Vlahavas, I. Mining Multi-label Data Min-Ling Zhang; Zhi-Hua Zhou A Review on Multi-Label Learning Algorithms
November 13	Transfer learning (Jaromir Savelka) Readings: SJ. Pan, Q. Yang A Survey on Transfer Learning, 2009 T. Evgeniou, M. Pontil Regularized Multi–Task Learning, 2004.
November 18	Learning from multiple annotators (Gaurav Trivedi)) Readings: Q. Nguyen. A short review of learning with multiple annotators (Section from Q. Nguyen's thesis proposal) Welinder, Perona. Online crowdsourcing: rating annotators and obtaining cost-effective labels, 2010. Raykar et al Learning From Crowds, 2010 H. Valizadegan, Q. Nguyen, M. Hauskrecht. Learning from multiple experts, 2013.
November 20	Zero and one shot learning (Jeya Balaji Balasubramanian)) Readings: Fei-Fei, Fergus, Perona One-Shot Learning of Object Categories, 2006 Palatucci, Pomerleau,Hinton, Mitchell. Zero-Shot Learning with Semantic Output Codes, 2011.
November 25	Deep learning, representation learning (Yoonjung Choi)) Readings: G. Hinton. Learning multiple layers of representation, 2007. Y. Bengio. Learning Deep Architectures for AI Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation Learning: A Review and New Perspectives, 2012. Videolectures: G. Hinton. Recent Developments in Deep Learning A. Ng. Deep Learning, Self-Taught Learning and Unsupervised Feature Learning Other readings: Li Deng and Dong Yu. Deep Learning Methods and Applications, 2014
December 2	Outlier detection (Yanbing Xu) Readings: V. Chandola, A. Banerjee, and V. Kumar Anomaly Detection: A Survey , 2009
December 4	Compressed sensing (Ka Wai Yung, Huichao Xu) Readings: Fill in the Blanks: Using Math to Turn Lo-Res Datasets Into Hi-Res Samples Candes, Walkin. An Introduction To Compressive Sampling Emmanuel Candes, Justin Rombergh, Terence Tao. Stable Signal Recovery from Incomplete and Inaccurate Measurements Romberg, Wakin. Tutorial on Compressed Sensing Videolectures: An Overview of Compressed Sensing by Emmanuel Candes, 2009. Other: Candes, Recht. Exact Matrix Completion via Convex Optimization Matrix Completion via Convex Optimization: Theory and Algorithms by Emmanuel Candes, 2009. Low rank modeling lecture by Emmanuel Candes, 2011

Course webpage for CS2750, the introductory Machine Learning course from Spring 2014. It is the prerequisite of CS3750.

Readings

Readings will be assigned before the class at which the discussion on the topic takes place. Most of the readings will be electronic, however, some readings will be in the paper form or from the books. See a summary list of Readings for different topics

Paper discussions

Every student is expected to be in charge of at least one topic, present it, and lead the discussion on the topic during the class. The readings for each topic will be distributed electronically. The assignment of the papers will be discussed during the first two week of the course.

Projects

There are no homeworks in this course. However, students will be asked to prepare, submit and present two projects. The first project will be assigned and due in the middle of the semester. The final project (due at the end of the semester) and is more flexible: a student can choose from a set of topics/problems or propose his/her own topic to investigate. If you plan to propose your own project/topic you will need to submit a short (one page) proposal for the purpose of approval and feedback. In general, the final project must have a distinctive and non-trivial learning or adaptive component.

Last updated by milos on 08/26/2014