CPSC 633: Machine Learning (Spring 2012)

Professor: Dr. Thomas R. Ioerger
Office: 322C HRBB
Phone: 845-0161
Email: ioerger@cs.tamu.edu
Office hours: Wed, 3:00-4:00, or by appointment

Class Time: Tues/Thurs, 9:35-10:50
Room: 113 H.R Bright Building
Course WWW page: http://www.cs.tamu.edu/faculty/ioerger/cs633-spr12/index.html
Textbook: Introduction to Machine Learning. (2010). Etham Alpayden. MIT Press. 2nd edition.

Teaching Assistant: Jaewook Yoo, jwookyoo@neo.tamu.edu
office hours: Tues/Thurs, 5-6pm, HRBB 344


Goals of the Course:

Machine learning is an important sub-area within AI, and is broadly applicable to many application areas within Computer Science. Machine learning can be viewed as methods for making systems adaptive (improving performance with experience), or alternatively, for augmenting the intelligence of knowledge-based systems via rule acquisition. In this course, we will examine and compare several different abstract models of learning, from hypothesis-space search, to function approximation (such as by gradient descent), to statistical inference (e.g. Bayesian), to the minimum description-length principle. Both theoretical issues (e.g. algorithmic complexity, hypothesis space bias) as well as practical issues (e.g. feature selection; dealing with noise and preventing overfit) will be covered.

Topics to be Covered:


Prerequisites

CPSC 420/625 - Introduction to Artificial Intelligence

We will be relying on standard concepts in AI, especially heuristic search algorithms, propositional logic, and first-order predicate calculus. Either the graduate or undergraduate AI class (or a similar course at another university) will count as satisfying this prerequisite.

In addition, the course will require some background in analysis of algorithms (big-O notation), and some familiarity with probability and statistics (e.g. standard deviation, confidence intervals, linear regression, Binomial distribution).

Projects and Exams

There will probably be only a few homeworks and 2 exams. However, the main work for the class will consist of several programming projects in which you will implement and test your own versions of several learning algorithms. These will not be group projects, so you will be expected to do your own work. Several databases will be provided for testing your algorithms (e.g. for accuracy). A written report describing your implementation and results will be required for each project.

Your grade at the end of the course will be based on a weighted average of points accumulated during the semester. The weights will be distributed approximately as 40% exams, 50% projects, 10% other (homeworks, quizzes, participation in class discussions), but this might be adjusted slightly to reflect relative effort of each. The maximum cutoff for an A will be 90%, 80% for B, and 70% for C.

The late-assignment policy for homeworks and projects will be incremental: -5%/per day, down to a maximum of -50%. If the project is turned in anytime by the end of the semester, you can still get up to 50% (minus points marked off).


Schedule

Tues, Jan 17: first day of class; Overview
Thurs, Jan 19: perspectives on machine learning; core concepts; terminologyCh. 1-3 notes
Tues, Jan 24:
Thurs, Jan 26: Decision TreesCh. 9; notes
Tues, Jan 31: pruning notes, (Mingers, 1989)
Thurs, Feb 2: rule induction; Evaluating ClassifiersCh. 19, notes
Tues, Feb 7:
Thurs, Feb 9: Perceptrons Ch. 11; notes
Tues, Feb 14: Backprop notes
Thurs, Feb 16:
Tues, Feb 21: Nearest Neighbor Ch. 8, notes
Project #1 due
Thurs, Feb 23: feature weighting/selection read Sec 6.3 (on PCA)
Tues, Feb 28: Bayesian classification, Naive Bayes Ch. 4; notes
Thurs, Mar 1 regression; LDA; model selection Ch. 5, notes (slides 12-23)
Tues, Mar 6
Thurs, Mar 8 Mid-term Exam
Mar 12-16 (Spring Break)
Tues, Mar 20 Support Vector Machines Ch. 13; notes
Thurs, Mar 22 kernel trick Project #2 due
Tues, Mar 27 Clustering Sec 7.3, 7.7, 7.8; notes
Thurs, Mar 29 multi-dimensional scaling (6.5); association rules (3.6)
Tues, Apr 3 Ensemble classifers; bagging; weighted majority Ch. 17 (skip 17.5) notes
Thurs, Apr 5 boosting, stacking, RBFs (12.3, 12.8)
Tues, Apr 10 Bayesian Inference Ch. 14, notes, Project #3 due
Thurs, Apr 12
Tues, Apr 17 Expectation Maximization Sec 7.4; notes
Thurs, Apr 19 Sampling notes
Tues, Apr 24 Hidden Markov Models Ch. 15, notes
Thurs, Apr 26 last day of class Project #4 due
Fri, May 4 Final Exam, 7:30-9:30am


Academic Integrity Statement and Policy

Aggie Code of Honor: An Aggie does not lie, cheat or steal, or tolerate those who do.
see: Honor Council Rules and Procedures


Americans with Disabilities Act (ADA) Policy Statement

The Americans with Disabilities Act (ADA) is a federal anti-discrimination statute that provides comprehensive civil rights protection for persons with disabilities. Among other things, this legislation requires that all students with disabilities be guaranteed a learning environment that provides for reasonable accommodation of their disabilities. If you believe you have a disability requiring an accommodation, please contact Disability Services, in Cain Hall, Room B118, or call 845-1637. For additional information visit http://disability.tamu.edu.


Links