CSL 864 - Special Topics in AI: Classification
This is an introductory course on machine learning focusing on
classification. The course has three major objectives. First, to
familiarize students with basic classification methods so that these
can be used as tools to tackle practical machine learning
problems. Second, to equip students with the mathematical skills
needed to theoretically analyze these methods and modify and extend
them to tackle new problems. Finally, students will be introduced to
basic optimization techniques so that they can start writing their own
code for these methods.
- Lecture 1: Introduction to supervised learning, over fitting,
probability theory and decision theory.
- Lecture 2: Generative Methods and Naïve Bayes.
- Lecture 3: Toy example.
- Lecture 4: Discriminative Methods and Logistic
Regression. Equivalence to Naïve Bayes.
- Lecture 5: Logistic Regression optimization and
extensions.
- Lecture 6: Support Vector Machines.
- Lecture 7: SVMs continued - Kernels.
- Lecture 8: Multi-Class SVMs. Digression on VC dimension.
- Lecture 9: SVM optimization
- Lecture 10: Kernel learning and optimization
- Lecture 11: Boosting
- Lecture 12: Boosting continued
Slides
Code
Most code was written five minutes before the start of each lecture
and comes with no guarantees, comments or documentation. In
particular, no attempt has been made to bulletproof the code. For
instance, if there is no feasible solution for your parameter settings
then the figure plotting subroutines will crash (the LR and SVM
learning routines should be stable). In any case, run the code at your
own peril.
- Some common MATLAB tools needed for
the demos (you'll need the Optimization Toolbox ver 4.2 or
higher for fmincon.LBFGS for Logistic Regression)
- MATLAB code for demos
- Naïve Bayes
- Regularized Logistic Regression (code is kernelized and
displays dual weights)
- Naïve Bayes vs Logistic Regression
- Multi-class Logistic Regression (Multinomial, 1-vs-All,
1-vs-1 DAG, 1-vs-1 majority vote)
- Linear SVMs vs Logistic Regression
- Non-linear SVMs
- Multi-class SVMs (multi-class hinge loss, Multinomial Logistic Regression, 1-vs-All SVM, 1-vs-1 DAG SVM, 1-vs-1 majority vote SVM)
Links to code by other people can be found in the slides.
Recommended Reading
-
D. Bertsekas.
Nonlinear Programming.
Athena Scientific, 1999.
-
C. M. Bishop.
Pattern Recognition and Machine Learning.
Springer, 2006.
-
S. Boyd and L. Vandenberghe.
Convex Optimization.
Cambridge University Press, 2004.
-
N. Cristianini and J. Shawe-Taylor.
An Introduction to Support Vector Machines and
Other Kernel-based Learning Methods.
Cambridge University Press, 2000.
-
R. O. Duda, P. E. Hart, and D. G. Stork.
Pattern Classification.
John Wiley and Sons, second edition, 2001.
-
T. Hastie, R. Tibshirani, and J. Friedman.
The
Elements of Statistical Learning.
Springer, second edition, 2009.
-
T. Mitchell.
Machine Learning.
McGraw Hill, 1997.
-
B. Scholkopf and A. Smola.
Learning with Kernels.
MIT Press, 2002.
Please see the slides for links to relevant research papers.
Back to Manik's Home Page