COV 878 - Extreme Classification

Instructor	Manik Varma
Co-ordinator	Parag Singla
Teaching Assistant	Kunal Dahiya
Credits	1
Classroom	LHC 416
Timings	2:00 - 3:30 PM on Tuesdays and Fridays
Kaggle competition	Dataset 1 Dataset 2 Dataset 3
Mailing list	Piazza

Extreme classification is a rapidly growing research area focussing on multi-class and multi-label problems involving an extremely large number of labels. Many applications have been found in diverse areas ranging from language modelling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, gene function prediction in bioinformatics, etc. Extreme classification has also opened up a new paradigm for ranking and recommendation by reformulating them as multi- label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content based recommendation techniques. Consequently, extreme classifiers have been deployed in many real-world applications in industry.

This course will introduce the area of extreme classification to students and cover various facets of the topic ranging from algorithms to applications to performance evaluation. Students are expected to be familiar with introductory machine learning, linear algebra and probability and statistics. Some familiarity with optimization will be helpful.

This will be a discussion based course with a significant self-study component. Students will be expected to have read a research paper before each lecture and come prepared to class for a discussion on the paper and related topics. Students will be assessed based on how well their extreme classifiers perform on benchmark datasets.

Lectures

Lecture 1 (20-2-2018) Introduction	Slides Talk
Lectures 2 and 3 (23-2-2018 and 6-3-2018) Tree approaches and bid phrase recommendation for advertising	Multi-label Random Forests
Lecture 4 (9-3-2018) Tree approaches continued -- FastXML	FastXML Talk
Lectures 5 and 6 (13-3-2018 and 16-3-2018) Extreme loss functions, performance evaluation and PfastreXML	PfastreXML Talk
Lecture 7 (20-03-2018) Extreme embeddings	SLEEC AnnexML AnnexML Talk
Lecture 8 (13-04-2018) 1-vs-All approaches	DiSMEC PPDSparse PPDSparse Talk
Lecture 9 (17-04-2018) Deep learning for extreme classification	XML-CNN XML-CNN Talk Training neural networks in time independent of output layer size (Talk) Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets Deep Networks With Large Output Spaces FastText Tree Learning FastText Tree Learning Talk

Resources

Back to Manik's Home Page