COV 878 - Extreme Classification


Instructor Manik Varma
Co-ordinator Parag Singla
Teaching Assistant         Kunal Dahiya
Credits 1
Classroom LHC 416
Timings 2:00 - 3:30 PM on Tuesdays and Fridays
Kaggle competition Dataset 1   Dataset 2   Dataset 3
Mailing list Piazza


Extreme classification is a rapidly growing research area focussing on multi-class and multi-label problems involving an extremely large number of labels. Many applications have been found in diverse areas ranging from language modelling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, gene function prediction in bioinformatics, etc. Extreme classification has also opened up a new paradigm for ranking and recommendation by reformulating them as multi- label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content based recommendation techniques. Consequently, extreme classifiers have been deployed in many real-world applications in industry.

This course will introduce the area of extreme classification to students and cover various facets of the topic ranging from algorithms to applications to performance evaluation. Students are expected to be familiar with introductory machine learning, linear algebra and probability and statistics. Some familiarity with optimization will be helpful.

This will be a discussion based course with a significant self-study component. Students will be expected to have read a research paper before each lecture and come prepared to class for a discussion on the paper and related topics. Students will be assessed based on how well their extreme classifiers perform on benchmark datasets.

Lectures

Lecture 1 (20-2-2018)
Introduction  

Slides
Talk

Lectures 2 and 3 (23-2-2018 and 6-3-2018)
Tree approaches and bid phrase recommendation for advertising  
Multi-label Random Forests

Lecture 4 (9-3-2018)
Tree approaches continued -- FastXML  
FastXML
Talk

Lectures 5 and 6 (13-3-2018 and 16-3-2018)
Extreme loss functions, performance evaluation and PfastreXML  
PfastreXML
Talk

Lecture 7 (20-03-2018)
Extreme embeddings  
SLEEC
AnnexML
AnnexML Talk

Lecture 8 (13-04-2018)
1-vs-All approaches  
DiSMEC
PPDSparse
PPDSparse Talk

Lecture 9 (17-04-2018)
Deep learning for extreme classification  
XML-CNN
XML-CNN Talk
Training neural networks in time independent of output layer size (Talk)
Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
Deep Networks With Large Output Spaces
FastText Tree Learning
FastText Tree Learning Talk

Resources


Back to Manik's Home Page