Extreme multi-label learning aims to annotate each data point with the most relevant subset of labels from an extremely large label set. Slice is an efficient 1-vs-All based extreme classifier that is specially designed for low-dimensional dense features. Slice achieves almost the same prediction accuracy as leading 1-vs-All extreme classifiers such as DiSMEC [04] but can be orders of magnitude faster at training and prediction as it cuts down both costs from linear to logarithmic in the number of labels. For example, Slice can efficiently scale to datasets with as many as 100 million labels and 240 million training points. Slice is also much more accurate than all the other state-of-the-art extreme classifiers (PfastreXML [05], PPDSparse [03] and Parabel [02]) for low-dimensional dense deep learning features. Please refer to our WSDM-2019 paper [1] for more details.
This code is made available as is for non-commercial research purposes only. Please make sure that you have read the license agreement in LICENSE.doc/pdf. Please do not install or use Slice unless you agree to the terms of the license.
Download Slice source code in C++ as well as precompiled Windows/Linux binaries.The code for Slice is written in C++ and should compile on 64 bit Windows/Linux machines using a C++11 enabled compiler. The code also uses the publically available implementation of the HNSW algorithm for Approximate Nearest Neighbor Search (ANNS). Installation and usage instructions are provided below. Running Slice with the default parameter settings should provide reasonable results on benchmark datasets but please verify that this works for you by running the sample_run.sh script provided in the toy example.
Please contact Himanshu Jain and Manik Varma if you have any questions or feedback.
The following datasets were used in our WSDM-2019 paper [01] to compare Slice's performance against other methods. All points are represented using XML-CNN [12] features which were provided by the authors directly. Note that, these dataset features differ from those available on The Extreme Classification Repository where points have been represented using pretrained fasttext features. The raw text for these datasets is also available from the repository in case you would like to compute your own custom features.
|
||||||||
Dataset | Download | Feature | Label | Number of | Number of | Avg. Points | Avg. Labels | Citations |
---|---|---|---|---|---|---|---|---|
Dimensionality | Dimensionality | Train Points | Test Points | per Label | per Point | |||
|
||||||||
|
||||||||
EURLex-4K | Download | 1024 | 3956 | 11,585 | 3,865 | 15.59 | 5.32 | [1] + [12] |
Wikipedia-500K | Download | 512 | 501,070 | 1,646,302 | 711,542 | 16.03 | 4.87 | [1] + [12] |
Amazon-670K | Download | 512 | 670,091 | 490,449 | 153,025 | 3.99 | 5.45 | [1] + [12] |
|
These results were reported in our WSDM-2019 paper. Please let us know if you would like your algorithm's performance added to the table.
|
|||||
Method | P@1 | P@3 | P@5 | Training time (hr) | Test time/point (ms) |
---|---|---|---|---|---|
|
|||||
EURLex-4K | |||||
|
|||||
Slice [1] | 77.72 | 63.78 | 52.05 | 0.02 | 1.23 |
DiSMEC [4] | 76.12 | 62.91 | 51.51 | 0.13 | 4.36 |
Parabel [2] | 74.54 | 61.72 | 50.48 | 0.01 | 0.91 |
PPD-Sparse [3] | 76.32 | 62.79 | 51.40 | 0.013 | 4.36 |
PfastreXML [5] | 73.63 | 60.31 | 49.69 | 0.037 | 1.82 |
SLEEC [7] | 74.31 | 60.00 | 49.11 | 0.35 | 4.87 |
PD-Sparse [6] | 73.53 | 60.80 | 49.37 | 0.12 | 4.36 |
CS [11] | 58.01 | 44.85 | 35.50 | 0.01 | 140.10 |
LEML [8] | 60.34 | 47.45 | 37.96 | 0.67 | 2.24 |
WSABIE [10] | 76.09 | 61.69 | 49.11 | 0.13 | 2.24 |
CPLST [9] | 61.01 | 47.43 | 38.04 | 0.18 | 2.24 |
|
|||||
Wikipedia-500K | |||||
|
|||||
Slice (|S|=2000) [1] | 62.62 | 41.79 | 31.57 | 15.61 | 11.14 |
DiSMEC [4] | 63.70 | 42.49 | 32.26 | ~2133 | 316.29 |
Parabel [2] | 59.34 | 39.05 | 29.35 | 6.29 | 2.94 |
PPD-Sparse [3] | 50.40 | 33.15 | 25.54 | 5.85 | 316.29 |
PfastreXML [5] | 55.00 | 36.14 | 27.38 | 11.14 | 6.36 |
|
|||||
Amazon-670K | |||||
|
|||||
Slice [1] | 37.77 | 33.76 | 30.70 | 1.92 | 3.49 |
DiSMEC [4] | 37.60 | 33.62 | 30.64 | ~789 | 429 |
Parabel [2] | 33.93 | 30.38 | 27.49 | 1.54 | 2.85 |
PPD-Sparse [3] | 33.16 | 29.60 | 26.85 | 3.90 | 429 |
PfastreXML [5] | 28.51 | 26.06 | 24.17 | 2.85 | 19.35 |
SLEEC [7] | 18.77 | 16.50 | 14.97 | 7.12 | 22.54 |
|
|||||
./slice_train [input feature file name] [input label file name] [output model folder name] -m 100 -c 300 -s 300 -k 300 -o 20 -t 1 -C 1 -f 0.000001 -siter 20 -stype 0 -q 0where:
-m = params.M : HNSW M parameter. default=100 -c = params.efC : HNSW efConstruction parameter. default=300 -s = params.efS : HNSW efSearch parameter. default=300 -k = params.num_nbrs : Number of labels to be shortlisted per training point according to the generative model. default=300 -o = params.num_io_threads : Number of threads used to write the retrived ANN points to file. default=20 -t = params.num_threads : Number of threads used to train ANNS datastructure and the discriminative classifiers. default=1 -C = params.classifier_cost : Cost co-efficient for linear classifiers default=1.0 SVM weight co-efficient. default=1.0 -f = params.classifier_threshold : Threshold value for sparsifying linear classifiers' trained weights to reduce model size. default=1e-6 -siter = params.classifier_maxiter : Maximum iterations of algorithm for training linear classifiers. default=20 -stype = params.classifier_kind : Kind of linear classifier to use. 0=L2R_L2LOSS_SVC, 1=L2R_LR (Refer to Liblinear). default=0 -q = params.quiet : Quiet option to restrict the output for reporting progress and debugging purposes 0=no quiet, 1=quiet. default=[value saved in trained model]Feature file should be in dense matrix text format and label file should be in sparse matrix text format (refer to Miscellaneous section).
./slice_predict [feature file name] [model dir name] [output file name] -b 0 -t 1 -q 0where:
-s = params.efS : HNSW efSearch parameter. default=[value saved in trained model] -k = params.num_nbrs : Number of labels to be shortlisted per training point according to the generative model. default=[value saved in trained model] -o = params.num_io_threads : Number of threads used to write the retrived ANN points to file. default=[value saved in trained model] -b = params.b_gen : Bias parameter for the generative model. default=0 -t = params.num_threads : Number of threads. default=[value saved in trained model] -q = params.quiet : Quiet option to restrict the output for reporting progress and debugging purposes 0=no quiet, 1=quiet. default=[value saved in trained model]Feature file should be in dense matrix text format (refer to the Miscellaneous section).
./precision_k [test score matrix file name] [test label file name] [K] ./nDCG_k [test score matrix file name] [test label file name] [K]You can also use the Matlab scripts provided in Tools/metrics folder for calculating the propensity scored metrics. Please execute "make" in Tools/Matlab folder from the Matlab terminal to compile these scripts. Both vanilla and propensity scored metrics can be computed by executing the following Matlab command from the Tools/metrics folder:
[metrics] = get_all_metrics([test score matrix], [test label matrix], [inverse label propensity vector])
./smat_to_dmat [sparse feature text file] [dense feature text file] ./dmat_to_smat [dense feature text file] [sparse feature text file]
[sparse matrix] = read_text_mat([sparse text matrix file name]);
write_text_mat([Matlab sparse matrix], [sparse text matrix file name]);
[weights vector] = inv_propensity([training label matrix],A,B);A,B are the parameters of the inverse propensity model. Following values are to be used over the benchmark datasets:
Wikipedia-500K: A=0.5, B=0.4 Amazon-670K: A=0.6, B=2.6 Other: A=0.55, B=1.5