The Extreme Classification Repository: Multi-label Datasets & Code
Kush Bhatia • Kunal Dahiya • Himanshu
Jain • Purushottam Kar • Anshul Mittal •
Yashoteja Prabhu • Manik Varma
The objective of extreme multi-label classification (XC) is to learn feature architectures and classifiers that can automatically tag a data point with the most relevant subset of labels from an extremely large label set. This repository provides resources, including XC datasets, code for leading XC methods and metrics to evaluate the performance of XC algorithms.
Citing the Repository
Please use the following citation if you use any of the datasets
or results provided on this repository.
@Misc{Bhatia16,
author = {Bhatia, K. and Dahiya, K. and Jain, H. and Kar, P. and Mittal, A. and Prabhu, Y. and Varma, M.},
title = {The extreme classification repository: Multi-label datasets and code},
url = {http://manikvarma.org/downloads/XC/XMLRepository.html},
year = {2016}
}
Datasets
Useful Tools
Performance Metrics
and Evaluation Protocols
Code for XC Methods
Benchmarked
Results
Appendix
References
The datasets below consider various XC
problems in webpage categorization, related webpage recommendation
and product-to-product recommendation tasks. These include
multi-modal datasets and datasets where labels have textual
features. The dataset file format information can be found in the
README file available
[here] .
Python and Matlab scripts for reading the datasets have been
provided
[below] .
Please get in touch with Manik
Varma if you would like to contribute a dataset.
Number of labels : The (rounded off) number of labels
in the dataset is appended to the dataset name to disambiguate
various versions of datasets. Specific legacy datasets were
renamed to ensure uniformity. The dataset previously referred
to as DeliciousLarge was renamed to Delicious-200K and RCV1-X
was renamed to RCV1-2K.
Label features : Datasets that contain label features
have the token "LF" prepended to their names. These are
usually short textual descriptions of the labels.
Multi-modal features : Datasets that contain
multi-modal features have the token "MM" prepended to their
names. These usually correspond to short textual descriptions
and one or more images for each data point and label.
Short-text datasets : Datasets with the phrase
"Titles" in their names, such as AmazonTitles-670K, are
short-text datasets whose data points are represented by a 3-5
word textual description such as the name of a product or
title of a webpage. For full-text datasets such as
Amazon-670K, data points are represented using a more detailed
description. Short-text tasks abound in ranking and
recommendation applications where data points are user queries
or products/webpages represented using only their titles.
Item-to-item datasets : Datasets with the phrase
"SeeAlso" in their names correspond to tasks requiring related
Wikipedia articles to be predicted for a given Wikipedia
article.
Datasets with/without Label Features
Note that there exist pairs of datasets whose names are identical
but for the "LF" prefix (e.g. LF-WikiSeeAlsoTitles-320K and
WikiSeeAlsoTitles-350K) but which contain a different number
of labels and data points. The reason for this variation is that
the raw dumps from which these datasets were curated often
contained labels for which label features were unavailable or
could not be reliably retrieved. Such labels could exist in the
non-LF dataset but were excluded from the LF version. Such
exclusions could also lead to
specific data points having zero labels. Such data points were
excluded from the dataset as well.
A special case in this respect is that of the Wikipedia-500K and
LF-Wikipedia-500K datasets that are identical and have the same
(number of) labels and data points. Wikipedia articles are the
data points and Wikipedia categories are the labels for these
datasets. As a convention, methods that do not use label features
could choose to report their results on the Wikipedia-500K dataset
whereas methods that do use label features could report results on
the LF-Wikipedia-500K dataset. For this reason, these two datasets
have not been released separately. The LF-Wikipedia-500K dataset
has been released (see links below). Methods that wish to work on
the Wikipedia-500K dataset can download the LF version and
disregard the label features.
Multi-modal Datasets
The MM-AmazonTitles-300K dataset was created by taking raw data
dumps and extracting all data points and labels for which a short
textual description and at least one image was available. The
images were resized to fit within a 128 x 128-pixel region and
padded with white pixels in a centered manner to ensure a 1:1
aspect ratio. White padding was used since the natural background
in most images was white. Subsequent operations such as
tokenization, train-test split creation and reciprocal pair
removal were done as explained below. The processed and
unprocessed image sets are available upon request. To
request, please download the dataset using the links given in the
table above, inspect the README file in the download for terms of
usage and fill out the form available
[here] .
Tables comparing various methods on the MM-AmazonTitles-300K
dataset are not provided on this webpage since most multi-modal
benchmarks are not XC methods and most XC methods work only with
textual features and not multi-modal features. Instead, please
refer to the publication
[65] for
benchmark comparisons.
Legacy Datasets
Benchmarked results on datasets formerly popular in XC research
have shifted to the Appendix available
[here] .
Some of these datasets are tiny such as the Bibtex dataset with
159 labels. The raw sources can no longer be reliably traced for
other datasets and only bag-of-words features are available. All
such legacy datasets remain available using links in the dataset
table below.
Dataset
Download
BoW Feature
Number of
Number of
Number of
Avg. Points
Avg. Labels
Original
Dimensionality
Labels
Train Points
Test Points
per Label
per Point
Source
Multi-modal Datasets
MM-AmazonTitles-300K
BoW Features Raw text
40,000
303,296
586,781
260,536
15.73
8.13
[64]
Datasets with Label Features
LF-AmazonTitles-131K
BoW Features Raw text
40,000
131,073
294,805
134,835
5.15
2.29
[28]
LF-Amazon-131K
BoW Features Raw text
80,000
131,073
294,805
134,835
5.15
2.29
[28]
LF-WikiSeeAlsoTitles-320K
BoW Features Raw text
40,000
312,330
693,082
177,515
4.67
2.11
-
LF-WikiSeeAlso-320K
BoW Features Raw text
80,000
312,330
693,082
177,515
4.67
2.11
-
LF-WikiTitles-500K
BoW Features Raw text
80,000
501,070
1,813,391
783,743
17.15
4.74
-
LF-Wikipedia-500K
BoW Features Raw text
2,381,304
501,070
1,813,391
783,743
24.75
4.77
-
ORCAS-800K
Dataset page
-
797,322
7,360,881
2,547,702
16.13
1.75
[70]
LF-AmazonTitles-1.3M
BoW Features Raw text
128,000
1,305,265
2,248,619
970,237
38.24
22.20
[29] + [30]
Datasets without Label Features
AmazonCat-13K
BoW Features Raw text
203,882
13,330
1,186,239
306,782
448.57
5.04
[28]
AmazonCat-14K
BoW Features Raw text
597,540
14,588
4,398,050
1,099,725
1330.1
3.53
[29] + [30]
WikiSeeAlsoTitles-350K
BoW Features Raw text
91,414
352,072
629,418
162,491
5.24
2.33
-
WikiTitles-500K
BoW Features Raw text
185,479
501,070
1,699,722
722,678
23.62
4.89
-
Wikipedia-500K
(same as LF-Wikipedia-500K)
2,381,304
501,070
1,813,391
783,743
24.75
4.77
-
AmazonTitles-670K
BoW Features Raw text
66,666
670,091
485,176
150,875
5.11
5.39
[28]
Amazon-670K
BoW Features Raw text
135,909
670,091
490,449
153,025
3.99
5.45
[28]
AmazonTitles-3M
BoW Features Raw text
165,431
2,812,281
1,712,536
739,665
31.55
36.18
[29] + [30]
Amazon-3M
BoW Features Raw text
337,067
2,812,281
1,717,899
742,507
31.64
36.17
[29] + [30]
Legacy Datasets
Mediamill
BoW Features
120
101
30,993
12,914
1902.15
4.38
[19]
Bibtex
BoW Features
1,836
159
4,880
2,515
111.71
2.40
[20]
Delicious
BoW Features
500
983
12,920
3,185
311.61
19.03
[21]
RCV1-2K
BoW Features
47,236
2,456
623,847
155,962
1218.56
4.79
[26]
EURLex-4K
BoW Features
5,000
3,993
15,539
3,809
25.73
5.31
[27] + [47]
EURLex-4.3K
BoW Features
200,000
4,271
45,000
6,000
60.57
5.07
[47] + [48]
Wiki10-31K
BoW Features
101,938
30,938
14,146
6,616
8.52
18.64
[23]
Delicious-200K
BoW Features
782,585
205,443
196,606
100,095
72.29
75.54
[24]
WikiLSHTC-325K
BoW Features
1,617,899
325,056
1,778,351
587,084
17.46
3.19
[25]
Dataset statistics & download
The table above allows downloading precomputed
bag-of-words features or raw text. The tokenization used to create
the bag-of-words representation may differ across datasets (e.g.
whitespace-separated for legacy datasets vs. WordPiece for more
recent datasets). It is recommended that additional experiments be
conducted for XC methods that use a novel tokenizer to isolate
improvements attributable to better tokenization rather than the
architecture or learning algorithm. One way to accomplish this is
to execute older XC methods with the novel tokenizer.
For each dataset, a single split is offered.
Splits were not created randomly but instead in a way that ensured
every label had at least one training point. This yielded more
realistic train/test splits than uniform sampling which could have
dropped several infrequently occurring and hard-to-classify labels
from the test set. For example, on the WikiLSHTC-325K dataset,
uniformly random split creation could lose ninety thousand of the
hardest to classify labels from the test set whereas the adopted
sampling procedure dropped only forty thousand labels from the
test set.
Note : Results computed on the
train/test splits provided on this page are not comparable to
results computed on splits created using uniform sampling.
For the "LF" datasets that concern related
item prediction, additional care is required since introducing
label features allowed "reciprocal pairs" to emerge. Specifically,
these are pairs of items, say A and B, that are related to each
other such that two distinct data points exist, with A appearing
as a label for B in one data point and B appearing as a label for
A in the other. Such pairs were removed from the ground truth in
the test set to prevent algorithms from achieving artificially
high scores by memorizing such pairs without learning anything
meaningful. The recommended protocol for performing prediction
while avoiding such reciprocal pairs using filter files provided
with these datasets is described
[here] .
The following resources provide several tools
The above tools can be used to perform various useful operations
including
Reading and writing the datasets in the given file format
Preprocessing raw text using various tokenizers to generate
data point (and label) features, including bag-of-words features
Evaluating various performance measures such as precision,
nDCG and their propensity-scored counterparts (see [here] for details)
The benchmarked results below present
comparative results of various algorithms with classification
accuracy evaluated on several performance measures. The discussion
below describes protocols for evaluating XC methods, especially
in the presence of head/tail labels
and reciprocal pairs (see
[here] ).
The precision$@k$ and nDCG$@k$ metrics are
defined for a predicted score vector $\hat{\mathbf y} \in
{\mathbb{R}}^{L}$ and ground truth label vector $\mathbf y \in
\left\lbrace 0, 1 \right\rbrace^L$ as \[ \text{P}@k := \frac{1}{k}
\sum_{l\in \text{rank}_k (\hat{\mathbf y})} \mathbf y_l \] \[
\text{DCG}@k := \sum_{l\in {\text{rank}}_k (\hat{\mathbf y})}
\frac{\mathbf y_l}{\log(l+1)} \] \[ \text{nDCG}@k :=
\frac{{\text{DCG}}@k}{\sum_{l=1}^{\min(k, \|\mathbf y\|_0)}
\frac{1}{\log(l+1)}}, \] where, $\text{rank}_k(\mathbf y)$ returns
the $k$ largest indices of $\mathbf{y}$ ranked in descending
order.
For datasets that contain excessively popular
labels (often referred to as "head" labels), high P@k may be
achieved by simply predicting head labels repeatedly irrespective
of their relevance to the data point. To check for such trivial
behavior, it is recommended that XC methods also be evaluated with
respect to propensity-scored counterparts of the precision$@k$ and
nDCG$@k$ metrics (PSP$@k$ and PSnDCG$@k$) described below. \[
\text{PSP}@k := \frac{1}{k} \sum_{l\in \text{rank}_k (\hat{\mathbf
y})} \frac{\mathbf y_l}{p_l} \] \[ \text{PSDCG}@k := \sum_{l\in
{\text{rank}}_k (\hat{\mathbf y})} \frac{\mathbf
y_l}{p_l\log(l+1)} \] \[ \text{PSnDCG}@k :=
\frac{{\text{PSDCG}}@k}{\sum_{l=1}^{k} \frac{1}{\log(l+1)}}, \]
where $p_l$ is the propensity score for label $l$ which helps in
making metrics unbiased
[31] with respect
to missing labels. Propensity-scored metrics place specific
emphasis on performing well on tail labels and give feeble rewards
for predicting popular or head labels. It is recommended that
scripts provided
[here] be used to compute
propensity-scored metrics in order to be consistent with results
reported below.
As described
[here] ,
reciprocal pairs were removed from the ground truth in the test
splits of the LF datasets to avoid trivial predictions from
getting rewarded. However, these reciprocal pairs must now be
removed from the test predictions of XC methods to avoid
unnecessary penalization. It is recommended that filter files
provided along with the datasets and the tools provided in the
PyXCLib library linked
[here]
be used to evaluate XC methods on LF datasets. Although reciprocal
pairs were not removed from the train splits, a separate filter
file is provided for the train splits enumerating the reciprocal
pairs therein so that methods that wish to eliminate them from
train splits may do so. Note that these filter files are distinct
from the ground truth files and only contain lists of reciprocal
pairs.
The following lists provide links to code for
leading XC methods. The methods have been categorized based on the
kind of classifier used (e.g. one-vs-all, trees, embeddings) for
easy identification. Methods that learn deep representations for
data points jointly with the classifier are included as a separate
category.
Slice (Jain et al.,
WSDM 2019)
1-vs-All
Pre-trained-dense
C++
Parabel (Prabhu et
al., WWW 2018)
1-vs-All
Sparse-BoW
C++
DiSMEC++ (Schultheis and Babbar, ECML-MLJ 2022)
1-vs-All
Sparse-BoW
C++
DiSMEC (Babbar and Schölkopf, WSDM 2017)
1-vs-All
Sparse-BoW
Java
PPD-Sparse (Yen et al., KDD 2017)
1-vs-All
Sparse-BoW
C++
Label Filters (Niculescu-Mizil and Abbasnejad,
AISTATS 2017)
1-vs-All
Sparse BoW
C
PD-Sparse (Yen et al., ICML 2016)
1-vs-All
Sparse-BoW
C++
ProXML
(Babbar and Schölkopf, Machine Learning 2019 & ECML 2019)
1-vs-All
Sparse-BoW
C++
Bonsai
(Khandagale et al., ArXiv 2019)
1-vs-All
Sparse-BoW
C++
SwiftXML (Prabhu
et al., WSDM 2018)
Trees
Sparse-BoW
C++
Probabilistic Label Trees (Jasinska et al.,
ICML 2017)
Trees
Sparse-BoW
C++
PfastreXML
(Jain et al., KDD 2016)
Trees
Sparse-BoW
C++
FastXML
(Prabhu & Varma, KDD 2014)
Trees
Sparse-BoW
C++
CRAFTML (Siblini et al., ICML 2018)
Trees
Sparse-BoW
Rust
DEFRAG (Jalan and Kar, IJCAI 2019)
Embeddings
Sparse-BoW
Rust
AnnexML (Tagami, KDD 2017)
Embeddings
Sparse-BoW
C
Randomized embeddings for extreme learning
(Mineiro and Karampatziakis, CoRR 2017)
Embeddings
Sparse BoW
Matlab
SLEEC
(Bhatia et al., NIPS 2015)
Embeddings
Sparse BoW
Matlab
LEML (Yu et al., ICML 2014)
Embeddings
Sparse BoW
Matlab
W-LTLS
(Evron et al., NeurIPS 2018)
Embeddings
Sparse BoW
Python
ExMLDS-(4,1) (Gupta et al., AAAI 2019)
Embeddings
Sparse BoW
C
fastTextLearnTree (Jernite et al., ICML 2017)
Deep-learning
Sparse BoW
C
XML-CNN (Liu et al., SIGIR 2017)
Deep-learning
Custom
Python
AttentionXML (You et al., NeurIPS 2019)
Deep-learning
Custom
Python
X-Transformer (Chang et al., KDD 2020)
Deep-learning
Custom
Python
MACH
(Medini et al., ICML 2019)
Deep-learning
Custom
Python
APLC-XLNet (Ye et al., ICML 2020)
Deep-learning
Custom
Python
DeepXML/Astec (Dahiya et al., WSDM 2021)
Deep-learning
Custom
Python
DECAF (Mittal et al., WSDM 2021)
Deep-learning
Custom
Python
LightXML
(Jiang et al., AAAI 2021)
Deep-learning
Custom
Python
PWXMC
(Qaraei et al., TheWebConf 2021)
Loss-function
Custom
Python
GalaXC (Saini et al., TheWebConf 2021)
Deep-learning
Custom
Python
ECLARE (Mittal et al., TheWebConf 2021)
Deep-learning
Custom
Python
SiameseXML (Dahiya et al., ICML 2021)
Deep-learning
Custom
Python
ZestXML (Gupta et al., KDD 2021)
Zero-shot-learning
Sparse-BoW
C++
MUFIN (Mittal et al., CVPR 2022)
Deep-learning
Custom
Python
InceptionXML (Kharbanda et al., SIGIR 2023)
Deep-learning
Custom
Python
CascadeXML (Kharbanda et al., NeurIPS 2022)
Deep-learning
Custom
Python
NGAME (Dahiya et al., WSDM 2023)
Deep-learning
Custom
Python
Renee (Jain et al., MLSys 2023)
Deep-learning
Custom
Python
MatchXML (Ye et al., TKDE 2024)
Deep-learning
Custom
Python
DEXA (Dahiya et al., KDD 2023)
Deep-learning
Custom
Python
Please contact Manik Varma
if you would like us to provide a link to your code.
The tables below provide benchmarked results
for various XC methods on several datasets. Rows corresponding to
XC methods that use deep-learnt features or label features in the
LF datasets have been highlighted in light orange. Training times
are reported on a single GPU except when noted otherwise for
methods that necessarily require multiple GPUs to scale. The model
sizes mentioned alongside XC methods are either as reported else
on-disk sizes subject to compression. Notably, executions using
different platforms/libraries may
introduce variance in model sizes and affect reproducibility. The
tables below offer columns that are sortable in
ascending/descending order. Please click on the name of a column
to sort the data on that attribute.
Note 1 : Deep learning methods use diverse architectures e.g.
CPU-only or CPU-GPU. The symbols *, †, and ‡ are used to specify
the machine configuration used for each method (see legend below).
AttentionXML and the X-Transformer could not be run on a single
GPU. These methods were executed on a cluster with 8 GPUs and
training times were scaled accordingly before reporting.
Note 2 : Results for methods marked with a ♦ symbol
were directly taken from their respective publications. In some
cases, this was done since publicly available implementations of
the method could not be scaled. In other cases, this was done since
a different version of the dataset was used in the publication. For
instance, this website does not provide raw text for legacy datasets.
Consequently, results on deep learning methods on legacy datasets are
always marked with a ♦ symbol since those methods used raw text
from alternate sources that resulted in different train-test splits.
Legend:
*: 24-core Intel Xeon 2.6GHz
†: 24-core Intel Xeon 2.6GHz with 1 Nvidia P40 GPU
‡: 24-core Intel Xeon 2.6GHz with 1 Nvidia V100 GPU
♦: Results as reported in publication
LF-AmazonTitles-131K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
30.05
21.25
16.02
30.05
31.58
34.05
19.23
26.09
32.26
19.23
23.64
26.60
1.95
0.08
Astec‡
37.12
25.20
18.24
37.12
38.17
40.16
29.22
34.64
39.49
29.22
32.73
35.03
3.24
1.83
AttentionXML‡
32.25
21.70
15.61
32.25
32.83
34.42
23.97
28.60
32.57
23.97
26.88
28.75
2.61
20.73
Bonsai*
34.11
23.06
16.63
34.11
34.81
36.57
24.75
30.35
34.86
24.75
28.32
30.47
0.24
0.10
DECAF‡
38.40
25.84
18.65
38.40
39.43
41.46
30.85
36.44
41.42
30.85
34.69
37.13
0.81
2.16
DEXA‡ 46.42 30.50 21.59 46.42 47.06 49.00 39.11 44.69 49.65 39.11 43.10 45.58 - 13.01
DiSMEC*
35.14
23.88
17.24
35.14
36.17
38.06
25.86
32.11
36.97
25.86
30.09
32.47
0.11
3.10
ECLARE‡
40.74
27.54
19.88
40.74
42.01
44.16
33.51
39.55
44.70
33.51
37.70
40.21
0.72
2.16
GalaXC‡
39.17
26.85
19.49
39.17
40.82
43.06
32.50
38.79
43.95
32.50
36.86
39.37
0.67
0.42
LightXML‡
35.60
24.15
17.45
35.60
36.33
38.17
25.67
31.66
36.44
25.67
29.43
31.68
2.25
71.40
MACH‡
33.49
22.71
16.45
33.49
34.36
36.16
24.97
30.23
34.72
24.97
28.41
30.54
2.35
3.30
NGAME‡ 46.01 30.28 21.47 46.01 46.69 48.67 38.81 44.40 49.43 38.81 42.79 45.31 1.20 12.59
Parabel*
32.60
21.80
15.61
32.60
32.96
34.47
23.27
28.21
32.14
23.27
26.36
28.21
0.34
0.03
PfastreXML*
32.56
22.25
16.05
32.56
33.62
35.26
26.81
30.61
34.24
26.81
29.02
30.67
3.02
0.26
Renee 46.05 30.81 22.04 46.05 47.46 49.68 39.08 45.12 50.48 39.08 43.56 46.24 - -
SiameseXML†
41.42
27.92
21.21
41.42
42.65
44.95
35.80
40.96
46.19
35.80
39.36
41.95
1.71
1.08
Slice+FastText*
30.43
20.50
14.84
30.43
31.07
32.76
23.08
27.74
31.89
23.08
26.11
28.13
0.39
0.08
X-Transformer‡
29.95
18.73
13.07
29.95
28.75
29.60
21.72
24.42
27.09
21.72
23.18
24.39
-
-
XR-Transformer‡
38.10
25.57
18.32
38.10
38.89
40.71
28.86
34.85
39.59
28.86
32.92
35.21
-
35.40
XT*
31.41
21.39
15.48
31.41
32.17
33.86
22.37
27.51
31.64
22.37
25.58
27.52
0.84
9.46
LF-Amazon-131K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
35.73
25.46
19.41
35.73
37.81
41.08
23.56
31.97
39.95
23.56
29.07
33.00
4.01
0.68
Astec‡
42.22
28.62
20.85
42.22
43.57
46.06
32.95
39.42
45.30
32.95
37.45
40.35
5.52
3.39
AttentionXML‡
42.90
28.96
20.97
42.90
44.07
46.44
32.92
39.51
45.24
32.92
37.49
40.33
5.04
50.17
Bonsai*
40.23
27.29
19.87
40.23
41.46
43.84
29.60
36.52
42.39
29.60
34.43
37.34
0.46
0.40
DECAF‡
42.94
28.79
21.00
42.94
44.25
46.84
34.52
41.14
47.33
34.52
39.35
42.48
1.86
1.80
DEXA‡ 47.16 31.45 22.42 47.16 48.20 50.36 38.70 45.43 50.97 38.70 43.44 46.19 - 41.41
DiSMEC*
41.68
28.32
20.58
41.68
43.22
45.69
31.61
38.96
45.07
31.61
36.97
40.05
0.45
7.12
ECLARE‡ 43.56 29.65 21.57 43.56 45.24 47.82 34.98 42.38 48.53 34.98 40.30 43.37 1,118.78 2.15
LightXML‡
41.49
28.32
20.75
41.49
42.70
45.23
30.27
37.71
44.10
30.27
35.20
38.28
2.03
56.03
MACH‡
34.52
23.39
17.00
34.52
35.53
37.51
25.27
30.71
35.42
25.27
29.02
31.33
4.57
13.91
NGAME‡ 46.53 30.89 22.02 46.53 47.44 49.58 38.53 44.95 50.45 38.53 43.07 45.81 1.20 39.99
Parabel*
39.57
26.64
19.26
39.57
40.48
42.61
28.99
35.36
40.69
28.99
33.36
35.97
0.62
0.10
PINA♦ 46.76 31.88 23.20 - - - - - - - - - - -
PfastreXML*
35.83
24.35
17.60
35.83
36.97
38.85
28.99
33.24
37.40
28.99
31.65
33.62
5.30
1.54
SiameseXML†
44.81
30.19
21.94
44.81
46.15
48.76
37.56
43.69
49.75
37.56
41.91
44.97
1.76
1.18
Renee 48.05 32.33 23.26 48.05 49.56 52.04 40.11 47.39 53.67 40.11 45.37 48.52 - -
Slice+FastText*
32.07
22.21
16.52
32.07
33.54
35.98
23.14
29.08
34.63
23.14
27.25
30.06
0.39
0.11
XR-Transformer‡
45.61
30.85
22.32
45.61
47.10
49.65
34.93
42.83
49.24
34.93
40.67
43.91
-
38.40
XT*
34.31
23.27
16.99
34.31
35.18
37.26
24.35
29.81
34.70
24.35
27.95
30.34
0.92
1.38
LF-WikiSeeAlsoTitles-320K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
16.30
11.24
8.84
16.30
16.19
17.14
7.24
9.63
11.75
7.24
9.06
10.43
4.22
0.21
Astec‡
22.72
15.12
11.43
22.72
22.16
22.87
13.69
15.81
17.50
13.69
15.56
16.75
7.30
4.17
AttentionXML‡
17.56
11.34
8.52
17.56
16.58
17.07
9.45
10.63
11.73
9.45
10.45
11.24
6.02
56.12
Bonsai*
19.31
12.71
9.55
19.31
18.74
19.32
10.69
12.44
13.79
10.69
12.29
13.29
0.37
0.37
DECAF‡
25.14
16.90
12.86
25.14
24.99
25.95
16.73
18.99
21.01
16.73
19.18
20.75
1.76
11.16
DiSMEC*
19.12
12.93
9.87
19.12
18.93
19.71
10.56
13.01
14.82
10.56
12.70
14.02
0.19
15.56
ECLARE‡
29.35
19.83
15.05
29.35
29.21
30.20
22.01
24.23
26.27
22.01
24.46
26.03
1.67
13.46
GalaXC‡
27.87
18.75
14.30
27.87
26.84
27.60
19.77
22.25
24.47
19.77
21.70
23.16
1.08
1.08
MACH‡
18.06
11.91
8.99
18.06
17.57
18.17
9.68
11.28
12.53
9.68
11.19
12.14
2.51
8.23
Parabel*
17.68
11.48
8.59
17.68
16.96
17.44
9.24
10.65
11.80
9.24
10.49
11.32
0.60
0.07
PfastreXML*
17.10
11.13
8.35
17.10
16.80
17.35
12.15
12.51
13.26
12.15
12.81
13.48
6.77
0.59
SiameseXML†
31.97
21.43
16.24
31.97
31.57
32.59
26.82
28.42
30.36
26.82
28.74
30.27
2.62
1.90
Slice+FastText*
18.55
12.62
9.68
18.55
18.29
19.07
11.24
13.45
15.20
11.24
13.03
14.23
0.94
0.20
XT*
17.04
11.31
8.60
17.04
16.61
17.24
8.99
10.52
11.82
8.99
10.33
11.26
1.90
5.28
LF-WikiSeeAlso-320K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
30.79
20.88
16.47
30.79
30.02
31.64
13.48
17.92
22.21
13.48
16.52
19.08
12.13
2.40
Astec‡
40.07
26.69
20.36
40.07
39.36
40.88
23.41
28.08
31.92
23.41
27.48
30.17
13.46
7.11
AttentionXML‡
40.50
26.43
19.87
40.50
39.13
40.26
22.67
26.66
29.83
22.67
26.13
28.38
7.12
90.37
Bonsai*
34.86
23.21
17.66
34.86
34.09
35.32
18.19
22.35
25.66
18.19
21.62
23.84
0.84
1.39
DECAF‡
41.36
28.04
21.38
41.36
41.55
43.32
25.72
30.93
34.89
25.72
30.69
33.69
4.84
13.40
DEXA‡ 47.11 30.48 22.71 47.10 46.31 47.62 31.81 35.50 38.78 31.81 38.94 78.61 - 78.61
DiSMEC*
34.59
23.58
18.26
34.59
34.43
36.11
18.95
23.92
27.90
18.95
23.04
25.76
1.28
58.79
ECLARE‡ 40.58 26.86 20.14 40.48 40.05 41.23 26.04 30.09 33.01 26.04 30.06 32.32 2.83 9.40
LightXML‡
34.50
22.31
16.83
34.50
33.21
34.24
17.85
21.26
24.16
17.85
20.81
22.80
-
249.00
MACH‡
27.18
17.38
12.89
27.18
26.09
26.80
13.11
15.28
16.93
13.11
15.17
16.48
11.41
50.22
NGAME‡ 47.65 31.56 23.68 47.65 47.50 48.99 33.83 37.79 41.03 33.83 38.36 41.01 2.51 75.39
PINA♦ 44.54 30.11 22.92 - - - - - - - - - - -
Parabel*
33.46
22.03
16.61
33.46
32.40
33.34
17.10
20.73
23.53
17.10
20.02
21.88
1.18
0.33
PfastreXML*
28.79
18.38
13.60
28.79
27.69
28.28
17.12
18.19
19.43
17.12
18.23
19.20
14.02
4.97
SiameseXML†
42.16
28.14
21.39
42.16
41.79
43.36
29.02
32.68
36.03
29.02
32.64
35.17
2.70
2.33
Renee 47.86 31.91 24.05 47.86 47.93 49.63 32.02 37.07 40.90 32.02 37.52 40.60 - -
Slice+FastText*
27.74
19.39
15.47
27.74
27.84
29.65
13.07
17.50
21.55
13.07
16.36
18.90
0.94
0.20
XR-Transformer‡
42.57
28.24
21.30
42.57
41.99
43.44
25.18
30.13
33.79
25.18
29.84
32.59
-
119.47
XT*
30.10
19.60
14.92
30.10
28.65
29.58
14.43
17.13
19.69
14.43
16.37
17.97
2.20
3.27
LF-WikiTitles-500K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
39.00
20.66
14.55
39.00
28.40
26.80
13.91
13.38
13.75
13.91
14.63
15.88
11.18
1.98
Astec‡
44.40
24.69
17.49
44.40
33.43
31.72
18.31
18.25
18.56
18.31
19.57
21.09
15.01
13.50
AttentionXML‡
40.90
21.55
15.05
40.90
29.38
27.45
14.80
13.97
13.88
14.80
15.24
16.22
14.01
133.94
Bonsai*
40.97
22.30
15.66
40.97
30.35
28.65
16.58
16.34
16.40
16.58
17.60
18.85
1.63
2.03
DECAF‡
44.21
24.64
17.36
44.21
33.55
31.92
19.29
19.82
19.96
19.29
21.26
22.95
4.53
42.26
DiSMEC*
39.42
21.10
14.85
39.42
28.87
27.29
15.88
15.54
15.89
15.88
16.76
18.13
0.68
48.27
ECLARE‡
44.36
24.29
16.91
44.36
33.33
31.46
21.58
20.39
19.84
21.58
22.39
23.61
4.24
39.34
MACH‡
37.74
19.11
13.26
37.74
26.63
24.94
13.71
12.14
12.00
13.71
13.63
14.54
4.73
22.46
Parabel*
40.41
21.98
15.42
40.41
29.89
28.15
15.55
15.32
15.35
15.55
16.50
17.66
2.70
0.42
PfastreXML*
35.71
19.27
13.64
35.71
26.45
25.15
18.23
15.42
15.08
18.23
17.34
18.24
20.41
3.79
Slice+FastText*
25.48
15.06
10.98
25.48
20.67
20.52
13.90
13.33
13.82
13.90
14.50
15.90
2.30
0.74
XT*
38.13
20.71
14.66
38.13
28.13
26.61
14.10
14.12
14.38
14.10
15.15
16.40
3.10
14.67
LF-AmazonTitles-1.3M
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
47.79
41.65
36.91
47.79
44.83
42.93
15.42
19.67
21.91
15.42
18.05
19.36
14.53
2.48
Astec‡
48.82
42.62
38.44
48.82
46.11
44.80
21.47
25.41
27.86
21.47
24.08
25.66
26.66
18.54
AttentionXML‡
45.04
39.71
36.25
45.04
42.42
41.23
15.97
19.90
22.54
15.97
18.23
19.60
28.84
380.02
Bonsai*
47.87
42.19
38.34
47.87
45.47
44.35
18.48
23.06
25.95
18.48
21.52
23.33
9.02
7.89
DECAF‡
50.67
44.49
40.35
50.67
48.05
46.85
22.07
26.54
29.30
22.07
25.06
26.85
9.62
74.47
DEXA‡ 56.63 49.05 43.90 56.60 53.81 52.37 29.12 32.69 34.86 29.12 32.02 33.86 - 103.13
ECLARE‡
50.14
44.09
40.00
50.14
47.75
46.68
23.43
27.90
30.56
23.43
26.67
28.61
9.15
70.59
GalaXC‡
49.81
44.23
40.12
49.81
47.64
46.47
25.22
29.12
31.44
25.22
27.81
29.36
2.69
9.55
MACH‡
35.68
31.22
28.35
35.68
33.42
32.27
9.32
11.65
13.26
9.32
10.79
11.65
7.68
60.39
NGAME‡ 56.75 49.19 44.09 56.75 53.84 52.41 29.18 33.01 35.36 29.18 32.07 33.91 9.71 97.75
Parabel*
46.79
41.36
37.65
46.79
44.39
43.25
16.94
21.31
24.13
16.94
19.70
21.34
11.75
1.50
PfastreXML*
37.08
33.77
31.43
37.08
36.61
36.61
28.71
30.98
32.51
28.71
29.92
30.73
29.59
9.66
SiameseXML†
49.02
42.72
38.52
49.02
46.38
45.15
27.12
30.43
32.52
27.12
29.41
30.90
14.58
9.89
Renee 56.04 49.91 45.32 56.04 54.21 53.15 28.54 33.38 36.14 28.54 32.15 34.18 - -
Slice*
34.80
30.58
27.71
34.80
32.72
31.69
13.96
17.08
19.14
13.96
15.83
16.97
5.98
0.79
XT*
40.60
35.74
32.01
40.60
38.18
36.68
13.67
17.11
19.06
13.67
15.64
16.65
7.90
82.18
XR-Transformer‡
50.14
44.07
39.98
50.14
47.71
46.59
20.06
24.85
27.79
20.06
23.44
25.41
-
132.00
WikiSeeAlsoTitles-350K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
14.96
10.20
8.11
14.96
14.20
14.76
5.63
7.04
8.59
5.63
6.79
7.76
3.59
0.20
Astec†
20.61
14.58
11.49
20.61
20.08
20.80
9.91
12.16
14.04
9.91
11.76
12.98
7.41
4.36
AttentionXML†
15.86
10.43
8.01
15.86
14.59
14.86
6.39
7.20
8.15
6.39
7.05
7.64
4.07
30.44
Bonsai*
17.95
12.27
9.56
17.95
17.13
17.66
8.16
9.68
11.07
8.16
9.49
10.43
0.25
0.46
DiSMEC*
16.61
11.57
9.14
16.61
16.09
16.72
7.48
9.19
10.74
7.48
8.95
9.99
0.09
6.62
InceptionXML♦ 21.87 15.48 12.20 - - - 11.13 13.31 15.20 - - - - -
MACH†
14.79
9.57
7.13
14.79
13.83
14.05
6.45
7.02
7.54
6.45
7.20
7.73
5.22
7.44
Parabel*
17.24
11.61
8.92
17.24
16.31
16.67
7.56
8.83
9.96
7.56
8.68
9.45
0.43
0.06
PfastreXML*
15.09
10.49
8.24
15.09
14.98
15.59
9.03
9.69
10.64
9.03
9.82
10.52
5.22
0.51
SLICE+FastText*
18.13
12.87
10.29
18.13
17.71
18.52
8.63
10.78
12.74
8.63
10.37
11.63
0.97
0.22
XML-CNN
17.75
12.34
9.73
17.75
16.93
17.48
8.24
9.72
11.15
8.24
9.40
10.31
0.78
14.25
XT*
16.55
11.37
8.93
16.55
15.88
16.47
7.38
8.75
10.05
7.38
8.57
9.46
2.00
3.25
WikiTitles-500K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
39.56
20.50
14.32
39.56
28.28
26.54
15.44
13.83
13.79
15.44
15.49
16.58
10.70
1.77
Astec†
46.60
26.03
18.50
46.60
35.10
33.34
18.89
18.90
19.30
18.89
20.33
22.00
15.15
13.04
AttentionXML†
42.89
22.71
15.89
42.89
30.92
28.93
15.12
14.32
14.22
15.12
15.69
16.75
9.21
102.43
Bonsai*
42.60
23.08
16.25
42.60
31.34
29.58
17.38
16.85
16.90
17.38
18.28
19.62
1.18
2.94
DiSMEC*
39.89
21.23
14.96
39.89
28.97
27.32
15.89
15.15
15.43
15.89
16.52
17.86
0.35
23.94
InceptionXML♦ 48.35 27.63 19.74 - - - 20.86 21.02 21.23 - - - - -
MACH†
33.74
15.62
10.41
33.74
22.61
20.80
11.43
8.98
8.35
11.43
10.77
11.28
10.48
23.65
Parabel*
42.50
23.04
16.21
42.50
31.24
29.45
16.55
16.12
16.16
16.55
17.49
18.77
2.15
0.34
PfastreXML*
30.99
18.07
13.09
30.99
24.54
23.88
17.87
15.40
15.15
17.87
17.38
18.46
16.85
3.07
SLICE+FastText*
28.07
16.78
12.28
28.07
22.97
22.87
15.10
14.69
15.33
15.10
16.02
17.67
1.50
0.54
XML-CNN†
43.45
23.24
16.53
43.45
31.69
29.95
15.64
14.74
14.98
15.64
16.17
17.45
1.17
55.21
XT*
39.44
21.57
15.31
39.44
29.17
27.65
15.23
15.00
15.25
15.23
16.23
17.59
3.30
12.13
AmazonTitles-670K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
35.31
30.90
27.83
35.31
32.76
31.26
17.94
20.69
23.30
17.94
19.57
20.88
2.99
0.17
Astec†
40.63
36.22
33.00
40.63
38.45
37.09
28.07
30.17
32.07
28.07
29.20
29.98
10.93
3.85
AttentionXML†
37.92
33.73
30.57
37.92
35.78
34.35
24.24
26.43
28.39
24.24
25.48
26.33
12.11
37.50
Bonsai*
38.46
33.91
30.53
38.46
36.05
34.48
23.62
26.19
28.41
23.62
25.16
26.21
0.66
0.53
DiSMEC*
38.12
34.03
31.15
38.12
36.07
34.88
22.26
25.46
28.67
22.26
24.30
26.00
0.29
11.74
InceptionXML♦ 42.45 38.04 34.68 - - - 28.70 31.48 33.83 - - - - -
LightXML‡ 43.10 38.70 35.50 - - - - - - - - - - -
MACH†
34.92
31.18
28.56
34.92
33.07
31.97
20.56
23.14
25.79
20.56
22.18
23.53
3.84
6.41
Parabel*
38.00
33.54
30.10
38.00
35.62
33.98
23.10
25.57
27.61
23.10
24.55
25.48
1.06
0.09
PfastreXML*
32.88
30.54
28.80
32.88
32.20
31.85
26.61
27.79
29.22
26.61
27.10
27.59
5.32
0.99
Renee 45.20 40.24 36.61 45.20 42.77 41.27 28.98 32.66 35.83 28.98 31.38 33.07 - -
SLICE+FastText*
33.85
30.07
26.97
33.85
31.97
30.56
21.91
24.15
25.81
21.91
23.26
24.03
2.01
0.22
XML-CNN†
35.02
31.37
28.45
35.02
33.24
31.94
21.99
24.93
26.84
21.99
23.83
24.67
1.36
23.52
XR-Transformers‡ 41.94 37.44 34.19 41.89 39.67 38.32 25.34 28.86 32.14 25.34 27.58 29.30 - -
XT*
36.57
32.73
29.79
36.57
34.64
33.35
22.11
24.81
27.18
22.11
23.73
24.87
4.00
4.65
AmazonTitles-3M
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
48.37
44.68
42.24
48.37
45.93
44.43
11.47
13.84
15.72
11.47
13.02
14.15
10.23
1.68
Astec†
48.74
45.70
43.31
48.74
46.96
45.67
16.10
18.89
20.94
16.10
18.00
19.33
40.60
13.04
AttentionXML†
46.00
42.81
40.59
46.00
43.94
42.61
12.81
15.03
16.71
12.80
14.23
15.25
44.40
273.10
Bonsai*
46.89
44.38
42.30
46.89
45.46
44.35
13.78
16.66
18.75
13.78
15.75
17.10
9.53
9.90
MACH†
37.10
33.57
31.33
37.10
34.67
33.17
7.51
8.61
9.46
7.51
8.23
8.76
9.77
40.48
Parabel*
46.42
43.81
41.71
46.42
44.86
43.70
12.94
15.58
17.55
12.94
14.70
15.94
13.20
1.54
PfastreXML*
31.16
31.35
31.10
31.16
31.78
32.08
22.37
24.59
26.16
22.37
23.72
24.65
22.97
10.47
Renee 51.81 48.84 46.54 51.81 50.08 48.86 14.49 17.43 19.66 14.49 16.50 17.95 - -
SLICE+FastText*
35.39
33.33
31.74
35.39
34.12
33.21
11.32
13.37
14.94
11.32
12.65
13.61
12.22
0.64
XR-Transformer‡ 50.50 47.41 45.00 50.50 48.79 47.57 15.81 19.03 21.34 15.81 18.14 19.75 - -
XT*
27.99
25.24
23.57
27.99
25.98
24.78
4.45
5.06
5.57
4.45
4.78
5.03
16.00
15.80
LF-Wikipedia-500K /
Wikipedia-500K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
64.64
43.20
32.77
64.64
54.54
52.42
26.88
30.24
32.79
26.88
30.71
33.33
48.32
15.50
APLC-XLNet♦
72.83
50.50
38.55
72.83
62.06
59.27
30.03
35.25
38.27
30.03
35.01
37.86
1.40
-
Astec†
73.02
52.02
40.53
73.02
64.10
62.32
30.69
36.48
40.38
30.69
36.33
39.84
28.06
20.35
AttentionXML†
82.73
63.75
50.41
82.73
76.56
74.86
34.00
44.32
50.15
34.00
42.99
47.69
9.30
110.60
Bonsai*
69.20
49.80
38.80
-
-
-
-
-
-
-
-
-
-
-
CascadeXML♦ 81.13 62.43 49.12 - - - 32.12 43.15 49.37 - -
DEXA‡ 84.92 65.50 50.51 84.90 79.18 76.80 42.59 53.93 58.33 42.59 52.92 57.44 57.51
DiSMEC*
70.20
50.60
39.70
70.20
42.10
40.50
31.20
33.40
37.00
31.20
33.70
37.10
-
-
ECLARE‡ 68.04 46.44 35.74 68.04 58.15 56.37 31.02 35.39 38.29 31.02 35.66 34.50 7.40 86.57
LightXML‡
81.59
61.78
47.64
81.59
74.73
72.23
31.99
42.00
46.53
31.99
40.99
45.18
-
185.56
MACH‡ 52.78 32.39 23.75 52.78 42.05 39.70 17.65 18.06 18.66 17.64 19.18 45.18 4.50 31.20
MatchXML♦ 80.66 60.43 47.09 80.66 73.28 71.20 35.87 43.12 47.50 35.87 43.00 47.18 61 11.10
NGAME‡ 84.01 64.69 49.97 84.01 78.25 75.97 41.25 52.57 57.04 41.25 51.58 56.11 3.88 54.88
PINA♦ 82.83 63.14 50.11 - - - - - - - - - - -
Parabel*
68.70
49.57
38.64
68.70
60.51
58.62
26.88
31.96
35.26
26.88
31.73
34.61
5.65
2.72
PfastreXML*
59.50
40.20
30.70
59.50
30.10
28.70
29.20
27.60
27.70
29.20
28.70
28.30
-
63.59
ProXML*
68.80
48.90
37.90
68.80
39.10
38.00
33.10
35.00
39.40
33.10
35.20
39.00
-
-
SiameseXML†
67.26
44.82
33.73
67.26
56.64
54.29
33.95
35.46
37.07
33.95
36.58
38.93
5.73
7.31
Renee 84.95 66.25 51.68 84.95 79.79 77.83 39.89 51.77 56.70 39.89 50.73 55.57 - -
X-Transformer♦
76.95
58.42
46.14
-
-
-
-
-
-
-
-
-
-
-
XML-CNN♦
59.85
39.28
29.81
59.85
48.67
46.12
-
-
-
-
-
-
-
117.23
XR-Transformer‡
81.62
61.38
47.85
81.62
74.46
72.43
33.58
42.97
47.81
33.58
42.21
46.61
-
318.90
XT*
64.48
45.84
35.46
-
-
-
-
-
-
-
-
-
5.50
20.88
Amazon-670K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
42.39
36.89
32.98
42.39
39.07
37.04
21.56
24.78
27.66
21.56
23.38
24.76
50.00
1.56
APLC-XLNet♦
43.46
38.83
35.32
43.46
41.01
39.38
26.12
29.66
32.78
26.12
28.20
29.68
1.1
-
Astec†
47.77
42.79
39.10
47.77
45.28
43.74
32.13
35.14
37.82
32.13
33.80
35.01
18.79
7.32
AttentionXML♦
47.58
42.61
38.92
47.58
45.07
43.50
30.29
33.85
37.13
-
-
-
16.56
78.30
Bonsai*
45.58
40.39
36.60
45.58
42.79
41.05
27.08
30.79
34.11
-
-
-
-
-
CascadeXML♦ 52.15 46.54 42.44 - - - 30.77 35.78 40.52 - - - - -
DiSMEC*
44.70
39.70
36.10
44.70
42.10
40.50
27.80
30.60
34.20
27.80
28.80
30.70
3.75
56.02
FastXML*
36.99
33.28
30.53
36.99
35.11
33.86
19.37
23.26
26.85
19.37
22.25
24.69
-
-
LEML*
8.13
6.83
6.03
8.13
7.30
6.85
2.07
2.26
2.47
2.07
2.21
2.35
-
-
LPSR-NB*
28.65
24.88
22.37
28.65
26.40
25.03
16.68
18.07
19.43
16.68
17.70
18.63
-
-
LightXML♦
49.10
43.83
39.85
-
-
-
-
-
-
-
-
-
4.59
86.25
MatchXML♦ 51.64 46.17 42.05 51.64 48.81 47.04 30.30 35.28 39.78 30.30 33.46 35.87 18 3.30
PPD-Sparse*
45.32
40.37
36.92
-
-
-
26.64
30.65
34.65
-
-
-
-
-
Parabel*
44.89
39.80
36.00
44.89
42.14
40.36
25.43
29.43
32.85
25.43
28.38
30.71
2.41
0.41
PfastreXML*
39.46
35.81
33.05
39.46
37.78
36.69
29.30
30.80
32.43
29.30
30.40
31.49
-
-
ProXML*
43.50
38.70
35.30
43.50
41.10
39.70
30.80
32.80
35.10
30.80
31.70
32.70
-
-
Renee 54.23 48.22 43.83 54.23 51.23 49.41 34.16 39.14 43.39 34.16 37.48 39.83 - -
SLEEC*
35.05
31.25
28.56
34.77
32.74
31.53
20.62
23.32
25.98
20.62
22.63
24.43
-
-
SLICE+FastText*
33.15
29.76
26.93
33.15
31.51
30.27
20.20
22.69
24.70
20.20
21.71
22.72
2.01
0.21
XML-CNN♦
35.39
31.93
29.32
35.39
33.74
32.64
28.67
33.27
36.51
-
-
-
-
52.23
XR-Transformers‡ 50.13 44.60 40.69 50.13 47.28 45.60 29.90 34.35 38.63 29.90 32.75 35.03 - -
XT*
42.50
37.87
34.41
42.50
40.01
38.43
24.82
28.20
31.24
24.82
26.82
28.29
4.20
8.22
Amazon-3M
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
49.30
45.55
43.11
49.30
46.79
45.27
11.69
14.07
15.98
-
-
-
-
-
AttentionXML†
50.86
48.04
45.83
50.86
49.16
47.94
15.52
18.45
20.60
-
-
-
-
-
Bonsai*
48.45
45.65
43.49
48.45
46.78
45.59
13.79
16.71
18.87
-
-
-
-
-
CascadeXML♦ 53.91 51.24 49.52 - - - - - - - - - - -
DiSMEC*
47.34
44.96
42.80
47.36
-
-
-
-
-
-
-
-
-
-
FastXML*
44.24
40.83
38.59
44.24
41.92
40.47
9.77
11.69
13.25
9.77
11.20
12.29
-
-
MatchXML♦ 55.88 52.39 49.80 55.88 53.90 52.58 17.00 20.55 23.16 17.00 19.56 21.38 113 8.30
Parabel*
47.48
44.65
42.53
47.48
45.73
44.53
12.82
15.61
17.73
12.82
14.89
16.38
-
-
PfastreXML*
43.83
41.81
40.09
43.83
42.68
41.75
21.38
23.22
24.52
21.38
22.75
23.68
-
-
Renee 54.84 52.08 49.77 54.84 53.31 52.13 15.74 19.06 21.54 15.74 18.02 19.64 - -
XR-Transformers‡ 53.67 50.29 47.74 53.67 51.74 50.42 16.54 19.94 22.39 16.54 18.99 20.71 - -
AmazonCat-13K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
93.54
78.37
63.30
93.54
87.29
85.10
49.04
61.13
69.64
49.04
58.83
65.47
18.61
3.45
APLC-XLNet♦
94.56
79.82
64.61
94.56
88.74
86.66
52.22
65.08
71.40
52.22
62.57
67.92
0.50
-
AttentionXML♦
95.92
82.41
67.31
95.92
91.17
89.48
53.76
68.72
76.38
-
-
-
-
-
Bonsai*
92.98
79.13
64.46
92.98
87.68
85.92
51.30
64.60
72.48
-
-
-
0.55
1.26
CascadeXML♦ 96.71 84.07 68.69 - - - 51.39 66.81 77.58 - - - - -
DiSMEC*
93.40
79.10
64.10
93.40
87.70
85.80
59.10
67.10
71.20
59.10
65.20
68.80
-
-
FastXML*
93.11
78.20
63.41
93.11
87.07
85.16
48.31
60.26
69.30
48.31
56.90
62.75
-
-
LightXML♦
96.77
84.02
68.70
-
-
-
-
-
-
-
-
-
-
-
MatchXML♦ 96.83 83.83 68.20 96.83 92.59 90.62 48.02 64.26 75.65 48.02 60.85 69.30 2.2 6.60
PD-Sparse*
90.60
75.14
60.69
90.60
84.00
82.05
49.58
61.63
68.23
49.58
58.28
62.68
-
-
Parabel*
93.03
79.16
64.52
93.03
87.72
86.00
50.93
64.00
72.08
50.93
60.37
65.68
0.62
0.63
PfastreXML*
91.75
77.97
63.68
91.75
86.48
84.96
69.52
73.22
75.48
69.52
72.21
73.67
19.02
5.69
SLEEC*
90.53
76.33
61.52
90.53
84.96
82.77
46.75
58.46
65.96
46.75
55.19
60.08
-
-
XML-CNN♦
93.26
77.06
61.40
93.26
86.20
83.43
52.42
62.83
67.10
-
-
-
-
-
XT*
92.59
78.24
63.58
92.59
86.90
85.03
49.61
62.22
70.24
49.61
59.71
66.04
0.46
7.14
XTransformer♦
96.70
83.85
68.58
-
-
-
-
-
-
-
-
-
-
-
[01] K. Bhatia, H. Jain, P. Kar, M.
Varma, and P. Jain, Sparse
Local Embeddings for Extreme Multi-label Classification ,
in NeurIPS 2015.
[02] R. Agrawal, A. Gupta, Y. Prabhu
and M. Varma, Multi-Label
Learning with Millions of Labels: Recommending Advertiser Bid
Phrases for Web Pages , in WWW 2013.
[03] Y. Prabhu and M. Varma, FastXML: A Fast, Accurate and
Stable Tree-classifier for eXtreme Multi-label Learning ,
in KDD 2014.
[04] J. Weston, A. Makadia, and H. Yee,
Label
Partitioning For Sublinear Ranking , in ICML
2013.
[05] H. Yu, P. Jain, P. Kar, and I.
Dhillon, Large-scale
Multi-label Learning with Missing Labels , in ICML
2014.
[06] D. Hsu, S. Kakade, J. Langford, and
T. Zhang, Multi-Label
Prediction via Compressed Sensing , in NeurIPS
2009.
[07] F. Tai, and H. Lin, Multi-label Classification with Principle
Label Space Transformation , in Neural Computation, 2012.
[08] W. Bi, and J. Kwok, Efficient Multi-label Classification with Many
Labels , in ICML, 2013.
[09] Y. Chen, and H. Lin, Feature-aware Label Space Dimension Reduction
for Multi-label Classification , in NeurIPS,
2012.
[10] C. Ferng, and H. Lin, Multi-label Classification with
Error-correcting Codes , in ACML, 2011.
[11] J. Weston, S. Bengio, and N.
Usunier, WSABIE: Scaling Up To Large Vocabulary Image
Annotation , in IJCAI, 2011.
[12] S. Ji, L. Tang, S. Yu, and J. Ye, Extracting Shared Subspaces for Multi-label
Classification , in KDD, 2008.
[13] Z. Lin, G. Ding, M. Hu, and J. Wang,
Multi-label Classification via Feature-aware
Implicit Label Space Encoding , in ICML, 2014.
[14] P. Mineiro, and N.
Karampatziakis, Fast Label Embeddings via Randomized Linear
Algebra , Preprint, 2015.
[15] N. Karampatziakis, and P.
Mineiro, Scalable Multilabel Prediction via Randomized
Methods , Preprint, 2015.
[16] K. Balasubramanian, and
G. Lebanon, The Landmark Selection Method for Multiple
Output Prediction , Preprint, 2012.
[17] M. Cisse, N. Usunier, T. Artieres,
and P. Gallinari, Robust Bloom Filters for Large Multilabel
Classification Tasks , in NIPS, 2013.
[18] B. Hariharan, S. Vishwanathan,
and M. Varma, Efficient max-margin multi-label
classification with applications to zero-shot learning , in
Machine Learning Journal, 2012.
[19] C. Snoek, M. Worring, J. van
Gemert, J.-M. Geusebroek, and A. Smeulders, The challenge problem for automated detection
of 101 semantic concepts in multimedia , in ACM
Multimedia, 2006.
[20] I. Katakis, G. Tsoumakas, and I.
Vlahavas, Multilabel text classification for automated
tag suggestion , in ECML/PKDD Discovery Challenge,
2008.
[21] G. Tsoumakas, I. Katakis, and
I. Vlahavas, Effective and efficient multilabel
classification in domains with large number of labels , in
ECML/PKDD 2008 Workshop on Mining Multidimensional Data ,
2008.
[22] J. Leskovec and A. Krevl, SNAP
Datasets: Stanford large network dataset collection , 2014.
[23] A. Zubiaga, Enhancing navigation on wikipedia with social
tags , Preprint , 2009.
[24] R. Wetzker, C. Zimmermann, and C.
Bauckhage, Analyzing social bookmarking systems: A
del.icio.us cookbook , in Mining Social Data (MSoDa)
Workshop Proceedings, ECAI , 2008.
[25] I. Partalas, A Kosmopoulos, N
Baskiotis, T Artieres, G Paliouras, E Gaussier, I
Androutsopoulos, M.-R. Amini and P Galinari, LSHTC:
A Benchmark for Large-Scale Text Classification ,
Preprint , 2015
[26] D. D. Lewis, Y. Yang, T. Rose, and F.
Li, RCV1: A New Benchmark Collection for Text
Categorization Research in JMLR , 2004.
[27] E. L. Mencia, and J. Furnkranz, Efficient pairwise multilabel classification
for large-scale problems in the legal domain in
ECML/PKDD , 2008.
[28] J. McAuley, and J. Leskovec, Hidden factors and hidden topics:
understanding rating dimensions with review text in
Proceedings of the 7th ACM conference on Recommender systems
ACM , 2013.
[29] J. McAuley, C. Targett, Q. Shi,
and A. v. d. Hengel, Image-based Recommendations on Styles and
Substitutes in International ACM SIGIR Conference on
Research and Development in Information Retrieval , 2015.
[30] J. McAuley, R. Pandey, and J.
Leskovec, Inferring networks of substitutable and
complementary products in KDD , 2015.
[31] H. Jain, Y. Prabhu, and M. Varma, Extreme
Multi-label Loss Functions for Recommendation, Tagging,
Ranking & Other Missing Label Applications in
KDD , 2016.
[32] R. Babbar, and B. Schölkopf, DiSMEC - Distributed Sparse Machines for
Extreme Multi-label Classification in WSDM ,
2017.
[33] I. E. H. Yen, X. Huang, K. Zhong, P.
Ravikumar and I. S. Dhillon, PD-Sparse: A Primal and Dual Sparse Approach
to Extreme Multiclass and Multilabel Classification in
ICML , 2016.
[34] I. E. H. Yen, X. Huang, W. Dai, P.
Ravikumar I. S. Dhillon and E.-P. Xing, PPDSparse: A Parallel Primal-Dual Sparse
Method for Extreme Classification in KDD , 2017.
[35] K. Jasinska, K. Dembczynski, R.
Busa-Fekete, K. Pfannschmidt, T. Klerx and E. Hullermeier, Extreme F-Measure Maximization using Sparse
Probability Estimates in ICML , 2017.
[36] J. Liu, W-C. Chang, Y. Wu and Y.
Yang, Deep Learning for Extreme Multi-label Text
Classification in SIGIR , 2017.
[37] Y. Jernite, A. Choromanska, D.
Sontag, Simultaneous Learning of Trees and
Representations for Extreme Classification and Density
Estimation in ICML , 2017.
[38] Y. Prabhu, A. Kag, S. Harsola, R.
Agrawal and M. Varma, Parabel: Partitioned Label Trees for Extreme
Classification with Application to Dynamic Search Advertising
in WWW , 2018.
[39] I. Evron, E. Moroshko and K.
Crammer, Efficient Loss-Based Decoding on Graphs for
Extreme Classification in NeurIPS , 2018.
[40] A. Niculescu-Mizil and E.
Abbasnejad, Label Filters for Large Scale Multilabel
Classification in AISTATS , 2017.
[41] H. Jain, V. Balasubramanian, B.
Chunduri and M. Varma, Slice:
Scalable linear extreme classifiers trained on 100 million
labels for related searches , in WSDM 2019.
[42] A. Jalan, P. Kar, Accelerating
Extreme Classification via Adaptive Feature Agglomeration ,
in IJCAI 2019.
[43] R. Babbar, and B. Schölkopf, Data Scarcity, Robustness and Extreme
Multi-label Classification in Machine Learning
Journal and European Conference on Machine Learning ,
2019.
[44] S. Khandagale, H. Xiao and R.
Babbar, Bonsai -
Diverse and Shallow Trees for Extreme Multi-label
Classification , in ArXiv 2019.
[45] W. Siblini, F. Meyer and P.
Kuntz, CRAFTML,
an Efficient Clustering-based Random Forest for Extreme
Multi-label Learning , in ICML 2018.
[46] V. Gupta, R. Wadbude, N. Natarajan,
H. Karnick, P. Jain and P. Rai, Distributional
Semantics meets Multi-Label Learning , in AAAI
2019.
[47] I. Chalkidis, E. Fergadiotis,
P. Malakasiotis, N. Aletras and I. Androutsopoulos, Extreme
Multi-Label Legal Text Classification: A Case Study in EU
Legislation , in Natural Legal Language Processing
Workshop 2019.
[47b] I. Chalkidis, E. Fergadiotis,
P. Malakasiotis, N. Aletras and I. Androutsopoulos, EURLEX57K
Dataset .
[48] I. Chalkidis, E. Fergadiotis,
P. Malakasiotis, and I. Androutsopoulos, Large-Scale
Multi-Label Text Classification on EU Legislation , in
ACL 2019.
[49] G. Tsoumakas, E.
Spyromitros-Xioufis, J. Vilcek and I. Vlahavas, Mulan: A Java
Library for Multi-Label Learning , in JMLR 2011.
[50] A. Jalan and P. Kar, Accelerating Extreme
Classification via Adaptive Feature Agglomeration , in IJCAI
2019.
[51] R. You, S. Dai, Z. Zhang, H.
Mamitsuka, and S. Zhu, AttentionXML: Extreme
Multi-Label Text Classification with Multi-Label Attention
Based Recurrent Neural Network , in NeurIPS
2019.
[52] T. K. R. Medini, Q. Huang, Y.
Wang, V. Mohan, and A. Shrivastava, Extreme
Classification in Log Memory using Count-Min Sketch: A Case
Study of Amazon Search with 50M Products , in NeurIPS
2019.
[53] W-C. Chang, H.-F. Yu, K. Zhong, Y.
Yang, and I. Dhillon, Taming Pretrained
Transformers for Extreme Multi-label Text Classification ,
in KDD 2020.
[54] T. Jiang, D. Wang, L. Sun, H. Yang,
Z. Zhao, F. Zhuang, LightXML:
Transformer with Dynamic Negative Sampling for
High-Performance Extreme Multi-label Text Classification ,
in AAAI 2021.
[55] K. Dahiya, D. Saini, A. Mittal, A.
Shaw, K. Dave, A. Soni, H. Jain, S. Agarwal and M. Varma, DeepXML: A
deep extreme multi-Label learning framework applied to short
text documents , in WSDM 2021.
[56] A. Mittal, K. Dahiya, S. Agrawal,
D. Saini, S. Agarwal, P. Kar and M. Varma, DECAF:
Deep extreme classification with label features , in
WSDM 2021.
[57] A. Mittal, N. Sachdeva, S.
Agrawal, S. Agarwal, P. Kar and M. Varma, ECLARE:
Extreme classification with label graph correlations , in
TheWebConf 2021.
[58] D. Saini, A. K. Jain, K. Dave, J.
Jiao, A. Singh, R. Zhang and M. Varma, GalaXC: Graph
neural networks with labelwise attention for extreme
classification , in TheWebConf 2021.
[59] H. Ye, Z. Chen, D.-H. Wang, B.-D.
Davison, Pretrained
Generalized Autoregressive Model with Adaptive Probabilistic
Label Clusters for Extreme Multi-label Text Classification ,
in ICML 2020.
[60] Y. Prabhu, A. Kag, S. Gopinath, K.
Dahiya, S. Harsola, R. Agrawal and M. Varma,
Extreme multi-label learning with label features for
warm-start tagging, ranking and recommendation in
WSDM , 2018.
[61] M. Qaraei, E. Schultheis, P.
Gupta, and R. Babbar, Convex Surrogates for Unbiased Loss Functions
in Extreme Classification With Missing Labels in
TheWebConf , 2021.
[62] N. Gupta, S. Bohra, Y. Prabhu, S.
Purohit and M. Varma, Generalized
zero-Shot extreme multi-label learning , in KDD
2021.
[63] K. Dahiya, A. Agarwal, D. Saini,
K. Gururaj, J. Jiao, A. Singh, S. Agarwal, P. Kar and M. Varma,
SiameseXML:
Siamese networks meet extreme classifiers with 100M labels ,
in ICML 2021.
[64] J. Ni, J. Li and J. McAuley, Justifying recommendations using
distantly-labeled reviews and fined-grained aspects in
Proceedings of Empirical Methods in Natural Language
Processing (EMNLP) , 2019.
[65] A. Mittal, K. Dahiya, S. Malani,
J. Ramaswamy, S. Kuruvilla, J. Ajmera, K-h. Chang, S. Agarwal,
P. Kar and M. Varma, Multimodal
extreme classification , in CVPR 2022.
[66] K. Dahiya, N. Gupta, D. Saini, A. Soni,
Y. Wang, K. Dave, J. Jiao, G. K, P. Dey, A. Singh, D. Hada,
V. Jain, B. Paliwal, A. Mittal, S. Mehta,
R. Ramjee, S. Agarwal, P. Kar and M. Varma, NGAME:
Negative Mining-aware Mini-batching for
Extreme Classification , in ArXiv 2022.
[67] E. Schultheis and R. Babbar,
Speeding-up One-vs-All Training for Extreme
Classification via Smart Initialization , in ECML-MLJ 2022.
[68] E. Chien, J. Zhang, C.-J. Hsieh,
J.-Y. Jiang, W.-C. Chang, O. Milenkovic and H.-F. Yu,
PINA: Leveraging Side Information in eXtreme Multi-label
Classification via Predicted Instance Neighborhood Aggregation , in ICML 2023.
[69] V. Jain, J. Prakash, D. Saini, J. Jiao, R. Ramjee and M. Varma,
Renee: End-to-end training of extreme classification models , in MLSys 2023.
[70] K. Dahiya, S. Yadav, S. Sondhi, D. Saini, S. Mehta, J. Jiao, S. Agarwal, P. Kar and M. Varma,
Deep encoders with auxiliary parameters for extreme classification , in KDD 2023.
[71] S. Kharbanda, A. Banerjee, R. Schultheis and R. Babbar,
CascadeXML : Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-Label Classification , in NeurIPS 2022.
[72] S. Kharbanda, A. Banerjee, D. Gupta, A. Palrecha, and R. Babbar,
InceptionXML : A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification , in SIGIR 2023.
[73] H. Ye, R. Sunderraman, S. Ji,
MatchXML: An Efficient Text-Label Matching Framework for Extreme Multi-Label Text Classification , in TKDE 2024.
Mediamill
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
87.82
73.45
59.17
87.82
81.50
79.22
70.14
72.76
74.02
70.14
72.31
73.13
-
-
CPLST*
83.82
67.32
52.80
83.82
75.29
71.92
66.23
65.28
63.70
66.23
65.89
64.77
-
-
CS*
78.95
60.93
44.27
78.95
68.97
62.88
62.53
58.97
53.23
62.53
60.33
56.50
-
-
DiSMEC*
81.86
62.52
45.11
81.86
70.21
63.71
62.23
59.85
54.03
62.25
61.05
57.26
-
-
FastXML*
83.57
65.78
49.97
83.57
74.06
69.34
66.06
63.83
61.11
66.06
64.83
62.94
-
-
LEML*
81.29
64.74
49.83
81.29
72.92
69.37
64.24
62.73
59.92
64.24
63.47
61.57
-
-
LPSR*
83.57
65.50
48.57
83.57
73.84
68.18
66.06
63.53
59.38
66.06
64.63
61.84
-
-
ML-CSSP
83.98
67.37
53.02
83.98
75.31
72.21
66.88
65.90
64.90
66.88
66.47
65.71
-
-
PD-Sparse*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
PPD-Sparse*
86.50
68.40
53.20
86.50
77.30
75.60
64.30
61.30
60.80
64.30
63.60
62.80
-
-
Parabel*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
PfastreXML*
84.22
67.33
53.04
84.22
75.41
72.37
66.67
65.43
64.30
66.08
66.08
65.24
-
-
SLEEC*
84.01
67.20
52.80
84.01
75.23
71.96
66.34
65.11
63.62
66.34
65.79
64.71
-
-
WSABIE
83.35
66.18
51.46
83.35
74.21
70.55
65.79
64.07
61.89
65.79
64.88
63.36
-
-
kNN*
83.91
67.12
52.99
83.91
75.22
72.21
66.51
65.21
64.30
66.51
65.91
65.20
-
-
Bibtex
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
1-vs-All
62.62
39.09
28.79
62.62
59.13
61.58
48.84
52.96
59.29
48.84
51.62
55.09
-
-
CPLST*
62.38
37.84
27.62
62.38
57.63
59.71
48.17
50.86
56.42
48.17
49.94
52.96
-
-
CS*
58.87
33.53
23.72
58.87
52.19
53.25
46.04
45.08
48.17
46.04
45.25
46.89
-
-
DiSMEC*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
FastXML*
63.42
39.23
28.86
63.42
59.51
61.70
48.54
52.30
58.28
48.54
51.11
54.38
-
-
LEML*
62.54
38.41
28.21
62.54
58.22
60.53
47.97
51.42
57.53
47.97
50.25
53.59
-
-
LPSR*
62.11
36.65
26.53
62.11
56.50
58.23
49.20
50.14
55.01
49.20
49.78
52.41
-
-
ML-CSSP
44.98
30.43
23.53
44.98
44.67
47.97
32.38
38.68
45.96
32.38
36.73
40.74
-
-
PD-Sparse*
61.29
35.82
25.74
61.29
55.83
57.35
48.34
48.77
52.93
48.34
48.49
50.72
-
-
PPD-Sparse*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Parabel*
64.53
38.56
27.94
64.53
59.35
61.06
50.88
52.42
57.36
50.88
51.90
54.58
-
-
PfastreXML*
63.46
39.22
29.14
63.46
59.61
62.12
52.28
54.36
60.55
52.28
53.62
56.99
-
-
ProXML*
64.60
39.00
28.20
64.40
59.20
61.50
50.10
52.00
58.30
50.10
52.00
55.10
-
-
SLEEC*
65.08
39.64
28.87
65.08
60.47
62.64
51.12
53.95
59.56
51.12
52.99
56.04
-
-
WSABIE
54.78
32.39
23.98
54.78
50.11
52.39
43.39
44.00
49.30
43.39
43.64
46.50
-
-
kNN*
57.04
34.38
25.44
57.04
52.29
54.64
43.71
45.82
51.64
43.71
45.04
48.20
-
-
Delicious
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
CPLST*
65.31
59.95
55.31
65.31
61.16
57.80
31.10
32.40
33.02
31.10
32.07
32.55
-
-
CS*
61.36
56.46
52.07
61.36
57.66
54.44
30.60
31.84
32.26
30.60
31.54
31.89
-
-
DiSMEC*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
FastXML*
69.61
64.12
59.27
69.61
65.47
61.90
32.35
34.51
35.43
32.35
34.00
34.73
-
-
LEML*
65.67
60.55
56.08
65.67
61.77
58.47
30.73
32.43
33.26
30.73
32.01
32.66
-
-
LPSR*
65.01
58.96
53.49
65.01
60.45
56.38
31.34
32.57
32.77
31.34
32.29
32.50
-
-
ML-CSSP
63.04
56.26
50.16
63.04
57.91
53.36
29.48
30.27
30.02
29.48
30.10
29.98
-
-
PD-Sparse*
51.82
44.18
38.95
51.82
46.00
42.02
25.22
24.63
23.85
25.22
24.80
24.25
-
-
Parabel*
67.44
61.83
56.75
67.44
63.15
59.41
32.69
34.00
34.53
32.69
33.69
34.10
-
-
PfastreXML*
67.13
62.33
58.62
67.13
63.48
60.74
34.57
34.80
35.86
34.57
34.71
35.42
-
-
SLEEC*
67.59
61.38
56.56
67.59
62.87
59.28
32.11
33.21
33.83
32.11
32.93
33.41
-
-
WSABIE
64.13
58.13
53.64
64.13
59.59
56.25
31.25
32.02
32.47
31.25
31.84
32.18
-
-
kNN*
64.95
58.89
54.11
64.95
60.32
56.77
31.03
32.02
32.43
31.03
31.76
32.09
-
-
EURLex-4K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
79.26
64.30
52.33
79.26
68.13
61.60
34.25
39.83
42.76
34.25
38.35
40.30
0.09
0.06
APLC-XLNet♦
87.72
74.56
62.28
87.72
77.90
71.75
42.93
49.84
53.07
42.93
48.00
50.40
0.48
-
Bonsai*
82.96
69.76
58.31
82.96
73.15
67.41
37.08
45.13
49.57
37.08
42.94
46.10
0.02
0.03
CPLST*
58.52
45.51
32.47
58.52
48.67
40.79
24.97
27.46
25.04
24.97
26.82
25.57
-
-
CS*
62.09
48.39
40.11
62.09
51.63
47.11
24.94
27.19
28.90
25.94
26.56
27.67
-
-
DiSMEC*
82.40
68.50
57.70
82.40
72.50
66.70
41.20
45.40
49.30
41.20
44.30
46.90
-
-
FastXML*
76.37
63.36
52.03
76.37
66.63
60.61
33.17
39.68
41.99
33.17
37.92
39.55
0.26
0.07
LEML*
68.55
55.11
45.12
68.55
58.44
53.03
31.16
34.85
36.82
31.16
33.85
35.17
-
-
LPSR*
79.89
66.01
53.80
79.89
69.62
63.04
37.97
44.01
46.17
37.97
42.44
43.97
-
-
MatchXML♦ 88.85 76.02 63.30 88.85 79.50 73.26 46.73 54.23 58.19 46.73 52.33 55.29 0.6 0.20
ML-CSSP*
75.45
62.70
52.51
75.45
65.97
60.78
43.86
45.72
46.97
43.86
45.23
46.03
-
-
PD-Sparse*
83.83
70.72
59.21
-
-
-
37.61
46.05
50.79
-
-
-
-
-
PPD-Sparse*
83.40
70.90
59.10
83.40
74.40
68.20
45.20
48.50
51.00
45.20
47.50
49.10
-
-
Parabel*
82.25
68.71
57.53
82.25
72.17
66.54
36.44
44.08
48.46
36.44
41.99
44.91
0.03
0.02
PfastreXML*
71.36
59.90
50.39
71.36
62.87
58.06
26.62
34.16
38.96
26.62
32.07
35.23
-
-
SLEEC*
63.40
50.35
41.28
63.40
53.56
48.47
24.10
27.20
29.09
24.10
26.37
27.62
-
-
WSABIE*
72.28
58.16
47.73
72.28
61.64
55.92
28.60
32.49
34.46
28.60
31.45
32.77
-
-
XT*
78.97
65.64
54.44
78.97
69.05
63.23
33.52
40.35
44.02
33.52
38.50
41.09
0.03
0.10
kNN*
81.73
68.78
57.44
81.73
72.15
66.40
36.36
44.04
48.29
36.36
41.95
44.78
-
-
Wiki10-31K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
86.49
74.27
64.20
86.49
77.13
69.44
11.90
12.76
13.58
11.90
12.53
13.10
0.62
0.39
APLC-XLNet♦
89.44
78.93
69.73
89.44
81.38
74.41
14.84
15.85
17.04
14.84
15.58
16.40
0.54
-
CascadeXML♦ 89.18 79.71 71.19 - - - 13.32 15.35 17.45 - - - - -
AttentionXML♦
87.47
78.48
69.37
87.47
80.61
73.79
15.57
16.80
17.82
-
-
-
-
-
Bonsai*
84.69
73.69
64.39
84.69
76.25
69.17
11.78
13.27
14.28
11.78
12.89
13.61
0.13
0.64
DiSMEC*
85.20
74.60
65.90
84.10
77.10
70.40
13.60
13.10
13.80
13.60
13.20
13.60
-
-
FastXML*
83.03
67.47
57.76
84.31
75.35
63.36
9.80
10.17
10.54
9.80
10.08
10.33
-
-
LEML*
73.47
62.43
54.35
73.47
64.92
58.69
9.41
10.07
10.55
9.41
9.90
10.24
-
-
LPSR-NB*
72.72
58.51
49.50
72.72
61.71
54.63
12.79
12.26
12.13
12.79
12.38
12.27
-
-
LightXML♦
89.45
78.96
69.85
-
-
-
-
-
-
-
-
-
-
-
MatchXML♦ 89.74 81.51 72.18 89.74 83.46 76.53 16.92 19.29 20.93 16.92 18.70 19.91 2.9 0.22
Parabel*
84.17
72.46
63.37
84.17
75.22
68.22
11.68
12.73
13.69
11.68
12.47
13.14
0.18
0.20
PfastreXML*
83.57
68.61
59.10
83.57
72.00
64.54
19.02
18.34
18.43
19.02
18.49
18.52
-
-
SLEEC*
85.88
72.98
62.70
85.88
76.02
68.13
11.14
11.86
12.40
11.14
11.68
12.06
1.13
0.21
XML-CNN♦
81.42
66.23
56.11
81.42
69.78
61.83
9.39
10.00
10.20
-
-
-
-
-
XT*
86.15
75.18
65.41
86.15
77.76
70.35
11.87
13.08
13.89
11.87
12.78
13.36
0.37
0.39
XTransformer♦
88.51
78.71
69.62
-
-
-
-
-
-
-
-
-
-
-
Delicious-200K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
46.79
40.72
37.67
46.79
42.17
39.84
7.18
8.05
8.74
7.18
7.78
8.22
10.74
2.58
Bonsai*
46.69
39.88
36.38
46.69
41.51
38.84
7.26
7.97
8.53
7.26
7.75
8.10
3.91
64.42
DiSMEC*
45.50
38.70
35.50
45.50
40.90
37.80
6.50
7.60
8.40
6.50
7.50
7.90
-
-
FastXML*
43.07
38.66
36.19
43.07
39.70
37.83
6.48
7.52
8.31
6.51
7.26
7.79
-
LEML*
40.73
37.71
35.84
40.73
38.44
37.01
6.06
7.24
8.10
6.06
6.93
7.52
-
-
LPSR-NB
18.59
15.43
14.07
18.59
16.17
15.13
3.24
3.42
3.64
3.24
3.37
3.52
-
-
PD-Sparse*
34.37
29.48
27.04
34.37
30.60
28.65
5.29
5.80
6.24
5.29
5.66
5.96
-
-
PPD-Sparse*
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Parabel*
46.86
40.08
36.70
46.86
41.69
39.10
7.22
7.94
8.54
7.22
7.71
8.09
6.36
9.58
Parabel*
46.97
40.08
36.63
46.97
41.72
39.07
7.25
7.94
8.52
7.25
7.75
8.15
-
-
PfastreXML*
41.72
37.83
35.58
41.72
38.76
37.08
3.15
3.87
4.43
3.15
3.68
4.06
15.34
3.60
SLEEC*
47.85
42.21
39.43
47.85
43.52
41.37
7.17
8.16
8.96
7.17
7.89
8.44
-
-
XT*
45.59
39.10
35.92
45.59
40.62
38.17
6.96
7.71
8.33
6.96
7.47
7.86
2.70
31.22
WikiLSHTC-325K
Method
P@1
P@3
P@5
N@1
N@3
N@5
PSP@1
PSP@3
PSP@5
PSN@1
PSN@3
PSN@5
Model size (GB)
Train time (hr)
AnnexML*
63.30
40.64
29.80
63.30
56.61
56.24
25.13
30.46
34.30
25.13
31.16
34.36
29.70
4.24
Bonsai*
66.41
44.40
32.92
66.41
60.69
60.53
28.11
35.36
39.73
28.11
35.42
38.94
2.43
3.04
DiSMEC*
64.40
42.50
31.50
64.40
58.50
58.40
29.10
35.60
39.50
29.10
35.90
39.40
-
-
FastXML*
49.75
33.10
24.45
49.75
45.23
44.75
16.35
20.99
23.56
16.35
19.56
21.02
-
-
LEML*
19.82
11.43
8.39
19.82
14.52
13.73
3.48
3.79
4.27
3.48
3.68
3.94
-
-
LPSR-NB
27.44
16.23
11.77
27.44
23.04
22.55
6.93
7.21
7.86
6.93
7.11
7.46
-
-
PD-Sparse*
61.26
39.48
28.79
61.26
55.08
54.67
28.34
33.50
36.62
28.34
31.92
33.68
-
-
PPD-Sparse*
64.08
41.26
30.12
-
-
-
27.47
33.00
36.29
-
-
-
-
-
Parabel*
65.04
43.23
32.05
65.04
59.15
58.93
26.76
33.27
37.36
26.76
31.26
33.57
3.10
0.75
PfastreXML*
56.05
36.79
27.09
56.05
50.59
50.13
30.66
31.55
33.12
30.66
31.24
32.09
14.23
6.34
ProXML*
63.60
41.50
30.80
63.80
57.40
57.10
34.80
37.70
41.00
34.80
38.70
41.50
-
-
SLEEC*
54.83
33.42
23.85
54.83
47.25
46.16
20.27
23.18
25.08
20.27
22.27
23.35
-
-
XT*
56.54
37.17
27.73
56.54
50.48
50.36
20.56
25.42
28.90
20.56
25.30
27.90
4.50
1.89