The objective in multi-label classification is to tag a data point with a set of relevant labels. Given a set of L labels, a data point can be tagged with any of the 2L possible subsets. The main challenge therefore lies in optimising over this exponentially large label space subject to label correlations.
The following is efficient code for learning a max-margin multi-label classifier. The code implements the M3L algorithm and the problem formulation, optimisation details and run times on benchmark datasets can be found in the paper. The code contains specialised algorithms for the kernelised and the linear cases. The code is very efficient at learning 1-vs-All SVMs as it re-utilises the kernel cache. We were able to train on the RCV1 dataset (23K training data points, 47K dimensional sparse features and 103 labels) using an RBF kernel in 817 seconds while training in a 1-vs-All fashion using LIBSVM took 5600 seconds. For the linear case, we were able to train on RCV1 with nearly a million training points in 18 minutes using a single core of an 2.66 GHz Xeon processor (we had more or less reached the global optimum within 6 minutes). The algorithm learns label correlations present in the training dataset but prior knowledge about how labels are correlated can also be incorporated.
The code is in C++ and should compile on 32/64 bit Windows/Linux machines. Please set "#define WINDOWS 1" in "include/definitions.h" when compiling on Windows machines (see the included README.txt). This code is made available as is for non-commercial research purposes.
Training: The command
M3L -train [options] training_file model_file reads in training data from training_file and saves the learnt M3L classifier in model_file subject to the following options
Testing: The command
M3L -test testing_file model_file output_file
applies the learnt M3L classifier in model_file to the test data in testing_file and saves the predictions in output_file.
The training, testing, model and output prediction files all follow the LIBSVM file format.
A toy example has been included with the source code and
binaries. Unpack the code and try
> cd toyexample
> ../bin/OS/M3L -train -k 1 example_data.svm your_example_model_kernel_withoutR.txt
where OS can be one of linux, linux64, windows or windowsx64. This trains the M3L classifier with an RBF kernel on the toy data. No a priori information about label correlations is provided. Compare your learnt model to the one provided to make sure that there were no errors.
To incorporate the correlation matrix given in example_R.txt
> ../bin/OS/M3L -train -k 1 -R example_R.txt example_data.svm your_example_model_kernel_withR.txt
Results for the linear kernel can be generated by
> ../bin/OS/M3L -train -a 1 example_data.svm your_example_model_linear_withoutR.txt
Outputs on the test file example_test.svm can be generated by
> ../bin/OS/M3L -test example_test.svm your_example_model_linear_withoutR.txt your_example_output_linear_withoutR.txt
The Hamming Loss that should be obtained is as follows:
Hamming Loss for label 0 : 14.000002%
Hamming Loss for label 1 : 1.000000%
Net Hamming Loss: 7.500000%
Please go through the included README and the reference below for details.