Prabhu, Yashoteja ; Kag, Anil ; Harsola, Shrutendra ; Agrawal, Rahul ; Varma, Manik (2018) Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising In: WWW '18: Proceedings of the 2018 World Wide Web Conference, April 2018, Geneva Switzerland.
Full text not available from this repository.
Official URL: http://doi.org/10.1145/3178876.3185998
Related URL: http://dx.doi.org/10.1145/3178876.3185998
Abstract
This paper develops the Parabel algorithm for extreme multi-label learning where the objective is to learn classifiers that can annotate each data point with the most relevant subset of labels from an extremely large label set. The state-of-the-art 1-vs-All based DiSMEC and PPDSparse algorithms are the most accurate but can take upto months for training and prediction as they learn and apply an independent linear classifier per label. Consequently, they do not scale to large datasets with millions of labels. Parabel addresses both limitations by learning a balanced label hierarchy such that: (a) the 1-vs-All classifiers in the leaf nodes of the label hierarchy can be trained on a small subset of the training set thereby reducing the training time to a few hours on a single core of a standard desktop and (b) novel points can be classified by traversing the learned hierarchy in logarithmic time and applying the 1-vs-All classifiers present in just the leaf thereby reducing the prediction time to a few milliseconds per test point. This allows Parabel to scale to tasks considered infeasible for DiSMEC and PPDSparse such as predicting the subset of 7 million Bing queries that might lead to a click on a given ad-landing page for dynamic search advertising. Experiments on multiple benchmark datasets revealed that Parabel could be almost as accurate as PPDSparse and DiSMEC while being upto 1,000x faster at training and upto 40x-10,000x faster at prediction. Furthermore, Parabel was demonstrated to significantly improve dynamic search advertising on Bing by more than doubling the ad recall and improving the click-through rate by 20%. Source code for Parabel can be downloaded from [1].
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Source: | Copyright of this article belongs to Association for Computing Machinery. |
ID Code: | 119555 |
Deposited On: | 14 Jun 2021 09:43 |
Last Modified: | 14 Jun 2021 09:43 |
Repository Staff Only: item control page