Maximum Mean Discrepancy for Class Ratio Estimation: Convergence Bounds and Kernel Selection

Iyer, Arun ; Nath, Saketh ; Sarawagi, Sunita (2014) Maximum Mean Discrepancy for Class Ratio Estimation: Convergence Bounds and Kernel Selection In: International Conference on Machine Learning (ICML).

[img] PDF
546kB

Abstract

In recent times, many real world applications have emerged that require estimates of class ratios in an unlabeled instance collection as opposed to labels of individual instances in the collection. In this paper we investigate the use of maximum mean discrepancy (MMD) in a reproducing kernel Hilbert space (RKHS) for estimating such ratios. First, we theoretically analyze the MMD-based estimates. Our analysis establishes that, under some mild conditions, the estimate is statistically consistent. More importantly, it provides an upper bound on the error in the estimate in terms of intuitive geometric quantities like class separation and data spread. Next, we use the insights obtained from the theoretical analysis, to propose a novel convex formulation that automatically learns the kernel to be employed in the MMD-based estimation. We design an efficient cutting plane algorithm for solving this formulation. Finally, we empirically compare our estimator with several existing methods, and show significantly improved performance under varying datasets, class ratios, and training sizes.

Item Type:Conference or Workshop Item (Paper)
Source:Copyright of this article belongs to ResearchGate GmbH
ID Code:128353
Deposited On:19 Oct 2022 10:10
Last Modified:14 Nov 2022 10:43

Repository Staff Only: item control page