Privacy-preserving Class Ratio Estimation

Iyer, Arun Shankar ; Nath, J. Saketha ; Sarawagi, Sunita (2016) Privacy-preserving Class Ratio Estimation In: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

PDF
624kB

Official URL: http://doi.org/10.1145/2939672.2939806

Related URL: http://dx.doi.org/10.1145/2939672.2939806

Abstract

In this paper we present learning models for the class ratio estimation problem, which takes as input an unlabeled set of instances and predicts the proportions of instances in the set belonging to the different classes. This problem has applications in social and commercial data analysis. Existing models for class-ratio estimation however require instance-level supervision. Whereas in domains like politics, and demography, set-level supervision is more common. We present a new method for directly estimating class-ratios using set-level supervision. Another serious limitation in applying these techniques to sensitive domains like health is data privacy. We propose a novel label privacy-preserving mechanism that is well-suited for supervised class ratio estimation and has guarantees for achieving efficient differential privacy, provided the per-class counts are large enough. We derive learning bounds for the estimation with and without privacy constraints, which lead to important insights for the data-publisher. Extensive empirical evaluation shows that our model is more accurate than existing methods and that the proposed privacy mechanism and learning model are well-suited for each other.

Item Type:	Conference or Workshop Item (Paper)
Source:	Copyright of this article belongs to ACM, Inc
ID Code:	128345
Deposited On:	19 Oct 2022 09:47
Last Modified:	15 Nov 2022 09:10

Repository Staff Only: item control page