Domain Adaptation of Conditional Probability Models Via Feature Subsetting

Satpal, Sandeepkumar ; Sarawagi, Sunita (2007) Domain Adaptation of Conditional Probability Models Via Feature Subsetting In: Knowledge Discovery in Databases: PKDD 2007.

[img] PDF
261kB

Official URL: http://doi.org/10.1007/978-3-540-74976-9_23

Related URL: http://dx.doi.org/10.1007/978-3-540-74976-9_23

Abstract

The goal in domain adaptation is to train a model using labeled data sampled from a domain different from the target domain on which the model will be deployed. We exploit unlabeled data from the target domain to train a model that maximizes likelihood over the training sample while minimizing the distance between the training and target distribution. Our focus is conditional probability models used for predicting a label structure y given input x based on features defined jointly over x and y. We propose practical measures of divergence between the two domains based on which we penalize features with large divergence, while improving the effectiveness of other less deviant correlated features. Empirical evaluation on several real-life information extraction tasks using Conditional Random Fields (CRFs) show that our method of domain adaptation leads to significant reduction in error.

Item Type:Conference or Workshop Item (Paper)
Source:Copyright of this article belongs to Springer Nature Switzerland AG
Keywords:Target Domain;Domain Adaptation;Label Data;Unlabeled Data;Conditional Random Field
ID Code:128389
Deposited On:20 Oct 2022 04:20
Last Modified:14 Nov 2022 11:10

Repository Staff Only: item control page