Document Classification Through Interactive Supervision of Document and Term Labels

Godbole, Shantanu ; Harpale, Abhay ; Sarawagi, Sunita ; Chakrabarti, Soumen (2004) Document Classification Through Interactive Supervision of Document and Term Labels Knowledge Discovery in Databases: PKDD 2004, 3202 . pp. 185-196. ISSN 0302-9743

Full text not available from this repository.

Official URL: http://doi.org/10.1007/978-3-540-30116-5_19

Related URL: http://dx.doi.org/10.1007/978-3-540-30116-5_19

Abstract

Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised high-accuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that actively collects user opinion on feature representations and choices, as well as whole-document labels, while minimizing redundancy in the input sought. Preliminary experience suggests that, starting with essentially an unlabeled corpus, very little cognitive labor suffices to set up a labeled collection on which standard classifiers perform well.

Item Type:Article
Source:Copyright of this article belongs to Springer Nature Switzerland AG
Keywords:Support Vector Machine;Active Learning;Cognitive Load;Label Document;Linear Additive Model
ID Code:130954
Deposited On:01 Dec 2022 10:36
Last Modified:01 Dec 2022 10:36

Repository Staff Only: item control page