Density-based multiscale data condensation

Mitra, P. ; Murthy, C. A. ; Pal, S. K. (2002) Density-based multiscale data condensation IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (6). pp. 743-747. ISSN 0162-8828

Full text not available from this repository.

Official URL: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arn...

Related URL: http://dx.doi.org/10.1109/TPAMI.2002.1008381

Abstract

A problem gaining interest in pattern recognition applied to data mining is that of selecting a small representative subset from a very large data set. In this article, a nonparametric data reduction scheme is suggested. It attempts to represent the density underlying the data. The algorithm selects representative points in a multiscale fashion which is novel from existing density-based approaches. The accuracy of representation by the condensed set is measured in terms of the error in density estimates of the original and reduced sets. Experimental studies on several real life data sets show that the multiscale approach is superior to several related condensation methods both in terms of condensation ratio and estimation error. The condensed set obtained was also experimentally shown to be effective for some important data mining tasks like classification, clustering, and rule generation on large data sets. Moreover, it is empirically found that the algorithm is efficient in terms of sample complexity.

Item Type:Article
Source:Copyright of this article belongs to IEEE.
ID Code:77689
Deposited On:14 Jan 2012 06:05
Last Modified:14 Jan 2012 06:05

Repository Staff Only: item control page