Maxdiff kd-trees for data condensation

Narayan, B. L. ; Murthy, C. A. ; Pal, Sankar K. (2006) Maxdiff kd-trees for data condensation Pattern Recognition Letters, 27 (3). pp. 187-200. ISSN 0167-8655

[img]
Preview
PDF - Publisher Version
1MB

Official URL: http://linkinghub.elsevier.com/retrieve/pii/S01678...

Related URL: http://dx.doi.org/10.1016/j.patrec.2005.08.015

Abstract

Prototype selection on the basis of conventional clustering algorithms results in good representation but is extremely time-taking on large data sets. kd-trees, on the other hand, are exceptionally efficient in terms of time and space requirements for large data sets, but fail to produce a reasonable representation in certain situations. We propose a new algorithm with speed comparable to the present kd-tree based algorithms which overcomes the problems related to the representation for high condensation ratios. It uses the Maxdiff criterion to separate out distant clusters in the initial stages before splitting them any further thus improving on the representation. The splits being axis-parallel, more nodes would be required for the representing a data set which has no regions where the points are well separated.

Item Type:Article
Source:Copyright of this article belongs to International Association for Pattern Recognition.
Keywords:Prototype Selection; Data Representation; Multiresolution kd-trees
ID Code:26121
Deposited On:06 Dec 2010 13:04
Last Modified:17 May 2016 09:27

Repository Staff Only: item control page