A linear time algorithm for approximate 2-means clustering

Sabharwal, Yogish ; Sen, Sandeep (2005) A linear time algorithm for approximate 2-means clustering Computational Geometry, 32 (2). pp. 159-172. ISSN 0925-7721

Full text not available from this repository.

Official URL: http://www.sciencedirect.com/science/article/pii/S...

Related URL: http://dx.doi.org/10.1016/j.comgeo.2005.01.003

Abstract

Matousek [Discrete Comput. Geom. 24 (1) (2000) 61-84] designed an O(nlogn) deterministic algorithm for the approximate 2-means clustering problem for points in fixed dimensional Euclidean space which had left open the possibility of a linear time algorithm. In this paper, we present a simple randomized algorithm to determine an approximate 2-means clustering of a given set of points in fixed dimensional Euclidean space, with constant probability, in linear time. We first approximate the mean of the larger cluster using random sampling. We then show that the problem can be reduced to a set of lines, on which it can be solved by carefully pruning away points of the larger cluster and randomly sampling on the remaining points to obtain an approximate to the mean of the smaller cluster.

Item Type:Article
Source:Copyright of this article belongs to Elsevier Science.
Keywords:k-means; Randomization
ID Code:53439
Deposited On:08 Aug 2011 12:13
Last Modified:08 Aug 2011 12:13

Repository Staff Only: item control page