Sign in
Author

Conference

Journal

Organization

Year

DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(3)
Number of Clusters
Triangle Inequality
K Means
Related Publications
(13)
An Efficient kMeans Clustering Algorithm: Analysis and Implementation
Redesigning distance functions and distancebased applications for high dimensional data
Making TimeSeries Classification More Accurate Using Learned Constraints
Reducing the computational requirements of the minimumdistance classifier
Accelerating exact k means algorithms with geometric reasoning
Subscribe
Academic
Publications
Using the Triangle Inequality to Accelerate kMeans
Using the Triangle Inequality to Accelerate kMeans,Charles Elkan
Edit
Using the Triangle Inequality to Accelerate kMeans
(
Citations: 129
)
BibTex

RIS

RefWorks
Download
Charles Elkan
The means algorithm is by far the most widely used method for discovering clusters in data. We show how to accelerate it dramatically, while still always computing exactly the same result as the standard algorithm. The accelerated al gorithm avoids unnecessary distance calculations by applying the
triangle inequality
in two differ ent ways, and by keeping track of lower and up per bounds for distances between points and cen ters. Experiments show that the new algorithm is effective for datasets with up to 1000 dimen sions, and becomes more and more effective as the
number of clusters
increases. For it is many times faster than the best previously known accelerated means method.
Conference:
International Conference on Machine Learning  ICML
, pp. 147153, 2003
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
www.aaai.org
)
(
www.hpl.hp.com
)
(
www.aaai.org
)
(
www.informatik.unitrier.de
)
(
users.cecs.anu.edu.au
)
More »
Citation Context
(89)
...An accelerated algorithmis proposedby using the triangle inequality [
3
], but requires O(k 2 ) extra storage, rendering it impractical for a large number of clusters...
Jing Wang
,
et al.
Fast Approximate kMeans via Cluster Closures
...For hosts whose (un)availability is truly random, we use two standard clustering methods, in particular kmeans [
6
] and hierarchical clustering, to cluster hosts by their distribution.Wetestedseveraldistancemetricsforclustering,eachof which measures the distance between two Cumulative Distribution Function (CDFs) [5]...
Bahman Javadi
,
et al.
Discovering Statistical Models of Availability in Large Distributed Sy...
...The partitioning approaches attempt to form clusters using similarity criterion based on the distance [2,7], [14,
8
,11]...
Salima Sabri
,
et al.
Evaluation of a clustering technique based on game theory
... variation algorithms of kmeans: The EM algorithm maintains probabilistic assignments to clusters, instead of the deterministic assignments in kmeans; Kmeans++ seeks to choose better starting clusters a way to avoid the sometimes poor clusters found by the standard kmeans algorithm; the filtering algorithm speeds up each kmeans step by using kdtrees [5], while some other methods adopt coresets [6] or the triangle inequality [
7
]; and ...
HaiGuang Li
,
et al.
KMeans Clustering with Bagging and MapReduce
...Other works focus on the convergence speed of the algorithm based on some geometric properties such as the trigonometric inequality [
9
], or the estimation of the number of clusters [10]...
Abbas Bradai
,
et al.
An efficient algorithm for selection and management of Island multicas...
References
(31)
Scalability for clustering algorithms revisited
(
Citations: 89
)
Fredrik Farnstrom
,
James Lewis
,
Charles Elkan
Journal:
Sigkdd Explorations
, vol. 2, no. 1, pp. 5157, 2000
Fast VQ encoding by an efficient kickout condition
(
Citations: 31
)
Kuangshyr Wu
,
Jachen Lin
Journal:
IEEE Transactions on Circuits and Systems for Video Technology  TCSV
, vol. 10, no. 1, pp. 5962, 2000
LargeScale Parallel Data Clustering
(
Citations: 67
)
Dan Judd
,
Philip K. Mckinley
,
Anil K. Jain
Journal:
IEEE Transactions on Pattern Analysis and Machine Intelligence  PAMI
, vol. 20, no. 8, pp. 871876, 1998
Accelerating exact k means algorithms with geometric reasoning
(
Citations: 130
)
Dan Pelleg
,
Andrew W. Moore
Conference:
Knowledge Discovery and Data Mining  KDD
, pp. 277281, 1999
Performance Guarantees for Hierarchical Clustering
(
Citations: 36
)
Sanjoy Dasgupta
Conference:
Computational Learning Theory  COLT
, pp. 351363, 2002
Sort by:
Citations
(129)
Fast Approximate kMeans via Cluster Closures
Jing Wang
,
Jingdong Wang
,
Qifa Ke
,
Gang Zeng
,
Shipeng Li
Published in 2012.
Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home
(
Citations: 3
)
Bahman Javadi
,
Derrick Kondo
,
JeanMarc Vincent
,
David P. Anderson
Journal:
IEEE Transactions on Parallel and Distributed Systems  TPDS
, vol. 22, no. 11, pp. 18961903, 2011
Improving the Performance of KMeans for Color Quantization
(
Citations: 1
)
M. Emre Celebi
Journal:
Image and Vision Computing  IVC
, vol. abs/1101.0, no. 4, pp. 260271, 2011
Evaluation of a clustering technique based on game theory
Salima Sabri
,
Mohammed Said Radjef
,
Mohand Tahar Kechadi
Conference:
IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services  ICSDM
, 2011
MicroFix: Using timing interpolation and delay sensors for power reduction
Guihai Yan
,
Yinhe Han
,
Hui Liu
,
Xiaoyao Liang
,
Xiaowei Li
Journal:
ACM Transactions on Design Automation of Electronic Systems  TODAES
, vol. 16, no. 2, pp. 121, 2011