Academic
Publications
Clustering with Constraints: Feasibility Issues and the k-Means Algorithm

Clustering with Constraints: Feasibility Issues and the k-Means Algorithm,Ian Davidson,S. S. Ravi

Clustering with Constraints: Feasibility Issues and the k-Means Algorithm   (Citations: 73)
BibTex | RIS | RefWorks Download
Recent work has looked at extending the k-Means al- gorithm to incorporate background information in the form of instance level must-link and cannot-link con- straints. We introduce two ways of specifying additional background information in the form of and con- straints that operate on all instances but which can be interpreted as conjunctions or disjunctions of instance level constraints and hence are easy to implement. We present complexity results for the feasibility of cluster- ing under each type of constraint individually and sev- eral types together. A key finding is that determining whether there is a feasible solution satisfying all con- straints is, in general, NP-complete. Thus, an iterative algorithm such as k-Means should not try to find a fea- sible partitioning at each iteration. This motivates our derivation of a new version of the k-Means algorithm that minimizes the constrained vector quantization er- ror but at each iteration does not attempt to satisfy all constraints. Using standard UCI datasets, we find that using constraints improves accuracy as others have reported, but we also show that our algorithm reduces the number of iterations until convergence. Finally, we illustrate these benefits and our new constraint types on a complex real world object identification problem using the infra-red detector on an Aibo robot.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: