## Abstract

We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean spaces, in the setting where k is part of the input (not a constant). For the k-means problem, Ostrovsky et al. [18] show that if the optimal (k-1)-means clustering of the input is more expensive than the optimal k-means clustering by a factor of 1/∈^{2}, then one can achieve a (1 + f(∈))-approximation to the k-means optimal in time polynomial in n and k by using a variant of Lloyd's algorithm. In this work we substantially improve this approximation guarantee. We show that given only the condition that the (k-1)-means optimal is more expensive than the k-means optimal by a factor 1+α for some constant α > 0, we can obtain a PTAS. In particular, under this assumption, for any ∈ > 0 we achieve a (1 + ∈)-approximation to the k-means optimal in time polynomial in n and k, and exponential in 1/∈ and 1/α. We thus decouple the strength of the assumption from the quality of the approximation ratio. We also give a PTAS for the k-median problem in finite metrics under the analogous assumption as well. For k-means, we in addition give a randomized algorithm with improved running time of n^{O(1)}(k log n)^{poly(1/∈,1/α)}. Our technique also obtains a PTAS under the assumption of Balcan et al. [4] that all (1+α) approximations are δ-close to a desired target clustering, in the case that all target clusters have size greater than δn and α > 0 is constant. Note that the motivation of Balcan et al. [4] is that for many clustering problems, the objective function is only a proxy for the true goal of getting close to the target. From this perspective, our improvement is that for k-means in Euclidean spaces we reduce the distance of the clustering found to the target from O(δ) to δ when all target clusters are large, and for k-median we improve the "largeness" condition needed in [4] to get exactly δ-close from O(δn) to δn. Our results are based on a new notion of clustering stability.

Original language | English |
---|---|

Title of host publication | Proceedings - 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS 2010 |

Publisher | IEEE Computer Society |

Pages | 309-318 |

Number of pages | 10 |

ISBN (Print) | 9780769542447 |

DOIs | |

State | Published - 2010 |

Externally published | Yes |

Event | 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS 2010 - Las Vegas, NV, United States Duration: 23 Oct 2010 → 26 Oct 2010 |

### Publication series

Name | Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS |
---|---|

ISSN (Print) | 0272-5428 |

### Conference

Conference | 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS 2010 |
---|---|

Country/Territory | United States |

City | Las Vegas, NV |

Period | 23/10/10 → 26/10/10 |