Explainable k-Means and k-Medians Clustering

Jul 12, 2020

Sprecher:innen

Über

Clustering is a popular unsupervised learning method for geometric data. Unfortunately, many clustering algorithms use global properties of the data, and there are no simple explanations for cluster assignments. To improve interpretability, we consider using a small threshold tree to partition a dataset into clusters. This leads to cluster assignments that can be explained by very few feature values in a straightforward manner. We study this problem from a theoretical viewpoint, measuring the output quality by the k-means and k-medians objectives. In terms of negative results, we show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost, and we prove that any explainable clustering must incur an Ω(logk) approximation compared to the optimal clustering. On the upper bound side, we design efficient algorithms that produce explainable clusters using a tree with k leaves. For two means/medians, we show that a single threshold cut suffices to achieve a constant factor approximation, which is a surprising result that nearly matches our lower bounds. For general k ≥2, our algorithm is an O(k) approximation to the optimal k-medians and an O(k^2) approximation to the optimal k-means. Prior to our work, no algorithms were known with provable guarantees independent of the dimensionality and input size.

Organisator

Kategorien

Über ICML 2020

The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.

Präsentation speichern

Soll diese Präsentation für 1000 Jahre gespeichert werden?

Wie speichern wir Präsentationen?

Ewigspeicher-Fortschrittswert: 0 = 0.0%

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Interessiert an Vorträgen wie diesem? ICML 2020 folgen