Pontificia Universidad Católica de Chile Pontificia Universidad Católica de Chile
Peralta B., Caro A. and Soto A. (2016)

A proposal for supervised clustering with Dirichlet Process using labels

Revista : Pattern Recognition Letters
Volumen : 80
Páginas : 52-57
Tipo de publicación : ISI Ir a publicación

Abstract

Supervised clustering is an emerging area of machine learning, where the goal is to find class-uniform clusters. However, typical state-of-the-art algorithms use a fixed number of clusters. In this work, we propose a variation of a non-parametric Bayesian modeling for supervised clustering. Our approach consists of modeling the clusters as a mixture of Gaussians with the constraint of encouraging clusters of points with the same label. In order to estimate the number of clusters, we assume a-priori a countably infinite number of clusters using a variation of Dirichlet Process model over the prior distribution. In our experiments, we show that our technique typically outperforms the results of other clustering techniques.