Semi-Supervised Clustering With Multiresolution Autoencoders

Contributo in Atti di convegno

Data di Pubblicazione:

2018

Abstract:

In most real world clustering scenarios, experts generally dispose of limited background information, but such knowledge is valuable and may guide the analysis process. Semi-supervised clustering can be used to drive the algorithmic process with prior knowledge and to enable the discovery of clusters that meet the analyst's expectations. Usually, in the semi-supervised clustering setting, the background knowledge is converted to some kind of constraint and, successively, metric learning or constrained clustering are adopted to obtain the final data partition. Conversely, we propose a new semi-supervised clustering algorithm that directly exploits prior knowledge, under the form of labeled examples, avoiding the necessity to derive constraints. Our algorithm employs a multiresolution strategy to generate an ensemble of semi-supervised autoencoders that fit the data together with the background knowledge. Successively, the network models are employed to supply a new embedding representation on which clustering is performed. The proposed strategy is evaluated on a set of real-world benchmarks also in comparison with well-known state-of-the-art semi-supervised clustering methods. The experimental results highlight the benefit of directly leveraging the prior knowledge and show the quality of the representation learnt by the multiresolution schema.

Tipologia CRIS:

04A-Conference paper in volume

Keywords:

semi-supervised clustering, background knowledge, autoencoders, ensemble

Elenco autori: