On Similarity Prediction and Pairwise Clustering

Abstract	We consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specific constraints on the cluster shapes other than their size. We investigate the problem in different settings: i. an online setting, where we provide a tight characterization of the prediction complexity in the mistake bound model, and ii. a standard stochastic batch setting, where we give tight upper and lower bounds on the achievable generalization error. Prediction performance is measured both in terms of the ability to recover the similarity function encoding the hidden clustering and in terms of how well we classify each item within the set. The proposed algorithms are time efficient.
Authors	Stephen Pasteris (UCL) Fabio Vitale Claudio Gentile Mark Herbster (UCL)
Date	Apr-2018
Venue	JMLR: Workshop and Conference Proceedings 83:1–28, 2018 29th International Conference on Algorithmic Learning Theory