On Similarity Prediction and Pairwise Clustering

Abstract We consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specifi c constraints on the cluster shapes other than their size. We investigate the problem in di fferent settings: i. an online setting, where we provide a tight characterization of the prediction complexity in the mistake bound model, and ii. a standard stochastic batch setting, where we give tight upper and lower bounds on the achievable generalization error. Prediction performance is measured both in terms of the ability to recover the similarity function encoding the hidden clustering and in terms of how well we classify each item within the set. The proposed algorithms are time efficient.
Authors
  • Stephen Pasteris (UCL)
  • Fabio Vitale
  • Claudio Gentile
  • Mark Herbster (UCL)
Date Apr-2018
Venue JMLR: Workshop and Conference Proceedings 83:1–28, 2018 29th International Conference on Algorithmic Learning Theory