A Fast Distributed Implementation of Machine Learning Algorithms that work via Online-to-Batch Conversion

Abstract In this paper we consider batch classification: we have an input space X and an unknown labelling of points in X. We assume there is a probability distribution over X and N points are sampled i.i.d. from the distribution. The N points are given to the learner, along with their labels, and it is the goal of the learner to build a function that labels new points in X correctly (most of the time). We will consider algorithms for the learner that work via online-to-batch conversion described as follows. Online classification is the following game. We have a number of trials. On each trial: 1) A point in X is revealed to the learner. 2) The learner predicts the label of the point. 3) The true label is revealed to the learner. Online to batch classification is the process of converting an online classification algorithm into a batch classification algorithm. We consider the case when the given point/label pairs are distributed across different machines. We give a distributed implementation of the online-to-batch conversion process that requires only a small amount of data to be broadcast.
Authors
  • Stephen Pasteris (UCL)
  • Mark Herbster (UCL)
  • Richard Tomsett (IBM UK)
  • Shiqiang Wang (IBM US)
Date Sep-2020
Venue 4th Annual Fall Meeting of the DAIS ITA, 2020