Abstract 
We consider the problem of a group of s agents learning collectively in a nonstationary environment. Each agent asynchronously receives an online data stream. At each point in time an agent performs one of n actions based on the data and incurs a loss based on that action. In this setting each data stream is partitioned into a sequence of segments that are unknown to the agents. Each such segment is also associated with an unknown best action. In [1] an algorithm was given which was designed to exploit the scenario where there are many such segments but all the best actions come from a small subset of the total action set. In this work we allow this subset of to change over time. We give efficient algorithms, for this scenario, with guarantees that that significantly improve on previous algorithms. In terms of efficiency on trial τ our algorithm requires O(nτ) time and in the special case of a single agent s = 1 we give an algorithm that requires only O(n) time per trial. Finally, we provide promising preliminary experiments that demonstrates the strength of algorithms. 
Authors 
 Stephen Pasteris (UCL)
 Mark Herbster (UCL)
 Richard Tomsett (IBM UK)
 Don Towsley (UMass)
 YuZhen Chen (UMass)

Date 
Sep2020 
Venue 
4th Annual Fall Meeting of the DAIS ITA, 2020 
