Ranking and updating machine learning models based on data inputs at edge nodes

Abstract	An input dataset for training a new machine learning model is received by a processor. For each of a plurality of trained machine learning models, a hash function and a sketch of a training dataset used to train the machine learning model is retrieved. A sketch of the input dataset is computed based on the hash function and the input dataset, along with a distance between the sketch of the training dataset and the sketch of the input dataset. The computed distances of the trained machine learning models are ranked from smallest to largest, and a seed machine learning model for the input dataset is selected from the trained machine learning models based at least in part on the ranking. A training process of the new machine learning model using the selected seed machine learning model and the input dataset is initiated.
Authors	Raghu Ganti (IBM US) Mudhakar Srivatsa (IBM US) Swati Rallapalli (IBM US) ShreeRanjani SrirangamSridharan (IBM US)
Date	Jan-2020
Venue	U.S. Patent Application 16/021,086, filed January 2, 2020.