Self Generating Policies for Training Data Curation in Coalition Environments

Abstract In any machine learning problem, acquiring good training data is the main challenge that needs to be overcome to build an effective model. When building machine learning based solutions in the context of coalition operations, one may be able to get training data from coalition partners. However, all coalition partners may not be equally trusted, making the task of deciding whether to accept training data from partners complex. Policies can provide a mechanism for making these decisions but defining the right policies manually would be difficult since they depend on the characteristics of the data being offered. Motivated by this observation, in this paper, we propose an architecture that can generate policies required for building a machine learning model in a coalition environment without significant amount of human input.
Authors
  • Dinesh Verma (IBM US)
  • Seraphin Calo (IBM US)
  • Shalisa Witherspoon (IBM US)
  • Irene Manatos (IBM US)
  • Elisa Bertino (Purdue)
  • Amani Abu Jabal (Purdue)
  • Geeth de Mel (IBM UK)
  • Greg Cirincione (ARL)
  • Ananthram Swami (ARL)
Date Sep-2018
Venue 2nd International Workshop on Policy-based Autonomic Data Governance (PADG 2018)