Computing Messages that Reveal Selected Inferences while Protecting Others

Abstract We study a lossy coding scenario, posed as an algorithmic optimization problem, where we trade off between the conflicting goals of accuracy and privacy, motivated by scenarios such as the public release of estimates of data that accurately reflect some aspects of the raw data without revealing other sensitive confidential aspects of it, or permitting it to be inferred. More precisely, given a discrete probability distribution p(D, X, Y ), where X represents the whitelisted inferences from D and Y represents the blacklisted inferences, we seek to construct a conditional distribution p(M|D) with the dual goals of making I(M; X) large and I(M; Y ) small. Chakraborty et al. 2013 [1] provided optimal solutions within this model to the two extreme points on the two objectives' Pareto frontier: maximizing I(M; X) subject to the constraint that I(M; Y ) be as small as possible (“perfect privacy”, using linear programming (LP)) and vice versa (“perfect utility”, which is trivial). In this paper we provide a faster combinatorial optimal algorithm for the perfect privacy problem, which does not require the use of an LP solver. Moreover, this algorithm any be used to compute Pareto-optimal solutions at any point on the Pareto frontier. (En route to this algorithm, we also provide a mathematical programming-based solution.) This solves the primary open problem posed by [1].
Authors
  • Matthew Johnson
  • Liang Zhao
  • Supriyo Chakraborty (IBM US)
Date Nov-2017
Venue 2017 IEEE Information Theory Workshop