Achieving Pareto-Optimal MI-Based Privacy-Utility Tradeoffs Under Full Data

Abstract We study a fine-grained model in which a perturbed version of some data (D) is to be disclosed, with the aims of permitting the receiver to accurately infer some useful aspects (X = f(D)) of it, while preventing her from inferring other private aspects (Y = g(D)). Correlation between the bases for these inferences necessitates compromise between these goals. Determining exactly how the disclosure (M) will be probabilistically generated (from D), somehow trading off between making I(M; X) large and I(M; Y ) small, is cast as an algorithmic optimization problem. Chakraborty et al. 2013 [2] provided optimal solutions for the two extreme points on these objectives Pareto frontier: maximizing I(M; X) s.t. I(M; Y ) = 0 (“perfect privacy”, via linear programming (LP)) and minimizing I(M; Y ) s.t. H(X|M) = 0 (“perfect utility”, for which the trivial solution M = X is optimal). We show that when minimizing I(M; Y ) − βI(M; X), we can restrict ourselves w.l.o.g. to solutions satisfying several normal-form conditions, which leads to 1) an alternative convex programming formulation when β ∈ [0, 1], which we provide a practical optimal algorithm for, and 2) proof that M = X is actually optimal for all β ≥ 1. This solves the primary open problem posed by [2]. (It also provides a faster solution than [2]'s LP for the “perfect privacy” special case.)
  • Matthew Johnson
  • Liang Zhao
  • Supriyo Chakraborty (IBM US)
Date Oct-2018
Venue IEEE Journal of Selected Topics in Signal Processing ( Volume: 12 , Issue: 5, Oct. 2018 ) [link]