Dual query: practical private query release for high dimensional data

Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Roth Aaron, Zhiwei Steven Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.

Original languageEnglish
Title of host publication31st International Conference on Machine Learning, ICML 2014
PublisherInternational Machine Learning Society
Pages2908-2916
Number of pages9
Volume4
ISBN (Print)9781634393973
Publication statusPublished - 2014
Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
Duration: 21 Jun 201426 Jun 2014
http://icml.cc/2014/

Publication series

NameJournal of Machine Learning Research
Volume32

Conference

Conference31st International Conference on Machine Learning, ICML 2014
Abbreviated titleICML 2014
Country/TerritoryChina
CityBeijing
Period21/06/1426/06/14
Internet address

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Dual query: practical private query release for high dimensional data'. Together they form a unique fingerprint.

Cite this