Abstract
We present a practical, differentially private algorithm for answering a large number of queries on high dimensional datasets. Like all algorithms for this task, ours necessarily has worst-case complexity exponential in the dimension of the data. However, our algorithm packages the computationally hard step into a concisely defined integer program, which can be solved non-privately using standard solvers. We prove accuracy and privacy theorems for our algorithm, and then demonstrate experimentally that our algorithm performs well in practice. For example, our algorithm can efficiently and accurately answer millions of queries on the Netflix dataset, which has over 17,000 attributes; this is an improvement on the state of the art by multiple orders of magnitude.
Original language | English |
---|---|
Title of host publication | 31st International Conference on Machine Learning, ICML 2014 |
Publisher | International Machine Learning Society |
Pages | 2908-2916 |
Number of pages | 9 |
Volume | 4 |
ISBN (Print) | 9781634393973 |
Publication status | Published - 2014 |
Event | 31st International Conference on Machine Learning, ICML 2014 - Beijing, China Duration: 21 Jun 2014 → 26 Jun 2014 http://icml.cc/2014/ |
Publication series
Name | Journal of Machine Learning Research |
---|---|
Volume | 32 |
Conference
Conference | 31st International Conference on Machine Learning, ICML 2014 |
---|---|
Abbreviated title | ICML 2014 |
Country/Territory | China |
City | Beijing |
Period | 21/06/14 → 26/06/14 |
Internet address |
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Networks and Communications
- Software