Abstract
Most previous work on differential privacy mainly focused on independent datasets, assuming that all records were sampled from a universe independently. However, in a real-world, many datasets contain strong coupling relations where some records are often correlated with each other. When such datasets are released, the definition of differential privacy will be violated as an adversary has a higher chance to obtain sensitive information. Hence, it is critical to find effective solutions to preserve rigorous differential privacy with correlated datasets. This chapter first formally defines the correlated differential privacy problem and outlines the research issues and challenges in providing privacy guarantees for correlated datasets. Then it presents an innovative solution to solve the correlated differential privacy problem and shows that the solution is robust and effective.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
L. Cao. Non-iidness learning in behavioral and social data. The Computer Journal, 2013.
L. Cao, Y. Ou, and P. S. Yu. Coupled behavior analysis with applications. IEEE Transactions on Knowledge and Data Engineering, 24(8):1378–1392, 2012.
R. Chen, B. C. Fung, S. Y. Philip, and B. C. Desai. Correlated network data publication via differential privacy. The VLDB Journal, 23(4):653–676, 2014.
C. Dwork. Differential privacy: a survey of results. In TAMC’08, pages 1–19, 2008.
M. Hardt and G. N. Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. In FOCS, pages 61–70, 2010.
M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. Proc. VLDB Endow., 3(1):1021–1032, 2010.
X. He, A. Machanavajjhala, and B. Ding. Blowfish privacy: Tuning privacy-utility trade-offs using policies. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD ’14, pages 1447–1458, New York, NY, USA, 2014. ACM.
D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In SIGMOD, pages 193–204, 2011.
D. Kifer and A. Machanavajjhala. Pufferfish: A framework for mathematical privacy definitions. ACM Trans. Database Syst., 39(1):3:1–3:36, 2014.
Y. Song, L. Cao, X. Wu, G. Wei, W. Ye, and W. Ding. Coupled behavior analysis for capturing coupling relationships in group-based market manipulations. KDD ’12, pages 976–984, New York, NY, USA, 2012. ACM.
Y. Wang, S. Song, and K. Chaudhuri. Privacy-preserving analysis of correlated data. CoRR, abs/1603.03977, 2016.
J. Zhang, Y. Xiang, Y. Wang, W. Zhou, Y. Xiang, and Y. Guan. Network traffic classification using correlation information. Parallel and Distributed Systems, IEEE Transactions on, 24(1):104–117, Jan 2013.
Z.-H. Zhou, Y.-Y. Sun, and Y.-F. Li. Multi-instance learning by treating instances as non-i.i.d. samples. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 1249–1256, New York, NY, USA, 2009. ACM.
T. Zhu, P. Xiong, G. Li, and W. Zhou. Correlated differential privacy: Hiding information in non-iid data set. IEEE Transactions on Information Forensics and Security, 10(2):229–242, 2015.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Zhu, T., Li, G., Zhou, W., Yu, P.S. (2017). Correlated Differential Privacy for Non-IID Datasets. In: Differential Privacy and Applications. Advances in Information Security, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-319-62004-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-62004-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62002-2
Online ISBN: 978-3-319-62004-6
eBook Packages: Computer ScienceComputer Science (R0)