Skip to main content
Log in

Privacy-preserving topic model for tagging recommender systems

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the original tagging datasets is usually confronted with serious privacy concerns, because adversaries may re-identify a user and her/his sensitive information from tagging datasets with only a little background information. Recently, several privacy techniques have been proposed to address the problem, but most of these lack a strict privacy notion, and rarely prevent individuals being re-identified from the dataset. This paper proposes a privacy- preserving tag release algorithm, PriTop. This algorithm is designed to satisfy differential privacy, a strict privacy notion with the goal of protecting users in a tagging dataset. The proposed PriTop algorithm includes three privacy-preserving operations: Private topic model generation structures the uncontrolled tags; private weight perturbation adds Laplace noise into the weights to hide the numbers of tags; while private tag selection finally finds the most suitable replacement tags for the original tags, so the exact tags can be hidden. We present extensive experimental results on four real-world datasets, Delicious, MovieLens, Last.fm and BibSonomy. While the recommendation algorithm is successful in all the cases, our results further suggest the proposed PriTop algorithm can successfully retain the utility of the datasets while preserving privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://delicious.com/.

  2. http://www.bibsonomy.org/.

  3. http://www.last.fm/.

  4. https://www.netflix.com/.

  5. http://www.dai-labor.de/.

  6. http://www.kde.cs.uni-kassel.de/ws/dc09/.

  7. http://ir.ii.uam.es/hetrec2011.

References

  1. Berkovsky S, Eytani Y, Kuflik T, Ricci F (2007) Enhancing privacy and preserving accuracy of a distributed collaborative filtering. In: Proceedings of the 2007 ACM conference on recommender systems, RecSys ’07. ACM, New York, NY, USA, pp 9–16

  2. Blei David M (2012) Probabilistic topic models. Commun ACM 55(4):77–84

    Article  MathSciNet  Google Scholar 

  3. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  4. Blum A, Ligett K, Roth A (2008) A learning theory approach to non-interactive database privacy. In: Proceedings of the 40th annual ACM symposium on theory of computing, STOC ’08. ACM, New York, NY, USA, pp 609–618

  5. Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) “you might also like: ” privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy, SP ’11. IEEE Computer Society, Washington, DC, USA, pp 231–246

  6. Canny J (2002) Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’02. ACM, New York, NY, USA, pp 238–245

  7. Dwork C (2006) Differential privacy. In: ICALP’06: Proceedings of the 33rd international conference on automata, languages and programming. Springer, Berlin, pp 1–12

  8. Dwork C (2008) Differential privacy: a survey of results. In: TAMC’08: Proceedings of the 5th international conference on theory and applications of models of computation. Springer, Berlin, pp 1–19

  9. Dwork C (2011) A firm foundation for private data analysis. Commun ACM 54(1):86–95

    Article  Google Scholar 

  10. Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: TCC’06: Proceedings of the third conference on theory of cryptography. Springer, Berlin, pp 265–284

  11. Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4):1–53

  12. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci USA 101(Suppl 1):5228–5235

    Article  Google Scholar 

  13. Jäschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in folksonomies. In: Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases, PKDD 2007. Springer, Berlin, pp 506–514

  14. Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: Proceedings of the third ACM conference on recommender systems, RecSys ’09. ACM, New York, NY, USA, pp 61–68

  15. Lin J (1991) Divergence measures based on the shannon entropy. IEEE Trans Inf Theory 37(1):145–151

    Article  MATH  Google Scholar 

  16. Marinho L, Hotho A, Jschke R, Nanopoulos A, Rendle S, Schmidt-Thieme L, Stumme G, Symeonidis P (2012) SpringerBriefs in electrical and computer engineering. Recommender systems for social tagging systems. Springer, US, pp 75–80

  17. McSherry F, Mironov I (2009) Differentially private recommender systems: building privacy into the net. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09. ACM, New York, NY, USA, pp 627–636

  18. McSherry F, Talwar K (2007) Mechanism design via differential privacy. In: Proceedings of the 48th annual IEEE symposium on foundations of computer science, FOCS ’07. IEEE Computer Society, Washington, DC, USA, pp 94–103

  19. Narayanan A, Shmatikov V (2006) How to break anonymity of the netflix prize dataset. CoRR, abs/cs/0610105

  20. Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: Proceedings of the 2008 IEEE symposium on security and privacy, SP ’08. IEEE Computer Society, Washington, DC, USA, pp 111–125

  21. Parameswaran R, Blough DM (2007) Privacy preserving collaborative filtering using data obfuscation. In: Granular computing, 2007. GRC 2007. IEEE international conference on granular computing, p 380

  22. Parra-Arnau J, Perego A, Ferrari E, Forne J, Rebollo-Monedero D (2014) Privacy-preserving enhanced collaborative tagging. IEEE Trans Knowl Data Eng 26(1):180–193

    Article  Google Scholar 

  23. Parra-Arnau J, Rebollo-Monedero D, Forne J (2014) Measuring the privacy of user profiles in personalized information systems. Future Gener Comput Syst 33(0):5363

    Google Scholar 

  24. Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: ICDM 2003. Third IEEE international conference on Data mining, 2003, pp 625–628

  25. Polat H, Du W (2006) Achieving private recommendations using randomized response techniques. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06. Springer, Berlin, pp 637–646

  26. Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G (2001) Privacy risks in recommender systems. IEEE Internet Comput 5(6):54–62

    Article  Google Scholar 

  27. Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, NY, USA, pp 259–266

  28. Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM, New York, NY, USA, pp 327–336

  29. Steyvers M, Griffiths T (2007) Probabilistic topic models. Handb Latent semant Anal 427(7):424–440

    Google Scholar 

  30. Symeonidis P, Nanopoulos A, Manolopoulos Y (2008) Tag recommendations based on tensor dimensionality reduction. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, NY, USA, pp 43–50

  31. Zhan J, Hsieh C-L, Wang I-C, Tsan sheng H, Liau C-J, Wang Da-Wei (2010) Privacy-preserving collaborative recommender systems. IEEE Trans Syst Man Cybern C Appl Rev 40(4):472–476

    Article  Google Scholar 

  32. Zhu T, Li G, Ren Y, Zhou W, Xiong P (2013) Differential privacy for neighborhood-based collaborative filtering. In: Proceedings of the 2013 international conference on advances in social networks analysis and mining (ASONAM 2013), ASONAM ’13. IEEE computer society, Washington, DC, USA, pp 752–759

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Li.

Additional information

This manuscript is an extended version of PAKDD best student paper.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, T., Li, G., Zhou, W. et al. Privacy-preserving topic model for tagging recommender systems. Knowl Inf Syst 46, 33–58 (2016). https://doi.org/10.1007/s10115-015-0832-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-015-0832-9

Keywords

Navigation