Abstract
The data sparsity is a well-known issue in the context of collaborative filtering, and it puts particular difficulties in making accurate recommendations. In this paper, we focus on the data sparsity problem in the context of neighborhood-based collaborative filtering, and propose a maximum imputation framework to tackle this. The basic idea is to identify an imputation area that can maximize the imputation benefit for recommendation purposes, while minimizing the imputation error brought in. To achieve the maximum imputation benefit, the imputation area is determined from both the user and the item perspectives; to minimize the imputation error, there is at least one real rating preserved for each item in the identified imputation area. A theoretical analysis is provided to prove that the proposed imputation method outperforms the conventional neighborhood-based CF methods through more accurate neighbor identification. We evaluate the proposed framework on two benchmark datasets by comparing it with seven relevant methods. The experimental results indicate that the proposed method significantly outperforms other relevant methods.
Similar content being viewed by others
Notes
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Altman NS (1992) An Introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175. doi:10.2307/2685209
Bell R, Koren Y, Volinsky C (2007) Modeling relationships at multiple scales to improve accuracy of large recommender systems. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, p 95–104
Bell RM, Koren Y (2007) Lessons from the netflix prize challenge. SIGKDD Explor 9(2):75–79
Billsus D, Pazzani M (1998) Learning collaborative information filters. ICML 54:46–54
Breese J, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th conference on uncertainty in artificial intelligence, p 43–52
Campos LMD, Fernández-luna JM, Huete JF, Rueda-morales MA (2009) Measuring predictive capability in collaborative filtering. In: Proceedings of the 3rd ACM conference on recommender systems, p 313–316
Chee S, Han J, Wang K (2001) Rectree: an efficient collaborative filtering method. In: Proceedings of the 3rd international conference on data warehousing and knowledge discovery, p 141–151
Cover T (1968) Estimation by the nearest neighbor Rule. IEEE Trans Infor Theory 14(1):50–55
Desrosiers C, Karypis G (2010) A novel approach to compute similarities and its application to item recommendation. Proc PRICAI 2010:39–51
Desrosiers C, Karypis G (2011) Chapter 4 A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook, Springer, US, p 107–144. doi:10.1007/978-0-387-85820-3
Goldman Sa, Warmuth MK (1995) Learning binary relations using weighted majority voting. Mach Learn 20(3):245–271. doi:10.1007/BF00994017
Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. SIGKDD, ACM, p 426–434
Lemire D, Maclachlan A (2005) Slope one predictors for online rating-based collaborative filtering. SDM’05, vol 05
Ma H, King I, Lyu MR (2007) Effective missing data prediction for collaborative filtering. SIGIR, ACM, New York, p 39–46
Mclaughlin MR, Herlocker JL (2004) A collaborative filtering algorithm and evaluation metric that accurately model the user experience. Proceeding of SIGIR 2004
Ning X, Karypis G (2011) SLIM : sparse linear methods for top-N recommender systems. ICDM 2011:497–506
Ren Y, Li G, Zhang J, Zhou W (2012a) The efficient imputation method for neighborhood-based collaborative filtering. In: Proceedings of the 21st ACM international conference on Information and knowledge management, CIKM ’12, p 684–693
Ren Y, Li G, Zhou W (2012b) Learning rating patterns for top-N recommendations. In: Proceedings of the 2012 IEEE/ACM international conference on social networks analysis and mining (ASONAM), p 472–479
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) GroupLens : an open architecture for collaborative filtering of netnews. In: ACM conference on computer supported cooperative work, p 175–186
Richard A Johnsom GKB (2009) Statistics: principles and methods, 6th edn, Wiley, USA
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web, ACM, p 285–295
Shi Y, Larson M, Hanjalic A (2009) Exploiting user similarity based on rated-item pools for improved user-based collaborative filtering. In: Proceedings of the 3rd ACM conference on recommender systems, p 125–132
Su X, Khoshgoftaar T (2006) Collaborative filtering for multi-class data using belief Nets algorithms. In: 2006 18th IEEE international conference on tools with artificial intelligence (ICTAI’06), Ieee, p 497–504, doi:10.1109/ICTAI.2006.41
Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Advances in artificial intelligence. doi:10.1155/2009/421425
Wang J, de Vries AP, Reinders MJT (2006) Unifying user-based and item-based collaborative filtering approaches by similarity fusion. SIGIR, ACM, USA, p 501–208. doi:10.1145/1148170.1148257
Xue GR, Lin C, Yang Q, Xi W, Zeng HJ, Yu Y, Chen Z (2005) Scalable collaborative filtering using cluster-based smoothing. SIGIR, ACM, USA, p 114–121. doi:10.1145/1076034.1076056
Zhang J, Xiang Y, Wang Y, Zhou W, Xiang Y, Guan Y (2013) Network traffic classification using correlation information. IEEE Trans Parallel Distrib Syst 24(1):104–117
Zhang S (2011) Shell-neighbor method and its application in missing data imputation. Appl Intell 35(1):123–133. doi:10.1007/s10489-009-0207-6
Zhang S, Jin Z, Zhu X (2011) Missing data imputation by utilizing information within incomplete instances. J Syst Softw 84(3):452–459. doi:10.1016/j.jss.2010.11.887
Zhu X, Zhang S, Jin Z, Zhang Z, Xu Z (2011) Missing value estimation for mixed-attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ren, Y., Li, G., Zhang, J. et al. The maximum imputation framework for neighborhood-based collaborative filtering. Soc. Netw. Anal. Min. 4, 207 (2014). https://doi.org/10.1007/s13278-014-0207-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-014-0207-3