Skip to main content
Log in

Location prediction in large-scale social networks: an in-depth benchmarking study

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Location details of social users are important in diverse applications ranging from news recommendation systems to disaster management. However, user location is not easy to obtain from social networks because many users do not bother to provide this information or decline to do so due to privacy concerns. Thus, it is useful to estimate user locations from implicit information in the network. For this purpose, many location prediction models have been proposed that exploit different network features. Unfortunately, these models have not been benchmarked on common datasets using standard metrics. We fill this gap and provide an in-depth empirical comparison of eight representative prediction models using five metrics on four real-world large-scale datasets, namely Twitter, Gowalla, Brightkite, and Foursquare. We formulate a generalized procedure-oriented location prediction framework which allows us to evaluate and compare the prediction models systematically and thoroughly under extensive experimental settings. Based on our results, we perform a detailed analysis of the merits and limitations of the models providing significant insights into the location prediction problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Notes

  1. https://developers.google.com/maps/documentation/.

References

  1. Ajao, O., Hong, J., Liu, W.: A survey of location inference techniques on twitter. J. Inf. Sci. 41(6), 855–864 (2015)

    Article  Google Scholar 

  2. Ao, J., Zhang, P., Cao, Y.: Estimating the locations of emergency events from twitter streams. Proc. Comput. Sci. 31, 731–739 (2014)

    Article  Google Scholar 

  3. Backstrom, L., Sun, E., Marlow, C.: Find me if you can: improving geographical prediction with social and spatial proximity. In: Proceedings of the 19th International Conference on World Wide Web, pp. 61–70. ACM (2010)

  4. Bao, J., Zheng, Y., Wilkie, D., Mokbel, M.: Recommendations in location-based social networks: a survey. GeoInformatica 19(3), 525–565 (2015)

    Article  Google Scholar 

  5. Bo, H., Cook, P., Baldwin, T.: Geolocation prediction in social media data by finding location indicative words. In: Proceedings of COLING, pp. 1045–1062 (2012)

  6. Chang, H.W., Lee, D., Eltaher, M., Lee, J.: @ Phillies tweeting from philly? Predicting twitter user locations with spatial word usage. In: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining, pp. 111–118. IEEE (2012)

  7. Chen, J., Liu, Y., Zou, M.: From tie strength to function: Home location estimation in social network. In: Computing, Communications and IT Applications Conference (ComComAp), pp. 67–71. IEEE (2014)

  8. Chen, Y., Zhao, J., Hu, X., Zhang, X., Li, Z., Chua, T.S.: From interest to function: location estimation in social media. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence, pp. 180–186. AAAI Press (2013)

  9. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM (2010)

  10. Cheng, Z., Caverlee, J., Lee, K.: A content-driven framework for geolocating microblog users. ACM Trans. Intell. Syst. Technol. (TIST) 4(1), 2 (2013)

    Google Scholar 

  11. Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082–1090. ACM (2011)

  12. Compton, R., Jurgens, D., Allen, D.: Geotagging one hundred million twitter accounts with total variation minimization. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 393–401. IEEE (2014)

  13. Davis Jr., C.A., Pappa, G.L., de Oliveira, D.R.R., de L Arcanjo, F.: Inferring the location of twitter messages based on user relationships. Trans. GIS 15(6), 735–751 (2011)

    Article  Google Scholar 

  14. Do, T.H., Nguyen, D.M., Tsiligianni, E., Cornelis, B., Deligiannis, N.: Multiview deep learning for predicting twitter users’ location (2017). arXiv:1712.08091

  15. Gao, H., Tang, J., Liu, H.: Exploring social-historical ties on location-based social networks. In: International AAAI Conference on Weblogs and Social Media (2012)

  16. Gelernter, J., Balaji, S.: An algorithm for local geoparsing of microtext. GeoInformatica 17(4), 635–667 (2013)

    Article  Google Scholar 

  17. Gu, Y., Song, J., Liu, W., Zou, L.: HLGPS: a home location global positioning system in location-based social networks. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 901–906. IEEE (2016)

  18. Han, B., Cook, P., Baldwin, T.: A stacking-based approach to twitter user geolocation prediction. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 7–12 (2013)

  19. Hecht, B., Hong, L., Suh, B., Chi, E.H.: Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 237–246. ACM (2011)

  20. Jurgens, D.: That’s what friends are for: inferring location in online social media platforms based on social relationships. ICWSM 13, 273–282 (2013)

    Google Scholar 

  21. Jurgens, D., Finethy, T., McCorriston, J., Xu, Y.T., Ruths, D.: Geolocation prediction in twitter using social networks: a critical analysis and review of current practice. In: Ninth International AAAI Conference on Web and Social Media, vol. 15, pp. 188–197 (2015)

  22. Kong, L., Liu, Z., Huang, Y.: Spot: locating social media users based on social network context. Proc. VLDB Endow. 7(13), 1681–1684 (2014)

    Article  Google Scholar 

  23. Lee, K., Ganti, R.K., Srivatsa, M., Liu, L.: When twitter meets foursquare: tweet location prediction using foursquare. In: Proceedings of the 11th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, pp. 198–207. ICST (2014)

  24. Levandoski, J.J., Sarwat, M., Eldawy, A., Mokbel, M.F.: Lars: a location-aware recommender system. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 450–461. IEEE (2012)

  25. Li, C., Sun, A.: Fine-grained location extraction from tweets with temporal awareness. In: Proceedings of the SIGIR Conference on Research & Development in Information Retrieval, pp. 43–52. ACM (2014)

  26. Li, C., Weng, J., He, Q., Yao, Y., Datta, A., Sun, A., Lee, B.S.: Twiner: named entity recognition in targeted twitter stream. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 721–730. ACM (2012)

  27. Li, R., Wang, S., Chang, K.C.C.: Multiple location profiling for users and relationships from social network and content. Proc. VLDB Endow. 5(11), 1603–1614 (2012)

    Article  Google Scholar 

  28. Li, R., Wang, S., Deng, H., Wang, R., Chang, K.C.C.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: Proceedings of the 18th ACM SIGKDD, pp. 1023–1031. ACM (2012)

  29. Li, W., Serdyukov, P., de Vries, A.P., Eickhoff, C., Larson, M.: The where in the tweet. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2473–2476. ACM (2011)

  30. Lingad, J., Karimi, S., Yin, J.: Location extraction from disaster-related microblogs. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1017–1020. ACM (2013)

  31. Liu, X., Zhang, S., Wei, F., Zhou, M.: Recognizing named entities in tweets. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 359–367. ACL (2011)

  32. Mahmud, J., Nichols, J., Drews, C.: Where is this tweet from? Inferring home locations of twitter users. ICWSM 12, 511–514 (2012)

    Google Scholar 

  33. McGee, J., Caverlee, J., Cheng, Z.: Location prediction in social media based on tie strength. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 459–468. ACM (2013)

  34. Miura, Y., Taniguchi, M., Taniguchi, T., Ohkuma, T.: Unifying text, metadata, and user network representations with a neural network for geolocation prediction. In: Proceedings of the 55th Annual Meeting of the ACL, vol. 1, pp. 1260–1272 (2017)

  35. Pang, J., Zhang, Y.: Deepcity: a feature learning framework for mining location check-ins. In: Eleventh AAAI Conference on Web and Social Media (2017)

  36. Paul, M.J., Dredze, M.: You are what you tweet: analyzing twitter for public health. ICWSM 20, 265–272 (2011)

    Google Scholar 

  37. Qian, Y., Tang, J., Yang, Z., Huang, B., Wei, W., Carley, K.M.: A probabilistic framework for location inference from social media (2017). arXiv:1702.07281

  38. Rahimi, A., Cohn, T., Baldwin, T.: Twitter user geolocation using a unified text and network prediction model (2015). arXiv:1506.08259

  39. Rahimi, A., Cohn, T., Baldwin, T.: A neural model for user geolocation and lexical dialectology. In: Proceedings of the 55th Annual Meeting of the ACL, ACL 2017, vol. 2, pp. 209–216 (2017)

  40. Rahimi, A., Vu, D., Cohn, T., Baldwin, T.: Exploiting text and network context for geolocation of social media users (2015). arXiv:1506.04803

  41. Rakesh, V., Reddy, C.K., Singh, D.: Location-specific tweet detection and topic summarization in twitter. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1441–1444. ACM (2013)

  42. Ren, K., Zhang, S., Lin, H.: Where are you settling down: geo-locating twitter users based on tweets and social networks. In: Asia Information Retrieval Symposium, pp. 150–161. Springer (2012)

  43. Rout, D., Bontcheva, K., Preoiuc-Pietro, D., Cohn, T.: Where’s@ wally? A classification approach to geolocating users based on their social ties. In: Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp. 11–20. ACM (2013)

  44. Ryoo, K., Moon, S.: Inferring twitter user locations with 10 km accuracy. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 643–648. ACM (2014)

  45. Sadilek, A., Kautz, H., Bigham, J.P.: Finding your friends and following them to where you are. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 723–732. ACM (2012)

  46. Sakaki, T., Okazaki, M., Matsuo, Y.: Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 25(4), 919–931 (2013)

    Article  Google Scholar 

  47. Scellato, S., Mascolo, C., Musolesi, M., Latora, V.: Distance matters: geo-social metrics for online social networks. In: The Proceedings of 3rd Workshop on Online Social Networks. USENIX Association (2010)

  48. Scellato, S., Musolesi, M., Mascolo, C., Latora, V., Campbell, A.T.: Nextplace: a spatio-temporal prediction framework for pervasive systems. In: International Conference on Pervasive Computing, pp. 152–169. Springer (2011)

  49. Scellato, S., Noulas, A., Lambiotte, R., Mascolo, C.: Socio-spatial properties of online location-based social networks. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)

  50. Scellato, S., Noulas, A., Mascolo, C.: Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1046–1054. ACM (2011)

  51. Sinnott, R.W.: Virtues of the haversine. Sky Telesc. 68, 159 (1984)

    Google Scholar 

  52. Tigunova, A., Lee, J., Nobari, S.: Location prediction via social contents and behaviors: location-aware behavioral LDA. In: International Conference on Data Mining Workshop (ICDMW), pp. 1131–1135. IEEE (2015)

  53. Vieweg, S., Hughes, A.L., Starbird, K., Palen, L.: Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1079–1088. ACM (2010)

  54. Wang, M., Wang, C., Yu, J.X., Zhang, J.: Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework. Proc. VLDB Endow. 8(10), 998–1009 (2015)

    Article  Google Scholar 

  55. Xu, W., Chow, C.Y., Zhang, J.D.: CALBA: capacity-aware location-based advertising in temporary social networks. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 364–373. ACM (2013)

  56. Yamaguchi, Y., Amagasa, T., Kitagawa, H.: Landmark-based user location inference in social media. In: Proceedings of the first ACM Conference on Online Social Networks, pp. 223–234. ACM (2013)

  57. Yamaguchi, Y., Amagasa, T., Kitagawa, H., Ikawa, Y.: Online user location inference exploiting spatiotemporal correlations in social streams. In: Proceedings of International Conference on Conference on Information and Knowledge Management, pp. 1139–1148. ACM (2014)

  58. Yuan, Q., Cong, G., Ma, Z., Sun, A., Thalmann, N.M.: Who, where, when and what: discover spatio-temporal topics for twitter users. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 605–613. ACM (2013)

  59. Zheng, X., Han, J., Sun, A.: A survey of location prediction on twitter. IEEE Trans. Knowl. Data Eng. 30(9), 1652–1671 (2018)

    Article  Google Scholar 

  60. Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report, Citeseer (2002)

  61. Zhuang, Y., Fong, S., Yuan, M., Sung, Y., Cho, K., Wong, R.K.: Location-based big data analytics for guessing the next foursquare check-ins. J. Supercomput. 73(7), 3112–3127 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

This research was mainly supported by the ARC Discovery Projects under Grant No. DP160102114 and CSIRO Data61 Scholarship Program. This research is also partially supported by the ARC Linkage Projects under Grant No. LP180100750, Research Grants Council of Hong Kong SAR, China, under Grant No. 14203618 and 14221716. The authors would like to thank Dr. Quanxi Shao and Dr. Cecile Paris (Data61, CSIRO) for their helpful advice, and Prof. Ajmal Mian (UWA) for proofreading the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al Hasan Haldar, N., Li, J., Reynolds, M. et al. Location prediction in large-scale social networks: an in-depth benchmarking study. The VLDB Journal 28, 623–648 (2019). https://doi.org/10.1007/s00778-019-00553-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-019-00553-0

Keywords

Navigation