Skip to main content
Log in

Discovering and tracking query oriented active online social groups in dynamic information network

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

The efficient identification of social groups with common interests is a key consideration for viral marketing in online social networking platforms. Most existing studies in social groups or community detection either focus on the common attributes of the nodes (users) or rely on only the topological links of the social network graph. The temporal evolution of user activities and interests have not been thoroughly studied to identify their effects on the formation of groups. In this paper, we investigate the problem of discovering and tracking time-sensitive activity driven user groups in dynamic social networks for a given input query consisting a set of topics. The users in these groups have the tendency to be temporally similar in terms of their activities on the topics of interest. To this end, we develop two baseline solutions to discover effective social groups. The first solution uses the network structure, whereas the second one uses the topics of common interest. We further propose an index-based method to incrementally track the evolution of groups with a lower computational cost. Our main idea is based on the observation that the degree of user activeness often degrades or upgrades widely over a period of time. The temporal tendency of user activities is modelled as the freshness of recent activities by tracking the social streams with a fading time window. We conduct extensive experiments on three real data sets to demonstrate the effectiveness and efficiency of the proposed methods. We also report some interesting observations on the temporal evolution of the discovered social groups using case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19

Similar content being viewed by others

Notes

  1. http://www.facebook.com

  2. http://www.twitter.com

  3. https://plus.google.com

  4. https://github.com/splorp/wordpress-comment-blacklist/blob/master/blacklist.txt

  5. https://gate.ac.uk/wiki/twitie.html

  6. https://www.noslang.com/

  7. https://www.webopedia.com/quick_ref/Twitter_Dictionary_Guide.asp

  8. https://github.com/EFord36/normalise

  9. A political protest movement comprising autonomous groups affiliated by their militant opposition to fascism and other forms of extreme right-wing ideology.

  10. A man who is readily upset or offended by progressive attitudes that conflict with his more conventional or conservative views.

  11. http://aspell.net/

  12. https://lucene.apache.org

  13. We use the Greek lowercase letter kappa (κ) to refer the number of clusters produced by k-means algorithm, and the English lowercase letter k to refer the k-core of the social graph G.

  14. http://snap.stanford.edu/data/twitter7.html

  15. We do not include IGM-Hashtag in efficiency results as the computation times for both IGM-Topic and IGM-Hashtag are same. So IGM-Topic is denoted as IGM in efficiency results. We reported the effectiveness comparison between IGM-Topic and IGM-Hashtag in Section 7.3

  16. An adjacency matrix of a network is represented by A, where Auv = 0 means there’s no edge (no interaction) between nodes u and v and Auv= 1 means there is an edge between the two.

References

  1. Natarajan, N., Sen, P., Chaoji, V.: Community detection in content-sharing social networks. In: ASONAM, pp. 82–89 (2013)

  2. Qi, G., Aggarwal, C.C., Huang, T.: Community detection with edge content in social media networks. In: ICDE, pp. 534–545 (2012)

  3. Newman, M.E.J., Park, J.: Why social networks are different from other types of networks. Phys. Rev. E 68, 036122 (2003)

    Article  Google Scholar 

  4. Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM, pp. 1151–1156 (2013)

  5. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: KDD, pp. 927–936 (2009)

  6. Flint, E., Ford, E., Thomas, O., Caines, A., Buttery, P.: A text normalisation system for Non-Standard english words. In: Proceedings of the 3rd Workshop on Noisy User-generated Text, pp. 107–115 (2017)

  7. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004)

    Article  Google Scholar 

  8. Greene, D., Doyle, D., Cunningham, P.: Tracking the evolution of communities in dynamic social networks. In: ASONAM, pp. 176–183 (2010)

  9. Chen, Y., Kawadia, V., Urgaonkar, R.: Detecting overlapping temporal community structure in time-evolving networks. arXiv:1303,7226 (2013)

  10. Newman, M.E.J.: Detecting community structure in networks. The European Physical Journal B - Condensed Matter and Complex Systems 38 (2004)

  11. Kershaw, D., Rowe, M., Stacey, P.: Towards tracking and analysing regional alcohol consumption patterns in the UK through the use of social media. In: Websci, pp. 220–228 (2014)

  12. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)

  13. Kim, M., Han, J.: A particle-and-density based evolutionary clustering method for dynamic networks. In: VLDB, pp. 622–633 (2009)

  14. Meo, P.D., Ferrara, E., Fiumara, G., Provetti, A.: Generalized louvain method for community detection in large networks. In: ISDA, pp. 88–93 (2011)

  15. O’Connor, B., Krieger, M., Ahn, D.: Tweetmotif: exploratory search and topic summarization for Twitter. In: ICWSM, pp. 384–385 (2010)

  16. Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)

    Article  Google Scholar 

  17. Sun, Y., Aggarwal, C.C., Han, J.: Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. In: PVLDB, pp. 394–405 (2012)

  18. Cheng, X., Yan, X., Lan, Y., Guo, J.: Topic modeling over short texts. In: TKDE, pp. 2928–2941 (2014)

  19. Bird, S., Loper, E., Klein, E.: Natural language processing with python. O’Reilly Media, Sebastopol (2009)

    MATH  Google Scholar 

  20. Palla, G., Barabasi, A.L., Vicsek, T.: Quantifying social group evolution, vol. 446, pp. 664–667 (2007)

  21. Wang, Z., Ye, X, Tsou, M.: Spatial, Temporal, and Content Analysis of Twitter for Wildfire Hazards, vol. 83, pp. 523–540 (2016)

  22. Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 036106 (2007)

    Article  Google Scholar 

  23. Zeng, Z., Wang, J., Zhou, L., Karypis, G.: Out-of-core coherent closed quasi-clique mining from large dense graph databases. ACM Trans. Database Syst 32(2). Article No. 13 (2007)

  24. Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-Link LDA: Joint models of topic and author community. In: ICML, pp. 665–672 (2009)

  25. Correa, D., Sureka, A., Pundir, M.: iTop - Interaction based topic centric community discovery on Twitter. In: PIKM, pp. 51–58 (2012)

  26. Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.J.: SCAN: a structural clustering algorithm for networks. In: KDD, pp. 824–833 (2007)

  27. Cuzzocrea, A., Folino, F.: Community evolution detection in time-evolving information networks. In: EDBT, pp. 93–96 (2013)

  28. Dev, H., Ali, M.E., Hashem, T.: User interaction based community detection in online social networks. In: DASFAA, pp. 296–310 (2014)

  29. Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. In: VLDB, pp. 1233–1244 (2016)

  30. Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. In: VLDB, pp. 718–729 (2009)

  31. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: KDD, pp.990–998 (2008)

  32. Tang, J., Musolesi, M., Mascolo, C., Latora, V, Nicosia, V.: Analysing information flows and key mediators through temporal centrality metrics. In: SNS, article no. 3 (2010)

  33. Lui, M., Baldwin, T.: Cross-domain feature selection for language identification. In: IJCNLP , pp. 553–561 (2011)

  34. Cohen, J.: Trusses: Cohesive Subgraphs for Social Network Analysis. Technical Report, National Security Agency, Tech. Rep (2008)

  35. Anwar, M., Liu, C., Li, J., Anwar, T.: Discovering and tracking active online social groups. In: WISE, pp. 59–74 (2017)

  36. Lim, K.H., Datta, A.: An interaction-based approach to detecting highly interactive Twitter communities using tweeting links. In: Web Intelligence, pp. 1–15 (2016)

  37. Thelwall, M., Kousha, K.: Academia.edu: social network or academic network?. J. Assoc. Inf. Sci. Technol. 65(4), 721–731 (2014)

    Article  Google Scholar 

  38. Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: KDD, pp. 228–238 (2005)

  39. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res., pp. 993–1022 (2003)

  40. Asur, S., Parthasarathy, S., Ucar, D.: An event-based framework for characterizing the evolutionary behavior of interaction graphs. In: KDD, pp. 913–921 (2007)

  41. Huang, X., Lakshmanan, L.V.S.: Attribute-driven community search, vol. 10, pp. 949–960 (2017)

  42. Li, R., Wang, S., Deng, H., Wang, R., Chang, K.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: KDD, pp. 1023–1031 (2012)

  43. Magdy, W., Elsayed, T., Hasanain, M.: On the evaluation of tweet timeline generation task. In: ECIR, pp. 648–653 (2016)

  44. Gouws, S., Metzler, D., Cai, C., Hovy, E.: Contextual bearing on linguistic variation in social media. In: LSM, pp. 20–29 (2011)

  45. Bogdanov, P., Busch, M., Moehli, J., Singh, A.K., Szymanski, B.K.: The social media genome: modeling individual topic-specific behavior in social media. In: ASONAM, pp. 236–242 (2013)

  46. Han, B., Cook, P., Baldwin, T.: Lexical normalization for social media text. In: Journal ACM Transactions on Intelligent Systems and Technology (TIST), vol. 4, Issue 1 (2013)

  47. Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective Community Search over Large Spatial Graphs, vol. 10, pp. 709–720 (2017)

  48. Guan, X., Chen, C.: Using social media data to understand and assess disasters. Nat. Hazards 74(2), 837–850 (2014)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work is supported by the ARC Discovery Projects DP170104747, DP160102412 and DP160102114.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md Musfique Anwar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Social Computing and Big Data Applications

Guest Editors: Xiaoming Fu, Hong Huang, Gareth Tyson, Lu Zheng, and Gang Wang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anwar, M.M., Liu, C. & Li, J. Discovering and tracking query oriented active online social groups in dynamic information network. World Wide Web 22, 1819–1854 (2019). https://doi.org/10.1007/s11280-018-0627-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0627-5

Keywords

Navigation