Abstract
Twitter Spam is a critical problem and current solution is mainly about machine learning based detection. However, recent studies found that the spam features are continuously changing day by day (called ‘Spam Drift’ problem), which may significantly affect the performance of the detection. In this paper, we carried out a real-data driven study to explored the ‘Spam Drift’ problem and its impact to machine learning based detection. Our study found that only a small group of spam features will continuously change. The results also suggested a counter-intuitive conclusion that the ‘Spam Drift’ problem does not have serious impact on spam detection Precision (SP) and non-spam detection Recall (NR), two metrics that industries prioritise in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alghamdi, B., Watson, J., Xu, Y.: Toward detecting malicious links in online social networks through user behavior. In: IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), pp. 5–8. IEEE (2016)
Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015)
Chen, C., Wang, Y., Zhang, J., Xiang, Y., Zhou, W., Min, G.: Statistical features-based real-time detection of drifted Twitter spam. IEEE Trans. Inf. Forensics Secur. 12(4), 914–925 (2017)
Chen, C., Zhang, J., Chen, X., Xiang, Y., Zhou, W.: 6 million spam tweets: a large ground truth for timely Twitter spam detection. In: 2015 IEEE International Conference on Communications (ICC), pp. 7065–7070. IEEE (2015)
Chen, C., Zhang, J., Xiang, Y., Zhou, W., Oliver, J.: Spammers are becoming “smarter” on Twitter. IT Prof. 18(2), 66–70 (2016)
Chen, C., Zhang, J., Xie, Y., Xiang, Y., Zhou, W., Hassan, M.M., AlElaiwi, A., Alrubaian, M.: A performance evaluation of machine learning-based streaming spam tweets detection. IEEE Trans. Comput. Soc. Syst. 2(3), 65–76 (2015)
Goldberg, D.E., Holland, J.H.: Genetic algorithms and machine learning. Mach. Learn. 3(2), 95–99 (1988)
Hu, X., Tang, J., Liu, H.: Online social spammer detection. In: AAAI, pp. 59–65 (2014)
Kumar, R.K., Poonkuzhali, G., Sudhakar, P.: Comparative study on email spam classifier using data mining techniques. In: Proceedings of the International Multi Conference of Engineers and Computer Scientists, vol. 1, pp. 14–16 (2012)
Liu, S., Zhang, J., Xiang, Y.: Statistical detection of online drifting twitter spam: invited paper. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 1–10. ACM (2016)
Miller, Z., Dickinson, B., Deitrick, W., Hu, W., Wang, A.H.: Twitter spammer detection using data stream clustering. Inf. Sci. 260, 64–73 (2014)
Wu, T., Liu, S., Zhang, J., Xiang, Y.: Twitter spam detection based on deep learning. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 3. ACM (2017)
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans. Inf. Forensics Secur. 8(8), 1280–1293 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wu, T., Wang, D., Wen, S., Xiang, Y. (2017). How Spam Features Change in Twitter and the Impact to Machine Learning Based Detection. In: Liu, J., Samarati, P. (eds) Information Security Practice and Experience. ISPEC 2017. Lecture Notes in Computer Science(), vol 10701. Springer, Cham. https://doi.org/10.1007/978-3-319-72359-4_57
Download citation
DOI: https://doi.org/10.1007/978-3-319-72359-4_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72358-7
Online ISBN: 978-3-319-72359-4
eBook Packages: Computer ScienceComputer Science (R0)