Abstract
Anomalies are those deviating significantly from the norm. Thus, anomaly detection amounts to finding data points located far away from their neighbors, i.e., those lying in low-density regions. Classic anomaly detection methods are largely designed for single data type such as continuous or discrete. However, real-world data is increasingly heterogeneous, where a data point can have both discrete and continuous attributes. Mixed data poses multiple challenges including (a) capturing the inter-type correlation structures and (b) measuring deviation from the norm under multiple types. These challenges are exaggerated under (c) high-dimensional regimes. In this paper, we propose a new scalable unsupervised anomaly detection method for mixed data based on Mixed-variate Restricted Boltzmann Machine (Mv.RBM). The Mv.RBM is a principled probabilistic method that estimates density of mixed data. We propose to use free energy derived from Mv.RBM as anomaly score as it is identical to data negative log-density up to an additive constant. We then extend this method to detect anomalies across multiple levels of data abstraction, an effective approach to deal with high-dimensional settings. The extension is dubbed \(\mathtt {MIXMAD}\), which stands for MIXed data Multilevel Anomaly Detection. In \(\mathtt {MIXMAD}\), we sequentially construct an ensemble of mixed-data Deep Belief Nets (DBNs) with varying depths. Each DBN is an energy-based detector at a predefined abstraction level. Predictions across the ensemble are finally combined via a simple rank aggregation method. The proposed methods are evaluated on a comprehensive suit of synthetic and real high-dimensional datasets. The results demonstrate that for anomaly detection, (a) a proper handling of mixed types is necessary, (b) free energy is a powerful anomaly scoring method, (c) multilevel abstraction of data is important for high-dimensional data, and (d) empirically Mv.RBM and \(\mathtt {MIXMAD}\) are superior to popular unsupervised detection methods for both homogeneous and mixed data.
Similar content being viewed by others
Notes
A preliminary version of this paper has been published in [16].
The original Mv.RBM also covers rank, but we do not consider in this paper.
References
Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional space. In: International conference on database theory, Springer, pp 420–434
Aggarwal CC, Sathe S (2015) Theoretical foundations and algorithms for outlier ensembles. ACM SIGKDD Explor Newsl 17(1):24–47
Akoglu L, Tong H, Vreeken J, Faloutsos C (2012) Fast and reliable anomaly detection in categorical data. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 415–424
Angiulli, F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: European conference on principles of data mining and knowledge discovery, Springer, pp 15–27
Becker J, Havens TC, Pinar A, Schulz TJ (2015) Deep belief networks for false alarm rejection in forward-looking ground-penetrating radar. In: SPIE defense+ security, International Society for Optics and Photonics, pp 94540W–94540W
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Bontemps L, McDermott J, Le-Khac NA et al (2016) Collective anomaly detection based on long short-term memory recurrent neural networks. In: International conference on future data and security engineering, Springer, pp 141–152
Bouguessa M (2015) A practical outlier detection approach for mixed-attribute data. Expert Syst Appl 42(22):8637–8649
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104
Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2015) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Discov 30(4):891–927
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15
Chauhan S, Vig L (2015) Anomaly detection in ECG time signals via deep long short-term memory networks. In: IEEE international conference on data science and advanced analytics (DSAA), 2015. 36678 2015, IEEE, pp 1–7
Cheng M, Xu Q, Lv J, Liu W, Li Q, Wang J (2016) MS-LSTM: a multi-scale LSTM model for BGP anomaly detection. In: IEEE 24th international conference on network protocols (ICNP), 2016, IEEE, pp 1–6
Das K, Schneider J, Neill DB (2008) Anomaly pattern detection in categorical datasets. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 169–176
De Leon AR, Chough KC (2013) Analysis of mixed data: methods & applications. CRC Press, Boca Raton
Do K, Tran T, Phung D, Venkatesh S (2016) Outlier detection on mixed-type data: an energy-based approach. In: International conference on advanced data mining and applications (ADMA 2016)
Fiore U, Palmieri F, Castiglione A, De Santis A (2013) Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122:13–23
Gao N, Gao L, Gao Q, Wang H (2014) An intrusion detection model based on deep belief networks. In: Second international conference on advanced cloud and big data (CBD), 2014, IEEE, pp 247–252
Ghoting A, Otey ME, Parthasarathy S (2004) Loaded: link-based outlier and anomaly detection in evolving data sets. In: ICDM, pp 387–390
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14:1771–1800
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Ienco D, Pensa RG, Meo R (2016) A semisupervised approach to the detection and characterization of outliers in categorical data. IEEE Trans Neural Netw Learn Syst 28(5):1017–1029
Kamyshanska H, Memisevic R (2015) The potential energy of an autoencoder. IEEE Trans Pattern Anal Mach Intell 37(6):1261–1273
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Koufakou A, Georgiopoulos M (2010) A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Min Knowl Discov 20(2):259–289
Koufakou A, Georgiopoulos M, Anagnostopoulos GC (2008) Detecting outliers in high-dimensional datasets with mixed attributes. In: DMIN, Citeseer, pp 427–433
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Lu YC, Feng C, Yating W, Lu CT (2016) Discovering anomalies on mixed-type data using a generalized student-t based approach. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2016.2583429
Malhotra P, Vig L, Shroff G, Agarwal P (2015) Long short term memory networks for anomaly detection in time series. In: Proceedings of ESANN, Presses universitaires de Louvain, pp 89–94
Mehta P, Schwab DJ (2014) An exact mapping between the variational renormalization group and deep learning. arXiv preprint arXiv:1410.3831
Nguyen TD, Tran T, Phung D, Venkatesh S (2013) Latent patient profile modelling and applications with mixed-variaterestricted Boltzmann machine. In: Proceedings of Pacific-Asia conference on knowledge discovery and datamining (PAKDD), Gold Coast, Queensland, Australia
Nguyen TD, Tran T, Phung D, Venkatesh S (2013) Learning sparse latent representation and distance metric for image retrieval. In: Proceedings of IEEE international conference on multimedia & expo, California, USA, July 15–19
Otey ME, Parthasarathy S, Ghoting A (2005) Fast lightweight outlier detection in mixed-attribute data. Techincal report, OSU–CISRC–6/05–TR43
Pai HT, Wu F, Hsueh PYSS (2014) A relative patterns discovery for enhancing outlier detection in categorical data. Dec Support Syst 67:90–99
Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings. 19th international conference on data engineering, 2003. IEEE, pp 315–326
Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reas 50(7):969–978
Serfling R, Wang S (2014) General foundations for studying masking and swamping robustness of outlier identifiers. Statis Methodol 20:79–90
Sun J, Wyss R, Steinecker A, Glocker P (2014) Automated fault detection using deep belief networks for the quality inspection of electromotors. tm-Technisches Messen 81(5):255–263
Tagawa T, Tadokoro Y, Yairi T (2014) Structured denoising autoencoder for fault detection and analysis. In: ACML
Tang G, Pei J, Bailey J, Dong G (2015) Mining multidimensional contextual outliers from categorical relational data. Intell Data Anal 19(5):1171–1192
Taylor A, Leblanc S, Japkowicz N (2016) Anomaly detection in automobile control network data with long short-term memory networks. In: IEEE international conference on data science and advanced analytics (DSAA), 2016, IEEE, pp 130–139
Tran N, Jin H (2012) Detecting network anomalies in mixed-attribute data sets. In: Third international conference on knowledge discovery and data mining, 2010. WKDD’10, IEEE, pp 383–386
Tran T, Phung D, Venkatesh S (2013) Thurstonian Boltzmann machines: learning from multiple inequalities. In: International conference on machine learning (ICML), Atlanta, USA, June 16–21
Tran T, Phung DQ, Venkatesh S (2011) Mixed-variate restricted Boltzmann machines. In: Proceedings of 3rd Asian conference on machine learning (ACML), Taoyuan, Taiwan
Tran T, Luo W, Phung D, Morris J, Rickard K, Venkatesh S (2016) Preterm birth prediction: deriving stable and interpretable rules from high dimensional data. In: Conference on machine learning in healthcare, LA, USA
Tuor A, Kaplan S, Hutchinson B, Nichols N, Robinson S (2017) Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: Proceedings of the AAAI-17 Workshop on Artificial Intelligence for Cyber Security, pp 224–231
Wang Y, Cai W, Wei P (2016) A deep learning approach for detecting malicious JavaScript code. Secur Commun Netw 9:1520–1534
Ye M, Li X, Orlowska ME (2009) Projected outlier detection in high-dimensional mixed-attributes data set. Expert Syst Appl 36(3):7104–7113
Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv preprint arXiv:1605.07717
Zhang K, Jin H (2010) An effective pattern based outlier detection approach for mixed attribute data. In: Australasian joint conference on artificial intelligence, Springer, pp 122–131
Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Statis Anal Data Mining 5(5):363–387
Acknowledgements
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Do, K., Tran, T. & Venkatesh, S. Energy-based anomaly detection for mixed data. Knowl Inf Syst 57, 413–435 (2018). https://doi.org/10.1007/s10115-018-1168-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1168-z