Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering

Li, Cheng; Rana, Santu; Phung, Dinh; Venkatesh, Svetha

doi:10.1007/s40745-016-0082-z

Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering

Published: 12 May 2016

Volume 3, pages 205–223, (2016)
Cite this article

Annals of Data Science Aims and scope Submit manuscript

Cheng Li¹,
Santu Rana¹,
Dinh Phung¹ &
…
Svetha Venkatesh¹

460 Accesses
1 Citation
Explore all metrics

Abstract

The Dirichlet process mixture (DPM) model, a typical Bayesian nonparametric model, can infer the number of clusters automatically, and thus performing priority in data clustering. This paper investigates the influence of pairwise constraints in the DPM model. The pairwise constraint, known as two types: must-link (ML) and cannot-link (CL) constraints, indicates the relationship between two data points. We have proposed two relevant models which incorporate pairwise constraints: the constrained DPM (C-DPM) and the constrained DPM with selected constraints (SC-DPM). In C-DPM, the concept of chunklet is introduced. ML constraints are compiled into chunklets and CL constraints exist between chunklets. We derive the Gibbs sampling of the C-DPM based on chunklets. We further propose a principled approach to select the most useful constraints, which will be incorporated into the SC-DPM. We evaluate the proposed models based on three real datasets: 20 Newsgroups dataset, NUS-WIDE image dataset and Facebook comments datasets we collected by ourselves. Our SC-DPM performs priority in data clustering. In addition, our SC-DPM can be potentially used for short-text clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://qwone.com/~jason/20Newsgroups/.

References

Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2(6):1152–1174
Article Google Scholar
Basu S, Banerjee A, Mooney R (2004) Active semi-supervision for pairwise constrained clustering. In: Proceedings of SIAM international conference on data mining, pp 333–344
Bilmes J (1997) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical Report, ICSI
Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Google Scholar
Boley D, Kawale J (2013) Constrained spectral clustering using l1 regularization. In: SDM’13, pp 103–111
Chinrungrueng C, Squin CH (1995) Optimal adaptive k-means algorithm with dynamic adjustment of learning rate. IEEE Trans Neural Netw 6(1):157–169
Article Google Scholar
Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09, pp 48:1–48:9
Davidson I (2012) Two approaches to understanding when constraints help clustering. In: Yang Q, Agarwal D, Pei J (eds) KDD. ACM, New York, pp 1312–1320
Google Scholar
Davidson I, Ravi SS (2005) Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of 5th SIAM data mining conference
Davidson I, Wagstaff KL, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: Proceedings of 10th European conference on principles and practice of knowledge discovery in databases, pp 115–126
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2):209–230
Article Google Scholar
Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396
Article Google Scholar
Finkel JR, Grenager T, Manning CD (2007) The infinite tree. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 272–279
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6(6):721–741
Article Google Scholar
Gershman SJ, Blei DM (2011) A tutorial on Bayesian nonparametric models. J Math Psychol 56(1):1–12
Article Google Scholar
Goldwater S, Griffiths TL, Johnson M (2006) Contextual dependencies in unsupervised word segmentation. In: Proceedings of the 21st international conference on computational linguistics, pp 673–680
Grira N, Crucianu M, Boujemaa N (2008) Active semi-supervised fuzzy clustering. Pattern Recogn 41(5):1851–1861
Article Google Scholar
House L (2006) Nonparametric Bayesian models in expression proteomic applications. Duke University, Durham
Google Scholar
Johnson S (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254
Article Google Scholar
Li C, Phung D, Rana S, Venkatesh S (2013) Exploiting side information in distance dependent Chinese restaurant processes for data clustering. In: ICME
Li C, Rana S, Phung D, Venkatesh S (2016) Hierarchical Bayesian nonparametric models for knowledge discovery from electronic medical records. Knowl Based Syst 99:168–182
Article Google Scholar
Li C, Rana S, Phung D, Venkatesh S (2015) Data clustering using side information dependent Chinese restaurant processes. Knowl Inf Syst 47(2):463–488
Article Google Scholar
Li C, Rana S, Phung D, Venkatesh S (2015) Small-variance asymptotics for Bayesian nonparametric models with constraints. Adv Knowl Discov Data Min 9078:92–105
Google Scholar
Li C, Rana S, Phung D, Venkatesh S (2014) Regularizing topic discovery in EMRS with side information by using hierarchical Bayesian models. In: ICPR
Mallapragada PK, Jin R, Jain AK (2008) Active query selection for semi-supervised clustering. In: ICPR, pp 1–4
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley series in probability and statistics, Wiley, New York
Book Google Scholar
Muller P, Quintana FA (2004) Nonparametric Bayesian data analysis. Stat Sci 19(1):95–110
Article Google Scholar
Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. JCGS 9(2):249–265
Google Scholar
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Advances in neural information processing systems. MIT Press, Cambridge, pp 849–856
Google Scholar
Orbanz P (2010) Bayesian nonparametric models. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning. Springer, Berlin
Google Scholar
Orbanz P, Buhmann JM (2008) Nonparametric Bayesian image segmentation. Int J Comput Vis 77(1–3):25–45
Article Google Scholar
Ross J, Dy J (2013) Nonparametric mixture of Gaussian processes with constraints. ICML 28:1346–1354
Google Scholar
Shental N, Bar-hillel A, Hertz T, Weinshall D (2003) Computing Gaussian mixture models with EM using equivalence constraints. Adv Neural Inf Process Syst 16:465–472
Google Scholar
Sudderth E, Torralba A, Freeman W, Willsky A (2008) Describing visual scenes using transformed objects and parts. Int J Comput Vis 77(1):291–330
Article Google Scholar
Vlachos A, Ghahramani Z, Korhonen A (2008) Dirichlet process mixture models for verb clustering. In: ICML workshop on prior knowledge for text and language processing, pp 1–6
Vlachos A, Korhonen A, Ghahramani Z (2009) Unsupervised and constrained Dirichlet process mixture models for verb clustering. GEMS ’09. Association for Computational Linguistics, Columbus, pp 74–82
Chapter Google Scholar
Vlachos A, Ghahramani Z, Briscoe T (2010) Active learning for constrained Dirichlet process mixture models. In: Proceedings of the 2010 workshop on geometrical models of natural language semantics, pp 57–61
Vu VV, Labroche N, Bouchon-Meunier B (2012) Improving constrained clustering with active query selection. Pattern Recogn 45(4):1749–1758
Article Google Scholar
Wagstaff KL (2006) When is constrained clustering beneficial, and why. In: AAAI, pp 1–2
Xiong S, Azimi J, Fern X (2014) Active learning of constraints for semi-supervised clustering. IEEE Trans Knowl Data Eng 26(1):43–54
Article Google Scholar
Xu Q, desJardins M, Wagstaff K (2005) Active constrained clustering by examining spectral eigenvectors. In: 8th International conference discovery science, vol 3735, pp 294–307
Yu G, Huang R, Wang Z (2010) Document clustering via Dirichlet process mixture model with feature selection. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 763–772

Download references

Author information

Authors and Affiliations

Centre for Pattern Recognition and Data Analytics, Deakin University, Geelong, Australia
Cheng Li, Santu Rana, Dinh Phung & Svetha Venkatesh

Authors

Cheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Santu Rana
View author publications
You can also search for this author in PubMed Google Scholar
Dinh Phung
View author publications
You can also search for this author in PubMed Google Scholar
Svetha Venkatesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, C., Rana, S., Phung, D. et al. Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering. Ann. Data. Sci. 3, 205–223 (2016). https://doi.org/10.1007/s40745-016-0082-z

Download citation

Received: 28 March 2016
Revised: 04 May 2016
Accepted: 06 May 2016
Published: 12 May 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s40745-016-0082-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering

Abstract

Access this article

Similar content being viewed by others

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering

Abstract

Access this article

Similar content being viewed by others

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation