Abstract
Current effort on multi-label learning generally assumes that the given labels are noise-free. However, obtaining noise-free labels is quite difficult and often impractical. In this paper, we study how to identify a subset of relevant labels from a set of candidate ones given as annotations to instances, and introduce a matrix factorization based method called MF-INL. It first decomposes the original instance-label association matrix into two low-rank matrices using nonnegative matrix factorization with feature-based and label-based constraints to retain the geometric structure of instances and label correlations. MF-INL then reconstructs the association matrix using the product of the decomposed matrices, and identifies associations with the lowest confidence as noisy associations. An empirical study on real-world multi-label datasets with injected noisy labels shows that MF-INL can identify noisy labels more accurately than other related solutions and is robust to input parameters. We empirically demonstrate that both feature-based and label-based constraints contribute to boosting the performance of MF-INL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR 7(11), 2399–2434 (2006)
Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. TPAMI 33(8), 1548–1560 (2011)
Chen, Y., Lin, H.: Feature-aware label space dimension reduction for multi-label classification. In: NIPS, pp. 1529–1537 (2012)
Cour, T., Sapp, B., Taskar, B.: Learning from partial labels. JMLR 12(5), 1501–1536 (2011)
Geng, X.: Label distribution learning. TKDE 28(7), 1734–1748 (2016)
Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. 47(3), 52 (2015)
Hansen, P.C., Jensen, S.H.: FIR filter representations of reduced-rank noise reduction. IEEE Trans. Signal Process. 46(6), 1737–1741 (1998)
Hüllermeier, E., Beringer, J.: Learning from ambiguously labeled examples. Intell. Data Anal. 10(5), 419–439 (2006)
Jiang, L., Wang, D., Cai, Z., Jiang, S., Yan, X.: Scaling up the accuracy of k-nearest-neighbour classifiers: a Naïve-Bayes hybrid. Int. J. Comput. Appl. 31(1), 36–43 (2009)
Jiang, L., Cai, Z., Wang, D., Zhang, H.: Bayesian Citation-KNN with distance weighting. Int. J. Mach. Learn. Cybern. 5(2), 193–199 (2014)
Jiang, L., Zhang, L., Li, C., Wu, J.: A correlation-based feature weighting filter for Naive Bayes. In: TKDE (2018). https://doi.org/10.1109/TKDE.2018.2836440
Konstantinides, K., Natarajan, B., Yovanof, G.S.: Noise estimation and filtering using block-based singular value decomposition. IEEE Trans. Image Process. 6(3), 479–483 (1997)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562 (2001)
Li, C., Sheng, V.S., Jiang, L., Li, H.: Noise filtering to improve data and model quality for crowdsourcing. Knowl. Based Syst. 107, 96–103 (2016)
Lin, Z., Ding, G., Hu, M., Wang, J.: Multi-label classification via feature-aware implicit label space encoding. In: ICML, pp. 325–333 (2014)
Liu, L., Dietterich, T.G.: A conditional multinomial mixture model for superset label learning. In: NIPS, pp. 548–556 (2012)
Meng, D., De La Torre, F.: Robust matrix factorization with unknown noise. In: ICCV, pp. 1337–1344 (2013)
Nam, J., Kim, J., MencÃa, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classificationÅ‚revisiting neural networks. In: ECML, pp. 437–452 (2014)
Sun, Y., Zhang, Y., Zhou, Z.: Multi-label learning with weak label. In: AAAI, pp. 593–598 (2010)
Tai, F., Lin, H.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)
Tang, C., Zhang, M.: Confidence-rated discriminative partial label learning. In: AAAI, pp. 2611–2617 (2017)
Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative review. JMLR 10, 66–71 (2009)
Wu, B., Lyu, S., Hu, B.G., Ji, Q.: Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recogn. 48(7), 2279–2289 (2015)
Xu, C., Tao, D., Xu, C.: Robust extreme multi-label learning. In: KDD, pp. 1275–1284 (2016)
Yeh, C., Wu, W., Ko, W., Wang, Y.F.: Learning deep latent space for multi-label classification. In: AAAI, pp. 2838–2844 (2017)
Yu, F., Zhang, M.L.: Maximum margin partial label learning. Mach. Learn. 104(4), 573–593 (2017)
Yu, G., Domeniconi, C., Rangwala, H., Zhang, G.: Protein function prediction using dependence maximization. In: ECML/PKDD, pp. 574–589 (2013)
Yu, G., Zhang, G., Rangwala, H., Domeniconi, C., Yu, Z.: Protein function prediction using weak-label learning. In: ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 202–209 (2012)
Zhang, J., Wu, X., Sheng, V.S.: Learning from crowdsourced labeled data: a survey. Artif. Intell. Rev. 46(4), 543–576 (2016)
Zhang, L., Jiang, L., Li, C.: A new feature selection approach to Naive Bayes text classifiers. Int. J. Pattern Recogn. Artif. Intell. 30(02), 1650003 (2016)
Zhang, M., Yu, F.: Solving the partial label learning problem: an instance-based approach. In: IJCAI, pp. 4048–4054 (2015)
Zhang, M., Yu, F., Tang, C.: Disambiguation-free partial label learning. TKDE 29(10), 2155–2167 (2017)
Zhang, M., Zhang, K.: Multi-label learning by exploiting label dependency. In: KDD, pp. 999–1008 (2010)
Zhang, M., Zhou, B., Liu, X.: Partial label learning via feature-aware disambiguation. In: KDD, pp. 1335–1344 (2016)
Zhang, M., Zhou, Z.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. TKDE 26(8), 1819–1837 (2014)
Acknowledgments
This work is supported by Natural Science Foundation of China (61741217 and 61402378), Natural Science Foundation of CQ CSTC (cstc2016jcyjA0351), Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP-2017A05) and Chongqing Graduate Student Research Innovation Project [No. CYS18089].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Chen, X., Yu, G., Domeniconi, C., Wang, J., Zhang, Z. (2018). Matrix Factorization for Identifying Noisy Labels of Multi-label Instances. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11013. Springer, Cham. https://doi.org/10.1007/978-3-319-97310-4_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-97310-4_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97309-8
Online ISBN: 978-3-319-97310-4
eBook Packages: Computer ScienceComputer Science (R0)