Fuzzy-Based Feature and Instance Recovery

Liu, Shigang; Zhang, Jun; Wang, Yu; Xiang, Yang

doi:10.1007/978-3-662-49381-6_58

Fuzzy-Based Feature and Instance Recovery

Shigang Liu⁸,
Jun Zhang⁸,
Yu Wang⁸ &
…
Yang Xiang⁸

Conference paper

2338 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9621))

Abstract

The severe class distribution shews the presence of under-represented data, which has great effects on the performance of learning algorithm, is still a challenge of data mining and machine learning. Lots of researches currently focus on experimental comparison of the existing re-sampling approaches. We believe it requires new ways of constructing better algorithms to further balance and analyse the data set. This paper presents a Fuzzy-based Information Decomposition oversampling (FIDoS) algorithm used for handling the imbalanced data. Generally speaking, this is a new way of addressing imbalanced learning problems from missing data perspective. First, we assume that there are missing instances in the minority class that result in the imbalanced dataset. Then the proposed algorithm which takes advantages of fuzzy membership function is used to transfer information to the missing minority class instances. Finally, the experimental results demonstrate that the proposed algorithm is more practical and applicable compared to sampling techniques.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 5, 597–604 (2006)
Article Google Scholar
Maratea, A., Petrosino, A., Manzo, M.: Adjusted F-measure and kernel scaling for imbalanced data learning. Inf. Sci. 257, 331–341 (2014)
Article Google Scholar
Dubey, R., et al.: Analysis of sampling techniques for imbalanced data: An n= 648 ADNI study. NeuroImage 87, 220–241 (2014)
Article Google Scholar
He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2012)
MATH Google Scholar
Rahman, M.M., Davis, D.: Cluster based under-sampling for unbalanced cardiovascular data. In: Proceedings of the World Congress on Engineering (2013)
Google Scholar
Barua, S., et al.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)
Article MathSciNet Google Scholar
Chongfu, H.: Demonstration of benefit of information distribution for probability estimation. Signal Process. 80, 1037–1048 (2000)
Article MATH Google Scholar
Shigang, L., Honghua, D., Min, G.: Information-decomposition-model-based missing value estimation for not missing at random dataset. Int. J. Mach. Learn. Cybern. 1–11 (2015)
Google Scholar
Alcala-Fdez, J., et al.: KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput. 13(3), 307–318 (2009)
Article Google Scholar
Shirabad, J.S., Menzies, T.J.: The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada (2005). http://promise.site.uottawa.ca/SERepository
Cano, A., Zafra, A., Ventura, S.: Weighted data gravitation classification for standard and imbalanced data. IEEE Trans. Cybern. 43, 1672–1687 (2013)
Article Google Scholar
Lin, M., Tang, K., Yao, X.: Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans. Neural Netw. Learn. Syst. 24, 647–660 (2013)
Article Google Scholar
Hulse, V., Jason, T.M., Khoshgoftaar, A.N.: Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, New York (2014)
Google Scholar
Mirkes, E.: KNN and Potential Energy (Applet). University of Leicester (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Deakin University, Melbourne, VIC, 3125, Australia
Shigang Liu, Jun Zhang, Yu Wang & Yang Xiang

Authors

Shigang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shigang Liu .

Editor information

Editors and Affiliations

Wrocław University of Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Wrocław University of Technology, Wrocław, Poland
Bogdan Trawiński
Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, S., Zhang, J., Wang, Y., Xiang, Y. (2016). Fuzzy-Based Feature and Instance Recovery. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49381-6_58

Download citation

DOI: https://doi.org/10.1007/978-3-662-49381-6_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49380-9
Online ISBN: 978-3-662-49381-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics