Skip to main content

Fuzzy-Based Feature and Instance Recovery

  • Conference paper
  • 2338 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9621))

Abstract

The severe class distribution shews the presence of under-represented data, which has great effects on the performance of learning algorithm, is still a challenge of data mining and machine learning. Lots of researches currently focus on experimental comparison of the existing re-sampling approaches. We believe it requires new ways of constructing better algorithms to further balance and analyse the data set. This paper presents a Fuzzy-based Information Decomposition oversampling (FIDoS) algorithm used for handling the imbalanced data. Generally speaking, this is a new way of addressing imbalanced learning problems from missing data perspective. First, we assume that there are missing instances in the minority class that result in the imbalanced dataset. Then the proposed algorithm which takes advantages of fuzzy membership function is used to transfer information to the missing minority class instances. Finally, the experimental results demonstrate that the proposed algorithm is more practical and applicable compared to sampling techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 5, 597–604 (2006)

    Article  Google Scholar 

  2. Maratea, A., Petrosino, A., Manzo, M.: Adjusted F-measure and kernel scaling for imbalanced data learning. Inf. Sci. 257, 331–341 (2014)

    Article  Google Scholar 

  3. Dubey, R., et al.: Analysis of sampling techniques for imbalanced data: An n= 648 ADNI study. NeuroImage 87, 220–241 (2014)

    Article  Google Scholar 

  4. He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2012)

    MATH  Google Scholar 

  6. Rahman, M.M., Davis, D.: Cluster based under-sampling for unbalanced cardiovascular data. In: Proceedings of the World Congress on Engineering (2013)

    Google Scholar 

  7. Barua, S., et al.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)

    Article  MathSciNet  Google Scholar 

  8. Chongfu, H.: Demonstration of benefit of information distribution for probability estimation. Signal Process. 80, 1037–1048 (2000)

    Article  MATH  Google Scholar 

  9. Shigang, L., Honghua, D., Min, G.: Information-decomposition-model-based missing value estimation for not missing at random dataset. Int. J. Mach. Learn. Cybern. 1–11 (2015)

    Google Scholar 

  10. Alcala-Fdez, J., et al.: KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput. 13(3), 307–318 (2009)

    Article  Google Scholar 

  11. Shirabad, J.S., Menzies, T.J.: The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada (2005). http://promise.site.uottawa.ca/SERepository

  12. Cano, A., Zafra, A., Ventura, S.: Weighted data gravitation classification for standard and imbalanced data. IEEE Trans. Cybern. 43, 1672–1687 (2013)

    Article  Google Scholar 

  13. Lin, M., Tang, K., Yao, X.: Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans. Neural Netw. Learn. Syst. 24, 647–660 (2013)

    Article  Google Scholar 

  14. Hulse, V., Jason, T.M., Khoshgoftaar, A.N.: Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)

    Google Scholar 

  15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, New York (2014)

    Google Scholar 

  16. Mirkes, E.: KNN and Potential Energy (Applet). University of Leicester (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shigang Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, S., Zhang, J., Wang, Y., Xiang, Y. (2016). Fuzzy-Based Feature and Instance Recovery. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49381-6_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49381-6_58

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49380-9

  • Online ISBN: 978-3-662-49381-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics