Skip to main content

Multiclass Lung Cancer Diagnosis by Gene Expression Programming and Microarray Datasets

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10604))

Abstract

There are various types of lung cancer and they can be differentiated by the cell size as well as the growth pattern. They are all treated differently. Classification of the various types of lung cancer assists in determining the specified treatments to decrease the fatality rates. In this paper, we broaden the analysis of lung by using gene expression data, binary decomposition strategies and Gene Expression Programming (GEP) technique, aiming at achieving better classification performance. Classification performance was assessed and compared between our GEP models and three representative machine learning techniques, SVM, NNW and C4.5 on real microarray Lung tumor datasets. Dependability was evaluated by the cross-informational collection validation. The evaluation results demonstrate that our technique can achieve better classification performance in terms of Accuracy, standard deviation and range under the recipient working trademark bend. The proposed technique in this paper provides a helpful tool for Lung cancer classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. American Cancer Society: Cancer facts & figures 2011, vol. 1, no. 34. American Cancer Society INC. (2011)

    Google Scholar 

  2. Laureen, W., Goh, B.C.: An overview of cancer trends in Asia. Innovationmagazine.com (2012)

  3. Balgkouranidou, I., Liloglou, T., Lianidou, E.S.: Lung cancer epigenetics: emerging biomarkers. Biomark. Med. 7(1), 49–58 (2013)

    Article  Google Scholar 

  4. Hosseinzadeh, F., Ebrahimi, M., Goliaei, B., Shamabadi, N.: Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models. PLoS ONE 7(7), e40017 (2012)

    Article  Google Scholar 

  5. Beasley, M.B., Brambilla, E., Travis, W.D.: The 2004 World Health Organization classification of lung tumors. In: Seminars in Roentgenology, vol. 40, no. 2, pp. 90–97. WB Saunders (2005)

    Google Scholar 

  6. Pham, T.D., Wells, C., Crane, D.I.: Analysis of microarray gene expression data. Current Bioinform. 1(1), 37–53 (2006)

    Article  Google Scholar 

  7. Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Google Scholar 

  8. Joseph, S.J., Robbins, K.R., Zhang, W., Rekaya, R.: Comparison of two output-coding strategies for multi-class tumor classification using gene expression data and latent variable model as binary classifier. Cancer Inform. 9, 39 (2010)

    Article  Google Scholar 

  9. Burgess, D.J.: Cancer genetics: initially complex, always heterogeneous. Nat. Rev. Genet. 12(3), 154–155 (2011)

    Article  Google Scholar 

  10. Dyrskjøt, L., et al.: Gene expression signatures predict outcome in non–muscle-invasive bladder carcinoma: a multicenter validation study. Clin. Cancer Res. 13(12), 3545–3551 (2007)

    Article  Google Scholar 

  11. Shah, M.A., et al.: Molecular classification of gastric cancer: a new paradigm. Clin. Cancer Res. 17(9), 2693–2701 (2011)

    Article  Google Scholar 

  12. Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)

    Article  Google Scholar 

  13. Mukherjee, S.: Classifying microarray data using support vector machines. In: Berrar, D.P., Dubitzky, W., Granzow, M. (eds.) A Practical Approach to Microarray Data Analysis, pp. 166–185. Springer, Boston (2003)

    Google Scholar 

  14. Ghorai, S., Mukherjee, A., Sengupta, S., Dutta, P.K.: Multicategory cancer classification from gene expression data by multiclass NPPC ensemble. In: 2010 International Conference on Systems in Medicine and Biology (ICSMB), pp. 41–48. IEEE (2010)

    Google Scholar 

  15. Lorena, A.C., De Carvalho, A.C., Gama, J.M.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30(1–4), 19–37 (2008)

    Article  Google Scholar 

  16. Clark, P., Boswell, R.: Rule induction with CN2: some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991). doi:10.1007/BFb0017011

    Chapter  Google Scholar 

  17. Anand, R., Mehrotra, K., Mohan, C.K., Ranka, S.: Efficient classification for multiclass problems using modular neural networks. IEEE Trans. Neural Netw. 6(1), 117–124 (1995)

    Article  Google Scholar 

  18. Knerr, S., Personnaz, L., Dreyfus, G.: Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Soulié, F.F., Hérault, J. (eds.) Neurocomputing. NATO ASI Series (Series F: Computer and Systems Sciences), vol. 68, pp. 41–50. Springer, Heidelberg (1990)

    Google Scholar 

  19. Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. 98(26), 15149–15154 (2001)

    Article  Google Scholar 

  20. Vlahou, A., Schorge, J.O., Gregory, B.W., Coleman, R.L.: Diagnosis of ovarian cancer using decision tree classification of mass spectral data. Biomed. Res. Int. 2003(5), 308–314 (2003)

    Google Scholar 

  21. Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2), 87–129 (2001)

    MathSciNet  MATH  Google Scholar 

  22. Teodorescu, L., Sherwood, D.: High energy physics event selection with gene expression programming. Comput. Phys. Commun. 178(6), 409–419 (2008)

    Article  Google Scholar 

  23. Shi, W., Zhang, X., Shen, Q.: Quantitative structure-activity relationships studies of CCR5 inhibitors and toxicity of aromatic compounds using gene expression programming. Eur. J. Med. Chemistry 45(1), 49–54 (2010)

    Article  Google Scholar 

  24. Nazari, A.: Prediction performance of PEM fuel cells by gene expression programming. Int. J. Hydrogen Energy 37(24), 18972–18980 (2012)

    Google Scholar 

  25. Weinert, W.R., Lopes, H.S.: GEPCLASS: a classification rule discovery tool using gene expression programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 871–880. Springer, Heidelberg (2006). doi:10.1007/11811305_95

    Chapter  Google Scholar 

  26. Jedrzejowicz, J., Jedrzejowicz, P.: Experimental evaluation of two new GEP-based ensemble classifiers. Expert Syst. Appl. 38(9), 10932–10939 (2011)

    Article  Google Scholar 

  27. Wang, W., Li, Q., Han, S., Lin, H.: A preliminary study on constructing decision tree with gene expression programming. In: First International Conference on Innovative Computing, Information and Control (ICICIC 2006), vol. 1, pp. 222–225. IEEE (2006)

    Google Scholar 

  28. Ávila, J.L., Gibaja, E.L., Ventura, S.: Multi-label classification with gene expression programming. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 629–637. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02319-4_76

    Chapter  Google Scholar 

  29. Ávila, J.L., Gibaja, E., Zafra, A., Ventura, S.: A gene expression programming algorithm for multi-label classification. J. Multiple Valued Logic Soft Comput. 17, 255–287 (2011)

    Google Scholar 

  30. Shi, W., Liu, Y., Kong, W., Shen, Q.: Tea classification by near infrared spectroscopy with projection discriminant analysis and gene expression programming. Anal. Lett. 48(18), 2833–2842 (2015)

    Article  Google Scholar 

  31. Huang, J., Deng, C.: A novel multiclass classification method with gene expression programming. In: International Conference on Web Information Systems and Mining, WISM 2009, pp. 139–143. IEEE (2009)

    Google Scholar 

  32. Zhou, C., Xiao, W., Tirpak, T.M., Nelson, P.C.: Evolving accurate and compact classification rules with gene expression programming. IEEE Trans. Evol. Comput. 7(6), 519–531 (2003)

    Article  Google Scholar 

  33. Khattab, H., Abdelaziz, A., Mekhamer, S., Badr, M., El-Saadany, E.: Gene expression programming for static security assessment of power systems. In: 2012 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2012)

    Google Scholar 

  34. Al-Anni, R., Hou, J., Abdu-aljabar, R.D.A., Xiang, Y.: Prediction of NSCLC recurrence from microarray data with GEP. IET Syst. Biol. 11(3), 77–85 (2017)

    Google Scholar 

  35. Azzawi, H., Hou, J., Xiang, Y., Alanni, R.: Lung cancer prediction from microarray data by gene expression programming. IET Syst. Biol. 10, 1–11 (2016)

    Google Scholar 

  36. Yu, Z., et al.: A highly efficient Gene Expression Programming (GEP) model for auxiliary diagnosis of small cell lung cancer. PLoS ONE 10(5), 1–19 (2015)

    Google Scholar 

  37. Yu, Z., Chen, X.Z., Cui, L.H., Si, H.Z., Lu, H.J., Liu, S.H.: Prediction of lung cancer based on serum biomarkers by gene expression programming methods. Asian Pac. J. Cancer Prev. 15(21), 9367–9373 (2014)

    Article  Google Scholar 

  38. Kusy, M., Obrzut, B., Kluska, J.: Application of gene expression programming and neural networks to predict adverse events of radical hysterectomy in cervical cancer patients. Med. Biol. Eng. Comput. 51(12), 1357–1365 (2013)

    Google Scholar 

  39. Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)

    Google Scholar 

  40. Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57

    Chapter  Google Scholar 

  41. Robnik-Šikonja, M., Kononenko, I.: An adaptation of relief for attribute estimation in regression. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), pp. 296–304 (1997)

    Google Scholar 

  42. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)

    Google Scholar 

  43. Gene Expression Programming for Java. https://code.google.com/archive/p/gep4j/. Accessed 26 Aug 2010

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hasseeb Azzawi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Azzawi, H., Hou, J., Alanni, R., Xiang, Y., Abdu-Aljabar, R., Azzawi, A. (2017). Multiclass Lung Cancer Diagnosis by Gene Expression Programming and Microarray Datasets. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69179-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69178-7

  • Online ISBN: 978-3-319-69179-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics