Skip to main content
Log in

A parallel framework for software defect detection and metric selection on cloud computing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software defect predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. NCI: National computational infrastructure. http://nci.org.au/raijin/

  2. Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl.-Based Syst. 74, 28–39 (2015)

    Article  Google Scholar 

  3. Aparisi, F., Sanz, J.: Interpreting the out-of-control signals of multivariate control charts employing neural networks. Int. J. Comput. Electr. Autom. Control Inf. Eng. 4(1), 24–28 (2010)

    Google Scholar 

  4. Arar, O.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33(C), 263–277 (2015)

  5. Asad, A.A., Alsmadi, I.: Evaluating the impact of software metrics on defects prediction, part 2. Comput. Sci. J. Mold. 22(1), 127–144 (2014)

    Google Scholar 

  6. Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1342–1343 (2010)

    Article  Google Scholar 

  7. Bayes, T.: An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. Lond. 53, 370–418 (1763)

    Article  MATH  Google Scholar 

  8. Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)

    Article  Google Scholar 

  9. Chang, C.P., Chu, C.P., Yeh, Y.F.: Integrating in-process software defect prediction with association mining to discover defect pattern. Inf. Softw. Technol. 51(2), 375–384 (2009)

    Article  Google Scholar 

  10. Compton, B.T., Withrow, C.: Prediction and control of ada software defects. J. Syst. Softw. 12(3), 199–207 (1990)

    Article  Google Scholar 

  11. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York, NY (2000)

    Book  MATH  Google Scholar 

  12. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)

    MATH  Google Scholar 

  13. Ebrahimi, N.B.: On the statistical analysis of the number of errors remaining in a software design document after inspection. IEEE Trans. Softw. Eng. 23(8), 529–532 (1997)

    Article  Google Scholar 

  14. Erturk, E., Sezer, E.A.: A comparison of some soft computing methods for software fault prediction. Expert Syst. Appl. 42(4), 1872–1879 (2015)

    Article  Google Scholar 

  15. Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  16. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE 2004). pp. 417–428 (2004)

  18. Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. pp. 78–88. IEEE Computer Society (2009)

  19. Hsu, C.N., Huang, H.J., Schuschel, D.: The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans. Syst. Man Cybern. B 32(2), 207–212 (2002)

    Article  Google Scholar 

  20. Huda, S., Abdollahian, M., Mammadov, M., Yearwood, J., Ahmed, S., Sultan, I.: A hybrid wrapper-filter approach to detect the source(s) of out-of-control signals in multivariate manufacturing process. Eur. J. Oper. Res. 237(3), 857–870 (2014)

    Article  Google Scholar 

  21. Jiang, Y., Cukic, B.: Misclassification cost-sensitive fault prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering. pp. 20:1–20:10. PROMISE ’09 (2009)

  22. Jin, C., Jin, S.W.: Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl. Soft Comput. 35, 717–725 (2015)

    Article  Google Scholar 

  23. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  24. Kröse, B., Smagt, P.V.D.: An introduction to Neural Networks. The University of Amsterdam, Amsterdam (1993)

    Google Scholar 

  25. Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)

    Article  Google Scholar 

  26. Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)

    Article  Google Scholar 

  27. Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI 2007). pp. 659–666 (2007)

  28. Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Soft Comput. 27, 504–518 (2015)

    Article  Google Scholar 

  29. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)

    Article  Google Scholar 

  30. Munson, J.C., Khoshgoftaar, T.M.: Regression modelling of software quality: empirical investigation. Inf. Softw. Technol. 32(2), 106–114 (1990)

    Article  Google Scholar 

  31. Pelayo, L., Dick, S.: Applying novel resampling strategies to software defect prediction. In: Proceedings of the 2007 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS 2007). pp. 69–72 (2007)

  32. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  33. Radjenović, D., Heričko, M., Torkar, R., Živkovič, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)

  34. Rodger, J.A.: Toward reducing failure risk in an integrated vehicle health maintenance system. Expert Syst. Appl. 39(10), 9821–9836 (2012)

    Article  Google Scholar 

  35. Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)

    Article  Google Scholar 

  36. Song, Q., Shepperd, M., Cartwright, M., Mair, C.: Software defect association mining and defect correction effort prediction. IEEE Trans. Softw. Eng. 32(2), 69–82 (2006)

    Article  Google Scholar 

  37. Sutter, J.M., Kalivas, J.H.: Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. Microchem. J. 47(1), 60–66 (1993)

    Article  Google Scholar 

  38. Wang, H., Khoshgoftaar, T.M., Hulse, J.V., Ga, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)

    Article  Google Scholar 

  39. Yadav, H.B., Yadav, D.K.: A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Inf. Softw. Technol. 63, 44–57 (2015)

    Article  Google Scholar 

  40. Zhao, M., Wohlin, C., Ohlsson, N., Xie, M.: A comparison between software design and code metrics for the prediction of software fault content. Inf. Softw. Technol. 40(14), 801–809 (1998)

    Article  Google Scholar 

  41. Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its participation in funding this research group (RGP-1436-039).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shamsul Huda.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ali, M.M., Huda, S., Abawajy, J. et al. A parallel framework for software defect detection and metric selection on cloud computing. Cluster Comput 20, 2267–2281 (2017). https://doi.org/10.1007/s10586-017-0892-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0892-6

Keywords

Navigation