A parallel framework for software defect detection and metric selection on cloud computing

Ali, Md Mohsin; Huda, Shamsul; Abawajy, Jemal; Alyahya, Sultan; Al-Dossari, Hmood; Yearwood, John

doi:10.1007/s10586-017-0892-6

A parallel framework for software defect detection and metric selection on cloud computing

Published: 24 May 2017

Volume 20, pages 2267–2281, (2017)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Md Mohsin Ali¹,
Shamsul Huda²,
Jemal Abawajy²,
Sultan Alyahya³,
Hmood Al-Dossari³ &
…
John Yearwood²

599 Accesses
20 Citations
Explore all metrics

Abstract

With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software defect predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy efficiency in cloud computing data centers: a survey on software technologies

Article 30 August 2022

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

Applications of AI in classical software engineering

Article Open access 26 July 2020

References

NCI: National computational infrastructure. http://nci.org.au/raijin/
Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl.-Based Syst. 74, 28–39 (2015)
Article Google Scholar
Aparisi, F., Sanz, J.: Interpreting the out-of-control signals of multivariate control charts employing neural networks. Int. J. Comput. Electr. Autom. Control Inf. Eng. 4(1), 24–28 (2010)
Google Scholar
Arar, O.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33(C), 263–277 (2015)
Asad, A.A., Alsmadi, I.: Evaluating the impact of software metrics on defects prediction, part 2. Comput. Sci. J. Mold. 22(1), 127–144 (2014)
Google Scholar
Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1342–1343 (2010)
Article Google Scholar
Bayes, T.: An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. Lond. 53, 370–418 (1763)
Article MATH Google Scholar
Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)
Article Google Scholar
Chang, C.P., Chu, C.P., Yeh, Y.F.: Integrating in-process software defect prediction with association mining to discover defect pattern. Inf. Softw. Technol. 51(2), 375–384 (2009)
Article Google Scholar
Compton, B.T., Withrow, C.: Prediction and control of ada software defects. J. Syst. Softw. 12(3), 199–207 (1990)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York, NY (2000)
Book MATH Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
MATH Google Scholar
Ebrahimi, N.B.: On the statistical analysis of the number of errors remaining in a software design document after inspection. IEEE Trans. Softw. Eng. 23(8), 529–532 (1997)
Article Google Scholar
Erturk, E., Sezer, E.A.: A comparison of some soft computing methods for software fault prediction. Expert Syst. Appl. 42(4), 1872–1879 (2015)
Article Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE 2004). pp. 417–428 (2004)
Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. pp. 78–88. IEEE Computer Society (2009)
Hsu, C.N., Huang, H.J., Schuschel, D.: The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans. Syst. Man Cybern. B 32(2), 207–212 (2002)
Article Google Scholar
Huda, S., Abdollahian, M., Mammadov, M., Yearwood, J., Ahmed, S., Sultan, I.: A hybrid wrapper-filter approach to detect the source(s) of out-of-control signals in multivariate manufacturing process. Eur. J. Oper. Res. 237(3), 857–870 (2014)
Article Google Scholar
Jiang, Y., Cukic, B.: Misclassification cost-sensitive fault prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering. pp. 20:1–20:10. PROMISE ’09 (2009)
Jin, C., Jin, S.W.: Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl. Soft Comput. 35, 717–725 (2015)
Article Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Article MATH Google Scholar
Kröse, B., Smagt, P.V.D.: An introduction to Neural Networks. The University of Amsterdam, Amsterdam (1993)
Google Scholar
Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)
Article Google Scholar
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Article Google Scholar
Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI 2007). pp. 659–666 (2007)
Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Soft Comput. 27, 504–518 (2015)
Article Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)
Article Google Scholar
Munson, J.C., Khoshgoftaar, T.M.: Regression modelling of software quality: empirical investigation. Inf. Softw. Technol. 32(2), 106–114 (1990)
Article Google Scholar
Pelayo, L., Dick, S.: Applying novel resampling strategies to software defect prediction. In: Proceedings of the 2007 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS 2007). pp. 69–72 (2007)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Radjenović, D., Heričko, M., Torkar, R., Živkovič, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)
Rodger, J.A.: Toward reducing failure risk in an integrated vehicle health maintenance system. Expert Syst. Appl. 39(10), 9821–9836 (2012)
Article Google Scholar
Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)
Article Google Scholar
Song, Q., Shepperd, M., Cartwright, M., Mair, C.: Software defect association mining and defect correction effort prediction. IEEE Trans. Softw. Eng. 32(2), 69–82 (2006)
Article Google Scholar
Sutter, J.M., Kalivas, J.H.: Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. Microchem. J. 47(1), 60–66 (1993)
Article Google Scholar
Wang, H., Khoshgoftaar, T.M., Hulse, J.V., Ga, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)
Article Google Scholar
Yadav, H.B., Yadav, D.K.: A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Inf. Softw. Technol. 63, 44–57 (2015)
Article Google Scholar
Zhao, M., Wohlin, C., Ohlsson, N., Xie, M.: A comparison between software design and code metrics for the prediction of software fault content. Inf. Softw. Technol. 40(14), 801–809 (1998)
Article Google Scholar
Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010)
Article Google Scholar

Download references

Acknowledgements

The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its participation in funding this research group (RGP-1436-039).

Author information

Authors and Affiliations

The Australian National University, Canberra, Australia
Md Mohsin Ali
Deakin University, Melbourne, Australia
Shamsul Huda, Jemal Abawajy & John Yearwood
King Saud University, Riyadh, Saudi Arabia
Sultan Alyahya & Hmood Al-Dossari

Authors

Md Mohsin Ali
View author publications
You can also search for this author in PubMed Google Scholar
Shamsul Huda
View author publications
You can also search for this author in PubMed Google Scholar
Jemal Abawajy
View author publications
You can also search for this author in PubMed Google Scholar
Sultan Alyahya
View author publications
You can also search for this author in PubMed Google Scholar
Hmood Al-Dossari
View author publications
You can also search for this author in PubMed Google Scholar
John Yearwood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shamsul Huda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ali, M.M., Huda, S., Abawajy, J. et al. A parallel framework for software defect detection and metric selection on cloud computing. Cluster Comput 20, 2267–2281 (2017). https://doi.org/10.1007/s10586-017-0892-6

Download citation

Received: 23 October 2016
Revised: 13 March 2017
Accepted: 27 April 2017
Published: 24 May 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10586-017-0892-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel framework for software defect detection and metric selection on cloud computing

Abstract

Access this article

Similar content being viewed by others

Energy efficiency in cloud computing data centers: a survey on software technologies

Data collection and quality challenges in deep learning: a data-centric AI perspective

Applications of AI in classical software engineering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A parallel framework for software defect detection and metric selection on cloud computing

Abstract

Access this article

Similar content being viewed by others

Energy efficiency in cloud computing data centers: a survey on software technologies

Data collection and quality challenges in deep learning: a data-centric AI perspective

Applications of AI in classical software engineering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation