Abstract
With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software defect predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.
Similar content being viewed by others
References
NCI: National computational infrastructure. http://nci.org.au/raijin/
Abaei, G., Selamat, A., Fujita, H.: An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Knowl.-Based Syst. 74, 28–39 (2015)
Aparisi, F., Sanz, J.: Interpreting the out-of-control signals of multivariate control charts employing neural networks. Int. J. Comput. Electr. Autom. Control Inf. Eng. 4(1), 24–28 (2010)
Arar, O.F., Ayan, K.: Software defect prediction using cost-sensitive neural network. Appl. Soft Comput. 33(C), 263–277 (2015)
Asad, A.A., Alsmadi, I.: Evaluating the impact of software metrics on defects prediction, part 2. Comput. Sci. J. Mold. 22(1), 127–144 (2014)
Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1342–1343 (2010)
Bayes, T.: An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. Lond. 53, 370–418 (1763)
Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)
Chang, C.P., Chu, C.P., Yeh, Y.F.: Integrating in-process software defect prediction with association mining to discover defect pattern. Inf. Softw. Technol. 51(2), 375–384 (2009)
Compton, B.T., Withrow, C.: Prediction and control of ada software defects. J. Syst. Softw. 12(3), 199–207 (1990)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York, NY (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
Ebrahimi, N.B.: On the statistical analysis of the number of errors remaining in a software design document after inspection. IEEE Trans. Softw. Eng. 23(8), 529–532 (1997)
Erturk, E., Sezer, E.A.: A comparison of some soft computing methods for software fault prediction. Expert Syst. Appl. 42(4), 1872–1879 (2015)
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE 2004). pp. 417–428 (2004)
Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. pp. 78–88. IEEE Computer Society (2009)
Hsu, C.N., Huang, H.J., Schuschel, D.: The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans. Syst. Man Cybern. B 32(2), 207–212 (2002)
Huda, S., Abdollahian, M., Mammadov, M., Yearwood, J., Ahmed, S., Sultan, I.: A hybrid wrapper-filter approach to detect the source(s) of out-of-control signals in multivariate manufacturing process. Eur. J. Oper. Res. 237(3), 857–870 (2014)
Jiang, Y., Cukic, B.: Misclassification cost-sensitive fault prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering. pp. 20:1–20:10. PROMISE ’09 (2009)
Jin, C., Jin, S.W.: Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl. Soft Comput. 35, 717–725 (2015)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Kröse, B., Smagt, P.V.D.: An introduction to Neural Networks. The University of Amsterdam, Amsterdam (1993)
Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI 2007). pp. 659–666 (2007)
Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Soft Comput. 27, 504–518 (2015)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)
Munson, J.C., Khoshgoftaar, T.M.: Regression modelling of software quality: empirical investigation. Inf. Softw. Technol. 32(2), 106–114 (1990)
Pelayo, L., Dick, S.: Applying novel resampling strategies to software defect prediction. In: Proceedings of the 2007 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS 2007). pp. 69–72 (2007)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Radjenović, D., Heričko, M., Torkar, R., Živkovič, A.: Software fault prediction metrics: a systematic literature review. Inf. Softw. Technol. 55(8), 1397–1418 (2013)
Rodger, J.A.: Toward reducing failure risk in an integrated vehicle health maintenance system. Expert Syst. Appl. 39(10), 9821–9836 (2012)
Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)
Song, Q., Shepperd, M., Cartwright, M., Mair, C.: Software defect association mining and defect correction effort prediction. IEEE Trans. Softw. Eng. 32(2), 69–82 (2006)
Sutter, J.M., Kalivas, J.H.: Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. Microchem. J. 47(1), 60–66 (1993)
Wang, H., Khoshgoftaar, T.M., Hulse, J.V., Ga, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)
Yadav, H.B., Yadav, D.K.: A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Inf. Softw. Technol. 63, 44–57 (2015)
Zhao, M., Wohlin, C., Ohlsson, N., Xie, M.: A comparison between software design and code metrics for the prediction of software fault content. Inf. Softw. Technol. 40(14), 801–809 (1998)
Zheng, J.: Cost-sensitive boosting neural networks for software defect prediction. Expert Syst. Appl. 37(6), 4537–4543 (2010)
Acknowledgements
The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its participation in funding this research group (RGP-1436-039).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ali, M.M., Huda, S., Abawajy, J. et al. A parallel framework for software defect detection and metric selection on cloud computing. Cluster Comput 20, 2267–2281 (2017). https://doi.org/10.1007/s10586-017-0892-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0892-6