Skip to main content
Log in

Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process

  • Peer Reviewed
  • Published:
Journal of Thermal Spray Technology Aims and scope Submit manuscript

Abstract

This paper presents the application of the artificial neural network into an atmospheric plasma spray process for predicting the in-flight particle characteristics, which have significant influence on the in-service coating properties. One of the major problems for such function-approximating neural network is over-fitting, which reduces the generalization capability of a trained network and its ability to work with sufficient accuracy under a new environment. Two methods are used to analyze the improvement in the network’s generalization ability: (i) cross-validation and early stopping, and (ii) Bayesian regularization. Simulations are performed both on the original and expanded database with different training conditions to obtain the variations in performance of the trained networks under various environments. The study further illustrates the design and optimization procedures and analyzes the predicted values, with respect to the experimental ones, to evaluate the performance and generalization ability of the network. The simulation results show that the performance of the trained networks with regularization is improved over that with cross-validation and early stopping and, furthermore, the generalization capability of the networks is improved; thus preventing any phenomenon associated with over-fitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. D.E. Rumelhart, Brain Style Computation: Learning and Generalization, An Introduction to Neural and Electronic Networks, Academic Press Professional, Inc., 1990, p 405-420

  2. C.J. Einerson, D.E. Clark, B.A. Detering, and P.L. Taylor, Intelligent Control Strategies for the Plasma Spray Process, Thermal Spray Coatings: Research, Design and Applications, Proceedings of the Sixth NTSC, June 1993 (Anaheim, CA), ASM International, Materials Park, OH, USA, 1993, p 205-211

  3. E. Pfender, Fundamental-Studies Associated with the Plasma Spray Process, Surf. Coat. Technol., 1988, 34(1), p 1-14

    Article  CAS  Google Scholar 

  4. P. Fauchais and M. Vardelle, Plasma Spraying—Present and Future, Pure Appl. Chem., 1994, 66(6), p 1247-1258

    Article  CAS  Google Scholar 

  5. S. Guessasma, G. Montavon, and C. Coddet, On the Implementation of Neural Network Concept to Optimize Thermal Spray Deposition Process, Combinatorial and Artificial Intelligence Methods in Materials Science, I. Takeuchi, J.M. Newsam, L.T. Wille, H. Koinuma, and E.J. Amis, Ed., Materials Research Society, Warrendale, PA, 2002, p 253-258

    Google Scholar 

  6. S. Guessasma, G. Montavon, P. Gougeon, and C. Coddet, Designing Expert System Using Neural Computation in View of the Control of Plasma Spray Processes, Mater. Des., 2003, 24(7), p 497-502

    Article  CAS  Google Scholar 

  7. S. Guessasma, G. Montavon, and C. Coddet, Neural Computation to Predict In-Flight Particle Characteristic Dependences from Processing Parameters in the APS Process, J. Therm. Spray Technol., 2004, 13(4), p 570-585

    Article  CAS  Google Scholar 

  8. P. Koistinen and L. Holmstrom, Kernel Regression and Backpropagation Training with Noise, IEEE, 1992, p 367-372

  9. E. Parzen, On Estimation of a Probability Density Function and Mode, Ann. Math. Stat., 1962, 33(3), p 1065

    Article  Google Scholar 

  10. M. Rosenblatt, Remarks on Some Nonparametric Estimates of a Density Function, Ann. Math. Stat., 1956, 27(3), p 832-837

    Article  Google Scholar 

  11. T. Cacoullo, Estimation of a Multivariate Density, Ann. Inst. Stat. Math., 1966, 18(2), p 179

    Article  Google Scholar 

  12. Z. Dongling, T. Yingjie, and Z. Peng, Kernel-Based Nonparametric Regression Method, Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT ‘08. IEEE/WIC/ACM International Conference on, 2008, p 410-413

  13. M.M. Nelson and W.T. Illingworth, A Practical Guide to Neural Nets, Addison-Wesley Publishing Company Inc., Reading, MA, 1991

    Google Scholar 

  14. S.E. Fahlman, Faster-Learning Variations on Back Propagation: An Empirical Study, Proceedings of the 1988 Connectionist Models Summer School, 1988, p 38-51

  15. S. Guessasma, Z. Salhi, G. Montavon, P. Gougeon, and C. Coddet, Artificial Intelligence Implementation in the APS Process Diagnostic, Mater. Sci. Eng. B, 2004, 110(3), p 285-295

    Article  Google Scholar 

  16. D. Shanno, Recent Advances in Numerical Techniques for Large-Scale Optimization, Neural Networks for Control, 1995, p 171

  17. E. Barnard, Optimization for Training Neural Nets, IEEE Trans. Neural. Netw., 1992, 3(2), p 232-240

    Article  CAS  Google Scholar 

  18. C. Charalambous, Conjugate Gradient Algorithm for Efficient Training of Artificial Neural Networks, Circ. Dev. Syst., IEE Proc. G, 1992, 139(3), p 301-310

    Article  Google Scholar 

  19. S. Kollias and D. Anastassiou, An Adaptive Least Squares Algorithm for the Efficient Training of Artificial Neural Networks, IEEE Trans. Circ. Syst., 1989, 36(8), p 1092-1101

    Article  Google Scholar 

  20. D. Marquardt, An algorithm for Least-Squares Estimation of Nonlinear Parameters, J. Soc. Ind. Appl. Math., 1963, 11(2), p 431-441

    Article  Google Scholar 

  21. M.T. Hagan and M.B. Mehnaj, Training Feedforward Networks with the Marquardt Algorithm, IEEE Trans. Neural Netw., 1994, 5(6), p 989-993

    Article  CAS  Google Scholar 

  22. A.J. Adeloye and A. De Munari, Artificial Neural Network Based Generalized Storage-Yield-Reliability Models Using the Levenberg-Marquardt Algorithm, J. Hydrol., 2006, 326(1-4), p 215-230

    Article  Google Scholar 

  23. D.J.C. Mackay, Bayesian Interpolation, Maximum Entropy and Bayesian Methods, C.R. Smith, G.J. Erickson, and P.O. Neudorfer, Ed., Kluwer Academic Publishers, Dordrecht, 1992, p 39-66

    Google Scholar 

  24. J.-E. Döring, R. Vaßen, D. Stöver, and D. Jülich, The Influence of Spray Parameters on Particle Properties, ITSC-International Thermal Spray Conference (DVS-ASM), 2002, p 440-445

  25. M. Vardelle and P. Fauchais, Plasma Spray Processes: Diagnostics and Control?, Pure Appl. Chem., 1999, 71(10), p 1909-1918

    Article  CAS  Google Scholar 

  26. C. Moreau, Towards a Better Control of Thermal Spray Processes, Thermal Spray: Meeting the Challenges of the 21st Century. Fifteenth International Thermal Spray Conference, 1998 (Nice, France), C. Coddet, Ed., 1998, 2, p 1681-1693

  27. B. Pateyron, M.-F. Elchinger, G. Delluc, and P. Fauchais, Thermodynamic and Transport Properties of Ar-H2 and Ar-He Plasma Gases Used for Spraying at Atmospheric Pressure. I: Properties of the Mixtures, Plasma Chem. Plasma Process., 1992, 12(4), p 421-448

    Article  CAS  Google Scholar 

  28. M.I. Boulos, P. Fauchais, A. Vardelle, and E. Pfender, Fundamentals of Plasma Particle Momentum and Heat Transfer, Plasma Spraying: Theory and Applications, World Scientific Publishing Co. Pte. Ltd., Singapore, 1993, p 3-57

  29. M. Friis, C. Persson, and J. Wigren, Influence of Particle In-Flight Characteristics on the Microstructure of Atmospheric Plasma Sprayed Yttria Stabilized ZrO2, Surf. Coat. Technol., 2001, 141(2-3), p 115-127

    Article  CAS  Google Scholar 

  30. C. Bossoutrot, F. Braillard, T.R. Chatellerault/F, M. Vardelle, and P. Fauchais, Limoges/F, Preliminary Studies of a Closed-Loop for a Feedback Control of Air Plasma Spray Processes, International Thermal Spray Conference (DVS-ASM), 2002, p 56-61

  31. I. Fisher, Variables Influencing the Characteristics of Plasma-Sprayed Coatings, Int. Metall. Rev., 1972, 17(1), p 117-129

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. A. Choudhury.

Appendices

Appendix A1: Database (DSO)

Table 5

Appendix A2: Generalization Errors Generated by Networks Trained by Levenberg-Marquardt Algorithm with the Datasets DSOTR and DSETR

Table 6

Appendix A3: Levenberg-Marquardt Algorithm

Levenberg-Marquardt algorithm is an approximation to the Newton’s method and is designed to reach the second order training speed without computing the Hessian matrix. The approximation of the Hessian matrix and the error gradient is computed as per Eq 5 and 6.

$$ H = J^{T} J $$
(10)
$$ g = J^{\text{T}} e $$
(11)

J represents the Jacobian matrix formed with the first derivatives of the network errors e, on the training set with respect to the network’s weights and biases and can be calculated using standard back propagation technique (Ref 21). The Levenberg-Marquardt algorithm uses the approximation in calculation of the Hessian matrix to update and tune the parameters. If z k represents the old parameter value, then the new parameter value after calculation of the network errors, is given by Eq 7.

$$ z_{k + 1} = z_{k} - [J^{\text{T}} J + \mu I]^{ - 1} J^{\text{T}} e $$
(12)

The parameter μ is set to a specific value at the start of the training. After each epoch, the performance function is computed. If the performance function is found out to be less than the previous epoch, the value of μ is decreased by a specific value. However, if the performance function increases, the value of μ is also increased by a specific value. Having the value of μ equals zero, turns Eq 7 into a Newton’s method. The aim is to shift toward Newton’s method rapidly since it is faster and more accurate near minimum error. A maximum value of μ is set before the training. If μ reaches its maximum value, the training stops and it indicates that the network has failed to converge. The training is also stopped when the error gradient (Eq 6) falls below a specific set value or when the goal set for the performance function is met.

The training set is used during the network learning stage to calculate the error gradient (Eq 6) and update the network’s weights and biases accordingly. The validation and the test sets are never used to update network’s weights and biases. The network’s error on the validation set is calculated and monitored during the training process. As the training starts, the validation set error decreases along with the training set error. However, as the training progresses, the error on the validation set starts to rise and if the error increases for a specific number of epochs (iterations), it indicates that the network has started to over fit. The training is thus immediately stopped and the values of the weights and biases, during the minimum validation error, are returned and saved. The test set is only used to conduct the performance evaluation of the trained network. In order to achieve statistically significant results, several independent data splits must be performed followed by lengthy trainings.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choudhury, T.A., Hosseinzadeh, N. & Berndt, C.C. Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process. J Therm Spray Tech 21, 935–949 (2012). https://doi.org/10.1007/s11666-012-9775-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11666-012-9775-9

Keywords

Navigation