Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process

Choudhury, T. A.; Hosseinzadeh, N.; Berndt, C. C.

doi:10.1007/s11666-012-9775-9

Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process

Peer Reviewed
Published: 29 March 2012

Volume 21, pages 935–949, (2012)
Cite this article

Journal of Thermal Spray Technology Aims and scope Submit manuscript

T. A. Choudhury¹,
N. Hosseinzadeh¹ &
C. C. Berndt²

363 Accesses
17 Citations
Explore all metrics

Abstract

This paper presents the application of the artificial neural network into an atmospheric plasma spray process for predicting the in-flight particle characteristics, which have significant influence on the in-service coating properties. One of the major problems for such function-approximating neural network is over-fitting, which reduces the generalization capability of a trained network and its ability to work with sufficient accuracy under a new environment. Two methods are used to analyze the improvement in the network’s generalization ability: (i) cross-validation and early stopping, and (ii) Bayesian regularization. Simulations are performed both on the original and expanded database with different training conditions to obtain the variations in performance of the trained networks under various environments. The study further illustrates the design and optimization procedures and analyzes the predicted values, with respect to the experimental ones, to evaluate the performance and generalization ability of the network. The simulation results show that the performance of the trained networks with regularization is improved over that with cross-validation and early stopping and, furthermore, the generalization capability of the networks is improved; thus preventing any phenomenon associated with over-fitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction of Particle Properties in Plasma Spraying Based on Machine Learning

Article Open access 02 August 2021

K. Bobzin, W. Wietheger, … G. Visconti

Using Machine Learning to Predict Synthetic Fuel Spray Penetration from Limited Experimental Data Without Computational Fluid Dynamics

Development of an Expert System for Prediction of Deposition Efficiency in Plasma Spraying

Article Open access 01 December 2022

K. Bobzin, H. Heinemann & S. R. Dokhanchi

References

D.E. Rumelhart, Brain Style Computation: Learning and Generalization, An Introduction to Neural and Electronic Networks, Academic Press Professional, Inc., 1990, p 405-420
C.J. Einerson, D.E. Clark, B.A. Detering, and P.L. Taylor, Intelligent Control Strategies for the Plasma Spray Process, Thermal Spray Coatings: Research, Design and Applications, Proceedings of the Sixth NTSC, June 1993 (Anaheim, CA), ASM International, Materials Park, OH, USA, 1993, p 205-211
E. Pfender, Fundamental-Studies Associated with the Plasma Spray Process, Surf. Coat. Technol., 1988, 34(1), p 1-14
Article CAS Google Scholar
P. Fauchais and M. Vardelle, Plasma Spraying—Present and Future, Pure Appl. Chem., 1994, 66(6), p 1247-1258
Article CAS Google Scholar
S. Guessasma, G. Montavon, and C. Coddet, On the Implementation of Neural Network Concept to Optimize Thermal Spray Deposition Process, Combinatorial and Artificial Intelligence Methods in Materials Science, I. Takeuchi, J.M. Newsam, L.T. Wille, H. Koinuma, and E.J. Amis, Ed., Materials Research Society, Warrendale, PA, 2002, p 253-258
Google Scholar
S. Guessasma, G. Montavon, P. Gougeon, and C. Coddet, Designing Expert System Using Neural Computation in View of the Control of Plasma Spray Processes, Mater. Des., 2003, 24(7), p 497-502
Article CAS Google Scholar
S. Guessasma, G. Montavon, and C. Coddet, Neural Computation to Predict In-Flight Particle Characteristic Dependences from Processing Parameters in the APS Process, J. Therm. Spray Technol., 2004, 13(4), p 570-585
Article CAS Google Scholar
P. Koistinen and L. Holmstrom, Kernel Regression and Backpropagation Training with Noise, IEEE, 1992, p 367-372
E. Parzen, On Estimation of a Probability Density Function and Mode, Ann. Math. Stat., 1962, 33(3), p 1065
Article Google Scholar
M. Rosenblatt, Remarks on Some Nonparametric Estimates of a Density Function, Ann. Math. Stat., 1956, 27(3), p 832-837
Article Google Scholar
T. Cacoullo, Estimation of a Multivariate Density, Ann. Inst. Stat. Math., 1966, 18(2), p 179
Article Google Scholar
Z. Dongling, T. Yingjie, and Z. Peng, Kernel-Based Nonparametric Regression Method, Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT ‘08. IEEE/WIC/ACM International Conference on, 2008, p 410-413
M.M. Nelson and W.T. Illingworth, A Practical Guide to Neural Nets, Addison-Wesley Publishing Company Inc., Reading, MA, 1991
Google Scholar
S.E. Fahlman, Faster-Learning Variations on Back Propagation: An Empirical Study, Proceedings of the 1988 Connectionist Models Summer School, 1988, p 38-51
S. Guessasma, Z. Salhi, G. Montavon, P. Gougeon, and C. Coddet, Artificial Intelligence Implementation in the APS Process Diagnostic, Mater. Sci. Eng. B, 2004, 110(3), p 285-295
Article Google Scholar
D. Shanno, Recent Advances in Numerical Techniques for Large-Scale Optimization, Neural Networks for Control, 1995, p 171
E. Barnard, Optimization for Training Neural Nets, IEEE Trans. Neural. Netw., 1992, 3(2), p 232-240
Article CAS Google Scholar
C. Charalambous, Conjugate Gradient Algorithm for Efficient Training of Artificial Neural Networks, Circ. Dev. Syst., IEE Proc. G, 1992, 139(3), p 301-310
Article Google Scholar
S. Kollias and D. Anastassiou, An Adaptive Least Squares Algorithm for the Efficient Training of Artificial Neural Networks, IEEE Trans. Circ. Syst., 1989, 36(8), p 1092-1101
Article Google Scholar
D. Marquardt, An algorithm for Least-Squares Estimation of Nonlinear Parameters, J. Soc. Ind. Appl. Math., 1963, 11(2), p 431-441
Article Google Scholar
M.T. Hagan and M.B. Mehnaj, Training Feedforward Networks with the Marquardt Algorithm, IEEE Trans. Neural Netw., 1994, 5(6), p 989-993
Article CAS Google Scholar
A.J. Adeloye and A. De Munari, Artificial Neural Network Based Generalized Storage-Yield-Reliability Models Using the Levenberg-Marquardt Algorithm, J. Hydrol., 2006, 326(1-4), p 215-230
Article Google Scholar
D.J.C. Mackay, Bayesian Interpolation, Maximum Entropy and Bayesian Methods, C.R. Smith, G.J. Erickson, and P.O. Neudorfer, Ed., Kluwer Academic Publishers, Dordrecht, 1992, p 39-66
Google Scholar
J.-E. Döring, R. Vaßen, D. Stöver, and D. Jülich, The Influence of Spray Parameters on Particle Properties, ITSC-International Thermal Spray Conference (DVS-ASM), 2002, p 440-445
M. Vardelle and P. Fauchais, Plasma Spray Processes: Diagnostics and Control?, Pure Appl. Chem., 1999, 71(10), p 1909-1918
Article CAS Google Scholar
C. Moreau, Towards a Better Control of Thermal Spray Processes, Thermal Spray: Meeting the Challenges of the 21st Century. Fifteenth International Thermal Spray Conference, 1998 (Nice, France), C. Coddet, Ed., 1998, 2, p 1681-1693
B. Pateyron, M.-F. Elchinger, G. Delluc, and P. Fauchais, Thermodynamic and Transport Properties of Ar-H2 and Ar-He Plasma Gases Used for Spraying at Atmospheric Pressure. I: Properties of the Mixtures, Plasma Chem. Plasma Process., 1992, 12(4), p 421-448
Article CAS Google Scholar
M.I. Boulos, P. Fauchais, A. Vardelle, and E. Pfender, Fundamentals of Plasma Particle Momentum and Heat Transfer, Plasma Spraying: Theory and Applications, World Scientific Publishing Co. Pte. Ltd., Singapore, 1993, p 3-57
M. Friis, C. Persson, and J. Wigren, Influence of Particle In-Flight Characteristics on the Microstructure of Atmospheric Plasma Sprayed Yttria Stabilized ZrO₂, Surf. Coat. Technol., 2001, 141(2-3), p 115-127
Article CAS Google Scholar
C. Bossoutrot, F. Braillard, T.R. Chatellerault/F, M. Vardelle, and P. Fauchais, Limoges/F, Preliminary Studies of a Closed-Loop for a Feedback Control of Air Plasma Spray Processes, International Thermal Spray Conference (DVS-ASM), 2002, p 56-61
I. Fisher, Variables Influencing the Characteristics of Plasma-Sprayed Coatings, Int. Metall. Rev., 1972, 17(1), p 117-129
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering and Industrial Sciences, Swinburne University of Technology, Internal Mail H38, PO Box 218, Hawthorn, VIC, 3122, Australia
T. A. Choudhury & N. Hosseinzadeh
Industrial Research Institute Swinburne (IRIS), Swinburne University of Technology, Internal Mail H66, PO Box 218, Hawthorn, VIC, 3122, Australia
C. C. Berndt

Authors

T. A. Choudhury
View author publications
You can also search for this author in PubMed Google Scholar
N. Hosseinzadeh
View author publications
You can also search for this author in PubMed Google Scholar
C. C. Berndt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. A. Choudhury.

Appendices

Appendix A1: Database (DS_O)

Table 5

Full size table

Appendix A2: Generalization Errors Generated by Networks Trained by Levenberg-Marquardt Algorithm with the Datasets DS_OTR and DS_ETR

Table 6

Full size table

Appendix A3: Levenberg-Marquardt Algorithm

Levenberg-Marquardt algorithm is an approximation to the Newton’s method and is designed to reach the second order training speed without computing the Hessian matrix. The approximation of the Hessian matrix and the error gradient is computed as per Eq 5 and 6.

$$ H = J^{T} J $$

(10)

$$ g = J^{\text{T}} e $$

(11)

J represents the Jacobian matrix formed with the first derivatives of the network errors e, on the training set with respect to the network’s weights and biases and can be calculated using standard back propagation technique (Ref 21). The Levenberg-Marquardt algorithm uses the approximation in calculation of the Hessian matrix to update and tune the parameters. If z _k represents the old parameter value, then the new parameter value after calculation of the network errors, is given by Eq 7.

$$ z_{k + 1} = z_{k} - [J^{\text{T}} J + \mu I]^{ - 1} J^{\text{T}} e $$

(12)

The parameter μ is set to a specific value at the start of the training. After each epoch, the performance function is computed. If the performance function is found out to be less than the previous epoch, the value of μ is decreased by a specific value. However, if the performance function increases, the value of μ is also increased by a specific value. Having the value of μ equals zero, turns Eq 7 into a Newton’s method. The aim is to shift toward Newton’s method rapidly since it is faster and more accurate near minimum error. A maximum value of μ is set before the training. If μ reaches its maximum value, the training stops and it indicates that the network has failed to converge. The training is also stopped when the error gradient (Eq 6) falls below a specific set value or when the goal set for the performance function is met.

The training set is used during the network learning stage to calculate the error gradient (Eq 6) and update the network’s weights and biases accordingly. The validation and the test sets are never used to update network’s weights and biases. The network’s error on the validation set is calculated and monitored during the training process. As the training starts, the validation set error decreases along with the training set error. However, as the training progresses, the error on the validation set starts to rise and if the error increases for a specific number of epochs (iterations), it indicates that the network has started to over fit. The training is thus immediately stopped and the values of the weights and biases, during the minimum validation error, are returned and saved. The test set is only used to conduct the performance evaluation of the trained network. In order to achieve statistically significant results, several independent data splits must be performed followed by lengthy trainings.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choudhury, T.A., Hosseinzadeh, N. & Berndt, C.C. Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process. J Therm Spray Tech 21, 935–949 (2012). https://doi.org/10.1007/s11666-012-9775-9

Download citation

Received: 17 November 2011
Revised: 06 February 2012
Published: 29 March 2012
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11666-012-9775-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process

Abstract

Access this article

Similar content being viewed by others

Prediction of Particle Properties in Plasma Spraying Based on Machine Learning

Using Machine Learning to Predict Synthetic Fuel Spray Penetration from Limited Experimental Data Without Computational Fluid Dynamics

Development of an Expert System for Prediction of Deposition Efficiency in Plasma Spraying

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A1: Database (DS_O)

Appendix A2: Generalization Errors Generated by Networks Trained by Levenberg-Marquardt Algorithm with the Datasets DS_OTR and DS_ETR

Appendix A3: Levenberg-Marquardt Algorithm

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving the Generalization Ability of an Artificial Neural Network in Predicting In-Flight Particle Characteristics of an Atmospheric Plasma Spray Process

Abstract

Access this article

Similar content being viewed by others

Prediction of Particle Properties in Plasma Spraying Based on Machine Learning

Using Machine Learning to Predict Synthetic Fuel Spray Penetration from Limited Experimental Data Without Computational Fluid Dynamics

Development of an Expert System for Prediction of Deposition Efficiency in Plasma Spraying

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A1: Database (DSO)

Appendix A2: Generalization Errors Generated by Networks Trained by Levenberg-Marquardt Algorithm with the Datasets DSOTR and DSETR

Appendix A3: Levenberg-Marquardt Algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Appendix A1: Database (DS_O)

Appendix A2: Generalization Errors Generated by Networks Trained by Levenberg-Marquardt Algorithm with the Datasets DS_OTR and DS_ETR