Abstract

By combining particle swarm optimization (PSO) and genetic algorithms (GA) this paper offers an innovative algorithm to train artificial neural networks (ANNs) for the purpose of calculating the experimental growth parameters of CNTs. The paper explores experimentally obtaining data to train ANNs, as a method to reduce simulation time while ensuring the precision of formal physics models. The results are compared with conventional particle swarm optimization based neural network (CPSONN) and Levenberg–Marquardt (LM) techniques. The results show that PSOGANN can be successfully utilized for modeling the experimental parameters that are critical for the growth of CNTs.

1. Introduction

Increasing requirements for high manufacturing efficiency such as low throughput time, better product quality, and cheaper finished parts are still driving the equipment manufacturers and the fabrication industries in their search for new technologies. The increasing requirements drive the industry to search for smaller transistors, where size reductions result in higher clock frequencies and lower power dissipation. CNTs show promise in satisfying the need for smaller transistors as a result of their physical and electrical properties [1, 2]. The mechanisms involved in growing CNTs are often complex with numerous experimental parameters that need precise control and often the growth-rate is very slow. Moreover, the growth process involves precursor materials, hydrocarbons, carrier gases, expensive equipment, and high thermal budget. Generating a simulator platform that allows us to optimize the growth parameters before conducting trial and error optimization through experiments will greatly save time and money. Therefore, in this paper, we present a novel algorithm by combining both particle swarm optimization (PSO) and genetic algorithm (GA) which predicts the experimental growth results of CNT and allows making parameter optimization less cumbersome. Ton et al. [3] have presented a numerical piecewise nonlinear approximation of the nonequilibrium mobile charge density to be used in the modeling of CNT transistors. Similarly, Yamacli and Avci [4] have developed a parameterisable model of CNTFET nanoelectronics. Recent advances in ANNs have been made possible through the efforts to model and simulate the behaviour of CNTS. Through the imitation of biological systems and analysis of brain structures these ANNs have developed human-like performance [36].

However, ANNs used for pattern classification and optimization problems often suffered from issues such as finding appropriated architecture to perform the satisfactory modeling performance [7]. Considerable research has been conducted in the development of new architectures and learning algorithms of the neural networks to achieve this objective, such as modular neural networks [8], hybrid neural networks [9], and evolutionary algorithms and evolutionary programming [10].

Seeking improved performance of conventional neural networks, researchers [11, 12] have turned to GA. Three evolutionary operations are required to implement a GA: selection, crossover, and mutation. It has been found in experiments that, with large training samples, the convergence speed for the GA would be significantly reduced [13]. Furthermore, application of the crossover operation in GA to neural networks can result in what is known as the “permutation problem.” Consequently, the employment of GA is seen as a generally complicated process. Recently, other evolutionary techniques such as PSO have been applied in other branches of engineering [14, 15]. By contrast, the PSO algorithm does not have two of the evolutionary operators of GA (crossover and mutation). The reduction in parameters yields a faster convergence which is easier to implement [16]. Accordingly, PSOs are suitable for approaching dynamic or problems that change rapidly over time [17].

The superiority of PSO and GA to the BP algorithm stems from the ability of PSO and GA trained ANNs to deal with nondifferentiable functions and work without gradient information. But one of the most notorious problems with the application of crossover to neural networks is known as falling in the local minimums and failing to converge. This problem is more visible when the number of data sets is not enough. Another common problem in conventional NNs is overfitting. If the number of weights of NN exceeds the number of data sets for the training of NN to some extent, “overfitting” may occur. GA are capable of isolating global optimums and, however, converge at low speed to this optimum. On the other hand, PSOs converge quickly albeit at a greater risk of isolating and being trapped at a local optimum. The optimization problem that arises from attempts to balance these two techniques can be solved by particle swarm intelligence. In order to overcome the downsides associated with each algorithm a combination of GA and PSO may be used which is referred to as GAPSO.

In GA, the binary strings of the initial population are generated randomly, so different runs of GA often give similar results. The idea of PSOGANN is to select these initial populations appropriately by using PSONN.

As mentioned before, in the case of lack of enough training data for network, neither conventional PSO nor GA based NN can provide a proper learning method for training NNs. Having a robust algorithm that could be trained with less training data could be a promising method for the application of NNs in microelectromechanical system (MEMS) fabrication and many other engineering fields.

In this study, a novel PSO-GA based neural network is proposed for improving the training capacity of neural network. To evaluate the performance of the proposed PSO algorithm, the training capacity of improved PSO-based ANN is first tested and then compared to that of a conventional PSO-based ANN and a back propagation-based ANN, using the experimental data obtained from carbon nanotube growth process. Optimal back propagation-based neural network architecture is designed using MATLAB Neural Network Toolbox. Programs of PSOGANN and conventional PSO-based ANN are accomplished in C++.

2. Genetic Algorithm

A GA emulates the evolutionary characteristic of survival of the fittest. At each phase, encoded chromosomes are simulated; the algorithm establishes the strength of each chromosome. The chromosomes mutate with crossover producing the next generation. Then the process repeats. The input parameters for GA are a set of solutions (the chromosomes of the GA) and a fitness function defining success characteristics and stopping criteria. At each step of the algorithm chromosomes are first evaluated for suitability against the success characteristics. Subsequently successful chromosomes are randomly pooled to mate. Pairs of chromosomes in this pool randomly share genetic information with each other. The chromosomes are evaluated against the stopping criteria. Then the process repeats if the criteria are not met. These features of GA make it adequate for handling large, nonlinear problems with unpredictable results. Relying on multipoint search and algorithmic features, the chance of convergence to the universal optimal solution is much higher than the chance of falling into a local optimal solution. GA has a positive track record successfully having dealt with problems in a variety of fields, including but not limited to optimization, fuzzy logic, NN, expert systems, and scheduling [11].

3. IPSOGA Based ANN

Particle swarm optimization is a population based stochastic optimization algorithm. In PSO algorithm the potential solutions, called particles, including weights and thresholds vectors fly through the problem space by following the current optimal particles. During training, after initializing PSO parameters using a group of random particles (solutions), optimal solution is achieved through the solution space [9]. Therefore, the velocity and position of the obtained optimal solution are updated according to its own experience and global cooperation. Despite regular PSOs converging rapidly on solutions, they may often become trapped within local maxima and minima [12]. To obviate this problem and improve its training capacity, a developed PSO algorithm is proposed. An improvement to the PSO algorithm is proposed by considering best and worst case particle positions. By nature of the algorithm best case positions will gravitate towards the optimal positions, away from worst case positions [13, 18].

Individual particles in the swarm are represented by -dimensional position and velocity vectors contained in , the search space:

The evaluation of each particle is performed against the success function, with individual best positions being cumulatively stored in a position vector:

A global optimum position, , is established from an evaluation of individual positions.

Between each iteration the new velocity of each particle is calculated from the distance to the global best position, the local best position, and an inertia weight from the previous velocity:where yields a randomly distributed acceleration coefficient.

The updated position can be given by taking the sum of the previous position and current velocity over the next iteration:The proposed algorithm (PSOGANN) is developed by taking advantage of both PSO and GA into the training process. A stopping criterion (which can be either the maximum number of iterations or reaching to a certain MSE) is imposed and if the PSO is unable to meet the stop criteria, the best population for GA (including weights and biases) is determined by PSO and GA will again search for the best parameter set. This process will continue until the stop criterion is satisfied. Figure 1 shows the flow chart of the proposed algorithm and details are presented below.

Step 1. Initialization of PSOGANN parameters: This includes (a) determination of the initial PSONN parameters (, , , , , and ); (b) select weights and biases for the network randomly (first iteration); (c) selecting initial position and velocity vectors for all the particles (randomly); (d) selecting initial values of and randomly; (e) determination of number of circuits of group 1, which is number of generations where PSO can try to meet the stopping criteria in each step before its current best particle () is saved as one of the GA’s populations; (f) number of circuits of group 2, which is number of initial populations in GA.

Step 2. Compute fitness of individual particles by the feedforward network.

Step 3. (a) Perform PSO operators to find the best PSONN parameters. (b) Update weights and thresholds according to equations (1) and (2) until “counter 1 > number of circuits of group 1” is satisfied.

Step 4. (a) Best position of PSO algorithm saves as an initial population for GA; (b) counter 1 resets; (c) algorithm continues to search for the optimal PSO parameters for the current set of network weights and biases until the tolerance is met in Step 3.

Step 5. (a) If the tolerance is not met after the maximum number (counter 2), perform GA algorithm by initial populations which are saved by PSO in previous steps. (b) Continue until stop criterion is met in Step 4.

4. Carbon Nanotube Growth Process

4.1. Sample Preparation

A 4′′ silicon wafer was oxidized with an oxide layer of 1 μm thickness on both sides, which could function as a buffer layer to circumvent the interaction of catalyst particles with the silicon during CNT growth process. Then a layer of Fe catalyst was deposited onto its top surface with the nominal thickness of 2 nm by electron beam evaporation. After slicing the wafer into small samples, the sample was kept in the CVD quartz chamber for CNT growth.

4.2. Growth Process

In these experiments, we set the initial temperature ramping rate as 50°C/min and the final CNT growth temperature = 725°C in the control program, and the pressure inside the chamber was maintained at about 11 Torr. Based on the temperature profile of the substrate as shown in Figure 2(A), the growth could be divided into 3 steps. Similar to our previous CVD processes [16, 17, 19], the gas mixture of hydrogen (H2) and argon (Ar) was provided throughout the 3 steps: Ar functions as the carrier gas and helps to dilute the acetylene (C2H2) concentration, while H2 acts as the reductive agent to refresh the activity of catalyst particles during the growth [20]; the carbon source C2H2 was only introduced in the second step to initiate and maintain the CNT growth. The gas flow rate was controlled and monitored in situ with mass flow controller.

To begin with (starting from time ), the temperature increases dramatically from initial temperature (usually its room temperature) towards the growth temperature (725°C here). The temperature ramping rate is not constant with time; instead it decreases as is approaching .

When the temperature reached (), the second step began. The catalyst layer went through some minutes of pretreatment (annealing), so as to further turn its thin film morphology into isolated small particles by increasing surface tension. Then at time , the C2H2 gas was introduced into the chamber to initiate the CNT growth. After 30 minutes of growth, C2H2 supply was stopped at , and the system started to cool down until below 200°C, when the samples could be taken out for characterizations. The surface morphology and the length of the as-grown CNTs on substrate were characterized with scanning electron microscopy (SEM), Hitachi S-3500 N. Typical SEM image of the CNT mat is shown in Figure 2(B)(a), the as-grown CNT mat is perpendicular to the substrate top surface, with uniform thickness of about 320 μm, and the closer look in Figure 2(B)(b) reveals that these CNTs are densely packed with a bit wavy entanglement between them. The transmission electron microscopy (TEM, model: FEI Titan) is used to characterize the structure of CNTs with very high resolution. As shown in Figure 2(B)(c)-(d), the CNTs grown here are multiwall carbon nanotubes (MWCNTs) with 10–30 walls and 10–30 nm in outermost diameter.

5. Modeling Results and Discussions

In this study, five input patterns were used during the CNT growth (values of C2H2, Ar, H2, pretreatment, and growth duration) and length of CNT is considered as the only output. The set of training data is comprised of 90% experimental data (43 groups). 10% (5 groups) of data was randomly set aside for testing purposes. Stop criteria were selected with either 1500 as iterations or 0.005 as minimum error (MSE) condition for all networks. A preliminary analysis was performed on conventional NNs which has not been presented in this paper to establish a control for comparison with the proposed PSO ANN. The LM-NN performed best in both the training and test data sets out of traditional NNs, yielding the lower mean-square-error, MSE. Tables 1 and 2 compare the performance of CPSONN LM-NN and IPSOGANN. In both training and testing phases IPSOGANN proved superior to CPSONN and LM-NN. The data suggests IPSOGANN can provide a 55% (MSE) improvement over CPSONN and 80% (MSE) improvement over LM-NN. During the training it is also found that in terms of speed of convergence (number of needed iterations to meet the stop criteria) PSONNGA is approximately 80% faster as compared to CPSONN (see Figure 3).

6. Conclusion

This study proposed a novel algorithm based on PSO and GA for training ANNs (PSOGANN). Application of the proposed algorithm for modeling growth of CNTs is discussed. In particular, proposed model has demonstrated about 40% improvement in offline training average error in comparison to those of conventional PSO-based ANN algorithm. PSOGANN can be trained extremely quickly, which makes it possible to perform a large number of evaluations required by GA. This method is less sensitive to the permutation problem and improves the results of the evolved networks. This method can highly solve some critical issues associated with traditional neural network systems such as overfitting and falling in local minimum.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

Majid Ebrahimi Warkiani would like to acknowledge the support of the Australian Research Council through a Discovery Project Grant (DP170103704).