A neural network applied to estimate Burr XII distribution parameters

https://doi.org/10.1016/j.ress.2010.02.001Get rights and content

Abstract

The Burr XII distribution can closely approximate many other well-known probability density functions such as the normal, gamma, lognormal, exponential distributions as well as Pearson type I, II, V, VII, IX, X, XII families of distributions. Considering a wide range of shape and scale parameters of the Burr XII distribution, it can have an important role in reliability modeling, risk analysis and process capability estimation. However, estimating parameters of the Burr XII distribution can be a complicated task and the use of conventional methods such as maximum likelihood estimation (MLE) and moment method (MM) is not straightforward. Some tables to estimate Burr XII parameters have been provided by Burr (1942) [1] but they are not adequate for many purposes or data sets. Burr tables contain specific values of skewness and kurtosis and their corresponding Burr XII parameters. Using interpolation or extrapolation to estimate other values may provide inappropriate estimations. In this paper, we present a neural network to estimate Burr XII parameters for different values of skewness and kurtosis as inputs. A trained network is presented, and one can use it without previous knowledge about neural networks to estimate Burr XII distribution parameters. Accurate estimation of the Burr parameters is an extension of simulation studies.

Introduction

The Burr XII distribution was first introduced in the literature by Burr [1]. It approximates the distributional form of normal, lognormal, gamma, logistic, and several Pearson-type distributions. For instance, the normal density function may be approximated as a Burr XII distribution with c=4.85437 and k=6.22665 and the gamma distribution with shape parameter 16 can be approximated as a Burr XII distribution with c=3 and k=6, and the log-logistic distribution is a special case of the Burr XII distribution [2]. Rodriguez [3] explored in great detail the connection between the Burr XII distribution and other continuous distributions. Fig. 1, Fig. 2 show different probability density function (pdf) and cumulative density function (cdf) of the Burr XII distribution for different values of c and k, respectively. (Fig. 1, Fig. 2 are borrowed from Wikipedia, the free encyclopedia, Burr distribution Internet page).

The Burr XII distribution has been applied frequently in areas of quality control, reliability analysis, and failure time modeling. Gupta et al. [4] analyzed failure time data by using the Burr distribution. Maximum likelihood estimation (MLE) of parameters and methods for fitting Burr distribution to life test data has been studied by Wingo [5], [6]. Ghitany and Al-Awadhi [7] gave examples of survival studies associated with different treatments of leukemia with censored data from the Burr distribution.

Zimmer and Burr [2] developed a method for sampling variables from non-normal populations using the Burr XII distribution. Burr [8] used his distribution to investigate the effect of non-normality on the constants of the X¯R control chart. Chou et al. [9] applied the Burr XII distribution to generate an economic–statistical design of the X¯ chart for non-normally distributed data. Abbasi [10] used Burr XII distribution in providing training data for neural network to estimate process capability index of non-normal processes.

The pdf and cdf of Burr XII distribution are defined byf(x)=kcxc1(1+xc)(k+1),x,c,k0F(x)=1(1+xc)k,x,c,k0where c>0 and k>0 are related to the skewness and kurtosis coefficients of the Burr XII distribution. For more information on properties of Burr XII distribution, see Wang et al. [11], Zimmer et al. [12], Cizek et al. [13], Burr [14] and Ali Mousa and Jaheen [15].

There are two methods commonly used to estimate Burr XII parameters:

  • (1)

    using standard tables of standard Burr XII distribution

  • (2)

    using MLE method

We explain them briefly in the following sections.

Burr [1] has constructed several tables of expected mean values, standard deviations, and skewness and kurtosis coefficients for the Burr XII distribution with specified value of c and k. In these tables, curvilinear interpolation is used to find the value of F(x) with desired moments. To find an appropriate Burr XII distribution for a given data set, parameters c and k and mean and standard deviation of the Burr XII distribution are selected from the Burr XII tables based on skewness and kurtosis values of the data. The first column of the Burr tables contains different values of Sk (skewness) and the second column contains different values of Ku-3 (kurtosis-3); however c, k, μ and σ are in the next columns.

There is a standardized transformation between a Burr variable (say Z) and another random variable (say X). This transformation may be expressed asZμzσz=Xμxσxwhere μz and σz are the mean and standard deviation of Z that are given in Burr tables and μx and σx are the mean and standard deviation of X. The pdf of Z isf(z)=kczc1(1+zc)(k+1),z,c,k0

Considering Z=((Xμx)/σx)σz+μz, the pdf of X isf(x)=σzσxkc((xμxσx)σz+μz)c1(1+((xμxσx)σz+μz)c)(k+1),xμxμz(σxσz),c,k0

The tables do not contain all the values of skewness and kurtosis (of the distribution), and using interpolation or extrapolation may present inappropriate estimations.

The MLE method is one of the most useful methods to estimate distribution parameters. However, in some cases, estimated values of the parameters that maximize the likelihood function (LF) can be a complicated task. For the Burr XII and some other distributions, the LF is complex and popular optimization algorithms may converge to a local optimum, and yield poor estimates.

The logarithm of the LF for Burr XII distribution isLn(L)=n(ln(c)+ln(k))+(c1)i=1nln(xi)(k+1)(i=1nln(1+xic))taking the derivative with respect to c and k yields:nc+i=1nln(xi)(k+1)i=1nxicln(xi)1+xic=0nk(i=1nln(1+(xi)c))=0

To solve Eqs. (7), (8), numerical method such as Newton–Raphson methods can be used although they may converge to a local optima. Also, when the sample size increases, solving Eqs. (7), (8) is more complicated and less efficient. In the MLE method, after estimating c and k, the mean and standard deviation of the Burr XII are estimated from the c and k parameters. μx and σx are estimated from the following equations [16]:μx=kΓ(k+1)Γ(k1c)Γ(1+1c)existsonlyforck>1σx=[kΓ(k+1)[Γ(k2c)Γ(1+2c)k(Γ(k1c)Γ(1+1c))2Γ(k+1)]]1/2existsonlyforck>2

Before using MLE method, variable X should be transformed according Eq. (3). Therefore, there are four unknown parameters, i.e., c, k, μz and σz (see Eq. (5)), in using the MLE method and four nonlinear equations must be solved taking the derivative with respect to c, k, μz and σz of the MLE function for the pdf presented in Eq. (5). Solving these equations can be problematic and they may converge to a local optima. One may replace μz and σz by Eqs. (9), (10) in Eq. (5) to omit two of the equations in applying MLE method. However, the MLE function and derivative equations would still be very complex.

In this paper, a neural network is used to estimate the Burr XII parameters. We develop a neural network which can estimate c, μz and σz from skewness and kurtosis of the sample data (empirical values) as inputs. Then, based on the estimated c, μz and σz parameters by the neural network, a simple equation is used to estimate k.

The current paper is organized in the following manner. The next section reviews generally artificial neural networks (ANN) and multilayer perceptron (MLP) that are used in this paper. Section 3 describes the application of neural networks to the estimation of Burr XII parameters. Trained neural network and a closed-form equation to estimate Burr XII parameters are presented in Section 4. Section 5 presents simulation studies. A comparison study with the MLE method is presented in Section 6. Finally, conclusions are presented in Section 7.

Section snippets

Neural networks

An ANN can be defined as a massively parallel distributed processor that can store experimental knowledge and make it available for future use [17]. The knowledge is acquired via a learning process through several input–output vectors and stored in inter-neuron connection strengths known as synaptic weights and bias (threshold) values. ANNs consist of numerous interconnected processing elements called neurons with an activation function, which are typically organized into layers linked via

Using MLP to estimate Burr XII parameters

Many statisticians and engineers are familiar with successful applications of ANN, and in this research, we use a neural network to estimate Burr XII parameters. In the table provided by Burr [1], we can estimate Burr XII parameters (c and k) considering the skewness and kurtosis of data, and after that, we can directly obtain the mean and standard deviation of the Burr XII with determined c and k.

In order to apply neural networks to estimate Burr XII parameters, similar to Burr XII tables, we

Trained neural network

The weight and bias (threshold) values after training are listed in Table 1. Wi and Bi are weights and bias values in the ith layer. By using these values in Eq. (12), we have a function that for each input vector [Skewness, Kurtosis-3], yields a vector of the form (μ^z,σ^z,1/c^).

The ANN's three outputs form an Output vector, (output (1), output (2), output (2),) as follows:Output=Tansig(W3*tansig(W2*tansig(W1*[Skewness,Kurtosis3]+B1)+B2)+B3))

Because of the characteristics of the Tansig

Simulation study

In this section, we present a simulation study for different parameters of the Burr XII distribution to analyze the performance of our developed ANN. For each distribution, we generate a sample of size n and estimate input variables required by ANN to obtain the desired parameters, and compare the performance of the ANN with the actual distribution parameter values. Three Burr XII distributions are used in this simulation study. Based on different sample sizes, i.e., n=100, 1000, 2500 and

Comparison study

In this section, the MLE method based on Eqs. (7), (8) is used to estimate the parameters of Burr XII distributions described in the previous simulation study section. The sample size was set equal to 100, because for large sample size solving Eqs. (7), (8) is not applicable.

The results of using MLE method are presented in Table 3 and for all cases they were compared to the corresponding results in Table 2 regarding accuracy and variability. These results indicate that the ANN method performs

Conclusions

In this paper, a MLP neural network has been introduced to estimate parameters of Burr XII distribution. Our developed neural network is very easy to use. The simulation study shows that using the developed neural network yields very acceptable results and it is very fast. The function, the weights and bias values of trained neural network were presented with this purpose that statisticians and engineers can use them in spreadsheets (with enough precision) even if they have no knowledge of ANN.

Acknowledgments

The authors would like to thank the referees for their valuable comments that improved the presentation of this paper. In addition, the authors would like to thank Mr. Edris Babaei for his help in computational analysis in using Burr tables.

References (20)

There are more references available in the full text version of this article.

Cited by (25)

  • Novel recursive inclusion-exclusion technology based on BAT and MPs for heterogeneous-arc binary-state network reliability problems

    2023, Reliability Engineering and System Safety
    Citation Excerpt :

    Various network models have been widely implemented in many different traditional and modern applications globally, such as telecommunication networks [1], power distribution networks [2], water networks [3], gas networks [4], common networks [5], logistics networks [6], industry 4.0 networks [7], 5G/6G networks [8], the Internet of Things (IoT) [9], wireless sensor networks [10], and deep learning [11–15].

  • Predicting solutions of large-scale optimization problems via machine learning: A case study in blood supply chain management

    2020, Computers and Operations Research
    Citation Excerpt :

    In this study, four common machine learning models’ classification and regression tree (CART), k-nearest neighbors (k-NN), random forest (RF), and multilayer perceptron (MLP) artificial neural network (ANN) are considered to learn the relationship hidden between the input parameters defining the optimization problem and the optimal value of its decision variables. These models have been frequently employed in many tasks such as parameter estimation (Abbasi et al., 2010; 2008) and forecasting problems (Londhe and Charhate, 2010). Moreover, they have been used extensively in the operations research literature as optimization methods themselves (see Smith, 1999 and Smith and Gupta, 2000).

  • Predicting component reliability and level of degradation with complex-valued neural networks

    2014, Reliability Engineering and System Safety
    Citation Excerpt :

    In addition to the approaches in which neural networks are directly applied to reliability problems, they are also applied within meta or surrogate models in which the analyses based on physical models are computationally expensive [26,20,45] so that neural networks learn to generalise the input–output mapping from few samples computed by the physical models. Furthermore, neural networks can also be applied in combination with statistical methods, e.g., for estimating parameters of distributions [1], and in combination with model-based methods, e.g., to accelerate optimisation computations for very complex reliability block diagrams [48], or to determine the transition probability in homogenous Markov processes [49]. Considering the life cycle phase in which neural networks are predominantly applicable, it appears obvious that they are particularly applicable in the operation phase as reliability monitoring data are recorded in operation and can be used to predict average or specific behaviour of systems.

View all citing articles on Scopus
View full text