Multi-stage hybridized online sequential extreme learning machine integrated with Markov Chain Monte Carlo copula-Bat algorithm for rainfall forecasting
Introduction
Anthropogenic and naturally-induced anomalies in regional-scale rainfall can directly affect the agricultural sector since rainfall plays a vital role in both the growth and the production of crops (Maraseni et al., 2012; Nguyen-Huy et al., 2018). The effect is not only restricted to the agricultural sector but it also brings major water-related disasters (Barredo, 2007) such as the shortage of rainfall on the long run is leading to drought events (Palmer, 1965). This can lead to water scarcity (Langridge et al., 2006; Vörösmarty et al., 2010) while excessive amounts of rainfall can cause flooding and damage to human and wildlife health, infrastructure and the economy (Bhalme and Mooley, 1980). The economy of Pakistan, a nation that is still in its developing phase, has also been severely damaged due to major flooding events, including the damage to infrastructure and agricultural crops (News, 2010). The estimated damage in the 2010 event to infrastructure was approximately 4 billion US dollars whereas the damage in the agricultural sector amounted to about 500 million US dollars (Hicks and Burton, 2010). The total economic damage was considerably large, totaling to approximately 43 billion US dollars in 2010 (Mansoor, 2010; Tarakzai, 2010). Equally, drought events (Ali et al., 2018) have been a major contributing factor towards reduced agricultural yields and significant reductions of the gross domestic product of Pakistan. Further, prolonged decline of adequate rainfall can cause a fall in hydraulic heads having severe consequences for crop irrigation from wells due to changes in the properties of groundwater reservoirs (Santos et al., 2014). Therefore, the ability to forecast rainfall in an accurate manner, particularly in agricultural belt regions, can increase the ability of stakeholders to formulate better water planning and resource management decisions.
Data-intelligent models, particularly developed for local (e.g., farm) scales, have the ability to utilize past data, and hence may offer a viable and reasonably accurate solution to drought disaster management through a projection of future rainfall (Luk et al., 2001). The study of Chiew et al. (1998) developed data-intelligent, predictive models for rainfall forecasting using an empirical method, whereas Sharma (2000) developed a nonparametric probabilistic model to forecast seasonal to inter-annual rainfall in Australia. Burlando et al. (1993) used an autoregressive moving average (ARMA) model for short-term rainfall forecasting in the USA whereas Hung et al. (2009) applied an artificial neural network (ANN) model for rainfall forecasting in Thailand and Lin et al. (2009) forecasted hourly rainfall using support vector machines for Taiwan. Yaseen et al. (2017) developed a rainfall forecasting model using the novel hybrid intelligent model based adaptive neuro fuzzy inference system(ANFIS) integrated with Firefly algorithm (FFA) for Pahang river catchment located in the Malaysian Peninsula, Mason (1998) forecasted seasonal rainfall of South Africa using a nonlinear discriminant analysis model while Nguyen-Huy et al. (2017) developed a novel copula-statistical rainfall forecasting model in Australia's agro-ecological zones. Accurate rainfall forecasting is a significant challenge for Pakistan due to high variation in seasonal, annual and inter-annual rainfalls, exacerbated by climate change.
Despite the need, only a few studies on rainfall forecasting, particularly at local or regional scales, have been carried out in Pakistan. For example, the study of Salma et al. (2012) forecasted rainfall trends in different climatic zones of Pakistan utilizing the autoregressive integrated moving average (ARIMA) model. Archer and Fowler (2008) applied meteorological data to forecast seasonal runoff on the River Jhelum, Pakistan on the basis of multiple linear regression models. Reale et al. (2012) forecasted an extreme rainfall event (in the Indus River Valley, Pakistan, 2010) with a global data assimilation and forecasting model. Faisal and Gaffar (2012) utilized the Thiessen polygon method of weighted rainfall forecast in Pakistan, whereas the study of Ahasan and Khan (2013) simulated flood producing rainfall events in 2010 over north-west Pakistan using weather research and a forecasting model. These studies have provided immensely useful information to various stakeholder, revealing the capability of data-driven models to generate acceptably accurate rainfall forecasts where only historical datasets were applied to construct the forecast model.
The aforementioned studies (Ahasan and Khan, 2013; Archer and Fowler, 2008; Faisal and Gaffar, 2012; Reale et al., 2012; Salma et al., 2012) focused in Pakistan indicate that rainfall forecasting has been mostly based on statistically-based models. In addition to this, a majority of these studies have been conducted to forecast seasonal rainfall using several different datasets. Moreover, there is a limitation of applying advanced data-intelligent models (considering significantly non-linear behavior of rainfall and its predicators) for accurate forecasting at a micro (or landscape) scale, which can provide help in decision-making for a better management of water resources and flood modelling in the future aimed to reducing the overall risk. For example accurate forecasting is beneficial at catchment scale for agro-forestry applications (Terêncio et al., 2018; Terêncio et al., 2017). Accurate rainfall forecasting can have several economic benefits, for example, a realistic forecast of heavy rainfall could allow airline dispatches to rout their flights in a timely manner (Graham, 2002). In addition to this, a more accurate rainfall forecasting tool might enable appropriate decision about flooding, crop sowing and harvesting and managing of water resources (Graham, 2002; Jones et al., 2000; Toth et al., 2000). To address these issues, there is an apparent need for data intelligent models to forecast rainfall more accurately than the currently statistically-based (i.e., regression) approaches that have various data distribution or linearity assumptions.
In this study, for the first time, a multi-stage online sequential extreme learning machine (OS-ELM) model integrated with Markov Chain Monte Carlo (MCMC) based copulas and the Bat algorithm is developed, denoted as the “MCMC-Cop-Bat-OS-ELM model”. For the purpose of comparison, the standalone extreme learning machine (ELM) without any hybridization and the random forest (RF) models are also developed. The proposed multi-stage MCMC-Cop-Bat-OS-ELM model is tested for rainfall forecasting in three agricultural districts: Faisalabad, Multan, and Jhelum located in Pakistan. The novelty of this study is therefore, to design and apply the newly proposed multi-stage, hybrid MCMC-Cop-Bat-OS-ELM model for rainfall forecasting in Pakistan, a developing nation where accurate predictions are likely to promulgate significant benefits to agriculture, climate adaptation and decision-making in the water resources sector.
To test the applicability of the proposed multi-stage MCMC-Cop-Bat-OS-ELM model, this study fulfils four objectives: (1) To develop a probabilistic MCMC based copula model integrated with the Bat algorithm in order to determine the optimal MCMC-copula model; (2) To incorporate the selected optimal MCMC-copula model based on the Bat algorithm in the OS-ELM model to develop a multi-stage MCMC-Cop-Bat-OS-ELM hybrid prediction tool; (3) To incorporate the significant antecedent lagged rainfall to effectively forecast the current and future rainfall in the consequent month; and (4) To validate the forecasting ability of the proposed hybrid MCMC-Cop-Bat-OS-ELM model for rainfall forecasting in Pakistan.
The literature on accurate rainfall forecasting shows that several approaches were adopted using data intelligent models.
Section snippets
Previous work
Accurate rainfall forecasting provides a key role in agriculture, water resources and early flooding warning systems (Yaseen et al., 2018, Yaseen et al., 2017, Yaseen et al., 2016). Ortiz-García et al. (2014) used support vector classifiers to forecast rainfall in Spain using meteorological variables and observational data to forecast rainfall in Spain using support vector classifiers in comparison with multi-layer perceptron, extreme learning machine, decision trees and K-nearest neighbor
Online sequential extreme learning machine (OS-ELM)
ELM is a state-of-the-art data intelligent model developed by Huang et al. (2006) used for the purpose of designing a Single Layer Feedforward Neural Network (SLFN). ELM is relatively faster, and thus computationally efficient compared with other traditional learning algorithms (Rajesh and Prakash, 2011; Deo and Şahin, 2015; Deo et al., 2017). The SLFN with M hidden nodes of N arbitrary inputs (xk, yk) ∈ Γn × Γn with an activation function f(.) can be mathematically formulated as:
Rainfall data
In this paper, we use the rainfall data obtained from the Pakistan Meteorological Department, Pakistan for the year 1981 to 2015 (PMD, 2016) for the selected regions, Faisalabad, Multan, Jhelum in Punjab, as shown in Fig. 1.
To evaluate the versatility of the multi-stage, hybridized MCMC-Cop-Bat-OS-ELM model for rainfall forecasting in Pakistan's agricultural belt, the study sites were chosen carefully to ensure that they were broadly representative of the diverse climatic conditions. The first
Results
The results of the MCMC-Cop-Bat-OS-ELM with the comparative models, MCMC-Cop-Bat-ELM and MCMC-Cop-Bat-RF have been evaluated based on the above criterion (Eqs. (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30)). Table 3 shows the selected MCMC-copula model using the Bat algorithm on the basis of feature selection. Out of a total of twenty-five tested MCMC-copula models, seven were selected to be the best by the Bat algorithm for Faisalabad, eight for Jhelum and seven for
Discussion: limitations and opportunity for further research
Accurate rainfall forecasting can complement and facilitate better planning of water management (Terêncio et al., 2018; Ali et al., 2018; Yaseen et al., 2016, Yaseen et al., 2017, Yaseen et al., 2018). (Terêncio et al., 2018). Furthermore, accurate predictions of rainfall can reduce water-related natural disasters (Barredo, 2007; Langridge et al., 2006; Palmer, 1965; Vörösmarty et al., 2010), and potential impacts upon wildlife health, infrastructure and the economy (Bhalme and Mooley, 1980).
Conclusion
For the first time, this paper has developed a hybrid multi-stage MCMC- Cop-Bat-OS-ELM model using the significant antecedent lags of monthly rainfall as predictor variables to forecast future rainfall for different geographical sites in Pakistan. The rainfall data from 1981 to 2015 for a total of three stations were used to develop the proposed multi-stage MCMC- Cop-Bat-OS-ELM model in order to achieve a high level of accuracy. Further, several types of evaluation criterion were adopted to
Acknowledgement
This research utilized rainfall data acquired from Pakistan Meteorological Department, Pakistan, which are duly acknowledged. This study was supported by University of Southern Queensland USQPRS (2017-2019) Office of Graduate Studies Postgraduate Research Scholarship (2017–2019) awarded to the first author. We thank both reviewers and the Editor-in-Chief for their constructive comments that has improved the clarity of the final paper.
References (114)
- et al.
Input selection and optimisation for monthly rainfall forecasting in Queensland, Australia, using artificial neural networks
Atmos. Res.
(2014) - et al.
An ensemble-ANFIS based uncertainty assessment model for forecasting multi-scalar standardized precipitation index
Atmos. Res.
(2018) - et al.
Two-phase particle swarm optimized-support vector regression hybrid model integrated with improved empirical mode decomposition with adaptive noise for multiple-horizon electricity demand forecasting
Appl. Energy
(2018) - et al.
Using meteorological data to forecast seasonal runoff on the river Jhelum, Pakistan
J. Hydrol.
(2008) - et al.
A framework model for the dimensioning and allocation of a detention basin system: the case of a flood-prone mountainous watershed
J. Hydrol.
(2016) - et al.
Forecasting of short-term rainfall using ARMA models
J. Hydrol.
(1993) - et al.
El Nino/southern oscillation and Australian rainfall, streamflow and drought: links and potential for forecasting
J. Hydrol.
(1998) - et al.
HydroTest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts
Environ. Model Softw.
(2007) - et al.
Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia
Atmos. Res.
(2015) - et al.
Comparison of some existing models for estimating global solar radiation for Antalya (Turkey)
Energy Convers. Manag.
(2000)
Extreme learning machine: theory and applications
Neurocomputing
Potential benefits of climate forecasting to agriculture
Agric. Ecosyst. Environ.
A neural network-based local rainfall prediction system using meteorological data on the internet: a case study using data from the Japan meteorological agency
Appl. Soft Comput.
Ensemble of online sequential extreme learning machine
Neurocomputing
An application of artificial neural networks for rainfall forecasting
Math. Comput. Model.
Integrated analysis for a carbon-and water-constrained future: an assessment of drip irrigation in a lettuce production system in eastern Australia
J. Environ. Manag.
Uncertainty analysis of bias from satellite rainfall estimates using copula method
Atmos. Res.
A new hybrid support vector machine–wavelet transform approach for estimation of horizontal global solar radiation
Energy Convers. Manag.
River flow forecasting through conceptual models part I—A discussion of principles
J. Hydrol.
Copula-statistical precipitation forecasting model in Australia's agro-ecological zones
Agric. Water Manag.
Modeling the joint influence of multiple synoptic-scale, climate mode indices on Australian wheat yield using a vine copula-based approach
Eur. J. Agron.
Accurate precipitation prediction with support vector classifiers: a study including novel predictive variables and observational data
Atmos. Res.
An efficient neuro-evolutionary hybrid modelling mechanism for the estimation of daily global solar radiation in the Sunshine State of Australia
Appl. Energy
Simultaneous modelling of rainfall occurrence and amount using a hierarchical nominal–ordinal support vector classifier
Eng. Appl. Artif. Intell.
The impact of climate change, human interference, scale and modeling uncertainties on the estimation of aquifer properties and river flow components
J. Hydrol.
Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 3—a nonparametric probabilistic forecast model
J. Hydrol.
Improved framework model to allocate optimal rainwater harvesting sites in small watersheds for agro-forestry uses
J. Hydrol.
Rainwater harvesting in catchments for agro-forestry uses: a study focused on the balance between sustainability values and storage capacity
Sci. Total Environ.
Comparison of short-term rainfall prediction models for real-time flood forecasting
J. Hydrol.
Long-term trends and variability of rainfall extremes in the Philippines
Atmos. Res.
Pakistan 7th most Vulnerable Country to Climate Change
Simulation of a flood producing rainfall event of 29 July 2010 over north-west Pakistan using WRF-ARW model
Nat. Hazards
A new look at the statistical model identification
IEEE Trans. Autom. Control
Pakistan Needs a New Crop Forecasting System
Major flood disasters in Europe: 1950–2005
Nat. Hazards
Large-scale droughts/floods and monsoon circulation
Mon. Weather Rev.
Bagging predictors
Mach. Learn.
Random forests
Mach. Learn.
Turbulent Mirror: An Illustrated Guide to Chaos Theory and the Science of Wholeness
Bat algorithm with Gaussian walk
Int. J. Bio-Inspired Computation.
Particle swarm optimization algorithm
Information and Control-Shenyang
A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence
Biometrika
Modelling radar-rainfall estimation uncertainties using elliptical and Archimedean copulas with different marginal distributions
Hydrol. Sci. J.
Rainfall forecasting using neural network: a survey
Handbook of Genetic Algorithms
A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset
Appl. Energy
Dry Weather Predicted in the Country during Friday/Monday
Ensemble learning
Applied Regression Analysis
Cited by (69)
Risk regulation of water allocation in irrigation areas under changing water supply and demand conditions
2022, Journal of Environmental ManagementA satellite-based Standardized Antecedent Precipitation Index (SAPI) for mapping extreme rainfall risk in Myanmar
2022, Remote Sensing Applications: Society and EnvironmentJoint probability of drought encounter among three major grain production zones of China under nonstationary climate
2021, Journal of HydrologyCitation Excerpt :This is computationally convenient, but it may lead to weak linkage effects for three-dimensional variables. In addition to the three copulas used in this study, there are other types of copula functions can be used to construct multidimensional distribution models, for example, Ali et al. (2018) utilized 25 different types of copulas to improve the performance of probabilistic machine learning model. In the future work, we will consider more copula types, such as t-copula, Fischer-Hinzmann copula (Ali et al., 2018), asymmetric copula functions (Ayantobo et al., 2019), and vine copula (Aas et al., 2009; Liu et al., 2015; Ni et al., 2020), to improve the performance of multivariate dependencies and reduce the uncertainty of joint probability assessment.