Elsevier

Journal of Hydrology

Volume 545, February 2017, Pages 424-435
Journal of Hydrology

Research papers
A fully-online Neuro-Fuzzy model for flow forecasting in basins with limited data

https://doi.org/10.1016/j.jhydrol.2016.11.057Get rights and content

Highlights

  • GSETSK is a fully-online model which does not requires a large training dataset.

  • River flow forecasting in two catchments were conducted with GSETSK.

  • GSETSK results compared favorably against physical models used as benchmark.

  • GSETSK produces a more interpretable rule base compared to other neuro-fuzzy models.

  • The study shows the applicability of GSETSK for forecasts with limited data.

Abstract

Current state-of-the-art online neuro fuzzy models (NFMs) such as DENFIS (Dynamic Evolving Neural-Fuzzy Inference System) have been used for runoff forecasting. Online NFMs adopt a local learning approach and are able to adapt to changes continuously. The DENFIS model however requires upper/lower bound for normalization and also the number of rules increases monotonically. This requirement makes the model unsuitable for use in basins with limited data, since a priori data is required. In order to address this and other drawbacks of current online models, the Generic Self-Evolving Takagi-Sugeno-Kang (GSETSK) is adopted in this study for forecast applications in basins with limited data. GSETSK is a fully-online NFM which updates its structure and parameters based on the most recent data. The model does not require the need for historical data and adopts clustering and rule pruning techniques to generate a compact and up-to-date rule-base. GSETSK was used in two forecast applications, rainfall-runoff (a catchment in Sweden) and river routing (Lower Mekong River) forecasts. Each of these two applications was studied under two scenarios: (i) there is no prior data, and (ii) only limited data is available (1 year for the Swedish catchment and 1 season for the Mekong River). For the Swedish Basin, GSETSK model results were compared to available results from a calibrated HBV (Hydrologiska Byråns Vattenbalansavdelning) model. For the Mekong River, GSETSK results were compared against the URBS (Unified River Basin Simulator) model. Both comparisons showed that results from GSETSK are comparable with the physically based models, which were calibrated with historical data. Thus, even though GSETSK was trained with a very limited dataset in comparison with HBV or URBS, similar results were achieved. Similarly, further comparisons between GSETSK with DENFIS and the RBF (Radial Basis Function) models highlighted further advantages of GSETSK as having a rule-base (compared to opaque RBF) which is more compact, up-to-date and more easily interpretable.

Introduction

Data Driven Models (DDM) often need to be trained with a dataset which is representative of the system under consideration (Solomatine et al., 2008). Among the variety of DDMs reported in the literature, Neuro-Fuzzy Models or NFMs have recently gained popularity due to its reasoning and learning capabilities. An early application of NFM in stream flow prediction can be found in Chang and Chen (2001). The authors used a counterpropagation fuzzy-neural network (CFNN) to forecast flow forecasting at Da-Cha River in central Taiwan. The results obtained were more reliable than ARMAX. Other studies utilizing Neuro-Fuzzy Models (NFMs) for flow forecasting usually employ either the ANFIS (Adaptive-Network-Based Fuzzy Inference System) (Jang, 1993) or DENFIS (Dynamic Evolving Neural-Fuzzy Inference System) (Kasabov and Song, 2002) models. Studies using ANFIS include Nayak et al. (2004) who used ANFIS for forecasting of river flow in the Baitarani Basin in India. The study showed that the forecasted flow with ANFIS can preserve the statistical properties of the actual flow. In another study, Nayak et al. (2005) used ANFIS for 1–6 lead time flow forecasting in the Kolar Basin, India and showed that ANFIS can model the nonlinear dynamics of the rainfall-runoff process very well. Chau et al. (2005) used ANFIS for daily river flow forecasting in a part of the Yangtze River suggesting that ANFIS can be used as a complement to traditional models. Chen et al. (2006) used ANFIS for forecasting floods in the Choshui River in central Taiwan using subtractive clustering for rule base management. Aqil et al. (2007) demonstrated improved ANFIS model performance with subtractive clustering for rainfall-runoff modeling for the Cilalawi River basin in west Java, Indonesia. Firat and Güngör, 2007, Firat and Güngör, 2008 used ANFIS for daily river flow forecasting. Remesan et al. (2008) used ANFIS to model the rainfall-runoff process in the Brue Catchment, UK and found that ANFIS produced results with high accuracy and reliability. Mukerji et al. (2009) used ANFIS for flood forecasting at a gauging site in India under different lead times ranging from 3 to 9 h with excellent results obtained for forecasts up to 6 h. Wang et al. (2009) compared ANFIS, Autoregressive Moving Average (ARMA), Artificial Neural Network (ANN), Genetic Programing (GP) and Support Vector Machine (SVM) models for forecasting monthly discharge for Manwan Hydropower in Lancangjiang River and Hongjiadu Hydropower on Wujiang River in China. ANFIS, GP and SVM achieved the best results. Firat and Turan (2010) applied the ANFIS model in forecasting monthly river flow using data from the Göksu River, southern Turkey. Talei et al. (2010a) used ANFIS for event-based rainfall-runoff modeling and demonstrated the superiority of ANFIS over the physical model SWMM (Storm Water Management Model). Talei et al. (2010b) and Talei and Chua (2012) also discussed the effect of lag time and input selection for improved ANFIS model performance. Ghalkhani et al. (2013) used the ANFIS model for a river in Iran and suggested the usage of ANFIS as a backup tool for flood forecasting and warning systems. Rath et al. (2013) used a subtractive clustering method to partition the input space and generate rules for ANFIS and other computational intelligence (CI) models. The model was trained incrementally hence permitting its operation in real time. Nguyen et al. (2014) used ANFIS for level forecasting in the Lower Mekong River. In more recent studies, Chang et al. (2014) and Chang and Tsai (2016) used ANFIS in forecasting rainfall and river flow in Shihmen Reservoir, Taiwan. Chang and Tsai (2016) used ANFIS for 1–4 h ahead flow forecasting. The authors used Self-Organizing Map (SOM) and 2-stage Gamma Test to integrate radar rainfall data as input. Comparison between modelled and measured data show good prediction of the flood peak and mitigation of the lag time effect.

Some studies have also focused on the use of clustering techniques and genetic algorithm (GA) to create the rules for ANFIS. Nayak et al. (2007) proposed a hybrid intelligent system model where the proper number of rules is realized by partitioning the input space using a hyper-ellipsoidal fuzzy clustering method. Their model was concluded to be comparable to ANFIS. Chidthong et al. (2009) proposed a hybrid multi-NFM where the fuzzy rules are determined using GA. This model was used for the daily flood forecasting for Chiang Mai in Thailand and Koriyama in Japan and was benchmarked against the ANFIS model. Recently, some studies have adopted a dynamic NFM known as DENFIS. Talei et al. (2013) applied DENFIS and a real-time version of DENFIS (RTDENFIS) in addition to ANFIS for rainfall-runoff forecasting in three catchments of different sizes. The results of forecasting were comparable or better than the physically based models SWMM and HBV. This study also showed that DENFIS needed less training time and provided better results when compared to ANFIS.

Although the above mentioned studies show the usefulness of NFMs, however, in cases where training data is limited or where the data during the simulation stage deviate from the training data, a new modeling scheme is required. This is because in NFMs such as ANFIS, a fixed number of rules are created during the training phase which can be used subsequently for testing. Each rule is a mapping from a partition of the input space created using a specific clustering method, to a part of the output space. So in the cases that training data is insufficient, there is inadequate number of rules to completely describe the input-output space. Also if the testing data deviate significantly from the training data, the NFM rules may not be valid. To resolve these problems, a dynamic model can be used, where instead of having an NFM that is trained using offline learning or batch data, online learning is used. Data is fed to the NFM sequentially one at a time in online learning. Also in online learning there is no priori-knowledge about the size of data which would be available to train the model. Currently, online models such as DENFIS suffer from one or both of the following problems: (i) Firstly, the upper/lower bound of the data used for training is required for model normalization. (ii) Secondly, the number of rules increases monotonically. In the latter case, sometimes there is a need to repartition the input space and make new rules. This however cannot be done when the input-output space mapping is saturated with old rules, which may contain rules that are no longer relevant. In addition, the existence of old rules may reduce the accuracy and interpretability of the model.

In order to address the deficiencies of NFMs such as ANFIS and DENFIS, a new NFM, GSETSK or Generic Self-Evolving Takagi-Sugeno-Kang (Nguyen and Quek, 2010, Nguyen et al., 2012, Nguyen et al., 2015) is introduced in this paper. GSETSK is a fully online model, i.e. the model does not need any historical data for training, instead the model learns in real-time as measured data is presented to the model incrementally. In this way (i) There is no need to define the data upper/lower bound which is the requirement for data normalization in models such as DENFIS. (ii) No assumption is required on the number of partitions to be created in the input space. (iii) Lastly, GSETSK uses a rule pruning algorithm which enables the rule-base to discard redundant rules and therefore keeps the rule base current. These capabilities are handled by the following main features of the GSETSK: (i) Multidimensional-Scaling Growing Clustering (MSGC) which is a clustering method that does not require a priori information about the nature of the data, and (ii) Rule Pruning algorithm which prunes redundant rules. The objective of this paper was to assess the applicability of GSETSK as an online forecast model in basins without prior data. Datasets obtained from two different sites were used to investigate the applicability of GSETSK for two common hydrological forecast applications, namely rainfall-runoff and river routing forecasting. The dataset used for the rainfall-runoff study was obtained from a Swedish catchment and GSETSK was used to model 1 day ahead (1DA) river discharge. The river routing study was conducted using data derived from the Lower Mekong River and GSETSK was used for 1, 3 and 5 days ahead river stage forecasting.

Section snippets

Generic Self-Evolving Takagi-Sugeno-Kang (GSETSK) model

The Generic Self-Evolving Takagi-Sugeno-Kang or GSETSK proposed by Nguyen (2012) is used in this study. The self-evolving property of this model makes it a fully-online model which can work with data streams without any prior assumption of the data distribution. GSETSK is able to dynamically change its structure and parameters to model the most recent knowledge from the data stream. At every data arrival the following steps are performed in GSETSK: (i) First phase of structure learning which

Study area and data

Data from two different geographical locations were used in this study. The first dataset belongs to Klippan_2, a sub-basin of Rönne catchment which is located in southern Sweden (Talei et al., 2013). Daily data of river discharge (Q), precipitation (P) and temperature (T) from 1 Jan 1979 to 31 Dec 1980 where used for the rainfall-runoff forecasting in this catchment. Temperature was taken as a surrogate parameter for snowmelt. As this is a small catchment (242.9 km2) and hence the time of

Forecasting in basins without prior data (Scenario 1)

Table 3(a) shows the forecasting results and evaluation indices for rainfall-runoff forecast with GSETSK for 1979 and 1980 (except the first day which was used for initialization) compared with the naïve model and Table 3(b) shows the GSETSK results for 1980 compared with the naïve and HBV models for 1980 (since HBV model results are only available for 1980 (Talei et al., 2013)). The time series are compared in Fig. 2. Note that the HBV model was calibrated using the daily data from 1 Jan 1961

Conclusion

The following can be concluded from this study:

  • (i)

    When GSETSK model results were compared against physical models such as URBS and HBV, similar results were obtained. The obvious advantage of GSETSK is that a separate training phase is not required, since it is a fully online model.

  • (ii)

    Being a fully online model, GSETSK was also found to be advantageous as it does not require data for model initialization. This is in comparison with NFMs such as DENFIS which require prior data since upper/lower bound

References (39)

  • A. Talei et al.

    Evaluation of rainfall and discharge inputs used by Adaptive Network-based Fuzzy Inference Systems (ANFIS) in rainfall-runoff modeling

    J. Hydrol.

    (2010)
  • W.-C. Wang et al.

    A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series

    J. Hydrol.

    (2009)
  • P.P. Angelov et al.

    An approach to Online identification of Takagi-Suigeno fuzzy models

    IEEE Trans. Syst. Man Cybernet. Part B – Cybernet.

    (2004)
  • K.W. Chau et al.

    Comparison of several flood forecasting models in Yangtze River

    J. Hydrol. Eng.

    (2005)
  • S.H. Chen et al.

    The strategy of building a flood forecast model by neuro-fuzzy network

    Hydrol. Process.

    (2006)
  • Y. Chidthong et al.

    Developing a hybrid multi-model for peak flood forecasting

    Hydrol. Process.

    (2009)
  • M. Firat et al.

    Hydrological time-series modelling using an adaptive neuro-fuzzy inference system

    Hydrol. Process.

    (2008)
  • M. Firat et al.

    Monthly river flow forecasting by an adaptive neuro-fuzzy inference system

    Water Environ. J.

    (2010)
  • H. Ghalkhani et al.

    Application of surrogate artificial intelligent models for real-time flood routing

    Water Environ. J.

    (2013)
  • Cited by (50)

    • Fuzzy risk-based technique for the design of an ogee spillway in a diversion dam based on hydrological return period discharge and the resistance-load theory

      2023, Journal of Hydrology
      Citation Excerpt :

      Jacquin and Shamseldin (2006) explored the application of the Takagi-Sugeno fuzzy inference system to rainfall-runoff modeling to describe the non-linear relationship between rainfall as an input and runoff as an output. Ashrafi et al. (2017) implemented a neuro-Fuzzy model for forecasting rainfall-runoff and river routing in basins with limited data. Morales et al. (2021) introduced a self-identification neuro Fuzzy inference model to reduce the uncertainty of the rainfall and runoff lags and the number of membership functions required in a Fuzzy system.

    • A review on flood management technologies related to image processing and machine learning

      2021, Automation in Construction
      Citation Excerpt :

      Tuna [118] investigated the ability of UAVs to determine rescue routes in the post-flood period. The system consists of a set of UAVs, each having a computer on-board running sub-system that carries out communication, navigation and control [118]. Using this computer along with the UAV controller, the goal of the acquisition of communication routes and infrastructure is achieved.

    • IT2-GSETSK: An evolving interval Type-II TSK fuzzy neural system for online modeling of noisy data

      2020, Neurocomputing
      Citation Excerpt :

      IT2-GSETSK has a self-evolving structure, which is alive through the lifetime of the model such that it can construct or deconstruct its structure to cope with the complexities of the problem. This is handled in IT2-GSETSK through specific mechanisms for structure learning and unlearning and with a localized version of recursive linear least-squares (RLS) [26,27] for tuning the consequent parameters. The model operates in an interleaved test-and-train mode upon arrival of each data to support the online self-evolution of the model.

    View all citing articles on Scopus
    View full text