Elsevier

Neurocomputing

Volume 171, 1 January 2016, Pages 89-105
Neurocomputing

An incremental meta-cognitive-based scaffolding fuzzy neural network

https://doi.org/10.1016/j.neucom.2015.06.022Get rights and content

Abstract

The idea of meta-cognitive learning has enriched the landscape of evolving systems, because it emulates three fundamental aspects of human learning: what-to-learn; how-to-learn; and when-to-learn. However, existing meta-cognitive algorithms still exclude Scaffolding theory, which can realize a plug-and-play classifier. Consequently, these algorithms require laborious pre- and/or post-training processes to be carried out in addition to the main training process. This paper introduces a novel meta-cognitive algorithm termed GENERIC-Classifier (gClass), where the how-to-learn part constitutes a synergy of Scaffolding Theory – a tutoring theory that fosters the ability to sort out complex learning tasks, and Schema Theory – a learning theory of knowledge acquisition by humans. The what-to-learn aspect adopts an online active learning concept by virtue of an extended conflict and ignorance method, making gClass an incremental semi-supervised classifier, whereas the when-to-learn component makes use of the standard sample reserved strategy. A generalized version of the Takagi-Sugeno Kang (TSK) fuzzy system is devised to serve as the cognitive constituent. That is, the rule premise is underpinned by multivariate Gaussian functions, while the rule consequent employs a subset of the non-linear Chebyshev polynomial. Thorough empirical studies, confirmed by their corresponding statistical tests, have numerically validated the efficacy of gClass, which delivers better classification rates than state-of-the-art classifiers while having less complexity.

Introduction

The consolidation of the meta-cognitive aspect in machine learning was initiated by Suresh et al. [7], [8], [9], [10], [11] based on a prominent meta-memory model proposed by Nelson and Naren [6]. The works in [7], [8], [9], [10], [11] identify that the meta-cognitive component, namely what-to-learn, how-to-learn and when-to-learn, can respectively be modelled with sample deletion strategy, sample learning strategy and sample reserved strategy. Nevertheless, their pioneering works still discount the construct of Scaffolding theory [12,22], rendering a plug-and-play classifier. They have also not addressed the issue of semi-supervised learning, since the what-to-learn phase requires the data to be fully labelled.

A novel meta-cognitive-based Scaffolding classifier, the GENERIC-classifier (gClass), is proposed in this paper. The gClass learning engine comprises three elements: what-to-learn; how-to-learn; and when-to-learn. The underlying novelty of gClass lies on the use of Schema and Scaffolding theories in the how-to-learn component to realize it as a plug-and-play classifier. The plug-and-play learning paradigm emphasizes the need for all learning modules to be embedded in a single learning process without invoking any pre- and/or post-training processes. In respect of its cognitive constituent, the gClass fuzzy rule triggers a non-axis-orthogonal fuzzy rule in the input space, underpinned by the multivariate Gaussian function rule premise. Unlike the standard form of TSK fuzzy rule consequents, the rule consequent of gClass is built upon a non-linear function stemming from a subset of non-linear Chebyshev polynomials. All training mechanisms run in the strictly sequential learning mode to assure fast model updates and comply with the four principles of online learning [32]: (1) all training observations are sequentially presented one by one or chunk by chunk to gClass; (2) only one training datum is seen and learned in every training episode; (3) a training sample which has been seen is discarded without being reused; and (4) gClass does not require any information pertaining to the total number of training data.

The gClass learning scenario utilizes several learning modules of our previous algorithms in [18], [19]: three rule growing cursors, namely Datum Significance (DS), Data Quality (DQ), and Generalized Adaptive Recursive Theory+ (GART+), are used to evolve fuzzy rules according to the Schema theory [14]; two rule pruning strategies, namely Extended Rule Significance (ERS) and Potential (P+) methods, are assembled to get rid of obsolete and inactive fuzzy rules and portray the fading aspect of Scaffolding theory. The P+ method also deciphers the rule recall process, manifesting the problematizing component of Scaffolding theory to cope with the recurring concept drift; the Fuzzily Weighted Generalized Recursive Least Square (FWGRLS) method is integrated to adjust the rule consequent of the fuzzy rule and in turn delineates the passive supervision of the Scaffolding theory. gClass operates as its counterparts in [7], [8], [9], [10], [11], where the sample reserved strategy is employed in the when-to-learn process. Nonetheless, several new learning modules are proposed in this paper:

  • The what-to-learn component is built upon a new online active learning scenario, called the Extended Conflict and Ignorance (ECI) method. The ECI method is derived from the conflict and ignorance method [2], and the ignorance method is enhanced by the use of the DQ method instead of the classical rule firing strength concept. This modification makes the online active learning method more robust against outliers and more accurate in deciding the sample ignorance. Note that this mechanism can be also perceived as an enhanced version of the original what-to-learn module in [7], [8], [9], [10], [11]. In [7], [8], [9], [10], [11], the what-to-learn module is limited to ruling out redundant samples for model updates, and still assumes that data are fully labelled.

  • A new fuzzy rule initialization strategy is proposed and is constructed by the potential per-class method. This method is used to avoid misclassifications caused by the class overlapping situation. A number of research efforts have been attempted in [7], [8], [9], [10], [69], [70], [71] to circumvent the class overlapping situation, however they rely on the distance ratio method, which overlooks the existence of unclean clusters. An unclean cluster is a cluster that contains supports from different classes and is prevalent in real world-problems. This learning aspect actualizes the restructuring phase of Schema theory.

  • gClass is also equipped with a local forgetting scheme inspired by [28] to surmount gradual concept drift, where the forgetting intensity is enumerated by a newly developed method, called the Local Data Quality (LDQ) method. It is worth stressing that gradual concept drift is more precarious than abrupt concept drift, because gradual concept drift cannot be detected by standard drift detection or the rule generation method. On the other side, it cannot be handled by the conventional parameter learning method either. This situation entails the local forgetting scheme, which adapts fuzzy rule parameters more firmly and is thereby able to pursue changing data distributions. In the realm of Scaffolding theory, the local drift-handling strategy plays a problematizing role in the active supervision of the theory.

  • gClass enhances the Fisher Separability Criterion (FSC) in the empirical feature space method with the optimization step via the gradient ascent method. This step not only alleviates the curse of dimensionality, but it also improves the discriminatory power of input features. Noticeably, it triggers a direct impact on the classifier׳s generalization. The online feature weighting technique is employed to address the complexity reduction scenario in the active supervision of the scaffolding concept.

The contributions of this paper are summarized as follows: (1) the paper proposes a new class of meta-cognitive classifiers, which consolidate the Schema and Scaffolding theories to drive the how-to-learn module. (2) The paper introduces a novel type of TSK fuzzy rule, crafted by the multivariable Gaussian function in the premise component and the non-linear Chebyshev polynomial in the output component. (3) Four novel learning modules in the gClass learning engine are proposed: online feature selection; online active learning; class overlapping strategy; and online feature weighting mechanism. The viability and efficacy of gClass have been numerically validated by means of thorough numerical studies in both real-world and artificial study cases. gClass has also been benchmarked against various state-of-the-art classifiers, confirmed by rigorous statistical tests in which gClass demonstrates highly encouraging generalization power while suppressing complexity to an acceptable level. The remainder of this paper is organized as follows: Section 2 discusses related works. Section 3 illustrates the gClass inference mechanism, i.e., its cognitive aspect. Section 4 outlines the algorithmic development of gClass, i.e., its meta-cognitive component. Section 5 deliberates the empirical studies and discussions of the research gap and contribution, which detail the viability and research gap of gClass. Concluding remarks are drawn in the last section of this paper.

Section snippets

Literature review

In this section, two related areas are discussed. A survey of the psychological concepts implemented in gClass is undertaken, as well as a literature review of state-of-the art evolving classifiers.

Cognitive component of Gclass

gClass is endowed with a generalized fuzzy rule [21], in which the multivariate Gaussian function, which possesses a non-diagonal covariance matrix, is utilized as the rule antecedent. This rule premise is an attractive option for covering real-world data distributions because it can evolve non-axis parallel ellipsoids and is capable of conferring more exact coverage of data distributions. It is worth noting that this advantage cannot be achieved with axis parallel rules induced by the

Meta-cognitive learning

An incoming datum is first vetted by the what-to-learn module (Section 4.2), which aims to rule out inconsequential samples for the model updates. The training samples, admitted by the what-to-learn component, are injected into the how-to-learn module (Section 4.1), which updates the cognitive component. The training samples, which do not satisfy the learning criteria set out in the how-to-learn component, are assigned as reserved samples. The reserved samples are utilized after the main

Efficacy of gClass learning modules

This section is intended to evaluate the efficacy of gClass׳s learning modules. Three data sets, namely thyroid, wine, and ionosphere, obtained from the University of California, Irvine (UCI) machine learning repository (http://www.ics.uci.edu/mlearn/MLRepository.html), are used to assess the qualities of the proposed learning components. The weather dataset is also used, because this dataset contains severe concept drift. In this section, we evaluate the weather data set from the Offtutt Air

Conclusions

A novel meta-cognitive classifier, namely gClass, is proposed in this paper. The major contribution of gClass has three learning attributes: (1) gClass introduces a generalized meta-cognitive learning paradigm, in which the how-to-learn module is consistent with the Schema and Scaffolding theories; (2) gClass relies on a generalized TSK fuzzy rule, exploiting the multivariate Gaussian function in the premise component and the non-linear Chebyshev function in the consequent component; (3) four

Acknowledgments

The work presented in this paper is partly supported by the Australian Research Council (ARC) under Discovery Project nos. DP110103733 and DP140101366 and the first author acknowledges receipt of UTS research seed funding grant.

Mahardhika Pratama received B.Eng degree (First Class Honor) in Electrical Engineering from the Sepuluh Nopember Institute of Technology, Indonesia, in 2010. At the same time, he was awarded the best and most favorite final project by the same institution. Dr. Pratama obtained his Master of Science (M.Sc.) degree in Computer Control and Automation (CCA) from Nanyang Technological University, Singapore, in 2011 and achieved prestigious engineering achievement award given by Institute of

References (75)

  • E. Lughofer

    On-line assurance of interpretability criteria in evolving fuzzy systems—achievements, new concepts and open issues

    Inf. Sci.

    (2013)
  • E. Lughofer et al.

    Autonomous data stream clustering implementing incremental split-and-merge techniques—towards a plug-and-play approach

    Inf. Sci.

    (2015)
  • A. Almaksour et al.

    LClass: error-driven antecedent learning for evolving Takagi-Sugeno classification systems

    Appl. Soft Comput.

    (2014)
  • M. Han et al.

    Endpoint prediction model for basic oxygen furnace steel-making based on membrane algorithm evolving extreme learning machine

    Appl. Soft Comput.

    (2014)
  • A. Zdesar et al.

    Self-tuning of 2 DOF control based on evolving fuzzy model

    Appl. Soft Comput.

    (2014)
  • R.-E. Precup et al.

    Online identification of evolving Takagi-Sugeno-Kang fuzzy models for crane systems

    Appl. Soft Comput.

    (2014)
  • N. Wang et al.

    A fast and accurate online self-organizing scheme for parsimonious fuzzy neural networks

    Neurocomputing

    (2009)
  • P. Angelov et al.

    Evolving fuzzy-rule-based classifiers from data streams

    IEEE Trans. Fuzzy Syst.

    (2008)
  • E. Lughofer et al.

    Reliable all-pairs evolving fuzzy classifiers

    IEEE Trans. Fuzzy Syst.

    (2013)
  • R.D. Baruah, P. Angelov, J. Andreu, Simpl_eClass: simplified potential-free evolving fuzzy rule-based classifiers, in:...
  • G.S. Babu et al.

    Sequential projection-based metacognitive learning in a radial basis function network for classification problems

    IEEE Trans. Neural Netw. Learn. Syst.

    (2013)
  • K. Subramanian et al.

    A Meta-Cognitive Neuro-Fuzzy Inference System (McFIS) for sequential classification systems

    IEEE Trans. Fuzzy Syst.

    (2013)
  • G. Sateesh Babu et al.

    Meta-cognitive rbf network and its projection based learning algorithm for classification problems

    Appl. Soft Comput.

    (2013)
  • K. Subramanian, R. Savitha, S. Suresh, A meta-cognitive interval type-2 fuzzy inference system classifier and its...
  • B.J. Reiser

    Scaffolding complex learning: the mechanisms of structuring and problematizing student work

    J. Learn. Sci.

    (2004)
  • F.C. Bartett

    Remembering: A study in Experimental and Social Psychology

    (1932)
  • J.H. Flavell

    Piagiet׳s legacy

    Psychol. Sci.

    (1996)
  • R. Elwell et al.

    Incremental learning of concept drift in non-stationary environments

    IEEE Trans. Neural Netw.

    (2011)
  • M. Pratama et al.

    PANFIS: a novel incremental learning machine

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • M. Pratama et al.

    GENEFIS: towards an effective localist network

    IEEE Trans. Fuzzy Syst.

    (2014)
  • M. Pratama et al.

    pClass: An Effective Classifier to Streaming Examples

    IEEE Transactions on Fuzzy Systems

    (2015)
  • M. Pratama, S. Anavatti, E. Lughofer, Evolving fuzzy rule-based classifier based on GENEFIS, in: Proceedings of the...
  • M. Pratama, M.-J. Er, S. Anavatti, E. Lughofer, I. Arifin, A novel meta-cognitive-based scaffolding classifier to...
  • J.C. Patra et al.

    Nonlinear dynamic system identification using Chebyshev functional link artificial neural networks

    IEEE Trans. Syst., Man Cybern.—Part B: Cybern.

    (2002)
  • Y.-Y. Lin et al.

    Identification and prediction of dynamic systems using an interactively recurrent self-evolving fuzzy neural network

    IEEE Trans. Neural Netw. Learn. Syst.

    (2013)
  • E. Lughofer

    Hybrid active learning for reducing the annotation effort of operators in classification systems

    Pattern Recognit.

    (2013)
  • A. Shaker et al.

    Self-adaptive and local strategies for a smooth treatment of drifts in data streams

    Evol. Syst.

    (2014)
  • Cited by (98)

    • Adaptive one-pass passive-aggressive radial basis function for classification problems

      2022, Neurocomputing
      Citation Excerpt :

      The most common paradigm in such methods is determining the best possible strategy to choose an action among deleting data, adding a new neuron, updating weights, or adding data to a fine-tuning set. These systems are designed to avoid overfitting, to find the most informative data and to reduce the structural complexity in contexts of Fuzzy Systems [44–47], Radial Basis Function (RBF) neural networks [48,49,13] and extreme learning machines [50–53]. The most well-known meta-cognitive approach is the Meta-cognitive Neural Network (McNN) algorithm [13].

    View all citing articles on Scopus

    Mahardhika Pratama received B.Eng degree (First Class Honor) in Electrical Engineering from the Sepuluh Nopember Institute of Technology, Indonesia, in 2010. At the same time, he was awarded the best and most favorite final project by the same institution. Dr. Pratama obtained his Master of Science (M.Sc.) degree in Computer Control and Automation (CCA) from Nanyang Technological University, Singapore, in 2011 and achieved prestigious engineering achievement award given by Institute of Engineer, Singapore. He successfully attained a Ph.D. degree from the University of New South Wales, Australia in 2014, where he was awarded the high impact publication award in 2013 and 2014. Dr. Pratama is currently working at the center of Quantum Computation and Intelligent System (QCIS), University of Technology, Sydney (UTS) as a research fellow. Dr. Pratama is a member of IEEE, IEEE Computational Intelligent Society (CIS) and IEEE System, Man and Cybernetic Society (SMCS), and Indonesian Soft Computing Society (ISC-INA) and severs as a reviewer in some top tier journals such as IEEE Transactions on System, Man and Cybernetics Part-B: Cybernetics, Neurocomputing and Applied Soft Computing. His research interests involve machine learning, computational intelligent, evolutionary computation, fuzzy logic, neural network and evolving adaptive systems

    Professor Jie Lu is the Associate Dean in Research in the Faculty of Engineering and Information Technology at the University of Technology Sydney (UTS). Her main research interests lie in the area of decision support systems, recommender systems, prediction and early warning systems, fuzzy transfer learning, and e-Service intelligence. She has published five research books and 400 papers in refereed journals and conference proceedings. She has won seven Australian Research Council (ARC) discovery grants, and 10 other research grants. She received the first UTS Research Excellent Medal for Teaching and Research Integration in 2010. She serves as Editor-In-Chief for Knowledge-Based Systems (Elsevier) and Editor-In-Chief for International Journal on Computational Intelligence Systems (Atlantis), and has delivered many keynote speeches at international conferences.

    Sreenatha Anavatti received his Ph.D. degree in aerospace engineering from the Indian Institute of Science in 1990, his Bachelor of Engineering degree in mechanical engineering from the Mysore University, India in 1984. He is currently a Senior Lecturer at the School of Aerospace, Civil and Mechanical Engineering (ACME), University of New South Wales at Australian Defence Force Academy (UNSW@ADFA), Australia. His current research interests include control systems, flight dynamics, robotics, aeroelasticity, artificial neural networks, fuzzy systems and unmanned systems.

    Edwin David Lughofer received his Ph.D. degree from the Department of Knowledge-Based Mathematical Systems, University Linz. He is now employed as key researcher at the department׳s branch Fuzzy Logic Laboratorium in the Softwarepark Hagenberg. He participated in several international research (EU) projects and has published more than 100 journal and conference papers in the fields of evolving fuzzy systems, machine learning and vision, clustering, fault detection, including a monograph on ‘Evolving Fuzzy Systems’ (Springer, Heidelberg) and an edited book on ‘Learning in Non-stationary Environments’ (Springer, New York). He acts as a reviewer in peer-reviewed international journals and as (co-)organizer of special sessions and issues at international conferences and journals. In 2014, he served as Main Chair of the international IEEE conference on Evolving and Adaptive Intelligent Systems. He served as programme committee member in several international conferences and is a member of the editorial board and associate editor of the international Springer journal ‘Evolving Systems’ and the international Elsevier Journal ‘Information Fusion’. In 2006, he received the best paper award at the International Symposium on Evolving Fuzzy Systems, and in 2013 the best paper award at the IFAC conference in Manufacturing Modeling, Management and Control Conference.

    View full text