An incremental meta-cognitive-based scaffolding fuzzy neural network
Introduction
The consolidation of the meta-cognitive aspect in machine learning was initiated by Suresh et al. [7], [8], [9], [10], [11] based on a prominent meta-memory model proposed by Nelson and Naren [6]. The works in [7], [8], [9], [10], [11] identify that the meta-cognitive component, namely what-to-learn, how-to-learn and when-to-learn, can respectively be modelled with sample deletion strategy, sample learning strategy and sample reserved strategy. Nevertheless, their pioneering works still discount the construct of Scaffolding theory [12,22], rendering a plug-and-play classifier. They have also not addressed the issue of semi-supervised learning, since the what-to-learn phase requires the data to be fully labelled.
A novel meta-cognitive-based Scaffolding classifier, the GENERIC-classifier (gClass), is proposed in this paper. The gClass learning engine comprises three elements: what-to-learn; how-to-learn; and when-to-learn. The underlying novelty of gClass lies on the use of Schema and Scaffolding theories in the how-to-learn component to realize it as a plug-and-play classifier. The plug-and-play learning paradigm emphasizes the need for all learning modules to be embedded in a single learning process without invoking any pre- and/or post-training processes. In respect of its cognitive constituent, the gClass fuzzy rule triggers a non-axis-orthogonal fuzzy rule in the input space, underpinned by the multivariate Gaussian function rule premise. Unlike the standard form of TSK fuzzy rule consequents, the rule consequent of gClass is built upon a non-linear function stemming from a subset of non-linear Chebyshev polynomials. All training mechanisms run in the strictly sequential learning mode to assure fast model updates and comply with the four principles of online learning [32]: (1) all training observations are sequentially presented one by one or chunk by chunk to gClass; (2) only one training datum is seen and learned in every training episode; (3) a training sample which has been seen is discarded without being reused; and (4) gClass does not require any information pertaining to the total number of training data.
The gClass learning scenario utilizes several learning modules of our previous algorithms in [18], [19]: three rule growing cursors, namely Datum Significance (DS), Data Quality (DQ), and Generalized Adaptive Recursive Theory+ (GART+), are used to evolve fuzzy rules according to the Schema theory [14]; two rule pruning strategies, namely Extended Rule Significance (ERS) and Potential (P+) methods, are assembled to get rid of obsolete and inactive fuzzy rules and portray the fading aspect of Scaffolding theory. The P+ method also deciphers the rule recall process, manifesting the problematizing component of Scaffolding theory to cope with the recurring concept drift; the Fuzzily Weighted Generalized Recursive Least Square (FWGRLS) method is integrated to adjust the rule consequent of the fuzzy rule and in turn delineates the passive supervision of the Scaffolding theory. gClass operates as its counterparts in [7], [8], [9], [10], [11], where the sample reserved strategy is employed in the when-to-learn process. Nonetheless, several new learning modules are proposed in this paper:
- •
The what-to-learn component is built upon a new online active learning scenario, called the Extended Conflict and Ignorance (ECI) method. The ECI method is derived from the conflict and ignorance method [2], and the ignorance method is enhanced by the use of the DQ method instead of the classical rule firing strength concept. This modification makes the online active learning method more robust against outliers and more accurate in deciding the sample ignorance. Note that this mechanism can be also perceived as an enhanced version of the original what-to-learn module in [7], [8], [9], [10], [11]. In [7], [8], [9], [10], [11], the what-to-learn module is limited to ruling out redundant samples for model updates, and still assumes that data are fully labelled.
- •
A new fuzzy rule initialization strategy is proposed and is constructed by the potential per-class method. This method is used to avoid misclassifications caused by the class overlapping situation. A number of research efforts have been attempted in [7], [8], [9], [10], [69], [70], [71] to circumvent the class overlapping situation, however they rely on the distance ratio method, which overlooks the existence of unclean clusters. An unclean cluster is a cluster that contains supports from different classes and is prevalent in real world-problems. This learning aspect actualizes the restructuring phase of Schema theory.
- •
gClass is also equipped with a local forgetting scheme inspired by [28] to surmount gradual concept drift, where the forgetting intensity is enumerated by a newly developed method, called the Local Data Quality (LDQ) method. It is worth stressing that gradual concept drift is more precarious than abrupt concept drift, because gradual concept drift cannot be detected by standard drift detection or the rule generation method. On the other side, it cannot be handled by the conventional parameter learning method either. This situation entails the local forgetting scheme, which adapts fuzzy rule parameters more firmly and is thereby able to pursue changing data distributions. In the realm of Scaffolding theory, the local drift-handling strategy plays a problematizing role in the active supervision of the theory.
- •
gClass enhances the Fisher Separability Criterion (FSC) in the empirical feature space method with the optimization step via the gradient ascent method. This step not only alleviates the curse of dimensionality, but it also improves the discriminatory power of input features. Noticeably, it triggers a direct impact on the classifier׳s generalization. The online feature weighting technique is employed to address the complexity reduction scenario in the active supervision of the scaffolding concept.
The contributions of this paper are summarized as follows: (1) the paper proposes a new class of meta-cognitive classifiers, which consolidate the Schema and Scaffolding theories to drive the how-to-learn module. (2) The paper introduces a novel type of TSK fuzzy rule, crafted by the multivariable Gaussian function in the premise component and the non-linear Chebyshev polynomial in the output component. (3) Four novel learning modules in the gClass learning engine are proposed: online feature selection; online active learning; class overlapping strategy; and online feature weighting mechanism. The viability and efficacy of gClass have been numerically validated by means of thorough numerical studies in both real-world and artificial study cases. gClass has also been benchmarked against various state-of-the-art classifiers, confirmed by rigorous statistical tests in which gClass demonstrates highly encouraging generalization power while suppressing complexity to an acceptable level. The remainder of this paper is organized as follows: Section 2 discusses related works. Section 3 illustrates the gClass inference mechanism, i.e., its cognitive aspect. Section 4 outlines the algorithmic development of gClass, i.e., its meta-cognitive component. Section 5 deliberates the empirical studies and discussions of the research gap and contribution, which detail the viability and research gap of gClass. Concluding remarks are drawn in the last section of this paper.
Section snippets
Literature review
In this section, two related areas are discussed. A survey of the psychological concepts implemented in gClass is undertaken, as well as a literature review of state-of-the art evolving classifiers.
Cognitive component of Gclass
gClass is endowed with a generalized fuzzy rule [21], in which the multivariate Gaussian function, which possesses a non-diagonal covariance matrix, is utilized as the rule antecedent. This rule premise is an attractive option for covering real-world data distributions because it can evolve non-axis parallel ellipsoids and is capable of conferring more exact coverage of data distributions. It is worth noting that this advantage cannot be achieved with axis parallel rules induced by the
Meta-cognitive learning
An incoming datum is first vetted by the what-to-learn module (Section 4.2), which aims to rule out inconsequential samples for the model updates. The training samples, admitted by the what-to-learn component, are injected into the how-to-learn module (Section 4.1), which updates the cognitive component. The training samples, which do not satisfy the learning criteria set out in the how-to-learn component, are assigned as reserved samples. The reserved samples are utilized after the main
Efficacy of gClass learning modules
This section is intended to evaluate the efficacy of gClass׳s learning modules. Three data sets, namely thyroid, wine, and ionosphere, obtained from the University of California, Irvine (UCI) machine learning repository (http://www.ics.uci.edu/mlearn/MLRepository.html), are used to assess the qualities of the proposed learning components. The weather dataset is also used, because this dataset contains severe concept drift. In this section, we evaluate the weather data set from the Offtutt Air
Conclusions
A novel meta-cognitive classifier, namely gClass, is proposed in this paper. The major contribution of gClass has three learning attributes: (1) gClass introduces a generalized meta-cognitive learning paradigm, in which the how-to-learn module is consistent with the Schema and Scaffolding theories; (2) gClass relies on a generalized TSK fuzzy rule, exploiting the multivariate Gaussian function in the premise component and the non-linear Chebyshev function in the consequent component; (3) four
Acknowledgments
The work presented in this paper is partly supported by the Australian Research Council (ARC) under Discovery Project nos. DP110103733 and DP140101366 and the first author acknowledges receipt of UTS research seed funding grant.
Mahardhika Pratama received B.Eng degree (First Class Honor) in Electrical Engineering from the Sepuluh Nopember Institute of Technology, Indonesia, in 2010. At the same time, he was awarded the best and most favorite final project by the same institution. Dr. Pratama obtained his Master of Science (M.Sc.) degree in Computer Control and Automation (CCA) from Nanyang Technological University, Singapore, in 2011 and achieved prestigious engineering achievement award given by Institute of
References (75)
- et al.
Evolving fuzzy classifiers using different model architectures
Fuzzy Sets Syst.
(2008) - et al.
Adaptive fault detection and diagnosis using an evolving fuzzy classifier
Inf. Sci.
(2013) - et al.
Metamemory: a theoretical framework and new findings
Psychol. Learn. Motiv.
(1990) - et al.
A meta-cognitive sequential learning algorithm for neuro-fuzzy inference system
Appl. Soft Comput.
(2012) - et al.
A massively parallel architecture for a self-organizing neural pattern recognition machine
Comput. Vis., Graph., Image Process.
(1987) - et al.
Data driven modelling based on dynamic parsimonious fuzzy neural network
Neurocomputing
(2013) On-line incremental feature weighting in evolving fuzzy classifiers
Fuzzy Sets Syst.
(2011)- et al.
Fuzzy passive-aggressive classification: a robust and efficient algorithm for online classification problems
Inf. Sci.
(2013) - et al.
Handling drifts and shifts in on-line data streams with evolving fuzzy systems
Appl. Soft Comput.
(2011) - et al.
Data compression by volume prototypes for streaming data
Pattern Recognit.
(2010)
On-line assurance of interpretability criteria in evolving fuzzy systems—achievements, new concepts and open issues
Inf. Sci.
Autonomous data stream clustering implementing incremental split-and-merge techniques—towards a plug-and-play approach
Inf. Sci.
LClass: error-driven antecedent learning for evolving Takagi-Sugeno classification systems
Appl. Soft Comput.
Endpoint prediction model for basic oxygen furnace steel-making based on membrane algorithm evolving extreme learning machine
Appl. Soft Comput.
Self-tuning of 2 DOF control based on evolving fuzzy model
Appl. Soft Comput.
Online identification of evolving Takagi-Sugeno-Kang fuzzy models for crane systems
Appl. Soft Comput.
A fast and accurate online self-organizing scheme for parsimonious fuzzy neural networks
Neurocomputing
Evolving fuzzy-rule-based classifiers from data streams
IEEE Trans. Fuzzy Syst.
Reliable all-pairs evolving fuzzy classifiers
IEEE Trans. Fuzzy Syst.
Sequential projection-based metacognitive learning in a radial basis function network for classification problems
IEEE Trans. Neural Netw. Learn. Syst.
A Meta-Cognitive Neuro-Fuzzy Inference System (McFIS) for sequential classification systems
IEEE Trans. Fuzzy Syst.
Meta-cognitive rbf network and its projection based learning algorithm for classification problems
Appl. Soft Comput.
Scaffolding complex learning: the mechanisms of structuring and problematizing student work
J. Learn. Sci.
Remembering: A study in Experimental and Social Psychology
Piagiet׳s legacy
Psychol. Sci.
Incremental learning of concept drift in non-stationary environments
IEEE Trans. Neural Netw.
PANFIS: a novel incremental learning machine
IEEE Trans. Neural Netw. Learn. Syst.
GENEFIS: towards an effective localist network
IEEE Trans. Fuzzy Syst.
pClass: An Effective Classifier to Streaming Examples
IEEE Transactions on Fuzzy Systems
Nonlinear dynamic system identification using Chebyshev functional link artificial neural networks
IEEE Trans. Syst., Man Cybern.—Part B: Cybern.
Identification and prediction of dynamic systems using an interactively recurrent self-evolving fuzzy neural network
IEEE Trans. Neural Netw. Learn. Syst.
Hybrid active learning for reducing the annotation effort of operators in classification systems
Pattern Recognit.
Self-adaptive and local strategies for a smooth treatment of drifts in data streams
Evol. Syst.
Cited by (98)
An Evolving Quantum Fuzzy Neural Network for online State-of-Health estimation of Li-ion cell
2023, Applied Soft ComputingSymbolic aggregate approximation based data fusion model for dangerous driving behavior detection
2022, Information SciencesSignificance of activation functions in developing an online classifier for semiconductor defect detection
2022, Knowledge-Based SystemsAdaptive one-pass passive-aggressive radial basis function for classification problems
2022, NeurocomputingCitation Excerpt :The most common paradigm in such methods is determining the best possible strategy to choose an action among deleting data, adding a new neuron, updating weights, or adding data to a fine-tuning set. These systems are designed to avoid overfitting, to find the most informative data and to reduce the structural complexity in contexts of Fuzzy Systems [44–47], Radial Basis Function (RBF) neural networks [48,49,13] and extreme learning machines [50–53]. The most well-known meta-cognitive approach is the Meta-cognitive Neural Network (McNN) algorithm [13].
An advanced interpretable Fuzzy Neural Network model based on uni-nullneuron constructed from n-uninorms
2022, Fuzzy Sets and Systems
Mahardhika Pratama received B.Eng degree (First Class Honor) in Electrical Engineering from the Sepuluh Nopember Institute of Technology, Indonesia, in 2010. At the same time, he was awarded the best and most favorite final project by the same institution. Dr. Pratama obtained his Master of Science (M.Sc.) degree in Computer Control and Automation (CCA) from Nanyang Technological University, Singapore, in 2011 and achieved prestigious engineering achievement award given by Institute of Engineer, Singapore. He successfully attained a Ph.D. degree from the University of New South Wales, Australia in 2014, where he was awarded the high impact publication award in 2013 and 2014. Dr. Pratama is currently working at the center of Quantum Computation and Intelligent System (QCIS), University of Technology, Sydney (UTS) as a research fellow. Dr. Pratama is a member of IEEE, IEEE Computational Intelligent Society (CIS) and IEEE System, Man and Cybernetic Society (SMCS), and Indonesian Soft Computing Society (ISC-INA) and severs as a reviewer in some top tier journals such as IEEE Transactions on System, Man and Cybernetics Part-B: Cybernetics, Neurocomputing and Applied Soft Computing. His research interests involve machine learning, computational intelligent, evolutionary computation, fuzzy logic, neural network and evolving adaptive systems
Professor Jie Lu is the Associate Dean in Research in the Faculty of Engineering and Information Technology at the University of Technology Sydney (UTS). Her main research interests lie in the area of decision support systems, recommender systems, prediction and early warning systems, fuzzy transfer learning, and e-Service intelligence. She has published five research books and 400 papers in refereed journals and conference proceedings. She has won seven Australian Research Council (ARC) discovery grants, and 10 other research grants. She received the first UTS Research Excellent Medal for Teaching and Research Integration in 2010. She serves as Editor-In-Chief for Knowledge-Based Systems (Elsevier) and Editor-In-Chief for International Journal on Computational Intelligence Systems (Atlantis), and has delivered many keynote speeches at international conferences.
Sreenatha Anavatti received his Ph.D. degree in aerospace engineering from the Indian Institute of Science in 1990, his Bachelor of Engineering degree in mechanical engineering from the Mysore University, India in 1984. He is currently a Senior Lecturer at the School of Aerospace, Civil and Mechanical Engineering (ACME), University of New South Wales at Australian Defence Force Academy (UNSW@ADFA), Australia. His current research interests include control systems, flight dynamics, robotics, aeroelasticity, artificial neural networks, fuzzy systems and unmanned systems.
Edwin David Lughofer received his Ph.D. degree from the Department of Knowledge-Based Mathematical Systems, University Linz. He is now employed as key researcher at the department׳s branch Fuzzy Logic Laboratorium in the Softwarepark Hagenberg. He participated in several international research (EU) projects and has published more than 100 journal and conference papers in the fields of evolving fuzzy systems, machine learning and vision, clustering, fault detection, including a monograph on ‘Evolving Fuzzy Systems’ (Springer, Heidelberg) and an edited book on ‘Learning in Non-stationary Environments’ (Springer, New York). He acts as a reviewer in peer-reviewed international journals and as (co-)organizer of special sessions and issues at international conferences and journals. In 2014, he served as Main Chair of the international IEEE conference on Evolving and Adaptive Intelligent Systems. He served as programme committee member in several international conferences and is a member of the editorial board and associate editor of the international Springer journal ‘Evolving Systems’ and the international Elsevier Journal ‘Information Fusion’. In 2006, he received the best paper award at the International Symposium on Evolving Fuzzy Systems, and in 2013 the best paper award at the IFAC conference in Manufacturing Modeling, Management and Control Conference.