Elsevier

Information Fusion

Volume 41, May 2018, Pages 105-118
Information Fusion

Multistage fusion approaches based on a generative model and multivariate exponentially weighted moving average for diagnosis of cardiovascular autonomic nerve dysfunction

https://doi.org/10.1016/j.inffus.2017.08.004Get rights and content

Highlights

  • Multistage fusion approach for diagnosis of autonomic nerve dysfunction.

  • Independent Component Analysis and statistical process control are used for fusion.

  • Body sensor data from ECG and blood chemistry are used for fusion approach.

  • Decision fusion has been proposed for diagnosis by using a multi-classifier system.

  • Proposed fusion approach achieves high performance for diagnosis of nerve dysfunction.

Abstract

Like many medical diagnoses, clinical decision support system (CDSS) is essential to diagnose the cardiovascular autonomic neuropathy (CAN). However, diagnosis of CAN using the traditional ‘Ewing battery test’ becomes very difficult due to the inherent imbalanced and incompleteness condition in the collected clinical data. This influences the health professionals to investigate other related diagnostic reports of patients, including Electrocardiogram (ECG) data from ECG sensors, blood chemistry, podiatry and endocrinology features. However, additional components increase the dimensionality of the feature set as well as its heterogeneity and modality in the clinical data which may limit the applications of traditional data mining approaches for an accurate diagnosis of CAN in the CDSS. To address the aforementioned problem, in this paper, we have proposed, a novel multistage fusion approach based on a generative model and a statistical process control (SPC) technique to diagnose CAN more accurately. The proposed approach develops two different generative models by using a shared and a separated Independent Component Analysis (ICA) to overcome the incompleteness and modality of the data. Due to the heterogeneous and non-normality features, statistical correlations and multivariate control limits in relation to the CAN diagnosis parameters are determined by fusioning of a series of exponentially weighted moving average (MEWMA) control processes. Fusioned features from both component analyses and SPC are applied in an ensemble classification system. The proposed multistage fusion approach is experimentally verified to justify its performance by using a large dataset collected from the diabetes screening research initiative (DiScRi) project at Charles Sturt University, NSW, Australia. Our comprehensive experimental results show that the proposed fusion approach performs better than the standard classifier for both ‘Ewing’ feature set and ‘Ewing and additional feature set’ with significant improvement in accuracy.

Introduction

Body Sensor Networks (BSNs) [17], [18] is a specific class of wireless sensor networks which are emerging as a noteworthy unobtrusive technology to collect and process different vital signs of a patient for the purpose of managing chronic diseases and detecting health anomalies. BSNs are typically equipped on the human body as tiny patches or hidden in users’ clothes or even implanted in the human body [18]. These sensors have the capability to collect real-time data of various physiological parameters (e.g. heart rate (HR), the rate of breathing (RR), blood pressure (BP), pulse, body temperature, blood oxygen intensity (SPO2) and electrocardiogram (ECG)) [17], [18], [23]. The monitoring of patient health conditions helps in preventing terminal illness, monitoring the progression of chronic disease, and enhancing emergency services, especially for elderly and physically impaired people [1]. The data from sensor nodes are collected at local personal devices such as mobile phone and PC. These data can then be used for real-time monitoring and long term remote storage for diagnostic analysis. Often chronic diseases such as cardiovascular autonomic neuropathy (CAN) is very hard to determine if it is not monitored carefully at early stages. In these cases, BSN are useful for collecting patients’ physiological data which can be later used for analysis and detection [18], [23].

Cardiovascular autonomic neuropathy (CAN) [4], [15] is directly associated with cardiac arrhythmia and many cardiovascular related diseases which increases the unexpected death rate [8], [11]; particularly for diabetes patients. Therefore appropriate monitoring and early detection of CAN is essential in clinical diagnostic and management systems. Often, clinical decision support systems (CDSS) are very helpful for appropriate management of chronic disease such as diabetes and CAN for accurate monitoring and early diagnosis of diseases. However, data analysis and intelligent processing of medical data in CDSS are quite challenging. Medical data are heterogeneous, multimodal, imbalance and high dimensional due to the complexity in the data collection and production processes from their sources in the medical systems. The complexity may arise for many different reasons including cost-sensitiveness, high risks with side effects of diagnostic tests and operating dangerous equipment for which diagnosis tests are not always completed for the patients unless it is strictly required. This can lead to an incompleteness in the collected data [36], which makes the diagnosis of the diseases difficult for health professionals and data analysis tasks in CDSS. This in turn may results into a complete failure or an inaccurate diagnosis of the diseases.

Similar to other medical data, successful diagnosis of CAN using conventional ‘Ewing battery tests’ [4], [15] is complex and dependent on the capability of the patients to undergo all the tests. Often many patients are unable to go through all of the ‘Ewing’ tests; for example, one of the tests require a movement of patients from a position ‘Lying’ to ‘standing’ or vice versa. Some other tests also may not be suitable for elderly patients who are hard to diagnosis either due to the insensitive response to the tests or having an impaired mobility. This may lead to incomplete datasets of clinical CAN diagnosis feature and can affect the performances of CDSS for CAN diagnosis.

Researchers are investigating complementary features including ECG and blood chemistry that may help to overcome these aggravating test conditions. However, additional features pose data analysis challenges for CDSS including high dimensionality, heterogeneity, multimodality to some extent and incompleteness in the data. In this paper, we address this data analysis challenge in CDSS for CAN diagnosis to achieve a high performance detection of CAN. High dimensionality have been addressed in CAN diagnosis with a small data set(only 291 patients) in 2010 by the co-authors [22] using a wrapper-filter approach which achieved up to 82% accuracy using ‘Ewing’ features. Relevance of blood chemistry with CAN have been recently studied in a number related researches [14], [38], particularly for glucose level among diabetes patients. The relationship between lipid profile and CAN have also been studied recently in several articles in [33], [34]. These studies [14], [33], [34], [38] show that blood chemistry has a strong relationship with CAN. However, recent studies [14], [33], [34], [38] are limited to an independent parameter based study and did not consider their combined effect including all CAN features. Also earlier work [22] uses repeated evaluation of Artificial Neural Network (ANN) [21] in a backward elimination process which increases the computational complexity [26] at a very high level due to the training time of ANN [21] and is not suitable for a large scale datatsets. Therefore, an extensive study with the combined challenge of incompleteness, heterogeneity and modality including an additional blood chemistry feature with a large number of patients is of utmost importance in order to develop a robust and high performance CDSS for CAN diagnosis. This is the main focus of this paper.

In this paper, we propose a multistage fusion approach through the fusion of an independent component analysis (ICA) based generative model and multivariate exponentially weighted moving average (MEWMA) based SPC technique. Two different generative models have been developed using a shared ICA and separated ICA of CAN features. Then the extracted components were passed through a multilevel fusioned MEWMA processes. Identified upper control limits with patients’ multivariate characteristics features by SPC are fusioned with the components of original CAN features. These features are applied on an ensemble classifier for CAN classification. The novelties and contributions of the proposed approach are described below which include:

  • 1.

    A novel multistage fusion approach has been developed using a generative model and multivariate exponentially weighted moving average for CAN diagnosis.

  • 2.

    An unsupervised statistical model has been developed to determine the multivariate co-relations and corresponding statistical upper control limit in order to distinguish the in-control and out-of-control patients by fusioning a series of MEWMA processes.

  • 3.

    A feature based fusion and ensemble decision model has been developed by using the independent component analysis (ICA) and MEWMA to minimize the effect of non-normality, heterogeneity, high dimensional and multimodal challenge of CAN data.

ICA based generative models successfully identify sources of input features using the inherent blind source separation technique which also deals with the high dimensionality. MEWMA has been used to identify multivariate co-relations among heterogeneous data which is also able to identify joint upper control limit of the CAN parameters through the fusion of a series of MEWMA with varying average run length (ARL). Identified multivariate characteristics from MEWMA process are fusioned with the components of CAN and used in an ensemble classification. A large dataset of CAN from the diabetes screening research initiative (DiScRi) project at Charles Sturt University, NSW, Australia has been used to justify the performance of the proposed multistage fusion framework.

The rest of the paper is organized as follows. Section 2 discusses related work in CAN and different existing techniques for CAN identification. Section 3 explains the proposed multistage fusion approaches. Description of the data collection method and method of pre-processing data are discussed in Section 3.1. Sections 3.3 and 3.4 describe the proposed two multistage fusion models including the fusioning of MEWMA processes, fusion of feature and decision models. The experiment results are presented in Section 4. Conclusions from this study and references are presented in the last two sections.

Section snippets

Related work

CAN is a complication of diabetes mellitus which involves a severe damage to the autonomic nerve fibres and is directly associated with increased levels of systemic inflammation and a high risk of cardiovascular disease [8], [11]. Conventional method of CAN diagnosis requires five simple autonomic function tests known as ‘Ewing battery tests’ [4], [15]. The tests include the measurements of variations in the heart rate (HR) and blood pressure (BP) for different situation while patients perform

Proposed methodology: multistage fusion approach based on a generative model and multivariate process control technique

SPC [20], [35] techniques are used to determine the quality of a process by using the distribution of the quality characteristics in many multivariate processes [7], [39]. This also can be used to monitor the bio-medical processes. SPC approach such as multivariate exponentially weighted moving average(MEWMA) charts can find any abrupt change or variations in the observed medical data, at the same time can evaluate unanticipated aberrations in the data. Tennant et al. [39] also have shown in

Experiment results and discussion

After feature extraction from collected data, a normality test is performed for all variables. Most of the features are found to be non-normal, an example of normality test for glucose level and LSHR are presented in Figs. 8 and 9. Latent variable components have been computed by using shared-ICA and separated-ICA as mentioned in the methods described in earlier sections. For separated-ICAs of Ewing plus ECG features, a total of 10 components have been taken based on their eigenvalues of

Conclusion

Early and accurate monitoring and diagnosis of cardiovascular autonomic neuropathy (CAN) can reduce the risk of cardiovascular disease related death rate significantly. Conventional ‘Ewing battery test’ are often difficult for elderly patients to undertake accurately or cannot be undertaken by them at all. In this situation ‘Ewing test’ results can lead to an incomplete CAN dataset. Blood biochemistry and different morphological features from ECG have been considered as a complementary features

Acknowledgement

The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for funding this research through Research Group Project No. RGP-1437-35.

References (41)

  • C.H. Antink et al.

    Beat-to-beat heart rate estimation fusing multimodal video and sensor data

    Biomed. Opt. Express

    (2015)
  • American Diabetes Association and American Academy of Neurology, 1988. Report and recommendations of the San Antonio...
  • S. Baraka et al.

    Fusion of multiple diverse predictors in stock market

    Inf. Fusion

    (2017)
  • A. Bradley

    The use of the area under the roc curve in the evaluation of machine learning algorithms

    Pattern Recognit.

    (1997)
  • X. Chanudet et al.

    Coronary heart disease and cardiovascular autonomic neuropathy in the elderly diabetic

    Diabetes Metab.

    (2007)
  • C. Chena et al.

    An ensemble of patch-based subspaces for makeup-robust face recognition

    Inf. Fusion

    (2016)
  • C.-C. Chiu1 et al.

    Svm classification for diabetics with various degrees of autonomic neuropathy based on cross-correlation features

    J. Med. Biol. Eng.

    (2013)
  • J.A. Cohena et al.

    Diabetic autonomic neuropathy is associated with an increased incidence of strokes

    Auton. Neurosci.

    (2003)
  • N.M. Correa et al.

    Canonical correlation analysis for data fusion and group inferences: examining applications of medical imaging data

    IEEE Signal Process. Mag.

    (2010)
  • M. Cortex, J.R. Singleton, A.G. Smith, Handbook of Clinical Neurology, vol. 126,...
  • Cited by (31)

    • Health assessment method based on multi-sign information fusion of body area network

      2022, Information Sciences
      Citation Excerpt :

      Features extracted from the frequency domain and time domain are used to train an intelligent classifier which is to identify the postures. In mental health, Bengum et al. [22] proposed a physiological signal classification technique based on multi-sensor data fusion and case-based reasoning to assess the stress levels of monitored individuals, using fuzzy logic theory to achieve matching between categories. In addition, many researchers have applied BAN to the field of emotion recognition.

    • Early detection of cardiovascular autonomic neuropathy: A multi-class classification model based on feature selection and deep learning feature fusion

      2022, Information Fusion
      Citation Excerpt :

      However, in this study, we used new demographic and frequency domain features (see Table 3) that were not used in [17]. The details of the features shown in Tables 1 and 2 are presented in our earlier work [17]. The data [17] were obtained from a large number of CAN patients and healthy individuals.

    • Multimodal spatio-temporal-spectral fusion for deep learning applications in physiological time series processing: A case study in monitoring the depth of anesthesia

      2021, Information Fusion
      Citation Excerpt :

      The obtained results from all configurations were finally merged as fused information [61]. Another study focused on fusing ECG and blood chemistry data through a process of developing separate generative models for each modality based on independent component analysis and then mapping their results into t2 statistics and merging them by applying exponentially weighted moving average [62]. One research introduced a modified version of Daubechies transform for fusing spike shape information and time-frequency patterns of biological signals [63].

    View all citing articles on Scopus
    View full text