In recent years, automatic emotion recognition has become a new hotspot of Artificial Intelligence and related research as the role played by affect in human life and everyday functioning is well recognized and studied. It not only greatly benefits natural human-computer interaction, but also shows great potential to be used in a wide variety of applications, such as personalized learning, health monitoring, surveillance, and anomaly detection. However, emotion recognition from facial and bodily expressions in the wild is still a challenging task because of pose variations, scaling and illumination changes, occlusion and background clutter. Moreover, the identification of significant discriminating facial and bodily features that could represent the characteristic of each expression is also difficult because of the subtlety and variability of such expressions.

This special issue aims to stimulate research and discussion on automatic human behaviour understanding. We present a total of 10 articles that highlight the latest developments in facial and bodily expression classification, as well as other related methods that are useful in image analysis. The first five articles are related to facial expression, the subsequent three are related to posture and gait recognition, while the last two are related to biomedical and healthcare applications. A description of each article is presented, as follows.

Sadeghi and Raie address feature extraction for facial expression recognition using Gabor filters. In the proposed method, a facial image is firstly convolved with the Gabor filer, and the achieved convolution matrices are properly coded based on the maximum and minimum responses. The feature vector is obtained by calculating the histogram of these codes. Evaluated with three benchmark data sets, the proposed method outperforms existing image texture descriptors in facial expression recognition with both controlled and uncontrolled images.

The research of Zhang, Xia, and Liu focuses on a variable length 3D convolution network that introduces local receptive fields in the time domain for facial expression recognition. A Siamese 3D convolution network that utilizes information from another subject is proposed to provide attention weights for the extracted features. In addition, a method to extract fixed length landmark features from expression sequences as auxiliary for the convolution network is developed. The proposed network outperforms other methods in experiments on both subject-independent tasks and cross-database evaluation.

A fusion method that combines visible and infrared images for face authentication is proposed by Seal and Panigraphy. The proposed method relies on wavelet transform and fractal dimension using a differential box counting method. A new similarity measure is also proposed to check the closeness between a fused face image and others. Using three databases, the results depict that the proposed method along with the similarity measure outperforms other methods in terms of accuracy, precision and recall in face authentication.

Li and You introduce a new method based on two-dimensional locality adaptive discriminant analysis for face image representation and recognition. The method focuses on closely-related data points and adaptively uses the local relationships of points in the learned subspace, which preserves the spatial structure of the image and extracts more accurate low-dimensionality features. The method does not depend on any assumption on the data distribution, and is suitable in real-world applications. The empirical evaluations using artificial and real-world databases demonstrate the usefulness of the proposed method.

To undertake the challenges pertaining to age estimation from face images, Sawat, Addepalli, and Bhurchandi, propose an aging feature descriptor, namely the local direction and moment pattern, to capture both directional and textural variations due to aging. Given a face image, the orientation information is encoded in eight directions, while the texture is embedded into the magnitudes of higher order moments. Both orientation and texture information is combined into a robust feature descriptor. The warped Gaussian process regression is applied to the proposed feature vector for age estimation. The experimental analysis demonstrates the effectiveness of the proposed method on two large databases.

To improve the teaching and learning process, the work of Kuang, Guo, Peng, and Pei focuses on posture recognition of students for evaluating their learning states. The method fuses the improved scale invariant local ternary patterns and local directional patterns for posture recognition. Based on features extracted from learners’ posture images, the support vector machine is used for classification and recognition. The experimental results show that the proposed method can effectively recognize learners’ postures of sitting, raising hands, and lowering heads in classroom scenes.

A novel computer vision for posture monitoring system is proposed by Manocha and Singh. The aim is to predict generalized anxiety disorder oriented physical abnormalities of individuals in working environment. The deep convolutional neural network is used for spatio-temporal feature extraction, while the gated recurrent unit model is applied to the extracted temporal dynamics for adversity scale determination. The usefulness of the proposed method is demonstrated through extensive experiments on health-related conditions, including staggering, headache, stomachache, backache, neckache, and nausea or vomiting conditions. The alert-based decisions with the deliverance of physical states facilitates the use of the proposed system in the healthcare or assistive-care domains.

Bhatti develops a central pattern generator that can produce coupled leg oscillation derived through user-controlled parameters for quadruped animation. The various gaits produced by the pattern generator include walk, trot, gallop, canter, pace, and rack. The dynamic motion is calculated independently for each body part, and the user can manipulate the simulation parameters. The system automatically adjusts the motion gaits and transitions between each gait at runtime. This procedural model for animating quadrupeds can generate various locomotion gaits with varying speed and footfall patterns dynamically. Its usefulness is evaluated with a user’s perception test pertaining to the believability and accuracy of the generated animation with statistical significance.

To identify the imagery hand movement and no movement states from electroencephalograph signals in brain-computer interface, Hekmatmanesh, Wu, Nasrabadi, Li, and Handroos develop a method that combine detrended fluctuation analysis and discrete wavelet packet transform for feature extraction. To produce appropriate mother wavelet and distinctive features, the best channels and frequency bands are determined by the event related desynchronization diagnosis method. A soft margin support vector machine with the generalized radial basis function is employed to classify the features. Based on a series of experimental evaluation, the results along with statistical tests indicate the usefulness of the proposed method.

A robust and secure color image watermarking is introduced by Singh for tele-health applications. Based on lifting wavelet transform and discrete cosine transform, the proposed method embeds watermarks in cover image for the purpose of confidentiality, authentication and non-repudiation. Experimental demonstrations indicate that the method provides sufficient robustness and security against various attacks. The method also outperforms previously reported techniques in the literature.

The guest editors would like to thank the authors for contributing their articles, the reviewers for improving the quality of the articles through constructive comments and suggestions, and the publisher for the opportunity in producing this special issue. Thank you.

Guest Editors

Dr. Li Zhang (Lead Guest Editor), Northumbria University, Newcastle, UK

Dr. Chee Peng Lim, Deakin University, Geelong, Australia

Dr. Jungong Han, Lancaster University, Lancaster, UK

1 List of articles

  1. 1.

    Human Vision Inspired Feature Extraction for Facial Expression Recognition Hamid Sadeghi, Abolghasem-A. Raie

  2. 2.

    3D Convolution Network and Siamese-Attention Mechanism for Expression Recognition Yi-Feng Zhang,·Tian Xia,·Yuan Liu

  3. 3.

    Human Authentication based on Fusion of Thermal and Visible Face Images Ayan Seal, Chinmaya Panigrahy

  4. 4.

    Two-dimensional Locality Adaptive Discriminant Analysis Qin Li, and Jane You

  5. 5.

    Age estimation using local direction and moment pattern (LDMP) features Manisha Sawant, Shalini Addepalli, Kishor Bhurchandi

  6. 6.

    Learner posture recognition via a fusing model based on improved SILTP and LDP Yuqian Kuang, Min Guo, Yali Peng, Zhao Pei

  7. 7.

    Computer vision based working environment monitoring to analyze Generalized Anxiety Disorder (GAD) Ankush Manocha, Ramandeep Singh

  8. 8.

    Oscillator driven Central Pattern Generator (CPG) System for Procedural Animation of Quadruped Locomotion Zeeshan Bhatti

  9. 9.

    Combination of Discrete Wavelet Packet Transform with Detrended Fluctuation Analysis Using Customized Mother Wavelet with the Aim of an Imagery-Motor Control Interface for an Exoskeleton Amin Hekmatmanesh, Huapeng Wu, Ali Motie Nasrabadi, Ming Li, Heikki Handroos

  10. 10.

    Robust and distortion control dual watermarking in LWT domain using DCT and error correction code for color medical image AK Singh