RGB-D ergonomic assessment system of adopted working postures

doi:10.1016/j.apergo.2019.05.004

Applied Ergonomics

Volume 80, October 2019, Pages 75-88

https://doi.org/10.1016/j.apergo.2019.05.004 Get rights and content

Highlights

•
This paper proposes a vision-based semi-automated RULA ergonomic posture assessment system using deep learning techniques.
•
The proposed method analyzes the posture holistically and estimates body joint angles directly from a single depth image.
•
We presented a novel inverse kinematic modeling process to obtain ground truth joint angles from motion capture data.
•
This allows training learning algorithms to estimate joint angles from input images acquired using different modalities.
•
The developed system supports Kinect and ASUS Xtion depth cameras and does not rely on skeleton data from the Kinect SDK.
•
The holistic posture analysis approach ensures robustness to different forms of occlusions.

Abstract

Ensuring a healthier working environment is of utmost importance for companies and global health organizations. In manufacturing plants, the ergonomic assessment of adopted working postures is indispensable to avoid risk factors of work-related musculoskeletal disorders. This process receives high research interest and requires extracting plausible postural information as a preliminary step. This paper presents a semi-automated end-to-end ergonomic assessment system of adopted working postures. The proposed system analyzes the human posture holistically, does not rely on any attached markers, uses low cost depth technologies and leverages the state-of-the-art deep learning techniques. In particular, we train a deep convolutional neural network to analyze the articulated posture and predict body joint angles from a single depth image. The proposed method relies on learning from synthetic training images to allow simulating several physical tasks, different body shapes and rendering parameters and obtaining a highly generalizable model. The corresponding ground truth joint angles have been generated using a novel inverse kinematics modeling stage. We validated the proposed system in real environments and achieved a joint angle mean absolute error (MAE) of $3.19 \pm {1.57}^{\circ}$ and a rapid upper limb assessment (RULA) grand score prediction accuracy of $89 %$ with Kappa index of 0.71 which means substantial agreement with reference scores. This work facilities evaluating several ergonomic assessment metrics as it provides direct access to necessary postural information overcoming the need for computationally expensive post-processing operations.

Introduction

Musculoskeletal disorders (MSDs) are a common concern across labor intensive industries. A recent statistical study performed by the Bureau of Labor Statistics (BLS) demonstrated that MSD cases account for $31 %$ of all work-related injuries and illness cases (Bureau of Labor Statistic, 2016). These injuries are most commonly in relation to the muscular components of the Neck, Back, Arms and Legs (Luttmann et al. Organizationet al.). In addition to the personal impact these injuries can have on workers, compensation costs and days-away-from-work can greatly effect the productivity of the organization it self (Bureau of Labor Statistic, 2016; Bernard and Putz-Anderson). The manufacturing industries endeavor to constantly provide a safe working environment via the early identification and intervention of problematic procedures. Currently, proactive task planning using digital human models and virtual facilities are helping minimize risk factors of MSDs, however to ensure the maintenance of harm minimization, advancements in injury prevention technology must continue to be implemented. This is due to the complex interactions between force and frequency during automotive assembly tasks. Adopting ergonomically invalid or awkward working postures while performing these manual tasks have the potential to cause long term MSDs (Krüger and Nguyen, 2015; Bernard and Putz-Anderson; Luttmann et al. Organizationet al.). Therefore, ergonomics specialists have been investigating methods and tools to evaluate the adopted working posture and identify potential MSDs risks.

The Rapid Upper Limb Assessment (RULA) (McAtamney and Corlett, 1993) is one of the most popular ergonomic assessment tools in the industry (Plantard et al. Multon; Liebregts et al., 2016). Despite its limitations and low resolution capabilities to problematic working procedures, RULA is simple, easy to compute and does not require prior knowledge in biomechanics or ergonomics. The RULA score quantifies the exposure of the adopted posture to risk factors of MSDs with more focus on the neck, trunk and upper body limbs. It ranges from one to seven representing the level of MSD risk and suggesting an action level that describes whether a method of intervention is required (McAtamney and Corlett, 1993), with one being an acceptable posture and seven requiring immediate intervention. Automating the RULA score evaluation process has gained much attention from ergonomics researchers (Manghisi et al. Monno; Plantard et al., 2017) to overcome the intra- and inter-rater variability problem (Manghisi et al., 2017). However, developing an automated RULA based ergonomic feedback system requires estimating joint angles of the upper body parts.

Recent studies have proposed automating ergonomic assessment methods relying on computer vision and machine learning techniques (Diego-Mas and Alcaide-Marzal, 2014). In particular, the Kinect camera alongside its software development kit (SDK) have been extensively used to analyze the adopted posture and evaluate the RULA score (Plantard et al., 2015; Liebregts et al., 2016; Plantard et al., 2017; Manghisi et al., 2017; Abobakr et al., 2017a). The Kinect SDK tracks the human body and estimates the 3D Cartesian coordinates of 20 joint positions. It uses a random decision forest classifier to segment the body into parts followed by a localization algorithm to infer joint positions (Abobakr et al., 2017a). However, there are several difficulties resulting from using the Kinect SDK. First, it relies on local body part detectors, and hence may produce unrealistic skeletons in cases of occlusions due to cluttered environments (Abobakr et al., 2018; Plantard et al., 2017). Also, the Kinect SDK has a difficulty in tracking self-occluded postures that have arms crossing, trunk bending, trunk lateral flexion and trunk rotation (Manghisi et al., 2017). This requires applying preprocessing operations to correct the resulting kinematic structure as suggested in (Plantard et al., 2015; Plantard et al. Multon). Second, an additional processing stage is required to convert 3D Cartesian coordinates of body joint positions into joint angles. For instance, Plantard et al. (Plantard et al., 2017) corrected Kinect data using the method presented in (Plantard et al., 2017) and estimated missing anatomical landmarks using the approach proposed in (Bonnechere et al., 2014), to make the reconstructed skeleton compatible with the ISB recommendations (Wu et al., 2005) and compute the joint angles. Clark et al. (2012) used the inverse tangent method to convert 3D joint positions into joint angles. Although these approaches have been successful in obtaining joint angles of high quality, they may exhibit large errors from relying on the Kinect skeleton data especially in cases of occluded postures (Plantard et al., 2017; Manghisi et al., 2017). Improving the quality of the Kinect skeleton data for ergonomic studies is an open area of research (Plantard et al., 2017). This work focuses more on addressing limitations of the Kinect V1 sensor, as it uses the structured light technology which has been incorporated in a wide range of depth sensors (Abobakr et al., 2018). This allows better generalization to different depth cameras, for instance ASUS Xtion. The Kinect V2, on the other hand, uses the time of flight imaging technology which helps produce more robust skeleton data, however, it consumes more power and requires cooling (Fankhauser et al., 2015).

In this paper, we propose a skeleton-free holistic posture analysis system that accurately predicts body joint angles from a single depth image without utilizing the temporal information between subsequent images, as shown in Fig. 1. Although incorporating a temporal dynamics modeling stage can help ensure consistency of subsequent frame predictions and achieve higher frame rates, tracking algorithms require regular initialization to avoid leading to drift anomalies (Shotton et al., 2013). The fundamental building block of the proposed method is a cascade of two deep convolutional neural network (ConvNet) models. The depth sensor produces two synchronized video feeds of RGB and depth images. First, we segment the body from the background via passing the RGB image to an object instance segmentation deep ConvNet model. This network computes segmentation masks for a predefined set of objects in a given scene. We apply the obtained person's segmentation mask to the corresponding depth image. Second, depth values of the posture are encoded using a proposed depth encoding algorithm. Third, the encoded image is passed through the second ConvNet model to predict body joint angles. Finally, the estimated joint angles are used to compute the RULA score. Thus, we simplify the overall ergonomic evaluation procedure to be as simple as mapping directly predicted joint angles into a RULA score. Using this score, the MSD risk level is identified and a recommended action is suggested to decrease the risk of work-related injuries as defined in (McAtamney and Corlett, 1993). This is made possible via training our models on a large amount of highly varied synthetic training images with ground truth joint angles that have been biomechanically modeled using a novel inverse kinematics step.

The remainder of this paper is structured as follows. Section 2 describes the proposed method and the used deep ConvNet models. Section 3 presents the experiments and results. Key aspects and limitations of the proposed method are discussed in Section 4. Section 5 highlights the conclusion and future work.

Section snippets

Material and methods

We propose a vision based ergonomic posture assessment system composed of two cascaded ConvNets; an object instance segmentation network and a holistic posture analysis network. We utilize both the RGB video and depth feeds of a low cost depth sensor. The input feeds are synchronized which means that each RGB image has an associated depth image. In particular, we employed the segmentation network to detect and segment the person from an input RGB image and reject other background objects. This

Results

We have trained a deep ConvNet model on a synthetic training dataset to predict 15 body joint angles from a single depth image. The estimated joint angles are then used to compute the RULA score for the adopted posture. In this section, we evaluate the performance of the proposed method and explore the generalization capabilities on a real test dataset. Table 4 provides a detailed description of the synthetic and real datasets used in training and evaluating the proposed system. We also examine

Discussion

This paper proposed a semi-automated ergonomic assessment system of adopted working postures. The proposed method analyzes the posture holistically and estimates body joint angles directly from a single depth and RGB image pair. Hence, we do not exploit temporal dependencies or skeleton data from the Kinect SDK. The estimated joint angles are used to compute the RULA score, the MSD risk level and subsequently the urgency of intervention required to reduce the risk of injury. The RULA score is

Conclusions

This paper proposed a semi-automated holistic ergonomic posture assessment system. It is composed of an instance segmentation model that detects and segments the person in the scene and a deep convolutional neural network that we trained to estimate body joint angles directly from a single depth image. The joint angles prediction model is trained on synthetic depth images. This allows simulating a wide range of manual tasks performed by workers of different body shapes and sizes from several

Acknowledgement

This research was supported by the Institute for Intelligent Systems Research and Innovation (IISRI) at Deakin University, Australia. The project was funded via Ford's university research program (URP 2014-4055R), Ford Motor Co., USA. The data used in this project was obtained from the CMU graphics lab motion capture database, mocap.cs.cmu.edu. The database was created with funding from NSF EIA-0196217, USA.

References (56)

B. Bonnechere et al.
Validity and reliability of the kinect within functional assessment activities: comparison with standard stereophotogrammetry
Gait Posture
(2014)
R.A. Clark et al.
Validity of the microsoft kinect for assessment of postural control
Gait Posture
(2012)
J.A. Diego-Mas et al.
Using kinect sensor in observational methods for assessing postures at work
Appl. Ergon.
(2014)
J. Iskander et al.
An ocular biomechanic model for dynamic simulation of different eye movements
J. Biomech.
(2018)
J. Krüger et al.
Automated vision-based live ergonomics analysis in assembly operations
CIRP Ann. - Manuf. Technol.
(2015)
J. Liebregts et al.
Photograph-based ergonomic evaluations using the rapid office strain assessment (rosa)
Appl. Ergon.
(2016)
V.M. Manghisi et al.
Real time rula assessment using kinect v2 sensor
Appl. Ergon.
(2017)
L. McAtamney et al.
Rula: a survey method for the investigation of work-related upper limb disorders
Appl. Ergon.
(1993)
A.D. Nimbarte et al.
Biomechanical loading of the shoulder complex and lumbosacral joints during dynamic cart pushing task
Appl. Ergon.
(2013)
P. Plantard et al.
Validation of an ergonomic assessment method using kinect data in real workplace conditions
Appl. Ergon.
(2017)

M.E. Raabe et al.

An investigation of jogging biomechanics using the full-body lumbar spine model: model development and validation

J. Biomech.

(2016)

J.A. Reinbolt et al.

Simulation of human movement: applications using opensim

Procedia IUTAM

(2011)

A. Seth et al.

Opensim: a musculoskeletal modeling and simulation framework for in silico investigations and exchange

Procedia Iutam

(2011)

N. Vignais et al.

Innovative system for real-time ergonomic feedback in industrial manufacturing

Appl. Ergon.

(2013)

N. Vignais et al.

Physical risk factors identification based on body sensor network combined to videotaping

Appl. Ergon.

(2017)

E. Weston et al.

A biomechanical and physiological study of office seat and tablet device interaction

Appl. Ergon.

(2017)

G. Wu et al.

ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion-part ii: shoulder, elbow, wrist and hand

J. Biomech.

(2005)

W. Wu et al.

Subject-specific musculoskeletal modeling in the evaluation of shoulder muscle and joint function

J. Biomech.

(2016)

A. Abobakr et al.

Body joints regression using deep convolutional neural networks

A. Abobakr et al.

A kinect-based workplace postural analysis system using deep residual networks

A. Abobakr et al.

Rgb-d human posture analysis for ergonomie studies using deep convolutional neural network

A. Abobakr et al.

A skeleton-free fall detection system from depth images using random decision forest

IEEE Syst. J.

(2018)

B.P. Bernard et al.

Musculoskeletal Disorders and Workplace Factors; a Critical Review of Epidemiologic Evidence for Work-Related Musculoskeletal Disorders of the Neck, Upper Extremity, and Low Back

(1997)

Nonfatal Occupational Injuries and Illnesses Resulting in Days Away from Work in 2015

(2016)

J. Cohen

A coefficient of agreement for nominal scales

Educ. Psychol. Meas.

(1960)

C. Couprie, C. Farabet, L. Najman, Y. LeCun, Indoor Semantic Segmentation Using Depth Information, arXiv preprint...

S.L. Delp et al.

An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures

IEEE Trans. Biomed. Eng.

(1990)

S.L. Delp et al.

Opensim: open-source software to create and analyze dynamic simulations of movement

IEEE Trans. Biomed. Eng.

(2007)

Cited by (52)

Development of an end-to-end hardware and software pipeline for affordable and feasible ergonomics assessment in the automotive industry
2024, Safety Science
An end-to-end hardware-software pipeline is introduced to automatize ergonomics assessment in industrial workplaces. The proposed modular solution can interoperate with commercial systems throughout the ergonomics assessment phases involved in the process. The pipeline includes custom-designed Inertial Measurement Unit (IMU) sensors, two real-time worker movement acquisition tools, inverse kinematics processing and Rapid Upper Limb Assessment (RULA) report generation. It is based on free tools such as Unity3D and OpenSim to avoid the problems derived from using proprietary technologies, such as security decisions being made under “black box” conditions. Experiments were conducted in an automotive factory in a workplace with WMSDs risk among workers. The proposed solution obtained comparable results to a gold standard solution, reaching measured joint angles a 0.95 cross-correlation and a Root Mean Square Error (RMSE) lower than 10 for elbows and 12 for shoulders between both systems. In addition, the global RULA score difference is lower than 5 % between both systems. This work provides a low-cost solution for WMSDs risk assessment in the workplace to reduce musculoskeletal disorders and associated sick leave in industry, impacting the health of workers in the long term. Our study can ease further research and popularize the use of wearable systems for ergonomics analysis allowing these workplace prevention systems to reach different industrial environments.
Soft computing applications in the field of human factors and ergonomics: A review of the past decade of research
2024, Applied Ergonomics
The main objectives of this study were to 1) review the literature on the applications of soft computing concepts to the field of human factors and ergonomics (HFE) between 2013 and 2022 and 2) highlight future developments and trends. Multiple soft computing methods and techniques have been investigated for their ability to address various applications in HFE effectively. These techniques include fuzzy logic, artificial neural networks, genetic algorithms, and their combinations. Applications of these methods in HFE have been highlighted in one hundred and four articles selected from 406 papers. The results of this study help address the challenges of complexity, vagueness, and imprecision in human factors and ergonomics research through the application of soft computing methodologies.
Ergonomic assessment based on monocular RGB camera in elderly care by a new multi-person 3D pose estimation technique (ROMP)
2023, International Journal of Industrial Ergonomics
Nursing staff members are at high risk of work-related musculoskeletal disorders (WMSDs), which not only threaten their health but also impact the quality of elderly care. Ergonomic posture risk assessment (EPRA) is usually employed to identify potential WMSD risks such as extreme posture and repetitive movements. A monocular RGB camera has been used for the EPRA in recent years due to its short time requirements and low cost. However, most work scenarios do not involve multi-person situations.
Therefore, based on the latest 3D pose estimation algorithm—Monocular, One-stage Regression of Multiple 3D People (ROMP)—this study proposes a method that uses one monocular RGB camera to conduct the EPRA in multi-person and occluded scenarios. The accuracy of our method was calculated through 12 care tasks involving multi-person and occlusion, using the Noitom motion capture (MoCap) system.
The results show that our method performed well, with an average accuracy of 83.8% and 90.7%, respectively, using two EPRA scoring tools, RULA and OWAS. The mean absolute error (MAE) of each joint angle was 9.4°. Thus, ROMP seems to be a potential method for conducting the EPRA in nursing workspaces with unsatisfactory conditions using a single monocular RGB camera.
Determination of workers' compliance to safety regulations using a spatio-temporal graph convolution network
2023, Advanced Engineering Informatics
The safety of workers in construction remains a critical issue despite the automation of several tasks with fewer workers on site. As fatal accidents of workers account for a significant number of construction accidents, considerable effort has been made to monitor workers’ safety behaviors with additional personnel for supervising workers. With the advancement of data analytics, recent research has reported various human activity recognition methods based on image data to perform automated worker monitoring without additional labor. Nevertheless, unlike existing approaches based on a single image, a method that can capture a series of actions from sequential images is required to monitor workers’ compliance with safety behavior. To this end, an approach based on OpenPose and a spatio-temporal graph convolutional network is proposed in this study to evaluate workers’ compliance with safety regulations using sequential videos. The two primary functions of the developed method include 1) classifying each safety behavior among five representative behaviors stipulated in construction, and 2) determining the compliance of workers with each safety regulation. The results indicate that the developed approach can capture momentary safety behaviors and workers’ compliance with feasible accuracy of an average F1 score greater than 0.8. Furthermore, the proposed method can be extended to safety intervention policies with behavior-based feedback to inform workers of their non-compliance with safety behaviors. Therefore, this study contributes to proactive safety management by focusing on workers’ behavioral levels rather than on accident rate-based management.
Simple method integrating OpenPose and RGB-D camera for identifying 3D body landmark locations in various postures
2022, International Journal of Industrial Ergonomics
Citation Excerpt :
This study proposes an alternative motion tracking method to localize 3D body landmark locations in a simple manner when postural assessments in the workplace are performed. Musculoskeletal disorders (MSDs) are problems commonly experienced by workers across industries (Abobakr et al., 2019; Joshi and Deshpande, 2020). Improper posture is known to be one of the risk factors for developing work-related MSDs.
Ergonomic assessments of posture are indispensable for reducing the risk of physical discomfort in the workplace. However, it is challenging to measure postural data in field studies because of the high cost and complex setup required by conventional motion tracking systems. OpenPose is an advanced approach for real-time multi-person two-dimensional (2D) pose estimations in an image, whereas RGB-D cameras can simultaneously record image and depth data. Thus, the present study proposes integrating these two approaches to identify 3D body landmark locations. To quantify the accuracy of the proposed method, the anatomical body landmark locations identified by a marker-based reference system were used as a gold standard. The tracking errors of using two RGB-D cameras, which used different data acquisition techniques (stereoscopic and time-of-flight (ToF)), were examined and compared. Thirty participants were recruited to perform 15 static postures. The average tracking errors of the landmark locations were 7.96 ± 3.59 and 9.81 ± 5.57 cm (stereoscopic), and 6.38 ± 2.88 and 8.18 ± 5.56 cm (ToF) for the standing postures and sitting postures, respectively. Depending on the desired accuracy, the integration of the RGB-D cameras and OpenPose could provide an alternative motion tracking method to identify 3D body landmark locations in a simple manner when postural assessments are performed.
Ergonomic assessment of office worker postures using 3D automated joint angle assessment
2022, Advanced Engineering Informatics
Citation Excerpt :
Finally, conclusions are presented in Section 6. Camera-based ergonomic assessments can be performed either in 2D using regular cameras [47–49] or in 3D using RGB-D cameras, such as ranged cameras [50] and Kinect cameras [44,51]. Previous studies investigated the application of camera-based ergonomic assessments on workers in different settings such as industrial [52–54], manufacturing [27,55,56], assembly [30,57], and construction [58,59].
Sedentary activity and static postures are associated with work-related musculoskeletal disorders (WMSDs) and worker discomfort. Ergonomic evaluation for office workers is commonly performed by experts using tools such as the Rapid Upper Limb Assessment (RULA), but there is limited evidence suggesting sustained compliance with expert’s recommendations. Assessing postural shifts across a day and identifying poor postures would benefit from automation by means of real-time, continuous feedback. Automated postural assessment methods exist; however, they are usually based on ideal conditions that may restrict users’ postures, clothing, and hair styles, or may require unobstructed views of the participants. Using a Microsoft Kinect camera and open-source computer vision algorithms, we propose an automated ergonomic assessment algorithm to monitor office worker postures, the 3D Automated Joint Angle Assessment, 3D-AJA. The validity of the 3D-AJA was tested by comparing algorithm-calculated joint angles to the angles obtained from manual goniometry and the Kinect Software Development Kit (SDK) for 20 participants in an office space. The results of the assessment show that the 3D-AJA has mean absolute errors ranging from 5.6° ± 5.1° to 8.5° ± 8.1° for shoulder flexion, shoulder abduction, and elbow flexion relative to joint angle measurements from goniometry. Additionally, the 3D-AJA showed relatively good performance on the classification of RULA score A using a Random Forest model (micro averages F1-score = 0.759, G-mean = 0.811), even at high levels of occlusion on the subjects’ lower limbs. The results of the study provide a basis for the development of a full-body ergonomic assessment for office workers, which can support personalized behavior change and help office workers to adjust their postures, thus reducing their risks of WMSDs.

View all citing articles on Scopus

View full text

RGB-D ergonomic assessment system of adopted working postures

Highlights

Abstract

Introduction

Section snippets

Material and methods

Results

Discussion

Conclusions

Acknowledgement

Gait Posture

Gait Posture

Appl. Ergon.

J. Biomech.

CIRP Ann. - Manuf. Technol.

Appl. Ergon.

Appl. Ergon.

Appl. Ergon.

Appl. Ergon.

Appl. Ergon.

J. Biomech.

Procedia IUTAM

Procedia Iutam

Appl. Ergon.

Appl. Ergon.

Appl. Ergon.

J. Biomech.

J. Biomech.

Body joints regression using deep convolutional neural networks

A kinect-based workplace postural analysis system using deep residual networks

Rgb-d human posture analysis for ergonomie studies using deep convolutional neural network

A skeleton-free fall detection system from depth images using random decision forest

IEEE Syst. J.

Musculoskeletal Disorders and Workplace Factors; a Critical Review of Epidemiologic Evidence for Work-Related Musculoskeletal Disorders of the Neck, Upper Extremity, and Low Back

Nonfatal Occupational Injuries and Illnesses Resulting in Days Away from Work in 2015

A coefficient of agreement for nominal scales

Educ. Psychol. Meas.

An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures

IEEE Trans. Biomed. Eng.

Opensim: open-source software to create and analyze dynamic simulations of movement

IEEE Trans. Biomed. Eng.