RGB-D ergonomic assessment system of adopted working postures
Introduction
Musculoskeletal disorders (MSDs) are a common concern across labor intensive industries. A recent statistical study performed by the Bureau of Labor Statistics (BLS) demonstrated that MSD cases account for of all work-related injuries and illness cases (Bureau of Labor Statistic, 2016). These injuries are most commonly in relation to the muscular components of the Neck, Back, Arms and Legs (Luttmann et al. Organizationet al.). In addition to the personal impact these injuries can have on workers, compensation costs and days-away-from-work can greatly effect the productivity of the organization it self (Bureau of Labor Statistic, 2016; Bernard and Putz-Anderson). The manufacturing industries endeavor to constantly provide a safe working environment via the early identification and intervention of problematic procedures. Currently, proactive task planning using digital human models and virtual facilities are helping minimize risk factors of MSDs, however to ensure the maintenance of harm minimization, advancements in injury prevention technology must continue to be implemented. This is due to the complex interactions between force and frequency during automotive assembly tasks. Adopting ergonomically invalid or awkward working postures while performing these manual tasks have the potential to cause long term MSDs (Krüger and Nguyen, 2015; Bernard and Putz-Anderson; Luttmann et al. Organizationet al.). Therefore, ergonomics specialists have been investigating methods and tools to evaluate the adopted working posture and identify potential MSDs risks.
The Rapid Upper Limb Assessment (RULA) (McAtamney and Corlett, 1993) is one of the most popular ergonomic assessment tools in the industry (Plantard et al. Multon; Liebregts et al., 2016). Despite its limitations and low resolution capabilities to problematic working procedures, RULA is simple, easy to compute and does not require prior knowledge in biomechanics or ergonomics. The RULA score quantifies the exposure of the adopted posture to risk factors of MSDs with more focus on the neck, trunk and upper body limbs. It ranges from one to seven representing the level of MSD risk and suggesting an action level that describes whether a method of intervention is required (McAtamney and Corlett, 1993), with one being an acceptable posture and seven requiring immediate intervention. Automating the RULA score evaluation process has gained much attention from ergonomics researchers (Manghisi et al. Monno; Plantard et al., 2017) to overcome the intra- and inter-rater variability problem (Manghisi et al., 2017). However, developing an automated RULA based ergonomic feedback system requires estimating joint angles of the upper body parts.
Recent studies have proposed automating ergonomic assessment methods relying on computer vision and machine learning techniques (Diego-Mas and Alcaide-Marzal, 2014). In particular, the Kinect camera alongside its software development kit (SDK) have been extensively used to analyze the adopted posture and evaluate the RULA score (Plantard et al., 2015; Liebregts et al., 2016; Plantard et al., 2017; Manghisi et al., 2017; Abobakr et al., 2017a). The Kinect SDK tracks the human body and estimates the 3D Cartesian coordinates of 20 joint positions. It uses a random decision forest classifier to segment the body into parts followed by a localization algorithm to infer joint positions (Abobakr et al., 2017a). However, there are several difficulties resulting from using the Kinect SDK. First, it relies on local body part detectors, and hence may produce unrealistic skeletons in cases of occlusions due to cluttered environments (Abobakr et al., 2018; Plantard et al., 2017). Also, the Kinect SDK has a difficulty in tracking self-occluded postures that have arms crossing, trunk bending, trunk lateral flexion and trunk rotation (Manghisi et al., 2017). This requires applying preprocessing operations to correct the resulting kinematic structure as suggested in (Plantard et al., 2015; Plantard et al. Multon). Second, an additional processing stage is required to convert 3D Cartesian coordinates of body joint positions into joint angles. For instance, Plantard et al. (Plantard et al., 2017) corrected Kinect data using the method presented in (Plantard et al., 2017) and estimated missing anatomical landmarks using the approach proposed in (Bonnechere et al., 2014), to make the reconstructed skeleton compatible with the ISB recommendations (Wu et al., 2005) and compute the joint angles. Clark et al. (2012) used the inverse tangent method to convert 3D joint positions into joint angles. Although these approaches have been successful in obtaining joint angles of high quality, they may exhibit large errors from relying on the Kinect skeleton data especially in cases of occluded postures (Plantard et al., 2017; Manghisi et al., 2017). Improving the quality of the Kinect skeleton data for ergonomic studies is an open area of research (Plantard et al., 2017). This work focuses more on addressing limitations of the Kinect V1 sensor, as it uses the structured light technology which has been incorporated in a wide range of depth sensors (Abobakr et al., 2018). This allows better generalization to different depth cameras, for instance ASUS Xtion. The Kinect V2, on the other hand, uses the time of flight imaging technology which helps produce more robust skeleton data, however, it consumes more power and requires cooling (Fankhauser et al., 2015).
In this paper, we propose a skeleton-free holistic posture analysis system that accurately predicts body joint angles from a single depth image without utilizing the temporal information between subsequent images, as shown in Fig. 1. Although incorporating a temporal dynamics modeling stage can help ensure consistency of subsequent frame predictions and achieve higher frame rates, tracking algorithms require regular initialization to avoid leading to drift anomalies (Shotton et al., 2013). The fundamental building block of the proposed method is a cascade of two deep convolutional neural network (ConvNet) models. The depth sensor produces two synchronized video feeds of RGB and depth images. First, we segment the body from the background via passing the RGB image to an object instance segmentation deep ConvNet model. This network computes segmentation masks for a predefined set of objects in a given scene. We apply the obtained person's segmentation mask to the corresponding depth image. Second, depth values of the posture are encoded using a proposed depth encoding algorithm. Third, the encoded image is passed through the second ConvNet model to predict body joint angles. Finally, the estimated joint angles are used to compute the RULA score. Thus, we simplify the overall ergonomic evaluation procedure to be as simple as mapping directly predicted joint angles into a RULA score. Using this score, the MSD risk level is identified and a recommended action is suggested to decrease the risk of work-related injuries as defined in (McAtamney and Corlett, 1993). This is made possible via training our models on a large amount of highly varied synthetic training images with ground truth joint angles that have been biomechanically modeled using a novel inverse kinematics step.
The remainder of this paper is structured as follows. Section 2 describes the proposed method and the used deep ConvNet models. Section 3 presents the experiments and results. Key aspects and limitations of the proposed method are discussed in Section 4. Section 5 highlights the conclusion and future work.
Section snippets
Material and methods
We propose a vision based ergonomic posture assessment system composed of two cascaded ConvNets; an object instance segmentation network and a holistic posture analysis network. We utilize both the RGB video and depth feeds of a low cost depth sensor. The input feeds are synchronized which means that each RGB image has an associated depth image. In particular, we employed the segmentation network to detect and segment the person from an input RGB image and reject other background objects. This
Results
We have trained a deep ConvNet model on a synthetic training dataset to predict 15 body joint angles from a single depth image. The estimated joint angles are then used to compute the RULA score for the adopted posture. In this section, we evaluate the performance of the proposed method and explore the generalization capabilities on a real test dataset. Table 4 provides a detailed description of the synthetic and real datasets used in training and evaluating the proposed system. We also examine
Discussion
This paper proposed a semi-automated ergonomic assessment system of adopted working postures. The proposed method analyzes the posture holistically and estimates body joint angles directly from a single depth and RGB image pair. Hence, we do not exploit temporal dependencies or skeleton data from the Kinect SDK. The estimated joint angles are used to compute the RULA score, the MSD risk level and subsequently the urgency of intervention required to reduce the risk of injury. The RULA score is
Conclusions
This paper proposed a semi-automated holistic ergonomic posture assessment system. It is composed of an instance segmentation model that detects and segments the person in the scene and a deep convolutional neural network that we trained to estimate body joint angles directly from a single depth image. The joint angles prediction model is trained on synthetic depth images. This allows simulating a wide range of manual tasks performed by workers of different body shapes and sizes from several
Acknowledgement
This research was supported by the Institute for Intelligent Systems Research and Innovation (IISRI) at Deakin University, Australia. The project was funded via Ford's university research program (URP 2014-4055R), Ford Motor Co., USA. The data used in this project was obtained from the CMU graphics lab motion capture database, mocap.cs.cmu.edu. The database was created with funding from NSF EIA-0196217, USA.
References (56)
- et al.
Validity and reliability of the kinect within functional assessment activities: comparison with standard stereophotogrammetry
Gait Posture
(2014) - et al.
Validity of the microsoft kinect for assessment of postural control
Gait Posture
(2012) - et al.
Using kinect sensor in observational methods for assessing postures at work
Appl. Ergon.
(2014) - et al.
An ocular biomechanic model for dynamic simulation of different eye movements
J. Biomech.
(2018) - et al.
Automated vision-based live ergonomics analysis in assembly operations
CIRP Ann. - Manuf. Technol.
(2015) - et al.
Photograph-based ergonomic evaluations using the rapid office strain assessment (rosa)
Appl. Ergon.
(2016) - et al.
Real time rula assessment using kinect v2 sensor
Appl. Ergon.
(2017) - et al.
Rula: a survey method for the investigation of work-related upper limb disorders
Appl. Ergon.
(1993) - et al.
Biomechanical loading of the shoulder complex and lumbosacral joints during dynamic cart pushing task
Appl. Ergon.
(2013) - et al.
Validation of an ergonomic assessment method using kinect data in real workplace conditions
Appl. Ergon.
(2017)
An investigation of jogging biomechanics using the full-body lumbar spine model: model development and validation
J. Biomech.
Simulation of human movement: applications using opensim
Procedia IUTAM
Opensim: a musculoskeletal modeling and simulation framework for in silico investigations and exchange
Procedia Iutam
Innovative system for real-time ergonomic feedback in industrial manufacturing
Appl. Ergon.
Physical risk factors identification based on body sensor network combined to videotaping
Appl. Ergon.
A biomechanical and physiological study of office seat and tablet device interaction
Appl. Ergon.
ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion-part ii: shoulder, elbow, wrist and hand
J. Biomech.
Subject-specific musculoskeletal modeling in the evaluation of shoulder muscle and joint function
J. Biomech.
Body joints regression using deep convolutional neural networks
A kinect-based workplace postural analysis system using deep residual networks
Rgb-d human posture analysis for ergonomie studies using deep convolutional neural network
A skeleton-free fall detection system from depth images using random decision forest
IEEE Syst. J.
Musculoskeletal Disorders and Workplace Factors; a Critical Review of Epidemiologic Evidence for Work-Related Musculoskeletal Disorders of the Neck, Upper Extremity, and Low Back
Nonfatal Occupational Injuries and Illnesses Resulting in Days Away from Work in 2015
A coefficient of agreement for nominal scales
Educ. Psychol. Meas.
An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures
IEEE Trans. Biomed. Eng.
Opensim: open-source software to create and analyze dynamic simulations of movement
IEEE Trans. Biomed. Eng.
Cited by (52)
Ergonomic assessment based on monocular RGB camera in elderly care by a new multi-person 3D pose estimation technique (ROMP)
2023, International Journal of Industrial ErgonomicsDetermination of workers' compliance to safety regulations using a spatio-temporal graph convolution network
2023, Advanced Engineering InformaticsSimple method integrating OpenPose and RGB-D camera for identifying 3D body landmark locations in various postures
2022, International Journal of Industrial ErgonomicsCitation Excerpt :This study proposes an alternative motion tracking method to localize 3D body landmark locations in a simple manner when postural assessments in the workplace are performed. Musculoskeletal disorders (MSDs) are problems commonly experienced by workers across industries (Abobakr et al., 2019; Joshi and Deshpande, 2020). Improper posture is known to be one of the risk factors for developing work-related MSDs.
Ergonomic assessment of office worker postures using 3D automated joint angle assessment
2022, Advanced Engineering InformaticsCitation Excerpt :Finally, conclusions are presented in Section 6. Camera-based ergonomic assessments can be performed either in 2D using regular cameras [47–49] or in 3D using RGB-D cameras, such as ranged cameras [50] and Kinect cameras [44,51]. Previous studies investigated the application of camera-based ergonomic assessments on workers in different settings such as industrial [52–54], manufacturing [27,55,56], assembly [30,57], and construction [58,59].