Measuring nursing competencies in the operating theatre: Instrument development and psychometric analysis using Item Response Theory
Introduction
The focus of the study was on developing and validating a Performance Based Scoring Rubric designed to assess the competence of the instrument nurse in the operating theatre.
The aim of preparing nurses for practice is to ensure that they are competent to manage and provide quality care for patients (Dolan, 2003, Pirie and Gray, 2007). Based on the evidence surrounding medical errors, major concerns in healthcare services include patient safety and quality of care. The competence of healthcare professionals to deliver safe, effective patient-centred care is an important factor contributing to the outcomes in patient safety (Lunney et al., 2007) with the delivery of quality care directly related to the level of competence (Meretoja and Leino-Kilpi, 2001, Winslade et al., 2007). While the purpose of education is relatively easy to define, there is a far greater challenge in trying to measure the outcome of education and ensure that competent nurses are being prepared (Edwards et al., 2001). Nurses are informed of the standard of practice expected of them with the development and acceptance of nursing competency standards; however an assessment system to support such competencies is required (Evans, 2008).
A major focus in nursing education is on the judgement of clinical performance, and, although it may appear straight forward, it is a complex process (Dolan, 2003). The challenge of assessing and measuring clinical competence is inherent in the nursing profession due to the diverse nature of nursing roles (Chambers, 1998, Dolan, 2003) including the complexity of nursing practice and human interaction (Gonczi et al., 1993), all of which are difficult to measure (Cowan et al., 2008, McGrath et al., 2006). With the transfer of nursing to the tertiary sector, there has been a review of the assessment procedures that dominated nursing education in Australia, such as the use of multiple choice tests and clinical skills tests, with a move towards competency-based assessment. The difficulty of finding an effective measure is reported in the literature as the search for a valid and reliable method of assessing competence continues (Norman et al., 2002, While, 1994). In order to be able to determine the nurse's competence in delivering the expected level of care, there is a need for the nursing profession to develop procedures that would enhance agreement about what constitutes a satisfactory performance (Dunn et al., 2000, Hager and Gillis, 1995).
Difficulties surrounding the development of valid and reliable assessment measures in nursing have resulted in the development of assessment instruments that closely measure the clinical experience, that is an increase in face validity with less focus on reliability, given that there are different variables present in the clinical setting that may influence the outcome of the assessment (McGrath et al., 2006). Although rarely addressed, the reliability and validity of assessment measures reported in the literature have not been specific enough (Watson et al., 2002) with theoretical frameworks rarely reported (Meretoja and Leino-Kilpi, 2001). Literature reviews suggest that there is no ‘gold standard’ for measuring clinical competence (Dolan, 2003, Redfern et al., 2002, Watson et al., 2002) with assessing nurses continuing to pose a challenge for nursing education (Evans, 2008).
Competence cannot be directly observed (Gonczi et al., 1993, Heywood et al., 1992, Wolf, 1989), therefore the evidence needs to be of sufficient quality and quantity to make a sound judgement about the individual's level of competence (Gonczi, 1994). With the movement in nursing from an apprenticeship model to one that focuses on the development of cognitive and caring skills (Benner, 1984), the importance of being able to measure nursing skills has resulted, since such measurement is central to the notion of professionalism and essential to nursing practice (Fotheringham, 2010, Norman et al., 1992). Similarly, evaluation of competence should include assessing competence in an integrated manner (Wolf, 1989). The method selected should be based on the relevance to what is being assessed to increase the validity of the evidence and ensure generalisability of the performance to other tasks (Bailey, 1993).
Reporting of results has been a contentious issue since the introduction of competency-based training and assessment, with much of the debate focused on reporting competence beyond the competent/not yet competent dichotomy (Gillis and Griffin, 2004). There are several scoring approaches used for evaluating performance with two main approaches, analytical and holistic scoring, identified by Anthanasou (1997). Analytical scoring requires an examination of specific aspects of the performance against a set of criteria, as opposed to judging the overall impression of the candidate's performance (Anthanasou, 1997, Perkins, 1983). Assessors may judge the entire performance as a whole but each aspect of the performance is awarded a separate score as well as the allocation of an overall score (Gillis, 2003, Goulden, 1989). An advantage of using the analytical method is the provision of more detailed information about the performance of the candidate. Strengths and weaknesses are identified allowing for tailored feedback in meeting the needs of the student.
When the overall performance is judged holistically, the method of evaluation considers the overall quality of the performance, including competent components, but does not mark them separately (Goulden, 1989). A single classification of competence level is made. A limitation of holistic scoring relates to a single score, which while useful in ranking information, provides very little diagnostic information about various aspects of the performance. If an analytical scoring process is adopted, evidence of competence needs to be evaluated against some scoring criteria, which are usually referred to as a ‘rubric’. A rubric, at its most basic, is a “scoring tool that lays out the specific expectations for an assignment” (Stevens and Levi, 2005, p. 3) which is implemented when an evaluation of the quality of performance is required. ‘Rubric’ is the technical term for allocating scores and is considered the most important aspect of the performance assessment (Gillis and Griffin, 2004, Griffin, 2000).
McGrath et al. (2006) provided a range of views of the use of generic domains of clinical competence without taking into consideration the specific skills required to practice in a specialist environment. The use of generic domains relates to concerns surrounding standardising nursing practice which is a common criticism of competency-based assessment (Evans, 2008). With the development of specific competencies, for example competencies specific to specialised areas of nursing, the assumption is that generic competencies will not be sufficient because the roles in different contexts of nursing vary considerably. Therefore the need to develop practice specific competencies has been identified (Sutton and Arbon, 1994). A lack of existing instruments dealing with the measurement of perioperative competencies was identified in the literature (Nicholson, 2005) which was supported by a report published with the revised Australian College of Operating Room Nurses Competency Standards (2006).
The aim of the study was to examine the validity and reliability of a Performance Based Scoring Rubric designed to assess the competence of the instrument nurse in the operating theatre using the Rasch model (1960). The investigation into developing a measurement instrument for determining clinical competence in nursing was driven by two main rationales. The first related to a lack of a ‘gold standard’ for judging clinical performance in nursing practice (Dolan, 2003, Redfern et al., 2002, Watson et al., 2002). The second was linked to exploring the use of Rasch model theory in validating a competency-based assessment for the operating theatre to attain empirical support for construct and content validity and identify measurement problems (Beck and Gable, 2001, Fox, 1999, Smith, 2004).
Section snippets
Development of the Performance Based Scoring Rubric
The instrument for recording the field observation of the nurse educators and preceptors included both analytical scoring rubrics and holistic scoring of both levels of proficiency and competence.
The internal consistency of the 18-item Analytical Observation Form
To examine the internal consistency of the Analytical Observation Form, the responses of the 32 nurse educators and preceptors observing 95 nurses in the operating theatre were analysed using Cronbach's alpha and Person Separation Reliability Index. This is known as the person indices, which is an indication of how much the test is able to provide a clear distinction between the person and the items as determined by Rasch Modeling (Wright and Masters, 1982).
The internal reliability of the
Conclusions
The implications for assessing and inferring competence in nursing practice have been identified in the literature. While a number of measurement instruments have been developed in nursing, it is acknowledged that inadequate psychometric testing has been reported resulting in a lack of confidence in supporting implementation of the measurement instrument. Evidence is required to reduce the subjectivity of direct observation and, using a checklist approach, which has been relied on heavily in
References (60)
- et al.
Increasing the accuracy of observer ratings by enhancing cognitive processing skills
American Journal of Pharmaceutical Education
(1995) - et al.
Measuring nursing competence: development of a self-assessment tool for general nurses across Europe
International Journal of Nursing Studies
(2008) Triangulation for the assessment of clinical nursing skills: a review of theory, use and methodology
International Journal of Nursing Studies
(2010)- et al.
Using the Rasch model in nursing research: an introduction and illustrative example
International Journal of Nursing Studies
(2009) - et al.
The validity and reliability of methods to assess the competence to practice of pre-registration nursing and midwifery students
International Journal of Nursing Studies
(2002) - et al.
Exploring the assessors' and nurses' experience of formal assessment of clinical competency in the administration of blood components
Nurse Education in Practice
(2007) - et al.
Australian nursing — moving forward? Competencies and the nursing profession
Nurse Education Today
(1994) Introduction to Educational Testing
(1997)Australian College of Operating Room Nurses Competency Standards
(2006)ACORN Standards for Perioperative Nursing
(2008)
Theoretical Models and the Assessment of Competency
Item Response Theory in affective instrument development: an illustration
Journal of Nursing Measurement
From Novice to Expert. Excellence and Power in Clinical Nursing Practice
Applying the Rasch Model. Fundamental Measurement in the Human Sciences
Criterion-referenced definitions for rating scales in clinical evaluation
The Journal of Nursing Education
Advanced Design in Nursing Research
Some issues in the assessment of clinical practice: a review of the literature
Journal of Clinical Nursing
Maximising confidence in assessment decision-making: a springboard to quality in assessment
Assessing student nurse clinical competency: will we ever get it right?
Journal of Clinical Nursing
Formal models vs human situational understanding: inherent limitations on the modeling of business expertise
Office: Technology and People
The development of competency standards for specialist critical nurses
Journal of Advanced Nursing
Evaluating student learning: an Australian case study
Nursing & Health Sciences
Competency assessment in nursing. A summary of literature published since 2000
An introduction to the partial credit model for developing nursing assessment
The Journal of Nursing Education
Using rubrics to recognise varying levels of performance
Training Agenda: A Journal of Vocational Education and Training
Competency based assessment in the professions in Australia
Assessment in Education: Principles, Policy & Practice
The Development of Competency‐Based Assessment Strategies for the Professions
Theoretical and empirical comparisons of holistic and analytical scoring of written and spoken discourse
Cited by (14)
The use and quality of reporting of Rasch analysis in nursing research: A methodological scoping review
2022, International Journal of Nursing StudiesEmergency department registered nurses’ disaster medicine competencies. An exploratory study utilizing a modified Delphi technique
2019, International Emergency NursingCitation Excerpt :Although recognized as important, a review of literature revealed that specific disaster medicine competencies for ED registered nurses, as well as a method for assessing disaster preparedness of emergency department registered nurses may be vague or missing [23,24]. The ability to assess disaster preparedness, as well as training and educating staff as required may be difficult [25,26]. Training and education in disaster response should be planned in relation to defined learning objectives based on what the nurses need to master.
Students’ approaches to learning in a clinical practicum: A psychometric evaluation based on item response theory
2018, Nurse Education TodayCitation Excerpt :This item-level analysis can yield valuable information for the utility of a learning approach instrument, because some items may possess higher discriminating power in differentiating students at varied levels of a learning approach, and some items may reflect a more difficult learning strategy than others. In response to this research gap, item response theory (IRT; van der Linden and Hambleton, 2013), as a modern measurement theory, offers a promising solution, and it has been increasingly used in nurse education in recent years (Nicholson et al., 2013). Differing from the conventional classical test theory, IRT models the response of an individual to an item as a function of item measures (item discrimination and item difficulty) and person measures (also referred to as latent trait, in a continuum ranging from low to high levels).
Psychometric properties of the Global Operative Assessment of Laparoscopic Skills (GOALS) using item response theory
2017, American Journal of SurgeryCitation Excerpt :In CTT, the total score is calculated regardless of how much each item correlates with the operative skill that is intended to be measured. However, using global rating scales such as GOALS, scores on items are ordinal rather than interval and the psychological distance between items differs from one to the other.5,8 “Bimanual dexterity”, “efficiency” and “autonomy” in the GOALS scale are more difficult items.
Assessment of bachelor's theses in a nursing degree with a rubrics system: Development and validation study
2016, Nurse Education TodayCitation Excerpt :Rubrics are assessment tools that establish criteria and achievement levels by setting a rating scale (Shipman et al., 2012). Rubrics are frequently used in nursing education in order to assess clinical judgement (Shin et al., 2015), clinical laboratories (Wu et al., 2015), or the skills acquisition in clinical settings (Nicholson et al., 2013). However, no studies have been found on their applications in the BT assessment.
Using Rasch analysis to identify midwifery students' learning about providing breastfeeding support
2015, Women and BirthCitation Excerpt :Another unique aspect of Rasch analysis is its capacity to explore individuals’ response patterns to a survey, to identify if participants tended to answer the items using the same rating category throughout the survey, or if they tended to avoid selecting the categories located at either ends of the scale. Rasch analysis has many applications; it has been used to measure self-efficacy in different occupational roles in nursing26; to measure competence of operating theatre nurses’27,28 and renal nursing29; to determine the performance of a depression screening scale30 and a psychological distress scale31; and for assessing English language skill development.32 Rasch analysis was chosen as the method of choice for this study for two specific reasons.