Introduction
Critical reflection is an educational strategy that systematically integrates clinical practice experiences, praxes, and theoretical knowledge [
1]. Critical reflection in nursing refers to how nurses analyze their experiences, actions, and decisions thoughtfully and systematically. It involves questioning and evaluating one’s practices to gain deeper insights into patient care, enhance professional development, and improve nursing practice [
2].
Critical reflection plays a pivotal role in enhancing nurses’ capabilities to thoroughly analyze clinical situations by applying evidence-based practices, ultimately improving their problem-solving skills [
3]. This process enables them to systematically evaluate their experiences and decisions, yielding valuable insights into the complexities of clinical challenges they face daily and their overall performance in the nursing field. Through metacognition—the awareness and understanding of one’s thought processes—nurses can better recognize patterns in their reasoning and decision-making [
4]. This self-awareness mitigates the risk of errors in patient care and promotes more effective strategies for addressing unforeseen situations [
5]. Critical reflection also fosters improved communication with patients and colleagues, as it encourages a culture of openness and collaborative dialogue, essential for providing high-quality care. Moreover, critical reflection serves as a mechanism for nurses to understand better the negative emotions and stresses often inherent in their professional lives [
5]. By engaging in reflective practices, nurses can learn to identify and manage these emotional responses more effectively, leading to healthier coping strategies and reduced burnout. Ultimately, this holistic approach equips them to navigate the emotional landscape of their work, thus enhancing their overall well-being and professional effectiveness [
6,
7].
Despite the acknowledged benefits of critical reflection, research within the nursing field remains relatively limited. Existing studies have primarily concentrated on defining the concept of reflection [
8], assessing the extent of reflective thinking as a determinant for enhancing nursing competencies, and establishing the correlation between these elements [
9]. The emphasis has primarily been on comprehensive reflection, which encompasses critical reflection. Furthermore, evidence suggests that critical reflection yields significant educational advantages, particularly in enhancing critical thinking skills and competencies in nursing practice [
10]. One investigation indicated that adopting reflective journaling as a pedagogical approach effectively fosters improvements in critical thinking and problem-solving skills [
11,
12]. Most research in nursing has predominantly focused on self-reflection, which has been shown to bolster clinical competence among nursing students [
11]. Previous research evaluated critical reflection through qualitative data from reflection journals and interviews [
13]. However, the assessment tools employed primarily measured general self-reflection or reflective thinking rather than focusing specifically on the critical reflection of nurses in clinical contexts [
14‐
16]. While these tools are prevalent in pedagogy, psychology, and business, they fall short in assessing critical reflection competencies that pertain directly to nursing practice. Despite the acknowledgment of critical reflection as essential to individual and organizational learning in nursing education, there remains a need for instruments designed to measure this facet of nursing care [
17].
The necessity of conducting this study stems from the critical role that reflection plays in nursing practice, particularly as it relates to enhancing clinical competencies and improving patient care outcomes. Despite the recognized benefits of critical reflection, there is a significant gap in research focusing on measuring critical reflection competencies among clinical nurses. Previous studies have largely concentrated on self-reflection and general reflective thinking, often overlooking the unique critical reflection aspects essential for nursing practice. The tool for assessing critical reflection competencies was first developed by Shin et al. [
17], who aimed to create a reliable instrument to effectively evaluate nurses’ abilities to engage in critical reflection within clinical settings. Our study aimed to validate and assess the psychometric properties of the Persian version of the critical reflection competency scale, thereby addressing the existing gap in the literature and providing a culturally relevant tool for Persian-speaking clinical nurses. This research aims to enhance the understanding of critical reflection in nursing and contribute to developing effective strategies for improving nursing education and practice.
Research questions and hypotheses
Methods
Design
The current study was a methodological study conducted in Ardabil province, located in the northwestern region of Iran, from July to September 2024. The main focus of this research was to perform a comprehensive psychometric evaluation of the CRCS-P. To ensure the rigor and reliability of the evaluation, the study adhered to the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) guidelines [
18], which provide a structured framework for assessing measurement properties, accuracy, and applicability of health-related instruments.
Setting and participants
The study included clinical nurses from various wards within educational, treatment, and research centers in Ardabil province. To ensure a representative sample, we employed a convenience sampling method, targeting participants who met specific inclusion criteria: they had to provide informed consent, possess at least a bachelor’s degree in nursing, and have a minimum of six months of clinical experience. This criterion was established based on literature indicating that at least six months of clinical experience is necessary for nurses to develop foundational critical thinking skills and engage effectively in reflective practice [
17,
19,
20]. This duration allows nurses to encounter a variety of clinical situations, enhancing their ability to reflect critically on their experiences. Nurses not directly involved in patient care, as well as those in administrative or educational roles, were excluded from participation to focus on those actively engaged in clinical practice. This approach ensures that the participants have relevant and practical experiences that contribute to their critical reflection competencies.
To assess the representativeness of our sample, we conducted a preliminary survey of clinical nursing staff in Ardabil province, identifying approximately 1,300 registered nurses across various healthcare settings, including hospitals and clinics. This initial survey provided insight into the broader population from which our sample was drawn. In conducting confirmatory factor analysis (CFA), having at least 5 to 10 participants for each estimated parameter in the model is commonly recommended [
21,
22]. Recognizing the importance of statistical power in conducting confirmatory factor analysis (CFA), we adopted a conservative approach by targeting a ratio of 10 participants per item in the Critical Reflection Competency Scale (CRCS-P), which contains 19 items. This led us to aim for a minimum of 190 participants for the analysis. Anticipating a 10% non-response rate based on previous studies, we adjusted our target sample size to 209 individuals. During the data collection phase, we experienced a sample attrition rate of 4%, and eight incomplete questionnaires were removed due to data deficiencies. Ultimately, data from 201 completed samples were analyzed, which ensured a sufficient sample size for the CFA and maintained the integrity of our psychometric evaluation.
The personal information form used in this study was designed to collect the demographic and clinical characteristics of the participating nurses. This form included questions regarding age, gender, clinical experience, educational background, and marital status. The data collected through this form provided essential context for analyzing the psychometric properties of the CRCS-P and understanding the sample characteristics.
Critical reflection competency scale (CRCS)
The Critical Reflection Competency Scale (CRCS), developed by Lee et al. [
17], consisted of five dimensions: factor 1 (Critical and reflective thinking, 4 items), factor 2 (Holistic perspective and self-reflection, 5 items), factor 3 (Meaningful engagement and self-awareness, 3 items), factor 4 (Commitment to professional growth and reflection, 4 items), and factor 5 (Evidence-based reflection and outcome evaluation, 3 items). Participants rated each item on a 5-point Likert scale, ranging from 1 (not at all) to 5 (strongly agree). The scale’s minimum and maximum possible scores were 19 and 95, respectively. The overall scale demonstrated a Cronbach’s alpha of 0.853.
Translation process
With authorization from the original developer, Dr. Lee, the Critical Reflection Competency Scale (CRCS) was translated into Persian using a meticulous forward-backward translation method. This process was conducted by two skilled translators who were fluent in both English and Persian. Both translators possessed qualifications in linguistics and experience in healthcare contexts, ensuring that the translation maintained the essence and meaning of the original scale. The research team conducted pilot tests with a small group of Persian-speaking nurses (n = 15) to enhance cultural equivalence. This step was crucial in confirming that the translated items were contextually appropriate and relevant to the experiences of these nurses. Feedback from the pilot tests led to further refinements, ensuring the language used was accurate and culturally resonant.
After completing the initial translations and pilot testing, the research team collaborated to create a cohesive Persian version that maintained clarity and fidelity to the original scale. This finalized version was then subjected to a rigorous back-translation process into English. The back-translation was peer-reviewed by the original developers of the CRCS-P, who assessed the translated content for accuracy and alignment with the original scale. Through this comprehensive translation and review process, the Persian version of the CRCS-P was prepared for assessments, ensuring its reliability and validity for Persian-speaking populations. This thorough approach underscores our commitment to cultural adaptation and the instrument’s integrity, making it suitable for the target population.
Psychometric testing
Content validity
The content validity of the CRCS-P was assessed through the Content Validity Index (CVI) and the Content Validity Ratio (CVR), involving a panel of 10 experts who evaluated each item for its relevance and necessity. These experts were selected based on predetermined criteria, including their experience in the field, academic qualifications, and prior involvement in similar research. We prioritized experts with a minimum of five years of experience in relevant clinical practice or research, as well as a demonstrated record of publications in the area of content validity and measurement in health-related fields.
A cut-off threshold of 0.78 or higher was established for the CVI to determine the validity of individual items. Those items that met or surpassed this benchmark were deemed valid. The necessity of each item was evaluated using the Content Validity Ratio (CVR), with a threshold of 0.62 or higher indicating that an item is essential. The Scale-Level Content Validity Index for Unanimous Agreement (S-CVI/UA) was calculated with a cut-off point of 0.80 or above, demonstrating a high level of consensus among experts. Likewise, the Scale-Level Content Validity Index Average (S-CVI/Ave) employed a cut-off of 0.90 or higher to signify robust average agreement across items [
23].
Construct validity
The construct validity of the CRCS-P was assessed through confirmatory factor analysis (CFA). A first-order CFA model was employed to explore the relationships between the observed variables and their corresponding latent constructs. Factor loadings were evaluated, using a threshold of 0.40 to indicate an acceptable association level between the observed variables and their latent factors. The statistical significance of the factor loadings was determined using the critical ratio (C.R.), with a significance threshold set at C.R. > 1.96 (
p < 0.05) [
24].
Various fit indices were calculated to evaluate the goodness of fit of the CFA model. The model’s fit was assessed against established cut-off values for absolute, incremental, and parsimonious fit indices. These indices included CMIN/DF < 3, RMSEA < 0.08, GFI > 0.90, CFI > 0.90, NFI > 0.90, and AGFI > 0.80 [
25].
Convergent and discriminant validity
The Fornell and Larcker criterion was used to determine the convergent and discriminant validity. The convergent validity of the CRCS-P was evaluated using Composite Reliability (CR) and Average Variance Extracted (AVE). CR values were determined for self-assessment and environmental factors, ensuring they exceeded the recommended threshold of 0.70. Additionally, AVE values were calculated to confirm that they surpassed the acceptable cut-off of 0.50, indicating that the respective latent constructs account for a significant portion of the variance in the indicators [
26].
Maximum Shared Variance (MSV) was assessed for discriminant validity. For the constructs, MSV should be less than the AVE to confirm that they do not share too much variance with each other. This measure collectively validates that the model’s constructs are adequately distinct [
26].
Reliability
The CRCS-P’s reliability was assessed through internal consistency and test-retest reliability analysis. Internal consistency was evaluated using Cronbach’s alpha (α) and McDonald’s omega coefficient (ω). These coefficients were calculated for the entire scale and dimensions to ensure they exceeded the recommended threshold of 0.70, signifying that the instrument reliably measures the intended constructs [
27‐
32].
Test-retest reliability was evaluated using the Intraclass Correlation Coefficient (ICC), with a threshold greater than 0.75 for the total CRCS-P and subscales. This analysis was conducted by administering the scale to 40 clinical nurses selected through simple random sampling at two different time points spaced two weeks apart [
33,
34].
Data analysis
Data analysis was performed using IBM SPSS Statistics for Windows, version 26 (IBM Corp., Armonk, NY, USA) to conduct descriptive data analysis. For CFA and Structural Equation Modeling (SEM), we employed IBM SPSS AMOS Graphics, version 24. Statistical significance was assessed using a two-tailed test, with a p-value of less than 0.05 considered significant.
Ethical considerations
The research study received formal approval from the Research Ethics Committees of Ardabil University of Medical Sciences, with the approval ID: IR.ARUMS.REC.1403.103 (June 16, 2024). In conducting this study, we adhered strictly to the ethical principles established in the most recent revision of the Declaration of Helsinki, ensuring the highest standards of ethical conduct throughout the research process. Before participating, all individuals involved in the study were thoroughly informed about the research’s objectives, methodology, and potential implications. This comprehensive briefing included detailed explanations of the study’s goals, the procedures they would be undergoing, and any possible risks or benefits associated with their participation. Participants were required to read and sign informed consent forms to ensure they fully understood their rights and the nature of their involvement.
Discussion
In this study, we aimed to validate the Persian version of the Critical Reflection Competency Scale (CRCS-P) for clinical nurses, assessing its psychometric properties to ensure its applicability in the Iranian healthcare context. The findings demonstrated strong construct validity, internal consistency, and reliability, indicating that the CRCS-P was a robust tool for measuring critical reflection competencies among nurses. By comparing our results with the original scale, we provided evidence of the CRCS-P’s effectiveness in capturing the nuances of critical reflection in clinical practice. This validation was crucial, as it enhanced the understanding of reflective practices within nursing education and professional development and contributed to improving patient care quality by promoting reflective competencies among healthcare providers.
In our assessment, the CVI for each item ranged from 0.80 to 0.92, indicating that all items exceeded the acceptable threshold of 0.78. In contrast, the original tool [
17] reported an I-CVI ranging from 0.88 to 1.00, with none of the questions falling below 0.78. This comparison showed that while our results had a slightly lower range, all items in our study were still deemed valid. Additionally, our findings’ CVR ranged from 0.70 to 1.00, highlighting the significance of all items as perceived by the expert panel. The original tool, however, did not provide an analysis of CVR. This aspect in our findings added an extra layer of validation, indicating that the items were acceptable and considered essential by experts. Regarding the S-CVI, we calculated the S-CVI/UA at 0.87 and the S-CVI/Ave at 0.92. In comparison, the original tool reported an S-CVI/Ave of 0.96, which exceeded the standard value of 0.90. Although our S-CVI/Ave was lower than that of the original tool, it still reflected a strong consensus among experts regarding the relevance of the items. Overall, while there are differences in some indices between our findings and those of the original tool, both studies support the validity of the assessed items. The variations in scores may be attributed to factors such as the size of the expert panel, the context in which the items were evaluated, and potential differences in expert opinions. Nevertheless, both sets of results affirm the robustness of the content validity assessment and the importance of the items evaluated [
35].
We utilized a first-order CFA model in construct validity, which provided robust goodness-of-fit indices. Specifically, we achieved a CMIN/DF value of 1.349, an RMSEA of 0.034, and a CFI of 0.989, among others. These results indicated a well-fitting model, suggesting that the measured variables accurately represented our examined constructs. The factor loadings ranged from 0.62 to 0.92, demonstrating the significance and relevance of each variable within the model, with critical ratios indicating strong statistical significance (
p < 0.001). Additionally, CR and AVE values exceeded the acceptable thresholds, further affirming our constructs’ convergent and discriminant validity. In contrast, the original tool [
17] employed Exploratory Factor Analysis (EFA) to assess construct validity. The initial analysis revealed a corrected item-to-total correlation coefficient ranging from
r = 0.292 to 0.650, with 16 items removed due to low correlations. The KMO value of 0.888 and a significant Bartlett’s sphericity test confirmed the appropriateness of the data for factor analysis. Ultimately, five factors were derived from 19 items, explaining 53.02% of the total variance. While both methodologies yielded valid results, our CFA approach allowed for a more stringent evaluation of the hypothesized relationships among constructs. Our findings indicated that the constructs were distinct and exhibited strong interrelatedness, as evidenced by the high factor loadings and significant fit indices. In contrast, the original tool’s EFA identified five factors but required the removal of numerous items, which may have suggested less clarity in the underlying structure of the constructs.
The differences in findings could be attributed to the distinct methodologies used. CFA was typically more confirmatory, testing a specified model based on theoretical expectations, while EFA was exploratory, allowing for the identification of potential factors without predefined hypotheses. This methodological divergence may have accounted for the stronger construct validity evidenced in our study, as CFA provided a clearer picture of how well the items reflected the underlying constructs [
36]. Hence, our findings demonstrated robust construct validity through CFA, highlighting the interconnectedness and distinctiveness of the constructs measured. While the original tool’s EFA results were valuable, the methodological rigor of our approach offered a more definitive understanding of the constructs, reinforcing their relevance and applicability in the field. These insights contributed to the ongoing discourse on the importance of methodological choices in validating measurement tools and the constructs they aimed to assess.
Our evaluation of the internal consistency of the CRCS-P utilized both Cronbach’s alpha and McDonald’s omega coefficients. The overall Cronbach’s alpha was 0.862, indicating strong reliability across all factors. Additionally, the McDonald’s omega coefficient of 0.871 further supported the conclusion of robust internal consistency. The stability of the CRCS-P was assessed through the ICC, which yielded a value of 0.806 (95% CI: 0.758–0.848). This result indicated good reliability over time and across different measurements, suggesting that the tool produced consistent results under varying conditions. In contrast, the original tool [
17] reported varying levels of internal consistency across its five factors. The overall Cronbach’s alpha for the total was 0.853, which was acceptable; however, the alpha values for individual factors were notably lower, ranging from 0.515 to 0.738. The lowest reliability was observed in Factor 5 (α = 0.515), indicating that this factor might not have consistently measured the intended construct. Such variability in reliability among the factors suggested that while the original tool was generally reliable, certain areas might have required further refinement to enhance their internal consistency.
The differences in reliability findings could be attributed to several factors. Our study’s overall higher Cronbach’s alpha and omega coefficients suggested that the CRCS-P had a more cohesive structure, with items better aligned in measuring the same underlying constructs. The original tool’s lower factor-specific alpha values might have indicated that some items did not correlate well with others within the same factor, leading to reduced internal consistency [
37]. Furthermore, the CRCS-P’s stability, indicated by the ICC value of 0.806, showed that the tool was reliable over time. This aspect was crucial for longitudinal studies, where consistent measurement was necessary to draw valid conclusions. The original tool [
17] did not provide similar stability measures, which limited the assessment of its reliability across different time points. Accordingly, our findings demonstrated strong internal consistency and reliability for the CRCS-P, outperforming the original tool regarding overall reliability and stability. While the original tool showed acceptable overall reliability, the variability in factor-specific alphas raised concerns about the consistency of certain constructs. These insights highlighted the importance of evaluating reliability comprehensively and suggested that further refinement might have been necessary for the original tool to enhance its reliability across all factors. This comparison underscored the significance of robust measurement tools in ensuring accurate and consistent data collection in research.
Limitations and strengths
This study had several limitations that should be acknowledged. One significant limitation was the sample size, which, while adequate for the analyses performed, may not have fully represented the diversity of clinical settings across Iran. Future research can benefit from a larger, more varied sample to enhance the generalizability of the findings. Additionally, the cross-sectional design limited the ability to assess changes in critical reflection competencies over time, suggesting that longitudinal studies will have provided deeper insights into how these competencies developed and impacted nursing practice. Another limitation was that the study focused exclusively on clinical nurses, leaving the applicability of the CRCS-P to other healthcare professionals unexplored.
Despite these limitations, this study presented several strengths that contributed to nursing education and practice. The CRCS-P validated a significant gap in assessing reflective practices among Iranian clinical nurses, providing a culturally relevant tool that could enhance educational and professional development initiatives. The use of robust psychometric methods, including CFA and reliability testing, further strengthened the credibility of the findings. Additionally, the comprehensive evaluation of both convergent and discriminant validity ensured that the CRCS-P accurately measured the intended constructs, making it a valuable resource for advancing reflective practice in nursing.
Conclusions
The Persian version of the Critical Reflection Competency Scale (CRCS-P) has demonstrated strong psychometric properties, including high internal consistency, reliability, and validity. These findings suggest that the CRCS-P is a robust instrument for assessing critical reflection competencies among clinical nurses in Iran. The importance of evaluating these competencies cannot be overstated, as reflective practice is integral to enhancing nursing education and improving patient care outcomes. Research indicates that critical reflection is associated with better clinical decision-making and improved patient outcomes. By fostering a culture of reflection, nurses can systematically analyze their experiences, leading to deeper insights into their practice and the complexities of patient care. The validation of the CRCS-P provides a dependable tool for measuring reflective practices, thereby supporting the ongoing professional development of nurses in Iran. Moreover, promoting critical reflection equips nurses to navigate complex clinical situations more effectively. This capability enhances their decision-making and contributes to a patient-centered approach to care, which is essential in today’s dynamic healthcare environment.
For future research, it is advisable to investigate the application of the CRCS-P across diverse healthcare settings and among various healthcare professionals to further establish its utility and effectiveness. Conducting longitudinal studies could yield valuable insights into how critical reflection competencies develop over time and their influence on clinical practice. Additionally, qualitative research might be beneficial in gaining a deeper understanding of the factors that shape reflective practices among nurses, including organizational culture and support systems.
Finally, exploring the connection between critical reflection and patient outcomes would further emphasize the significance of this competency in nursing practice. Such investigations could inform educational strategies and policy development to enhance reflective practices, ultimately leading to improved patient care and professional satisfaction among nurses.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.