Skip to main content
Erschienen in:

Open Access 01.12.2024 | Research

Psychometric evaluation of the study interest questionnaire-short form among Chinese nursing students based on classical test theory and item response theory

verfasst von: Yue Yi Li, Lai Kun Tong, Mio Leng Au, Wai I. Ng, Si Chen Wang, Yongbing Liu, Liqiang Zhong, Yi Shen, Xichenhui Qiu

Erschienen in: BMC Nursing | Ausgabe 1/2024

Abstract

Background

There is currently no dedicated measurement for assessing nursing students’ study interest in China. Considering the good reliability, validity, and widespread applicability of the Study Interest Questionnaire-Short Form (SIQ-SF), the objective of this study was to validate its usage among Chinese nursing students.

Methods

The translation and cross-cultural adaptation rigorously followed the modified Brislin model. A cross-sectional survey was conducted using the Chinese version of the SIQ-SF and convenience sampling was employed to select nursing students. The Psychometric evaluation of the Chinese version of the SIQ-SF was conducted based on Classical Test Theory and Item Response Theory.

Results

A total of 1158 participants were included in the analysis. The item-level content validity index (CVI) ranged from 0.9 to 1.0, and the scale-level CVI was 0.98. In the Exploratory factor analysis, three factors with eigenvalues above 1 were identified, accounting for 62.554% of the cumulative variance. In the confirmatory factor analysis, the CMIN\DF was 5.639, the GFI was 0.953, the CFI was 0.902, and the IFI was 0.904. The Cronbach’s α coefficient of the Chinese version of the SIQ-SF was 0.70. Thirty-one participants were invited to sign the scale after two weeks. The intraclass correlation coefficient was 0.784, and that of items ranged from 0.70 to 0.819. The infit MnSQ values ranged between 0.76 and 1.51, and the outfit MnSQ values ranged between 0.72 and 1.76. The point-measure correlation value ranged between 0.30 and 0.68. The item difficulty measures ranged from − 0.66 to 1.44 logit and the individual learning interest estimations ranged from − 4.22 to 4.97 logit. DIF contrast ranged from 0.00 to 0.33 logits, with all p values greater than 0.05.

Conclusions

The Chinese version of the SIQ-SF demonstrated acceptable reliability and validity among Chinese nursing students and could be used to assess nursing students’ study interest in China. With the aid of this scale, teachers can gain a better understanding of nursing students’ study interests, thereby maximizing their learning effects through appropriate content and methods.
Begleitmaterial
Hinweise

Supplementary Information

The online version contains supplementary material available at https://​doi.​org/​10.​1186/​s12912-024-02390-1.
The original version of this article was revised: In this article, affiliation 1 was incorrectly given as “Nursing College of Macau, Edifício do Instituto de Enfermagem Kiang Wu de Macau, Avenida do Hospital das Ilhas no.447, Coloane, RAEM, Macau SAR, China” but should have been “Kiang Wu Nursing College of Macau, Edifício do Instituto de Enfermagem Kiang Wu de Macau, Avenida do Hospital das Ilhas no.447, Coloane, RAEM, Macau SAR, China”.
A correction to this article is available online at https://​doi.​org/​10.​1186/​s12912-025-02735-4.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Background

The global nurse shortage, coupled with a rise in nursing student dropout rates, has profound implications for individuals, families, educational institutions, and society at large [1, 2]. While challenges within nursing programs contribute to student attrition [3, 4], a significant factor is the lack of interest in the nursing profession [5, 6]. Study interest, a vital motivational factor [7], influences students’ engagement with content and their capacity for deep learning [8]. Research indicates that study interest positively impacts nursing students’ academic performance [9], self-efficacy [10], critical thinking [11], and learning satisfaction [12]. Consequently, study interest plays a crucial role in improving nursing students’ learning state and quality, and the development of study interest is a powerful tool for fostering deeper learning [7].
Many researchers have explored methods to enhance study interest among nursing students [1315]. However, there is currently a lack of dedicated measurement tools for assessing nursing students’ study interest in China [16]. Existing measures mainly focus on assessing the interest of primary and secondary school students in specific subjects [1618]. While scales are available to measure study interest among college students, they are often tailored to specific subjects or situations [19, 20]. Consequently, it is essential to utilize a measurement tool specifically developed to evaluate college students’ interest in professional learning to obtain accurate results. Employing an inappropriate scale may result in an inability to accurately capture the intended construct, potentially introducing systematic measurement errors and biasing study outcomes [21]. Therefore, selecting an appropriate scale for assessing study interest is critical for minimizing measurement bias and enhancing the overall quality of research findings.
The Study Interest Questionnaire (SIQ) has been identified as a suitable tool for assessing college students’ interest in professional learning [22]. Grounded in educational interest theory, the original SIQ consists of 18 items [22]. It has been translated into multiple languages and widely used [11, 23]. In 2013, a scholar proposed a condensed version of the SIQ known as the SIQ-SF [24]. The SIQ-SF exhibits robust reliability and validity [24], with the added benefit of containing only 9 items, thereby saving responders valuable time. Considering the good reliability, validity, and widespread applicability of the SIQ-SF [2426], the objective of this study was to validate its usage among Chinese nursing students. To better evaluate the psychological characteristics of the Chinese version of the SIQ-SF, classical test theory (CTT) and item response theory (IRT) were employed.

Methods

This cross-sectional study was conducted in two distinct phases: (1) translation and cross-cultural adaptation of the SIQ-SF and (2) validation of the SIQ-SF in a sample of Chinese nursing students.
The nine items of the SIQ-SF capture three factors of study interest: emotion (5 items), value (2 items), and intrinsic quality (2 items) [22]. Each item is rated on a Likert scale from 1 (not at all true) to 4 (completely true), with items 2 and 5 scored in reverse [24]. The total score, which ranges from 9 to 36, is calculated by summing the scores of all nine items [24]. A higher total score indicates a greater level of interest in the student’s professional learning [24]. The SIQ-SF demonstrates satisfactory reliability (Cronbach’s α = 0.82) [22] and validity, effectively measuring students’ interest in professional learning [2426].

Phase 1: translation and cross-cultural adaptation

To maintain semantic consistency with the SIQ-SF, the translation and cross-cultural adaptation rigorously followed the modified Brislin model [27]. The process included the following four steps: (1) Forward translation: The SIQ-SF was independently translated from English to Chinese by two bilingual translators residing in China, both of whom have nursing doctorates. Subsequently, a panel comprising five nursing educators, two bilingual translators and one English lecturer deliberated and modified two versions of the translated SIQ-SF together, ultimately reaching a consensus on the phrasing and expression of an initial Chinese version of the SIQ-SF. (2) Backward translation: Two additional bilingual translators who worked in the UK translated the initial Chinese version of the SIQ-SF back into English and were blinded to the original SIQ-SF. These two bilingual translators both hold master’s degrees. Then, the same expert panel and the two bilingual translators working in the UK undertook the task of scrutinizing the back-translation in comparison to the original text. The objective of this comparison was to identify and rectify any linguistic errors present in both translations and finally form the draft Chinese version of the SIQ-SF that best matches the original English version. (3) Cross-Cultural Adaptation: A group of ten experts were invited to enhance the draft Chinese version of the SIQ-SF by aligning it with Chinese expressions and habits, drawing upon their professional expertise and work experience. Each of these experts holds a doctorate, and they specialize in various fields, including nursing, education, medicine, and psychology. The translational relevance of each of the 9 items was graded on a 4-point scale (1 = “totally different” and 4 = “equivalent”). (4) Pilot study: Ten nursing students were invited to evaluate the fluency, readability, and comprehensibility of the items. Any inappropriacy was modified according to the received feedback, and the final Chinese version of the SIQ-SF was obtained.

Phase 2: validation

Participants and settings

Convenience sampling was employed in the present study. Participants who fulfilled the following criteria were included: (1) Chinese nursing student, (2) over 18 years old, and (3) volunteering to participate. Participants who demonstrated an inability to comprehend and produce written Chinese were excluded. To optimize the sample selection process, three provinces representing the eastern, central, and western regions of mainland China were chosen, considering variations in economic development levels and educational resources. Additionally, recognizing the distinct educational approaches in the two special administrative regions (SAR) of China (Hong Kong and Macao), compared to other provinces in mainland China, it is necessary to include some samples from SAR. Therefore, samples were ultimately collected from the eastern province of Jiangsu, the central province of Hunan, the western province of Sichuan, and the Macao SAR. Ideally, the sample size should consist of at least 500 participants for the factor analysis in CTT [28]. Although the appropriate sample size for the Rasch Rating Scale Model (RRSM) in IRT has not been determined with certainty, for the purpose of estimating fit statistics, a sample of 500 students was considered adequate [28].

Procedures

The data were collected using Wenjuanxing (https://​www.​wjx.​cn/​), a well-known Chinese online questionnaire survey platform. Initially, the research team generated promotional materials, such as posters incorporating QR codes for electronic questionnaires. Subsequently, the research coordinators in each province assumed the responsibility of disseminating the research through social media platforms within their respective regions. To ensure the completion of the questionnaire, participants were required to respond to all questions prior to submission. Furthermore, each participant was limited to a single instance of answering the questionnaire, as determined by the IP address documented by the Wenjuanxing survey platform.

Statistical analysis

To conduct the analysis, the total sample was randomly divided into two equal-sized groups. The randomization techniques help to distribute any potential confounding variables evenly across groups, thus reducing demographic differences [29]. The first group of samples was used for exploratory factor analysis (EFA), and the second group of samples was used for confirmatory factor analysis (CFA) and Rasch analysis (RA). The EFA was conducted with SPSS 29.0, the CFA was conducted with AMOS 22.0, and the RA was conducted with Winsteps 3.74.0. Demographic characteristics (such as age, sex, and education) were analyzed through descriptive statistics. A p value < 0.05 was considered to indicate statistical significance [30].

CTT analysis

The validity of the Chinese version of the SIQ-SF was assessed through content validity and construct validity. The Content Validity Index (CVI) was calculated by initially categorizing scores of 3 and 4 as indicating ‘relevance’ (assigned a value of 1) and scores of 1 and 2 as indicating ‘irrelevance’ (assigned a value of 0). The item content validity index (I-CVI) and scale-level CVI (S-CVI) were calculated. I-CVI = Number of experts rating item as ‘relevance’ / (total number of experts) and S-CVI/Ave = (Σ of I-CVI scores) / (total number of items) [31]. An I-CVI ≥ 0.80 and an S-CVI ≥ 0.9 were deemed to indicate good content validity [32]. To assess construct validity, EFA and CFA were conducted. Before factor analysis, Kaiser‒Meyer‒Olkin (KMO) and Bartlett’s sphericity tests were used to verify sampling adequacy. Whenever the KMO value is above 0.60 and Bartlett’s test of sphericity is significant, the data are suitable for factor analysis [33, 34]. Varimax rotation and principal component analysis (PCA) were used in the EFA. The number of extracted factors was based on examination of the scree plot and eigenvalues > 1, and factor loading was assumed to be greater than 0.5. With CFA, a robust maximum likelihood method was used to evaluate the model’s fitness. Ideally, the chi-squared values divided by the degrees of freedom (CMIN/DF) should be smaller than 5, and the goodness-of-fit index (GFI), comparative fit index (CFI), and incremental fit index (IFI) should be greater than 0.9 [35].
The reliability of the Chinese version of the SIQ-SF was determined by analyzing its internal consistency and test-retest reliability. Cronbach’s α coefficient was used to evaluate internal consistency, and a value of > 0.70 indicated satisfactory internal consistency [36]. The intraclass correlation coefficient (ICC) was calculated to determine the test-retest reliability at an interval of 2 weeks, and the minimal acceptable value was set at 0.70 [37].

Rasch analysis

The RRSM was used to estimate item and person parameters. Several assumptions must be met before using the RRSM. Unidimensionality is one of the foundational assumptions, meaning that each item measures a single concept or construct [38]. Principal component analysis (PCA) for residuals was used to investigate potential unidimensionality [38]. The criteria for unidimensionality were that the variance explained by the measurement dimension be at least 40% and that the unexplained variance of the eigenvalue for the first contrast (size) be < 3.0 [38]. Additionally, local independence is another major assumption of Rasch models, which holds that items in a test are not related to one another [39]. Local independence can be assumed if Yen’s Q3 statistic is equal to or less than 0.36 [40]. To evaluate the items’ fit to the RRSM, the outfit mean square (outfit MnSQ) and infit MnSQ were computed, which should be between 0.6 and 1.4 [41]. The point-measure correlation (PTMEA CORR. ), which represents the relationship between an individual item’s response and the overall test score of an item, was used to investigate item polarity [42]. It is ideal for these correlations to be positive [42]. To ensure that items discriminate against different levels of person performance and that individuals can detect calibration differences between items, separation indices were calculated for items and persons. Person separation reliability measures the degree to which a respondent has maintained the same position when given another set of items measuring the same construct, whereas item separation reliability measures how consistently the set of items is answered by different respondents who possess similar abilities [43]. It is generally considered adequate if the separation index exceeds 2 and the separation reliability coefficient exceeds 0.7 [44]. To visualize item features and individual measures, a Wright map was used to estimate the values measured by the sample respondents and the average location of all items on a common scale (logits) [45]. Additionally, differential item functioning (DIF) was conducted to test measurement invariance across genders, and DIF is typically reported as a DIF > 0.5 logits with a p value < 0.05 [46].

Ethical considerations

Initially, permission to translate the SIQ-SF into Chinese was obtained from the original author. Ethical approval for the present study was received from the Research Management and Development Department of a College of Macau (No. REC-2021.801). The Helsinki Declaration was followed throughout the study. The data were used for research purposes only. Prior to participating in the survey, participants had to read the informed consent form and click the “Agree” button. Participants have the right to withdraw from the study without any negative effects.

Results

Demographic characteristics

A total of 1158 participants were included in the analysis, and their demographic characteristics are presented in Table 1. The participants ranged in age from 18 to 50 years, and the average age was 23 ± 5.0 years. Female participants accounted for 81.5%, and 72% of the participants had an education level of undergraduate or above.
Table 1
Demographic characteristics of the participants (n = 1158)
Variables
EFA
(N = 579)
CFA & RA
(N = 579)
n (%)
n (%)
Age (mean, standard deviation)
23.1 (4.8)
23.1 (4.3)
Gender
  
 Female
466 (80.5)
478 (82.6)
 Male
113 (19.5)
101 (17.4)
Education
  
 Junior college
160 (27.6)
165 (28.5)
 Undergraduate
333 (57.5)
329 (56.8)
 Graduate
86 (14.9)
85 (14.7)
Location
  
 Jiangsu
142 (24.5)
144 (24.9)
 Hunan
154 (26.6)
155 (26.8)
Sichuan
148 (25.6)
139 (24.0)
 Macao
135 (23.3)
141 (24.3)

Validity

Content validity

The I-CVI of the Chinese version of the SIQ-SF ranged from 0.9 to 1.0, and the S-CVI was 0.98, as shown in Supplementary Table S1. All the I-CVIs and S-CVIs were greater than 0.90, suggesting that the content validity of the SIQ-SF was good.

Construct validity

The construct validity was evaluated by EFA and CFA. The KMO value was 0.760, and Bartlett’s sphericity was statistically significant (χ2 = 1337.904, P < 0.001), indicating that the data were suitable for factor analysis. In the EFA, three factors - emotion, value, and intrinsic quality - were identified, all of which align with the original scale. These factors accounting for 62.554% of the cumulative variance. A scree plot displaying the eigenvalues is shown in Fig. 1. The factor loadings are presented in Table 2. Consequently, the three-factor model was examined by CFA. The results of CFA are shown in Fig. 2. The initial model was modified according to the modification indices. In the revised model, the CMIN\DF was 5.639, the GFI was 0.953, the CFI was 0.902, and the IFI was 0.904, indicating that the model fit the data adequately. The construct validity of the Chinese version of the SIQ-SF was found to be acceptable, with the identified factors and their corresponding items consistent with the original scale.
Table 2
Factor analysis results for the Chinese version of the SIQ-SF (N = 579)
 
Item
Emotion
Value
Intrinsic quality
Q8
I am confident that I have chosen the major that corresponds to my personal preferences.
0.809
  
Q9
I chose my major primarily because of the interesting subject matter involved.
0.745
  
Q3
When I am in a library or bookstore, I like to browse through magazines or books having to do with topics related to my major.
0.714
  
Q7
Even before I started studying, my current major was important to me.
0.693
  
Q4
It was of great personal importance to me to be able to study this particular subject.
0.682
  
Q5
Compared to other things that are of great importance to me (e.g., hobbies, social life), my studies are of markedly less significance to me.
 
0.856
 
Q2
I prefer to talk about my hobbies rather than about my major.
 
0.722
 
Q6
Working with particular subject matter is more important to me than leisure and amusement.
  
0.607
Q1
After a long weekend or vacation, I look forward to getting back to my studies.
  
0.590

Reliability

The Cronbach’s α coefficient of the SIQ-SF was 0.70, indicating that the internal consistency of the Chinese version of the SIQ-SF was satisfactory. Thirty-one participants were invited to complete the scale after two weeks. The ICC of the Chinese version of the SIQ-SF was 0.784, and that of the items ranged from 0.70 to 0.819, as shown in Supplementary Table 1. All the ICCs were greater than 0.70, suggesting that the test-retest reliability of the Chinese version of the SIQ-SF was good.

Rasch analysis results

Unidimensionality and local independence

Based on the PCA of the standardized residuals, the dimension extracted by the Rasch model accounted for 33.4% of the variance by persons and items, and the eigenvalue of the first contrast (the largest secondary dimension) was 2.3, which was considered appropriate. Furthermore, to ensure local independence, all pairs of items were thoroughly examined, and no pairwise residual correlation exceeded 0.5. Detailed information regarding this analysis can be found in Supplementary Table S2, which suggests that local dependence was not a concern.

Item fit

Infit and outfit indices indicated the construct validity of the assessment test for differentiating students with varying levels of learning interest. Table 3 provides the item statistics for each of the 9 items, including item difficulty, standard error of measurement, fit statistics (comprising infit and outfit), and point-measure correlation.
The infit MnSQ values ranged between 0.76 and 1.51, and the outfit MnSQ values ranged between 0.72 and 1.76. With the exception of Q5, which is slightly above the reference value, the goodness of fit values for all items ranged from 0.6 to 1.4. The minimum point-measure correlation value observed was 0.30, while the maximum value reached 0.68. These results indicate that the item in this scale aligns with other items for assessing nursing students’ learning interest.
Table 3
Item difficulty, standard error, fit, and point-measure correlation
Item number
Item difficulty
Standard error
Infit MnSQ
Outfit MnSQ
PTMEA CORR.
Q1
-0.38
0.07
1.05
1.04
0.50
Q2
-0.53
0.08
1.37
1.31
0.30
Q3
0.16
0.07
0.80
0.77
0.64
Q4
-0.62
0.08
0.90
0.87
0.57
Q5
1.44
0.06
1.51
1.76
0.36
Q6
0.18
0.07
0.91
0.92
0.53
Q7
-0.66
0.08
0.76
0.72
0.59
Q8
-0.05
0.07
0.84
0.81
0.68
Q9
0.44
0.07
0.81
0.80
0.67

Item and person reliability

In the Rasch model, reliability is a measure that indicates the consistency of the position of the person and item on the logit scale. According to Rasch analysis, the SIQ-SF had a person reliability coefficient of 0.70 and a person separation index of 1.51, indicating that the scores of the participants could be reliably estimated and classified into at least 2 strata [(1.51*4 + 1)/3]. The item reliability coefficient is 0.99, and the item separation index is 8.30. These findings suggest that these items provided accurate measurements and were effectively differentiated into at least 11 strata [(8.3*4 + 1)/3] of learning interest. A visual representation of the distribution of item difficulty and person ability is shown in Fig. 2. The right side of the Wright Map displays the distribution of items ranked by their difficulty, with the easiest items (Q4 and Q7) positioned at the bottom and the most difficult items (Q5) positioned at the top. The item difficulty measures ranged from − 0.66 to 1.44 logit with a spread of 2.1 logit. On the left side of the Wright Map, individuals are positioned. Nursing students with lower learning interest were situated at the base of the scale, while nursing students with higher learning interest were located at the top of the map. The individual learning interest estimations ranged from − 4.22 to 4.97 logit with a spread of 9.19 logit. The mean item difficulty for the test is 0.00 logits, while the mean person ability is approximately 0.75 logits higher. The dispersion of individuals in the Wright Map exhibits a significantly wider range compared to the spread of the test items.

DIF across gender

The DIF analysis determines whether constructs are equivalent across groups. Mean scores of the SIQ-SF based on demographic data are presented in Supplementary Table S3. Figure 3 compares the item DIF to the overall baseline item difficulty for the gender-based classification of people. For all items, DIF contrast ranged from 0.00 to 0.33 logits, with all p values greater than 0.05, showing no substantive DIF based on gender.

Discussion

In the present study, the SIQ-SF was translated into Chinese, and the psychometric properties were determined among Chinese nursing students using CTT and IRT. Based on these findings, the Chinese version of the SIQ-SF demonstrated acceptable reliability and validity among Chinese nursing students. Educators can leverage this scale to gain a deeper understanding of nursing students’ learning interests. This understanding can inform the development of tailored content and pedagogical approaches, ultimately enhancing educational outcomes.
During the translation and cross-cultural adaptation, the modified Brislin model was rigorously followed to maintain semantic consistency [27]. The translation and cross-cultural adjustment did not alter the original structures of the SIQ-SF, and only a few items were modified to make them more easily understandable in Chinese. The S-CVI of the Chinese version of the SIQ-SF was greater than 0.90, demonstrating that the content validity of the Chinese version of the SIQ-SF was satisfactory. The Chinese version of the SIQ-SF maintained semantic consistency with the original scale, which may be partly due to the strict following of the guidelines of the theoretical model in the process of translation and cross-cultural adaptation [27]. On the other hand, it may also be that experts in various fields provided professional advice during the adjustment process.
According to the EFA, the Chinese version of the SIQ-SF had three factors (emotion, value, and intrinsic quality), which corresponded with the original scale. In the CFA, the model fit indices, including CMIN/DF, GFI, CFI, and IFI, all met the standard except for CMIN/DF. The CMIN/DF is a commonly used fit index in CFA, and it is a measure of the discrepancy between the observed covariance matrix and the model-implied covariance matrix, adjusted for degrees of freedom [47]. The CMIN/DF value was slightly greater than the normal range, indicating that there is still room for improvement. However, it is worth noting that the CMIN/DF ratio tends to increase as the sample size increases [48], and the large sample size in this study may partly account for the slightly greater CMIN/DF ratio. Additionally, all model fitting indices except CMIN/DF were within normal ranges, thus indicating an acceptable model fit.
In terms of reliability, the Cronbach’s α coefficient and ICC were within normal ranges, indicating satisfactory internal consistency and test-retest reliability of the Chinese version of the SIQ-SF. The Cronbach’s α coefficient of this scale was 0.7, which was lower than that of the original version. Compared to the original scale, the Chinese version of the SIQ-SF has half as many items, which may explain the lower Cronbach’s α coefficient [49]. Regarding test-retest reliability, the ICC of the original SIQ was 0.67, which was lower than that of the Chinese version of the SIQ-SF [22]. In the present study, thirty-one participants were invited after two weeks, while the original SIQ was retested after two years. The various retest times may partly explain these differences [50, 51].
In addition to using CTT to verify the psychological properties of the scale, IRT was also used to obtain a comprehensive understanding of how survey items and respondents distributed and exhibited erratic behavior [52, 53]. Prior to applying the RRSM, the unidimensional nature of the scale and the local independence of the item were verified [54], and both assumptions were found to be true. In terms of item fit, the infit and the outfit MnSQ values of all items ranged between 0.6 and 1.4 except for Q5. The infit and outfit MNSQs of Item 5 exceeded the acceptable limits (1.51 and 1.76, respectively), so it might measure something other than or in addition to the learning interest [55]. Several potential reasons can contribute to this misfit. First, poorly constructed items, such as complicated language and double-barreled questions, can trigger inconsistent or unexpected responses and subsequently lead to Rasch model misfit [56]. Second, guessing behavior can result if the item is too easy or too difficult for the target population [57]. As shown in Fig. 4, Q5 was the most difficult item and was positioned at the top of the Wright map. It is possible that respondents may randomly guess the answer, leading to inconsistent responses and poor fit with the Rasch model. Despite not meeting the infit and outfit MNSQ of Q5, its point-measure correlation value was 0.36, indicating that Q5 was aligned with other items in assessing students’ learning interests. Accordingly, these 9 items fit the Rasch model adequately.
DIF provides insight into how items perform in different groups and safeguards the validity of a scale [58]. An item is considered DIF when a characteristic apart from its ability influences the response of the respondent. In the present study, DIF analysis was performed across gender [58]. There was no significant difference in learning interest scores between male and female students. The DIF values for all the items were less than 0.5 logits, with all p values greater than 0.05, indicating that the DIF did not differ by gender. This result may be related to China’s social culture. It has traditionally been the case that nurses are dominated by women, but as society has become more diverse and inclusive, males are less influenced by gender expectations when selecting the nursing specialties in which they are interested [4]. Another possible reason is self-actualization. Today’s nurses perform a wide range of roles, including clinical nurses, nurse anesthesiologists, nursing educators, and nurse researchers [59]. As a result of this diversity, the nursing field creates a wider range of employment opportunities and attracts both males and females with diverse interests and ambitions.

Limitations

There are several limitations to this study. First, only some nursing students from three provinces in eastern, central, and western China and one special administrative region participated in this study. This sample may not represent all Chinese or international nursing students. Consequently, generalizing the results to broader populations should be done with caution, and a more representative sample could be considered for further studies. Second, the data collection method used was self-reports on the online platform. Although each participant was allowed to answer the questionnaire only once, the process was still unregulated. In the future, other more effective methods may be considered. Third, although CTT and IRT were used to translate and validate the Chinese version of the SIQ-SF, a criterion validity test was not performed because of the lack of a gold standard for evaluating nursing students’ interest in learning. Finally, the study did not establish a definitive cut-off point for the scale. The absence of a validated cut-off point limits the interpretability and practical application of the findings, as it may affect the categorization of study interest levels among participants. Future research should focus on determining an appropriate cut-off point for the scale.

Conclusions

The Chinese version of the SIQ-SF demonstrated acceptable reliability and validity among Chinese nursing students and could be used to assess nursing students’ interest in learning in China. With the aid of this scale, teachers can gain a better understanding of nursing students’ learning interests, thereby maximizing their learning effects through appropriate content and methods. Future research should focus on determining an appropriate cut-off point for the Chinese version of the SIQ-SF and applying it to a larger population of college students to verify its psychological characteristics, thereby broadening its scope of application.

Acknowledgements

Not applicable.

Declarations

This research was approved by the Research Management and Development Department of Kiang Wu Nursing College of Macau (No. REC-2021.801) and conducted according to the Declaration of Helsinki. The Research Management and Development Department is in charge of ethical approvals of scientific research projects, as well as issuing the ethical approval documents. It was a completely voluntary, anonymous, and unrewarded study. Informed consent was obtained from all the participants. The participants read and agreed to the informed consent before starting to fill in the questionnaire. In order to assured voluntariness, participants could withdraw at any time without losing benefits. Anonymity was guaranteed by not collecting any personal identification information (such as names). Confidentiality was ensured by storing data on a computer protected by a password known only by the researchers.
Not applicable.

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by-nc-nd/​4.​0/​.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anhänge

Electronic supplementary material

Below is the link to the electronic supplementary material.
Literatur
2.
7.
Zurück zum Zitat Renninger KA, Hidi SE. Interest Development and Learning. In: Renninger KA, Hidi SE, editors. The Cambridge Handbook of Motivation and Learning. Cambridge Handbooks in psychology. Cambridge, UK: Cambridge University Press; 2019. pp. 265–90.CrossRef Renninger KA, Hidi SE. Interest Development and Learning. In: Renninger KA, Hidi SE, editors. The Cambridge Handbook of Motivation and Learning. Cambridge Handbooks in psychology. Cambridge, UK: Cambridge University Press; 2019. pp. 265–90.CrossRef
17.
Zurück zum Zitat Chang F. The Development and Validation of STEM Career Interest Scale for High School Students. Taiwan: National Taiwan Normal University; 2019. Chang F. The Development and Validation of STEM Career Interest Scale for High School Students. Taiwan: National Taiwan Normal University; 2019.
21.
Zurück zum Zitat Breakwell GM, Barnett J, Wright DB. Research methods in psychology. London: SAGE Publications Ltd; 2020. Breakwell GM, Barnett J, Wright DB. Research methods in psychology. London: SAGE Publications Ltd; 2020.
22.
Zurück zum Zitat Schiefele U, Krapp A, Wild K-P, Winteler A. Der Fragebogen Zum Studieninteresse(FSI). Diagnostica. 1993;39(4):335–51. Schiefele U, Krapp A, Wild K-P, Winteler A. Der Fragebogen Zum Studieninteresse(FSI). Diagnostica. 1993;39(4):335–51.
26.
Zurück zum Zitat Dünne AA, Zapf S, Hamer HM, Folz BJ, Käuser G, Fischer MR. Teaching and assessment in otolaryngology and neurology: does the timing of clinical courses matter? Eur Archives oto-rhino-laryngology: Official J Eur Federation Oto-Rhino-Laryngological Soc (EUFOS) : Affiliated German Soc Oto-Rhino-Laryngology - Head Neck Surg. 2006;263(11):1023–30. https://doi.org/10.1007/s00405-006-0114-y. Dünne AA, Zapf S, Hamer HM, Folz BJ, Käuser G, Fischer MR. Teaching and assessment in otolaryngology and neurology: does the timing of clinical courses matter? Eur Archives oto-rhino-laryngology: Official J Eur Federation Oto-Rhino-Laryngological Soc (EUFOS) : Affiliated German Soc Oto-Rhino-Laryngology - Head Neck Surg. 2006;263(11):1023–30. https://​doi.​org/​10.​1007/​s00405-006-0114-y.
27.
Zurück zum Zitat Jones PS, Lee JW, Phillips LR, Zhang XE, Jaceldo KB. An adaptation of Brislin’s translation model for cross-cultural research. Nurs Res. 2001;50(5):300–4.CrossRefPubMed Jones PS, Lee JW, Phillips LR, Zhang XE, Jaceldo KB. An adaptation of Brislin’s translation model for cross-cultural research. Nurs Res. 2001;50(5):300–4.CrossRefPubMed
34.
Zurück zum Zitat Bartlett MS. A note on the multiplying factors for various χ < sup > 2 approximations. J Royal Stat Soc Ser B (Methodological). 1954;16(2):296–8.CrossRef Bartlett MS. A note on the multiplying factors for various χ < sup > 2 approximations. J Royal Stat Soc Ser B (Methodological). 1954;16(2):296–8.CrossRef
36.
Zurück zum Zitat Loewenthal KM, Lewis CA. An introduction to psychological tests and scales. 3rd Edition ed: Routledge; 2020. Loewenthal KM, Lewis CA. An introduction to psychological tests and scales. 3rd Edition ed: Routledge; 2020.
38.
Zurück zum Zitat Hambleton RK, Swaminathan H. Item response theory: principles and applications. Springer Science & Business Media; 2013. Hambleton RK, Swaminathan H. Item response theory: principles and applications. Springer Science & Business Media; 2013.
39.
Zurück zum Zitat Cohen J. Statistical power analysis for the behavioral sciences. 2nd Edition ed: Routledge; 2013. Cohen J. Statistical power analysis for the behavioral sciences. 2nd Edition ed: Routledge; 2013.
41.
Zurück zum Zitat Wright BD. Reasonable mean-square fit values. Rasch Meas Trans. 1996;2:370. Wright BD. Reasonable mean-square fit values. Rasch Meas Trans. 1996;2:370.
42.
Zurück zum Zitat Abdaziz A, Jusoh MS, Omar, Amlus H, Salleh T. Construct validity: a Rasch Measurement Model approaches. J Appl Sci Agric. 2014;9(12):7–12. Abdaziz A, Jusoh MS, Omar, Amlus H, Salleh T. Construct validity: a Rasch Measurement Model approaches. J Appl Sci Agric. 2014;9(12):7–12.
43.
Zurück zum Zitat Bond TG, Fox CM. Applying the Rasch model: Fundamental measurement in the human sciences. 2nd Edition ed: Psychology Press; 2007. Bond TG, Fox CM. Applying the Rasch model: Fundamental measurement in the human sciences. 2nd Edition ed: Psychology Press; 2007.
44.
Zurück zum Zitat Chan M, Subramaniam R. Validation of a Science Concept Inventory by Rasch Analysis. In: Khine MS, editor. Rasch Measurement: applications in quantitative Educational Research. Singapore: Springer Singapore; 2020. pp. 159–78.CrossRef Chan M, Subramaniam R. Validation of a Science Concept Inventory by Rasch Analysis. In: Khine MS, editor. Rasch Measurement: applications in quantitative Educational Research. Singapore: Springer Singapore; 2020. pp. 159–78.CrossRef
45.
Zurück zum Zitat Wilson M. Constructing measures: An item response modeling approach. 2nd Edition ed: Routledge; 2023. Wilson M. Constructing measures: An item response modeling approach. 2nd Edition ed: Routledge; 2023.
47.
Zurück zum Zitat Collier J. Applied structural equation modeling using AMOS: Basic to advanced techniques. Routledge; 2020. Collier J. Applied structural equation modeling using AMOS: Basic to advanced techniques. Routledge; 2020.
Metadaten
Titel
Psychometric evaluation of the study interest questionnaire-short form among Chinese nursing students based on classical test theory and item response theory
verfasst von
Yue Yi Li
Lai Kun Tong
Mio Leng Au
Wai I. Ng
Si Chen Wang
Yongbing Liu
Liqiang Zhong
Yi Shen
Xichenhui Qiu
Publikationsdatum
01.12.2024
Verlag
BioMed Central
Erschienen in
BMC Nursing / Ausgabe 1/2024
Elektronische ISSN: 1472-6955
DOI
https://doi.org/10.1186/s12912-024-02390-1