Abstract
Purpose:
To linguistically and culturally adapt the 31-item Singaporean Diabetic Retinopathy Knowledge and Attitudes (DRKA) questionnaire for a Chinese population and assess its reliability and validity using classical and modern psychometric theory.
Methods:
A total of 230 patients with diabetic retinopathy (DR) were recruited, and of these, 202 valid responses were analyzed. Rasch analysis and classical test theory (CTT) methods were used to analyze the fit statistics of the Knowledge (n = 22 items) and Attitudes (n = 9 items) scales, including the functionality of the response categories, fit statistics, person and item reliability and separation, unidimensionality, targeting, differential item functioning (DIF), internal consistency, convergent validity, and known-group validity.
Results:
After revision, both the Knowledge and Attitudes scales were unidimensional and had good measurement precision (Person Separation Index = 2.18 and 1.72) and internal consistency (Cronbach's α = 0.83 and 0.82). While the items in the Knowledge scale aptly targeted participants’ ability level, targeting of the Attitudes scale was slightly suboptimal, with items too easy on average for participants’ ability level. There were no issues with DIF and item fit, and the scales showed good known-group validity (scores increased as education level increased) and convergent validity (high correlation with the DRKA Practice questionnaire).
Conclusions:
After a thorough language and cultural verification process, the Chinese version of the DRKA is culturally appropriate and has good psychometric performance.
Translational Relevance:
The DRKA questionnaire may be useful to assess patients’ DR-related knowledge and attitude level, as well as inform specific education interventions and optimize patients' ability to manage their condition.
The Singaporean DRKA was translated into Chinese using the modified Brislin translation model, which is outlined in more detail below.
19
Literal translation: Two Chinese masters of ophthalmic care (translator 1 and translator 2) who are proficient in both Chinese and English independently translated the source language (English) into the target language (Chinese) to form the first draft of Chinese versions C1 and C2.
Comparison : The translation research team consisted of five clinical nursing researchers with different clinical backgrounds. Chinese versions C1 and C2 were compared in repeated cycles within the group by the research team members to evaluate whether the translation of each item of the scale achieved
20 (1) the same theoretical framework and item content (conceptual equivalence), (2) the meaning of each item was the same before and after translation (semantic equivalence), and (3) the content of each item was appropriate for the culture of the people using the scale (content equivalence). After repeated comparison and correction, the team members were satisfied with the translation results of each item of the scale, and Chinese version C3 was obtained.
Back translation : Chinese version C3 was back-translated to obtain the English version of scale E1 by a graduate student (back translator 3) who was familiar with the DR patient population and had worked in an ophthalmic ward or outpatient clinic but was not familiar with the contents of the scale. Then, the research team discussed with back translator 3 to revise English versions E0 and E1 to form a unified opinion and obtain English version E2.
Common deliberation : Under the guidance of Shandong Youqi Translation Service Co. Ltd., the members of the research team, translator 1, translator 2, and back translator 3, translated English scale E2 into Chinese scale C4 and sent them to the original developers of the DRKA (authors EKF and ELL) for approval. The translation process is shown in
Figure 1.
Cultural adjustment: Chinese version C4 was then reviewed by an expert panel (
n = 6) in writing or via face-to-face interview. Selection criteria for the expert panel included
21 (1) bachelor's degrees or above, (2) title of associate senior or above, (3) extensive experience in specialized diagnosis and treatment or specialized nursing management or English teaching, and (4) willingness to participate in this study. Based on Chinese people's familiarity and understanding of common nouns, the panel recommended the following changes:
- 1. Item 9: “the blood sugar monitoring that you do throughout the day” was changed to “all-day dynamic blood glucose monitoring.”
- 2. Item 17: “too much computer use” was changed to “excessive use of electronic products (mobile phones, computers, etc.).”
- 3. Item 18 was changed from “natural remedies e.g. fenugreek, bitter gourd or neem” to “natural remedies (such as season beans, bitter gourd and neem jujube and other ‘earthworks’).”
- 4. In item 6 of the Attitudes scale, the word “gene” was changed to “inheritance.”
After this process, Chinese version C5 was obtained.
Pre survey of target population: Thirty patients with DR with good reading comprehension and language expression skills were selected by convenience sampling to complete Chinese version C5 and answer some questions about the items, including the following: (1) Do you have any questions about the content of the questionnaire? (2) What do you think this item means? (3) How do you think this item can be expressed more easily? (4) Do you have any suggestions or opinions on this scale? As a result of this process, item 20 was changed from “avoid receiving laser treatment or eye injection” to “laser treatment or eye injection,” resulting in Chinese version C6 of the DRKA. This final version was deemed to be fully adapted to the cultural environment of China and met the expectations of the original developers.
In order for Rasch analysis to produce reliable item calibration or person measurement within the (–0.5, 0.5) logit interval with 99% confidence, Linacre
22 recommends a sample size of 150 (range, 108–243 for best to poor targeting). In this cross-sectional study, 230 patients with DR were recruited from the ophthalmic outpatient department and ophthalmic ward of a class Ⅲ, grade A hospital in Hohhot, Inner Mongolia Autonomous Region, for the DRKA validation population from December 2021 to August 2022. Convenience sampling—a nonprobabilistic sampling technique in which researchers select samples based on who is available or easily accessible—was used to recruit participants. Inclusion criteria were the following: (1) met the international clinical grading criteria for diabetic retinopathy by the American Academy of Ophthalmology in 2019, including patients with diabetic macular edema
23; (2) age ≥18 years; and (3) agreed to volunteer for research. Exclusion criteria included the following: (1) patients with concomitant eye disease not caused by diabetes and (2) patients with a cognitive disorder.
Severity of DR was graded by fundus photos, which were taken, read, diagnosed, and graded by trained doctors. The international standard logarithmic visual acuity chart was used to measure visual acuity, which was converted to the logarithm of the minimum angle of resolution to determine whether the patient had visual impairment. Most information was filled out by self-report, and clinical information was supplemented by consulting medical records if the patient could not provide it. If patients had difficulty in filling out the study questionnaires due to vision problems or for other reasons, data were collected by trained research assistants.
The study followed the principles of the Declaration of Helsinki, the purpose and significance of this study were explained to eligible patients before the process of data collection, and informed consent was signed.
The data from the Chinese DRKA were statistically analyzed with SPSS version 26.0 (SPSS, Inc., Chicago, IL, USA) and Winsteps version 3.66.0 (Winsteps, Chicago, IL, USA). Mean ± standard deviation (SD) was used to describe continuous data, while categorical data were expressed as percentages and frequency. Participants for whom complete data were not available were excluded from the data analysis.
Winsteps software (version 3.66.0) was used for the Rasch analysis. The basic Rasch model (dichotomous score) was used for the Knowledge scale. As the items of the attitude scale are polytomous, the partial credit model (PCM) was used. Rasch parameters analyzed included functionality of the response categories, fit statistics, person and item reliability, person and item separation, unidimensionality, targeting, and differential item functioning (DIF).
Functionality of the response categories: To determine whether the response options of the test items are functioning as intended, the response options of items should meet the following criteria
24,25: (1) There should be at least 10 observations for each response. (2) The value of the response options should increase sequentially. (3) The adjacent category difficulty differences should be at least 1.1 logits apart on a 4-point scale and 1.4 logits apart on a 3-point scale. If the above conditions are not met, collapsing of categories may be warranted. (4) The outfit mean square (MNSQ) is less than 2.0.
Fit statistics: Infit MNSQ and outfit MNSQ were used to verify the compatibility of the observed data with the prediction model. Infit is sensitive to data where the item difficulty is comparable to the individual's ability level, while the outfit is more sensitive to extreme values or abnormal data. The values of outfit MNSQ and infit MNSQ for each item should be between [0.5, 1.5], with values closer to 1 indicating the best fit, a value of 1 indicating a perfect fit, and the best value of standardized fitting between –2 and 2. If MNSQ <0.5, it indicates possible redundancy of items, and MNSQ >1.5 indicates measuring data underfitting the model (“noise”).
26 Similarly, the overall infit MNSQ and outfit MNSQ values of the scale were used to verify that the DRKA scale overall conformed to the Rasch model.
27
Reliability: Reliability refers to the consistency of the results obtained when the subjects are measured repeatedly.
28 Item reliability (IR) and person reliability (PR) were used to evaluate the reliability of measurement. Low person reliability indicates that the reliability of the measurement tool was not sufficient to distinguish the ability of the subjects.
29 Low item reliability indicates that the sample size of the subjects is not enough to reflect the difficulty level of the scale. Values above 0.9 were considered excellent, values between 0.8 and 0.9 were considered good, and values between 0.7 and 0.8 were considered acceptable.
Separation: The Person Separation Index (PSI) and Item Separation Index (ISI) were used to evaluate the discrimination and the construct validity of the DRKA scales, respectively. A low PSI means that the scale is not sensitive enough to distinguish between high and low subjects, indicating the need for more items. The ISI is used to verify the item difficulty level, and low ISI means that the sample size was not large enough to demonstrate the item difficulty hierarchy of the scale.
27 Generally, ISI should be greater than 3.00 and PSI should be greater than 2.00, indicating that the scale has good discrimination.
30
Unidimensionality: Unidimesionality states that all items in the scale jointly reflect the characteristics of a certain construct, and the independence of items is the premise for the test of unidimensionality.
31 Ideally, the primary dimension should explain more than 50% of the variance in the data,
32 and the eigenvalue of the standardized residuals of the first contrast should be <2.1 to meet the requirement of unidimensionality.
33
Targeting: Targeting of item difficulty to person ability is assessed by calculating the difference between average person ability and item difficulty (a difference of >1 logits indicates notable mistargeting).
34 It is also assessed by looking at the person–item map to assess the distribution of the items and participants. Large gaps in item coverage also indicate poor targeting.
34
Differential item functioning: DIF indicates if item bias is present for certain participant characteristics.
35,36 Rasch models show DIF if the response of the item cannot be fully explained by the ability of the subject or the difficulty of the item.
35 There are two widely recognized DIF test criteria
28: (1) DIF contrast greater than 0.5 logits and (2) the absolute value of
t is greater than 2. If these two criteria were met, they were considered to have DIF. In this study, DIF was tested for the revised Knowledge and Attitudes scales according to gender (male, female), age (≥50 years, <50 years), and education level (high school and below, junior college and above).
SPSS was used to assess reliability, convergent validity, and known-group validity.
Internal consistency: Cronbach's α coefficient was used to determine the reliability of the scale. The larger the value of Cronbach's α, the better the homogeneity, and Cronbach's α ≥0.7 was taken as the minimum acceptable value.
37
Convergent validity: Construct validity measurements comprise convergent, divergent, criterion (known group), and predictive validity.
38 This study assessed the convergent validity of the Knowledge and Attitudes scales of the Chinese DRKA (i.e., that it measures what it purports to measure) by assessing its correlation (Pearson's coefficient) with the Knowledge and Attitudes scores of the KAPDR. A moderate correlation (
r > 0.3) was deemed to indicate good convergent validity.
Known-group validity: Known-group validity refers to whether the scale can distinguish between different groups for which the content of the scale is aimed.
39 In this study, the association between DRKA Knowledge and Attitudes scores and age, gender, education level, professional level, DR severity, and vision impairment severity were assessed using the
t-test or analysis of variance.
Using a rigorous process of cultural and linguistic translation, our study found that the Chinese version of the DRKA has very good psychometric properties. Both the Knowledge and Attitudes scales were unidimensional with well-functioning response categories and had no item misfit or DIF. While the Knowledge scale had good person and item reliability, the Attitudes scale had somewhat low person reliability, suggesting it cannot distinguish multiple different “ability” levels. Therefore, when using the Attitudes scale, the results should be interpreted with caution. The Chinese version of the DRKA also showed good known-group validity with scores increasing as education level increased and good convergent validity, suggesting that it is measuring what it purports to measure. With its domains and items based on the KAP framework, the Chinese DRKA questionnaire can be used to evaluate patients' level of DR-related knowledge and attitudes and, in turn, develop and evaluate interventions and health education programs to improve awareness and understanding of DR and target misconceptions and erroneous beliefs about DR management for at-risk individuals. Researchers, clinicians, and policymakers can also use the Chinese DRKA to better understand the association between improving knowledge and attitudes, as well as changes in patient practices, such as attending scheduled DR screening and diabetes self-management.
The design of the DRKA scale was based on the KAP framework, which emphasizes that behavior is affected by knowledge and attitude.
40 For example, doctors and researchers can use the scale to quantify level of knowledge and attitudes of patients with DR and apply the results to develop and evaluate interventions that apply personalized intervention plans to improve the ability of patients to prevent and control their disease, potentially slowing disease progression. Promoting behavior change by improving the knowledge and attitude of patients may help improve the prognosis of patients as well as their overall quality of life, psychological burden, and disease-related costs, although this is yet to be tested using the Singaporean or Chinese DRKA questionnaire and is an area for future research.
During the revision of the scale, we found that in item 7 of the attitude scale, “The outcome of my diabetic eye disease is determined by fate,” fewer than 10 people chose the “strongly disagree” option, and the difficulty level of this item was too low. Moreover, with the gradual improvement of the Chinese people’s medical and health literacy, as well as the acceptance of scientific theoretical guidance, the view that destiny guides life is increasingly less accepted. In view of this evidence, consensus was reached by the research group to delete item 7, after which targeting and other fit statistics improved. However, the PSI of the revised Attitudes scale remained lower than desired, indicating that the scale was unable to discriminate between more than two levels of “attitudes,” which may be due to a lack of variance in DR-related attitudes in our sample.
Our study found that the average score on the Knowledge scale of patients with DR was low, at 11.43 out of 22, while the average score on the Attitudes scale was higher, at 24.20 out of 32, based on the original score generated by the simple summation of item scores using classical test theory methods. The relatively high Attitudes levels may be due to selection bias as the sample collection center was centered on patients with good local economic and medical conditions who may have had more positive attitudes than patients located in poorer socioeconomic areas. The low level of knowledge of DR is of concern. Importantly, as the education level increased, the Knowledge and Attitudes level of patients with DR also increased, which is similar to the results reported using the KAPDR questionnaire (
Supplementary Material S3), the Singaporean DRKA sample,
17 and by Qi and colleagues,
41 who reported similar results for a population in northeast China using a validated KAP questionnaire. This indicates that the knowledge and attitude level of patients can be improved through education, suggesting the importance of health education for patients by medical workers. In addition, we found that the knowledge and attitude levels of patients with DR over 50 years old were lower than those under 50 years old, which may be related to poorer memory caused by the increase in the age of patients and the continuous improvement of education in China resulting in a higher education level in younger people. The knowledge and attitude level of male patients was significantly higher than that of female patients, which may due to the low education levels of female patients caused by the previously held thought of “male superiority and female inferiority” in China.
42 The differences in the knowledge and attitude level of patients in terms of age and gender can be explained by the differences in education level, which again shows that education is the key to improving the knowledge and attitude of patients. Importantly, our known-group validity analysis was only univariate, and it is not possible to tease out what factors are causative versus associative in relation to the DRKA scores. Indeed, it is possible there is residual confounding occurring that has not been taken into account in our analysis. Future work in a large, well-characterized sample of patients across the spectrum or DR with diverse sociodemographic characteristics using multivariable regression analyses is needed to untangle these relationships.
The development of the Chinese version of DRKA followed a rigorous scientific process. First, the sinicization process of this study strictly followed the introduction standard of scale adaptation
20 and followed the modified Brislin translation model
19 to translate the original text. According to the Chinese cultural background, language expression customs, and professional knowledge, the equivalence of the concept, semantics, and content of the scale were ensured through repeated intragroup circular discussions and iterative revisions. Second, the application of Rasch analysis to a sample of more than 200 responses in the population of interest makes the validation results robust. Rasch analysis overcomes the problems of “measurement dependence” and “sample dependence” in traditional measurement methods to achieve objective measurement and complements validation methods using classical test theory.
43 However, there are still some limitations in this study. The patient sample was limited to Hohhot, Inner Mongolia, meaning that the data are not fully representative of all Chinese patients with DR. The Chinese DRKA scale should be tested in other parts of China for validation purposes in the future. Finally, for patients with severe visual impairment who needed the assistance of researchers to fill in the questionnaire, some proxy bias may be present.
Following a comprehensive cultural and linguistic adaptation of the DRKA scale, including expert discussions among research group members, item wording was adapted to suit the local population and one item in the Attitudes scale was deleted, resulting in a culturally appropriate scale. Although some minor psychometric issues were revealed during Rasch analysis, the Chinese DRKA is a valid and reliable measure of knowledge and attitudes of patients with DR in this population. It can be used to assess patients' level of DR-related knowledge and attitudes, as well as develop and evaluate interventions to improve awareness and understanding of DR and its management and ultimately influence changes in DR-related patient practices.
The authors thank researchers from the Singapore Eye Research Institute for their contribution to our manuscript.
Supported by the Inner Mongolia Higher Education Science and Technology Research Project (NJZZ22652). The funding organization had no intervention in conducting this research or preparing this manuscript.
Disclosure: W. Jiang, None; E.K. Fenwick, None; E.L. Lamoureux, None; Z. Zhang, None; Y. Feng, None; Y. Wang, None; X. Yang, None