Development of the questionnaire
The design and content of the Rheumatoid Arthritis Foot Disease Activity Index–5 (RADAI-F5) was derived from the mRADAI-5, a 5-item patient-reported outcome measure for the self-report of global disease activity, developed and evaluated by Leeb et al (16) and Rintelen et al (24). It is completed in a numerical rating scale format from 0 to 10 and scored by an average summary score ranging 0–10. The RADAI-F5 was developed by editing the mRADAI-5 with an opening statement: “THINKING ONLY OF YOUR FEET,” and editing the original questions to subsequently read as follows: “How active was your arthritis IN YOUR FEET over the last 6 months?” (0 = completely inactive to 10 = extremely active); “How active is your FOOT arthritis today with respect to joint tenderness and swelling?” (0 = completely inactive to 10 = extremely active); “How severe is your arthritis pain IN YOUR FEET today?” (0 = no pain to 10 = unbearable pain); “How would you describe your general FOOT health today?” (0 = very good to 10 = very bad); “Did you experience foot joint stiffness on awakening yesterday morning? If yes, how long was this stiffness IN YOUR FEET?” (0 = no stiffness to 10 = stiffness the whole day). The RADAI-F5 is scored by an average summary score ranging from 0 to 10.
Study setting and participants
The 2 data sources for this study were 1) a primary RADAI-F5 validation study, conducted at rheumatology outpatient clinics at Glasgow Royal Infirmary, Gartnavel General Hospital, and Stobhill Hospital within the Greater Glasgow and Clyde National Health Service Board, and 2) a larger randomized controlled trial, the details of which have been published previously (25). Briefly, the trial was a multicenter, parallel-group, randomized controlled trial with 6- and 12-month follow-up periods, with participants randomly allocated to either customized or prefabricated foot orthoses. Trial participants were recruited from rheumatology outpatient clinics within NHS Grampian, Fife, and Lanarkshire, Lothian Health Boards, Dorset Healthcare University Trust, and Homerton University Hospital Trust.
Participants were included if they were ages 18–75 years, with a definitive clinical diagnosis of RA. Patients were excluded if they were unable to read, write, and/or understand the English language, or if they were diagnosed with other major medical conditions that could have diminished their ability to distinguish between RA-related foot problems and problems due to alternative disease mechanisms. Ethical approval was obtained from the West of Scotland Research Ethics Committee 5 (13/WS/0106) and the East of England Essex Research Ethics Committee (15/EE/0410). Participants were recruited consecutively, and written consent was obtained from all participants.
Data collection and measures
Demographic and clinical information was collected as baseline, including age, sex and disease duration. The newly developed RADAI-F5 was collected at baseline, 1 week from baseline, and 6 months from baseline. All other measurements were recorded at baseline and 6 months. The DAS28-ESR scores were recorded by rheumatologists as part of routine care and made available to researchers. The mRADAI-5 was collected as an additional self-reported measure of global disease activity (16). Foot-related impairments and disability were evaluated using the FFI (18), and the FIS (17). The FFI is a widely used and extensively validated 23-item patient-reported outcome measure, completed using a 100-mm visual analog scale format, providing a mean summary score from 0 to 100 (higher scores indicating worse disability) (18). The FIS is an extensively validated RA-specific 51-item measure with domains for impairment/footwear (21-items) and activity limitation/participation restriction (30-items). It is completed using a yes/no dichotomous format and scores for domains are calculated by summating “yes” responses (higher scores indicating worse disability) (17).
To evaluate the content validity and practical burden of the RADAI-F5, 3 additional items were evaluated: a 5-point Likert scale regarding questionnaire relevance to participants (ranging from extremely irrelevant to extremely relevant), a 5-point Likert scale regarding participants’ opinion on the readability/understanding of the new questionnaire (ranging from very difficult to very easy), and the time taken to complete the questionnaire (in minutes).
Data were analyzed using SPSS 25 and Excel 2016. Descriptive statistics for age (median [IQR]) in years, sex (female:male ratio), and disease duration (median [IQR]) in months were generated for all participants at baseline. The RADAI-F5 was examined using factor analysis by principal component analysis to reveal the structure and item loading. The Kaiser-Meyer-Olkin test and Bartlett’s test of sphericity were undertaken to determine data suitability for factor analysis. The number of factors extracted was decided by a combination of Kaiser’s rule (eigenvalues >1), examination of the scree plot, and interpretation of items’ contribution to the factor. To test internal consistency, we evaluated the inter-item correlation matrix and calculated Cronbach’s alpha, a measure of consistency between items in a scale. A Cronbach’s α = 0.7–0.9 was considered acceptable (26, 27).
Hypotheses were generated a priori to examine the extent to which baseline scores (construct validity) and 0–6-month change scores (longitudinal validity) on the RADAI-F5 were associated with baseline and 0–6-month change scores from other measures in a manner that was theoretically consistent (28). Hypotheses for construct validity, which focused on baseline scores, were specified as follows: moderate positive correlations between the RADAI-F5 score and mRADAI-5, FFI, and FIS domains; and a positive weak correlation between the RADAI-F5 score and the DAS28 score. Hypotheses for longitudinal validity, which focused on 0–6-month change scores, were identical except for the FIS subscales, where a weak positive correlation was anticipated, because the FIS is less responsive to change (29). Spearman’s rank (rs) correlation and 95% confidence intervals (95% CIs) were used to test these hypotheses, and coefficients were interpreted as follows: 0–0.1 = negligible, 0.1–0.39 = weak, 0.4–0.69 = moderate, 0.7–0.89 = strong, 0.9–1.0 = very strong (30).
The 1-week (test–retest) reliability was examined using a 2-way mixed intraclass correlation coefficient (ICC) with corresponding 95% CIs for baseline and 1-week scores. Once preliminary foot disease categories were established (see below), Cohen’s quadratic weighted kappa and corresponding 95% CI for foot disease categories (remission, low, moderate, high) was calculated, with values >0.61 indicating substantial reliability (31).
Absolute measurement error was evaluated using the standard error of measurement (SEm), derived by dividing the SD of the mean change between the 2 measurements (SDchange/√2); the 95% limits of agreement, derived by calculating the mean change between the 2 measurements, ±1.96 × the SD of the changes ([meanchange] ± 1.96 × [SDchange]); the 95% smallest detectable change (1.96 × √2 × SEm), and construction and examination of Bland-Altman plots (32–34).
Responsiveness was evaluated using 4 different effect size statistics: Wilcoxon’s signed ranks test, Cohen’s d, the standardized response mean, and Guyatt’s Index (35). In the absence of an anchor question to calculate the minimal important difference (MID), the MID was calculated using a value of 0.5 × SDchange scores between baseline and 6 months. Guyatt’s Index, representing the magnitude and variability in change scores relative to its MID, was calculated as MID/√(2 × SDchange) (36). Effect sizes were interpreted as follows: <0.15 = negligible, >0.15 to <0.40 = small, ≥0.40 to <0.75 = medium, ≥0.75 to <1.10 = large, ≥1.10 to <1.45 = very large, and ≥1.45 = huge (35).
Participants were classified according to mRADAI-5 thresholds for remission, mild, moderate, or high disease activity. With participants assigned to the mRADAI-5 reference categories, the third quartile of corresponding RADAI-F5 scores was calculated to establish the thresholds for respective RADAI-F5 categories (24). Cohen’s quadratic weighted kappa and 95% CI were used to evaluate agreement between disease activity categories between the mRADAI-5 and the RADAI-F5.
Median (IQR) values were obtained for readability and relevant Likert scores and completion time. For evaluation of floor and ceiling effects for the RADAI-F5 in the RA population, we adopted the conventional 15% threshold for patients achieving the highest and lowest scores to define ceiling and floor effect, respectively (32, 37). To evaluate structural validity via factor analysis, a minimum sample size of n ≥100 at baseline was targeted a priori to achieve a participant-to-item ratio of 20 (38). For hypotheses testing for construct and longitudinal validity, between 61 and 123 participants were required to detect at least weak correlation coefficients from 0.25 to 0.35 at 80% power and 0.05 significance level (G*Power 220.127.116.11).