Measurement properties of the OARSI core set of performance-based measures for hip osteoarthritis: a prospective cohort study on reliability, construct validity and responsiveness in 90 hip osteoarthritis patientsJaap J Tolk, Rob P A Janssen, C (Sanna) A C Prinsen, M (Marieke) C van der Steen, Sita M A Bierma Zeinstra & Max Reijman
Background and purpose — Improvement of physical function is one of the main treatment goals in severe hip osteoarthritis (OA) patients. The Osteoarthritis Research Society International (OARSI) has identified a core set of performance-based tests to assess the construct physical function: 30-s chair stand test (30-s CST), 4×10-meter fast-paced walk test (40 m FPWT), and a stair-climb test. Despite this recommendation, available evidence on the measurement properties is limited. We evaluated the reliability, validity, and responsiveness of these performance-based measures in patients with hip OA scheduled for total hip arthroplasty (THA).
Patients and methods — Baseline and 12-month follow-up measurements were prospectively obtained in 90 end-stage hip OA patients who underwent THA. As there is no gold standard for comparison, the hypothesis testing method was used for construct validity and responsiveness analysis. A test can be assumed valid if ≥75% of predefined hypotheses are confirmed. A subgroup (n = 30) underwent test–retest measurements for reliability analysis. The Oxford Hip Score, Hip injury and Osteoarthritis Outcome Score—Physical Function Short Form, pain during activity score, and muscle strength were used as comparator instruments.
Results — Test–retest reliability was appropriate; intraclass correlation coefficient values exceeded 0.70 for all 3 tests. None of the performance-based measures reached 75% hypothesis confirmation for the construct validity or responsiveness analysis.
Interpretation — The performance-based tests have good reliability in the assessment of physical function. Construct validity and responsiveness, using patient-reported measures and muscle strength as comparator instruments, could not be confirmed. Therefore, our findings do not justify their use for clinical practice.