Validity and reliability of the finkelstein test

1Physiotherapy Program, Department of Health Sciences, School of Sciences, European University Cyprus, Diogenes 6 Street. Engomi, P.O.Box 22006, 1516, Nicosia, Cyprus 2Department of Health Sciences, School of Sciences, European University Cyprus, Diogenes 6 Street. Engomi, P.O. Box 22006, 1516, Nicosia, Cyprus 3School of Medicine, European University Cyprus, Diogenes 6 Street. Engomi, P.O. Box 22006, 1516, Nicosia, Cyprus 4Director of Cyprus Musculoskeletal and Sports Trauma Research Centre (CYMUSTREC), Diogenes 6 Street. Engomi, P.O. Box 22006, 1516, Nicosia, Cyprus


Introduction
De Quervain Tenosynovitis (DQT), as initially described by the Swiss surgeon Firtz De Quervain in 1895, is a common painful musculoskeletal disorder of the hand. It involves inflammation of the tendons, abductor pollicis longus (APL) and extensor pollicis brevis (EPB) as they travel through the first extensor compartment of the wrist. DQT may also be characterized as a tendinopathy because of the non-inflammatory thickening of the tendons and the synovial sheaths, including a degenerative process, as opposed to inflammation [1,2].
Most commonly clinical symptoms of DQT include: Pain at the styloid apophysis area that can also spread to the thumb and wrist, swelling, sensitivity, painful repetitive movements, weakness and pain (particularly thumb extension and abduction), reduced fist strength, weak and sensitive pincer grip. Less common symptoms include burning or numbness. Typically, such patients find completing functional tasks problematic; for example, brushing hair, hold a newspaper, lifting objects and rotational movements [3].
Clinical diagnosis can be aided with the Wrist Hyperflexion Abduction Thumb Test (WHAT), which has been previously validated as a Diagnostic Screening Tool for DQT [4]. Furthermore examination aiding diagnosis includes at least 5 of 7 of the following criteria: Pain, sensitivity, positive Finkelstein's test, local swelling, thickness of extensor retinaculum, absence of other causes such as ganglion cysts and sensitivity in distal radioulnar join [5]. The clinician must be cognizant of other pathology such as osteoarthritis of the thumb, radial neuritis, trigger thumb etc. [5]. Further investigations such as diagnostic ultrasound (US) and Magnetic Resonance Imaging (MRI) should always be considered for a definitive diagnosis [6,7].
Diagnostic testing such as the Finkelstein test was first described by an American surgeon Harry Filkenstein (1865-1939). As described by several authors the Finkelstein test has been widely used in clinical practice and taught to undergraduates as well as to health professionals. This assists with the diagnosis of DQT [2,[8][9][10][11][12][13][14][15].
Differing methodology of the Finkelstein test has been reported throughout the literature. A description as reported by Leo (1958) presents that Eichhoff in 1927 uses the Finkelstein test as a technique involving tension: Whereby the thumb is stabilized in the palm gripped by the fingers of the same hand (like a fist), and the examiner passively causes ulnar deviation [16]. A positive test causes the

Clinical tests
The three clinical tests (Finkelstein, Eischoff and WHAT) were performed on two separate dates by two physiotherapists with ten and three years experience. Each test was performed by the physiotherapist with the ten years of experience. Each test was one minute apart. The second round of tests were performed three minutes later, by the physiotherapist with the three years of experience. Again, each test was performed one minute apart. For all tests, the participant was fully informed about the test procedure. Both their arms were tested and a VAS pain scale score (0-10) was recorded immediately before and after the test (0 represents no pain whatsoever and 10 represents excruciating pain). The second session akin to the first session was performed one week later for the second round of measurements.

Finkelstein test
The participant was seated opposite the examiner with their forearm on the table. The forearm was positioned flat on the table with the hand hanging over the edge and the ulna side facing down as seen in Figures 1 and 2. At the initial stage, the examiner applied light ulnar deviation assisting gravity. In the next step if there was little or no pain then slight pressure causing ulnar deviation was applied to create tension in the first dorsal compartment. Finally, the examiner distracted the thumb distally as shown in Figures 1 and 2 [22]. A positive result is pain in the first dorsal compartment. same symptomatology experienced by the patient. This test was also reported by Filkenstein, who in one of his early reports attempted to understand the biomechanics of the test [17]. Additionally, Goubau et al. investigated the active WHAT Test, confirming the validity, specificity and sensitivity of the aforementioned tests in patients with DQT. Accordingly, the aim of this study was to evaluate the validity and reliability of the Finkelstein test [4].

Materials and methods
This study was undertaken at the Cyprus Musculoskeletal and Sports Trauma Research Centre (CYMUSTREC), European University of Cyprus (EUC), February -March 2016. An announcement at EUC encouraged interested individuals to apply and participate. Applicants signed an informed consent form acknowledging they received a participant information brochure, that they understood the purpose of the study and that they may withdraw at any time without their status being effected at that time or in the future, that while the information obtained from the study would be published, no identifiable information would be disclosed and would remain confidential. Participants were screened based on exclusion criteria and were asked the following questions: 1. Do you currently have any signs or symptoms in the hands or upper limbs?

Have you had any acute injury in the last 6 months?
3. Are you using any pain-relieving or anti-inflammatory medications? If the answer was 'No' to all three questions a physiotherapist with 10 years of experience in musculoskeletal conditions further examined them subjectively and a physical examination followed.
Patients were further excluded if they did not meet the inclusion criteria being no apparent health conditions affecting them from participating in this study.
Subjective assessment involved questions based on the framework of the International Classification and Function in order to ascertain an overview of the participants lifestyle and habits. Information recorded included basic characteristics such as age, occupation, dexterity (use of hands), sporting abilities (professional vs hobby), in particular whether their work or sport involved repetitive hand movements, heavy work etc. usual daily activities, and if the participant was either right or left handed. Participants were excluded if there was any symptomatology, which related to DQT according to whether the three basic criteria were present [18]. This included: 1. Pain with movement, 2. Edema; and 3. Sensitivity in the region of the first dorsal compartment.
If the participant was found to be a healthy person, with no known pathology in the hands they proceeded to the clinical assessment.
The physical examination included assessment for both upper limbs. According to Magee [19] active and passive range of motion (AROM -PROM) for all joints evaluated, must not have presence of pain, irritation, sensitivity or any crippling in order to be disease free. Muscle Power was measured with the standard Medical Research Council Scale (MRC scale) and individuals were required to achieve a 5/5 gross score [20,21]. Individuals deemed disease free with good muscle function and power continued to the next phase 'Clinical Tests'.

Eichhoff test
The participant was seated opposite the examiner with their forearm on the table. At the initial stage, the examiner asked the volunteer to close the thumb with the rest of their fingers and then proceeded to apply a slight ulnar deviation in the direction of gravity. If there was little or no pain slight over pressure causing ulnar deviation was applied to create tension in the first dorsal compartment as shown in Figures 3 and 4 [16]. A positive result is pain over the dorsal compartment.

WHAT test
The participant was seated opposite the examiner with their forearm on the table. The participant was asked to place their wrist in a fully flexed position within the limits of any pain and retain the thumb in extension and abduction while the examiner applied stepwise isometric resistance to the thumb as shown in Figures 5 and 6. The test was completed when the participant could no longer maintain their position against the force applied by the examiner. A positive result is when there is pain near the base of the thumb during the test.

Reliability study
The study was designed as an examiner variability study. The aim was to compare measurements obtained between two examiners (interrater reliability) for the results of the Finkelstein test, the Eichhoff test    and WHAT test for both left and right arms. A VAS scale scoring system of 0-10 was immediately recorded after each test was performed.
The study investigated the test-retest reliability by each examiner repeating these tests one week later. This was also measured by a Pain VAS score from 0-10 on both hands.

Validity study
To evaluate the validity of the Finkelstein's test, this study compared the already validated WHAT test and Eichhoff test for diagnosing DQT [3]. Secondary data of the Eichhoff test compared with the WHAT test findings was considered with regard to the validity of the Finkelstein test. The specificity for evaluation of the clinical tests was assumed based on previous studies [3].
The hypothesis of this study was that healthy subjects should not have positive signs. We expected that all subjects were to be pain free on examination using the clinical special tests. The True Negative Rate was calculated by measuring the proportion of the total number of negative tests (i.e. healthy subjects who were pain free on examination) to the number of false positives (i.e. healthy subjects who felt pain on examination). The proportion of False Positives was expected to be low due as all subjects appeared healthy.

Statistical analysis
The Statistical Package for Social Sciences (SPSS) 2009 version software package was used. The results were calculated by a third person, differing from the examiners. Each participant was entered with a code and the following variables were recorded: Age, endurance, occupation and use of the upper limbs, whether the use of hand skills was continuous or not in ADL, if they were a professional athlete and the extent of upper limb use in their sport. Measurements were recorded as a categorical variable if there was pain (1) or not (0) and for the VAS scale from 0 to 10. Data was recorded for the three clinical tests, for the two assessors, and for the first and second measurement.
Cohen Kappa method was used to evaluate the reliability of agreement between examiners for the clinical study. The frequency of agreement was calculated for the 45 cases at the first and the second measurement.
Specificity testing was performed by calculating the True Negative and False Positive ratios. Each clinical test using Finkelstein, WHAT and Eichhoff were separately compared.

Conflicts of interest: None
Participants signed an informed consent form acknowledging that they had received a participant information brochure; that they understood the purpose of the study and that they may withdraw at any time without their status being affected at that time or in the future. That while their information obtained from the study would be published, no identifiable information would be disclosed and would remain confidential. Additionally, participants were conwas caused, the physiotherapist would give the appropriate advice for treatment. This study was approved by the appropriate competent authority the Bioethics Committee of the European University Cyprus (2016).

Results
45 healthy individuals (n=45) without any known pathology, pain, swelling in the wrist area and with full range of motion and strength were evaluated. The demographic characteristics are presented in Table  1.

Agreement between two examiners (Inter-rater reliability)
The agreement between examiners for the Finkelstein Test for the first measurement and the second measurement, was minimal. Similarly, for the Eichhoff Test, minimal agreement was observed; however, for the left hand moderate agreement was observed because the values were close to 0.41. The inter-rater reliability for the WHAT test could not be calculated using the kappa method due to the many outcome values close to 0. The frequency of agreement for Eichhoff and Finkelstein ranges between 23 to 27 while for the WHAT test greater frequency appears in the range between 42 to 44. This data is presented in Table 2.
When the classical pain variable is utilized but the degree of pain experienced is different it may be said that there is still agreement between examiners on the cause of provoking pain, i.e. pain or no pain. The Finkelstein and Eichhoff test, showed moderate to substantial agreement, kappa values (0.41-0.60 and 0.61-0.80) respectively. The level of agreement for the WHAT test cannot be calculated using the kappa method due to the very near values close to 0. As shown in Table  3 the examiners appear to be in greater agreement with the Eichhoff test; However, the greatest agreement appears to be in the WHAT test at a frequency of 43/45. No difference between the left and right hand was observed. These findings are statistically significant (p <0.05).

Agreement between two measurements (Intra-rater reliability)
The level of agreement between the individual physiotherapists first and second measurement for the Finkelstein test ranges from  Table 4.
When considering the pain variable as pain or no-pain as shown in Table 5, the Finkelstein tests level of agreement was moderate to good, ranging from 0.41-0.60 and 0.61 to 0.80. In some cases, as shown in Table 5. there was less agreement. The Eichhoff test showed a high level of agreement with statistical significance (p <0.05). The kappa method was not able to be calculated for the WHAT test because of a high homogeneity of values close to 0. The agreement percentage for the WHAT test was the highest when compared to the other tests regardless of which hand was used, indicating the WHAT test has the highest intra-rater reliability when compared to the other two tests.

Validity-True negative, false positive
Validity of the Finkelstein special test was assessed by the use of True Negative and False Positive values. As noted in Table 6, nearly half of those assessed using the Finkelstein and Eichhoff test reported pain (i.e false positive). Note this was the case even though all these individuals had been screened for no apparent injury or pathology. In contrast, the WHAT test appeared to indicate a high percentage of people (89%) who were known to be healthy did not report pain (i.e. true negative).

Agreement between tests
The agreement between Finkelstein and Eichhoff test is fair to slight and with a frequency of 30 and 32 in the 45 subjects for the right hand and the left. The agreement level between Finkelstein and WHAT test is poor. Further, the WHAT and Eichhoff level of agreement is even weaker. The results are listed in Table 7.

Discussion
Clinicians need a valid and reliable tool to diagnose soft tissue disorders in the upper extremity. In addition, MRI and diagnostic ultrasound although useful tools in the diagnosis of DQT are costly and burdensome on the patient.

Finkelstein test reliability
The results of the agreement level between measurements have been shown to be slight to fair when the VAS (0 to 10) score was utilized. This low reliability observed may be attributed to the differentiation of even one point on the VAS scale. We hypothesize that different weekly activities undertaken by the participants before assessment may have been the cause of greater sensitivity during the tests [23,24]. Also, muscle imbalance may have caused the different result, but this parameter was not evaluated [25]. In addition, the passive diagnosis mechanism in the Finkelstein test is a tension mechanism within the tendon sheath, but the potential to create irritation of other anatomical tissues is highly probable [26]. The better agreement of the WHAT test may be due to the fact that it is an active test and individuals can control their strength against the examiner [4].
Comparing the two measurements (Finkelstein and Eichhoff) with the variable of pain or absence of pain the agreement was moderate to good. This suggests that the reliability was higher between the two measurements in contrast to the VAS scale because there were only two parameters. The result for the Finkelstein test and Eichhoff test agreement between the first and the second measurement was fair, when using the VAS scale as a variable. The two physiotherapists were blinded from each others results so as not to effect recording. The agreement between the examiners is poor and not satisfactory, even though they examined exactly the same people with the same methodology. This may have been attributed to human error or individuality, for example the pressure applied by examiner's manual handling skill [27]. In contrast, reliability is higher between observers when pain is documented with two categories (any pain or absence of pain) as participants simply expressed pain or not.
It is important to note that because the only outcome variable in this study is pain, both inter-rater and intra-rater reliability are at risk of being tarnished by individuality of the pain experience including environmental and psychological factors according to the biopsychosocial model in pain assessment [28][29][30]. Accordingly, the application of clinical tests may differ among individuals and therefore a different VAS score may be reported.

Finkelstein validity
The validity of the Finkelstein test and secondly the Eichhoff test was evaluated with the use of True Negative and False Positive values [31]. The results showed that although the sample selected was composed of healthy individuals, they experienced pain in both the Finkelstein and Eichhoff tests. We showed here that there can be up to 50% false positives in the Finkelstein test, this means that for every 2 healthy people without wrist symptoms one will have a positive result. The finding that the Finkelstein test is likely to be positive in healthy individuals was also reported by Huisstede et al. [32]. We hypothesize that this caveat may be due to individual differences i.e. the thickening of the tendon or other anatomical factors, as if the two tendons cross the same sheath. However, to be able to diagnose any pathology, it is necessary to undergo diagnostic ultrasound or MRI.
There appeared to be a minimal agreement between WHAT and the Finkelstein test and poor agreement with the Eichhoff test. However as outlined above there was agreement between Finkelstein and Eichhoff test. This lead us to the conclusion that Finkelstein and Eichhoff although having similar results, they do not agree with the WHAT test as it appears to be more sensitive and specialized for people with DQT.
The results of this study show that because Finkelstein and Eichhoff test can be positive in a healthy population they cannot be a valid diagnostic criterion. Further to reviewing the literature grip and pinch force testing is required to incorporate as part of clinical reasoning for diagnosis of DQT because these patients have reduced strength in the affected hand [33]. However, in an investigation that verified the reliability and validity of the De Quervain Screening Tool it was concluded that the Finkelstein test is a valid and reliable test to include as part of the diagnostic criteria. This makes sense when analyzing the results of this study as clearly the Finkelstein test cannot be used alone in diagnosis, as it aids in the clinical reasoning process. It may be preferable and more appropriate for the Finkelstein test to be included with the other diagnostic criteria as set by Batteson, et al. [17]. The results of this study come to strengthen the suggestion of Goubau, et al. [4] that the WHAT test has better sensitivity and specificity for diagnosis compared to the Eiscoff test. The WHAT test seemed to have greater reliability between measurements and among researchers, giving a measure of stability to our study. Thus, we hypothesize the WHAT test should be used as part of the diagnostic criterion in aiding the clinician to diagnose and rule out DQT. In a situation whereby a patient presents with a negative WHAT test, the clinician with up to 90% certainty (as per this study) may conclude the tendon complex is unlikely to be implicated.
The result of this study suggest that the Finkelstein test itself is not a valid tool for diagnosing DQT when used alone, as it may cause pain even in healthy individuals who do not have DQT. The Finkelstein and Eichhoff test could be used as one diagnostic criterion in combination with other criteria.

Limitations of study -Future studies
Whilst the results of this study are indicative of the true negatives and false positives this study lacks evidence for evaluation of false negatives. Hence future studies, especially for healthy individuals who experience pain during the tests could be assessed with diagnostic ultrasound [6] or even with MRI [34] to clearly identify if any underlying pathology is present.
Another issue which became evident based on the results between examiners is the inter-rater reliability. The number of examiners was only 2, and this study cannot strongly predict the interrater reliability with only two examiners.
The sample size was only 45 individuals and as a result we are limited in extrapolating to the wider community, as well as the fact we limited the study to only healthy individuals [35]. Repeating the study with people with known DQT would also provide further valuable data.
A further limitation was the randomness of the tests. For example, even though the examiners were blinded to their VAS scoring model, the order by which the tests were performed may have contributed. As the Finkelstein test was the first test, it may be having been a reason for the high false positives. Additionally, the first session, may influence the second by pre conditioning the subject for the week later.
Electromyography and motor control tests could be used to control muscle imbalance. For example, the results of this study could be due to a change in muscle activation timing and tendon tension. Similarly, future research could be conducted on the same participants who experienced pain to see if they develop DQT in the future. The effect of the ergonomics of their profession, exercise, daily activities, the psychological state and the handedness of the upper limb (right or left) on the results of the aforementioned tests may have affected the second measurement.
Confounders such as athletes or non-athletes, with continuous use or not of their upper limbs at work and in everyday life may have caused a differential in the results.
Future studies may also look at Women who have just given birth as they are at risk of DQT.