The effects of therapies for Myalgic Encephalomyelitis and chronic fatigue syndrome should be assessed using objective measures

There is controversy with regard to therapies proposed to be effective for Myalgic Encephalomyelitis (ME) and chronic fatigue syndrome (CFS), especially behavorial therapies: cognitive behavioral therapy (CBT) and graded exercise therapy (GET). As will be exemplified by the PACE trial and other studies, the positive effects of CBT and GET are almost exclusively based on strongly varying subjective criteria (measures and cut-off thresholds for fatigue, physical functioning, et cetera). Depending on the subjective criteria used ‘recovery’ rates vary from 7% to 69%. Looking at the objective measures, e.g. work rehabilitation, physical fitness, and activity levels, CBT and GET seem to have a negligible effect or no effect at all. Trials into proposed therapies for ME and CFS, including CBT, GET, rituximab and rintatolimod, should use objective measures to impartially assess the effectiveness. Correspondence to: Frank Twisk, ME-de-patiënten Foundation, The Netherlands, Tel: 31-72-505 4775, E-mail: frank.twisk@hetnet.nl


Introduction
ME, acknowledged as a new clinical entity in the '50s [1], is a multisystemic disease, characterised by distinctive muscular symptoms, especially muscle weakness and myalgia after minor exertion lasting for days, neurological symptoms implicating cerebral dysfunction, e.g. cognitive impairment and sleep reversal, dysregulation of the circulatory system and other systems, and a chronic relapsing course [2][3][4]. CFS, introduced in 1988 [5] and redefined in 1994 [6], is primarily defined by chronic fatigue. To meet the diagnosis CFS [6], chronic fatigue must be accompanied by at least four out of eight additional symptoms: loss of memory or concentration, sore throat, tender lymph nodes, muscle pain, joint pain, headaches, unrefreshing sleep and prolonged extreme exhaustion after minor exertion (post-exertional 'malaise'). Although ME and CFS are often used interchangeably, the diagnostic criteria for ME [2][3][4] and CFS [6] define distinct clinical entities with partial overlap. A patient can meet the diagnosis ME [2][3][4] while not fulfilling the CFS [6] criteria and vice versa.
Both ME and CFS are defined by a complex of symptoms: ambiguous and abstract notions, e.g. muscle weakness, fatigue, cognitive deficits and unrefreshing sleep. However these symptoms, if not operationalized, are also experienced in other conditions, e.g. mitochondrial disorders and depression, and assessment of the presence and severity of a specific symptom fully relies on the patient's judgement. Diagnosis, improvement and recovery are almost always based on subjective notions, e.g. fatigue scores, derived from questionnaires, visual analogue scales et cetera.
This article will illustrate that the various subjective definitions of caseness and recovery used, based on cut-off scores for subjective measures, used fully determine the outcomes of trails into the effects of treatments, e.g. CBT and GET, yielding recovery rates from 7% (without any significant differences compared to non-intervention) to 69%.
Commonly used subjective measures for ME and CFS relate to 'fatigue' and physical functioning. An often used variant of the Chalder Fatigue Scale (CFQ) [7,8] contains 11 questions relate to 'fatigue' [9], e.g. "Do you lack energy?". The answers can be scored 'binary' ("better than usual" and "no worse than usual" scoring 0, "worse than usual" and "much worse" scoring 1), yielding a range of 0 to 11 for the CFB fatigue score. Alternatively the CFQ is combined with Likert scores ("better than usual": 0, "no worse than usual": 1, "worse than usual": 2, and "much worse": 3), giving a range of 0 to 33 for the CFL fatigue score. Higher CFB and CFL fatigue scores implicate higher levels of fatigue. The CIS F [10] score is based on the patients' rating of 8 statements related to 'fatigue', e.g. "I get tired very quickly", on a seven point Likert scale (from 1 for "no, that is not true" to 7 for "yes, that is true", or reversed scoring) yielding a total score for CIS F ranging from 8 to 56. Higher CIS F scores indicate higher levels of fatigue. The SF-36 PF score [11-13] is derived from 10 questions with regard to physical limitations, with possible scores being "limited a lot" (score: 0), "limited a little" (50), or "not limited at all" (100). The total SF-36 PF score (range: 0-1,000) is divided by 10, yielding a range of 0 to 100. Higher SF-36 PF scores implicate better health. Other studies use the SIP 8 score [10,14] to assess the degree of disability (limitations), subdivided into 8 categories. The SIP-8 score (total weighted score of all items in the 8 SIP subscales) ranges from 0 to 5,799, lower scores indicating less disability.

The outcomes of trials into the effect of therapies are fully determined by strongly varying subjective criteria
Most trials into therapies for ME and CFS relate to CBT, GET or variants thereof. Some trials have investigated the effects of According to Knoop et al. [32] 69% of the patients 'recovered from CFS' by CBT/GET. 'Recovery from CFS' was defined as a CIS F score <35 and a SIP 8 score <700. Since the inclusion criteria (CIS F ≥35 and SIP 8 ≥700) border on the recovery criteria, a minimal improvement was sufficient to be qualified as 'recovered from CFS'. But the cut-off thresholds for recovery don't come close to the criteria for "normal fatigue" (CIS F ≤27) and "no disabilities in all domains" (SIP8 ≤203) as defined by the authors. According to Knoop et al. [32] 23% of the patients recovered using the '"most comprehensive definition". However this definition of recovery doesn't include a criterion for SIP 8, one of the two measures to define 'CFS'. Moreover, all cut-off thresholds (mean + 1 SD) assume normal distributions for CIS F, SF-36 PF, SF-36 Social Functioning and SF-36 Social Functioning. But, as the authors confirm, the scores are not normally distributed. For that reason (85%) percentiles should be used. If percentiles were used and the SIP 8 score was included recovery rates would drop substantially below 20%. A control group was lacking, but another study by the same group [33] observed a self-rated clinical improvement in 30% of the patients in the non-intervention group.
Prins et al.
[33] conducted a study of the effects of CBT/GET, guided support, and non-intervention in CF patients and concluded that "[CBT/GET] was significantly more effective than both control conditions for fatigue severity [..] and for functional impairment [..]". Both CBT/GET and non-intervention had a positive effect on the mean CIS F [34] and SIP 8 [35] score. But the effects of CBT/GET and nonintervention were by far insufficient to achieve 'normal levels' for CIS F [32,36] and SIP 8 [32,33,36] as defined in other studies by the research group. There was no significant difference between the effect of CBT/ GET and non-intervention on all secondary subjective measures, except for the Karnofsky status [37,38], rated by a clinical psychologist. Prins and colleagues [33] observed a clinically significant improvement of fatigue (CIS F) for 35%, Karnofsky score for 49%, and self-rated improvement for 50% of the CF patients after CBT/GET. However 32% of the patients in the non-intervention group also reported a clinical significant improvement without any intervention and 23% experienced a significant improvement of the Karnofsky performance status. The subjective improvement experienced in the CBT/GET and non-intervention group was not reflected by an improvement of objective measures. CBT/GET had no substantial effect on the (low) number of hours worked [33] or activity levels [39].
Both CBT/GET and non-intervention had a positive effect on CIS F and SF-36 PF in adolescents with CFS in a study by Stulemeijer and others [40], although the effects of CBT/GET on these two primary measures were significantly enhanced. The effects of CBT/GET on the eight other CFS symptoms [6] were very modest or non-existent. No less than 44% of the patients in the non-intervention group and 71% in the CBT/GET arm rated themselves as "completely recovered" or "much better". Although the effect of CBT/GET was significantly larger, both CBT/GET and non-intervention had a positive effect on school attendance, the only objective measure reported by this study. Despite the positive effects school absence remained rather high in both groups. A reduction of school absence is at odds with the negligible effects of CBT/GET on activity levels established in another study by the same research group [39]. A follow-up study [41] observed that CBT/GET didn't result into improved cognitive test scores. This is extremely relevant since Nijhoff and others [42] observed that CFS has a dramatic impact on cognitive functioning and this decline in the IQ of CFS patients is not correlated to school absence. , implicating a CFB score ≥6 (CFL >12), and the SF-36 PF score was ≤65. A 'positive outcome' for CFL B was defined as "a 50% reduction in fatigue score, or a score of 3 or less" ("normal fatigue"), while a score of ≥75, or a 50% increase from the baseline score was considered to be a 'positive outcome' for SF-36 PF. Recovery was defined by four criteria: CFB ≤3, SF-36 PF ≥85, self-rated Clinical Global Impression (CGI): "very much better" and no longer meeting the criteria for CF [25], CFS [6] or 'ME'.
Deviating from the protocol [24], the eligibility criterion for SF-36 PF was defined as ≤60 at the start of the trial [15]. During the trial this criterion was changed from ≤60 to ≤65 to increase recruitment. The eligibility criterion for the CFB score remained ≥6. In the posthoc analysis the 'normal range' for the CFL score was defined as ≤18, corresponding to a CFB score of ≤9, the 'normal range' for SF-36 PF as ≥60 [15]. This implies the patient in the PACE trial could meet the 'normal ranges' for CFL and SF-36 PF afterwards without any improvement or even with a small deterioration. Moreover the 'normal values' defined by the PACE trial [15] (CFL ≤18, SF-36 PF ≥60) don't come close to the recovery criteria from the protocol (CFB ≤3/CFL ≤9, SF-36 PF ≥85).
In 2013 the PACE trial investigators published 'recovery rates' [26] for "trial recovery from CFS" and "clinical recovery from the illness". "Trail recovery" was defined by the two criteria based on 'normal values' for CFB (≤18) and SF-36 PF (≥60), combined two other criteria, CGI being "much better" or "very much better" and no longer meeting the Oxford case definition of CF [25]. Based on this new 'recovery' criteria, which are much less strict than the recovery criteria as defined in the PACE trial protocol [24], the authors reported that the percentages of 'recovered' CF patients were 22% after CBT, 22% after GET, 8% after APT and 7% after SMC [26]. Note, these rates potentially include patients who already met 'recovery' criteria at baseline. A re-analysis concluded that "When recovery was defined according to the original protocol, recovery rates in the GET and CBT groups were low and not significantly higher than in the control group [SMC arm] (4%, 7% and 3%, respectively) [27].
So, when the subjective recovery criteria defined by the PACE trial investigators are employed the difference between the effects of CBT and GET and of SMC are nihil. This finding is reflected by the effects of CBT, GET and SMC on objective measures. CBT and GET had a very small effect on the number of meters walked in 6 minutes [15], largely insufficient to achieve normal levels [28,29], 'return to employment', health care usage and social welfare benefits didn't improve [30], while fitness and perceived exertion during a step test also didn't substantially improve [31].

Example 2: Dutch studies into CBT/GET
The use of various subjective measures and strongly fluctuating cut-off scores to define caseness, improvement and recovery is also observed in three Dutch studies into the effects of CBT/GET. [43]. Although studies showed positive effects on subjective measures for the severity of the symptoms and degree of disability [44][45][46], the evidence of positive effects on objective measures is limited to exercise tolerance and concomitant medication usage until now [44,46].
Three studies reported positive effects of B-cell depletion by the anti-CD20 antibody rituximab in a subgroup of CFS patients [47][48][49]. Evidence of a positive effect on objective measures, e.g. (maximum) oxygen uptake or activity levels, is still lacking, but the protocol of a randomized phase III study [50], which will be finished in the near future, states the trial will report on the effects of rituximab on activity levels.

Trials into therapies for ME and CFS should be using objective measures
Subjective measures are associated with risk of bias ('researcher allegiance'') [51], the placebo effect [32] and buy-in effects [52], and other effects related to the nature of these measures. The shortcomings of subjective measures are extremely relevant since, as observed by Knoop and colleagues [32], the effect of behavorial treatments, like CBT and GET, is (mainly) due to expectations and a placebo effect. Knoop et al. [32] also confirm that: "[R]ecovery is a construction. The percentage of recovered patients differed depending on the definition of recovery used.". Recovery rates vary from 69% to nonexistent, just by employing different cut-offs for subjective measures. Since the effect of CBT/GET is ascribed to a placebo response by Knoop et al. [32] (14% for psychological interventions in ME/CFS), one could argue that the real effect of CBT/GET is non-existent. This position is confirmed by the observation that CBT/GET has no substantial effect of the low activity levels [39].
Looking at the current controversy with regard to therapies for ME and CFS, trials into proposed effective therapies for ME and CFS, including CBT, GET, rituximab and rintatolimod, should use objective measures to impartially assess the effectiveness.