COCs containing dienogest and 30 μg ethinylestradiol may carry a higher VTE risk compared to corresponding preparations with levonorgestrel: A meta-analysis of four large cohort studies

Background: The European Medicines Agency requested a meta-analysis of four large, multi-national cohort studies on hormonal contraceptives to clarify whether dienogest/ethinylestradiol-containing combined oral contraceptives (DNG/EE) carry a different risk of venous thromboembolic events (VTE) compared to levonorgestrel/ethinylestradiol-containing preparations (LNG/EE). The primary objective of the meta-analysis was to assess VTE risk in a study population that is representative for the actual users of the individual preparations. Methods: All four studies were prospective, observational cohort studies. Cohorts consisted of new users of hormonal contraceptives: starters, switchers and restarters. Study participants were followed up for up to 10 years. The analysis was restricted to preparations containing 30 μg of ethinylestradiol. Primary risk measure: VTE hazard ratio (HR) in the European study population for DNG/EE versus LNG/EE adjusted for age, BMI, duration of current use, family history of VTE and data source. Results: The analysis set included data from 228,122 users of hormonal contraceptives. The European study participants had used DNG/EE and LNG/EE (WY) for 38,708 and 45,359 woman years, respectively. The meta-analysis includes 102 VTEs: DNG/EE, 56 cases and 14.5 VTE/10,000 WY; LNG/EE, 46 cases and 10.1 VTE/10,000 WY. The primary analysis showed an adjusted HR for DNG/EE versus LNG/EE of 1.6 (95% confidence interval 1.1-2.3). Four alternative analyses showed similar results although only one of these analyses reached statistical significance. Conclusion: DNG/EE is probably associated with a slightly higher risk of VTE compared to LNG/EE. However, some uncertainty regarding the validity of this result remains. *Correspondence to: Jürgen Dinger, ZEG-Berlin Center for Epidemiology and Health Research, Invalidenstrasse 115, 10115 Berlin, Germany, Tel: +49 30 9451010; E-mail: j.dinger@zeg-berlin.de


Introduction
The safety of combined oral contraceptives (COCs) has improved over the years with the reduction in doses of estrogen and progestogen. However, concerns about COC safety have remained, peaking in the mid-1990s and early 2010s with discussion on whether COCs containing so-called "third" and "fourth generation" progestogens (desogestrel/gestodene and drospirenone, respectively) have a higher risk of cardiovascular side effects -especially venous thromboembolc events (VTE) -than older formulations [1][2][3][4].
Although the specific combination of 2 mg of dienogest (DNG) and 30 μg of ethinylestradiol (EE) has a substantial market share in Europe, published data on VTE risk are limited. It is currently scientifically unclear whether this specific combination (DNG/EE) is associated with a different risk of VTE compared to levonorgestrel (LNG)/EEcontaining COCs which are often used as the reference standard for the VTE risk associated with combined hormonal contraceptives.
The Berlin Center for Epidemiology and Health Research conducted several large prospective cohort studies on the risk of VTE associated with the use of hormonal contraceptives. Four of these studies included a substantial number of women using DNG/EE or LNG/EE-containing COCs. The European Medicines Agency requested a meta-analysis of these four prospective cohort studies to clarify whether DNG/ EE carries a different VTE risk compared to LNG/EE. Therefore, the data on DNG/EE and LNG/EE from the following four prospective cohort studies were combined: i) "Long-term Active Surveillance Study for Oral Contraceptives" (LASS) [5]; ii) "International Active Surveillance Study of Women Taking Oral Contraceptives" (INAS-OC) [6]; iii) "Transatlantic Active Surveillance on Cardiovascular Safety of Nuvaring" (TASC) [7]; and iv) "International Active Surveillance Study -Safety of Contraceptives: Role of Estrogens" (INAS-SCORE) [8].

Materials and methods
Details of design and methodology of the four cohort studies are described elsewhere [1][2][3][4]. All studies were conducted in accordance with the ethical principles of the Declaration of Helsinki. The primary ethical approvals in Europe for all four studies were provided by the ethical committee of the physicians' association in Berlin, Germany. The study outlines were published at ClinicalTrials.gov prior to the recruitment phase of the individual studies. Each study was governed by an independent Safety Monitoring and Advisory Council to ensure its scientific independence.
The primary objective of the meta-analysis was to assess the risk of VTE associated with the short and long-term use of DNG/EE and LNG/EE in a study population that is representative for the actual users of the individual preparations. The secondary objective was to characterize the baseline risk of users of the two formulations (lifetime history of co-morbidity, prognostic factors for VTE, co-medication, socio-demographic and life-style data).
The LASS study was conducted in Europe only; the other three studies were transatlantic studies that included a large proportion of European women. Study participants were recruited via large networks of hormonal contraceptive prescribing health care professionals in the United States and a total of 12 European countries. DNG/EEcontaining COCs are not available in the United States. Thus, only European data were used for the meta-analysis presented here. The methodology used in the four studies was almost identical. All studies were large, prospective, observational, active surveillance studies that focused on the risk of VTE associated with the use of hormonal contraceptives. Cohorts consisted of new users of COCs: starters, switchers and restarters. A 'non-interference' approach was used to provide standardized, comprehensive, reliable information under routine medical conditions: i.e., all patients who were new users of an OC were eligible for enrolment if they gave their informed consent, and the physicians' prescribing behaviour was not influenced by quotas for specific OCs. Study participants were followed up for up to 10 years. All outcomes of interest were captured by direct contacts between the investigator team and the study participants. Inclusion and exclusion criteria, methods of patient recruitment, follow-up and data documentation (including prognostic factors for VTE) were almost identical. The data from these studies could therefore be combined for the planned meta-analysis without any substantial methodological problems.
Overall, the analysis set included data from 228,122 users of hormonal contraceptives with a follow-up of 736,793 woman years (WY) of observation. The European study participants had used DNG/ EE and LNG/EE (preparations with 30 µg EE only) for 38.708 WY and 45.359 WY, respectively. The proportions of starters, switchers and restarters in the DNG/EE user group were 21%, 31% and 48%. The corresponding values for LNG/EE were 26%, 27% and 47%.
A low "loss to follow-up rate" was essential for the validity of all four studies. In order to minimize loss to follow-up the same multifaceted, four-level follow-up process was established in all studies. Level1 activities included mailing the follow-up questionnaire and in case of no response two reminder letters. If level1 activities did not lead to a response, multiple attempts were made to contact the woman, her friends, relatives, and gynaecologist/primary care physician by phone. In parallel to these level2 activities, searches in national and international telephone and address directories as well as social networks were started (level3 activities). If this was not successful, an official address search via the respective governmental administration was conducted (in some countries centralized, in others decentralized at community level). This level4 activity usually yielded information on a new address (or information that the respondent had moved abroad or died). Overall, the loss to follow-up rate was 3.3% or lower in each of the four studies.
In all four studies the same procedures were used for the validation of reported adverse events. All serious adverse events and particularly VTEs were validated via the diagnosing and/or treating physician. All VTEs were checked at the end of the studies by three independent medical experts specializing in radiology/nuclear medicine, cardiology, and internal medicine/phlebology. For the blinded adjudication process, the brand names, dose, regimen and composition of the OC(s) used by the reporting woman were rendered anonymous. The adjudicators conducted their reviews independently of each other and without knowing the judgement of the other adjudicators or the investigators.
The primary risk measure was the VTE hazard ratio (HR) in the European study population for DNG/EE versus LNG/EE. In general, it is very difficult to interpret a relative risk of two or less in observational research [9,10]. Therefore, the author focused the analysis on excluding a twofold risk. Accordingly, the null hypothesis prior to the meta-analysis was: HR VTE >2 (i.e., the adjusted VTE hazard ratio for DNG/EE vs. LNG/ EE is higher than 2). The alternative hypothesis was: HR VTE ≤ 2. However, regulatory authorities often request the exclusion of a 1.5-fold riskeven in non-experimental studies. The a priori power of the pooled analysis to exclude a twofold and 1.5-fold VTE risk for DNG/EE compared to LNG/EE was about 94% and 58%, respectively [11,12]. The corresponding values for the individual studies are shown in table 1. These power calculations confirm that the meta-analysis is sufficiently powered to show non-inferiority of DNG/EE compared to LNG/EE if a non-inferiority hazard ratio of 2 is accepted. If a non-inferiority hazard ratio of 1.5 is requested, the power of the pooled analysis is limited.
Inferential statistics were based on Cox proportional hazard models. Crude and adjusted HRs between the two cohorts of interest -DNG/EE and LNG/EE -were calculated. Four prognostic factors for VTE -age (continuous variable), BMI (continuous variable), current duration of use (continuous variable), and family history of VTE (binary variable) -were included as covariates in the Cox models. This selection was based on those factors that had consistently shown a substantial impact on VTE risk estimates in the regular statistical analyses of the individual studies. Furthermore, the data source (i.e. the study the women were participating in: LASS, INAS-OC, TASC or INAS-SCORE) was included in the Cox models for the pooled analysis.
To assess the robustness and validity of the primary statistical model, four alternative Cox models were used for sensitivity analyses: i) overall information on 20 potential confounders for VTE was available in all 4 studies (e.g., concomitant medication, smoking, acne), and the 20 potential confounders were included as covariates in a saturated Cox model; ii) starting with the 20 potential confounders for VTE a backward stepwise procedure was used to reduce the number of covariates; all covariates that did not change the point estimate of the hazard ratio by more than 10% or that had no statistically significant impact (p > 0.05) were removed from the model in a stepwise procedure; iii) a Cox model with acne included in the primary model -in addition to the prognostic factors included in the primary model: this model was used to investigate the impact of the baseline differences between the two treatment groups; iv) a Cox model selected by the Akaike information criterion (AIC) as an estimator of the relative quality of different Cox models [13]: i.e. during a stepwise backward procedure (see model ii) the model with the lowest AIC value was chosen.
Regulatory authorities requested that the statistical analyses be conducted based on the "as treated" (AT) population as well as the "intention to treat" (ITT) population. For the AT analyses, data on outcomes of interest were assigned to the product actually used by the respective study participant at the time of the event. For the ITT analyses, all data from individual participants were assigned to the treatment they used at study entry, regardless of any switching (or stopping) or of any different (or no) product being used at the time of the event. For studies on efficacy, the "intention to treat" (ITT) approach is often preferred because it is conservative with respect to the superiority of a new treatment. For an analysis of drug safety, however, the ITT approach dilutes differences between treatments. Therefore, the "as treated" analysis was designated as the primary analysis for assessing the data.
The primary analysis was based on the pooled cohorts. As requested by regulatory authorities HRs were calculated per study as well as per user status (starter/switcher/restarter) for exploratory reasons. It should be noted that the total number of statistical tests is 25: 5 (4 studies plus pooled analysis) times 4 (complete cohorts, starters, switchers, restarters) plus 5 ITT analyses (complete cohorts in the pooled analysis and four studies). Accordingly, the likelihood of incorrectly rejecting a null hypothesis was substantial. Therefore, 24 out of the 25 tests were calculated for exploratory reasons only. Furthermore, a sub-analysis was only conducted if a minimum of three VTE was available for each of the two comparison groups.

Results
Overall, baseline characteristics were similar for the DNG/EE and LNG/EE users ( Table 2). The similarities include age, weight, height, BMI, cardiovascular risk factors (e.g. family history of VTE), medical history and concomitant medication. A substantial difference was found for the prevalence of acne (30.0% and 9.1% for DNG/EE and LNG/EE, respectively). Slight to moderate differences were also found: i) a higher proportion of LNG/EE users had delivered a child or had been pregnant prior to study entry; ii) a higher proportion of DNG/ EE users had switched OCs prior to study entry; and iii) DNG/EE users had a higher educational level.
Acne is associated with polycystic ovary syndrome (PCOS) which is associated with an approximately twofold risk of VTE [14][15][16]. Therefore, it was important that the sensitivity analyses included a Cox model that adjusted for acne (alternative model iii). A more detailed analysis of the age profile showed that the similar mean age of the two exposure groups is slightly misleading as the LNG/EE group included both, high proportions of teenagers and women age 30. Given the exponential increase of the VTE risk with age the LNG/EE users already had a slightly higher VTE risk prior to their enrolment. Therefore, it was to be expected that age adjusted VTE hazard ratios for DNG/EE versus LNG/EE would be slightly higher compared to crude hazard ratios. Given the opposing effects of adjustment for acne and age, it was not expected that adjustment for the differences discussed above would result in a substantial change of the unadjusted VTE hazard ratio.
The meta-analysis is based on 102 VTEs: 56 and 46 VTEs occurred in the DNG/EE and LNG/EE exposure groups, respectively. Only preparations with the same EE-content (30 µg) were used for this comparison. The VTE incidence rates were higher for DNG/ EE compared to LNG/EE (14.5 vs. 10.1 VTE/10,000 WY). The corresponding overall incidence rate ratio was 1.4; the 95% confidence interval (CI) included unity: 1.0 -2.1. The results for the individual studies are shown in figure 1. Breaking down VTE into deep venous thrombosis (DVT) and pulmonary embolism (PE) showed similar results: DVT, incidence rate ratio 1.4 (95% CI, 0.9 -2.2); PE, incidence rate ratio 1.4 (95% CI 0.6 -3.1).
The primary analysis (pooled dataset for all users of DNG/EE or LNG/EE) resulted in an adjusted HR of 1.6 (95% CI, 1. of the ITT analysis was similar: 1.5 (95% CI, 1.0 -2.4). Also, the results for starters, switchers and restarters showed similar results: starters, 1.6 (95% CI, 0.6 -4.6); switchers, 1.9 (1.0 -3.4); and restarters 1.3 (95% CI, 0.7 -2.5). A comparison of the individual study results shows consistent results ( Figure 2). The huge LASS dataset had the strongest impact on the overall results. However, the data are sufficiently consistent across studies to justify combining of all four study databases.
The results of the four alternative Cox models are shown in table 3. The point estimates of the VTE hazard ratio were only slightly lower compared to the estimate from the primary model. The lower limits of the 95% confidence interval were close to one in all cases.. Formal statistical significance for DNG/EE versus LNG/EE was only reached for the primary model and alternative model ii (backward stepwise procedure).

Discussion and conclusion
The incidence rate for VTE was higher for DNG/EE compared to LNG/EE and the primary statistical analysis yielded an adjusted VTE hazard ratio for the comparison of DNG/EE versus LNG/EE of 1.6. The corresponding 95% CI did not include unity and suggested a slightly higher VTE risk for DNG/EE compared to LNG/EE. This result is supported in principle by the results of four alternative analyses although only one of these analyses reached statistical significance.
In non-experimental studies like LASS, INAS-OC, TASC and INAS-SCORE the possibility of bias and residual confounding can never be entirely eliminated, and the ability to infer causation is correspondingly limited [17]. Valid information on potential sources of confounding, and sophisticated statistical and epidemiologic methodology help to reduce the impact of bias and residual confounding [18]. However, the difficulty remains unresolved when all that exists is a weak association [19,20]. Relative risk estimates that are close to unity may not allow differentiation between causation, bias and confounding [21,22]. In general, it is very difficult to interpret a relative risk of two or less in observational research [9,10].
Selection and misclassification bias were probably not a major issue in any of the four cohort studies because i) their participants are representative for adult COC users [23]; and ii) reliable information on exposure and duration of OC use was available. Furthermore, the low loss to follow-up rates of 3.3% or less in all studies is noteworthy. In theory, a disproportionately high percentage of VTE could have occurred in those patients who were lost to follow-up, because VTEs could be the reason for the break in contact with the investigators. An advantage of the design of the four included studies, however, is that the investigator teams had direct contact with the participants; contact was not lost if the women changed their gynaecologists (e.g. due to change of residence or dissatisfaction with treatment).
In contrast, it was impossible to exclude diagnostic bias. Clinical symptoms of VTE cover the spectrum from a complete absence or unspecific, slight symptoms to dramatic, acute, life-threatening symptoms [24][25][26]. A high awareness of potential cardiovascular risks of combined oral contraceptive use might have led to more diagnostic procedures and therefore to more detected VTEs. However, this is a general consideration and there is no evidence that diagnostic bias influenced the results of this meta-analysis. Another issue is the fact that information on specific gene mutations was only available for VTE cases but not for the vast majority of study participants. This limitation was mitigated by information on family history of VTE which has a higher predictive value for VTE compared to gene mutations [27].
The studies included in this meta-analysis combine several methodological strengths that are substantial for the validity of the results such as: i) prospective, comparative cohort design; ii) availability of important confounder information (e.g. BMI and family history of VTE); iii) validation of outcomes of interest and exposure for the relevant cases; iv) comprehensive follow-up procedure and very low loss to follow-up to minimize underreporting; v) independent, blinded adjudication of VTE cases; vi) study population representative for oral contraceptive users under routine clinical conditions; vii) quite different statistical approaches resulting in similar risk estimates and 95% CIs support the validity and robustness of the primary statistical model; and viii) supervision by an independent Safety Monitoring and Advisory Council as well as scientific independence from the study funder.  The validity of synthetic meta-analysis as applied to nonexperimental (observational) studies has been challenged [28][29][30]. Reasons why this approach is questioned include issues such as: variation in quality among studies; variation in methodology including variable definitions of exposure and outcome; and variable precision in the recording, measurement, and control of confounding factors. These issues do not apply to this meta-analysis. However, the possibility that a series of studies may tend to share the same biases and sources of confounding cannot be rejected for the studies included in this metaanalysis. The latter consideration is of special relevance when it comes to considering small associations [19].
In the author's judgment, the results of this meta-analysis are valid within the general limitations of observational research and metaanalysis of observational studies. The impact of residual confounding and bias could be limited to an extent that would allow causal interpretation of statistically significant results with hazard ratios that are equal to or higher than 2. Statistically significant results with hazard ratios of 1.5 to 2 can be cautiously interpreted as risk. However, some uncertainty remains in these cases.
The thresholds of 1.5 and 2 also reflect the limitations of the statistical power of this meta-analysis. The analysis was sufficiently powered to detect a twofold risk of VTE but had limited power to detect smaller risks. Nevertheless, the primary statistical analysis yielded a statistically significant increased risk of VTE for DNG/EE compared to LNG/EE. The adjusted hazard ratio of 1.6 is clearly below 2, and therefore some uncertainty regarding a causal interpretation of this result remains. The fact that the alternative analysis that adjusted for acne -the only baseline characteristic with a substantial difference between the exposure groups -showed no statistically significant difference adds to this uncertainty. It is also conceivable that adjustment for acne did not completely adjust for the likely differences in the prevalence of PCOS, and that some residual confounding remained unadjusted. However, it should be noted that primary and alternative analyses showed quantitatively similar risk estimates and the lower limits of the 95% CIs were always close to one.
Given the methodological strengths of the individual studies, the similarity of their study designs, and the quantitative consistency of the analysis results, the investigator considers it likely -but not definitively proven -that in 30 µg EE preparations DNG carries a higher risk of VTE compared to LNG.

Authorship
The meta-analysis was planned, conducted and reported by the author based on requests of the European Medicines Agency.