Adaptive Designs I

Statistical Issues in Confirmatory Platform Trials
Martin Posch, Elias Meyer, Franz König
Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Austria

Adaptive platform trials provide a framework to simultaneously study multiple treatments in a disease. They are multi-armed trials where interventions can enter and leave the platform based on interim analyses as well as external events, for example, if new treatments become available [3]. The attractiveness of platform trials compared to separate parallel group trials is not only due to operational aspects as a joint trial infrastructure and more efficient patient recruitment, but results also from the possibility to share control groups, to efficiently prune non-efficacious treatments, and to allow for direct comparisons between experimental treatment arms [2]. However, the flexibility of the framework also comes with challenges for statistical inference and interpretation of trial results such as the adaptivity of platform trials (decisions on the addition or dropping of arms cannot be fully pre-specified and may have an impact on the recruitment for the current trial arms), multiplicity issues (due to multiple interventions, endpoints, subgroups and interim analyses) and the use of shared controls (which may be non-concurrent controls or control groups where the control treatment changes over time). We will discuss current controversies and the proposed statistical methodology to address these issues [1,3,4]. Furthermore, we give an overview of the IMI project EU-PEARL (Grant Agreement no. 853966) that aims to establish a general framework for platform trials, including the necessary statistical and methodological tools.

[1] Collignon O., Gartner C., Haidich A.-B., Hemmings R.J., Hofner B., Pétavy F., Posch M., Rantell K., Roes K., Schiel A. Current Statistical Considerations and Regulatory Perspectives on the Planning of Confirmatory Basket, Umbrella, and Platform Trials. Clinical Pharmacology & Therapeutics 107(5), 1059–1067, (2020)

[2] Collignon O., Burman C.F., Posch M., Schiel A.Collaborative platform trials to fight COVID-19: methodological and regulatory considerations for a better societal outcome. Clinical Pharmacology & Therapeutics (to appear)

[3] Meyer E.L., Mesenbrink P., Dunger-Baldauf C., Fülle H.-J., Glimm E., Li Y., Posch M., König F. The Evolution of Master Protocol Clinical Trial Designs: A Systematic Literature Review. Clinical Therapeutics 42(7), 1330–1360, (2020)

[4] Posch, M., & König, F. (2020).  Are p-values Useful to Judge the Evidence Against the Null Hypotheses in Complex Clinical Trials? A Comment on “The Role of p-values in Judging the Strength of Evidence and Realistic Replication Expectations”. Statistics in Biopharmaceutical Research, 1-3, (2002)


Type X Error: Is it time for a new concept?
Cornelia Ursula Kunz
Boehringer Ingelheim Pharma GmbH & Co. KG, Germany

A fundamental principle of how we decide between different trial designs and different test statistics is the control of error rates as defined by Neyman and Pearson, namely the type I error rate and the type II error rate. The first one controlling the probability to reject a true null hypothesis and the second one controlling the probability to not reject a false null hypothesis. When Neyman and Pearson first introduced the concepts of type I and type II error, they could not have predicted the increasing complexity of many trials conducted today and the problems that arise with them.

Modern clinical trials often try to address several clinical objectives at once, hence testing more than one hypothesis. In addition, trial designs are becoming more and more flexible, allowing to adapt ongoing trial by changing, for example, number of treatment arms, target populations, sample sizes and so on. It is also known that in some cases the adaptation of the trial leads to a change of the hypothesis being tested as for example happens when the primary endpoint of the trial is changed at an interim analysis.

While their focus was on finding the most powerful test for a given hypothesis, we nowadays often face the problem of finding the right trial design in the first place before even attempting on finding the most powerful or in some cases even just a test at all. Furthermore, when more than one hypothesis is being tested family-wise type I error control in the weak or strong sense also has to be addressed with different opinions on when we need to control for it and when we might not need to.

Based on some trial examples, we show that the more complex the clinical trial objectives, the more difficult it might be to establish a trial that is actually able to answer the research question. Often it is not sufficient or even possible to translate the trial objectives into simple hypotheses that are then being tested by some most powerful test statistic. However, when the clinical trial objectives cannot completely be addressed by a set of null hypotheses, type I and type II error might not be sufficient anymore to decide on the admissibility of a trial design or test statistic. Hence, we raise the question whether a new kind of error should be introduced.


Control of the population-wise error rate in group sequential trials with multiple populations
Charlie Hillner, Werner Brannath
Competence Center for Clinical Trials, Germany

In precision medicine one is often interested in clinical trials that investigate the efficacy of treatments that are targeted to specific sub-populations defined by genetic and/or clinical biomarkers. When testing hypotheses in multiple populations multiplicity adjustments are needed. First, we propose a new multiple type I error criterion for clinical trials with multiple intersecting populations, which is based on the observation that not all type I errors are relevant to all patients in the overall population. If the sub-populations are disjoint, no adjustment for multiplicity appears necessary, since a claim in one sub-population does not affect patients in the other ones. For intersecting sub-populations we suggest to control the probability that a randomly selected patient will be exposed to an inefficient treatment, which is an average multiple type I error. We propose group sequential designs that control the PWER where possibly multiple treatments are investigated in multiple populations. To this end, an error spending approach that ensures PWER-control is introduced. We exemplify this approach for a setting of two intersecting sub-populations and discuss how the number of different treatments to be tested in each sub-population affects the critical boundaries needed for PWER-control. Lastly, we apply this error spending approach to a group sequential design example from Magnusson & Turnbull (2013), where the efficacy of one treatment is to be tested after a certain sub-population that is likely to benefit from the treatment is found. We compare our PWER-controlling method with their FWER-controlling method in terms of critical boundaries and the resulting rejection probabilities and expected information.

References:

Magnusson, B.P. and Turnbull, B.W. (2013), Group sequential enrichment design incorporating subgroup selection. Statist. Med., 32: 2695-2714. https://doi.org/10.1002/sim.5738


Adaptive group sequential designs for phase II trials with multiple time-to-event endpoints
Moritz Fabian Danzer1, Tobias Terzer2, Andreas Faldum1, Rene Schmidt1
1Institute of Biostatistics and Clinical Research, University of Münster, Germany; 2Division of Biostatistics, German Cancer Research Center, Heidelberg, Germany

Existing methods concerning the assessment of long-term survival outcomes in one-armed trials are commonly restricted to one primary endpoint. Corresponding adaptive designs suffer from limitations regarding the use of information from other endpoints in interim design changes. Here we provide adaptive group sequential one-sample tests for testing hypotheses on the multivariate survival distribution derived from multi-state models, while making provision for data-dependent design modifications based on all involved time-to-event endpoints. We explicitly elaborate application of the methodology to one-sample tests for the joint distribution of (i) progression-free survival (PFS) and overall survival (OS) in the context of an illnessdeath model, and (ii) time to toxicity and time to progression while accounting for death as a competing event. Large sample distributions are derived using a counting process approach. Small sample properties and sample size planning are studied by simulation. An already established multi-state model for non-small cell lung cancer is used to illustrate the adaptive procedure.