Chairs: Jan Beyersmann and Georg Zimmermann

**CASANOVA: Permutation inference in factorial survival designs**

Marc Ditzhaus^{1}, Arnold Janssen^{2}, Markus Pauly^{1}^{1}TU Dortmund, Germany; ^{2}Heinrich-Heine-University Duesseldorf

In this talk, inference procedures for general factorial designs with time-to-event endpoints are presented. Similar to additive Aalen models, null hypotheses are formulated in terms of cumulative hazards. Deviations are measured in terms of quadratic forms in Nelson–Aalen-type integrals. Different to existing approaches, this allows to work without restrictive model assumptions as proportional hazards. In particular, crossing survival or hazard curves can be detected without a significant loss of power. For a distribution-free application of the method, a permutation strategy is suggested. The theoretical findings are complemented by an extensive simulation study and the discussion of a real data example.

**Statistical MODEling of Additive Time Effects in Survival Analysis**

Annika Hoyer^{1}, Oliver Kuss^{2}^{1}Department of Statistics, Ludwig-Maximilians-University Munich, Germany; ^{2}Institute for Biometrics and Epidemiology, German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich-Heine-University Duesseldorf, Germany

In survival analysis, there have been various efforts to model intervention or exposure effects on an additive rather than on a hazard, odds or accelerated life scale. Though it might be intuitively clear that additive effects can be easier understood, there is also evidence from randomized trials that this is indeed the case: treatment benefits are easier understood if communicated as postponement of an adverse event [1]. In clinical practice, physicians and patients tend to interpret an additive effect on the time scale as a gain in life expectancy which is added as additional time to the end of life [2]. However, as the gain in life expectancy is, from a statistical point of view, an integral, this is not a precise interpretation. As an easier interpretable alternative we propose to model the increasing „life span“ [3] and to examine the corresponding densities instead of the survival functions. Focussing on the respective modes, the difference of them describes a change in life span, especially the shifting of the most probable event time. Therefore, it seems reasonable to model differences in life time in terms of mode differences instead of differences in expected times. To this task, we propose mode regression models (which we write “Statistical MODEls” to emphasize that the modes are modelled) based on parametric distributions (Gompertz, Weibull and log-normal). We illustrate our MODEls by an example from a randomized controlled trial on efficacy of a new glucose-lowering drug for the treatment of type 2 diabetes.

[1] Dahl R, Gyrd-Hansen D, Kristiansen IS, et al. Can postponement of an adverse outcome be used to present risk reductions to a lay audience? A population survey. BMC Med Inform Decis Mak 2007; 7:8

[2] Detsky AS, Redelmeier DA. Measuring health outcomes-putting gains into perspective. N Engl J Med 1998; 339:402-404 [3] Naimark D, Naglie G, Detsky AS. The meaning of life expectancy: what is a clinically significant gain? J Gen Intern Med 1994; 9:702-707

**Assessment of additional benefit for time-to-event endpoints after significant phase III trials – investigation of ESMO and IQWiG approaches**

Christopher Alexander Büsch, Johannes Krisam, Meinhard Kieser*University of Heidelberg, Germany*

New cancer treatments are often promoted as major advances after a significant phase III trial. Therefore, a clear and unbiased knowledge about the magnitude of the clinical benefit of newly approved treatments is important to assess the amount of reimbursement from public health insurance of new treatments. To perform these evaluations, two distinct “additional benefit assessment” methods are currently used in Europe.

The European Society for Medical Oncology (ESMO) developed the Magnitude of Clinical Benefit Scale Version 1.1 (ESMO-MCBSv1.1) classifying new treatments into 5 categories using a dual rule considering the relative and absolute benefit assessed by the lower limit of the 95% HR confidence interval or the observed absolute difference in median treatment outcomes, respectively[1,2]. As an alternative, the German IQWiG compares the upper limit of the 95% HR confidence interval to specific relative risk scaled thresholds classifying new treatments into 6 categories[4]. Until now, these methods have only been compared empirically[3].

We evaluate and compare the two methods in a simulation study with focus on time-to-event outcomes. The simulation includes aspects such as different censoring rates and types, incorrect HRs assumed for sample size calculation, informative censoring, and different failure time distributions. Since no “placebo” method reflecting a true (deserved) maximal score is available, different thresholds of the simulated treatment effects were used as alternatives. The methods’ performance is assessed via ROC curves, sensitivity / specificity, and the methods’ percentage of achieved maximal scores. Our results indicate that IQWiGs method is usually more conservative than ESMOs. Moreover, in some scenarios such as quick disease progression or incorrect assumed HR IQWiGs method is too liberal compared to ESMO. Nevertheless, further research is required, e.g. methods’ performance under non-proportional hazards.

References:

[1] N.I. Cherny, U. Dafni et al. (2017): ESMO-Magnitude of Clinical Benefit Scale version 1.1. Annals of Oncology, 28:2340-2366

[2] N.I. Cherny, R. Sullivan et al. (2015): A standardised, generic, validated approach to stratify the magnitude of clinical benefit that can be anticipated from anti-cancer therapies: the European Society for Medical Oncology Magnitude of Clinical Benefit Scale (ESMO-MCBS). Annals of Oncology, 26:1547-1573

[3] U. Dafni, D. Karlis et al. (2017): Detailed statistical assessment of the characteristics of the ESMO Magnitude of Clinical Benefit Scale (ESMO-MCBS) threshold rules. ESMO Open, 2:e000216

[4] G. Skipka, B. Wieseler et al. (2016): Methodological approach to determine minor, considerable, and major treatment effects in the early benefit assessment of new drugs. Biometrical Journal, 58:43-58

**Independent Censoring in Event-Driven Trials with Staggered Entry**

Jasmin Rühl*Universitätsmedizin Göttingen, Germany*

In the pharmaceutical field, randomised clinical trials with time-to-event endpoints are frequently stopped after a pre-specified number of events has been observed. This practice leads to dependent data and non-random censoring, though, which can generally not be solved by conditioning on the underlying baseline information.

If the observation period starts at the same time for all of the subjects, the assumption of independent censoring in the counting process sense is valid (cf. Andersen et al., 1993, p. 139), and the common methods for analysing time-to-event data can be applied. The situation is not as clear in case that staggered study entry is considered, though. We demonstrate that the study design at hand indeed entails general independent censoring in the sense of Andersen et al.

By means of simulations, we further investigate possible consequences of employing techniques such as the non-parametric bootstrap that make the more restrictive assumption of random censoring. The results indicate that the dependence in event-driven data with staggered entry is generally too weak to affect the outcomes; however, in settings where only few occurrences of the regarded event are observed, the implications become clearer.