Statistical Software Development

Chairs: Fabian Scheipl and Gernot Wassmer

A Web-Application to determine statistical optimal designs for dose-response trials, especially with interactions.
Tim Holland-Letz, Annette Kopp-Schneider
German Cancer Research Center DKFZ, Germany

Statistical optimal design theory is well developed, but almost never used in practical applications in fields such as toxicology. For the area of dose response trials we therefore present an R-shiny based web application which calculates D-optimal designs for the most commonly fitted dose response functions, namely the log-logistic and the Weibull function. In this context, the application also generates a graphical representation of the design space (a “design heatmap”). Furthermore, the application allows checking the efficiencies of user specified designs. In addition, uncertainty in regard to the assumptions about the true parameters can be included in the form of average optimal designs. Thus, the user can find a design which is a compromise between rigid optimality and more practical designs which also incorporate specific preferences and technical requirements.

Finally, the app can also be used to compute designs for substance interaction trials between two substances combined in a ray design setup, including an a-priori estimate for the parameters of the combination to be expected under the (Loewe-) additivity assumption.

Distributed Computation of the AUROC-GLM Confidence Intervals Using DataSHIELD
Daniel Schalk1, Stefan Buchka2, Ulrich Mansmann2, Verena Hoffmann2
1Department of Statistics, LMU Munich; 2The Institute for Medical Information Processing, Biometry, and Epidemiology, LMU Munich

Distributed calculation protects data privacy without ruling out complex statistical analyses. Individual data stays in local databases invisible to the analyst who only receives aggregated results. A distributed algorithm that calculates a ROC curve, its AUC estimate with confidence interval is presented to evaluate a therapeutic decision rule. It will be embedded in the DataSHIELD framework [1].

Starting point is the ROC-GLM approach by Pepe et al. [2]. The additivity of the Fisher information matrix, of the score vector, and of the CI proposed by DeLong [3] to aggregate intermediate results allows to design a distributed algorithm to calculate estimates of the ROC-GLM, its AUC, and CI.

We simulate scores and labels (responses) to create AUC values within the range of [0.5, 1]. The size of individual studies is uniformly distributed on [100, 2500] while the percentage of treatment-response covers [0.2,0.8]. Per scenario, 10000 studies are produced. Per study, the AUC is calculated within a non-distributed empiric as well as a distributed setting. The difference in AUC between both approaches is independent of the number of distributed components and is within the range of [-0.019, 0.013]. The boundaries of bootstrapped CIs in the non-distributed empirical setting are close to those in the distributed approach with the CI of DeLong: Range of differences in the lower boundary [-0.015, 0.03]; range of the upper boundary deviations [-0.012, 0.026].

The distributed algorithm allows anonymous multicentric validation of the discrimination of a classification rules. A specific application is the audit use case within the MII consortium DIFUTURE ( The multicentric prospective ProVAL-MS study (DRKS: 00014034) on patients with newly diagnosed relapsing-remitting multiple sclerosis provides the data for a privacy-protected validation of a treatment decision score (also developed by DIFUTURE) regarding discrimination between good and insufficient treatment response. The simulation results demonstrate that our algorithm is suitable for the planned validation. The algorithm is implemented in R to be used within DataSHIELD. It will be made publicly available.

[1] Amadou Gaye et al (2014). DataSHIELD: taking the analysis to the data, not the data to the analysis. International Journal of Epidemiology

[2] Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. Medicine.

[3] DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.Biometrics, pages 837–845.

Interactive review of safety data during a data monitoring committee using R-Shiny
Tobias Mütze1, Bo Wang2, Douglas Robinson2
1Statistical Methodology, Novartis Pharma AG, Switzerland; 2Scientific Computing and Consulting, Novartis Pharma AG, Switzerland

In clinical trials it is common that the safety of patients is monitored by a data monitoring committee (DMC) that operates independently of the clinical trial teams. After each review of the accumulating trial data, it is within the DMC’s responsibility to decide on whether to continue or stop the trial. The data are generally presented to DMCs in a static report through tables, listing, and sometimes figures. In this presentation, we share our experiences with supplementing the safety data review with an interactive R-Shiny app. We will first present the layout and content of the app. Then, we outline the advantages of reviewing (safety) data by means of an interactive app compared to the standard review of a DMC report, namely, extensive use of graphical illustrations in addition to tables, ability to quickly change the level of detail, and to switch between study-level data and subject-level data. We argue that this leads to a robust collaborative discussion and a more complete understanding of the data. Finally, we discuss the qualification process itself of an R Shiny app and outline how the learnings may be applied to enhance standard DMC reports


[1] Wang, W., Revis, R., Nilsson, M. and Crowe, B., 2020. Clinical Trial Drug Safety Assessment with Interactive Visual Analytics. Statistics in Biopharmaceutical Research, pp.1-12.

[2] Fleming, T.R., Ellenberg, S.S. and DeMets, D.L., 2018. Data monitoring committees: current issues. Clinical Trials, 15(4), pp.321-328.

[3] Mütze, T. and Friede, T., 2020. Data monitoring committees for clinical trials evaluating treatments of COVID-19. Contemporary Clinical Trials, 98, 106154.

[4] Buhr, K.A., Downs, M., Rhorer, J., Bechhofer, R. and Wittes, J., 2018. Reports to independent data monitoring committees: an appeal for clarity, completeness, and comprehensibility. Therapeutic innovation & regulatory science, 52(4), pp.459-468.

An R package for an integrated evaluation of statistical approaches to cancer incidence projection
Maximilian Knoll1,2,3,4, Jennifer Furkel1,2,3,4, Jürgen Debus1,3,4, Amir Abdollahi1,3,4, André Karch5, Christian Stock6,7
1Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany; 2Faculty of Biosciences, Heidelberg University, Heidelberg, Germany; 3Clinical Cooperation Unit Radiation Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany; 4German Cancer Consortium (DKTK) Core Center Heidelberg, Heidelberg, Germany; 5Institute of Epidemiology and Social Medicine, University of Muenster, Muenster, Germany.; 6Institute of Medical Biometry and Informatics (IMBI), University of Heidelberg, Heidelberg, Germany; 7Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany

Background: Projection of future cancer incidence is an important task in cancer epidemiology. The results are of interest also for biomedical research and public health policy. Age-Period-Cohort (APC) models, usually based on long-term cancer registry data (>20yrs), are established for such projections. In many countries (including Germany), however, nationwide long-term data are not yet available. It is unclear which statistical approach should be recommended for projections using rather short-term data.

Methods: To enable a comparative analysis of the performance of statistical approaches to cancer incidence projection, we developed an R package (incAnalysis), supporting in particular Bayesian models fitted by Integrated Nested Laplace Approximations (INLA). Its use is demonstrated by an extensive empirical evaluation of operating characteristics (bias, coverage and precision) of potentially applicable models differing by complexity. Observed long-term data from three cancer registries (SEER-9, NORDCAN, Saarland) was used for benchmarking.

Results: Overall, coverage was high (mostly >90%) for Bayesian APC models (BAPC), whereas less complex models showed differences in coverage dependent on projection-period. Intercept-only models yielded values below 20% for coverage. Bias increased and precision decreased for longer projection periods (>15 years) for all except intercept-only models. Precision was lowest for complex models such as BAPC models, generalized additive models with multivariate smoothers and generalized linear models with age x period interaction effects.

Conclusion: The incAnalysis R package allows a straightforward comparison of cancer incidence rate projection approaches. Further detailed and targeted investigations into model performance in addition to the presented empirical results are recommended to derive guidance on appropriate statistical projection methods in a given setting.

Using Differentiable Programming for Flexible Statistical Modeling
Maren Hackenberg1, Marlon Grodd1, Clemens Kreutz1, Martina Fischer2, Janina Esins2, Linus Grabenhenrich2, Christian Karagiannidis3, Harald Binder1
1Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Germany; 2Robert Koch Institute, Berlin, Germany; 3Department of Pneumology and Critical Care Medicine, Cologne-Merheim Hospital, ARDS and ECMO Center, Kliniken der Stadt Köln, Witten/Herdecke University Hospital, Cologne, Germany

Differentiable programming has recently received much interest as a paradigm that facilitates taking gradients of computer programs. While the corresponding flexible gradient-based optimization approaches so far have been used predominantly for deep learning or enriching the latter with modeling components, we want to demonstrate that they can also be useful for statistical modeling per se, e.g., for quick prototyping when classical maximum likelihood approaches are challenging or not feasible.

In an application from a COVID-19 setting, we utilize differentiable programming to quickly build and optimize a flexible prediction model adapted to the data quality challenges at hand. Specifically, we develop a regression model, inspired by delay differential equations, that can bridge temporal gaps of observations in the central German registry of COVID-19 intensive care cases for predicting future demand. With this exemplary modeling challenge, we illustrate how differentiable programming can enable simple gradient-based optimization of the model by automatic differentiation. This allowed us to quickly prototype a model under time pressure that outperforms simpler benchmark models.

We thus exemplify the potential of differentiable programming also outside deep learning applications, to provide more options for flexible applied statistical modeling.