Personalised Medicine

Learning about personalised effects: transporting anonymized information from individuals to (meta-) analysis and back
Els Goetghebeur
Ghent University, Belgium

Evidence on `personalized‘ or stratified medicine draws information from subject-specific records on prognostic factors, (point) treatment and outcome from relevant population samples. A focus on treatment by covariate interactions makes such studies more data hungry than a typical trial focusing on a population average effect. Nationwide disease registers or individual patient meta-analysis may overcome sample size issues, but encounter new challenges, especially when targeting a risk or survival outcome. Between study heterogeneity comes as a curse and a blessing when aiming to transport treatment effects to new patient populations. It reveals sources of variation with specific roles in the transportation. For survival analysis special attention must be given to calendar time and internal time. A core set of covariates is needed for the stratified analysis, ideally measured with similar precision. A shared minimum follow-up time, and well understood censoring mechanisms are expected. Variation in baseline hazards under standard of care may reflect between study variation in diagnostic criteria, in populations, in unmeasured baseline covariates, in standard of care delivery and its impact.

When core set covariates are lacking for some studies or over certain stretches of time, a range of solutions may be considered. Different assumptions on missing covariates of survival models will affect treatment balancing IPW and direct standardization methods differentially. Alternatively, one may seek to link data from different sources to fill the gaps. It then pays to consider models with estimators that can be calculated from (iteratively) constructed summary statistics involving weighted averages over functions of the missing covariates. By thus avoiding the need for additional individually linked measures, one may open access to a range of existing covariates or biomarkers that can be merged while circumventing (time)consuming lengthy confidentiality agreements.

In light of the above we discuss pros and cons of various methods of standardizing effects to obtain transportable answers that are meaningfully compared between studies. We thus aim to provide relevant evidence on stratified interventions referring to several case studies.

Precision medicine in action – the FIRE3 NGS study
Laura Schlieker1, Nicole Krämer1, Volker Heinemann2,3, Arndt Stahler4, Sebastian Stintzing3,4
1Staburo GmbH, Germany; 2Department of Medicine III, University Hospital, University of Munich, Germany; 3DKTK, German Cancer Consortium, German Cancer Research Centre (DKFZ); 4Medical Department, Division of Hematology, Oncology and Tumor Immunology (CCM), Charité Universitaetsmedizin Berlin

The choice of the right treatment for patients based on their individual genetic profile is of utmost importance in precision medicine. To identify potential signals within the large number of biomarkers it is mandatory to define criteria for signal detection beforehand and apply appropriate statistical models in the setting of high dimensional data.

For the identification of predictive and prognostic genetic variants as well as tumor mutational burden (TMB) in patients with metastatic colorectal cancer, we derived the following pre-defined and hierarchical criteria for signal detection.

a) All biomarkers identified via a multivariate variable selection procedure

b) If a) reveals no signal, all biomarkers with adjusted p-value ≤ 0.157

c) If neither a) nor b) reveals signals, the top 5 biomarkers according to sorted, adjusted p-value

Regularized regression models were used for variable selection, and the stability of the selection process was quantified and visualized. Selected biomarkers were analyzed in terms of their predictive potential on a continuous scale.

With our analyses we confirmed the predictive potential of several already known biomarkers and identified additional promising candidate variants. Furthermore, we identified TMB as a potential prognostic biomarker with a trend towards prolonged survival for patients with high TMB.

Our analyses were supported by power simulations for the variable selection method, assuming different prevalences of biomarkers, numbers of truly predictive biomarkers and effect sizes.

Tree-based Identification of Predictive Factors in Randomized Trials using Weibull Regression
Wiebke Werft1, Julia Krzykalla2, Dominic Edelmann2, Axel Benner2
1Hochschule Mannheim University of Applied Sciences, Germany; 2German Cancer Research Center (DKFZ), Heidelberg, Germany

Keywords: Predictive biomarkers, Effect modification, Random forest, Time-to-event endpoint, Weibull regression

Novel high-throughput technology provides detailed information on the biomedical characteristics of each patient’s disease. These biomarkers may qualify as predictive factors that distinguish patients who benefit from a particular treatment from patients who do not. Hence, large numbers of biomarkers need to be tested in order to gain evidence for tailored treatment decisions („personalized medicine“). Tree-based methods divide patients into subgroups with differential treatment effects in an automated and data-driven way without requiring extensive pre-specification. Most of these methods mainly aim for a precise prediction of the individual treatment effect, thereby ignoring interpretability of the tree/random forest.

We propose a modification of the model-based recursive partitioning (MOB) approach for subgroup analyses (Seibold, Zeileis et al. 2016), the so-called predMOB, that is able to specifically identify predictive factors (Krzykalla, Benner et al. 2020) from a potentially large number of candidate biomarkers. The original predMOB is developed for normally distributed endpoints only. To widen the field of application, particularly for time-to-event endpoints, we enhanced the predMOB to these situations. More specifically, we use Weibull regression for the base model in the nodes of the tree since MOB and predMOB require fully parametrized models. However, the Weibull model includes the shape parameter as a nuisance parameter which has to be fixed to focus on predictive biomarkers only.

The performance of this extension of the predMOB is assessed concerning identification of the predictive factors as well as prediction accuracy of the individual treatment effect and the predictive effects. Using simulation studies, we are able to show that predMOB provides a targeted approach to predictive factors by reducing the erroneous selection of biomarkers that are only prognostic.

Furthermore, we apply our method to a data set of primary biliary cirrhosis (PBC) patients treated with D-penicillamine or placebo in order to compare our results to those obtained by Su et al. 2008. The aim is to identify predictive factors with respect to overall survival. On the whole, similar variables are identified, but the ranking differs.


Krzykalla, J., et al. (2020). „Exploratory identification of predictive biomarkers in randomized trials with normal endpoints.“ Statistics in Medicine 39(7): 923-939.

Seibold, H., et al. (2016). „Model-based recursive partitioning for subgroup analyses.“ The International Journal of Biostatistics 12(1): 45-63.

Su. X., et al. (2008). “Interaction trees with censored survival data.” The International Journal of Biostatistics 4(1).