P001 - A comparison of methods to develop individualized treatment rules on real data - 20/04/23

Résumé |
Introduction |
Identifying subgroups of patients who benefit from a treatment is a key aspect of personalized medicine. This can be achieved by developing individualized treatment rules (ITRs), which map individual characteristics to a treatment. Many machine learning algorithms have been proposed to develop such rules. It is unclear to what extent those algorithms lead to the same ITRs, i.e. recommending the treatment for the same individuals. To this aim, we compared the most common approaches on two multi-center randomized control trials: the International Stroke Trial (IST) and the CRASH-3 trial.
Methods |
We distinguished two classes of algorithms to develop an ITR. The first class relies on predicting the outcome of individuals under each treatment compared, to derive individualized treatment effects and then an ITR by recommending treatment to those with predicted benefit. In the second class, algorithms directly estimate the ITR by minimizing a loss function, without estimating purely prognostic components of the outcome model. The majority of the algorithms we compared in this project fell under the first class: meta-learners (T-learner, S-learner, X-learner and DR-learner, both with parametric and non-parametric models), generalized random forests (GRF), and virtual twins (VT), whereas A-learning, Weighting, outcome weighted learning (OWL) and contrast weighted learning (CWL) fell under the second class. When using non-parametric models, we compared results with and without cross-fitting. For each trial, we assessed the performance of ITRs in terms of value of the rule, average benefit of treatment among people with a positive score, average benefit of no treatment among people with a negative score, population average prescription effect, c-statistic for benefit and the pairwise agreement between ITRs by Cohen's kappa and Matthews correlation coefficients.
Results |
Results showed that the ITRs obtained by the different algorithm generally had considerable disagreements regarding the individuals to be treated. A better concordance was found among algorithms of the same family (e.g. among all meta-learners with parametric models, or all meta-learners with non-parametric models and cross-fitting). Overall, when evaluating the performance of ITRs in a hold-out validation sample (33% of the original sample selected at random), all algorithms produced ITRs with limited performance, whatever the performance in the training set, which suggests a high potential for overfitting.
Conclusion |
The methods are not interchangeable which draws some concerns on their practical use. This finding should be considered when developing ITRs in practice.
Keywords |
Personalized medicine, Individualized treatment rules, Causal inference, Machine learning
Déclaration de liens d'intérêts |
Les auteurs n'ont pas précisé leurs éventuels liens d'intérêts.
Le texte complet de cet article est disponible en PDF.Vol 71 - N° S2
Article 101641- mai 2023 Retour au numéroBienvenue sur EM-consulte, la référence des professionnels de santé.