Affiliations 

  • 1 School of Physics, University of Western Australia, Australia; School of Health Sciences, National University of Malaysia, Malaysia. Electronic address: [email protected]
  • 2 School of Physics, University of Western Australia, Australia; Department of Radiation Oncology, Sir Charles Gairdner Hospital, Australia
  • 3 Institute for Health Research, University of Notre Dame, Fremantle, Australia
  • 4 Department of Radiation Oncology, Sir Charles Gairdner Hospital, Australia
  • 5 Department of Radiation Oncology, Sir Charles Gairdner Hospital, Australia; School of Surgery, University of Western Australia, Australia
  • 6 School of Medicine and Public Health, University of Newcastle, Australia
Radiother Oncol, 2016 08;120(2):339-45.
PMID: 27370204 DOI: 10.1016/j.radonc.2016.05.010

Abstract

BACKGROUND AND PURPOSE: Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate.

MATERIALS/METHODS: Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed.

RESULTS: 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients.

CONCLUSIONS: Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.