Development of a prognostic model of overall survival in oropharyngeal cancer from real-world data: PRO.M.E.THE.O.
Objective. The PRO.M.E.THE.O. study (PredictiOn Models in Ent cancer for anti-EGFR based THErapy Optimization) aimed to develop a predictive model (PM) of overall survival (OS) for patients with locally advanced oropharyngeal cancer (LAOC) treated with radiotherapy (RT) and cetuximab (Cet) from an Italian dataset.
Methods. We enrolled patients with LAOC from 6 centres treated with RT-Cet. Clinical and treatment variables were collected. Patients were randomly divided into training (TS) (80%) and validation (VS) (20%) sets. A binary logistic regression model was used on the TS with stepwise feature selection and then on VS. Timepoints of 2, 3 and 5 years were considered. The area under the curve (AUC) of receiver operating characteristic of 2, 3 and 5 year and confusion matrix statistics at 5-threshold were used as performance criteria.
Results. Overall, 218 patients were enrolled and 174 (79.8%) were analysed. Age at diagnosis, gender, ECOG performance, clinical stage, dose to high-risk volume, overall treatment time and day of RT interruption were considered in the final PMs. The PMs were developed and represented by nomograms with AUC of 0.75, 0.73 and 0.73 for TS and 0.713, 0.713, 0.775 for VS at 2, 3 and 5 years, respectively.
Conclusions. PRO.M.E.THE.O. allows the creation of a PM for OS in patients with LAOC treated with RT-Cet.
Meta-analyses on chemotherapy (CT) in locally advanced head and neck squamous cell carcinoma have demonstrated an overall survival (OS) benefit for the addition of CT to radiotherapy (RT) 1-5. Cetuximab is considered a viable treatment option for patients who are unfit for cisplatin, and has been shown to significantly improve OS when combined with RT compared with RT alone in a randomised phase III trial 3.
Even if squamous cell carcinomas (SCC) of the head and neck region share major risk factors and some clinical features, they still have specific characteristics, different treatment options and variable prognosis, depending on the tumour site and subsite 6.
As new strategies and therapies are being tested, it is becoming apparent that the magnitude of benefit derived from a specific treatment, and the corresponding toxicity profile, may vary in different patient groups 7.
Over the last decade, remarkable advances in cancer care have raised new challenges, leading clinical practice towards personalised medicine, although there remains a gap between evidence from clinical trials and real-world practice 8.
Moreover, studies investigating the physicians’ performance in predicting radiosensitivity and oncological outcomes are currently lacking 9.
Since the clinical introduction of cetuximab, the lack of biomarkers to predict its efficacy have profoundly hampered its routine clinical use. The development of tools that allow physicians to individualise treatment will facilitate transformation from population-based strategies to personalised medicine with an essential role of decision supporting systems (DSSs) 10,11. The DSS development process necessarily relies on an ontology that represents knowledge as a set of shared concepts within a domain and the relationships between them, using a large amount of data with proximity to daily clinical reality 12.
The PRO.M.E.THE.O. (PredictiOn Models in Ent cancer for anti-EGFR based THErapy Optimization) project involved several Italian RT centres to implement a system that is able to analyse large heterogeneous datasets specific for oropharyngeal SCC (OPC). The aim of this project is to develop predictive prognostic models (PPM) of OS at 2, 3 and 5 years for OPC patients treated with RT and cetuximab (bio-radiotherapy [b-RT]) based on a real-world data collection.
Materials and methods
The PRO.M.E.THE.O. project involved 6 Italian Radiation Oncology Departments. The promoting centre was Fondazione Policlinico Universitario A. Gemelli-IRCCS, Rome. A multidisciplinary team of physicians defined the project milestones and a teleconference was scheduled every two weeks for progress updates. The objective of this first phase was to implement a system that can analyse a large heterogeneous dataset which included all available data conforming to a standardised ontology collected from daily activity. The final aim was to develop reliable PPM for OS in OPC at 2, 3 and 5 years for use in clinical management DSS.
CREATION OF AN ONTOLOGY
The first step was the creation of an ontology, as a defined data collection model, which is capable of collecting, standardising and organising features of patients with OPC treated with cetuximab. The ontology is fundamental for the data mining process, as it will explicitly declare the clinical variables involved and their mutual relationships. Each variable has four main properties: name, form, field type and level. The existing field types are text: some variables are described with a multiple choice option and someone others with free text option; number: integer or decimal number; date; table; files: DICOM and.txt files are the standard file formats chosen for images and data on treatment, respectively.
Initially, a small multidisciplinary team based in the promoter centre identified clinical variables to be included in the ontology, and all the six centres validated them. Next, a technical committee consisting of an engineer, a physicist, a physician with experience in data storage, and a software expert was created. This multiprofessional group defined the characteristics required for the ontology to be accepted. These requirements included defining the data type for each field, the possible values allowed, the cardinality of the entries (i.e., single- or multi-selection field) and the range allowed in case of numeric values.
After the formal definition of the ontology and its requirements, the working group and the technical committee were asked to define the tools to share the ontology among centres via a standardised form. To accomplish this task, the “Beyond Ontology Awareness” (BOA) software was developed to reproduce the ontology structure, manage the import of legacy data and coordinate data sharing activities 13,14.
From January 2017 to January 2018, we selected more than 200 variables across 16 input forms related to OPC. The ontology was organised into three levels:
- registry: with exclusive epidemiologic information (age, gender, ethnicity, height, etc.);
- procedural: where treatment information and related toxicities were reported;
- research: where dimensional data, such as radiomics and genomics were collected.
A BOA-Web service platform was created and the centers involved collected data using Case Report Forms (CRFs). A total of 16 CRFs were created for: registry and history; blood and serum test pre-treatment and follow-up; histology; staging c, yc, p, and yp; external beam RT; CT; brachytherapy; surgery; toxicity Common Terminology Criteria for Adverse Events (CTCAE) v4.0, and Radiation Therapy Oncology Group (RTOG); follow-up and outcomes.
Standardised data sharing
A network of private connections was implemented between the various centres and a computer system that automatically translated the various available information into extractable parameters that conformed to the ontology. The data was anonymised, encrypted and sent to the central repository via a secure https-based web service. The system allowed the data to be aggregated anonymously into a single ‘large database’ to proceed with analysis of selected patient data.
We retrospectively collected data on patients with OPC treated with b-RT from 2006 to 2018 with curative intent. Inclusion criteria were: age > 18 years, Eastern Cooperative Oncology Group-Performance Status (ECOG-PS Eastern Cooperative Oncology Group) 0-2, stage I-IVa, ineligible to be treated with cisplatin (due to clinical conditions known to contraindicate the use of cisplatin according to international consensus recommendations 15). The only exclusion criteria was the presence of metastatic disease.
Patient underwent RT with different techniques such as 3D, Intensity-modulated radiotherapy (IMRT), tomotherapy, volumetric modulated arc therapy (VMAT) with simultaneous integrated boost (SIB), or sequential RT boost. Patients underwent weekly physical evaluation during treatment. Acute toxicities, including cutaneous side effects and onset of rash, were graded according to CTCAE version 4.0. Clinical follow-up consisted of physical examination every three months for 3 years from diagnosis and every 6 months thereafter, and was performed alternatively by the radiation oncologist, otolaryngologist and medical oncologist or in joint consultation.
Table I shows the clinical and treatment variables that were collected. Categorical covariates were dummy field, while numerical covariates were kept in their original version and binarised at different cut-off thresholds as reported in Table II.
Overall survival was considered as the time elapsing between the date of diagnosis of the neoplasm and last follow-up date or death from any cause; 2-, 3- and 5-year time points were considered. For PPM development and internal validation, the dataset was randomly split into 80% and 20% 16 for the training and validation sets, respectively. The covariates selected and the ones engineered from those selected were included within a multivariable logistic regression model on the training set and further selected with step-wise regression based on the Akaike Information Criteria (AIC). The model was then applied on the testing set to assess the performance in terms of Receiver Operating Characteristic (ROC) Area Under the Curve (AUC). The models were retrained for each OS time point on the entire dataset to be represented in the form of nomograms. Statistical analysis was performed using R version 3.4.4.
Patient characteristics and treatment
From February to December 2018, we collected data on 218 OPC patients. We considered clinical and treatment variables, as shown in Table II. The covariates analysed were: age at diagnosis, gender, ECOG score, smoking status, alcohol consumption, human papilloma virus (HPV)-status (p16 immunohistochemistry), cT, cN, tumour grading, dose to high risk clinical target volume (CTV), fractionation, overall treatment time (OTT), days of RT interruption (DRTI). DRTI was taken as surrogate of treatment tolerance. We considered only variables with at least 75% of available values (Tab. II).
A total of 174 (79.8%) patients were analysed. All patients underwent combined b-RT with a median OTT of 51 days (range 6-101) with a median dose of 69.9 Gy (17.6-79.2 Gy). This wide dose range is due to the discontinuation of RT treatment for patients receiving less than 50 Gy.
The analysis was performed considering several variables. Categorical covariates were dummy field, whereas numerical covariates were kept in their original version and binarised at different cut-off thresholds (such as total RT dose to high risk CTV ≥ 66Gy, age at diagnosis > median value, DRTI yes/no, DRTI > 2, > 3, > 5, > 7, > 10, > 20). HPV status was not considered because of the not available rate > 25% (Tab. I). Covariates tested that showed a negative impact on 2-, 3-5-year OS were cN+, ECOG ≥ 1, age, RT dose < 66 Gy and DRTI > 3. At a median follow-up of 57.6 months (1.5-142.0), the OS rate at 2, 3, 5 years was 71.5%, 67.4% and 64.7%, respectively.
No significant difference was found between the distributions of training and validation set covariates. The logistic regression models trained with the covariates selected from the previous stepwise selection are summarised in Table III for the three different OS time points. The performance of these models in terms of AUC is reported in Table IV, along with their respective ROC curves (Fig. 1). The nomograms built from these models are reported in Figures 2, 3 and 4 and two clinical examples are reported in Table V.
In this study, we collected registry and procedural level variables and prognostic models were developed to predict 2-, 3-5-year OS for patients with OPC treated with b-RT. Specifically, nomograms allowed the integration of clinical, treatment-related and epidemiological risk factors, assessing their interactions and estimating the final effect on survival. Most importantly, the PPMs provide personalised, patient-specific estimates of OS that can be used for risk-stratification and prognosis discussions with patients. Fundamental to the creation and validation of any PPM is the generation of an ontology that in this study was used as a fundamental tool to collect, standardise and organise data from OPC patients treated with cetuximab.
The collection of data from several Italian centres selected for accrual and expertise, allowed us to perform an analysis on a large population of OPC based on real-world experience. Our results are consistent with the literature, showing better survival rates than those reported in the Bonner studies 17,18.
The creation of a large database gave us the ability to predict which variables impact clinical outcomes. In our analyses, clinically negative nodal status (p < 0.05), good performance status with ECOG 0 and RT dose > 66 Gy were protective factors, while DRTI ≥ 3 appeared to be a detrimental factor for OS at 2, 3 and 5 year.
The strength of our project is related to the multicentre origin of the data and their quality due to the relative homogeneity of the population, including patients with oropharynx cancer treated with cetuximab-based RT protocols.
PRO.M.E.THE.O also has the benefit of taking into account treatment compliance, which is a key parameter in patients with OPC, where discontinuation of RT may affect the overall efficacy of treatment. It indirectly shows compliance with cetuximab, underscoring its importance 3. Unfortunately, we do not have clear data about the main reasons for treatment interruptions, which would have been interesting and could be a subject of future research.
HPV status is a major determinant of the prognosis of patients with OPC with a 60% reduction in the risk of death. Data from randomised trials highlight the impact of HPV status on survival, demonstrating a 60% improvement in HPV-positive patients.
Our retrospective observational case series on a large amount of data lacks HPV status because in the period analysed HPV determination was not part of routine clinical practice; nevertheless, the model provided has highlighted prognostic factors for survival that in our opinion may represent an essential integration in the prognostic framing of patients with OPC.
There is heterogeneity in HPV positive patients due to tumour stage, smoking status and other prognostic factors such as radiological extranodal extension, matted nodes and PIK3CA 19. Therefore, even if the integration of HPV status in the nomogram for the clinical use is absolutely recommended, in this heterogeneous setting, our models, which overcome HPV-status, could be useful as well. In addition, the benefit of the present work is that it is based on a real-world data collection and reflects the fact that in many Italian centres the evaluation of HPV in OPC is still not standardised nor routinely performed.
Nonetheless, our results provide evidence of the feasibility of model building on real-world data also considering that they are in line with the literature, even without considering HPV-status. In fact, our models showed a good predictive power with an AUC between 0.73 and 0.75.
Some sources of big data already exist in the literature 10,20, but are constrained by important limitations, including low granularity (i.e. lack of detailed information on RT). Available big data sources are usually at least 2-3 years behind current practice due to the time required to collect and assemble the data and perform quality control. Adding the time for data analysis and publication, current studies examining quality of care and comparative effectiveness usually report data that is at least 5 years old 22,23. However, to our knowledge, the reliability, consistency and accuracy of these calculators in predicting outcomes for individual patients in different populations remain unclear. These uncertainties are critical in order to optimally implement these tools in clinical practice 24.
The present study has some limitations. First, the absence of external validation and analysis for variables such as HPV-status, smoking and alcohol habits that may have an impact on OS. However, considering the results of De-Escalate 4 and RTOG 1016 5 trials, it is important to create a model that bypasses HPV-status and investigate the impact of other factors on survival. Another limitation is that we did not study adherence to cetuximab therapy, as we did not consider cetuximab dose intensity as a potential variable influencing outcome. We also did not specifically address the impact of toxicity, as we assumed that toxicity-induced RT discontinuations were too rare to have an impact on OS. However, we are planning to investigate this aspect in future analyses, where it may be of interest to evaluate the impact of cetuximab dose intensity and cumulative dose on skin toxicity and OS.
This project represents the first example of a PPM created specifically for OPC patients treated with b-RT based on real-world data. The next step of this project will be external validation of the PPM. We would also like to implement our analysis with HPV, smoking status, alcohol consumption, details about toxicity (and in particular compliance to cetuximab treatment) and radiomics data. Schematisations have been shown to significantly enhance physician-patient communication, and nomograms provide a visual picture of prognostic factors and their relative influence to support personalised treatment. This could facilitate active participation by patients in decision-making.
Conflict of interest statement
The authors declare no conflict of interest.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
All authors have equally contributed to the manuscript.
This study was approved by the Institutional Ethics Committee of Catholic University of the Sacred Heart (approval number/protocol number 0049309/16). The research was conducted ethically, with all study procedures being performed in accordance with the requirements of the World Medical Association’s Declaration of Helsinki. Written informed consent was obtained from each participant/patient for study participation and data publication.
The local ethics committees of each centre approved the protocol before patient enrolment according to Italian legislation.
A Collaboration and Data Transfer Agreement was signed to define the type of data, permitted use and data protection policies.
Figures and tables
|Patient characteristics||Available (%)|
|Analysed patients/collected patients per RT centre|
|HPV (human papillomavirus) DNA||Not analysed|
|HPV (human papillomavirus) RNA||Not analysed|
|ECOG PS||174 (100%)|
|RT start date||154 (88%)|
|RT end date||154 (88%)|
|Interruption days||144 (83%)|
|Prescription dose to CTV||166 (95%)|
|Dose reached (y/n)||166 (95%)|
|Censor death||139 (92%)|
|Last follow-up date||104 (94%)|
|Smoking status (Pack year)||109 (63%)|
|Alcohol consumption||41 (23%)|
|Acute toxicity||Not analysed|
|Relapse (y/n)||Not analysed|
|Cetuximab number of cycles||Not analysed|
|Covariates tested:||Training set 139 patients (%)||Validation set 35 patients (%)||P-value|
|Age at diagnosis > 65 years||70 (50%)||13 (37%)||0.187 a|
|Male||106 (76%)||26 (74%)||0.826*|
|Female||33 (24%)||9 (26%)|
|N0||116 (83%)||29 (83%)||1*|
|N+||23 (17%)||6 (17%)|
|ECOG = 0||55 (39%)||20 (57%)||0.084*|
|ECOG = 1||48 (34%)||2 (5%)|
|Yes||110 (79%)||32 (91%)||0.359*|
|No||29 (21%)||3 (9%)|
|Interruption days ≥ 2||95 (68%)||26 (74%)||0.544*|
|Interruption days ≥ 3||85 (61%)||22 (62%)||1*|
|Interruption days ≥ 5||52 (37%)||19 (54%)||0.084*|
|Interruption days ≥ 8||40 (29%)||16 (46%)||0.068*|
|Interruption days ≥ 10||25 (18%)||12 (34%)||0.062*|
|Interruption days ≥ 21||8 (5%)||5 (14%)||0.140*|
|Median OTT in days (range)||51 (6-101)||52 (41-81)||0.284 a|
|Total RT dose ≥ 66 Gy||126 (90%)||34 (97%)||0.306 a|
|2 years||3 years||5 years|
|Coefficient||Standard error||P-value||Coefficient||Standard error||P-value||Coefficient||Standard error||P-value|
|N = 0||-1.0439||0.6260||0.095||-0.8863||0.5702||0.120||-0.8863||0.5702||0.120|
|ECOG PS = 0||-1.3410||0.4634||0.003**||-0.9209||0.4129||0.025*||-0.9209||0.4129||0.025*|
|RT Dose ≥ 66 Gray||-2.0091||0.6714||0.002**||-1.7503||0.6499||0.007**||-1.7503||0.6499||0.007**|
|RT Interruption days ≥ 3||0.9058||0.4405||0.039*||1.0223||0.4165||0.014*||1.0223||0.4165||0.014*|
|OS time point||AUC training set||AUC validation set|
|Age at diagnosis||NO||ECOG 0||RT Dose ≥ 66 Gy||Interruption RT days ≥ 3||Death risk at 2 y (%)|
|OS > 90% at 2 y|
|OS > 75-85% at 2 y|
|OS > 50-70% at 2 y|
|OS > 15-40% at 2 y|
- Lacas B, Carmel A, Landais C. Meta-analysis of chemotherapy in head and neck cancer (MACH-NC): an update on 107 randomized trials and 19,805 patients, on behalf of MACH-NC Group. Radiother Oncol. 2021; 156:281-293. DOI
- Blanchard P, Baujat B, Holostenco V. Meta-analysis of chemotherapy in head and neck cancer (MACH-NC): a comprehensive analysis by tumour site. Radiother Oncol. 2011; 100:33-40. DOI
- Magrini SM, Buglione M, Corvò R. Cetuximab and radiotherapy versus cisplatin and radiotherapy for locally advanced head and neck cancer: a randomized phase II trial. J Clin Oncol. 2016; 34:427-435. DOI
- Mehanna H, Robinson M, Hartley A. Radiotherapy plus cisplatin or cetuximab in low-risk human papillomavirus-positive oropharyngeal cancer (De-ESCALaTE HPV): an open-label randomised controlled phase 3 trial. Lancet. 2019; 393:51-60. DOI
- Gillison ML, Trotti AM, Harris J. Radiotherapy plus cetuximab or cisplatin in human papillomavirus-positive oropharyngeal cancer (NRG Oncology RTOG 1016): a randomised, multicentre, non-inferiority trial. Lancet. 2019; 393:40-50. DOI
- Rios Velazquez E, Hoebers F, Aerts HJWL. Externally validated HPV-based prognostic nomogram for oropharyngeal carcinoma patients yields more accurate predictions than TNM staging. Radiother Oncol. 2014; 113:324-330. DOI
- Bentzen SM, Hendry JH. Variability in the radiosensitivity of normal cells and tissues. Report from a workshop organised by the European Society for Therapeutic Radiology and Oncology in Edinburgh, UK, 19 September 1998. Int J Radiat Biol. 1999; 75:513-517. DOI
- van Baardwijk A, Wanders S, Boersma L. Mature results of an individualized radiation dose prescription study based on normal tissue constraints in stages I to III non-small-cell lung cancer. J Clin Oncol. 2010; 28:1380-1386. DOI
- Scott JG, Berglund A, Schell MJ. A genome-based model for adjusting radiotherapy dose (GARD): a retrospective, cohort-based study. Lancet Oncol. 2017; 18:202-211. DOI
- Meldolesi E, Van Soest J, Dinapoli N. An umbrella protocol for standardized data collection (SDC) in rectal cancer: a prospective uniform naming and procedure convention to support personalized medicine. Radiother Oncol. 2014; 112:59-62. DOI
- Gambacorta MA, Valentini C, Dinapoli N. Clinical validation of atlas-based auto-segmentation of pelvic volumes and normal tissue in rectal tumors using auto-segmentation computed system. Acta Oncol. 2013; 52:1676-1681. DOI
- van Stiphout RGPM, Lammering G, Buijsen J. Development and external validation of a predictive model for pathological complete response of rectal cancer patients including sequential PET-CT imaging. Radiother Oncol. 2011; 98:126-133. DOI
- Tagliaferri L, Kovács G, Autorino R. ENT COBRA (Consortium for Brachytherapy Data Analysis): interdisciplinary standardized data collection system for head and neck patients treated with interventional radiotherapy (brachytherapy). J Contemp Brachytherapy. 2016; 8:336-343. DOI
- Meldolesi E, van Soest J, Dinapoli N. Medicine is a science of uncertainty and an art of probability (Sir W. Osler). Radiother Oncol. 2015; 114:132-134. DOI
- Porceddu S V, Scotté F, Aapro M. Treating patients with locally advanced squamous cell carcinoma of the head and neck unsuitable to receive cisplatin-based therapy. Front Oncol. 2020; 9:1522. DOI
- Collins GS, Reitsma JB, Altman DG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015; 350:g7594. DOI
- Bonner JA, Harari PM, Giralt J. Radiotherapy plus cetuximab for squamous-cell carcinoma of the head and neck. N Engl J Med. 2006; 354:567-578. DOI
- Bonner JA, Harari PM, Giralt J. Radiotherapy plus cetuximab for locoregionally advanced head and neck cancer: 5-year survival data from a phase 3 randomised trial, and relation between cetuximab-induced rash and survival. Lancet Oncol. 2010; 11:21-28. DOI
- Beaty BT, Moon DH, Shen CJ. PIK3CA mutation in HPV-associated OPSCC patients receiving deintensified chemoradiation. J Natl Cancer Inst. 2020; 112:855-858. DOI
- Tagliaferri L, Budrukkar A, Lenkowicz J. Review papers ENT COBRA ONTOLOGY: the covariates classification system proposed by the Head & Neck and Skin GEC-ESTRO Working Group for interdisciplinary standardized data collection in head and neck patient cohorts treated with interventional radiotherapy (brachytherapy). J Contemp Brachytherapy. 2018; 10:260-266. DOI
- Fakhry C, Zhang Q, Nguyen-Tân PF. Development and validation of nomograms predictive of overall and progression-free survival in patients with oropharyngeal cancer. J Clin Oncol. 2017; 35:4057-4065. DOI
- Pagedar NA, Chioreso C, Schlichting JA. Treatment selection in oropharyngeal cancer: a surveillance, epidemiology, and end results (SEER) patterns of care analysis. Cancer Causes Control. 2017; 28:1085-1093. DOI
- Hararah MK, Stokes WA, Jones BL. Nomogram for preoperative prediction of nodal extracapsular extension or positive surgical margins in oropharyngeal squamous cell carcinoma. Oral Oncol. 2018; 83:73-80. DOI
- Beesley LJ, Hawkins PG, Amlani LM. Individualized survival prediction for patients with oropharyngeal cancer in the human papillomavirus era. Cancer. 2019; 125:68-78. DOI
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
© Società Italiana di Otorinolaringoiatria e chirurgia cervico facciale , 2022
- Abstract viewed - 225 times
- PDF downloaded - 57 times