Comparative Analysis of Classification Models for Predicting Cancer Stage in a Chilean Cancer Center

Authors

  • Marcela Aguirre Instituto Oncológico Fundación Arturo López Pérez, Medical Informatics and Data Science Unit, Department of Cancer Research, Santiago, Chile Author https://orcid.org/0000-0001-9113-2057
  • Sergio Peñafiel Instituto Oncológico Fundación Arturo López Pérez, Medical Informatics and Data Science Unit, Department of Cancer Research, Santiago, Chile Author https://orcid.org/0000-0002-0025-7805
  • April Anlage Massachusetts Institute of Technology, Global Health Informatics Course, Boston, USA Author
  • Emily Brown Massachusetts Institute of Technology, Global Health Informatics Course, Boston, USA Author
  • Cecilia Enriquez Chavez Massachusetts Institute of Technology, Global Health Informatics Course, Boston, USA. Author
  • Inti Paredes Instituto Oncológico Fundación Arturo López Pérez, Medical Informatics and Data Science Unit, Department of Cancer Research, Santiago, Chile Author

DOI:

https://doi.org/10.56294/dm2023123

Keywords:

Cancer Staging, Machine Learning, Decision Support Techniques

Abstract

This study aimed to develop a predictive model for cancer stage using data from a Chilean cancer registry. Several factors, including cancer type, patient age, medical history, and time delay between diagnosis and treatment, were examined to determine their association with cancer stage. Multiple supervised multi-class classification methods were tested, and the best-performing models were identified. The results showed that the random forest, SVM polynomial, and composite models performed well across different stages, although distinguishing between Stages II and III was more challenging. The most important features for predicting cancer stage were found to be cancer type, TNM variables, and diagnostic extension. Variables related to treatment timing and sequence also showed some importance. It was emphasized that the results of predictive models should be interpreted carefully to avoid overprediction or underprediction. Clinical context and additional information should be considered to enhance the accuracy of predictions. The small dataset and limitations in data availability posed challenges in accurately predicting cancer stage for different cancer types. Implementing the predictive model can have various benefits, including informing treatment decisions, assessing disease severity, and optimizing resource allocation. Further research and expansion of the model's scope were recommended to improve its performance and impact. Overall, the study emphasized the potential of predictive models in cancer staging and highlighted the need for ongoing advancements in this field

References

1. The International Agency for Research on Cancer (IARC). Iarc.fr. Global Cancer Observatory. https://gco.iarc.fr/

2. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49. http://dx.doi.org/10.3322/caac.21660

3. Inastrilla CRA. Big Data in Health Information Systems. Seminars in Medical Writing and Education 2022;1:6–6. https://doi.org/10.56294/mw20226

4. Memorial Sloan Kettering Cancer Center. Types of cancer treatments. https://www.mskcc.org/cancer-care/diagnosis-treatment/cancer-treatments

5. The Breast Cancer Risk Assessment Tool [Internet]. [cited 2023 May 12]. Breast Cancer Risk Assessment Tool: Online calculator (The Gail Model). https://www.cancer.gov/bcrisktool

6. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452–60. http://dx.doi.org/10.1111/cas.14377

7. National Cancer Institute. 2014. Understanding cancer prognosis. https://www.cancer.gov/about-cancer/diagnosis-staging/prognosis

8. Basu A, Ghosh D, Mandal B, Mukherjee P, Maji A. Barriers and explanatory mechanisms in diagnostic delay in four cancers – A health-care disparity? South Asian J Cancer. 2019;08(04):221–5. http://dx.doi.org/10.4103/sajc.sajc_311_18

9. Al-Azri MH. Delay in Cancer Diagnosis: Causes and Possible Solutions. Oman Med J. 2016;31(5):325–6. http://dx.doi.org/10.5001/omj.2016.65

10. National Cancer Institute [Internet]. 2015 [cited 2023 May 5]. Treatment for cancer. https://www.cancer.gov/about-cancer/treatment

11. Inastrilla CRA. Data Visualization in the Information Society. Seminars in Medical Writing and Education 2023;2:25–25. https://doi.org/10.56294/mw202325

12. Canova-Barrios C, Machuca-Contreras F. Interoperability standards in Health Information Systems: systematic review. Seminars in Medical Writing and Education 2022;1:7–7. https://doi.org/10.56294/mw20227

13. Vickers AJ. Prediction models in cancer care. CA Cancer J Clin. 2011. http://dx.doi.org/10.3322/caac.20118

Downloads

Published

2023-12-08

Issue

Section

Original

How to Cite

1.
Aguirre M, Peñafiel S, Anlage A, Brown E, Enriquez Chavez C, Paredes I. Comparative Analysis of Classification Models for Predicting Cancer Stage in a Chilean Cancer Center. Data and Metadata [Internet]. 2023 Dec. 8 [cited 2024 Dec. 21];2:123. Available from: https://dm.ageditor.ar/index.php/dm/article/view/62