• Using Machine Learning (Machine Learning) in Predicting the Risk of Thyroid Cancer based on Genetic Information
  • Sana Tarashandeh Hemmati,1,* Mahdi Esmaeilpour Eshka,2
    1. Islamic Azad University, Lahijan branch
    2. University of Tehran (College of Farabi)


  • Introduction: Thyroid cancer is one of the common endocrine malignancies, with an increasing incidence in recent years. Early diagnosis and assessment of the risk of thyroid carcinoma are essential for its effective management and treatment. In recent years, machine learning has proved to be a revolutionary tool for predicting several risks, including thyroid cancer. This review article summarizes the current status of the research aimed at using ML to predict the risk of thyroid cancer, with particular attention dedicated to the role of genetic information.
  • Methods: We carried out a comprehensive review of the literature using search strategies in databases like PubMed, Embase, and the Cochrane Library. Studies were selected that applied the ML algorithm to predict the risk of thyroid cancer based on genetic information as a feature. Search terms included "thyroid cancer," "machine learning," "risk prediction," and "genetic information." The studies that used machine learning to predict the risk of thyroid cancer were critically appraised for their methodological quality, such as the type of design, data collection, feature selection, model development, and validation of the developed model.
  • Results: Indeed, several works have shown that ML has the potential to predict risk in thyroid cancer depending on genetic information. It was demonstrated by Xie et al. that their classification model built with the XGBoost algorithm reached an AUC of 0.84 for the case of thyroid nodule malignancy prediction. Age, obesity, prothrombin time, fibrinogen, and HBeAb were high-risk factors, whereas monocyte, D-dimer, T3, FT3, and albumin were low-risk factors. A Bagged CART model built by Jiang et al. showed a 99.1% accurate prediction of thyroid cancer. The recurrence of thyroid cancer was predicted, with importance placed on the BRAFV600E mutation.
  • Conclusion: Using genetic data, machine learning has shown great potential in predicting thyroid cancer risk. Machine learning models integrate genetic data with other clinical and demographic data to identify high-risk populations and derive management strategies. Further research is needed to validate models within broad, diverse populations and study the implementation strategy for ML in clinical decision-making workflows.
  • Keywords: Thyroid Cancer, Machine learning, Risk prediction, Genetic information, BRAFV600E mutation