Breast cancer is one of the most common types of cancer affecting women worldwide. Previous research has highlighted etiological differences between breast cancer before and after menopause. Several predictors of this disease have been identified, including genetic factors, reproductive factors, and lifestyle. Recently, British scientists have integrated various approaches, including machine learning, to accurately predict breast cancer in women.
How Machine Learning Was Utilized
Machine learning (ML) methods can analyze large datasets to identify early predictors and handle complex non-linear relationships. Previous studies have used machine learning to forecast the risk of breast cancer, but not to determine predictors.
The main objective of the research was to demonstrate the effective application of ML in feature selection to assist classical statistical methods.
To investigate the potential interaction between phenotypic features and polygenic risk scores (PRS) estimation, researchers utilized SHapley Additive exPlanations (SHAP) dependence plots. The researchers utilized data from the UK Biobank, which encompasses over half a million participants from England, Wales, and Scotland. Information was collected through interviews conducted by trained nurses, questionnaires, biological samples, and physical examinations.
Overall, this study included 104,313 postmenopausal women aged 40 to 69 years.
Research Findings
Breast cancer developed in 4,010 participants during the nearly 12-year observation period. By combining machine learning with traditional statistical approaches in cancer epidemiology, scientists identified several known and unknown risk factors.
The identified known risk factors included menopausal age, testosterone level, and age itself. Additionally, five new predictors were discovered, including body mass, blood biochemistry, blood analysis, and urine biomarkers. These new predictors were closely associated with postmenopausal breast cancer incidence.
The new predictors involved a detailed analysis of body composition rather than just body mass index (BMI). Basal metabolic rate was also a significant predictor of breast cancer. Plasma urea, a blood biomarker associated with kidney function, also correlated with cancer. This is the first reported association between plasma phosphate, sodium, or creatinine in urine with breast cancer.
These findings warrant further research into utilizing more accurate anthropometric measurements to improve breast cancer prediction. External validation of the results is the next crucial step before implementation in clinical practice.




