Machine learning classification of HER2 mutations and treatment response using gene expression in breast cancer

Presenting Author

Muyiwa Ategbole

University of South Carolina

Submitting Author

Muyiwa Ategbole

Additional Authors

Kesheng Wang, Swann Adams

Abstract

Background:
HER2 hotspot mutations are actionable targets in breast cancer, yet transcriptomic signatures that reliably distinguish clinically relevant variants, including V777L, and predict treatment-related phenotypes are not well characterized.

Methods:
We analyzed RNA sequencing data from recount3 (SRP166112), comprising 864 breast cancer cell samples representing multiple HER2 genotypes and drug exposures. After variance-stabilizing transformation and filtering, we used LASSO regularization to identify a sparse panel of 29 informative genes. Machine learning models (random forest, LASSO, elastic net, support vector machine, and gradient boosting) were trained to classify treated versus control samples using a 70/30 stratified train–test split. Model performance was evaluated on an independent test set using accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).

Results:
All models demonstrated strong discrimination between treated and control samples. Test-set accuracy ranged from 98.1% to 99.6% across models, with AUC values near 1.00. The LASSO model achieved the highest accuracy (99.6%), sensitivity (97.6%), and specificity (100%). Random forest, elastic net, support vector machine, and gradient boosting models showed similarly high performance (AUC 0.995–1.000). The LASSO-selected gene panel included biologically relevant predictors such as CYP1A1, SERPINA6, CISH, and CYP1B1, which were strongly associated with treatment exposure.

Conclusions:
A sparse set of gene expression markers enables highly accurate classification of treatment exposure in HER2-mutant breast cancer cell models. These transcriptomic signatures provide a foundation for biomarker discovery and future translational studies aimed at stratifying HER2-mutant tumors by therapeutic response.

Abstract Search

Abstract