Big Data/Machine Learning/AI
Comparing Extreme Gradient Boosting and Multi-Layer Perceptron algorithms for the prediction of neonatal mortality Renan Moritz VR Almeida* Renan Moritz VR Almeida Nubia Karla de Oliveira Almeida Renan Moritz Varnier Rodrigues de Almeida
Objective: To apply eXtreme Gradient Boosting (XGBoost) and Multilayer Perceptron (MLP) models to the prediction of neonatal mortality.
Methods: The study used 167.928 singleton births in Rio de Janeiro State (Brazil), 2019-20, obtained from a national administrative information system (Datasus). The outcome was neonatal mortality, and the predictors were 15 variables pertaining to characteristics of the mother, the pregnancy and the newborn.
Data were randomly split into a training (70% of data) and a testing set, after what an MLP with one hidden layer and an XGBoost model were developed in the training set. Backpropagation MLPs are the “classical” machine-learning models, introduced in the 1970s and popularized in the 80s; while XGBoost is widely regarded for its efficiency and robustness, and is now one of the most preferred methods for machine-learning applications. Performance was evaluated in the testing set by the usual metrics AUC, Accuracy, Precision, Sensitivity, F1-score and Specificity. Two classification thresholds were tested: The usual “0.5” and an empirical threshold that maximized sensitivity. All analyses were done with the R language.
Results: The identified threshold for sensitivity maximization was 0.01. Regardless of threshold or metric, both models did not differ much in their outcomes, with the main XGBoost x MLP differences: AUC: 0.83 x 0.88, Precision: 0.28 x 0.54, F1-score: 0.11 x 0.13 (threshold 0.5); AUC: 0.83 x 0.88, Sensitivity 0.68 x 0.72, F1-score: 0.07 x 0.10 (maximizing sensitivity; no change in other metrics).
Conclusions: Both models were good classifiers for the studied data. In fact, despite its alleged advantages, XGBoost performed slightly worse than the classical MLP model. Threshold definition also did not have a marked effect on outcomes.
Overall, these results once more indicate that the most important factors for the application of machine learning models to epidemiologic data are model specification and data quality.