 Research
 Open Access
 Published:
A machine learning classifier approach for identifying the determinants of underfive child undernutrition in Ethiopian administrative zones
BMC Medical Informatics and Decision Making volume 21, Article number: 291 (2021)
Abstract
Background
Undernutrition is the main cause of child death in developing countries. This paper aimed to explore the efficacy of machine learning (ML) approaches in predicting underfive undernutrition in Ethiopian administrative zones and to identify the most important predictors.
Method
The study employed ML techniques using retrospective crosssectional survey data from Ethiopia, a nationalrepresentative data collected in the year (2000, 2005, 2011, and 2016). We explored six commonly used ML algorithms; Logistic regression, Least Absolute Shrinkage and Selection Operator (L1 regularization logistic regression), L2 regularization (Ridge), Elastic net, neural network, and random forest (RF). Sensitivity, specificity, accuracy, and area under the curve were used to evaluate the performance of those models.
Results
Based on different performance evaluations, the RF algorithm was selected as the best ML model. In the order of importance; urban–rural settlement, literacy rate of parents, and place of residence were the major determinants of disparities of nutritional status for underfive children among Ethiopian administrative zones.
Conclusion
Our results showed that the considered machine learning classification algorithms can effectively predict the underfive undernutrition status in Ethiopian administrative zones. Persistent underfive undernutrition status was found in the northern part of Ethiopia. The identification of such highrisk zones could provide useful information to decisionmakers trying to reduce child undernutrition.
Background
Proper nutrition is so crucial to lead a healthy lifestyle. Malnutrition, particularly undernutrition, is a global concern for the health condition and survival of children [1,2,3,4,5]. Almost half of the deaths of children in developing countries were directly or indirectly linked to malnutrition [3, 6]. Malnourished children are more vulnerable to different illnesses compared to their counterparts [1,2,3,4,5,6]. A considerable number of studies investigating the issue targeting underfive children malnutrition and the risk factors associated with this age group. These studies employed classical models such as generalized linear (mixed) models [4, 5, 7,8,9,10]. The finding from the investigations, among others, showed that the nutritional status of children of this age group has gradually improved over the last 2 decades in Ethiopia. Particularly, it has been found that the prevalence of underfive children underweight in Ethiopia was 47.1% in 2000, 38.5% in 2005, 28.8% in 2011, 23.3 in 2016, and 20.56% in 2019, while the prevalence of stunting was 51.22% in 2000, 46.5% in 2005, 44.3% in 2011, 38.3% in 2016, and 36.9% in 2019. Similarly, 10.7% of underfive children were wasted in 2000, 10.5% in 2005, 9.9% in 2011, 10.1% in 2016, and 7% in 2019. The prevalence of having at least one of the undernutrition indicators measured in terms of the composite index for anthropometric failure (CIAF) was 61.38% in 2000, 56.58% in 2005, 51.58% in 2011, 46.49% in 2016, and 42.4 in 2019. Moreover, the CIAF is computed by grouping different forms of anthropometric failure as such: Bwasting only, Cwasting and underweight, Dwasting, stunting and underweight, Estunting and underweight, Fstunting only, and Yunderweight only. The CIAF, calculated by aggregating these six (B–Y) categories [11,12,13,14,15]. Most of such studies conducted in this country depicted the effects of socioeconomic and demographic covariates that were associated with underfive children undernutrition status using the classical regression models [4, 5, 7, 8]. Those traditional models are widely used for causal inferences and with the selection of builtin features, with a relatively small number of covariates [16, 17]. Correlations between covariates (multicollinearity) and a large number of factors are the common analytical challenges in traditional modeling [18,19,20,21]. Moreover, as compared to those classical models, the machine learning (ML) methods have the qualities of using a larger number of predictors, requiring fewer assumptions, incorporating “multidimensional correlations”, and producing a more flexible relationship among the predictor variables and the outcome variables [16,17,18, 20,21,22]. In addition, the ML models can create models for prediction purposes that show superiority in taking care of classification problems when compared with the classical approaches [16,17,18, 21, 23]. In the present paper, we focused to predict CIAF in Ethiopia using this tool drawing on the nationally representative data. Machine learning employs methods developed within the disciplines of statistics, computer sciences, mathematics, and artificial intelligence which allow the formation of algorithms that can learn from and make predictions using data [24,25,26,27,28,29]. As such, it is applicable in different disciplines, such as in medical sciences; for diagnosis and outcome prediction [23, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44], disease modeling [33], disease prediction [34,35,36,37], child mortality [23, 38], and it is also used in industrial applications [39,40,41]. Just only a few studies had investigated the role of this tool to create prediction models of childhood for malnutrition [42,43,44]. Moreover, the study is conducted at the administrative zones in Ethiopia. This is because, in the country, the zonal health departments have the mandate to plan, follow up, monitor, and evaluate health activities of Woreda health offices and the different Woredas in the same Zone are relatively similar in many respects. Moreover, the administrative Zones are mainly ethnicbased, and the assessment of the Zones provides cultural practices regarding staple food and the geographic environment of the community in the Zones [45,46,47,48]. Hence detecting the problems of undernutrition and its variations among administrative Zones provides deeper insight into the health priorities which helps policymakers to design focused intervention strategies. The main objective of this study was, therefore, to identify ML algorithms in predicting and identifying the important covariates that underline the spatial variations in childhood CIAF among 72 Ethiopian administrative zones.
Materials and methods
This study was carried out on the disparities of malnutrition in Ethiopia, with a surface area of 1.1 million km^{2}, the country shares borders with Eritrea in the north, Djibouti and Somali in the east, Sudan and South Sudan in the west, and Kenya in the south. It is divided into 11 administrative units (regions) including Addis Ababa, the capital city of the country. The regions were further divided into 72 secondlevel administrative boundaries called zones [49] (Fig. 1).
Data sources and analysis tools
We conducted the analysis based on the four EDHS datasets (2000, 2005, 2011, and 2016), a nationally representative household survey developed by the United States Agency for International Development (USAID) in the 1980s [50]. The outcome variable that we aimed to predict is the undernutrition status of underfive children measured in terms of the composite index for anthropometric failure (CIAF). CIAF is measured as a binary response as being nourished (coded as 0) and undernourished (coded as 1). The covariates (features) were collected from different pieces of literature [4, 5, 7,8,9,10]. All the categorical features are converted to numerical dummy variables, by mapping each unique value to a number [4, 5, 7,8,9,10]. The boundaries (shapes) were used to define the secondlevel administrative zones and merged with the real dataset for analysis [51].
Methodology
Model building The ML models have shown superiority in taking care of classification problems when compared with the traditional models (like generalized linear mixed models). The raw data are usually not found in the form and shape that is required for optimal performance of the machine learning algorithms. The algorithms that would be implemented in ML are only numerical values and therefore it is important to transform the categorical variables into numerical values. Hence, the preprocessing step is the most important aspect in the ML model applications [21, 23, 52,53,54]. The categorical features of the dataset are encoded to transform these features into numerical values and the continuous data in this study were normalized. For ML approaches, the dataset is randomly split into two: a training dataset which trains the model, and a test dataset where we predict the response variable and check whether the predicted outcome is similar to the actual outcomes, and the validation dataset is considered for the parameter estimates to be incorporated in the training models [24,25,26,27,28,29]. Influence of different training and testing ratios on the performance of the given ML models were checked. This study (train/test: 80/20, and 70/30) was implemented to divide the datasets into the training and testing datasets for performance assessment of models. Popular statistical indicators have been employed to evaluate the predictive capability of the models under different training and testing ratios. The results revealed that the traintest 70–30% split were more advantageous to undernutrition classification than their counterparts (80/20). A variety of supervised ML algorithms including Logistic Regression (LR) [55], Ridge regression [56], Least Absolute Shrinkage and Selection Operator (LASSO) regression [57], Elastic Net [27, 58], Artificial Neural Network (ANN) [59, 60] and Random Forest (RF) [27, 61] were included in the analysis.
The Ridge, Lasso, and Elastic Net are very similar to LR, except that we have an additional penalty term called regularization to estimate the regression coefficients [26, 27] to reduce the overfitting and the adverse effects of multicollinearity [26,27,28, 62]. The advantage of ridge, lasso and elastic net modeling over the classical statistical methods is that, in addition to fitting optimized models, a penalty is applied to predictors in the model, causing covariates with little impact on the outcome variable to be minimized or dropped from the final model. This reduces the model's complexity while increasing its generalizability.
Logistic regression (LR) LR is a widely applied statistical model for classification problems. This model applies the maximum likelihood estimation procedure to estimate the parameter of interest. Let \(y_{i}\) be the response variable for the ith child, and it is Bernoulli distributed and takes on the value 1 with a probability of \(\pi_{i} = P(y_{i} = 1{\varvec{x}}_{i} )\), where \({\varvec{x}}_{i} = \left( {x_{1} , . . . , x_{p} } \right)^{T}\) is the ith child’s covariate vector, and value 0 with probability 1 − \(\pi_{i}\). Then the logistic regression model with the logit link function can be given as:
where \(\beta_{0}\) is the intercept term, and \({\varvec{\beta}} = \left( {\beta_{1} , . . . , \beta_{p} } \right)^{T}\) is a p × 1 vector of estimated regression parameters on the logit scale. If parameter \({\varvec{\theta}} = \left( {\beta_{0} ,{\varvec{\beta}}} \right)^{T}\), then the corresponding loglikelihood function is given by the following equation as it was also shown by [55]:
By replacing \(\pi_{i}\) from Eq. 1 in Eq. 2, we have:
In the maximum likelihood method, the goal is finding a set of \({\varvec{\theta}}\) that can maximize Eq. (3). When we have a large number of features (dimensionality), the traditional LR has a few problems: overfitting, multicollinearity, and computational difficulties. To address this problem, we used regularization which is a GLM that imposes a penalty on the parameters to shrink towards zero [27, 55,56,57,58, 63].
The ridge regression (L2 regularization, which shrinks coefficients of correlated covariates towards each other) is obtained by maximizing the function with a penalized parameter \(\lambda\) applied for all the parameters except the constant (intercept) [55, 56]. The penalized likelihood formulation for ridge regression is given by (4)
When the λ values are too large (λ → ∞), the coefficients of all the parameters tend to be zero, but when λ = 0, the ridge regression is equal to the traditional approach.
The LASSO regression uses the L1 penalty for variable selection and shrinkage. As such, if the \(\lambda\) is large enough, it forces the coefficient to be zero which provides a lesser number of predictors [57]. The function for the lasso regression is given by (5)
The term \(\lambda\) allows the lasso model to carry out much iteration for a given function and find the optimum values for all coefficients. The optimal regularization parameter (\(\lambda\)) was determined using the nfold crossvalidation techniques. The smaller the \(\lambda\) value, the more the effect of regularization upon the number of covariates (features) in the model and their respective coefficients [26,27,28]. Thus, variables with nonzero estimates are considered the important covariates for the outcome variable of interest.
The elastic net regularization is a combination of both (3) and (4) penalties [27, 58]. This method can effectively control the group of correlated features and also shrink the coefficients of noninformative features to zero [27, 58, 63, 64]. The elastic net regression is given by (5)
All the ML algorithms including the logistic regression were performed with R statistical software R and the packages glmnet, pROC, caret, random forest, ggplot, and ROCit were included in the analysis [65,66,67,68,69]. In this paper, we trained the generalized linear model (GLM) estimators with common \(\alpha\) values from the set \(\left\{ {0,0.5,1} \right\}\), where (\(\alpha\) = 0.0, 0.5 and 1.0 respectively refers to the ridge, elastic net and lasso penalty) [27, 58, 63].
The Random forest (RF) is the popular supervised ML approach in applied statistics because of its applicability in both classification and regression [70,71,72]. It is also used for variable screening for dimension reduction. It is a “treebased” technique in which several decision trees are constructed from a random set of covariates and used to predict an outcome label for a subset of samples. It builds multiple trees (called the forest) and the decision is based on the majority votes over all the trees in the forest [70,71,72,73].
The Neural Network (NN) is a type of ML algorithm that is made up of layers of nodes, the most important of which are an input layer [74], hidden layers, and output layers. It is set up with several input neurons (X) that represent the information extracted from each feature in the dataset. Backpropagation is a process used in recurrent NN in which prediction errors are fed back through the NN before modifying the weights of each neural connection until the error level is minimized [59, 60].
Model evaluation
Model performance The performances of the given ML models are evaluated using different model performance approaches including sensitivity, specificity, and accuracy [24,25,26,27,28,29, 75] which are calculated using the observed data as the gold standard. The model sensitivity and specificity relationship are expressed using the Receiver operating characteristics (ROC) curves (Fig. 2).
All the curves which are plotted to the left of the diagonal line are performing better than chance. The area under each curve (AUC) gives an aggregated value which explains the probability that a random sample would be correctly classified by each of the ML algorithms [25, 76]. The AUC of the ROC curve averaged over 10 crossvalidation folds (ten repeats) [25], which partitions the original sample into ten disjoint subsets, uses nine of those subsets in the training process, and then makes predictions about the remaining subset. Then the identified bestfit model is used to predict the undernutrition in another dataset, known as the test dataset [24,25,26,27,28,29].
Covariate selection and ranking Covariate selection is very important for prediction and interpretations, especially for highdimensional datasets. To assess the importance of predictors in the selected model, the study employed two important measures; Mean Decreases Accuracy (MDA) and Mean Decrease Gini (MDG). The highest decrease in the accuracy and Gini values of the model implies the best predictive and the most important variable respectively [77] for the successful classifications (Table 1).
Results
This analysis consisted of data from 29,333 children of age 0–59 months. Of these, 15,281 (52.09%) had at least one form of the undernutrition indicators (stunting, wasting, and underweight) measured in terms of CIAF. We examined the prevalence of CIAF of U5C experience across different child and motherhousehold level covariates. The prevalence of CIAF was more common among parents with no formal education compared to parents with secondary and postsecondary levels of educations. Most of the undernourished children were from rural areas. Also, the prevalence of undernourished children was reported from the lower wealth index of households, from mothers having no media exposure, from unimproved toilets and sanitation compared with their counterparts. Covariates that were significant in the Chisquare statistics were used to develop the ML algorithms on the training dataset (Table 2).
Figures in the supplementary documents indicated the effects of different levels of the log of the regularization parameter (\(\lambda\)) for the ridge, elastic net, and lasso regression using the dotted vertical lines (here at x = − 4.51, x = − 7.84, and x = − 8.71) respectively, which indicates the accuracy of the prediction maximization. The coefficients for the given model features were indicated for different values of log (\(\lambda\)) that minimizes a mean squared error (MSE) of coefficients established during the crossvalidation. The graph shows that as the log (\(\lambda\)) value decreases, the number of the variables included in the model (those with nonzero coefficients) increases (Additional file 1).
Performance comparisons The accuracy and AUC were implemented to evaluate the efficiency of ML algorithms. The comparison of the efficiency of ML algorithms with the traditional LR was depicted in Fig. 3 and Table 3. All the ML algorithms considered in this study perform better than those of the classical logistic regression model to predict the undernutrition status. More detail is given in the Additional file 1.
A comparison of 70% training and 30% validation, 80% training and 20% validation was performed respectively to examine the six models’ behaviors with some statistical measures and area under the receiver operating characteristic curve. Although all the models with the two traintest splits ratio had almost identical performances evaluation metrics, the 70–30% split was chosen as the most appropriate model to undernutrition classification. Moreover, it was noticed that the prediction model based on RF demonstrated the bestperformed model, with AUC up to 0.761, followed by LASSO (AUC = 0.717), while the perdition model using the traditional model (LR) is the least efficient (AUC = 0.653). Hence RF model was chosen as the classification engine to construct the perdition model for underfive undernutrition in Ethiopian administrative zones (Table 3).
In machine learning prediction, identifying important attributes is also crucial. The importance of each aspect for a tree’s decision is represented by feature importance rates. The random forest (best algorithm for childhood undernutrition in our study gives the MDA and MDG measures of the relative importance of covariates in the model which are summarized in Fig. 4. The factors include urban–rural settlement (ur), the total number of underfive population, the BMI, literacy rates of parents and zones were the most important predictors of CIAF, but household size, age of mother, parity, and autonomy were the lowest predictive variables in our model (Fig. 4).
The predicted values with the actual values of undernutrition among the 72 administrative areas were mapped in Fig. 5. Having the best predictive model (RF) that yielded the highest AUC, we further predicted the undernutrition status of underfive children by the administrative zones. Both the crude and predicted undernutrition values were merged with the secondlevel administrative level (zones) shapefiles. A visual comparison confirms that while discrepancies did exist between few zones, the overall patterns of the observed prevalence were in line with the patterns of the predicted prevalence of undernutrition. The degrees of agreement between the actual and predicted values indicated that the two variables are strongly correlated. Moreover, the third map reveals that the difference. Further, it is between the crude and predicted CIAF of U5C in some zones that have a positive difference indicated that the crude prevalence is less than the predicted value and vice versa (Fig. 5).
Discussions
Previous studies carried out on this subject reported that Ethiopia is one of the countries with the highest number of underfive undernourished children in the world [2, 4, 8, 78, 79]. Further, the studies indicated that, while the prevalence of underfive undernutrition has declined in the nation from time to time, more effort is needed to facilitate this decline and to contain the negative consequences of the phenomena. In this study, we briefly described spatial disparities in underfive undernutrition and predicted underfive undernutrition among Ethiopian administrative zones. The spatial maps show evidences of considerable zonal disparities in underfive undernutrition rates in the administrative zones similar to what has been reported in different countries [80,81,82]. The continuous data in this study were normalized and the categorical variables were encoded. The machine learning models are known as advanced approaches and techniques for quick and accurate prediction of realworld problems. In this paper, the ML techniques are analyzed by investigating the influence of training/testing ratio on the performance of the six popular ML models to predict the undernutrition of underfive children. The performance of the ML models was slightly changed under the two different ratios. The result revealed that the ratio 70/30 was the most suitable ratio for the training and validating ML models. This study is in line with previously published studies [18, 23, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44, 83,84,85,86]. The ML tool can offer insight into the identification of novel factors associated with underfive undernutrition that can serve as targets for intervention. Among the six predictive models built using these techniques, the Random Forest (RF) model reveals a higher predictive power as compared to other ML models including the logistic regression. The RF model reveals that urban–rural settlement ratio, the literacy level of parents, under five populations, BMI of mothers, locations (zones, place of residence), and rainfall distributions were the top important predictors of underfive undernutrition in Ethiopia. This study is consistent with previous studies [4, 42, 79, 81]. Moreover, the selected ML algorithm reveals consistent effects of the covariates with the classical generalized linear model which shows that the educational level of parents, the age of the child, sex of the child, birth order, dietary diversity, types of the birthplace of residence, women’s autonomy, household sanitation, and a clean water supply were the most significant variables for undernutrition [4, 6, 7, 10, 21, 79,80,81,82]. The child’s residence (zones) was one of the important risk factors for the U5C CIAF rate which varied significantly across spatial zones. Moreover, this paper briefly explored the spatial variation in underfive child undernutrition and the predicted underfive undernutrition risk factors in Ethiopia using the different machine learning approaches. Hence, we explored a spatial map for the crude prevalence and predicted (from RF) rate of underfive undernutrition by zones in Ethiopia to document the zonal disparities in underfive undernutrition in the country.
Limitations
Since there are no regression coefficients and no directional effects in ML algorithms, the parameters are difficult to be interpreted [21, 23, 87]. In the current study, ML models only predict or classify certain variables depending on the importance of their contribution in determining underfive undernutrition instead of causal inferences. More types of classification ML algorithms could also have been used [21, 23, 28, 38, 59].
Conclusions
The main objective of this study was to compare and evaluate the performance of different machine learning (ML) algorithms considering the influence of two traintest splits ratios in predicting the undernutrition underfive classification. Popular statistical indicators, such as accuracy and area under the curve were employed to evaluate the predictive power of the ML models under different testing and training ratios. The higher the accuracy the model had, the better was the performance of the model. Our results confirm that ML models can effectively predict the underfive undernutrition status and hence may be useful for concerned body decision tools. The best model was the RF, with accuracy and AUC of (68.2%, 76.2%) respectively. The findings from this paper showed that considerable zonal disparities in the underfive undernutrition status persist in the northern part of Ethiopia. When implementing health policies aimed at the redaction of child undernutrition in Ethiopian administrative zones, the zone characteristics must be taken into account.
Availability of data and materials
The dataset used and analyzed during the current study is available from the corresponding author on reasonable request.
Abbreviations
 AUC:

Area under the curve
 EDHS:

Ethiopian Demographic and Health Survey
 BMI:

Body mass index
 ML:

Machine learning
 RF:

Random forest
 ANN:

Artificial neural network
 ROC:

Receiver operating characteristic
 U5C:

Underfive children
 LR:

Logistic regression
References
 1.
Phalkey RK, et al. Systematic review of current efforts to quantify the impacts of climate change on undernutrition. Proc Natl Acad Sci. 2015;112(33):E4522–9.
 2.
Organization WH. The state of food security and nutrition in the world 2019: safeguarding against economic slowdowns and downturns, vol 2019. Food & Agriculture Org; 2019.
 3.
ElGhannam AR. The global problems of child malnutrition and mortality in different world regions. J Health Soc Policy. 2003;16(4):1–26.
 4.
Fenta HM, et al. Determinants of stunting among underfive years children in Ethiopia from the 2016 Ethiopia demographic and Health Survey: application of ordinal logistic regression model using complex sampling designs. Clin Epidemiol Glob Health. 2020;8(2):404–13.
 5.
Kassie GW, Workie DL. Determinants of undernutrition among children under five years of age in Ethiopia. BMC Public Health. 2020;20:1–11.
 6.
Pelletier DL, Frongillo EA. Changes in child survival are strongly associated with changes in malnutrition in developing countries. J Nutr. 2003;133(1):107–19.
 7.
Degarege D, Degarege A, Animut A. Undernutrition and associated risk factors among school age children in Addis Ababa, Ethiopia. BMC Public Health. 2015;15(1):1–9.
 8.
Takele K, Zewotir T, Ndanguza D. Understanding correlates of child stunting in Ethiopia using generalized linear mixed models. BMC Public Health. 2019;19(1):1–8.
 9.
Suriyakala V et al. Factors affecting infant mortality rate in India: an analysis of Indian states. In: The international symposium on intelligent systems technologies and applications. Springer; 2016.
 10.
Habyarimana F, Zewotir T, Ramroop S. A proportional odds model with complex sampling design to identify key determinants of malnutrition of children under five years in Rwanda. Mediterr J Soc Sci. 2014;5(23):1642–1642.
 11.
Nandy S, Svedberg P. The composite index of anthropometric failure (CIAF): an alternative indicator for malnutrition in young children. In: Handbook of anthropometry. Springer, pp 127–137; 2012.
 12.
Rasheed W, Jeyakumar A. Magnitude and severity of anthropometric failure among children under two years using Composite Index of Anthropometric Failure (CIAF) and WHO standards. Int J Pediatr Adolesc Med. 2018;5(1):24.
 13.
Shit S, et al. Assessment of nutritional status by composite index for anthropometric failure: a study among slum children in Bankura, West Bengal. Indian J Public Health. 2012;56(4):305.
 14.
Mandal G, Bose K. Assessment of overall prevalence of undernutrition using composite index of anthropometric failure (CIAF) among preschool children of West Bengal, India; 2009.
 15.
Sen J, Mondal N. Socioeconomic and demographic factors affecting the Composite Index of Anthropometric Failure (CIAF). Ann Hum Biol. 2012;39(2):129–36.
 16.
Knol MJ, et al. What do casecontrol studies estimate? Survey of methods and assumptions in published casecontrol research. Am J Epidemiol. 2008;168(9):1073–81.
 17.
Gu W, et al. Use of random forest to estimate population attributable fractions from a casecontrol study of Salmonella enterica serotype Enteritidis infections. Epidemiol Infect. 2015;143(13):2786–94.
 18.
Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2017;38(23):1805–14.
 19.
AmbaleVenkatesh B, et al. Cardiovascular event prediction by machine learning: the multiethnic study of atherosclerosis. Circ Res. 2017;121(9):1092–101.
 20.
Adler ED, et al. Improving risk prediction in heart failure using machine learning. Eur J Heart Fail. 2020;22(1):139–47.
 21.
Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30.
 22.
Shameer K, et al. Machine learning in cardiovascular medicine: are we there yet? Heart. 2018;104(14):1156–64.
 23.
Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: a review of classification techniques. Emerg Artif Intell Appl Comput Eng. 2007;160(1):3–24.
 24.
Quinlau R. Induction of decision trees. Mach Learn. 1986;1(1):S1–106.
 25.
Gareth J, et al. An introduction to statistical learning: with applications in R. Berlin: Spinger; 2013.
 26.
Molina M, Garip F. Machine learning for sociology. Annu Rev Sociol. 2019;45:27–45.
 27.
Géron A. Handson machine learning with ScikitLearn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media; 2019.
 28.
Marsland S. Machine learning: an algorithmic perspective. Boca Raton: CRC Press; 2015.
 29.
Zhang H. The optimality of Naïve Bayes. FLAIRS2004 conference. 2004.
 30.
Esteva A. Dermatologistlevel classification of skin cancer with deep neural networks. Nature. 2017;542:115–8.
 31.
Anderson JP, et al. Reverse engineering and evaluation of prediction models for progression to type 2 diabetes: an application of machine learning using electronic health records. J Diabetes Sci Technol. 2016;10(1):6–18.
 32.
Friedman CP, Wong AK, Blumenthal D. Achieving a nationwide learning health system. Sci Transl Med. 2010;2(57):57cm29.
 33.
Ayer T, et al. Comparison of logistic regression and artificial neural network models in breast cancer risk estimation. Radiographics. 2010;30(1):13–22.
 34.
Farran B, et al. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machinelearning algorithms and validation using national health data from Kuwait—a cohort study. BMJ Open. 2013;3(5):e002457.
 35.
Aneja S, Lal S. Effective asthma disease prediction using naive Bayes—Neural network fusion technique. In: 2014 international conference on parallel, distributed and grid computing. 2014. IEEE.
 36.
Behroozi M, Sami A. A multipleclassifier framework for Parkinson’s disease detection based on various vocal tests. Int J Telemed Appl. 2016;2016:6837498.
 37.
Weiss JC, et al. Machine learning for personalized medicine: predicting primary myocardial infarction from electronic health records. AI Mag. 2012;33(4):33–33.
 38.
Methun MIH, et al. A machine learning logistic classifier approach for identifying the determinants of under5 child morbidity in Bangladesh. Clin Epidemiol Glob Health. 2021;12:100812.
 39.
Bertolini M et al. Machine Learning for industrial applications: a comprehensive literature review. Expert Syst Appl; 2021: 114820.
 40.
Schmidt J, et al. Recent advances and applications of machine learning in solidstate materials science. NPJ Comput Mater. 2019;5(1):1–36.
 41.
Wuest T, et al. Machine learning in manufacturing: advantages, challenges, and applications. Prod Manuf Res. 2016;4(1):23–45.
 42.
Talukder A, Ahammed B. Machine learning algorithms for predicting malnutrition among underfive children in Bangladesh. Nutrition. 2020;78:110861.
 43.
Khare S, et al. Investigation of nutritional status of children based on machine learning techniques using Indian demographic and health survey data. Procedia Comput Sci. 2017;115:338–49.
 44.
Rahman SJ, et al. Investigate the risk factors of stunting, wasting, and underweight among underfive Bangladeshi children and its prediction based on machine learning approach. PLoS ONE. 2021;16(6):e0253172.
 45.
Gebreyesus SH, et al. Local spatial clustering of stunting and wasting among children under the age of 5 years: implications for intervention strategies. Public Health Nutr. 2016;19(8):1417–27.
 46.
Collaborators GRF. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet (London, England). 2016;388(10053):1659.
 47.
Corsi DJ, et al. Shared environments: a multilevel analysis of community context and child nutritional status in Bangladesh. Public Health Nutr. 2011;14(6):951–9.
 48.
Griffiths P, et al. A tale of two continents: a multilevel comparison of the determinants of child nutritional status from selected African and Indian regions. Health Place. 2004;10(2):183–99.
 49.
Fetene N, et al. The Ethiopian health extension program and variation in health systems performance: what matters? PLoS ONE. 2016;11(5):e0156438.
 50.
Croft TN et al. Guide to DHS statistics. Rockville, Maryland, USA: ICF; 2018.
 51.
Esri, ArcGIS Version 10.1. ESRI; 2010.
 52.
Ibeji JU, et al. Modelling children ever born using performance evaluation metrics: a dataset. Data Brief. 2021;36:107077.
 53.
Raschka S. Python machine learning. Birmingham: Packt publishing ltd; 2015.
 54.
Seger C. An investigation of categorical variable encoding techniques in machine learning: binary versus onehot and feature hashing; 2018.
 55.
Yu HF, Huang FL, Lin CJ. Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn. 2011;85(1–2):41–75.
 56.
Arthur EH, Robert WK. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12(1):55–67.
 57.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol). 1996;58(1):267–88.
 58.
Zou H, Hastie T. Addendum: regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol). 2005;67(5):768–768.
 59.
HechtNielsen R. Theory of the backpropagation neural network. In: Neural networks for perception. Elsevier. p. 6593; 1992.
 60.
Abdelhafiz D, et al. Deep convolutional neural networks for mammography: advances, challenges and applications. BMC Bioinform. 2019;20(11):1–20.
 61.
Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining (New York, NY, USA, 2016), KDD ‘16, ACM; 2016.
 62.
Garg A, Tai K. Comparison of statistical and machine learning methods in modelling of data with multicollinearity. Int J Model Identif Control. 2013;18(4):295–312.
 63.
Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12(1):55–67.
 64.
Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol). 2005;67(2):301–20.
 65.
Yuan GX, Ho CH, Lin CJ. An improved glmnet for l1regularized logistic regression. J Mach Learn Res. 2012;13(1):1999–2030.
 66.
Genuer R, Poggi JM, TuleauMalot C. VSURF: an R package for variable selection using random forests. R J. 2015;7(2):19–33.
 67.
Robin X, et al. pROC: an opensource package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011;12(1):1–8.
 68.
Khan MRAA. ROCitAn R package for performance assessment of binary classifier with visualization; 2019.
 69.
Wickham H, Chang W, Wickham MH. Package ‘ggplot2’. Create elegant data visualisations using the grammar of graphics. Version. 2016; 2(1): 1–189.
 70.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
 71.
Genuer R, Poggi JM, TuleauMalot C. Variable selection using random forests. Pattern Recogn Lett. 2010;31(14):2225–36.
 72.
Janitza S, Tutz G, Boulesteix AL. Random forest for ordinal responses: prediction and variable selection. Comput Stat Data Anal. 2016;96:57–73.
 73.
Liaw A, Wiener M. Classification and regression by randomForest. R news. 2002;2(3):18–22.
 74.
Liang NY, et al. A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw. 2006;17(6):1411–23.
 75.
Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327(8476):307–10.
 76.
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36.
 77.
Han H, Guo X, Yu H. Variable selection using mean decrease accuracy and mean decrease gini based on random forest. In: 2016 7th IEEE international conference on software engineering and service science (ICSESS). IEEE; 2016.
 78.
Gebre A et al. Prevalence of malnutrition and associated factors among underfive children in pastoral communities of Afar Regional State, Northeast Ethiopia: a communitybased crosssectional study. J Nutr Metab. 2019;2019.
 79.
Kassie GW, Workie DL. Determinants of undernutrition among children under five years of age in Ethiopia. BMC Public Health. 2020;20(1):1–11.
 80.
Spray AL, et al. Spatial analysis of undernutrition of children in leogane Commune, Haiti. Food Nutr Bull. 2013;34(4):444–61.
 81.
Simler KR. Nutrition mapping in Tanzania: an exploratory analysis. IFPRI Food Consumption and Nutrition Division Discussion Paper, 2006(204).
 82.
Khan J, Mohanty SK. Spatial heterogeneity and correlates of child malnutrition in districts of India. BMC Public Health. 2018;18(1):1–13.
 83.
Pham BT, et al. Spatial prediction of rainfallinduced landslides using aggregating onedependence estimators classifier. J Indian Soc Remote Sens. 2018;46(9):1457–70.
 84.
Verma C, Illés Z. Attitude prediction towards ICT and mobile technology for the realtime: an experimental study using machine learning. In: The international scientific conference elearning and software for education. 2019. “Carol I” National Defence University.
 85.
Van Dao D, et al. A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. CATENA. 2020;188:104451.
 86.
Nguyen PT, et al. Soft computing ensemble models based on logistic regression for groundwater potential mapping. Appl Sci. 2020;10(7):2469.
 87.
Bitew FH, et al. Machine learning approach for predicting underfive mortality determinants in Ethiopia: evidence from the 2016 Ethiopian Demographic and Health Survey. Genus. 2020;76(1):1–16.
Acknowledgements
The datasets used in this study were obtained from the DHS program thanks to the authorization received to download the dataset on the website.
Funding
Not applicable.
Author information
Affiliations
Contributions
HMF was involved in this study from data management, data analysis, drafting, and revising the final manuscript. TZ and EKM contributed to the conception, design, and interpretation of data, as well as to manuscript reviews and revisions. All authors have read and approved the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Ethics approval and consent to participate Institutional review board of Macro International and USAID ethically approved the data utilized on this study. Authorization to make use of the data was formally applied through online registration on the MEASURE DHS website. The study protocol was submitted. Thus, approval was sought to use the datasets.
Consent for publication
Not applicable.
Competing interests
We, the authors, declare that we have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1:
Implementation of different Supervised Machine Learning (SML) using R statistical software.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Fenta, H.M., Zewotir, T. & Muluneh, E.K. A machine learning classifier approach for identifying the determinants of underfive child undernutrition in Ethiopian administrative zones. BMC Med Inform Decis Mak 21, 291 (2021). https://0doiorg.brum.beds.ac.uk/10.1186/s12911021016521
Received:
Accepted:
Published:
DOI: https://0doiorg.brum.beds.ac.uk/10.1186/s12911021016521
Keywords
 Composite index for anthropometric failure (CIAF)
 Confusion matrix
 Covariate selection and ranking
 Multicollinearity
 Receiver operating characteristics (ROC)