Concordance between the estimates of wasting measured by weight-for-height and by mid-upper arm circumference for classification of severity of nutrition crisis: analysis of population-representative surveys from humanitarian settings

Background Despite frequent use of mid-upper arm circumference (MUAC) to assess populations at risk of nutrition emergencies, as well as evidence that measurement of children based on MUAC identifies different children than weight-for-height (WHZ) as wasted, no crisis classification thresholds based on prevalence of wasting by MUAC currently exist. Methods We analyzed 733 population-representative anthropometric surveys from 41 countries conducted by Action Contre la Faim (ACF) and the United Nations High Commissioner for Refugees (UNHCR) between 2001 and 2016. Children aged 6–59 months were classified as wasted if they had a WHZ < − 2 and/or a MUAC < 125 mm. Prevalence of wasting as assessed by WHZ and by MUAC were compared using correlations and linear regression models adjusting for stunting prevalence, sex and age distribution of the sample. Median prevalence of wasting by MUAC corresponding to each of the WHZ-based crisis thresholds was examined. Results Median prevalence of wasting by WHZ was 10.47% (IQR: 6.34–17.55%) and by MUAC was 6.66% (IQR:4.12–10.88%). Prevalence of wasting by WHZ exceeded prevalence by MUAC in 543 (74.1%) surveys and median prevalence by WHZ was greater in 30 (73.17%) countries. Prevalence of wasting by WHZ is poorly correlated with prevalence of wasting by MUAC (ρ = 0.55). R2 was 0.36 for unadjusted and 0.45 for adjusted linear regression model. The difference between the prevalence by WHZ and by MUAC increased as the overall prevalence by WHZ increased (ρ = 0.69). Surveys with prevalence of wasting by WHZ approximately equal to thresholds for “poor” (5% ± 2.5%), “serious” (10% ± 2.5%), “emergency” (15% ± 2.5%), and “famine” (30% ± 2.5%) were observed to have median prevalence of wasting by MUAC of 4.51% (IQR: 2.73–6.81%), 6.67% (IQR: 4.27–10.03%), 8.15% (IQR: 5.11–11.86%), and 15.71% (IQR: 10.28–17.50%), respectively. There was a very substantial overlap of MUAC values across the threshold categories. Conclusions Given a poor correlation between population prevalence of wasting by WHZ and by MUAC, classification of surveys based on prevalence of wasting by MUAC will result in poor concordance with current WHZ-based crisis thresholds, even if regional differences are considered, regardless of the cutoffs used.


Background
Prevalence of acute malnutrition is commonly used to benchmark the severity of a nutritional emergency to help inform the scale and scope of humanitarian response activities. Prevalence in a given context is compared with global standard thresholds. The World Health Organization (WHO) initially outlined guidance on these standard thresholds in 1995, modifying guidance from a 1992 consultation by the WHO Eastern Mediterranean regional office [1]. The guidance proposed classification of a situation using thresholds of less than 5% prevalence of wasting ("acceptable"), less than 10% ("poor"), less than 15% ("serious") and equal to or greater than 15% ("critical"). The Management of Nutrition in Major Emergencies, a joint guidance document drafted in 2000 by WHO, United Nations High Commissioner for Refugees (UNHCR), International Federation of Red Cross (IFRC), and World Food Programme (WFP), included these same thresholds prompting a more universal adoption [2]. In 2004, the need for a higher, famine threshold was proposed by Howe and Devereux [3]. Currently, both the Integrated Phase Classification (IPC) used in East Africa and Asia and the Cadre Harmonisé (CH) used in the Sahel and West Africa use a cutoff of 30% prevalence of wasting as a threshold for famine, such that prevalence of wasting is used to classify a situation as Phase I (< 5%), Phase 2 (5-< 10%), Phase 3 (10-< 15%), Phase 4 (15-< 30%) or Phase 5 (≥ 30%) [4,5].
The above standard thresholds are all based on prevalence of wasting as assessed by weight-for-height Z scores (WHZ) [1,2,4,5]. In addition to WHZ, wasting can be assessed using mid-upper arm circumference (MUAC). Since in 2005 WHO, WFP, United Nations Children's Fund (UNICEF), and the Standing Committee on Nutrition (SCN) recommended MUAC as independent measure of wasting used as a criterion for admission into selective nutrition feeding programs [6][7][8]. However, separate thresholds for classifying a crisis based on prevalence of wasting as assessed by MUAC do not exist. Previous research demonstrating substantial discrepancy in diagnosis of children as wasted using WHZ and MUAC has prompted questions about the validity of applying WHZ-based thresholds to estimates of wasting based on MUAC. Based on an analysis of over 560 surveys from 31 countries, WHO estimated that only about 4 in 10 children were identified as wasted by both WHZ and MUAC, concluding that "the cases selected using weight-for-height and MUAC were not the same" [6]. Multi-country analysis by Grellety et al. using 1832 surveys from 47 countries similarly highlighted that a large proportion of children were identified as wasted by MUAC but not WHZ and by WHZ but not MUAC, adding that the proportion of children in each of these categories varied widely by country [9]. Analysis by Roberfroid et al. found that stunting, sex, and age all influenced diagnosis of acute malnutrition by MUAC but not WHZ [10]. However, while there is evidence to suggest poor correlation between WHZ and MUAC diagnosis of individual children, whether or not this translates into population level differences in prevalence of wasting by WHZ and MUAC has yet to be evaluated.
Mid-upper arm circumference is increasingly used to measure wasting, especially as part of community-based screenings and at remote clinics where height boards and other anthropometric equipment may not be available. Additionally, several studies suggest that low MUAC better predicts mortality than low WHZ, as summarized by Briend et al [11]. MUAC only assessments are particularly common in humanitarian settings with extreme insecurity. In the absence of clear guidance on thresholds for classifying prevalence of acute malnutrition by MUAC, the WHZ-based thresholds have been applied in many contexts. The Cadre Harmonisé, for example, recommends this approach for the Sahel and West Africa [4]. The Integrated Food Security and Nutrition Phase Classification technical committee has identified the need for secondary analysis of existing survey data to explore the possibility of deriving the thresholds for classifying severity of wasting at the population level using prevalence of wasting by MUAC where WHZ based anthropometry data are not available.
The objective of this research therefore was to explore the concordance of the prevalence of wasting by WHZ and MUAC at the population level and the possibility of deriving MUAC-based crisis thresholds corresponding to the existing WHZ-based thresholds. The focus of the analysis was on total rather than severe wasting, as WHO-recommended emergency thresholds are based on the prevalence of total wasting [1,2]. To this aim, we assessed the correlation of prevalence of low MUAC and low WHZ in survey samples globally, as well as described prevalence of wasting by MUAC in populations with prevalence approximately equal to poor, serious, critical, and famine thresholds as determined by WHZ.

Methods
Data included in these analyses were from small-scale field nutrition surveys conducted in humanitarian settings by Action Contre la Faim (ACF) International (an international humanitarian non-governmental organization focused on nutrition in humanitarian settings worldwide) and by the United Nations High Commissioner for Refugees (UNHCR). Data were drawn from a database of 808 population-representative cross sectional surveys conducted between 2001 and 2016 [12,13]. Surveys with sample sizes smaller than 196 persons and cluster surveys with fewer than 25 clusters were excluded a priori from all analyses as they did not meet minimum standards for small-scale cluster surveys [9]. Surveys that did not collect both MUAC and weight-for-height (weight, height, age and sex) were also excluded.
Weight-for-Height Z-scores (WHZ) were calculated for each child using the WHO 2006 growth standards using the WHO SAS macro [14]. Only children aged 6-59 months were included in the analyses. Prevalence of wasting by WHZ for each survey reflects the proportion of children with WHZ less than − 2. Outlier observations were excluded from a survey if Z-score of a child fell outside the flexible exclusion range of ±4 Z-scores from the observed survey sample mean, as described by WHO [1]. Prevalence of wasting by MUAC for each survey reflects the proportion of children with MUAC values less than 125 mm. MUAC values less than 70 mm and greater than 220 mm were excluded as outliers. Individual observations within each survey were also excluded from calculations of wasting by WHZ for children without information on height, weight, age or sex and from calculations of wasting by MUAC for children without information on MUAC and age. Cases of bilateral pitting edema were not included in estimated prevalence of wasting by WHZ or MUAC; edema cases were relatively rare in all surveys, representing approximately 3 per 1000 (mean: 0.32%) sampled children.
Countries where the surveys were conducted were categorized into seven geographical or country groupings (Latin America and the Caribbean; Eastern and Southern Africa; Democratic Republic of Congo (DRC); West and Central Africa; South, Southeast Asia and Pacific; Sudan; Middle East and North Africa) as seen in Table 1. DRC and Sudan were analyzed as its own grouping given the large number of surveys conducted in both countries.
The first aim of the analysis was to describe the relationship between prevalence of wasting as assessed by WHZ and the prevalence of wasting as assessed by MUAC on the same population. Spearman correlations were therefore calculated to describe the correlation between prevalence of wasting by WHZ and by MUAC, as well as between the difference in prevalence by WHZ and MUAC and the prevalence by each WHZ and MUAC survey. A multivariate model was then constructed to explore the relationship of the prevalence by WHZ and by MUAC, controlling for key factors shown in previous research to be associated with the prevalence of wasting by MUAC. These factors included stunting prevalence, sex distribution, and age distribution of the survey sample [10]. Prevalence of wasting by MUAC as an outcome and all predictor variables were modeled as continuous linear terms. Sex ratio was calculated as the proportion of females in the survey sample. Age ratio was calculated as the proportion of younger children aged 6-29 months in the survey sample. Observations with significantly high leverage or Cook's distance were removed from the multivariable analyses. The regression analysis was repeated using logit-transformed independent and dependent variables [15]. To assess the reproducibility of the results, analysis above was repeated with DRC included in the West and Central Africa region and Sudan included in the Middle East North Africa, and separately for surveys conducted by ACF and UNHCR.
Second, we described the prevalence of wasting by MUAC in surveys with the prevalence of wasting by WHZ corresponding to the existing WHZ-based crisis thresholds (5, 10, 15 and 30%) to assess the feasibility of deriving corresponding thresholds using MUAC. Surveys with prevalence of wasting by WHZ within ±2.5% of the 5, 10, 15, and 30% crisis thresholds were included in the analysis. For example, to explore prevalence of MUAC that may correspond to the 10% WHZ-based crisis threshold we used surveys with a prevalence of wasting by WHZ between 7.5 and 12.5%. Median and interquartile range (IQR) for prevalence of wasting by MUAC were calculated for each of the sub-sets of surveys with prevalence approximately equal to the four thresholds, overall and by geographic region. The analysis was repeated using only the surveys within ±1.5% of the crisis thresholds.
Finally, we explored concordance of the possible MUAC classification by determining the proportion of surveys that would be classified into the same crisis category if categorized separately based on prevalence of wasting by WHZ and by MUAC. An example set of MUAC thresholds was used for this classification, derived from the observed median values observed in the previous stage analysis.
All data were aggregated and cleaned using SAS Version 9.3, analysis was performed in Stata IC Version 14.2, and figures were produced in JMP Version 13.0.0.

Results
In total, 808 surveys were reviewed for this study. Seventy-five surveys were excluded from the analysis: 60 surveys did not collect MUAC measurements and another 15 had fewer than 25 clusters and/or had a sample size smaller than 196 children, resulting in 733 surveys from 41 countries retained for analysis. As seen in Table 1, the countries with the largest number of surveys were Sudan (150 surveys), DRC (130), Chad (67), Ethiopia (59) and Kenya (46). All other countries had 32 of fewer surveys each. Among selected surveys, 0.7% of children aged 6-59 months were excluded due to missing anthropometric values (sex, weight, height or MUAC) and an additional 0.4% were excluded due to out of range values for WHZ or MUAC. After exclusions, these surveys represent data from approximately 550 thousand children. The data suggest a positive but relatively weak monotonic correlation (ρ = 0.5485) between prevalence of wasting by WHZ and by MUAC ( Table 2 and Fig. 1a). By region, correlation was highest for the Middle East and North Africa (ρ = 0.8901) and DRC (ρ = 0.6822) and lowest for Eastern and Southern Africa (ρ = 0.3553). Rho for all surveys was improved when prevalence of wasting by WHZ was correlated with the difference between the prevalence of wasting by WHZ and by MUAC (ρ = 0.6859). Notably, difference in prevalence by WHZ and by MUAC was greatest for surveys with higher prevalence of wasting by WHZ. Conversely, overall correlation was lowest when prevalence of wasting by MUAC was correlated with the difference between the prevalence of wasting by WHZ and by MUAC (ρ = − 0.1634). The strength of the correlation varied by region (Table 2 and Fig. 1b, c). These correlations did not change markedly when surveys from ACF and UNHCR were analyzed separately (not presented). R 2 in the univariate linear model with prevalence of WHZ as a predictor and prevalence by MUAC as an outcome was 0.36. R 2 in the multivariate model adjusted for prevalence of stunting, the proportion of younger children (aged 6-29 months of age), and the proportion of females, increased to 0.46 (Table 3). Multivariate model results suggest that a 1% increase in prevalence of wasting by WHZ was associated with a 0.5% increase in prevalence of wasting by MUAC; this association was highly significant (p < 0.001). All other co-variates were also positively associated with prevalence of wasting by MUAC. Prevalence of stunting and the proportion of younger children were both significant (p < 0.001 for both), whereas the proportion of females was not (p = 0.218) ( Table 3). Logit transformation of all variables in the model did not improve fit of either univariate or multivariate models (R 2 = 0.35 and 0.43, respectively). Table 4 and Fig. 2 present the median prevalence of wasting by MUAC corresponding to each of the WHZ-based crisis thresholds (5, 10, 15, and 30%). Overall, median prevalence of wasting by MUAC was 4.51% (IQR: 2.73-6.81%) for surveys near the "poor" threshold (5 ± 2.5%), 6.67% (IQR: 4.27-10.03%) for surveys near the "serious" threshold (10 ± 2.5%), 8.15% (IQR: 5.11-11.86%) for surveys near the "emergency" threshold (15 ± 2.5%), and 15.71% (IQR: 10.28-17.50%) for surveys near the "famine" threshold (30 ± 2.5%). Median MUAC thresholds corresponding to 5, 10, 15%  were virtually unchanged when only surveys within ±1.5% of the thresholds rather than ±2.5% of the thresholds were included in the analysis: 4.46, 7.06, and 7.92%, respectively. However, median wasting prevalence by MUAC corresponding to the famine threshold (30%) was higher when only surveys within ±1.5% of the thresholds were included (16.94% vs. 15.71%); this estimate is likely less stable due to the smaller number of surveys in this threshold category.
Prevalence of wasting by MUAC for surveys with WHZ prevalence near all four thresholds varied considerably, as illustrated by the wide interquartile ranges and overall distributions (Fig. 2). For example, for surveys with wasting prevalence by WHZ approximately equal to 10%, prevalence of wasting by MUAC ranged from less than 1% to nearly 20%. The distributions for each of the four threshold categories all overlap substantially. Nearly half (48.0%) of all surveys corresponding to the   10% threshold (± 2.5%), have prevalence of wasting within the IQR for the 15% threshold (± 2.5%), too great an overlap to allow for meaningful discriminatory power. Median prevalence of wasting by MUAC for each threshold category varied by region, suggesting that regional variation contributed to the overall variability observed. For surveys with prevalence of wasting by WHZ approximately equal to the 5, 10 and 15% thresholds, median wasting prevalence by MUAC was greatest in the DRC. Median wasting by MUAC in DRC was nearly double that of surveys from West and Central Africa for surveys with wasting by WHZ of 5 ± 2.5 and 10% ±2.5% and more than triple that of surveys from Eastern and Southern Africa with wasting prevalence by WHZ of 15 ± 2.5%. Median prevalence of wasting by MUAC was also lowest in Eastern and Southern Africa for surveys near the famine threshold (30 ± 2.5%).
Due to the large observed variation in wasting prevalence by MUAC corresponding to the current WHZ-based crisis thresholds, classification of surveys based on prevalence of MUAC and WHZ independently resulted in poor concordance regardless of the MUACbased thresholds used. Table 5 presents as illustration the proportion of surveys that would be classified into the same crisis category using a dozen MUAC threshold combinations derived based on analysis presented in Table 4 when compared with WHZ-categories of 5, 15 and 30%. In all iterations, approximately 4 in 10 surveys were classified into the same crisis category. No combination of MUAC-based thresholds achieved greater than 50% concordance. Notably, Table 5 only contains suggested MUAC thresholds corresponding to 5, 15 and 30% WHZ thresholds. As shown in the previous analyses, the overlap in MUAC values around 10 and 15% WHZ thresholds was too great to suggest a separate meaningful MUAC threshold for 10% WHZ threshold. Including this threshold generally resulted in a lower proportion of concordant surveys.

Discussion
Analysis presented in this paper aimed to assess the feasibility of developing thresholds for determining the severity of a crisis in contexts where assessments of wasting using weight-for-height, the indicator for which WHO recommended emergency thresholds exist, were not practical and only mid-upper arm circumference could be measured. However, analysis of survey data from over 700 surveys from more than 40 countries suggests that prevalence of wasting as assessed by MUAC was poorly correlated with prevalence of wasting by WHZ. Correlation was not substantively improved when analysis was repeated separately by region; rho values for all regions were below 0.7. Consistent with previous literature [10], multivariable model demonstrated that an increase in prevalence of wasting by MUAC was significantly associated with an increased prevalence of stunting and an increased proportion of younger children (6 to 29 months of age) in the survey sample. Proportion of females in the sample was not significantly associated with the prevalence of wasting by MUAC. However, while prevalence of stunting and the proportion of younger children were both significant, including them in the model did not markedly improve fit (R 2 multivariate = 0.46; R 2 univariate = 0.36). A poor correlation of wasting prevalence as assessed by WHZ and MUAC is consistent with previous literature on inconsistencies in diagnosis of individual children as wasted using WHZ and MUAC [6,9,10].
Prevalence of wasting in most contexts was higher when assessed by WHZ than MUAC, however the reverse was true in approximately a quarter of all surveys.