Study design and population
This cross-sectional study included individuals between the ages of 18 and 59 years, from socioeconomic classes A, B, and C, and residents in the urban area of Sao Paulo. Pregnant women and individuals with any physical or mental condition that might make them incapable of participating were excluded from the study.
The sampling process occurred in two stages, the first of which was probabilistic, corresponding to the selection of census sectors, and the second stage was non-probabilistic using a convenience sampling method in reference to the selection of the participants (Fig. 1). In the first stage, of the total of 15,879 census sectors in the urban area of Sao Paulo, with the predominance of socioeconomic classes A, B, and C, 261 censor sectors were selected. This selection was systematically conducted using the probability proportional to size (PPS) method. The second stage was conducted within each censor sector using the proportional quotas relative to the following variables: sex (female, 64%; male, 36%), age (18–29 years, 29%; 30–39 years, 22%; 40–49 years, 25%; 50–59 years, 24%), and socioeconomic class (AB, 43%; C, 57%). The determination of the quotas was based on a study conducted by Possa et al., which aimed to verify the factors associated with yogurt consumption in Sao Paulo [22].
In each sector, 10 interviews were planned, five with yogurt consumers and five with non-consumers, totaling 2610 interviews. The interviews occurred in randomly selected homes using the systematic skip from a home when the interviews were completed.
Yogurt consumers were those who reported a frequency of yogurt consumption ≥ 4 times a week in the last year [23–26]. The group of non-consumers, paired with the consumers for age, sex, and socioeconomic class, consisted of individuals with yogurt consumption frequencies of less than once a week. These pairing variables were selected because they are important confounders, being associated with the consumption of yogurt [22]. The individuals with consumption frequency between one and three times a week, as well as those who reported a consumption ≥ 4 times a week for a period of less than one year were not included in the study.
The project was approved by the Institutional Review Board of the Universidade Federal de Sao Paulo. The participants provided their written informed consent.
Data collection and processing
Data collection was performed by trained interviewers in the participants’ homes during the week and on the weekend.
The interview occurred in two phases, both carried out on the same day. In the first phase, recruiting and selection were performed, and the participants’ demographic, socioeconomic, and yogurt consumption frequency data were collected.
To measure the frequency of yogurt consumption, the participants were initially asked: “Do you usually consume yogurt?” Next, a portfolio was presented containing different brands, versions, and packaging. The objective was to confirm the non-consumption of yogurt by individuals who responded negatively to the initial question and to confirm the consumption of yogurt among those who responded affirmatively. The participants were also questioned regarding the frequency of yogurt consumption (less than once a week; one to three times a week; four or more times a week) and the period of consumption (less than one year; one year or more).
Participants who met the study inclusion criteria continued to the second phase of the interview, in which anthropometric (weight, height, and waist circumference) and lifestyle (food consumption, physical activity, smoking, and alcohol consumption) data were obtained, as well as socioeconomic information (relationship status, level of education, and employment) and the presence of self-reported morbidities, including arterial hypertension, diabetes, cardiovascular diseases, and dyslipidemia.
Anthropometric data
Anthropometric data were verified in triplicate and according to the procedures recommended by the World Health Organization (WHO) [27]. Weight check was performed using calibrated platform-type digital scales (Wiso®, model W801, capacity for 180 kg and precision of 100 g), and height check was conducted using a portable stadiometer with platform (WCS®, maximal measurement 220 cm, precision of 0.1 cm). The mean values of weight and height were used to calculate the body mass index (BMI), defined as body mass in kilograms divided by the height in squared meters (kg/m2). According to the WHO criteria, participants were classified as normal weight, overweight and obese [28]. The abdominal circumference was obtained by measuring the waist at the midpoint between the last rib and the iliac crest using an inextensible metric tape with 0.1-cm precision. To determine the abdominal obesity, the cut-offs proposed by the WHO were used (≥94 cm for men and ≥ 80 cm for women) [28].
Demographic and socioeconomic data
To determine socioeconomic classes A, B and C, the classification proposed by the Brazilian Association of Market Research Institutes (ABIPEME, 1997), which considers the consumers’ goods and the educational level of the head of the family, was used. This classification divides the individuals into classes A, B, C, D, and E, based on the composite scores: class E (0–19 points); class D (20–34 points); class C (35–58 points); class B (59–88 points); and class A (≥89 points). Class A represents the most favored social stratum, and class E represents the least favored social stratum. It should be emphasized that this classification considers the education level of the head of the family, which will not necessarily be that of the participant.
Furthermore, the individuals were categorized according to their level of education (<8 years of study/≥ 8 years of study), currently working and/or studying (yes/no), relationship status (with a partner/without a partner), and the presence of a child aged 3 to 12 years residing in the home (yes/no).
Lifestyle data
Physical activity
Data on physical activity were collected using the International Physical Activity Questionnaire (IPAQ), long version [29]. The individuals were initially distributed into two categories of physical activity level: sedentary and non-sedentary. For this classification, the four domains of IPAQ (work, means of transport, domestic tasks, and leisure) were considered. Posteriorly, the individuals were classified as active, yes or no, for the “leisure” domain. Physically active was defined as engaging in moderate intensity physical activity at least 30 min a day five days a week or engaging in vigorous intensity physical activity at least 20 min a day three days a week.
Consumption of alcoholic beverages and smoking
The consumption of alcoholic beverages was evaluated using the Alcohol Use Disorders Identification Test (AUDIT) [30]. By means of quantitative scores, the AUDIT identifies the low risk use (scores, 0–7), harmful risk use (scores, 8–15), hazardous use (scores, 16–19), and symptoms of alcohol dependence (scores, ≥ 20). The individuals who reported that they did not consume alcoholic beverages did not answer the AUDIT questionnaire.
The participants were classified as current smokers or non-smokers/former smokers.
Morbidities
Self-reported morbidity data were obtained by the following question: “Has any physician told you that you currently have any of the following diseases or conditions: hypertension, diabetes, osteoporosis, dyslipidemia, allergy, lactose intolerance, heart disease, obesity, anxiety or depression, Parkinson’s disease, sleep disorder, cancer, rheumatic disease and/or HIV?” Additionally, one should ask these questions: “Are you in treatment for this disease?” “In the past, have you had any of these diseases diagnosed by a physician?” The individuals who responded affirmatively to any of the questions or reported use of medications were considered as presenting morbidity.
Food intake data
Food intake data were collected using a Quantitative Food-Frequency Questionnaire (QFFQ) composed of 65 food items with a frequency that varies from 0 (never) to 10 times; units of time that included day, week, and year; and the portion size of small, medium, or large. The median portion is the reference serving size and is presented in household measures and in grams. The food items are organized into the following categories: dairy products including yogurt (natural or with fruits), breads and biscuits, rice and tubers, legumes and eggs, meat and fish, soups and pasta, vegetables, sauces and spices, fruits, beverages and sweets and desserts. An album containing images of domestic tools was used to help complete the QFFQ. The QFFQ was developed and validated for the population of Sao Paulo to evaluate habitual food consumption during the year preceding its application [31].
To calculate habitual intake, the intake frequencies of different items were converted to daily intake frequencies, which were multiplied by the size of the respective portion, obtaining the daily intake of the item. To calculate the energy and nutrient intake, the daily quantity of food consumed was multiplied by the nutritional value of the item obtained from the North American chemical composition table of the Department of Agriculture of the United States (USDA) and the Brazilian Table of Food Composition (TACO) [32]. The nutrients evaluated in the present study were saturated fat, alcohol, added sugar, vitamin D, and the minerals magnesium, phosphorus, and calcium. Added sugar was considered as that added to foods and products during processing or preparation, as well as sugar added to the food at the time of consumption.
Quartiles of consumption of yogurt and other dairy products
Based on the quantity of consumption in grams/day of yogurt and milk, cheeses, and fruit smoothies (that use milk), evaluated together here and named “other dairy products,” four other analysis groups were formed, called group LOW-Y/LOW-D (low consumption of yogurt and other dairy products), group LOW-Y/HIGH-D (low consumption of yogurt and high consumption of other dairy products), group HIGH-Y/LOW-D (high consumption of yogurt and low consumption of other dairy products), and group HIGH-Y/HIGH-D (high consumption of yogurt and other dairy products). In this case, the original groups of the study, “consumers of yogurt” and “non-consumers of yogurt” were not considered (Fig. 1).
To structure these new study groups, initially, the values of the 25th and 75th percentiles of yogurt consumption in grams/day were obtained, as well as the values of the 25th and 75th percentiles of consumption of the “other dairy products.” Posteriorly, the consumption characteristics of each group were defined, whereby the consumption of yogurt or other dairy products below the 25th percentile was described as “low consumption” and the consumption of those products above the 75th percentile was described as “high consumption.”
Statistical analyses
The statistical analyses were performed using Statistical Analysis Software, SAS version 9.2. (SAS Institute, Cary, NC, USA). Statistical significance was set at 5%.
Initially, the percentage of the participants classified as consumers or non-consumers of yogurt was calculated according to their relationship status (with/without a partner); work/study (working, studying/not working/studying); educational level (<8 years/≥ 8 years); child residing in the home (yes/no); consumption of alcoholic beverages (dependent/harmful/risk/low risk); smoking (yes; no/former smoker); sedentary (yes/no); leisure activity (active: yes/no); nutritional status (no excess weight/overweight/obesity); abdominal circumference (adequate/high); and presence of self-reported morbidities, such as hypertension, diabetes, osteoporosis, dyslipidemia, allergy, lactose intolerance, cardiovascular diseases, anxiety or depression, Parkinson’s disease, sleep disorder, cancer, rheumatic disease, and HIV (yes/no). The chi-squared test or Fisher’s exact test was used to compare proportions. Additionally, the consumers and non-consumers of yogurt were compared according to the mean alcohol intake (g/day) using Student’s t test. Sex, age and socioeconomic level were not compared between the groups because they were pairing variables.
Finally, the individuals included in the groups LOW-Y/LOW-D, LOW-Y/HIGH-D, HIGH-Y/LOW-D and HIGH-Y/HIGH-D were compared according to the intake of nutrients, such as calcium, vitamin D, phosphorus, magnesium, saturated fat, and added sugar, the variables of age, BMI, abdominal circumference, leisure physical activity (yes/no), sex (female/male), and educational level (<8 years/≥ 8 years). The groups LOW-Y/LOW-D and HIGH-Y/LOW-D were compared to verify the association between the consumption of yogurt and the variables analyzed; the groups LOW-Y/HIGH-D and HIGH-Y/LOW-D were compared to verify whether there were differences between the consumption of yogurt and other dairy products; the groups LOW-Y/LOW-D and LOW-Y/HIGH-D were compared to verify the association between the consumption of other dairy products and the variables analyzed; and the groups LOW-Y/LOW-D and HIGH-Y/HIGH-D were compared to verify the association between the consumption of yogurt and other dairy products and the variables analyzed.