Reproducibility and relative validity of a newly developed web-based food-frequency questionnaire for assessment of preconception diet

Background The importance of diet and nutrition during preconception age is a window of opportunity to promote future parental and transgenerational health. As a sub-study to a large Norwegian study, ‘Diet today – health of tomorrow’, a food-frequency questionnaire (FFQ) was developed to assess diet during the preconception phase in young adults aged 20 – 30 years and in this paper we report the reproducibility and relative validity of this questionnaire. Methods The FFQ was developed from an existing FFQ validated in adolescents. Participants were recruited on social media and at a university. Reproducibility was assessed by comparing the test and retest of the FFQ. Relative validity was assessed by comparing intake measured by the FFQ with a 7-day weighed food record. Energy, nutrients and food intake were used to assess the reproducibility and relative validity of the FFQ. The study applied the Spearman’s rank correlation coefficient, percentage of agreement and Cohen’s Kappa to assess reproducibility and validity. Results There were 32 participants recruited to the study, of which 21 participants completed both the test-retest reproducibility and the relative validation. The test-retest reproducibility had a median correlation coefficient of 0.85 for energy and nutrients, a median Spearman’s rank correlation coefficient of 0.75 and a median Cohen’s Kappa of 0.51 for food groups. The relative validity of the FFQ had a median correlation coefficient of 0.59 for energy and nutrients, a median Spearman’s rank correlation coefficient of 0.54 and a median Cohen’s Kappa of 0.28 for food groups. Conclusion This newly developed FFQ for preconception diet in young adults had a satisfactory test-retest reproducibility and fair relative validity.


Background
Eating a healthy and balanced diet throughout the life course protects against malnutrition in all its forms, as well as a range of non-communicable diseases (NCDs) and conditions [1]. The understanding that diet and nutrition during the preconception phase of life is important for a future child's development and later life conditions is a developing field of study, showing promise in promoting future parental and transgenerational health [2][3][4][5][6]. The preconception phase of reproductive life is defined as the time from reproductive maturity to conception [2].
The International Federation of Gynecology and Obstetrics [3] calls for worldwide action to improve diet and nutrition prior to conception in order to promote lifelong health and wellbeing, and to prevent the transmission of metabolic susceptibility to the next generation. The Lancet's Maternal Obesity 3 [7] points out the need for future research on detailed information about specific maternal lifestyle, nutritional, and metabolic exposures that underpin effects of maternal obesity on outcomes in offspring. Assessing diet and nutritional status is therefore essential to understand how to improve the health of individuals and populations [8]. Today there is no suitable Norwegian questionnaire to approach the preconception target population, from adolescence to young adulthood. The development of a food-frequency questionnaire (FFQ) that targets this population is therefore important.
There are several dietary assessment methods, each with its strengths and limitations. The retrospective methods, such as 24-h recall, diet history and FFQs, offer a retrospective view on dietary habits and food intake. These methods rely on the participants' memory and their ability to recall foods eaten and the frequency of intake [9]. Prospective methods assess food intake in real time, such as food records or the duplicate diet approach. These methods assess actual intake over a specific period, but are less suitable for large scale epidemiological studies, as they are time consuming, demand a high level of motivation and represent a large burden for the participants [9]. The weighed food record (WFR) should be among the first methods of choice when assessing the validity of an FFQ [10]. The prospective nature of the method reduces errors related to participants recalling food intake. Using a weighing scale to quantify the amounts of food eaten ensures accurate intake. The limitations associated with the method, such as expenses, burden for participants and social desirability bias, makes it less feasible for larger scale implementation [9,10]. An FFQ generally consists of a fixed food list and a frequency response section and may include further details on quantity and composition. FFQs are common in large scale observational studies because they are easily administered, the least expensive and have the lowest participant burden compared with other dietary assessment methods, while being able to capture usual long-term dietary intake [11].
Given the importance of gaining more knowledge on the impact of diet and nutrition on health outcomes, it is crucial to examine the degree to which a dietary assessment method measures true intake [10,12], by testing the validity and reproducibility. The validity is tested by comparing your method with a more reliable reference method. In order to collect data on the large population that make up preconception young adults, developing an FFQ was considered an appropriate methodological approach.
The present study is part of a larger study, 'Diet today -health of tomorrow'. The main study aims to develop, implement and evaluate a theory and evidence based digital intervention that promotes a healthful diet preconception, optimizes fetal conditions during pregnancy, and prevents NCDs in future children. No relevant FFQ were found for the target age group in Norway, making it necessary to develop an FFQ and test its accuracy.
Therefore, the purpose of the present study was to develop an FFQ for preconception young adults and investigate its reproducibility and validity in the target population.

Study design and recruitment
Reproducibility was assessed by comparing a test and retest of the FFQ. Relative validity was assessed by comparing intake measured by the FFQ (test) with 7-day weighed food records. Recruitment of participants took place from November 1st until November 24th in 2017. Data were collected in the time period from November 2017 until January 2018. To be included in the study, participants had to be 20-30 years old, without children and give their consent to participate. The lower range of the age group was based on targeting young adults as they move away from home and start their independent life, thereby deciding their own diet. The upper range of the age group was based on the Norwegian age of first child birth, which were 29 for women and 32 for men in 2018, respectively [13]. The study included both preconception and periconception young adults but did not distinguish between the two. The participants needed to have access to internet, possess the necessary skills to complete an internet-based questionnaire and be willing to weigh and record their intake of food for seven consecutive days. Finally, they needed to be able to meet in person at the University of Agder, Kristiansand, Norway, at least once to attend an instructional meeting.
The study was advertised on social media, among students at the Faculty of Health and Sport Sciences at University of Agder, and through word of mouth. The advertisement led potential participants to the study website, which contained a general outline of the study, an invitation to participate, contact information, and a button for enrolment in the study. When signing up for the study, the participants gave their consent to participate and selected one of 11 possible instructional meetings to attend. As participants signed up, an e-mail containing a link to the online FFQ was sent to their email address. The e-mail also contained information about the FFQ, a deadline for completing the FFQ, and information on where to meet for the instructional meeting. If a participant had not completed the FFQ 1 week before their scheduled meeting, a reminder was sent by e-mail.
For the test-retest reproducibility investigation, a link to the FFQ retest version was sent to participants via email on December 13th, 2017, resulting in at least 19 days between test and retest of the FFQ.

Study population
During the recruitment period of approximately 4 weeks, 32 participants signed up. Of these, 29 completed the first FFQ (of which three did not complete the entire FFQ). In the course of 11 instructional meetings 25 participants attended and started their 7-day WFR. Of these 25 participants, 22 completed the WFR (one participant recorded 6 days) and returned their recording booklet.
The FFQ retest was completed by 21 of the 22 participants that completed the WFR. A total of 21 participants completed all the components of the study (Fig. 1).
The food-frequency questionnaire The FFQ was developed from an existing FFQ targeting adolescents [14]. In the first stage of development, the original FFQ was sent to five volunteers using a purposive sampling, at which point they completed the FFQ and then answered questions according to an interview guide. Feedback from the volunteers were considered by the authors. The inputs deemed relevant based on our understanding of the age group and development of an FFQ were included in the revision of the questionnaire. In addition, the original questionnaire was revised in accordance to input from the authors and expert colleagues at the faculty. In the second stage of development, the revised version of the FFQ was sent to two new volunteers, using purposive sampling. The participants were interviewed in accordance with the interview guide upon completion of the FFQ. Based on their feedback the final version of the FFQ was created, using the online survey tool SurveyXact [15]. The food-frequency questionnaire can be found in Additional file 1. The majority of the food items in the original FFQ were included in the revised version. Some food items were excluded (food items not found relevant by the participants during the development, food items targeting children, i.e. specific sweetened breakfast cereals), and some food items were added (i.e. low-and full fat options, a wider range of condiments). Changes to frequency of intake were based on some participants requesting a greater degree of variation in order to reflect their dietary intake (i.e. adding 'more than 3 cups per day' for coffee).
The FFQ consisted of 146 questions aiming to reflect diet in a four-week retrospect by asking respondent to report their average intake of the specified food items during the last 4 weeks. Completing the FFQ took about 25 min. The FFQ started with the respondents' sex, age, self-reported height and weight, and level of education, followed by 121 questions related to average consumption of foods and beverages. These were divided into different sections (beverages and dairy products, bread and grain products, lunch meats, dinner meals and side dishes, fruits and vegetables, desserts, cakes and snacks). The FFQ ended with 13 questions regarding food habits and 7 questions related to physical activity, screen time, sleep and tobacco use.
The response alternatives regarding frequency of intake varied according to food and beverage questions. For beverages and dairy products (not including yoghurt), the interval range was 'never', '1-3 per month', '1-3 per week', '4-6 per week', '1 per day', '2-3 per day' and 'more than 3 per day'. Dinner meals and side dishes used the interval ranges 'never', '1-3 per month', '1 per week', '2-4 per week' and 'more than 4 times per week'. Fruit and vegetables, desserts, cakes and snacks all used the interval range 'never', '1-3 per month', '1 per week', '2-3 per week', '4-6 per week' and '1 or more times per day'. The respondents reported their food intake in 'units per month', 'units per week' or 'units per day'. The unit measurements differed between sections and foods, whereas most questions were related to a standard portion size (e.g. cup of coffee, a piece of bread, an apple). For some questions, extra information was provided (e.g. maize = 2 tablespoons or soda = 0.5 l).
Calculating food and nutrient intake was overseen by the corresponding author. All food and beverage related questions were linked to a corresponding food-code in the Norwegian food composition table [16]. The Norwegian Food Safety Authority's "Weights, measures and portion sizes for food" [17] and the web-page "Food-Recipes" [18] were used when assigning portion sizes. Amount in grams/millilitres was calculated using portion sizes and reported frequency of intake per day (based on 1 month being 28 days, as the FFQ reported intake in a four-week retrospect). FoodCalc [19] was used to process the FFQ, based on nutritional values from the Norwegian food composition table [16]. Food and beverage items assessed by the FFQ were organized into 28 non-overlapping food groups according to nutrient profile, biological classification or culinary usage.

The 7-day weighed food record
At the instructional meetings, the participants received general information on how and why the study was conducted, and instructions on how to weigh and record their food. They also received the equipment necessary to implement the WFR. The participants were encouraged to maintain their normal diet during the recording period, as any change in diet could influence the validation of the FFQ. Every participant received a weighing scale (Swordfish SFKSW14E) and was told to use this rather than a personal weighing scale. The recording booklet contained two pages of information on how to weigh and record food. This information was reviewed, followed by a review of the booklet itself. To accurately record the weight of the food, participants were given boxes to weigh the remains after a meal. A practical example was conducted with weighing and recording of a test meal. When eating out, participants were instructed to weigh and record their food as usual if possible. If the weighing scale was not accessible, participants were instructed to take note of what they ate, take a picture if possible and estimate portion size. The participants were offered a pre-paid envelope if they did not have the opportunity to deliver the recording booklet in person after completing the WFR. The WFR started the following day and continued for seven consecutive days. The participants were encouraged to make contact by e-mail or telephone if they had any questions.
Data entry of the food records was conducted by the corresponding author in collaboration with two research assistants. Calculating food and nutrient intake was overseen by one of the co-authors. Foods and beverages recorded by the participants in the WFR were linked to a corresponding food-code in the Norwegian food composition table [16]. The Norwegian Food Safety Authority's "Weights, measures and portion sizes for food" was used when converting from volume measurements to grams and for calculating the weight yield factor from cooking [17]. When participants provided recipes, these were replaced with the closest approximate food-code. We used FoodCalc [19] and the Norwegian food composition table [16] to calculate food and nutrient intakes from the WFR. Food and beverage items assessed by the WFR were aggregated into the same 28 non-overlapping food groups used for the FFQ.

Statistical analysis
Descriptive analyses were used to evaluate the characteristics of the participants (age, height, weight, body mass index (BMI), level of education). Most of the nutrients used to assess reproducibility and relative validity were not normally distributed and presented as median with 25th and 75th percentile, although some were considered to be normally distributed and therefore presented as mean with standard deviation (SD). The food groups used to organize the FFQ test-retest and WFR were not normally distributed, and therefore presented as median with 25th and 75th percentile. We used Spearman's rank correlation coefficient to examine the correlations for the test-retest reproducibility and the relative validity. Correlation coefficients between 0.5 and 0.7 has shown to be common when testing the reproducibility between two administrations of an FFQ [10], and a Spearman correlation coefficient above 0.5 is recommended for nutrients in dietary validation studies [20].
We also examined energy intake and intake of vegetables and fruits assessed by the FFQ and the WFR by Bland-Altman plots, i.e. by plotting the mean energy intake and intake of vegetables and fruits (x-axis) against their mean difference for each participant [21]. Further, the total intake per day for the food and beverages included in each food group were ranked into tertiles of intake for the FFQ and WFR. The FFQ's ability to categorize participants into the correct tertile of intake was assessed by calculating the percentage of agreement. Participants were presented as percent correctly classified or grossly misclassified. Participants correctly classified were categorized in the same tertile of intake for both measurements, whereas participants that were grossly misclassified were categorized in non-adjacent tertiles. Unweighted Cohen's Kappa statistics was analysed for food intake in each food group ranked into tertiles to identify the strength of agreement. Values of Kappa, according to Masson et al., are categorized as follows: < 0.20: poor agreement, 0.21-0.40: fair agreement, 0.41-0.60: moderate agreement, 0.61-0.80: good agreement, and > 0.80: very good agreement [20]. Self-reported height and weight were used to calculate BMI (kg/m 2 ). The significance level was set to 5%, and all statistical analysis were carried out using the computer program IBM SPSS Statistics for Windows, version 24 (IBM Corp., Armonk, N.Y., USA).

Sample
The characteristics of the 29 participants who completed the first FFQ are presented in Table 1. There was an uneven distribution of women and men in the study. Median age of the participants was 23 years while median BMI was 24.1 (kg/m 2 ). The majority of the sample had a higher level of education (university/college up to 4 years or more). The 21 participants (17 women and 4 men) that completed both the test-retest of the FFQ and the WFR were used to assess reproducibility and relative validity of the FFQ.

Test-retest reproducibility
The median correlation coefficient for energy and nutrients was 0.85, ranging from r = 0.56 for vitamin D to r = 0.93 for calcium. There were 15 nutrients that showed high correlation (> 0.7) and two nutrients that were in the common range of correlation for reproducibility (0.5 -0.7) ( Table 2). The median Spearman's rank correlation coefficient for the food groups was 0.75, ranging from r = 0.22 for processed red meat to r = 0.93 for oils, butter and margarine (Table 3). There were 19 food groups that showed high correlation, five food groups in the common range of correlation for reproducibility, and four food groups showed a low degree of correlation. For all but the food groups processed red meat, liver-pate, fatty fish, fish dishes, rice, pasta and noodle, and salty snacks, less than 10% of the participants were grossly misclassified, that is, classified into a non-adjacent tertile. The median Cohen's Kappa value for the food groups was 0.51, with a range from k = 0.01 for processed red meat to k = 0.78 for low fibre bread and crispbread. The calculated daily median (mean) energy intake was 9.7 MJ (mean: 9.1 MJ) by the FFQ and 9.1 MJ (mean: 10.2 MJ) by the WFR. The Bland-Altman plot showed that although the mean difference between the methods (bias) was small, the confidence limits were wide and showed large differences at the individual level (Fig. 2). Energy intake by the FFQ ranged from 4.3 MJ (1028 kcal) to 13.4 MJ (3203 kcal) and by the WFR from 5.5 MJ (1315 kcal) to 9.9 MJ (2366 kcal), all are within a plausible range for young adults. The calculated daily median (mean) intake of vegetable and fruit was 367 g (mean: 311 g) by the FFQ and 350 g (mean: 395 g) by the WFR (Fig. 3). Vegetable and fruit intake by the FFQ ranged from 49 g to 590 g and by the WFR from 75 g to 866 g, all are within a plausible range. The median intakes were higher by the FFQ than by the WFR for energy and four other nutrients (carbohydrates, fibre, sodium, and iron). This was also seen for food groups including vegetables and fruit, processed red meat, fish dishes, high fibre bread and crispbread, sugar, sweets and desserts, and pizza. Still, the calculated mean intakes were lower by the FFQ than by the WFR, as illustrated by the Bland-Altman plots (Figs. 2 and 3). Correlations for energy and nutrients had a median of 0.59, with a range from r = < 0.05 for saturated fat to r = 0.78 for folate. There were 13 nutrients that showed moderate to good correlation (> 0.5) and four nutrients that were below the recommended correlation coefficient for relative validation ( Table 2). Spearman's rank correlation coefficient, percentage of agreement and Cohen's Kappa value were analysed for food groups to investigate the relative validity ( Table 4). The median Spearman's rank correlation coefficient for the 28 corresponding food groups was 0.54, ranging from r = − 0.14 for potatoes to r = 0.81 for oils, butter and margarine. There were 16 food groups that showed moderate to good correlation and 12 food groups with low to moderate correlation. For all but the food groups fatty fish, potatoes and salty snacks, less than 20% of the participants were grossly misclassified. The median Cohen's Kappa value for the food groups was 0.28, with a range from k = − 0.13 for potatoes to k = 0.65 for oils, butter and margarine.

Discussion
The FFQ developed for assessing preconception diet among young adults showed satisfactory test-retest reproducibility, indicating that the FFQ is suitable for its target group. The relative validity of the FFQ explored against 7-day WFR indicated a fair agreement between the test and reference method.  Food intake (gram/milliliter per day) for food groups (median, 25th and 75th percentile, Spearman's rank correlation coefficient, percent correctly classified (CC) and grossly misclassified (GM) into tertiles of intake, Cohen's Kappa value). Statistically significant p-values marked in bold for vitamin D to r = 0.93 for calcium. The study included a wider age group, with a mean age of 27.5 (range 18-71). The reproducibility of an online semi-quantitative FFQ to be used for personalized dietary advice was tested by Fallaize et al. [23]. Their unadjusted correlations for energy and nutrients show results that are similar to ours. The reproducibility of 35 food groups showed a mean correlation coefficient of 0.75, similar to our results. The study had similar percentage of exact agreement for food groups, although their intake was ranked into quartiles. The study sample included a somewhat wider age group with a mean age of 32 (SD 12). Hebden et al. tested the reproducibility of a semiquantitative FFQ on participants 18-34 years old [24]. The study only had food groups for fruit, fruit including fruit juice, and vegetable. The weighted Kappa value for vegetable servings were similar to our unweighted kappa results, but showed a higher value for fruit servings. Their food groups were ranked into quintiles instead of tertiles.

Relative validity
Comparison of intake between the FFQ and WFR resulted in a median correlation coefficient of 0.59 for energy and nutrients and 0.54 for food groups. The median Cohen's Kappa value was 0.28 for the food groups. This indicates fair agreement between the two methods. This is also evident from the Bland-Altman plots (Figs. 2 and 3). Participants being correctly classified into the same tertile of intake ranged from 19% for the food groups lean fish to 76% for oil, butter and margarine. Three of the food groups in the relative validation had 20% or more participants grossly misclassified. The WFR dietary assessment does not reflect the same four-week time span as the FFQ, which it ideally should [10]. Although the WFR assessment covered 7 days in our study, the mean difference in calculated energy intake was relatively low (~13%). The differences between mean intakes calculated by the FFQ and WFR were negative (Figs. 2 and 3), indicating that intakes were skewed to the left, i.e. wider range of low than high intakes. As most nutrient intakes correlate strongly with energy intake, energy intake is commonly used to identify and exclude individuals with invalid dietary reports. There are no established limits for plausible energy intakes by FFQs. Without further justification than being implausible, a lower limit of 2.5 MJ/day (600 kcal) and an upper limit of 15 MJ/day (3600 kcal) has been applied as cutoffs for reported energy intake by FFQs in several observational studies [25][26][27]. All the reported energy intakes in the current study were well within these limits and were considered plausible for young adults. The level of underreporting has in some studies been as high as 46% for women and 29% for men [10].
Steinemann et al. [28] examined the relative validity of an FFQ to be used for estimating the food intake in an adult population. The study sample (n = 56) had a mean age of 40 years (range 22-85) and completed a 4-day weighted food record. Correlations were reported for energy, four nutrients and 25 food groups. Our results showed similar correlations as in that study for protein and fat, but higher correlations for energy, carbohydrates and fibre. Comparing the relative validity of food groups, our study had a higher median correlation coefficient than Steinemann et al. The present study had higher correlation coefficients for fruit and vegetable intake compared with Hebden et al. [24]. Our Kappa values for fruit and vegetable intake were somewhat lower than their weighted Kappa, despite our intake being ranked into tertiles instead of quintiles. The relative validity of energy, protein, sugars and alcohol showed similar results to our study, although our results had lower correlation for saturated fat and higher correlations for carbohydrates and dietary fibre compared to Hebden et al. Fallaize et al. compared nutrients and food groups from an FFQ with a 4-day WFR on a sample with a mean age of 26.9 (SD 8.4) [23]. The correlations of 30 nutrients (of which seven were E%) ranged from 0.23 for vitamin D to 0.65 for protein, E%, with a mean value of 0.47. Our results show a wider range with a somewhat higher median correlation value. The correlations for 35 food groups showed similar range as in our study, although our median correlation value was higher. Their results ranged from 18 to 55% of participants being correctly classified into quartiles of intake with a mean of 5% grossly misclassified, which is comparable to our 28 food groups ranked into tertiles.
The nutrients included in our analysis were based on the Norwegian Directorate of Health "A healthy lifestyle before and during pregnancy" [29], which highlights folate, vitamin D, iron, calcium, iodine and vitamin B12. Energy, macronutrients and fibre were included to assess dietary contributions, alcohol was included to asses alcohol intake and vitamin A was included because of its importance in fetus development during pregnancy [30]. Salt and sugar were included as they are important factors in a public health perspective, and are priority areas in the Norwegian "Partnership for a healthier diet" [31]. The study was promoted at a university and on social media related to this university, which may have resulted in a majority of student participants. The time of the implementation of the study coincided with the end of the yearexamination for students. Four participants (19%) Bland-Altman plot for measuring daily vegetables and fruit intake. Bland-Altman plot between the food frequency questionnaire (FFQ) and the weighed food record (WFR) methods for measuring daily vegetable and fruit intake. The solid line represents the mean difference between the two methods, and the dashed lines represent the limits of agreement corresponding to ±2 (SD) pointed out that this had interfered with their dietary habits. The upcoming Christmas time was mentioned as a reason for increased consumption of unhealthy foods by five participants (24%). The study sample size is a limitation. We did not conduct an a priori sample size calculation. A reasonable sample size for reproducibility and validation studies is 100-200 persons [10]. The present sample's gender distribution, with an overrepresentation of women, students and high level of education, limits its representativeness for the population. Evidence have shown that women are more likely to underestimate intake, which may have affected the results [8]. There were 16 of the 21 participants used in the analysis of the FFQ that reported using dietary supplements, of which 12 (75%) of these also reported using dietary supplements in the WFR. There is limited available research on diet in the preconception age group. A scoping review identified a paucity of longitudinal data into the mid and late twenties, a varying use and quality of dietary assessment methods, and a large variety of macronutrients and food groups studied [32].

Conclusion
This food-frequency questionnaire (FFQ) developed for assessing preconception diet in young adults had satisfactory test-retest reproducibility and fair relative validity compared with 7-day weighed food records. The validated FFQ will be used to investigate transgenerational diet-disease associations in future studies. Food intake (gram/milliliter per day) for the food-frequency questionnaire (FFQ) and the 7-day weighed food record (WFR) for food groups (median, 25th and 75th percentile, Spearman's rank correlation coefficient, percent correctly classified (CC) and grossly misclassified (GM) into tertiles of intake, Cohen's Kappa value). Statistically significant p-values marked in bold