Food Compass Is A Nutrient Profiling System Using Expanded Characteristics For Assessing Healthfulness of Foods
Food Compass Is A Nutrient Profiling System Using Expanded Characteristics For Assessing Healthfulness of Foods
https://doi.org/10.1038/s43016-021-00381-y
Nutrient profiling systems (NPS) aim to discriminate the healthfulness of foods for front-of-package labelling, warning labels,
taxation, company ratings and more. Existing NPS often assess relatively few nutrients and ingredients, use inconsistent cri-
teria across food categories and have not incorporated the newest science. Here, we developed and validated an NPS, the Food
Compass, to incorporate a broader range of food characteristics, attributes and uniform scoring principles. We scored 54 attri-
butes across 9 health-relevant domains: nutrient ratios, vitamins, minerals, food ingredients, additives, processing, specific
lipids, fibre and protein, and phytochemicals. The domain scores were summed into a final Food Compass Score (FCS) ranging
from 1 (least healthy) to 100 (most healthy) for all foods and beverages. Content validity was confirmed by assessing nutrients,
food ingredients and other characteristics of public health concern; face validity was confirmed by assessing the FCS for 8,032
foods and beverages reported in NHANES/FNDDS 2015–16; and convergent and discriminant validity was confirmed from com-
parisons with the NOVA food processing classification, the Health Star Rating and the Nutri-Score. The FCS differentiated food
categories and food items well, with mean ± s.d. ranging from 16.4 ± 17.7 for savoury snacks and sweet desserts to 78.6 ± 17.4
for legumes, nuts and seeds. In many food categories, the FCS provided important discrimination of specific foods and bever-
ages as compared with NOVA, the Health Star Rating or the Nutri-Score. On the basis of demonstrated content, convergent
and discriminant validity, the Food Compass provides an NPS scoring a broader range of attributes and domains than previous
systems with uniform and transparent principles. This publicly available tool will help guide consumer choice, research, food
policy, industry reformulations and mission-focused investment decisions.
D
iet is a leading modifiable cause of poor health globally1. composition for reasons related to preventing disease and promot-
While broad outlines of a healthful diet are clear—for ing health.”3,11 NPS provide quantitative algorithms to score food
example, eat more fruits and vegetables, and avoid soda2— products on the basis of the presence and/or amounts of attributes
ambiguity exists on how to distinguish many other food groups, scored as beneficial (for example, selected vitamins and fruit con-
packaged and processed foods, and restaurant and mixed-food tent) or detrimental (for example, calories, total fat, saturated fat,
dishes, which together represent the majority of most diets. Clear salt and sugar)5.
metrics to characterize the healthfulness of these food items as well NPS are gaining momentum as major tools for policy actions—
as groups of items, such as whole meals, diets or company product for example, in the European Union (Nutri-Score, Keyhole and oth-
portfolios, are essential. Such metrics can help inform consumer ers), North America (Guiding Stars), South America (Pan American
choices; define incentives in worksite wellness, health care or nutri- Health Organization system and Chile stage III systems), Australia
tion assistance programmes; promote industry reformulations and and New Zealand (Health Star Rating (HSR)), Asia (Healthier
compliance with societal targets; guide public health policies such Choice Symbol) and the Middle East (Waqeya). Yet, limitations and
as front-of-package (FOP) labelling, food procurement and school gaps remain. In the European Union, for example, there is consid-
meal standards, taxation, and marketing restrictions; inform agri- erable ongoing discussion and controversy over the best NPS for
cultural and trade practices; and guide environmental, social and a harmonized approach to FOP nutrition labelling12. Of numer-
corporate governance (ESG) investment decisions3–8. ous NPS models evaluated by the World Health Organization3, the
Several approaches to assessing the healthfulness of foods rely majority (84%) had not undergone any type of validation, such as
on isolated single nutrients, such as nutrient labelling (for example, for content validity (including nutrients and/or dietary factors of
Nutrition Facts) or FOP labels based on single nutrient thresholds public health concern), criterion validity (the contents are compared
(for example, UK ‘traffic light’ labels and Chile and Mexico’s ‘black against a ‘gold standard’ reference where possible), convergent and
box’ warnings)9. Other approaches rely on general determinations discriminant validity (comparing the NPS with other scoring sys-
of food processing10. Because assessments of healthfulness based tems) or construct validity (evaluating the NPS against population
on single nutrients in isolation or broad processing categories can diet quality indices or health outcomes)13,14. In addition, most NPS
have limited applicability, nutrient profiling systems (NPS) have score a relatively small number of nutrients and ingredients; sev-
emerged as a more comprehensive approach. NPS represent “the eral prioritize nutrients with outdated evidence for health impacts
science of classifying or ranking foods according to their nutritional (such as total calories or total fat); and most score nutrient contents
Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA. 2These authors contributed equally: Dariush Mozaffarian,
1
Fibre: Thiamine (B1) Phosphorus Vegetables, Nitrites Fermentation Eicosapentaenoic Total protein Total carotenoids
carbohydrate non-starchy acid + docosahexaenoic
ratio acid
Potassium: Riboflavin (B2) Magnesium Beans and Artificial sweeteners, Frying Medium-chain fatty
sodium legumes flavours or colours acids
ratio
Vitamin K
Choline
Across 9 domains, 54 individual attributes were assessed per 100 kcal (418.4 kJ) of food product, with scoring from 0 to 10 for beneficial attributes, −10 to 0 for harmful attributes and −10 to 10 for
attribute ratios that could range from harmful to beneficial. Each domain received a score, calculated as the average of all attributes in that domain (or for food ingredients, as the sum, given that the
contents of ingredients are interdependent). For vitamins and minerals, the domain score was calculated from the highest (absolute value—that is, negative or positive) five attribute scores; and for specific
lipids, from the highest three attribute scores (Supplementary Table 3). Attributes with emerging evidence for health impacts (five additives, as well as fermentation and frying as processing methods) were
scored using half-weights. All domain scores were then summed, using equal weights for the first six domains and half-weights for the latter three domains (Methods). The final FCS was scaled across all
food and beverage items to range from 1 (least healthful) to 100 (most healthful). For the full scoring details, see Supplementary Table 3. Potassium and sodium were included both as ‘Nutrient ratios’ given
evidence for their biologic interaction and separately as ‘Minerals’ given evidence that their absolute intakes also influence health. Carbohydrate and fibre were included as ‘Nutrient ratios’ given evidence
that their ratio predicts the healthfulness of carbohydrate-rich foods, and separately as total fibre (in ‘Fibre and protein’) given evidence that the absolute intake of dietary fibre, but not total carbohydrate,
influences health. Partially hydrogenated oils were included in addition to trans fat content (scored −10 to 0 in ‘Specific lipids’) on the basis of emerging evidence that industrial trans fats may have greater
adverse health effects than naturally occurring trans fats, and that the presence of such partially hydrogenated oils may serve as a marker of more intensive and potentially adverse industrial processing.
per gram, which can be strongly influenced by water content4,5,15–17. and convergent and discriminant validity from comparisons
Most widely used NPS also have varying attribute criteria and scor- with the NOVA food processing classification and the HSR and
ing principles for different, arbitrarily grouped food categories Nutri-Score NPS.
(from 4 to 33), which increases subjectivity and inconsistency and
limits consistent scoring of mixed foods or meals. Few existing NPS Results
incorporate evidence on the diversity of food attributes relevant to Food Compass attributes, domains and scoring. Relevant attri-
health, such as the presence of polyphenols or other phytonutrients, butes and scoring principles were developed on the basis of an assess-
probiotics, other additives, or processing characteristics that may be ment of more than 100 reported NPS4,9, including 7 widely used
adverse (for example, ultraprocessing) or beneficial (for example, NPS of diverse origins (Supplementary Table 1); a systematic review
fermentation). of national and international dietary guidelines2; nutrient require-
To address these gaps, we developed the Food Compass, ments for health claims (Supplementary Table 2); and an assessment
a comprehensive NPS incorporating an expanded assessment of nutrients, ingredients and other food characteristics linked to
of nutrient and ingredient characteristics, additional food com- health outcomes18–21. The final algorithm incorporated 54 attributes
ponents and processing parameters in a uniform fashion across across 9 domains (Table 1 and Fig. 1). Of the domains, ‘Nutrient
food categories, to help guide healthy food choices; industry ratios’ included measures of fat quality (unsaturated:saturated fat
reformulations; environmental, social and corporate governance ratio), carbohydrate quality (carbohydrate:fibre ratio) and min-
metrics; and policy actions. Relevant nutritional attributes were eral quality (potassium:sodium ratio). ‘Vitamins’ and ‘Minerals’
selected, scoring principles identified and an algorithm estab- included major nutrients related to undernutrition and chronic dis-
lished on the basis of an assessment of existing NPS, dietary guide- eases. ‘Food-based ingredients’ included food groups with probable
lines, health claims and diet–health relationships. Testing and or convincing evidence for impacts on chronic diseases including
validation were performed on the basis of content validity for cardiovascular diseases, type 2 diabetes or cancers18,22. ‘Additives’
nutrients, food ingredients and other characteristics of public included factors with evidence for health harms (for example,
health concern; face validity from the scoring of 8,032 unique added sugar and nitrites in processed meats) or emerging but not
foods and beverages in a nationally representative US database; definitive evidence for harms and/or potential markers of adverse
Food Compass
domains
Nutrient ratios
Vitamins
Minerals
Food-based ingredients
Additives
Processing
Specific lipids
Fibre and protein
Phytochemicals
0 0
–5 –5
Fibre and protein Minerals Fibre and protein Minerals
–10 –10
Food-based Food-based
Specific lipids Specific lipids
ingredients ingredients
FCS: 23 FCS: 83
Fig. 1 | Domains of the Food Compass. The Food Compass scores food items across nine domains (top), with six domains equally weighted and three
domains each given a half-weight on the basis of more modest relative health effects. Each domain score is calculated as the average of the scores of the
specific attributes within that domain; sample spider plots of the nine domain scores for two food products are shown (bottom). The nine domain scores
are summed and scaled to calculate a final FCS, scaled to range from 1 (least healthful) to 100 (most healthful).
industrial processing (for example, artificial sweeteners, flavours Within each food category, the FCS distributions were relatively
or colours). ‘Processing’ metrics included the NOVA classification, normal and also broad, often as low as 1.0 (exceptions included
as well as fermentation and frying as emerging processing charac- dairy, 3.8; legumes, nuts and seeds, 17.0; fruits, 17.1; and seafood,
teristics with health implications. The remaining three domains 22.5) and as high as 100 (exceptions included meat, poultry and
incorporated specific lipids, fibre and protein, and phytochemicals, eggs, 73.0; savoury snacks and sweet desserts, 80.9; fats and oils,
each given a half-weight on the basis of less formally recognized 89.7; and dairy, 95.2). On the basis of the observed ranges, FCS ≥ 70
evidence or dietary guidance supporting health associations, spe- was selected as a reasonable cut-off point for foods or beverages to
cific target intakes or health effects independent of the other attri- be encouraged; FCS = 31–69, to be consumed in moderation; and
butes and domains. The full details of the scoring algorithm are in FCS ≤ 30, to be minimized.
Supplementary Table 3, and score distributions for each domain Subcategorization into 44 food subcategories supported further
for the 8,032 unique foods and beverages in the Food and Nutrient discrimination and specificity of the FCS (Fig. 2). For example, the
Database for Dietary Studies (FNDDS) are shown in Supplementary mean ± s.d. FCS was 27.6 ± 28.7 for sugar-sweetened sodas and energy
Fig. 1. Three example foods with details of attribute values, attri- drinks versus 67.0 ± 18.4 for 100% fruit or vegetable juices; 24.9 ± 6.9
bute scores and domain scores are given in Supplementary Table 4. for beef, 42.6 ± 12.0 for poultry and 67.0 ± 18.9 for seafood; and
Summed across domains, the overall score for any item could 43.2 ± 14.0 for starchy vegetables versus 88.2 ± 14.5 for green vegeta-
range from −35 to 60; the final Food Compass Score (FCS) was bles. Among fruits, nearly all raw fruits received an FCS of 100, while
scaled for interpretability to range from 1 (least healthful) to 100 higher-sugar fruits such as bananas, dates and figs received a lower
(most healthful). FCS (but still >70). Examples of scores are given in Supplementary
Table 6 and Supplementary Fig. 2. Among beverages, for instance, the
Testing and validation using NHANES. To assess usefulness and FCS for low-sodium tomato juice was 100; carrot juice, 84; apple juice,
validity, we applied the Food Compass to all 8,032 unique foods and 55; a packaged fruit juice drink, 19; and most energy drinks, sports
beverages in FNDDS. Overall, the mean ± s.d. FCS was 43.2 ± 28.5 drinks and colas, 1–2. Among grain-based products, a whole-oat
(median, 39.3) (Supplementary Table 5). Among 12 major food cereal received a 95; plain instant oatmeal with water, 79; whole-grain
categories, the FCS varied from 16.4 ± 17.7 for savoury snacks and pasta, 70; whole-wheat bread, 60; plain waffles, 19; cooked noodles,
sweet desserts to 78.6 ± 17.4 for legumes, nuts and seeds (Fig. 2). 17; white rice, 10; and pita bread, 1. Other examples among fruits,
The interquartile range (25th to 75th percentiles) was the narrowest mixed dishes and savoury snacks and sweets are also shown. As one
for savoury snacks and sweet desserts (2.0–23.2) and meat, poultry illustrative example, the scores for bulgur (a whole grain generally
and eggs (22.9–45.5) and was the broadest for beverages (8.1–58.8). encouraged in dietary guidance) and sweet potato chips (a savoury
100
80
60
FCS
40
20
0
Beverages Grains Fruits Vegetables Legumes, Meat, Seafood Dairy Fats and Mixed Sauces Savoury
nuts poultry oils foods and snacks
and and condiments and
seeds eggs sweets
100
80
60
FCS
40
20
ϕ
0
Waters
Coffees and teas
100% fruit and vegetable juices
Sugar-sweetened drinks
Breads
Rice and pasta
Cold cereals
Cooked cereals
Cereal bars
General fruits
Berries
Citrus fruits
Dried fruits
General vegetables
Green vegetables
Red and orange vegetables
Starchy vegetables
Pickled vegetables
Legumes
Nuts and seeds
Beef
Pork
Lamb and game
Organ meats
Poultry
Cured meats
Eggs
Seafood
Milk
Cheese
Yogurt
Plant-based dairy
Plant oils
Animal fats
Vegetable dishes
Meat/poultry/egg/seafood dishes
Rice and pasta dishes
Grain-encased dishes
Soups
Sauces
Condiments
Savoury snacks
Desserts
Fig. 2 | FCS for 8,032 unique foods and beverages consumed by US adults, based on NHANES 2015–16. Foods and beverages are grouped in 12 major
food categories (top) and 44 food subcategories (bottom). Standard box plots are shown, with the open circles representing the mean score, the
horizontal lines representing the median FCS, the shaded bars representing the interquartile range (25th to 75th percentiles), the error bars representing
the 5th and 95th percentiles, and the small circles representing additional outliers. The shaded blue region represents the 25th and 75th percentile bounds
for all 8,032 items.
snack generally limited in dietary guidance) were both 69. A more The discriminant ability of the Food Compass versus NOVA
detailed discussion of their comparative nutritional profiles is pro- can be compared by reviewing all 8,032 food items (Supplementary
vided in Supplementary Text 2. The scores for all 8,032 foods and Table 7). For instance, within NOVA = 1, the FCS was 100 each
beverages are provided in Supplementary Table 7. for raw raspberries and raw avocado, 83 for raw banana, 77 for
raw fig, 55 for 100% apple juice, 51 for whole boiled egg and
Food Compass versus NOVA classification. The discriminant 49 for whole milk. Within NOVA = 2, the FCS was 85 for olive
potential of the FCS was evaluated within each category of NOVA. oil, 19 for lard, 8 for unsalted butter and 1 for white granulated
The mean ± s.d. FCS was 81.7 ± 20.3 for NOVA = 1 (unprocessed, sugar. Within NOVA = 3 (processed), food items with FCS > 70
~5.0% of all food items), 38.5 ± 32.7 for NOVA = 2 (culinary ingre- included fresh, frozen and canned cooked asparagus (100), salted
dients, ~0.8%), 61.9 ± 26.6 for NOVA = 3 (processed, ~16.7%) and mixed nuts (97), salted roasted peanuts (88), restaurant meals
36.8 ± 25.4 for NOVA = 4 (ultraprocessed, ~77.5%). Within each such as broiled halibut (95) and mussels with tomato-based sauce
NOVA category, food items had relatively broad ranges of scores (95), and pico de gallo salsa (80). And, within this same category,
(Supplementary Fig. 3). Among all items, 28.8%, 72.4%, 57.6% Swiss, Gouda and Brie cheese had FCS of ~29–44, low-sodium
and 87.2% of products had FCS < 70 in NOVA categories 1, 2, 3 bacon scored 31, smoked or cured pork chops scored 20 and plain
and 4, respectively. cooked couscous scored 11. The Food Compass was especially
80 80
60 60
FCS
FCS
40 40
20 20
0 0
0.5 1.0 1.5 2.0 3.0 3.5 4.0 4.5 5.0 0.5 1.0 1.5 2.0 3.0 3.5 4.0 4.5 5.0
80 80
60 60
FCS
FCS
40 40
20 20
0 0
0.5 1.0 2.0 3.0 3.5 4.0 4.5 5.0 0.5 1.0 2.0 3.0 3.5 4.0 4.5 5.0
HSR HSR
Fig. 3 | FCS according to HSR category for 8,032 unique foods and beverages consumed in the United States (NHANES/FNDDS 2015–16). Box plot
distributions are shown for all foods (top left), grains (top right), vegetables (bottom left), and legumes, nuts and seeds (bottom right). The horizontal
lines (and values) represent the median FCS, the shaded bars represent the interquartile range (25th to 75th percentiles), the error bars represent the 5th
and 95th percentiles, and the small circles represent additional outliers.
biologic interaction, and phytochemicals are considered23. The use food items in NHANES and imputing missing attributes in other
of distinct scoring domains facilitates joint consideration of different datasets by using averages or weighted averages of similar products.
aspects of foods while also preventing undue influence of any one In future work, online applications could also allow consumers and
attribute or domain. The scoring of attributes per 100 kcal provides companies to enter products, using public nutrient information or
a more comparable assessment across foods and beverages than by UPC codes, to be matched against existing product datasets and
weight (confounded by water) or serving size (highly variable across generate an automated FCS. Because we have transparently pub-
items). Finally, the Food Compass scores all items uniformly using lished the mathematical algorithm in the current report, additional
the same attributes, domains, algorithm and cut-off points. scientific groups can develop other approaches for applying the
These features contrast with existing NPS3–5,9,11,15–17, which often FCS. The limitation is not the current science or practical ability to
assess few attributes, prioritize harmful factors, omit many food measure any of these attributes but the current absence of meaning-
characteristics and use inconsistent scoring across arbitrary food ful motivation for the broader food industry to measure and report
categories. The HSR, for instance, has six different food group scor- on all of them. Such characteristics could be readily calculated and
ing algorithms, while the Nutri-Score has four, to help their scores provided by food manufacturers and restaurants if they were pro-
match external evidence on health effects and dietary guidance. vided the incentive to do so.
Beyond the hazards of subjectivity, inconsistency and industry When comparing the Food Compass with NOVA, unprocessed
lobbying for each category’s algorithm (already happening for the items (NOVA = 1) generally scored as healthier (mean FCS, ~82)
HSR), variable treatment of different food groups creates anomalies than ultraprocessed items (NOVA = 4; mean FCS, ~37). However,
for assessing foods with mixed ingredients. In the HSR, for example, the Food Compass also classified processed foods (NOVA = 3; mean
not only do milk and yogurt, fats and oils, cheese, and grains each FCS, ~62) as healthier than culinary ingredients (NOVA = 2; mean
have separate scoring algorithms when packaged separately, but a FCS, ~39). These findings are consistent with observational associa-
mixed food containing more than one of these must be classified tions with health risk factors and endpoints suggesting that most of
and scored on the basis of selecting only one of these algorithms, NOVA’s discrimination occurs between categories 1 and 4, and not
crafted to assess only one of these components. In addition, most 2 or 3 (refs. 10,24). More importantly, the Food Compass displayed
existing NPS have not undergone any type of validation3,4. further discrimination within each NOVA category, such as com-
The FCS, with strengths of objectivity, diverse health-relevant paring whole fruits, whole eggs and milk in NOVA = 1; olive oil,
attributes, universality of scoring and additive discriminatory abil- lard and granulated sugar in NOVA = 2; and canned vegetables,
ity, has corollary limitations in its parsimony and ease of scoring. salted nuts, cheese, bacon and white rice in NOVA = 3. The great-
Its comprehensive nature could thus limit applicability in certain est discrimination of the Food Compass was within NOVA = 4, the
contexts, such as when product data are more limited. Such chal- largest NOVA category in NHANES (>70% of all items). Among
lenges could be partly addressed by leveraging the 8,000+ scored NOVA = 4 foods, 46.7% had FCS ≤ 30, 40.5% had FCS = 31–69 and
80 80
60 60
FCS
FCS
40 40
20 20
0 0
E D C B A E D C B A
80
80
60
FCS
FCS
60
40
40
20
20
0
E D C B A E D C B A
Nutri-Score Nutri-Score
Fig. 4 | FCS according to Nutri-Score category for 8,032 unique foods and beverages consumed in the United States (NHANES/FNDDS 2015–16).
Box plot distributions are shown for all foods (top left), grains (top right), legumes, nuts and seeds (bottom left), and savoury snacks and sweet desserts
(bottom right). The horizontal lines (and values) represent the median FCS, the shaded bars represent the interquartile range (25th to 75th percentiles),
the error bars represent the 5th and 95th percentiles, and the small circles represent additional outliers. The continuous Nutri-Score is converted to five
categories, with different cut-off points for four different food groups for each of these categories; category A is considered the healthiest, and category E
is considered the least healthy.
12.8% had FCS > 70. Review of these ‘ultraprocessed’ foods with categories, the Food Compass may better reflect current evidence
low, intermediate and high scores suggested concordance of the FCS on health harms of processed refined grains and low-fat processed
with dietary guidance, nutritional qualities and health associations meats, the health benefits of unsaturated fats from plant sources,
of these food items. and the minor health relevance of total fat. These observed advan-
The HSR has been applied in government-sanctioned voluntary tages are consistent with a 2019 government-solicited review of the
FOP labelling in Australia and New Zealand and in the Access to older (pre-2020) HSR, which identified over-scoring by the HSR
Nutrition Index ratings of major multinational food companies. The of breakfast cereals, snack bars, sweetened milks and sugar-based
Nutri-Score has similarly been adopted in several European nations confectionary, and under-scoring of healthier oils and oil-based
and is being considered for broader use across the European Union. spreads32. These problems led to 2020 updates in the HSR algo-
The HSR and the Nutri-Score were designed for packaged foods, rithm; but even when using this new HSR algorithm, our analyses
not home-cooked foods, restaurant meals or mixed meals includ- identified persistent similar challenges.
ing multiple items, and thus the comparison with the FCS (which Potential limitations should be considered. Some Food Compass
can be used for these purposes) should be interpreted in that con- attributes with known health effects, such as iodine and trans fats,
text. Conversely, validation of these scores against health outcomes were unavailable in FNDDS. However, because domain scoring pre-
has necessarily assessed all foods and beverages as consumed25–31. vents major influence of any single attribute, their addition from
The very high correlations between the HSR and the Nutri-Score other data sources should further improve validity and discrimina-
(often approaching r = 0.90) indicate that these two systems provide tion but would probably not greatly alter any single item’s FCS. The
very similar information on foods. In contrast, the FCS only mod- potential differential scoring of naturally occurring versus forti-
erately correlated with these other NPS overall, and much less for fied vitamins, minerals, trace lipids or fibres requires further study,
categories such as fish and seafood; beverages; grains; vegetables; when feasible based on both scientific information on potential dif-
dairy; legumes, nuts and seeds; and savoury snacks and sweet des- ferential health effects and available food databases. We elected to
serts. Compared with the FCS, the Nutri-Score often scored 100% include, with reduced scoring weights, attributes with emerging but
juices much lower and pre-sweetened coffees, energy drinks and not yet conclusive evidence for health effects—this values emerg-
fruit drinks much higher. Both the Nutri-Score and the HSR often ing science but could also increase controversy. Because domains,
gave healthier scores than the FCS to processed foods rich in refined attributes and relative weights were identified on the basis of bio-
grains and starch, and much lower scores than the FCS to foods logic and scientific considerations of major health endpoints, this
rich in healthful unsaturated plant oils. The HSR also often gave aided objectivity but reduced ability to explore varying weighting
healthier scores than the FCS to low-fat dairy and low-fat processed schemes. Scores were derived per 100 kcal, requiring more study
meats. These results suggest that, particularly for certain major food on how to score very low-calorie items (<5 kcal per 100 g), which
could influence health directly (for example, coffee and tea) or dis- of all foods and beverages, including mixed dishes, using the same algorithm
place (for example, diet drinks and water) other beverages. Finally, and scoring thresholds. We prioritized attributes related to risk of major chronic
diseases such as obesity, diabetes, cardiovascular diseases and cancers, and to risk
while the Food Compass was tested and validated against two major of major undernutrition outcomes, especially for maternal and child health and
NPS and NOVA, next steps include testing against health outcomes, the elderly. We further considered food characteristics with more recent evidence
application to other product datasets and testing in other world for important health impacts, such as processing characteristics and presence of
regions with widely differing available food products. phytochemicals (carotenoids and flavonoids), and characteristics with emerging
evidence for potential benefits (for example, fermentation) or harms (for example,
The Food Compass is an NPS with attractive characteristics that
preservatives). The final selections were guided by discussions and consensus
can assess and compare the healthfulness of diverse foods and bev- among all investigators.
erages and their combinations such as in a shopping basket, diet We considered several standards for assessing each attribute, including
pattern or company portfolio. The publicly available scoring algo- content per 100 kcal, per 100 g, per litre and per serving size4. Contents per 100 g,
rithm can inform a more nuanced approach to help guide consumer commonly used for many other NPS, are strongly confounded by water weight—
for example, when comparing raw versus cooked spinach. Contents measured by
behaviour, food policy, scientific research, industry reformulations volume, such as cups or litres, are confounded by differing contents of air, water,
and socially focused investment decisions. dietary fibre and fat, such as when comparing a high-bulk breakfast cereal versus
muesli or granola. These differences are also important when comparing grains
Methods and other starchy staples. For example, a half cup of cooked rice (~141 kcal) weighs
The design and development of the Food Compass involved four main steps: (1) 93 g, whereas two slices of white bread (~158 kcal) weigh 60 g. Among discretionary
the assessment of existing NPS, dietary guidelines, health claims and diet–health foods, 150 kcal of soda weighs 245 g (8 fl oz), while 150 kcal of fruit-flavoured candy
relationships, (2) the selection of attributes, (3) the development of scoring weighs 37.5 g (1.3 oz). On the basis of these considerations, we selected assessment
principles and the scoring algorithm, and (4) testing and validation. As this study of all foods and beverages per 100 kcal (418.4 kJ) to facilitate the use of a single
was performed using published reports as well as deidentified, publicly available scoring algorithm for a diverse range of items, from a single small item to a food
data in NHANES, institutional review board approval was not required for this with mixed ingredients or a large mixed dish or meal, even among items that differ
investigation. greatly in bulk. Scoring per 100 kcal was also considered valuable for scaling up
to compare diverse combinations that may be sold and consumed together—for
Assessment of existing NPS, dietary guidance and diet–health relationships. example, to score an entire shopping basket, an entire diet or an entire portfolio of
To help inform the selection of attributes and scoring principles of the Food foods being sold by a particular vendor.
Compass, we assessed the current scientific landscape of major NPS and related
FOP labels, as well as national and international dietary guidance. We found Additional relevant attributes. Several other relevant attributes were considered
more than 100 reported NPS with certain similarities and many differences in but not included due to more limited available scientific information on their
components, scoring principles and thresholds, and design4,9. We reviewed the health impacts and/or contents in major food and beverage databases. For example,
NPS characteristics, including the nutrients or other food attributes scored, we considered separately scoring naturally occurring versus fortified vitamins,
whether positive and/or negative factors were considered, the scoring algorithm minerals and fibre. However, most food databases do not provide separate
and whether the scoring varied for different food or beverage categories. From that information on such contents, nor is the science clear that the health impacts of
evaluation, we focused on seven widely used NPS of diverse origins, used primarily many naturally occurring versus fortified compounds differ. Instead, the use of
for interpretative food/FOP labelling, that represent a variety of methodologies, Dietary Reference Intakes (DRIs) and percentile ranges in the Food Compass
organizations and country coverage—namely, Guiding Stars (United States), the buffered potential extremes in maximum values of nutrients (for example, vitamin
Nutri-Score (Europe), the HSR (Australia and New Zealand), the Nordic Keyhole C) that may be fortified at high levels. In the ‘Processing’ domain, the extent of
(Scandinavia), Singapore Healthy Choice (Singapore), Waqeya (United Arab heating or cooking (for example, the charring of meat), milling of grains or other
Emirates) and the Nestle Nutrient Profiling System (Supplementary Table 1). measures of food structure were considered, but reliable information on these
The number of attributes considered in each NPS typically ranged from 7 to 12 characteristics was considered to be still lacking in most large food databases. Also,
(and also varied within some of these NPS depending on the food category). Most the development of the Food Compass was based on the current best evidence
NPS counted macronutrients, vitamins and/or minerals; a few NPS also considered regarding dietary factors and their links to cardiometabolic diseases, cancers
a limited number of ingredients, such as fruit, vegetable or legume content. For and a variety of conditions associated with undernutrition. However, the Food
each of these seven NPS, different subgroups of foods (for example, milk, other Compass was designed so that additional attributes and scoring could evolve on
beverages, grains and cheese) were scored using different attributes, methods and the basis of future evidence, including other outcomes such as gut health, immune
algorithms, ranging from 4 to 33 differentially scored food categories across function, brain health, bone health, sarcopenia and degrees of physical and mental
these NPS. performance.
For guidelines and official recommendations, we used a recent systematic
review of about 90 national and international dietary guidelines2. In contrast Scoring principles and algorithm. Attributes could be scored in four ways. Most
to the nutrient focus of most NPS, this analysis identified most dietary guidelines were scored on the basis of a linear 10-point scale from 0 to 10 for attributes
to be food-focused. All encouraged certain beneficial food groups, such as considered to have a positive overall health impact and from −10 to 0 for attributes
vegetables, fruits and whole grains; and most provided guidance on other food considered to have an adverse overall health impact (Supplementary Table 3).
groups to help achieve certain nutrient targets, such as consuming dairy foods to For attributes with a defined DRI, such as vitamins and minerals, the target level
achieve calcium and vitamin D targets. Most guidelines also provided information (maximum points) was set at 25% of the adult DRI value for a 2,000-kcal-per-day
on a selected number of nutrients to limit, primarily sodium, added sugar and diet, which most consistently distinguished foods and beverages with higher versus
saturated fats, and some extended these to specific foods, such as the reduction lower levels of these nutrients and was generally similar to the 95% percentile
of red and processed meats. Brazil’s guidelines emphasized food processing value of content across all foods and beverages reported in NHANES 2015–16.
characteristics. The World Health Organization offered generalized dietary For attributes without DRIs, the target level was set on the basis of the 95th
recommendations aimed at the prevention of malnutrition, as well as limiting percentile value of foods and beverages consumed by the US population (based
certain nutrients linked to non-communicable diet-related diseases such as sodium on 2015–16 NHANES data). For attributes that were ratios of positive versus
and added sugar. adverse factors (for example, the ratio of unsaturated to saturated fat), scoring
We also assessed the US Food and Drug Administration nutrient content was on a log-linear scale from −10 to 10 points to represent the full range of the
requirements for health claims, including for general health claims; for claiming ratio, with reference targets for the lowest and highest points based on the 5th and
‘good source’, ‘high’, ‘more’, and ‘high potency’; for claiming ‘light’ or ‘lite’; and 95th percentile values of foods consumed by the US population. For attributes for
for using the term ‘healthy’ or related terms (Supplementary Table 2). Building which information was generally binary (for example, the presence or absence of
on these evaluations of NPS, dietary guidelines and health claims, we assessed preservatives; artificial sweeteners, flavours or colours; fermentation; or frying),
which nutrients, ingredients and other food characteristics were linked to health scoring was binary (−10, 0), with half-weights for most of these factors based
outcomes in observational studies or randomized trials, have been prioritized in on still emerging evidence for health impacts. The attribute based on NOVA
population diet pattern scores, or were of emerging public health interest18–21. processing was scored categorically ranging from −10 to 0.
To prevent any single attribute from dominating a food’s score and to provide a
Selection of attributes. On the basis of the assessments above, we selected more holistic assessment of overall health impact, the identified relevant attributes
key nutrients, ingredients and other food characteristics for inclusion. We did were grouped into nine domains that represented different health-relevant aspects
not make any assumptions about a desirable number of attributes. Rather, we of foods: major nutrients, vitamins, minerals, food ingredients and so on. Each
aimed to address gaps in existing NPS by developing a more updated, more domain’s score was calculated as the average of its attribute scores (or for food
discriminatory NPS that incorporated key new attributes likely to be related to ingredients, the sum), and then the domain scores were summed to calculate the
health, that excluded outdated nutrients for which modern evidence suggests summary score for each food. The same scoring principles and algorithm were
little impact on major health endpoints and that allowed for universal scoring used for all foods and beverages.