0% found this document useful (0 votes)
12 views

FoodRecognition-IEEE-TIM-final

The paper presents a novel system for measuring calories and nutrition from food images using mobile devices. By employing image processing techniques and nutritional fact tables, the system aims to improve the accuracy of food intake measurement compared to existing methods, which often suffer from significant errors. The proposed method utilizes smartphone cameras for capturing food images and incorporates advanced segmentation features to enhance the identification and analysis of food portions.

Uploaded by

Joseph Franklin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

FoodRecognition-IEEE-TIM-final

The paper presents a novel system for measuring calories and nutrition from food images using mobile devices. By employing image processing techniques and nutritional fact tables, the system aims to improve the accuracy of food intake measurement compared to existing methods, which often suffer from significant errors. The proposed method utilizes smartphone cameras for capturing food images and incorporates advanced segmentation features to enhance the identification and analysis of food portions.

Uploaded by

Joseph Franklin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/263813339

Measuring Calorie and Nutrition From Food Image

Article in IEEE Transactions on Instrumentation and Measurement · August 2014


DOI: 10.1109/TIM.2014.2303533

CITATIONS READS
216 12,358

3 authors:

Parisa Pouladzadeh Shervin Shirmohammadi


University of Ottawa University of Ottawa
16 PUBLICATIONS 760 CITATIONS 444 PUBLICATIONS 6,714 CITATIONS

SEE PROFILE SEE PROFILE

Rana Almaghrabi
University of Ottawa
5 PUBLICATIONS 361 CITATIONS

SEE PROFILE

All content following this page was uploaded by Shervin Shirmohammadi on 02 January 2016.

The user has requested enhancement of the downloaded file.


Measuring Calorie and Nutrition from Food Image
Parisa Pouladzadeh1, 2 , Shervin Shirmohammadi1, 2, and Rana Almaghrabi1
1
Distributed and Collaborative Virtual Environment Research Laboratory
University of Ottawa, Ottawa, Canada
2
Colleges of Engineering and Natural Sciences, Istanbul Şehir University, Istanbul, Turkey
Email: {ppouladzadeh, shervin, ralmaghrabi}@discover.uottawa.ca

Abstract- As people across the globe are becoming more interested not only to patients and dietitians in the treatment of obesity, but
in watching their weight, eating more healthily, and avoid obesity, a also to the average calorie-conscious person. Indeed, a number of
system that can measure calories and nutrition in every day meals food intake measuring methods have been developed in the last
can be very useful. In this paper, we propose a food calorie and few years. But, most of these systems have drawbacks such as
nutrition measurement system that can help patients and dietitians
usage difficulties or large calculation errors. Furthermore, many
to measure and manage daily food intake. Our system is built on
food image processing and uses nutritional fact tables. Recently, of these methods are for experimental practices and not for real
there has been an increase in the usage of personal mobile life usage, as we shall see in the section II.
technology such as smartphones or tablets, which users carry with In this paper, we propose a personal software instrument to
them practically all the time. Via a special calibration technique, measure calorie and nutrient intake using a smartphone or any
our system uses the built-in camera of such mobile devices and other mobile device equipped with a camera. Our system uses
records a photo of the food before and after eating it in order to image processing and segmentation to identify food portions
measure the consumption of calorie and nutrient components. Our (i.e., isolating portions such as chicken, rice, vegetables, etc.,
results show that the accuracy of our system is acceptable and it from the overall food image), measures the volume of each food
will greatly improve and facilitate current manual calorie
portion, and calculates nutritional facts of each portion by
measurement techniques.
calculating the mass of each portion from its measured volume
Keywords: Calorie measurement, Food Image processing, Obesity and matching it against existing nutritional fact tables. While a
management. preliminary description of our work has been presented in [5],
here we extend it by proposing a more accurate measurement
I. INTRODUCTION method for estimating food portion volume, which also works
Obesity in adults has become a serious problem. A person is for food portions with an irregular shape, and by evaluating our
considered obese when the Body Mass Index is higher than or approach with more food items. More importantly, the
equal to 30 (kg/m2) [1]. In 2008, more than one in ten of the segmentation features are enriched by involving texture as well
world’s adult populations were obese [1], but in 2012 this figure as color, shape and size of the objects. Our results show
has risen to one in six adults [2], an alarming growth rate. Recent reasonable accuracy in the estimation of nutritional values of
studies have shown that obese people are more likely to have food types for which our system has been trained.
serious health conditions such as hypertension, heart attack, type Color and texture are fundamental characters of natural images,
II diabetes, high cholesterol, breast and colon cancer, and and play an important role in visual perception. Color has been
breathing disorders. The main cause of obesity is the imbalance used in identifying objects for many years. Texture is one of the
between the amount of food intake and energy consumed by the most active topics in machine intelligence and pattern analysis
individuals [3]. So, in order to lose weight in a healthy way, as since the 1950s which tries to discriminate different patterns of
well as to maintain a healthy weight for normal people, the daily images by extracting the dependency of intensity between pixels
food intake must be measured [4]. In fact, all existing obesity and their neighboring pixels [6], or by obtaining the variance of
treatment techniques require the patient to record all food intakes intensity across pixels [7]. Recently, different features of color
per day in order to compare the food intake to consumed energy. and texture are combined together in order to measure food
But, in most cases, unfortunately patients face difficulties in nutrition more accurately [8].
estimating and measuring the amount of food intake due to the In our proposed system, we also aim at using smartphones as
self-denial of the problem, lack of nutritional information, the monitoring tools as they are widely accessible and easy to use.
manual process of writing down this information (which is However, compared to existing work, our system has the
tiresome and can be forgotten), and other reasons. As such, a following contributions:
semi-automatic monitoring system to record and measure the • Our system is currently the only one that not only explains and
amount of calories consumed in a meal would be of great help discusses uncertainties in image-based food calorie
measurement, but also measures and presents actual uncertainty has consumed during a period of time in the recent past (often the
results using food images and its application scenario. This puts previous 24 hours). The 24HR requires only short-term memory,
our system properly in the context of Instrumentation and and if the recall is unannounced, the diet is not changed. Also, the
Measurement research, and leads to more meaningful results for interview is relatively brief (20 to 30 minutes), and the subject
food recognition systems. burden is less in comparison with other food recording
• To the best of our knowledge, this is the first study of a food methods [13]. However, it is not always easy for a person to
image segmentation, classification, identification, and calorie remember the actual contents as well as the amount of the food
measurement system that not only uses 3000 images, but also intake. In addition, to see an expert every 24 hours is difficult
under different conditions such as using different cameras, and in many cases not feasible. In fact, the great majorities of
lighting, and angles. We also use a variety of food such as solid existing clinical methods are similar to this, and typically require
or liquid food, and mixed or non-mixed food. Other existing food records to be obtained for 3 to 7 days, with 7 days being the
work uses much fewer images (typically hundreds) of mostly “gold standard” [5]. The problem with this manual approach is
very specific food, and also do not consider the above condition obvious: people not remembering exactly what they ate,
variations. For example, [9] has used the shape and texture forgetting to take note, and needing to see an expert dietician on
features with only 180 images of food with very distinct shape a very frequent basis so the dietician can guess how much
and texture, [10] has used only fruits in fruit salad, and [11] has calories and nutrient the person has taken.
used 120 pizza images. From a measurement perspective, our To alleviate the shortcomings of these clinical methods,
study and results are more comprehensive, meaningful, and researchers have been trying to come up with improved
generalizable. techniques. Some of these techniques require the person to take
• In our proposed system, we use more features than other a picture of the food before eating it, so that the picture can be
systems, including color, texture, size and shape, whereas most processed offline, either manually or automatically, to measure
existing methods in this area, such as [9], use only color and the amount of calorie. For example, the work in [14] proposes a
shape features. As we have shown in section VI Table II, using 4 method that uses a calibration card as a reference; this card
features significantly increases the accuracy of the system should be placed next to the food when capturing the image, so
compared to using fewer features. that the dimensions of the food are known. However, this card
• We design a method to apply Gabor filter for texture must always be present in the photo when the user wants to use
segmentation of food images. To do this, a bank of Gabor filters the system. The drawback is that the system will not work
with different desired orientations and wavelength are applied to without this card, which means that in the case of misplacement
an image. The outcome of each of these Gabor filters is a two- or absence of the card, the system will not work. Another method
dimensional array, with the same size of the input image. The uses the photo of the food and feeds that to a Neural Network
sum of all elements in one such array is a number that represents developed by researchers in [15]. But the user must capture the
the matching orientation and spatial frequency of the input photo in a special tray (for calibration purposes), which might not
image. In our method, 6 orientations are used as Gabor be always possible and so the method might be difficult to follow
parameter. for the average user. A personal digital assistive (PDA) system
The rest of this paper is organized as follows; Section II covers has also been proposed for food calorie measurement in [16],
related work in this area, while Section III presents a brief where patients use the PDA to record their daily food intake
background of calorie measurement requirements and available information on a mobile phone. But it has been shown that the
calorie tables. Section IV presents our system design, which is result of the portion estimation has significant error and also it
followed by section V, where our food portion volume takes a long time for the user to record the information [17]. Yet
measurement technique is proposed. Section VI covers the another approach appears in [18] where the picture of the food
performance evaluation of our proposed method, while taken with a smartphone is compared to photos of predefined
Section VII analyzes the proposed work. Finally section VIII foods with known nutritional values which are stored in a
concludes the paper as well as providing a brief discussion of database, and the values are estimated based on picture similarity.
future works. The main disadvantage of this system is that it does not take into
account the size of the food, which is extremely important.
II. RELATED WORK Compared to the above methods, our proposed system has fewer
There have been a number of proposed methods for measuring of their shortcomings. Our measurement system also uses a photo
daily food’s dietary information. One example, which is typical of the food, taken with the built-in camera of a smartphone, but
of current clinical approaches, is the 24-Hour Dietary uses the patient’s thumb for calibration, which solves the
Recall[12].The idea of this method is the listing of the daily food problem of carrying cards or special trays. More specifically, an
intake by using a special format for a period of 24 hours. This image of the thumb is captured and stored with its measurements
method requires a trained interviewer, such as a dietician, to ask in the first usage time (first time calibration). This unique method
the respondent to remember in details all the food and drinks s/he will lead to relatively accurate results without the difficulties of
other methods. Food images will then be taken with the user’s developing nutritional facts tables. Each person should take a
thumb placed next to the dish, making it easy to measure the real- certain amount of calories daily. If this amount is increased, it
life size of the portions. We then apply image processing and will lead to gain weight.
classification techniques to find the food portions, their volume,
and their nutritional facts. But before discussing the details of Table I Sample of a Typical Nutritional table
our system, let us first review some background about calorie
Food Name Measure Weight (grams) Energy
measurement and its requirements. Apple with skin 1 140 80
Potato, boil, no skin 1 135 116
III. BACKGROUND Orange 1 110 62
tomatoes, raw 1 123 30
a. Required accuracy of the measurement system Bread white, commercial 1 100 17
Before discussing any technical issues, it is important to Cake 1 100 250
Egg 1 150 17
understand what level of accuracy is expected from our system. Cucumber 1 100 30
To answer this question, we must first see what level of accuracy Banana 1 100 105
existing clinical methods have in their measurement of food’s Orange 1 110 62
nutritional facts. There are two things to consider. First, if we
put a plate of food in front of an expert dietician, s/he cannot Table I illustrates a small sample of a typical nutritional facts
give an accurate measurement of its nutritional facts by simply table, this specific one from Health Canada [20]. Such tables are
looking at it or even examining it manually, because it is readily available from international or national health
impossible to know the exact contents of the dish, such as if this organizations around the world. Our proposed system relies on
dish contains salt, and if so how much, or contains oil, and if so such tables as a reference to measure nutritional facts from any
what type (olive, corn, animal-based, …), and how much, etc.? selected food photo.
Also, some food portions can be obstructed, for example a piece
IV. PROPOSED SYSTEM
of meat could be deep inside a soup, making it invisible to the
dietician. So we can see already that high accuracy of calorie The overall design of our system and its blocks are shown in
measurement is not possible in real life. Second, when we add Figure 1.
this to what happens in existing clinical methods such as [4] , in
which the dietician goes over a list of food items recorded by the
patient without necessarily even seeing the actual food or its
picture, and without knowing size of portions, it becomes clear
that accuracy is decreased even more.
This is very important, because it directly affects the objectives
of our system. The goal of our measurement system is therefore
to design an automated measurement tool running on a
smartphone or other mobile devices with built-in camera that
facilitates (i.e., makes it easier) to record food intake, measure
the size of food portions, and measure nutritional facts,
compared to existing clinical methods. Our goal is not to
necessarily have high accuracy, because as explained above such Figure 1 Overall system design
accuracy is not possible in practice. Of course, the more accurate
the system is the better the end results, and this is why in this As the figure shows, at the early stage, images are taken by the
paper we have tried to measure the size of food portions as user with a mobile device followed by a pre-processing step.
accurately as possible. But it is very important to understand that Then, at the segmentation step, each image will be analyzed to
high accuracy is not possible when dealing with food pictures extract various segments of the food portion. It is known that
only. without having a good image segmentation mechanism, it is not
b. Measurement unit: Calorie definition and nutritional possible to process the image appropriately. That's why we have
tables jointly used color and texture segmentation tools. We will show
how these steps lead to an accurate food separation scheme. For
Calorie is a typical measuring unit which is defined as the
each detected food portion, a feature extraction process has to be
amount of heat energy needed to raise the temperature of one
performed. In this step, various food features including size,
gram of water by one degree [19]. This unit is commonly used to
shape, color and texture will be extracted. The extracted features
measure the overall amount of energy in any food portion that
will be sent to the classification step where, using the Support
consists of the main food components of Carbohydrate, Protein,
Vector Machine (SVM) scheme, the food portion will be
and Fat. Beside gram units, calorie units are also adopted in
identified. Finally, by estimating the area of the food portion and suitable for our purpose where the texture features are obtained
using some nutritional tables, the calorie value of the food will by subjecting each image to a Gabor filtering operation in a
be extracted. The thumb of the user and its placement on the window around each pixel. We can then estimate the mean and
plate are also shown in Figure 1.There is a one-time calibration the standard deviation of the energy of the filtered image. The
process for the thumb, which is used as a size reference to size of the block is proportional to the size of the segment. A
measure the real-life size of food portions in the picture. We Gabor impulse response in the spatial domain consists of a
reported the concept of using the thumb for calibration, as well as sinusoidal plane wave of some orientation and frequency,
its implementation and evaluation in [21] and [22], respectively, modulated by a two-dimensional Gaussian envelope. It is given
and so we do not repeat them here. An example of food picture by:
capturing and thumb isolation and measurement are shown in 1 𝑥 2 𝑦2 (1)
Figure 2. ℎ(𝑥, 𝑦) = −exp { 2 + 2 } cos(2𝜋𝑈𝑥 + 𝜑)
2 𝜎𝑥 𝜎𝑦
Compared to the calibration method of similar systems, using Where 𝑈𝑥 and 𝜑 are the frequency and phase of the sinusoidal
the thumb is more flexible, controllable, and reliable. For users
plane wave along the z-axis (i.e. the 0° orientation), and 𝜎𝑥 and
with thumb disability or amputated thumbs, another finger or a
𝜎𝑦 are the space constants of the Gaussian envelope along the z-
coin can be used instead, the latter still more ubiquitous than
and y-axis, respectively.
special plates or cards used in other systems.
A Gabor filter-bank consists of Gabor filters with Gaussian
kernel function of several sizes modulated by sinusoidal plane
waves of different orientations from the same Gabor-root filter
as defined in equation (1), it can be represented as:

𝑔𝑚,𝑛 (𝑥, 𝑦) = 𝑎−𝑚 ℎ(𝑥′ , 𝑦′ ), 𝑎 > 1 (2)

Figure 2 (a, b) Test images with food and thumb (c) Calculation of the
Where:
thumb dimensions
𝑥 ′ = 𝑥𝑐𝑜𝑠𝜃 + 𝑦𝑠𝑖𝑛𝜃
Figure 3 shows the overall sequence of steps in our system.
The user captures two photos of the food: one from above and 𝑦 ′ = −𝑥𝑠𝑖𝑛𝜃 + 𝑦𝑐𝑜𝑠𝜃
one from the side; the side photo is needed to measure depth, in
𝑛𝜋
order to have a more accurate volume measurement, as will be 𝜃 = (𝑘 = 𝑡𝑜𝑡𝑎𝑙 𝑜𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛, 𝑛 = 0,1, … , 𝑘 − 1. 𝑚 =
𝑘
explained in VI. 0,1, … 𝑠 − 1)

Camera Food Picture


Image Contour Give an Image 𝐼𝐸 (𝑟, 𝑐)of size𝐻𝑥𝑊, the discrete Gabor filtered
transformation Recognition output is given by a 2D convolution:

No
𝐼𝑔 (𝑟, 𝑐) = ∑ 𝐼𝐸 (𝑟 − 𝑠, 𝑐 − 𝑡) 𝑔𝑚,𝑛 (𝑠, 𝑡) (3)
𝑠,𝑡
Correct Contour
Recognition
No Food Recognition Yes
Obtained
As a result of this convolution, the energy of the filtered image
is obtained and then the mean and standard deviation are
Yes estimated and used as features. We used the following
Nutrition Info parameters: 5 scales (S=5), and 6 orientations (K=6).In our
Area-Volume
Calculation
Retrieval & Store Information Show Result model we used Gabor filter for texture segmentation. In the
Calculation implementation phase, each image is divided into 4x4 blocks,
Figure 3 System’s flowchart and each block is convolved with Gabor filter. 6 orientations and
5 scales Gabor filters are used, and the mean and variance of the
Gabor sizes are calculated for each block. In our project, Using
The system uses image segmentation on the photo taken Gabor filter, we can identify five different textures and their
from the top and uses contours to isolate various food portions. identities as soft, rough, smooth, porous, wavy as shown in
The detailed design, implementation, and evaluation of this Table II. In this table, for each texture the number of used image
image processing and segmentation component were described samples for training phase is reported as well.
in [22].For texture features, we used Gabor filters to measure
local texture properties in the frequency domain.
We used a Gabor filter-bank proposed in [23] . It is highly
Table II Different Texture which maps samples into a higher dimensional space in a non-
linear manner. Unlike the linear kernels, the RBF kernel is well
Label Class Samples
1 Soft 400 suited for the cases in which the relation between class labels
2 rough 450 and attributes is nonlinear.
3 smooth 180 In our proposed method, the feature vectors of SVM contain 5
4 porous 320
5 wavy 200
texture features, 5 color features, 3 shape features, and 5 size
features. The feature vectors of each food item, extracted during
As the figure below shows, we have used these features as our the segmentation phase, will be used as the training vectors of
classification inputs and the results will be the input of the SVM SVM.
phase. For each feature, several categories are engaged as shown For increasing the accuracy, after the SVM module has
in the Figure 4. determined each food portion type, the system can optionally
interact with the user to verify the kind of food portions. For
Color feature Size feature Shape feature Texture feature
instance, it can show a picture of the food to the user, annotated
(10 categories) (6 categories) (5 categories) (5 categories) with what it believes are the portion types, such as chicken,
Data preparation for SVM
meet, vegetable, etc., as described in [21], and shown in Figure
Scaling data
6. The user can then confirm or change the food type. This
changes the system from an automatic one into a semi-automatic
Model selection
(RBF Kernel) one; however, it will increase the accuracy of the system.
Adjust the cross-validation
and RBF parameters (C & γ)

Generate SVM model

Figure 4 SVM Algorithm

Some examples of various food types and their segmented


portions are shown in Figure 5.

Figure 6 The SVM module verifies with the user the type of foods it has
determined. [21]

The system then measures the volume of each food portion


and converts it to mass, using available density tables, and
finally uses the mass and nutritional tables to measure the
overall calorie and nutrients in the food. These two latter
components; i.e., food portion volume measurement and calories
measurement, are the focus of this paper and will be explained in
Figure 5 Segmentation of dishes into food portions the next section.
The system also has a module that allows the user or the
dietician to use the measurement results and manage the user’s
Once the food items are segmented and their features are eating habits or clinical program. This module provides useful
extracted, the next step is to identify the food items using graphs such as daily intake, weekly intake, comparison between
statistical pattern recognition techniques. Afterwards, the food various dates, and percentage change in calorie consumption, as
item has to be classified, using SVM mechanism [24], [25]. discussed in [21].
SVM is one of the popular techniques used for data
classification. A classification task usually involves training and V. PROPOSED MEASUREMENT METHOD
testing data which consist of some data instances. Each instance
in the training set contains one class label and several features. a. Food portion volume measurement
The goal of SVM is to produce a model which predicts target
value of data instances in the testing set which are given only by As explained before, in order to measure the size of the
their attributes. food inside the dish, two pictures must be taken: one from the
In our model, we use the radial basis function (RBF) kernel, top and one from the side, with the user's thumb placed beside
the dish when taking the picture from the top. The picture from
the side can be used to see how deep the food goes, and is for regular shapes in a set of different food images.
needed for measuring the food portions’ volumes. The system,
which already has the dimensions of the user’s thumb, can then
use this information to measure the actual area of each food
portion from the top picture, and can multiply this area by the
depth (from the side picture) to estimate the volume of food. Let
us see this in more details in the next paragraphs.
To calculate the surface area for a food portion, we propose to
superimpose a grid of squares onto the image segment so that Figure 8 Calculating area and volume of regular shapes in food images [5]
each square contains an equal number of pixels and,
consequently, equal area. Figure 7 illustrates an example with an b. Calorie and Nutrition Measurement
actual food portion. The reason for using a grid are twofold:
First, compared to other methods, the grid will more easily The volume measurement method described above is really
match with irregular shapes, which is important for food images just an interim step in order to measure the mass of the food
because most food portions will be irregular. Naturally, there portion. Mass is what we really need since all nutritional tables
will be some estimation error, but this error can be reduced by are based on food mass. Once we have the mass, we can use
making the grid finer. Second, depending on the processing these tables to calculate the amount of calories and other
capabilities of the user’s mobile device and the expected system nutrition, as described next.
response time from the user’s perspective, we can adjust the It is known that the nutritional facts database is an important
granularity of the grid to balance between the two factors. If the component for a useful and successful food recognition
grid is made finer, measurements become more accurate but will system [26]. The data of nutritional values of foods are stored in
take longer time, and if the grid is made coarser, measurements these tables and are available from national and international
become less accurate but the response time will be faster. health organizations. These tables, similar to the one shown in
Table I, help us to calculate the amount of calories quickly and
without reference to the Internet or an expert.
At this point, we have the measurement for the volume of each
food portion, and we can use the following general mathematical
equation to calculate their mass:

𝑀=𝜌𝑉 (6)
Where M is the mass of the food portion and ρ is its density.
Food density can also be obtained from readily-available tables.
For example, aqua-calc provides a volume to mass conversion for
Figure 7 Methodology for food portion area measurement 3199 food items and ingredients [27].
In order to extract the density of each food portion, the system
The total area (TA) of the food portion is calculated as the sum needs to know the type of the food, which is done by our SVM-
of the sub areas (Ti) for each square (i) in the grid, as shown in based food recognition module. An example of the information
equation (4): that is fed into the SVM module is shown in Figure 9 right
n
(4) column. The SVM module uses this information and recognizes
TA = ∑ T𝑖 the type of food for each portion [28]. Also, as mentioned earlier,
i=1 at this stage the system can ask the user to verify whether the
Where n is the total number of squares in the food portion’s food type recognized by the SVM module is correct. If not, the
area. After that, and by using the photo from the side view, the user can then enter the correct type, as shown in Figure 6.
system will extract the depth of the food, d, to calculate the food Now, the system can calculate the mass by having the type of
portion’s volume, V, using the following equation (5): food. Consequently, the amount of calorie and nutrition of each
food portion can be derived using nutritional tables, such as
V = TA × d (5) Table I, and based on the following equation:
For better accuracy, if some food portions happen to be regular
shapes like square, circle, triangle, etc., we can use geometric Calorie in the photo =
Calorie from table × Mass in the photo (7)
Mass from table
formulas to calculate their area, instead of using a grid. This
however requires an additional module that can recognize c. Partially-eaten Food
regular shapes. Figure 8 illustrates some example calculations
It is possible that a user does not finish the entire food with features separately which were color, texture, size and shape
captured in the first picture that was taken before eating the respectively. In addition, we have evaluated the performance of
food. If so, we propose a simple technique to increase the system when all of the features are involved in the
measurement accuracy in such cases. If a user does not finish a recognition phase. Furthermore, in order to test the accuracy of
meal, s/he should take another top picture of what is left of the the SVM method, we have applied 10 fold cross validation on
meal. All of the above process can then be repeated on this new different food portions. In cross validation, the original sample is
picture to calculate the amount of calorie and nutrient in the randomly partitioned into k equal size subsamples. In our model
remaining food. The actual value of in-take is then adjusted by we have 10 different rotation of our sample, a single subsample
deducting the values of the remaining food. is retained as the validation data for testing the model, and the
remaining k − 1 subsamples are used as training data. The cross-
validation process is then repeated k times (the folds), with each
of the k subsamples used exactly once as the validation data. The
k results from the folds then can be averaged to produce a single
estimation. The advantage of this method over repeated random
sub-sampling is that all observations are used for both training
and validation, and each observation is used for validation
exactly once.

b. Evaluation of the Recognition Systems

The results of the above-mentioned evaluations are shown in


Table III. As the table shows, we have low accuracy results for
each separate feature, whereas, involving joint combination of all
features works well with an accuracy of approximately 92.21
percent. Finally, as shown in the last column of Table III, we
have examined the system performance using 10 fold cross
validation technique, and we can see that the accuracy of results
are acceptable as well.

Table III RESULTS OF FOOD AND FRUIT RECOGNITION SYSTEM

Recognition Rate (%)


Using All
No. Food items Using Using Using Using Using Features
Color Texture Size Shape All (10 fold
Features Features Features Features Features cross-
validation)
1 Apple 60.33 85.25 31.22 22.55 97.64 91.41
Figure 9 Before (left) and after (right) color analysis and contour detection. 2 Orange 65.38 79.24 41.04 71.33 95.59 90.19
Right column is fed into SVM. 3 Corn 52.00 81.93 71.33 34.61 94.85 97.00
4 Tomato 71.29 69.81 48.09 45.01 89.56 79.82
VI. PERFORMANCE EVALUATION 5 Carrot 74.61 79.67 69.30 65.19 99.79 92.34
6 Bread 56.11 61.56 35.55 35.20 98.39 93.50
a. Evaluation strategy 7 Pasta 71.22 81.57 52.09 48.30 94.75 96.10
8 Sauce 72.45 78.45 40.56 55.00 88.78 85.00
9 Chicken 69.81 71.45 28.02 34.27 86.55 84.52
We have implemented our system as a software prototype, 10 egg 45.12 75.71 31.00 48.37 77.53 92.53
where we successfully segmented the food images and identified 11 Cheese 61.67 83.62 42.67 33.65 97.47 93.43
food portions using their contour inside of the dish [22]. We then 12 Meat 75.38 71.67 55.00 44.61 95.73 97.73
extracted one by one each portion and analyzed them using the 13 Onion 45.81 79.98 31.78 22.59 89.99 84.48
14 Bean 76.80 79.55 76.71 65.11 98.68 96.73
methods described in this paper. For the SVM part, we used 15 Fish 58.55 64.81 18.96 62.73 77.70 81.50
around 3000 different images for our method, which means a set Total Average 63.76 76.28 44.88 45.90 92.21 90.41
of more than 300 images for each food portions, Approximately
150 for training set and then another 150 images as a testing set. Since in 10 fold cross validation we divided input data into 10
In the experiment, the color, texture, size, and shape properties of different groups, in each iteration we have to test the method on
the food images were extracted after pre-processing, as shown in the group of images, meaning that the results are for a group of
the examples of Figure 9. We then checked the recognition result images, not only for one single image. Compared to the 10 fold
cross method with the previous model in which we have tested portions, and their type and volume are extracted. Using the type
the system using only one image in each step and the result is the and volume of each food portion, its mass is extracted using a
accuracy of finding one food portion, we may reach lower density table [25]. Using the extracted mass, the calorie of each
accuracy in some food portions, which is why the last column of food portion is derived using Table I. In the second scenario, the
Table III is generally lower than its second last column, with the real food portion is actually weighted and its real calorie is
exception of fish and egg. extracted using the tables. Finally we have compared the
extracted calories from these two scenarios. Some of the results
are shown in Table V.
As the table shows, the accuracy of the proposed method in non-
mixed food is approximately around 86%. The results are lower
than the recognition rate reported in Table III, though not
significantly inaccurate."
a) d)
Table V Accuracy of proposed method in comparison with real values

Weight Calculated Real Absolute


Food Portions
(grams) Calorie Calorie Accuracy (%)
Cake 100 275 250 90
Egg 150 15 17 88
Apple 200 100 114 87
Tomato 150 23 30 76
Cucumber 100 27.5 30 91
b) e) Bread 100 21 17 76
Banana 150 140 157 89
Orange 160 98 90 91
Average Accuracy 86

e. Uncertainty measurements
One way to increase the confidence in experimental data is to
repeat the same measurement many times and to better estimate
c) f) uncertainties [1] by checking how reproducible the
Figure 10 Non-mixed food (left) and mixed food (right) measurements are. When dealing with repeated measurements,
there are three important statistical quantities: average (or
c. Evaluation of the area measurement technique mean), standard deviation, and standard error. These are
summarized in Table VI.
We have evaluated our proposed area measurement
technique on a variety of simple food (not liquid like soup, Table VI Definition of statistical Quantities
curry, etc.). We measured the area of each food portion twice:
once by hand from the image, and once by using our proposed Statistic What it is Statistical interpretation Symbol
method. Our experimental results, some of which are presented estimate of the
Average "true" value of the the central value 𝑥𝑎𝑣𝑒
in Table IV, show that our area measurement method achieves a measurement
reasonable error of about 10% in the worst case, and less than You can be reasonably sure
1% in the best case. that if you repeat the same
a measure of the measurement one more
Standard
Table IV AREA MEASUREMENT EXPERIMENT RESULTS "spread" in the time, that next 𝑠
deviation
data measurement will be less
than one standard deviation
Food type Error percentage away from the average.
Bread 0.63%
Cake 2.30% You can be reasonably sure
Spaghetti -3.07% that if you do the entire
Cookies 0.50% experiment again with the
estimate in the
Omelet 10.5% same number of repetitions,
uncertainty in the
Standard error the average value from the 𝑆𝐸
average of the
new experiment will be less
d. System accuracy measurements
than one standard error
away from the average
value from this experiment.
In order to evaluate the accuracy of the proposed method,
we have performed two different simulation scenarios. In the
first one, our proposed method is applied on several food
In our system, the following parameters may have effects on the hence the method fails to segment food portions properly. To
results: illumination, camera angle, and the camera itself. solve this problem, we are working on improving the
Illumination is one of the important parameters which affect the segmentation mechanism to better support mixed food as well,
system outcome because illumination directly affects the with the following plan for our future work:
segmentation algorithm, which in turn affects the rest of the a) We are going to apply and test other methods such as graph
algorithms. To take this into account, we put the same plate in cut segmentation to improve our segmentation steps. Having a
three different locations with different illuminations and we took more accurate segmentation method helps us to extract more
pictures. This strategy was repeated for all of the images in our reliable features for recognition phase.
database. b) We are going to train the system with more mixed foods, to
The second effective parameter is the angle of photography; we expand the operation range of the system.
have chosen three different angles which are approximately 30, c) In order to increase the accuracy of segmentation, also we
90, and 150 degrees from the plate of food for all pictures. This are going to increase the range of each feature; e.g., expanding
means that for each plate in 3 different locations we have also the range of color or texture features.
gotten three more pictures from different angles s. 2. The measurement of the mass of the food needs to be
Finally, the camera itself will have an effect on the results in improved to achieve higher accuracy. This can be achieved by:
terms of its lens, hardware, and software. As such, we used three a) Better estimation of the area of each food portion, which can
different cameras for our experiments, consisting of Canon be improved using more accurate segmentation methods, as
SD1400, iphone 4, and Canon SD1300. described in item 1 above.
We discussed above that we have selected three different b) Coming up with an approach to measure the depth of the
illuminations for our plates, each illumination combined with food more accurately, instead of assuming that the depth is
three different angles, and each angle taken with three different uniform throughout the food portion’s area, which is what we
cameras. This means that we have 27 images for each plate of assume now.
food in various conditions. This gives a good opportunity to 3. All of our simulations are performed on white plates with a
measure uncertainties. Since we cannot show the values for each smooth texture. We need to expand our work to various plates
food’s 27 different images, in Table VIII we show for each with different shapes, textures and colors as well.
parameter the average values combined with the other two
parameters. For example, the column that corresponds to Angle VIII. CONCLUSIONS AND FUTURE WORK
at 30 degrees represents the average for all images in all three In this paper, we proposed a measurement method that
illuminations and taken with all three cameras when the angle estimates the amount of calories from a food’s image by
was 30 degrees. As we can see from the table, the results show measuring the volume of the food portions from the image and
that different illuminations with different angles and also using nutritional facts tables to measure the amount of calorie and
different cameras didn't change the final results and they are nutrition in the food. As we argued, our system is designed to aid
approximately in the same range. Because of this, the standard dieticians for the treatment of obese or overweight people,
error is in an acceptable range in each food potion and the overall although normal people can also benefit from our system by
error percentage is small compared with real calories. All in all controlling more closely their daily eating without worrying
this can tell us the method can work well with passable about overeating and weight gain. We focused on identifying
uncertainty in non-mixed plate of food. food items in an image by using image processing and
segmentation, food classification using SVM, food portion
VII. ANALYSIS volume measurement, and calorie measurement based on food
We applied our method to 3 different categories of food: single portion mass and nutritional tables. Our results indicated
food, non-mixed food, and mixed food, and from the results reasonable accuracy of our method in area measurement, and
which are shown in Table III and Table VII, we saw that the subsequently volume and calorie measurement.
SVM’s accuracy is approximately 92.21%, 85%, and 35% to An obvious avenue for future work is to cover more food
65%, respectively. types from a variety of cuisines around the world. Also, more
While the above results are encouraging, there are still some work is needed for supporting mixed or even liquid food, if
limitations with our system, as follows: possible.
1. Our method still has problems in detecting some mixed
foods. In the current version of our proposed method, the
segmentation step often fails to properly detect various food
portions in mixed foods. In addition, illumination of food
portions in a mixed food may be changed as they get mixed,
making it harder to extract different food portions. Furthermore,
the size of food portions in different mixed food are not similar,
Table VII Results of 10 fold cross validation techniques on non-mixed and mixed food

Accuracy
(%)
10 fold cross validation Non-mixed Mixed

a) b) c) d) e) f)

Train classifier on folds: 2 3 4 5 6 7 8 9 10; Test against fold: 1 85.34 82.25 91.05 65 44.29 35.62
Train classifier on folds: 1 3 4 5 6 7 8 9 10; Test against fold: 2 79.36 78.24 100.21 65.25 45 33
Train classifier on folds: 1 2 4 5 6 7 8 9 10; Test against fold: 3 81.66 77.68 95.3 61.49 45 34.82
Train classifier on folds: 1 2 3 5 6 7 8 9 10; Test against fold: 4 73.92 89.98 75.41 64.5 43.25 32.38
Train classifier on folds: 1 2 3 4 6 7 8 9 10; Test against fold: 5 89.22 79.81 100.5 66.81 41.75 34
Train classifier on folds: 1 2 3 4 5 7 8 9 10; Test against fold: 6 81.3 89.89 95.18 60.15 45 34.3
Train classifier on folds: 1 2 3 4 5 6 8 9 10; Test against fold: 7 89.28 81.56 94.75 65.63 42.8 35.28
Train classifier on folds: 1 2 3 4 5 6 7 9 10; Test against fold: 8 91.26 91.57 70.19 64.5 44.19 33.19
Train classifier on folds: 1 2 3 4 5 6 7 8 10; Test against fold:9 85.1 78.45 87.13 65.5 45.21 35.12
Train classifier on folds: 1 2 3 4 5 6 7 8 9; Test against fold:10 89 81.45 69.01 64.25 45 35.01
Average 84.54 85.34 87.9 64.30 44.14 34.27

Table VIII Repeated Uncertainty of measurement

Calories Measured by App


Food Real Illumination Angle Camera
Standard
Location Location Location Canon iphone Canon Average
items Calories 30o 90o 150o Error
1 2 3 SD1400 4 SD1300
Red Apple 80 77.39 79.24 79.99 76.81 80.01 79.40 77.46 81.31 78.46 78.89 0.49
Orange 71 71.23 71.60 70.39 71.31 70.92 71.02 70.92 71.40 71.61 71.15 0.12
Tomato 30 21.49 22.51 22.30 25.12 28.01 22.93 23.35 23.71 24.66 23.78 0.65
Carrot 30 29.61 29.01 29.50 30.21 30.39 30.29 29.77 29.41 29.10 29.69 0.16
Bread 68 66.81 67.12 67.81 68.29 68.99 69.16 70.31 67.52 71.72 68.63 0.53
Pasta 280 270.14 268.00 259.91 281.56 285.01 279.48 269.10 271.88 259.93 271.66 2.97
Egg 17 15.63 16.00 15.99 17.32 16.89 16.93 14.59 15.12 15.52 15.99 0.30
Banana 10 8.50 8.29 8.31 8.45 8.45 8.00 7.90 7.91 7.23 8.11 0.13
Cucumber 30 27.34 28.01 28.00 28.21 28.00 28.49 27.37 27.61 27.99 27.89 0.12
Green
Pepper 16 18.27 18.21 18.44 18.5 18.5 18.92 18.27 18.5 18.30 18.43 0.07

Strawberry 53 45.5 46.53 46.12 46.10 45.17 46.13 46.00 47.02 46.38 46.10 0.18
[15] B. Schölkopf, A. Smola, R. Williamson, and P. L. Bartlett, "New
IX. REFERENCES support vector algorithms," Neural Computation, vol. 12, no. 5,
[1] World Health Organization. (2011, October) Obesity pp. 1207-1245, May 2000.
Study.[Online]. http://www.who.int/mediacentre/factsheets [16] L. E. Burke et al., "Self-monitoring dietary intake: current
/fs311/en/index.html andfuture practices," Journal of renal nutrition the official journal
[2] World Health Organization. (2012) World Health Statistics of the Council on Renal Nutrition of the National Kidney
2012.[Online]. Foundation, vol. 15, no. 3, pp. 281–290, 2005.
http://www.who.int/gho/publications/world_health_statistics/2012/ [17] J. Beasley, "The pros and cons of using pdas for dietary self-
en/index.html monitoring,"Diet Assoc, vol. 107, no. 5, 739 2007.
[3] George A. Bray and Claude Bouchard, Handbook of Obesity, [18] C Gao, F Kong, and J Tan, "Healthaware: Tackling obesity with
Second Edition, Ed. Louisiana, USA: ennington Biomedical health aware smart phone systems,"IEEE International Conference
Research Center, 2004. on Robotics and Biometics, 2009, pp. 1549–1554.
[4] Wenyan Jia et al., "A Food Portion Size Measurement System for [19] Kim E. Barrett, Scott Boitano, Susan M. Barman, and Heddwen L.
Image-Based Dietary Assessment," Bioengineering Conference, Brooks, "Digestion, Absorption & Nutritional Principles," in
IEEE, pp. 3-5, April 2009. Ganong's Review of Medical Physiology. USA: Mc Graw Hill,
[5] R. Almaghrabi, G. Villalobos, P. Pouladzadeh, and S. 2009, ch. 27.
Shirmohammadi, "A Novel Method for Measuring Nutrition [20] Health Canada. (2011, November) Health Canada Nutrient Values.
Intake Based on Food Image," in Proc. IEEE International
[Online].http://www.hcsc.gc.ca/fnan/nutrition/fiche-nutri
Instrumentation and Measurement Technology Conference, Graz,
data/nutrient_value-valeurs_nutritives-tc-tmeng.php
Austria, 2012, pp. 366 - 370.
[21] G. Villalobos, R. Almaghrabi, B. Hariri, and S. Shirmohammadi,
[6] Kartikeyan B and Sarkar A, "An identification approach for 2-D
"A Personal Assistive System for Nutrient Intake Monitoring," in
autoregressive models in describing textures," Graphical Models
Proc. ACM Workshop On Ubiquitous Meta User Interfaces, in
and Image Processing, vol. 53, pp. 121-131, 1993.
Proc. ACM Multimedia, Arizona, USA, 2011, pp. 17 - 22.
[7] R. M Haralick, K Shanmugan, and I Dinstein, "Textural features
[22] G. Villalobos, R. Almaghrabi, P. Pouladzadeh, and S.
for image classification," IEEE Transactions on Systems, Man, and
Shirmohammadi, "An Image Processing Approach for Calorie
Cybernetics, vol. 3, pp. 610 - 621, 1973.
Intake Measurement," in Proc. IEEE Symposium on Medical
[8] A Jain and G Healey, "A multiscale representation including Measurement and Applications, Budapest, Hungary, 2012, pp. 1-5.
opponent color features for texture recognition," IEEE
[23] A.k.jain and F.Farrokhnia, "Unsupervised texture segmentation
Transactions on Image Processing, vol. 7, no. 1, pp. 124-128,
using Gabor filters," Pattern Recognition, vol. 24, pp. 1167-1186,
1998.
1991.
[9] Shiyin Qin, and Yunjie Wu Yu Deng, "An Automatic Food
[24] C.J.C. Burges, "A tutorial on support vector machines for pattern
Recognition Algorithm with both," Image Processing and
recognition," Data Mining and Knowledge, vol. 2, no. 2, pp. 121–
Photonics for Agricultural Engineering, 2009.
167, 1998.
[10] S.A.Madival, and S.A.Madival Dr Vishwanath.B.C, "Recognition
[25] K. Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkopf, "An
of Fruits in Fruits Salad Based on Color and Texture Features,"
introduction to kernel-based learning," IEEE Transactions on
International Journal of Engineering Research & Technology
Neural Networks, vol. 12, no. 2, pp. 181–201, March 2001.
(IJERT), vol. 1, September 2012.
[26] Wu and Jie Yang Wen, "Fast Food Recognition From Videos of
[11] Cheng-Jin Du and Da-Wen Sun, "Pizza sauce spread classification
Eating for Calorie Estimation," in Intl. Conf. on Multimedia and
using colour vision," Journal of Food Engineering, pp. 137-145,
Expo, 2009.
2005.
[27] aqua-calc Food Volume to Weight Conversions. [Online].
[12] M. Livingstone, P. Robson, and J.Wallace, "Issues in dietary
http://www.aqua-calc.com/page/density-table
intake assessment of children and adolescents," Br.J.Nutr, vol. 92,
[28] P. Pouladzadeh, G. Villalobos, R. Almaghrabi, and S.
pp. pp. S213–S222, 2004.
Shirmohammadi, "A Novel SVM Based Food Recognition Method
[13] (2011,October)Obesity:A Research Journal.
for Calorie Measurement Applications," in Proc. International
[Online].http://www.nature.com/oby/journal/v10/n11s/full/oby200 Workshop on Interactive Ambient Intelligence Multimedia
2192a.html Environments, in Proc. IEEE International Conference on
[14] Mingui Sun et al., "Determination of food portion size by image Multimedia and Expo, Melbourne, Australia, 2012, pp. 495 - 498.
processing ," Engineering in Medicine and Biology Society, pp.
871 - 874 , August 2008.

View publication stats

You might also like