0% found this document useful (0 votes)
193 views

PROJECT REPORT-Moving Object Tracking

This document discusses a project to track moving objects and determine their velocity. The objectives are to develop an algorithm to track a single object through a video and perform background subtraction and object detection. The algorithm will track objects, provide unique tags, and can be extended for real-time applications like surveillance. The project will use techniques like background subtraction using dynamic thresholding and morphological processing to detect objects against a static background. Velocity will be estimated using factors like optical flow, block matching, distance, and frame rate. The algorithm aims to efficiently detect salient objects in images and videos in an unsupervised manner using low-level perceptual cues.

Uploaded by

watch ram
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views

PROJECT REPORT-Moving Object Tracking

This document discusses a project to track moving objects and determine their velocity. The objectives are to develop an algorithm to track a single object through a video and perform background subtraction and object detection. The algorithm will track objects, provide unique tags, and can be extended for real-time applications like surveillance. The project will use techniques like background subtraction using dynamic thresholding and morphological processing to detect objects against a static background. Velocity will be estimated using factors like optical flow, block matching, distance, and frame rate. The algorithm aims to efficiently detect salient objects in images and videos in an unsupervised manner using low-level perceptual cues.

Uploaded by

watch ram
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 87

A PROJECT REPORT ON

TRACKING OF MOVING OBJECT AND


DETERMINATION OF IT’S VELOCITY
Submitted in fulfilment of the requirements for the award of degree of

CONTENTS
I Abstract 1
1. CHAPTER-1 INTRODUCTION
1.1 Motivation 2
1.2 Objective and Scope 2
1.3 Problem Definition and Challenges 6
1.3.1 Assumptions 7
1.4 Contribution of the Thesis 8
1.4.1 Saliency Detection based on Low-level Perceptual cues 8
1.4.2 Salient Object Segmentation in Natural Images 10
2. CHAPTER-2 SEGMENTATION
2.1 Introduction 12
2.2 Segmentation using discontinuities 14
2.3 Edge Detection 15
2.4 Segmentation using thresholding 27

3. CHAPTER-3 RELATED WORK


3.1 Pre-Processing 30
3.2 Median Filter 30
3.3 Morphological Process 31
3.4 Connected Component Analysis 31
3.4.1 Background Subtraction Algorithms 32
3.4.1.1 Background Modelling 32
3.4.1.2 Background Subtraction Algorithm 36

3.4.2 Frame Difference 38


3.5 Feature Extraction 39
3,6 Bounding Box with colour feature 40
4. CHAPTER-4 SINGLE OBJECT TRACKING AND VELOCITY ESTIMATION
4.1 Optical Flow 42
4.2 Block Matching 42
4.3 Tracking 42
4.4 Distance 43
4.5 Velocity 43
5. CHAPTER-5 CODING 46
6. CHAPER-6 CONCLUSION AND FUTURE SCOPE 53
II. REFERENCES 54
III. APPENDIX-A MATLAB 56
IV. APPENDIX-B DIGITAL IMAGE PROCESSING 70
ABSTRACT

Currently, both the market and the academic communities have required
applications based on image and video processing with several real-time
constraints. On the other hand, detection of moving objects is a very
important task in mobile robotics and surveillance applications. This
project proposes an efficient technique for moving object detection and
it’s velocity determination based on background subtraction using
dynamic threshold and morphological process. These methods are used
effectively for object detection and most of the previous methods depend
on the assumption that the background is static over short time periods. In
dynamic threshold based object detection, morphological process and
filtering also used effectively for unwanted pixel removal from the
background.

Then the object is detected in a sequence of frames with respect to the


frame rate that the video is recorded. A simulation result proves that the
proposed method is effective for background subtraction for object
detection compared to several competitive methods proposed in the
literature and determination. This method will be able to identify moving
persons, track them and provide aunique tag for the tracked persons. The
background subtraction algorithm can also be used to detect multiple
objects. The algorithms developed can also be used for other applications
(real-time, object classification, etc.).

-1-

CHAPTER 1

INTRODUCTION

Object tracking in a complex environment has long been an interesting


and challenging problem. This project deals with the single object
tracking, background subtraction, and object detection. Detection of
moving objects is a very important task in mobile robotics and
surveillance applications.

1.1 Motivation

The motivation behind this project is to develop software for tracking, the
major application in security, surveillance and vision analysis. The
developed software must be capable of tracking any single object moving
in the frame and to simulate on software. This system might be useful for
extending in real-time surveillance or object classification.

1.2 Objective

The objective of this project is to develop an algorithm for tracking an


object and determining the frame subtraction and object detection.
Algorithms can be extended for real-time applications.
Modern day life has overwhelming amount visual data and information
available and created every minute. This growth in image data has led to
new challenges of processing them fast and extracting correct
information, so as to facilitate different tasks from image search to image
compression and transmission over the network. One specific problem of
computer vision algorithms used for extracting information from images
is to find objects of interest in an image. The human visual system has an

-2-

immense capability to extract important information from a scene. This


ability enables humans to focus their limited perceptual and cognitive
resources on the most pertinent subset of the available visual data,
facilitating learning and survival in everyday life.

This amazing ability is known as visual saliency. Hence for a computer


vision system, it is important to detect saliency so that the resources can
be utilized properly to process important information. Applications range
from object detection to Content-Based Image Retrieval (CBIR), face or
human re-identification and video tracking.

1.1 Motivation:

What is Saliency?

Saliency is the ability or quality of a region in an image to standout (or


be prominent) from the rest of the scene and grabs our attention.
Saliency can be either stimulus-driven or task specific. The former one
is known as bottom-up saliency while the later species top-down
saliency and leads to visual search. Bottom-up saliency can be
interpreted as alter which allows only important visual information to
grab the attention for further processing. In our work, we concentrate on
bottom-up salient object detection. Saliency is a useful concept when
considering bottom-up feature extraction. In such circumstances, the
role of context becomes extremely important. That is to say that
saliency can be described as a relative measure of importance. Hence,
the bottom-up saliency can be interpreted as its state or quality of
standing out (relative to other stimuli) in a scene.

-3-

Figure 1.1: The top row shows an example of saliency map generated
from the image (left) and the bottom row depicts an ideal segmentation
of the object in the image (left)

As a result, a salient stimulus will often pop-out to the observer, such as


a red dot in a field of green dots, an oblique bar among a set of vertical
bars, bickering message indicator of an answering machine, or a fast
moving object in a scene with mostly static or slow moving objects. An
important aspect of the saliency mechanism is that it helps the visual
perceptual system to quickly alter and organize useful visual
information, necessary for object recognition and/or scene
understanding.

We propose two deferent methods for the task. First is based on low-
level perceptual features. The second combines the low-level saliency
with generic object specific cues in a graphical model-based approach.

-4-

Both the methods are thoroughly evaluated against state-of-the-art


methods on challenging benchmark datasets and found to produce
superior results.

1.2 Objective and Scope

The objective of the thesis is to device an eminent salient object


detection method that can facilitate as a pre-processing step for many of
the previously mentioned tasks. Further, the method must be
unsupervised so that it can detect any generic object. Moreover, it has to
be computationally evident to ensure fast processing, considering the
huge amount of available data. As already discussed bottom-up saliency
can be characterized by the ability to pop-out in a scene.
Figure 1.2: Different challenges in saliency detection, illustrated using
samples from saliency dataset, MSRA B.

-5-

1.3 Problem Definition and Challenges

The problem we address in the thesis is: Given a natural scene, detect
one or more regions of interest (ROI) which contain the salient objects
in a scene. The method must be unsupervised with no training sample
for classes of objects available. Parameters of an optimization function
may be learned using a part of another dataset. Although the problem is
similar to unsupervised foreground segmentation, it differs in the
context of features which is mostly inspired by biological motivation.
Some examples of finding objects of interest are presented in Figure
1.1.

This is a challenging task because objects of interest are detected


without any prior knowledge about them purely based on unsupervised
stimulus-driven perceptual cues. Single features such as colour,
brightness, depth alone are not helpful to solve the problem. Further,
when finding saliency for challenging datasets like PASCAL to
facilitate the later process of object detection or recognition, segmenting
the object of interest becomes even harder. The study of different
methods and samples from the dataset reveals the following challenges,
apart from the factors already depicted in Figure 1.2:

1. Only a small part of an object is present on the boundary of an image;

2. Objects with large holes, such as cycle wheel;

3. Repeated Distractors in background or foreground. These are


illustrated with respective samples in Figure 1.3

-6-

1.3.1 Assumptions

1. The images are indoor or outdoor natural scenes, captured using the
optical camera (not X-ray or infrared etc.).

2. The object of interest is not just visible in few pixels on the boundary
of the image.

3. The object of interest is generally not hidden behind a large


distractor, for example, the image in the third row of Figure 1.3.
4. Not much of both colour and texture overlap between object and
scene background.

Figure 1.3: Different challenges in saliency detection illustrated with


images (left) and respective ground truths (right), from PASCAL
dataset.

-7-

1.4 Contribution of the Thesis

The central contribution of the thesis is pixel accurate localization of


the object of interest. The saliency map provided by our proposed
methods assign each pixel a saliency value in the range of 0 to 1,
depicting their probability of being salient. Hence, it can be easily
segmented by simple thresholding mechanism, to obtain the important
or salient object. In the work described here, saliency is described in
terms of the spatial rarity of the image feature, mainly colour.This can
change the conventional way of extracting features from the whole
image or searching objects in huge 4-dimensional (position, scale and
aspect ratio) sliding window search space. It would help simulate the
same logistics as human vision and improve both speed and accuracy of
the computer vision tasks. Moreover, since we produce a probability
value of each pixel being salient, the saliency map can also be utilized
for identifying most salient regions for different tasks, for example
placing an advertisement in a video. In the following sub-sections, we
describe the methods proposed in the thesis in brief.

1.4.1 Saliency Detection based on Low-level Perceptual cues

In the first formulation, we formulate three saliency criteria, namely:

(i) Graph-based spectral rarity,


(ii) Spatial compactness and
(iii) Background prior.
Then, a weighted combination of these three measures of saliency,
produce a saliency map. A saliency map is represented as a grayscale
image, where each pixel is assigned its probability of being salient.

-8-

The first two terms named above are based on the rarity of feature and
the third terms correspond to boundary prior. The idea of boundary
prior is that the boundary of an image mostly contains background
image elements or superpixels and background superpixels are spatially
connected among themselves, but not with foreground ones.

Graph-based spectral rarity assigns saliency based on rarity or


uniqueness of a superpixel. This measure utilizes the spectral features
using Laplacian of the superpixel graph.
Figure 1.4: Illustration of the sequence of stages of our proposed
algorithm (PARAM) for saliency estimation, with an example from
MSRA-B Dataset.

On the other hand, spatial compactness takes into account that a colour
belonging to a salient object would be grouped at a spatial location and
thus the spatial variance of the colour would be low. Whereas,
background colours are generally distributed over the whole image and
score low on spatial compactness.

Our formulation models background prior using a Gaussian Mixture


Model (GMM). All the superpixels touching the boundary of an image

-9-

are modelled by GMM in Lab colour space. The saliency of a


superpixel is measured as the sum of the distances from the GMM
modes weighted by the particular mixturecoefficient. Since most of the
boundary superpixels would be background, big GMM modes with a
high value of mixture coefficientbelong to background colours and thus
the mentioned distance gives a good measure of saliency.

A non-linear weighted combination of these three different cues is used


to compute the final saliency map. Also, binary segmentation maps are
generated for quantitative evaluation of performance using an adaptive
threshold. An illustration of the complete ow chart of this proposed
saliency detection method, which is named as PARAM (background
Prior and Rarity for saliency modelling), is depicted in Figure 1.4.

1.4.2 Salient Object Segmentation in Natural Images

Next, we propose a Salient Object Segmentation method that captures


the same visual processing hierarchy as in the human visual system. Our
goal is to localize objects independent of its category by formulating an
unsupervised algorithm.

Next, a graphical model-based approach is used for proper spatial


propagation of these priors. The image cue priors form the unary term
and we formulate a submodular edge cost or pairwise term to specify
the CRF. Hence, an exact inference is done efficiently using graph-cut.

In short, the contributions can be described as:

1. Two saliency detection algorithms, which significantly outperform


(both quantitatively and qualitatively) several existing algorithms,
have been proposed.
-10-

2. A structured SVM based parameter learning of our graphical model


(CRF) based approach, using high-order loss function.

3. Exhaustive experimentation on two real-world saliency and object


segmentation datasets have been performed to validate the performance
of the proposed methods.

4. Publicly available C++ code of the algorithms.


-11-

CHAPTER 2

SEGMENTATION

Segmentation is the process of dividing digital image into multiple


regions. Segmentation shows the objects and boundaries in an image.
Each Pixel in the region has some similar characteristics like colour,
intensity, etc. Few methods for segmenting the images are explained
below.
2.1 Introduction:

Segmentation is a process of that divides the images into its regions or


objects that have similar features or characteristics.

Some examples of image segmentation are

1. In automated inspection of electronic assemblies, presence or absence


of specific objects can be determined by analyzing images.

2. Analyzing aerial photos to classify terrain into forests, water bodies


etc.

3. Analyzing MRI and X-ray images in medicine forclassify the body


organs.

Some figures which show segmentation are

Fig 2.1
-12-

Segmentation has no single standard procedure and it is very difficult in


non-trivial images. The extent to which segmentation is carried out
depends on the problem Specification. Segmentation algorithms are based
on two properties of intensity values- discontinuity and similarity. First
category is to partition an image based on the abrupt changes in the
intensity and the second method is to partition the image into regions that
are similar according to a set of predefined criteria.
In this report some of the methods for determining the discontinuity will
be discussed and also other segmentation methods will be attempted.
Three basic techniques for detecting the gray level discontinuities in a
digital images points, lines and edges.

The other segmentation technique is the thresholding. It is based on the


fact that different types of functions can be classified by using a range
functions applied to the intensity value of image pixels. The main
assumption of this technique is that different objects will have distinct
frequency distribution and can be discriminated on the basis of the mean
and standard deviation of each distribution.

Segmentation on the third property is region processing. In this method


an attempt is made to partition or group regions according to common
image properties. These image properties consist of Intensity values from
the original image, texture that are unique to each type of region and
spectral profiles that provide multidimensional image data.

A very brief introduction to morphological segmentation will also be


given. This method combines most of the positive attributes of the other
image segmentation methods.

-13-

2.2 Segmentation using discontinuities

Several techniques for detecting the three basic gray level discontinuities
in a digital image are points, lines and edges. The most common way to
look for discontinuities is by spatial filtering methods.

Point detection idea is to isolate a point which has gray level


significantly different form its background.
Fig 2.2

w1=w2=w3=w4=w6=w7=w8=w9 =-1, w5 = 8.

Response is R = w1z1+w2z2……+w9z9, where z is the gray level of the


pixel.

Based on the response calculated from the above equation we can find out
the points desired.

Line detection is next level of complexity to point detection and the lines
could be vertical, horizontal or at +/- 45 degree angle.

Fig 2.3

-14-

Responses are calculated for each of the mask above and based on the
value we can detect if the lines and their orientation.

2.3 Edge detection


The edge is a regarded as the boundary between two objects (two
dissimilar regions) or perhaps a boundary between light and shadow
falling on a single surface.

To find the differences in pixel values between regions can be computed


by considering gradients.

The edges of an image hold much information in that image. The edges
tell where objects are, their shape and size, and something about their
texture. An edge is where the intensity of an image moves from a low
value to a high value or vice versa.

There are numerous applications for edge detection, which is often used
for various special effects. Digital artists use it to create dazzling image
outlines. The output of an edge detector can be added back to an original
image to enhance the edges.

Edge detection is often the first step in image segmentation. Image


segmentation, a field of image analysis, is used to group pixels into
regions to determine an image's composition.

A common example of image segmentation is the "magic wand" tool in


photo editing software. This tool allows the user to select a pixel in an
image. The software then draws a border around the pixels of similar
value. The user may select a pixel in a sky region and the magic wand
would draw a border around the complete sky region in the image.

-15-

The user may then edit the color of the sky without worrying about
altering the color of the mountains or whatever else may be in the image.
Edge detection is also used in image registration. Image registration
aligns two images that may have been acquired at separate times or from
different sensors.

Figure2.4 Different edge profiles.

There is an infinite number of edge orientations, widths and shapes


(Figure e1). Some edges are straight while others are curved with varying
radii. There are many edge detection techniques to go with all these
edges, each having its own strengths. Some edge detectors may work well
in one application and perform poorly in others. Sometimes it takes
experimentation to determine what the best edge detection technique for
an application is.

The simplest and quickest edge detectors determine the maximum value
from a series of pixel subtractions. The homogeneity operator subtracts
each 8 surrounding pixels from the center pixel of a 3 x 3 window as in
Figure 2.5. The output of the operator is the maximum of the absolute
value of each difference.

-16-
Figure 2.5 How the homogeneity operator works.

new pixel = maximum{½ 11-11½ , ½ 11-13½ , ½ 11-15½ , ½ 11-16½ ,½


11-11½ ,
½ 11-16½ ,½ 11-12½ ,½ 11-11½ } = 5

Similar to the homogeneity operator is the difference edge detector. It


operates more quickly because it requires four subtractions per pixel as
opposed to the eight needed by the homogeneity operator. The
subtractions are upper left - lower right, middle left - middle right, lower
left - upper right, and top middle - bottom middle (Figure 2.6)

Figure 2.6 How the difference operator works.

-17-
new pixel = maximum{½ 11-11½ , ½ 13-12½ , ½ 15-16½ , ½ 11-16½ } =
5

First order derivative for edge detection

If we are looking for any horizontal edges it would seem sensible to


calculate the difference between one pixel value and the next pixel value,
either up or down from the first (called the crack difference), i.e.
assuming top left origin

Hc= y_difference(x, y) = value(x, y) – value(x, y+1)

In effect this is equivalent to convolving the image with a 2 x 1 template

Likewise

Hr = X_difference(x, y) = value(x, y) – value(x – 1, y)

uses the template

–1 1

Hc and Hrare column and row detectors. Occasionally it is useful to plot


both X_difference and Y_difference, combining them to create the
gradient magnitude (i.e. the strength of the edge). Combining them by
simply adding them could mean two edges canceling each other out (one
positive, one negative), so it is better to sum absolute values (ignoring the
sign) or sum the squares of them and then, possibly, take the square root
of the result.

-18-
It is also to divide the Y_difference by the X_difference and identify a
gradient direction (the angle of the edge between the regions)

The amplitude can be determine by computing the sum vector of HcandHr

Sometimes for computational simplicity, the magnitude is computed as

The edge orientation can be found by

In real image, the lines are rarely so well defined, more often the change
between regions is gradual and noisy.

The following image represents a typical read edge. A large template is


needed to average at the gradient over numberof pixels, rather than
looking at two only

-19-
Sobel edge detection

The Sobel operator is more sensitive to diagonal edges than vertical and
horizontal edges. The Sobel 3 x 3 templates are normally given as

X-direction

Y-direction

Original image

absA + absB

-20-
Threshold at 12

Other first order operation

The Roberts operator has a smaller effective area than the other mask,
making it more susceptible to noise.

The Prewit operator is more sensitive to vertical and horizontal edges


than diagonal edges.

The Frei-Chen mask

In many applications, edge width is not a concern. In others, such as


machine vision, it is a great concern. The gradient operators discussed
above produce a large response across an area where an edge is present.
This is especially true for slowly ramping edges. Ideally, an edge detector
should indicate any edges at the center of an edge. This is referred to as
localization. If an edge detector creates an image map with edges several

-21-
pixels wide, it is difficult to locate the centers of the edges. It becomes
necessary to employ a process called thinning to reduce the edge width to
one pixel. Second order derivative edge detectors provide better edge
localization.

Example. In an image such as

The basic Sobel vertical edge operator (as described above) will yield a
value right across the image. For example if

is used then the results is

Implementing the same template on this "all eight image" would yield

This is not unlike the differentiation operator to a straight line, e.g. if y =

3x-2.
-22-

Once we have gradient, if the gradient is then differentiated and the


result is zero, it shows that the original line was straight.

Images often come with a gray level "trend" on them, i.e. one side of a
regions is lighter than the other, but there is no "edge" to be discovered in
the region, the shading is even, indicating a light source that is stronger at
one end, or a gradual colour change over the surface.

Another advantage of second order derivative operators is that the edge


contours detected are closed curves. This is very important in image
segmentation. Also, there is no response to areas of smooth linear
variations in intensity.

The Laplacian is a good example of a second order derivative operator. It


is distinguished from the other operators because it is omnidirectional. It
will highlight edges in all directions. The Laplacian operator will produce
sharper edges than most other techniques. These highlights include both
positive and negative intensity slopes.

The edge Laplacian of an image can be found by convolving with masks


such as

or

The Laplacian set of operators is widely used. Since it effectively


removes the general gradient of lighting or coloring from an image it only
discovers and enhances much more discrete changes than, for example,
the Sobel operator. It does not produce any information on direction
-23-

which is seen as a function of gradual change. It enhances noise, though


larger Laplacian operators and similar families of operators tend to ignore
noise.

Determining zero crossings

The method of determining zero crossings with some desired threshold is


to pass a 3 x 3 window across the image determining the maximum and
minimum values within that window. If the difference between the
maximum and minimum value exceed the predetermined threshold, an
edge is present. Notice the larger number of edges with the smaller
threshold. Also notice that the width of all the edges are one pixel wide.

A second order derivative edge detector that is less susceptible to noise is


the Laplacian of Gaussian (LoG). The LoG edge detector performs
Gaussian smoothing before application of the Laplacian. Both operations
can be performed by convolving with a mask of the form

where x, y present row and column of an image, s is a value of dispersion


that controls the effective spread.

Due to its shape, the function is also called the Mexican hat filter. Figure
2.7 shows the cross section of the LoG edge operator with different
values of s. The wider the function, the wider the edge that will be
detected. A narrow function will detect sharp edges and more detail.
-24-

Figure 2.7 Cross selection of LoG with various s.

The greater the value of s, the wider the convolution mask necessary. The

first zero crossing of the LoG function is at . The width of the


positive center lobe is twice that. To have a convolution mask that
contains the nonzero values of the LoG function requires a width three
times the width of the positive center lobe (8.49s).

Edge detection based on the Gaussian smoothing function reduces the


noise in an image. That will reduce the number of false edges detected
and also detects wider edges.

Most edge detector masks are seldom greater than 7 x 7. Due to the shape
of the LoG operator, it requires much larger mask sizes. The initial work
in developing the LoG operator was done with a mask size of 35 x 35.

Because of the large computation requirements of the LoG operator, the


Difference of Gaussians (DoG) operator can be used as an approximation
to the LoG. The DoG can be shown as
-25-

The DoG operator is performed by convolving an image with a mask that


is the result of subtracting two Gaussian masks with different a values.
The ratio s1/s 2 = 1.6 results in a good approximation of the LoG. Figure
2.8 compares a LoG function (s = 12.35) with a DoG function (s 1 = 10, s2
= 16).

Figure 2.8LoG vs. DoG functions.

One advantage of the DoG is the ability to specify the width of edges to
detect by varying the values of s1 and s2. Here are a couple of sample
masks. The 9 x 9 mask will detect wider edges than the 7x7 mask.

For 7x7 mask, try


-26-

For 9 x 9 mask, try

2.4 Segmentation using thresholding.

Thresholding is based on the assumption that the histogram is has two


dominant modes, like for example light objects and an dark background.
The method to extract the objects will be to select a threshold F(x,y)= T
such that it separates the two modes. Depending on the kind of problem
to be solved we could also have multilevel thresholding. Based on the
region of thresholding we could have global thresholding and local
thresholding.

Where global thresholding is considering the function for the entire image
and local thresholding involving only a certain region. In addition to the
above mentioned techniques that if the thresholding function T depends
on the spatial coordinates then it is known as the dynamic or adaptive
thresholding.
-27-

Let us consider a simple example to explain thresholding.

Figure2.9: for hypothetical frequency distribution of intensity values for


fat ,muscel and bone

A hypothetical frequency distribution f(I) of intensity values I(x,y) for fat,


muscle and bone, in a CT image. Low intensity values correspond to fat
tissues, whereas high intensity values correspond to bone. Intermediate
intensity values correspond to muscle tissue. F+ and F- refer to the false
positives and false negatives; T+ and T- refer to the true positives and
true negatives.

Basic global thresholding technique:


In this technique the entire image is scanned by pixel after pixel and hey
is labeled as object or the background, depending on whether the gray
level is greater or lesser than the thresholding function T. The success

-28-

depends on how well the histogram is constructed. It is very successful in


controlled environments, and finds its applications primarily in the
industrial inspection area.

The algorithm for global thresholding can be summarized in a few steps.

1) Select an initial estimate for T.

2) Segment the image using T. This will produce two groups of pixels.
G1 consisting of all pixels with gray level values >T and G2 consisting of
pixels with values <=T.

3) Compute the average gray level values mean1 and mean2 for the pixels
in regions G1 and G2.

4) Compute a new threshold value T=(1/2)(mean1 +mean2).

5) Repeat steps 2 through 4 until difference in T in successive iterations is


smaller than a predefined parameter T0.
-29-

CHAPTER 3

RELATED WORK

3.1 Pre-processing

The pre-processing performs some steps to improve the image quality.


Few algorithms are explained and filtering is done by using a median
filter and morphological filter.

3.2 Median Filter

In signal processing, it is often desirable to perform noise reduction on an


image or signal. The median filter is a nonlinear digital filtering
technique, often used to remove noise. Such noise reduction is a typical
pre-processing step to improve the results of later processing (for
example, edge detection on an image). Median filtering is very widely
used in digital image processing because, under certain conditions, it
preserves edges while removing noise. The median filter is a classical
noise removal filter. Noise is removed by calculating the median from all
its box elements and stores the value to the central element. If we
consider an example of 3x3 matrix

Example of Median
The median filter sorts the elements in a given matrix and median value is
assigned to the central pixel. Sorted elements 1, 2, 3, 4, 5, 6, 7, 8, 9 and
median 5 will assign to the central element. Similar box scan is

-30-

performed over the whole image and reduces noise. Execution time is
more compared to mean filter,as the algorithm involves with sorting
techniques. But it removes the small pixel noise.

3.3 Morphological Process

Morphological operations are applied on segmented binary image for


smoothening the foreground region. It processes the image based on
shapes and it performs on image using structuring element. The
structuring elements will be created with specified shapes (disk, line,
square) which contains 1‟s and 0‟s value where ones are represents the
neighborhood pixels. Dilation and erosion process will be used to
enhance (smoothening) the object region by removing the unwanted
pixels from outside region of foreground object. After this process, the
pixels are applied for connected component analysis and then analysis the
object region for counting the objects.

3.4 Connected Component Analysis

The output of the change detection module is the binary image that
contains only two labels, i.e., „0‟ and „255‟, representing as
„background‟ and „foreground‟ pixels respectively, with some noise. The
goal of the connected component analysis is to detect the large sized
connected foreground region or object. This is one of the important
operations in motion detection. The pixels that are collectively connected
can be clustered into moving objects by analysing their connectivity.

-31-

3.4.1 BACKGROUND SUBTRACTION METHODS

The common problem of all computer-vision systems is to separate


people from a background scene (determine the foreground and the
background).Many methods are proposed to resolve this problem.
Background subtraction is used in several applications to detect the
moving objects in a scene as in multimedia, video surveillance and
optical motion capture. Background subtraction consist of steps, like
Background modeling, Background initialization, Background
maintenance and Foreground detection shown in Figure 2.1

Figure 3.1 Classification of Background Subtraction

3.4.1.1 Background Modelling


Background modelling describes the kind of model used to represent the
background. The simplest way to model the background is to acquire a
background image which does not include any moving object. In some
environments, the background is not available, and can always be
changed under critical situations, like illumination changes, objects being
introduced or removed from the scene.

-32-

In literature, there are many proposed background modeling algorithms.


This is mainly because no single algorithm is capable in coping with all
the challenges in this area. There are several problems that a good
background subtraction algorithm must resolve. First, it must be robust
against changes in illumination. Second, it should avoid detecting non-
stationary background objects, such as swaying leaves, grass, rain, snow,
and shadows cast by moving objects. Finally, the background model
should be developed such that it should react quickly to changes in the
background, such as the starting and stopping of vehicles.

Background modeling techniques could be classified into two broad


categories: Non-Predictive Modelling and Predictive Modelling. The
former tries to model the scene as a time series, and creates a dynamic
model at each pixel to consider the incoming input, using the past
observations, and utilizes the magnitude of deviation between the actual
observation and the predicted value, to categorize the pixels as part of the
foreground or background. The latter neglects the order of the input
observations, and develops a statistical (probabilistic) model, such as the
probability density function at each pixel.

Background adaptation techniques could also be categorized as non-


recursive and recursive. A non-recursive technique estimates the
background based on a sliding-window approach. The observed video
frames are stored in a buffer, considering the existing pixel variations in
the buffer the background image will be estimated. Since in practice the
buffer size is fixed as time passes and more video frames come along the
initial frames of the buffer are discarded which makes these techniques
adaptive to scene changes depending on their buffer size. However, in the
case of adapting to slow moving objects or coping with transient stops of

-33-

certain objects in the scene the non-recursive techniques require large


amount of memory for storing the appropriate buffer. With a fixed buffer
size this problem can partially be solved by reducing the frame rate as
they are stored.

On the contrary the recursive techniques instead of maintaining a buffer


to estimate the background they try to update the background model
recursively using either a single or multiple model(s) as each input frame
is observed. The very first input frames are capable to leave an effect on
new input video frames which makes the algorithm adapt with periodical
motions such as flickering, shaking leaves, etc. Recursive methods need
less storage in comparison with non-recursive methods but possible errors
stay visible for longer time in the background model. Majority of
schemes use exponential weighting or forgetting factors to determine the
proportion of contribution of past observations.

Many of the pixel based probability density function methods in the


literature are of the mixture-of-Gaussians (MOG) variety; each pixel's
intensity is described by a mixture of K Gaussian distributions where K is
usually a small number. Over the years, increasingly complex pixel-level
algorithms have been proposed. Among these, by far the most popular is
the Gaussian Mixture Model (GMM).This model consists of modeling
the distribution of the values observed over time at each pixel by a
weighted mixture of gaussians.

This background pixel model is capable to cope with the multimodal


nature of many practical situations and leads to good results when
repetitive background motions, such as tree leaves or branches, are
encountered. A particle swarm optimization methodis proposed to
automatically determine the parameters of the GMM algorithm.

-34-

However, the technique has a considerable computational cost as they


only manage to process seven frames of 640 × 480 pixels per second with
an Intel Xeon 5150 processor. The downside of the GMM algorithm
resides in its strong assumptions that the background is more frequently
visible than the foreground and that its variance is significantly lower.
None of this is valid for every time window. Furthermore, if high- and
low-frequency changes are present in the background, its sensitivity
cannot be accurately tuned and the model may adapt to the targets
themselves or miss the detection of some high speed targets,the
estimation of the parameters of the model can become problematic in
real-world noisy environments. This often leaves one with no other
choice than to use a fixed variance in a hardware implementation.

In the codebook algorithm each pixel is represented by a codebook,


which is a compressed form of background model for a long image
sequence. Each codebook is composed of codeword comprising colors
transformed by an innovative color distortion metric. An improved
codebook incorporating the spatial and temporal context of each pixel has
been proposed. Codebooks are believed to be able to capture background
motion over a long period of time with a limited amount of memory.
However, one should note that the proposed codebook update mechanism
does not allow the creation of new codewords and this can be problematic
if permanent structural changes occur in the background. Instead of
choosing a particularbackground density model. They keep a cache of a
given number of last observed background values for each pixel and
classify a new value as background if it matches most of the values stored
in the pixel’s model. One might expect that such an approach would
avoid the issues related to deviations from an arbitrarily assumed density

-35-

model, but since values of pixel models are replaced according to a first-
in first-out update policy.

The W4 model is a rather simple but nevertheless effective method. It


uses three values to represent each pixel in the background image: the
minimum and maximum intensity values, and the maximum intensity
difference between consecutive images of the training sequence. A small
improvement madeto the W4 model together with the incorporation of a
technique for shadow detection and removal. Inon-line K-means
approach, each incoming pixel is matched to a cluster if it is within 2.5
standard deviations from the cluster mean. The parameters for that cluster
are then updated with the new observation. Repeated similar pixels will
drive the weight of their cluster up and simultaneously reduce the cluster
variance, indicating a higher selectivity. If no match is found, the
observation becomes a new cluster with an initially low weight and wide
variance. Background estimation is formulated as an optimal labeling
problem in which each pixel of the background image is labeled with a
frame number, indicating which colour from the past must be copied.

The proposed algorithm produces a background image, which is


constructed by copying areas from the input frames. Impressive results
are shown for static background but the method is not designed to cope
with objects moving slowly in the background, as its outcome is a single
static background frame. Pixel-based background subtraction techniques
compensate for the lack of spatial consistency by a constant updating of
their model parameters.

3.4.1.2 Background Subtraction Algorithm

Background subtraction algorithm is a computational vision process of


extracting foreground objects in a particular frame. A foreground object
-36-

can be described as an object of attention which helps in reducing the


amount of data to be processed as well as provide important information
to the task under consideration. Often, the foreground object can be
thought of as a coherently moving object in a scene. We must emphasize
the word coherent here because if a person is walking in front of moving
leaves, the person forms the foreground object while leaves though
having motion associated with them are considered background due to its
repetitive behavior.

In some cases, distance of the moving object also forms a basis for it to
be considered a background, e.g. if in a scene one person is close to the
camera while there is a person far away in background, in this case the
nearby person is considered as foreground while the person far away is
ignored due to its small size and the lack of information that it provides.
Identifying moving objects from a video sequence is a fundamental and
critical task in many computer-vision applications. A common approach
is to perform background subtraction, which identifies moving objects
from the portion of video frame that differs from the background model.
Fig.3.2 Flow Chart for background subtraction algorithm.

-37-

Fig.3.3 Single background Method

In this a set of frames (previous frames) are taken and the calculation is
done for separation. The separation is performed by calculating the mean
and variance at each pixel position. If we take N frames with pixel value
P and intensity I.

Mean, µ= (1)

Variance,

σ= (2)

Now after calculating the variance of each pixel, a threshold function is


used to separate the foreground and background objects.Figure3.3 shows
the single Gaussian background method while testing.
3.4.2 Frame Difference

Frame difference calculates the difference between two frames at every


pixel position and store the absolute difference. It is used to visualize the
moving objects in a sequence of frames. Let us consider an example, if
we take a sequence of frames, the present frame and the next frame are
taken into consideration at every calculation and the frames are shifted
(after calculation the next frame becomes present frame and the frame
that comes in sequence becomes next frame).

-38-

Fig.3.4 Frame difference between 2 frames

Frame difference is calculated step by step as shown in the fig 4. Let be


the current frame and be the previous frame. Now is subtracted from.
Result should be the pixel variation between two adjacent frames.
Fig.3.5Flow chart for frame difference method 𝑓𝑘−1 𝑓𝑘 Subtract
Binarization Process Result Start

3.5 Feature Extraction

Feature Extraction plays a major role to detect the moving objects in


sequence of frames. Every object has a specific feature like color or
shape. In a sequence of frames, any one of the feature is used to detect the

-39-

objects in the frame. Edges are formed where there is a sharp change in
the intensity of images. If there is an object, the pixel positions of the
object boundary are stored and in the next sequence of frames this
position is verified. Corner based algorithm uses the pixel position of
edges for defining and tracking of objects.

3.6 Bounding Box with Colour Feature

If the segmentation is performed using frame difference the residual


image is visualized with rectangular bounding box with the dimensions of
the object produced from residual image. For a given image, a scan is
performed where the intensity values of the image are more than limit
(depends on the assigned value for accurate assign maximum). In this
Features is extracted by color and here the intensity value describes the
color. The pixel values from the first hit of the intensity values from top,
bottom, left and right are stored.

By using this dimension values a rectangular bounding box is plotted


within the limits of the values produced. The boundary box is drawn
using the dimensions Height and Width as calculated below. In the
bounding box with centroid is shown. Initially the centroid of the object
is extracted and then by calculating Height and Width a bounding box is
drawn around the object.

Images are often only interested in certain parts. These parts are often
referred to as goals or foreground (as other parts of the background). In
order to identify and analyse the target in the image, we need to isolate
-40-

them from the image. The Image segmentation refers to the image is
divided into regions,each with characteristics and to extract the target of
interest in processes. The image segmentation used in this paper is
threshold segmentation. To put it simply, the threshold of the gray-scale
image segmentation is to identify a range in the image of the gray-scale
threshold, and then all image pixels gray values are compared with the
threshold and according to the results to the corresponding pixel is
divided into two categories: the foreground of, or background. The
simplest case, the image after the single-threshold segmentation can be
defined as: Threshold segmentation has two main steps:
l) Determine the threshold T.
2) Pixel value will be compared with the threshold value T.
In the above steps to determine the threshold value is the most critical
step in partition. In the threshold selection, there is a best threshold based
on different goals of image segmentation.
If we can determine an appropriate threshold, we can correct the image
for segmentation.

-41-

CHAPTER 4

SINGLE OBJECT TRACKING

Object tracking is the process of locating and following the moving


object in sequence of video frames. Smart cameras are used as input
sensors to record the video. The recorded video may have some noise due
to bad weather (light, wind, etc. or due to problems in sensors). Few
algorithms are tested to improve the image quality, to detect moving
object, calculation of distance and velocity of the moving object.
Extraction of objects using the features is known as object detection.
Every object has a specific feature based on its dimensions. Applying
feature extraction algorithm, the object in each frame can be pointed out.

4.1 Optical Flow

Optical flow is one way to detect moving objects in a sequence of frames.


In this, the vector position of pixels is calculated and compared in
sequence of frames for the pixel position. Typically the motion is
represented as vector position of pixels.

4.2 Block Matching

Block matching algorithm is a standard technique for determining the


moving object in video. Blocks are formed in a region without
overlapping on the other region. Every block in a frame is compared to
the corresponding blocks in the sequence of frames and compares the
smallest distance of pixel value.

4.3 Tracking

The process of locating the moving object in sequence of frames is


known as tracking. This tracking can be performed by using the feature

-42-

extraction of objects and detecting the objects in sequence of frames. By


using the position values of object in every frame, we can calculate the
position and velocity of the moving object.

4.4 Distance

The distance travelled by the object is determined by using the centroid.


It is calculated by using the Euclidean distance formula. The variables for
this are the pixel positions of the moving object at initial stage to the final
stage. Distance = (𝑥2 − 𝑥1)2 + (𝑦2 − 𝑦1)2 Where x1=previous pixel
position and x2=present pixel position in width y1=previous pixel
position and y2=present pixel position in height.

4.5 Velocity

The velocity of moving object is calculated by the distance it travelled


with respect to the time. Euclidean distance formula is used to calculate
the distance between the sequences of frames. By using the values of
distance with respect to frame rate, the velocity of the object is defined.
The defined velocity is of 2-dimension (since camera is static).Velocity
of moving object is determined using the distance travelled by the
centroid to the frame rate of the video. The velocity of moving object in
the sequence frames is defined in pixels / second. The successive frames
in a video of moving object are given in figure 6.

-43-
Fig.4.1Flow chart of object velocity determination

Velocity = Distance travelled / Frame Rate

Object tracking, the main application for security, surveillance and vision
analysis. In this, a video is recorded using digital camera. The recorded
video frames are converted into individual frames. Noise is removed for
the imported images using median filter. The filtered images are used as
input for the frame difference for the separation of foreground and
background objects. A rectangular bounding box is plotted around the
foreground objects produced from frame difference and frame
subtraction.

Fig.4.2 Step by step diagrams during evaluation and simulation

-44-
Input video dividing into frames Noise removal Background subtraction
algorithm Object detection Object bounding box Calculation of distance
travelled Velocity Determination Input video Frame difference Frame
separation Dynamic Threshold Object detection Figure 4.2 shows the step
by step process of tracking moving object detection. From this figureit is
clear that the inputs and the output produced at each step.

-45-
CHAPTER 5

CODING

clc;

close all;

clear all;

%video loading

[f,p]=uigetfile('.avi');

T=strcat(p,f);

Vid=VideoReader(T);

%number of frames

n=Vid.NumberOfFrames;

disp('number of frames...');

disp(n);

% height and width of video

H=Vid.Height;

W=Vid.Width;

disp('height of a video file');

disp(H);

disp('Width of a video file');

disp(W);

Framerate=Vid.FrameRate;

disp('Framerate of a video file');

disp(Framerate);

-46-
%region of interest points%

p2 = [70,600];

p3= [70,25];

q3 = [250,300];

q2 = [250,10];

alpha=0.1;

f=read(Vid,1);

f=imcrop(f,[p3(2) p2(1) q3(2) q3(1)-p2(1)]) ;

reader = vision.VideoFileReader(T);

detector = vision.ForegroundDetector('NumGaussians', 3, ...

'NumTrainingFrames', 415, 'MinimumBackgroundRatio', 0.7);

blob = vision.BlobAnalysis('BoundingBoxOutputPort', true, ...

'AreaOutputPort', true, 'CentroidOutputPort', true, ...

'MinimumBlobArea',400);

vid_frame = vision.VideoPlayer('Position', [20, 50, 600, 400]);

vid_mask = vision.VideoPlayer('Position', [20, 400, 600, 400]);

ii = 1;

while ~isDone(reader)

frame =reader.step();

fgMask = detector.step(frame);

mask = imopen(fgMask, strel('rectangle', [3,3]));

mask = imclose(mask, strel('rectangle', [15, 15]));

-47-
mask = imfill(mask, 'holes');

[~,centroid,bbox] = blob.step(fgMask);

arr = [];

arr = bbox;

a1{ii,1} = arr;

ii = ii+1;

[~,centroids,bboxs] = blob.step(mask);

% draw bounding boxes around objects

out = insertShape(frame, 'Rectangle', bbox, 'Color', 'blue');

out1 = insertShape(double(mask),'Rectangle',bboxs,'Color','green');

% view results in the video player

step(vid_frame, out);

step(vid_mask, out1);

% % Background Subtraction

end

temp=0;

%%%%%%%%%%%%%%%

for i = 1:n

frame=read(Vid,i);

frame=imcrop(frame,[p3(2) p2(1) q3(2) q3(1)-p2(1)]) ;

F1=((1-alpha).*f)+(alpha.*frame);% BS formula

FF=frame-F1;

temp=(temp+FF)./i;

-48-
bw=im2bw(FF,0.1);

out=bwareaopen(bw,1000);

out=bwlabel(out,8);

figure(1),imshow(out)

V(:,:,:,i)=out;

end

T1=V(:,:,:,1);

for i=2:n

T2=V(:,:,:,i);

DD(i)=dist(T1(:)',T2(:));

end

D1=mean(DD);

disp([' Distance of object is ',num2str(D1),' pixels'])

vl=(mean(DD)/Framerate);

disp([' Velocity of object is ',num2str(vl),' pixel/fps'])

-49-
RESULTS:

Before Filtering

-50-
After Filtering

Object Tracking

-51-
Object Tracking

-52-
CHAPTER 6

CONCLUSION AND FUTURE WORKS

Tracking of moving object is a major application in security, surveillance


and vision analysis. In this, a video is recorded using digital camera. The
recorded video frames are converted into individual frames. Noise is
removed for the imported images using median filter and morphological
filter. The filtered images are used as input for the frame difference for
the separation of foreground and background objects. The background
subtraction using dynamic threshold and morphological process.

In dynamic threshold based object detection, morphological process and


filtering also used effectively for unwanted pixel removal from the
background. These methods are used effectively for object detection and
most of previous methods depend on the assumption that the background
is static over short time periods. These algorithms can also be extended
for the use of realtime applications and object classifications. Future work
is extension of the algorithm to include smooth movement of objects, CC
analysis, Object boundary box, distance and velocity determination of
multiple objects with changing background.

-53-
REFERENCES

[1] Lingfei Meng, Student Member, IEEE, and John P. Kerekes, Senior
Member, IEEE,” Object Tracking Using High Resolution Satellite
Imagery”. IEEE Journal of selected topics in applied earth observations
and remote sensing,2012 vol.5.

[2] M. Presnar, A. Raisanen, D. Pogorzala, J. Kerekes, and A. Rice,


“Dynamic scene generation, multimodal sensor design, and target
tracking demonstration for hyperspectral/polarimetric performance-driven
sensing,” in Proc. SPIE, 2010, vol. 7672.

[3] G.Marchisio, F.Pacifici, and C. Padwick, “On the relative predictive


value of the new spectral bands in the Worldview-2 sensor,” in Proc.2010
IEEE Int. Geoscience and Remote Sensing Symp., Jul.2010.

[4] TTI-Chicago, Chicago, IL, USA Minglun Gong, “Realtime


background subtraction from dynamic scenes". Computer Vision, 2009
IEEE 12th International Conference.

[5] Yangquan Yu, Chunguang Zhou, Lan Huang.”A Moving Target


Detection Algorithm Based on the Dynamic Background “.2009 IEEE.

[6] D. Lenhart, S. Hinz, J. Leitloff, and U. Stilla, “Automatic traffic


monitoring based on aerial image sequences,” Pattern Recognit. Image
Anal., vol. 18, 2008.

[7] S. Hinz, D. Weihing, S. Suchandt, and R. Bamler, “Detection and


velocity estimation of moving vehicles in high-resolution space borne
synthetic aperture radar data,” in Proc. 2008 Computer Vision and Pattern
Recognition Workshops, Jun. 2008.

-54-
[8] S. Zhang, H. Yao, S. Liu, X. Chen, and W. Gao, “A covariance-based
method for dynamic background subtraction,” in Proc. IEEE Int. Conf.
Pattern Recognition (ICPR), Dec. 2008.

[9] A.Yilmaz,O.Javed, and M. Shah, “Object tracking: A survey,” ACM


Comput. Surv., vol. 38, Dec. 2006.

[10] Parameswaran, V Ramesh, V. Zoghlami, I. “Fast Crowd


Segmentation Using Shape Indexing” 2007 IEEE

[11] Tao Zhao Intuitive Surg. Inc., Sunnyvale, CA Nevatia, R.;Bo Wu


“Segmentation and Tracking of Multiple humans in crowded
environments” volume:30 2007.IEEE Computer society.

[12] S. Hinz, R. Bamler, and U. Stilla, “Theme issue:” Airborne and


spaceborne traffic monitoring,” ISPRS J. Photogramm. Remote Sens.,
vol. 61, no. 3–4, 2006.

[13] J.Kerekes,M.Muldowney,K.Strackerhan, L. Smith, and B.


Leahy,“Vehicle tracking with multi-temporal hyperspectral imagery,” in
Proc.SPIE, 2006, vol. 6233, p. 62330C.

-55-
APPENDIX

MATLAB

INTRODUCTION

MATLAB is a high-performance language for technical


computing. It integrates computation, visualization, and programming in
an easy-to-use environment where problems and solutions are expressed
in familiar mathematical notation. MATLAB stands for matrix
laboratory, and was written originally to provide easy access to matrix
software developed by LINPACK (linear system package) and EISPACK
(Eigen system package) projects. MATLAB is therefore built on a
foundation of sophisticated matrix software in which the basic element is
array that does not require pre dimensioning which to solve many
technical computing problems, especially those with matrix and vector
formulations, in a fraction of time.
MATLAB features a family of applications specific solutions
called toolboxes. Very important to most users of MATLAB, toolboxes
allow learning and applying specialized technology. These are
comprehensive collections of MATLAB functions (M-files) that extend
the MATLAB environment to solve particular classes of problems. Areas
in which toolboxes are available include signal processing, control
system, neural networks, fuzzy logic, wavelets, simulation and many
others.
Typical uses of MATLAB include: Math and computation, Algorithm
development, Data acquisition, Modeling, simulation, prototyping, Data
analysis, exploration, visualization, Scientific and engineering graphics,
Application development, including graphical user interface building.

-56-
A.2 Basic Building Blocks of MATLAB

The basic building block of MATLAB is MATRIX. The fundamental


data type is the array. Vectors, scalars, real matrices and complex matrix
are handled as specific class of this basic data type. The built in functions
are optimized for vector operations. No dimension statements are
required for vectors or arrays.

MATLAB Window

The MATLAB works based on five windows: Command window,


Workspace window, Current directory window, Command history
window, Editor Window, Graphics window and Online-help window.

A.2.1.1 Command Window

The command window is where the user types MATLAB commands and
expressions at the prompt (>>) and where the output of those commands
is displayed. It is opened when the application program is launched. All
commands including user-written programs are typed in this window at
MATLAB prompt for execution.

A.2.1.2 Work Space Window

MATLAB defines the workspace as the set of variables that the user
creates in a work session. The workspace browser shows these variables
and some information about them. Double clicking on a variable in the
workspace browser launches the Array Editor, which can be used to
obtain information.

A.2.1.3 Current Directory Window

The current Directory tab shows the contents of the current directory,
whose path is shown in the current directory window. For example, in the
windows operating system the path might be as follows:
-57-
C:\MATLAB\Work, indicating that directory “work” is a subdirectory of
the main directory “MATLAB”; which is installed in drive C. Clicking
on the arrow in the current directory window shows a list of recently used
paths. MATLAB uses a search path to find M-files and other MATLAB
related files. Any file run in MATLAB must reside in the current
directory or in a directory that is on search path.

A.2.1.4 Command History Window

The Command History Window contains a record of the


commands a user has entered in the command window, including both
current and previous MATLAB sessions. Previously entered MATLAB
commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This
is useful to select various options in addition to executing the commands
and is useful feature when experimenting with various commands in a
work session.

A.2.1.5 Editor Window

The MATLAB editor is both a text editor specialized for creating M-files
and a graphical MATLAB debugger. The editor can appear in a window
by itself, or it can be a sub window in the desktop. In this window one
can write, edit, create and save programs in files called M-files.

MATLAB editor window has numerous pull-down menus for tasks such
as saving, viewing, and debugging files. Because it performs some simple
checks and also uses color to differentiate between various elements of
code, this text editor is recommended as the tool of choice for writing and
editing M-functions.

-58-
A.2.1.6 Graphics or Figure Window

The output of all graphic commands typed in the command window is


seen in this window.

A.2.1.7 Online Help Window

MATLAB provides online help for all it’s built in functions and
programming language constructs. The principal way to get help online is
to use the MATLAB help browser, opened as a separate window either
by clicking on the question mark symbol (?) on the desktop toolbar, or by
typing help browser at the prompt in the command window. The help
Browser is a web browser integrated into the MATLAB desktop that
displays a Hypertext Markup Language (HTML) documents. The Help
Browser consists of two panes, the help navigator pane, used to find
information, and the display pane, used to view the information. Self-
explanatory tabs other than navigator pane are used to perform a search.

A.3 MATLAB Files

MATLAB has three types of files for storing information. They are: M-
files and MAT-files.

A.3.1 M-Files

These are standard ASCII text file with ‘m’ extension to the file name
and creating own matrices using M-files, which are text files containing
MATLAB code. MATLAB editor or another text editor is used to create
a file containing the same statements which are typed at the MATLAB
command line and save the file under a name that ends in .m.
There are two types of M-files:

-59-
1. Script Files

It is an M-file with a set of MATLAB commands in it and is executed by


typing name of file on the command line. These files work on global
variables currently present in that environment.

2. Function Files

A function file is also an M-file except that the variables in a function file
are all local. This type of files begins with a function definition line.

A.3.2 MAT-Files

These are binary data files with .mat extension to the file that are created
by MATLAB when the data is saved. The data written in a special format
that only MATLAB can read. These are located into MATLAB with
‘load’ command.

A.4 the MATLAB System:

The MATLAB system consists of five main parts:

A.4.1 Development Environment:

 This is the set of tools and facilities that help you use MATLAB
functions and files. Many of these tools are graphical user interfaces. It
includes the MATLAB desktop and Command Window, a command
history, an editor and debugger, and browsers for viewing help, the
workspace, files, and the search path.

A.4.2 the MATLAB Mathematical Function:

This is a vast collection of computational algorithms ranging from

-60-
elementary functions like sum, sine, cosine, and complex arithmetic, to
more sophisticated functions like matrix inverse, matrix eigen values,
Bessel functions, and fast Fourier transforms.

A.4.3 the MATLAB Language:

This is a high-level matrix/array language with control flow statements,


functions, data structures, input/output, and object-oriented programming
features. It allows both "programming in the small" to rapidly create
quick and dirty throw-away programs, and "programming in the large" to
create complete large and complex application programs.

A.4.4 Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as


graphs, as well as annotating and printing these graphs. It includes high-
level functions for two-dimensional and three-dimensional data
visualization, image processing, animation, and presentation graphics. It
also includes low-level functions that allow you to fully customize the
appearance of graphics as well as to build complete graphical user
interfaces on your MATLAB applications.

A.4.5 the MATLAB Application Program Interface (API):

This is a library that allows you to write C and FORTRAN programs that
interact with MATLAB. It includes facilities for calling routines from
MATLAB (dynamic linking), calling MATLAB as a computational
engine, and for reading and writing MAT-files.

-61-
A.5 SOME BASIC COMMANDS:

pwd prints working directory

Demo demonstrates what is possible in Mat lab

Who listsall of the variables in your Mat lab workspace?

Whose list the variables and describes their matrix size

clear erases variables and functions from memory

clear x erases the matrix 'x' from your workspace

close by itself, closes the current figure window

figure creates an empty figure window

hold on holds the current plot and all axis properties so that subsequent
graphing

commands add to the existing graph

hold off sets the next plot property of the current axes to "replace"

find find indices of nonzero elements e.g.:

d = find(x>100) returns the indices of the vector x that are greater


than 100

break terminate execution of m-file or WHILE or FOR loop

for repeat statements a specific number of times, the general form of


a FOR

statement is:

FOR variable = expr, statement, ..., statement END

-62-
for n=1:cc/c;

magn(n,1)=NaNmean(a((n-1)*c+1:n*c,1));

end

diff difference and approximate derivative e.g.:

DIFF(X) for a vector X, is [X(2)-X(1) X(3)-X(2) ... X(n)-X(n-


1)].

NaN the arithmetic representation for Not-a-Number, a NaN is


obtained as a

result of mathematically undefined operations like 0.0/0.0

INF the arithmetic representation for positive infinity, a infinity is


also produced

by operations like dividing by zero, e.g. 1.0/0.0, or from


overflow, e.g. exp(1000).

save saves all the matrices defined in the current session into the file,

matlab.mat, located in the current working directory

load loads contents of matlab.mat into current workspace

save filename x y z saves the matrices x, y and z into the file titled
filename.mat

save filename x y z /ascii save the matrices x, y and z into the file titled
filename.dat

load filename loads the contents of filename into current


workspace; the file can be a binary (.mat) file

-63-
load filename.dat loads the contents of filename.dat into the
variable filename

xlabel(‘ ’) : Allows you to label x-axis

ylabel(‘ ‘) : Allows you to label y-axis

title(‘ ‘) : Allows you to give title for

plot

subplot() : Allows you to create multiple

plots in the same window

A.6 SOME BASIC PLOT COMMANDS:

Kinds of plots:

plot(x,y) creates a Cartesian plot of the vectors x & y

plot(y) creates a plot of y vs. the numerical values of the elements


in the y-vector

semilogx(x,y) plots log(x) vs y

semilogy(x,y) plots x vs log(y)

loglog(x,y) plots log(x) vs log(y)

polar(theta,r) creates a polar plot of the vectors r & theta where theta is in
radians

bar(x) creates a bar graph of the vector x. (Note also the


command stairs(x))

-64-
bar(x, y) creates a bar-graph of the elements of the vector y,
locating the bars

according to the vector elements of 'x'

Plot description:

grid creates a grid on the graphics plot

title('text') places a title at top of graphics plot

xlabel('text') writes 'text' beneath the x-axis of a plot

ylabel('text') writes 'text' beside the y-axis of a plot

text(x,y,'text') writes 'text' at the location (x,y)

text(x,y,'text','sc') writes 'text' at point x,y assuming lower left corner is


(0,0)

and upper right corner is (1,1)

axis([xminxmaxyminymax]) sets scaling for the x- and y-axes on the


current plot

A.7 ALGEBRIC OPERATIONS IN MATLAB:

Scalar Calculations:

+ Addition

- Subtraction

* Multiplication

/ Right division (a/b means a ÷ b)

-65-
\ left division (a\b means b ÷ a)

^ Exponentiation

For example 3*4 executed in 'matlab' gives ans=12

4/5 gives ans=0.8

Array products: Recall that addition and subtraction of matrices involved


addition or subtraction of the individual elements of the matrices.
Sometimes it is desired to simply multiply or divide each element of an
matrix by the corresponding element of another matrix 'array operations”.

Array or element-by-element operations are executed when the operator


is preceded by a '.' (Period):

a .* b multiplies each element of a by the respective element of b

a ./ b divides each element of a by the respective element of b

a .\ b divides each element of b by the respective element of a

a .^ b raise each element of a by the respective b element

A.8 MATLAB WORKING ENVIRONMENT:

A.8.1 MATLAB DESKTOP

Matlab Desktop is the main Matlab application window. The desktop


contains five sub windows, the command window, the workspace
browser, the current directory window, the command history window,
and one or more figure windows, which are shown only when the user
displays a graphic

-66-
The command window is where the user types MATLAB commands and
expressions at the prompt (>>) and where the output of those commands
is displayed. MATLAB defines the workspace as the set of variables that
the user creates in a work session.

The workspace browser shows these variables and some information


about them. Double clicking on a variable in the workspace browser
launches the Array Editor, which can be used to obtain information and
income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of
the current directory, whose path is shown in the current directory
window. For example, in the windows operating system the path might be
as follows: C:\MATLAB\Work, indicating that directory “work” is a
subdirectory of the main directory “MATLAB”; WHICH IS
INSTALLED IN DRIVE C. clicking on the arrow in the current directory
window shows a list of recently used paths. Clicking on the button to the
right of the window allows the user to change the current directory.

MATLAB uses a search path to find M-files and other MATLAB related
files, which are organize in directories in the computer file system. Any
file run in MATLAB must reside in the current directory or in a directory
that is on search path. By default, the files supplied with MATLAB and
math works toolboxes are included in the search path. The easiest way
to see which directories are soon the search path, or to add or modify a
search path, is to select set path from the File menu the desktop, and then
use the set path dialog box. It is good practice to add any commonly used
directories to the search path to avoid repeatedly having the change the
current director.

-67-
The Command History Window contains a record of the commands a
user has entered in the command window, including both current and
previous MATLAB sessions. Previously entered MATLAB commands
can be selected and re-executed from the command history window by
right clickingon a command or sequence of commands.

This action launches a menu from which to select various options in


addition to executing the commands. This is useful to select various
options in addition to executing the commands. This is a useful feature
when experimenting with various commands in a work session.

A.8.2 Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files
and a graphical MATLAB debugger. The editor can appear in a window
by itself, or it can be a sub window in the desktop. M-files are denoted by
the extension .m, as in pixelup.m.

The MATLAB editor window has numerous pull-down menus for tasks
such as saving, viewing, and debugging files. Because it performs some
simple checks and also uses color to differentiate between various
elements of code, this text editor is recommended as the tool of choice for
writing and editing M-functions.

To open the editor , type edit at the prompt opens the M-file filename.m
in an editor window, ready for editing. As noted earlier, the file must be
in the current directory, or in a directory in the search path

-68-
A.8.3 Getting Help:

The principal way to get help online is to use the MATLAB help
browser, opened as a separate window either by clicking on the question

mark symbol (?) on the desktop toolbar, or by typing help browser at the
prompt in the command window. The help Browser is a web browser
integrated into the MATLAB desktop that displays a Hypertext Markup
Language(HTML) documents. The Help Browser consists of two panes,
the help navigator pane, used to find information, and the display pane,
used to view the information. Self-explanatory tabs other than navigator
pane are used to perform a search.

-69-
Appendix B

INTRODUCTION TO DIGITAL IMAGE PROCESSING

What is DIP?

An image may be defined as a two-dimensional function f(x, y), where x


& y are spatial coordinates, & the amplitude of f at any pair of
coordinates (x, y) is called the intensity or gray level of the image at that
point. When x, y & the amplitude values of f are all finite discrete
quantities, we call the image a digital image. The field of DIP refers to
processing digital image by means of digital computer. Digital image is
composed of a finite number of elements, each of which has a particular

Vision is the most advanced of our sensor, so it is not surprising that


image play the single most important role in human perception. However,
unlike humans, who are limited to the visual band of the EM spectrum
imaging machines cover almost the entire EM spectrum, ranging from
gamma to radio waves. They can operate also on images generated by
sources that humans are not accustomed to associating with image.

There is no general agreement among authors regarding where image


processing stops & other related areas such as image analysis& computer
vision start. Sometimes a distinction is made by defining image
processing as a discipline in which both the input & output at a process
are images. This is limiting & somewhat artificial boundary. The area of
image analysis (image understanding) is in between image processing &
computer vision.

There are no clear-cut boundaries in the continuum from image

-70-
processing at one end to complete vision at the other. However, one
useful paradigm is to consider three types of computerized processes in
this continuum: low-, mid-, & high-level processes. Low-level process
involves primitive operations such as image processing to reduce noise,
contrast enhancement & image sharpening. A low- level process is
characterized by the fact that both its inputs & outputs are images. Mid-
level process on images involves tasks such as segmentation, description
of that object to reduce them to a form suitable for computer processing
& classification of individual objects. A mid-level process is
characterized by the fact that its inputs generally are images but its
outputs are attributes extracted from those images. Finally higher- level
processing involves “Making sense” of an ensemble of recognized
objects, as in image analysis & at the far end of the continuum
performing the cognitive functions normally associated with human
vision.

Digital image processing, as already defined is used successfully in a


broad range of areas of exceptional social & economic value.

A.B.1 What is an image?

An image is represented as a two dimensional function f(x, y) where


x and y are spatial co-ordinates and the amplitude of ‘f’ at any pair of
coordinates (x, y) is called the intensity of the image at that point.

Gray scale image:

A grayscale image is a function I(xylem) of the two spatial


coordinates of the image plane.

I(x, y) is the intensity of the image at the point (x, y) on the image plane.

-71-
I (xylem)takes non-negative values assume the image is bounded by
arectangle[0, a] [0, b]I: [0, a]  [0, b]  [0, info)

Color image:

It can be represented by three functions, R (xylem)for red,G (xylem)for


green and B (xylem)for blue.

An image may be continuous with respect to the x and y


coordinates and also in amplitude. Converting such an image to digital
form requires that the coordinates as well as the amplitude to be digitized.
Digitizing the coordinate’s values is calledsampling. Digitizing the
amplitude values is called quantization.

A.B.2 Coordinate convention:

The result of sampling and quantization is a matrix of real numbers. We


use two principal ways to represent digital images. Assume that an image
f(x, y) is sampled so that the resulting image has M rows and N columns.
We say that the image is of size M X N. The values of the coordinates
(xylem) are discrete quantities. For notational clarity and convenience,
we use integer values for these discrete coordinates. In many image
processing books, the image origin is defined to be at (xylem)=(0,0).The
next coordinate values along the first row of the image are
(xylem)=(0,1).It is important to keep in mind that the notation (0,1) is
used to signify the second sample along the first row. It does not mean
that these are the actual values of physical coordinates when the image
was sampled. Following figure shows the coordinate convention. Note
that x ranges from 0 to M-1 end y from 0 to N-1 in integer increments.

-72-
The coordinate convention used in the toolbox to denote arrays is
different from the preceding paragraph in two minor ways. First, instead
of using (xylem) the toolbox uses the notation (race) to indicate rows and
columns. Note, however, that the order of coordinates is the same as the
order discussed in the previous paragraph, in the sense that the first
element of a coordinate topples, (alb), refers to a row and the second to a
column. The other difference is that the origin of the coordinate system is
at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to N in integer
increments. IPT documentation refers to the coordinates. Less frequently
the toolbox also employs another coordinate convention called spatial
coordinates which uses x to refer to columns and y to refers to rows. This
is the opposite of our use of variables x and y.

A.B.3 Image as Matrices:

The preceding discussion leads to the following representation for a


digitized image function:

f (0, 0) f (0, 1) ……….. f (0, N-1)

f (1, 0) f (1, 1) ………… f (1, N-1)

f (xylem) = . . .
. . .
f (M-1, 0) f (M-1, 1) ………… f (M-1, N-1)
The right side of this equation is a digital image by definition. Each
element of this array is called an image element, picture element, pixel or
pel. The terms image and pixel are used throughout the rest of our
discussions to denote a digital image and its elements.

-73-
A digital image can be represented naturally as a MATLAB matrix:
f (1, 1) f (1, 2) ……. f (1, N)
f (2, 1) f (2, 2) …….. f (2, N)
. . .
f = . . .
f (M, 1) f (M, 2) …….f (M, N)
Where f (1, 1) = f (0, 0) (note the use of a monoscope font to
denote MATLAB quantities). Clearly the two representations are
identical, except for the shift in origin. The notation f (p, q) denotes the
element located in row p and the column q. For example f (6, 2) is the
element in the sixth row and second column of the matrix f. typically we
use the letters M and N respectively to denote the number of rows and
columns in a matrix. A 1xN matrix is called a row vector whereas an
Mx1 matrix is called a column vector. A 1x1 matrix is a scalar.
Matrices in MATLAB are stored in variables with names such as A,
a, RGB, real array and so on. Variables must begin with a letter and
contain only letters, numerals and underscores. As noted in the previous
paragraph, all MATLAB quantities are written using monoscope
characters. We use conventional Roman, italic notation such as f(x, y),
for mathematical expressions
A.B.4 Reading Images:
Images are read into the MATLAB environment using function
imread whose syntax is
imread(‘filename’)

Format name Description recognized


extension
TIFF Tagged Image File Format .tif, .tiff

-74-
JPEG Joint Photograph Experts Group .jpg, .jpeg
GIF Graphics Interchange Format .gif
BMP Windows Bitmap .bmp
PNG Portable Network Graphics .png
XWD X Window Dump .xwd

Here filename is a spring containing the complete of the image


file(including any applicable extension).For example the command line
>> f = imread (‘8. jpg’);
reads the JPEG (above table) image chestxray into image array f. Note
the use of single quotes (‘) to delimit the string filename. The semicolon
at the end of a command line is used by MATLAB for suppressing
output. If a semicolon is not included. MATLAB displays the results of
the operation(s) specified in that line. The prompt symbol(>>) designates
the beginning of a command line, as it appears in the MATLAB
command window.
When as in the preceding command line no path is included in
filename, imread reads the file from the current directory and if that fails
it tries to find the file in the MATLAB search path. The simplest way to
read an image from a specified directory is to include a full or relative
path to that directory in filename.
For example,
>> f = imread( ‘D:\myimages\chestxray.jpg’);
reads the image from a folder called my images on the D: drive, whereas
>> f = imread(‘ . \ myimages\chestxray .jpg’);
reads the image from the my images subdirectory of the current of
the current working directory. The current directory window on the

-75-
MATLAB desktop toolbar displays MATLAB’s current working
directory and provides a simple, manual way to change it. Above table
lists some of the most of the popular image/graphics formats supported
by imread and imwrite.
Function size gives the row and column dimensions of an image:
>> size (f)
ans = 1024 * 1024
This function is particularly useful in programming when used in the
following form to determine automatically the size of an image:
>>[M,N]=size(f);
This syntax returns the number of rows(M) and columns(N) in the
image.
The whole function displays additional information about an array.
For instance ,the statement
>>whos f
gives
Name size Bytes Class
F 1024*1024 1048576 unit8 array
Grand total is 1048576 elements using 1048576 bytes
The unit8 entry shown refers to one of several MATLAB data
classes. A semicolon at the end of a whose line has no effect ,so normally
one is not used.
A.B.5 Displaying Images:
Images are displayed on the MATLAB desktop using function
imshow, which has the basic syntax:
imshow(f,g)
Where f is an image array, and g is the number of intensity levels

-76-
used to display it. If g is omitted ,it defaults to 256 levels .using the
syntax
Imshow (f, {low high})
Displays as black all values less than or equal to low and as white
all values greater than or equal to high. The values in between are
displayed as intermediate intensity values using the default number of
levels .Finally the syntax
Imshow(f,[ ])
Sets variable low to the minimum value of array f and high to its
maximum value. This form of imshow is useful for displaying images
that have a low dynamic range or that have positive and negative values.
Function pixval is used frequently to display the intensity values of
individual pixels interactively. This function displays a cursor overlaid on
an image. As the cursor is moved over the image with the mouse the
coordinates of the cursor position and the corresponding intensity values
are shown on a display that appears below the figure window .When
working with color images, the coordinates as well as the red, green and
blue components are displayed. If the left button on the mouse is clicked
and then held pressed, pixval displays the Euclidean distance between the
initial and current cursor locations.
The syntax form of interest here is Pixval which shows the cursor
on the last image displayed. Clicking the X button on the cursor window
turns it off.
The following statements read from disk an image called
rose_512.tif extract basic information about the image and display it
using imshow :
>>f=imread(‘rose_512.tif’);
-77-
Name Size Bytes Class
F 512*512 262144 unit8 array
Grand total is 262144 elements using 262144 bytes
>>imshow(f)
A semicolon at the end of an imshow line has no effect, so normally
one is not used. If another image,g, is displayed using imshow,
MATLAB replaces the image in the screen with the new image. To keep
the first image and output a second image, we use function figure as
follows:
>>figure ,imshow(g)
Using the statement
>>imshow(f),figure ,imshow(g) displays both images.

Note that more than one command can be written on a line ,as long
as different commands are properly delimited by commas or semicolons.
As mentioned earlier, a semicolon is used whenever it is desired to
suppress screen outputs from a command line.
Suppose that we have just read an image h and find that using
imshow produces the image. It is clear that this image has a low dynamic
range, which can be remedied for display purposes by using the
statement.
>>imshow(h,[ ])
A.B.6 WRITING IMAGES:
Images are written to disk using function imwrite, which has the
following basic syntax:
Imwrite (f,’filename’)
With this syntax, the string contained in filename must include a

-78-
recognized file format extension .Alternatively, the desired format can be
specified explicitly with a third input argument.
>>imwrite(f,’patient10_run1’,’tif’)
Or alternatively
For example the following command writes f to a TIFF file named
patient10_run1:
>>imwrite(f,’patient10_run1.tif’)
If filename contains no path information, then imwrite saves the file
in the current working directory.
The imwrite function can have other parameters depending on e file
format selected. Most of the work in the following deals either with JPEG
or TIFF images ,so we focus attention here on these two formats.

More general imwrite syntax applicable only to JPEG images is


imwrite(f,’filename.jpg,,’quality’,q)
where q is an integer between 0 and 100(the lower number the higher the
degradation due to JPEG compression).
For example, for q=25 the applicable syntax is
>>imwrite(f,’bubbles25.jpg’,’quality’,25)
The image for q=15 has false contouring that is barely visible, but
this effect becomes quite pronounced for q=5 and q=0.Thus, an
expectable solution with some margin for error is to compress the images
with q=25.In order to get an idea of the compression achieved and to
obtain other image file details, we can use function imfinfo which has
syntax.
Imfinfo filename
Here filename is the complete file name of the image stored in disk.

-79-
For example,
>>imfinfo bubbles25.jpg
outputs the following information(note that some fields contain no
information in this case):
Filename: ‘bubbles25.jpg’
FileModDate: ’04-jan-2003 12:31:26’
FileSize: 13849
Format: ‘jpg’
Format Version: ‘ ‘
Width: 714
Height: 682
Bit Depth: 8
Color Depth: ‘grayscale’
Format Signature: ‘ ‘
Comment: { }
Where file size is in bytes. The number of bytes in the original
image is corrupted simply by multiplying width by height by bit depth
and dividing the result by 8. The result is 486948.Dividing this file size
gives the compression ratio:(486948/13849)=35.16.This compression
ratio was achieved. While maintaining image quality consistent with the
requirements of the appearance. In addition to the obvious advantages in
storage space, this reduction allows the transmission of approximately 35
times the amount of un compressed data per unit time.
The information fields displayed by imfinfo can be captured in to a
so called structure variable that can be for subsequent computations.
Using the receding an example and assigning the name K to the structure
variable.

-80-
We use the syntax >>K=imfinfo(‘bubbles25.jpg’);
To store in to variable K all the information generated by command
imfinfo, the information generated by imfinfo is appended to the structure
variable by means of fields, separated from K by a dot. For example, the
image height and width are now stored in structure fields K. Height and
K. width.
As an illustration, consider the following use of structure variable K to
commute the compression ratio for bubbles25.jpg:
>> K=imfinfo(‘bubbles25.jpg’);
>> image_ bytes =K.Width* K.Height* K.Bit Depth /8;
>> Compressed_ bytes = K.FilesSize;
>> Compression_ ratio=35.162
Note that iminfo was used in two different ways. The first was t
type imfinfobubbles25.jpg at the prompt, which resulted in the
information being displayed on the screen. The second was to type
K=imfinfo (‘bubbles25.jpg’),which resulted in the information generated
by imfinfo being stored in K. These two different ways of calling imfinfo
are an example of command_ function duality, an important concept that
is explained in more detail in the MATLAB online documentation.
More general imwrite syntax applicable only to tifimages has the
form
Imwrite(g,’filename.tif’,’compression’,’parameter’,….’resloution’,[colres
rowers] )

Where ‘parameter’ can have one of the following principal values:


‘none’ indicates no compression, ‘pack bits’ indicates pack bits
compression (the default for non ‘binary images’) and ‘ccitt’ indicates

-81-
ccitt compression. (the default for binary images).The 1*2 array [colres
rowers]
Contains two integers that give the column resolution and row resolution
in dot per_ unit (the default values). For example, if the image
dimensions are in inches, colres is in the number of dots(pixels)per inch
(dpi) in the vertical direction and similarly for rowers in the horizontal
direction. Specifying the resolution by single scalar, res is equivalent to
writing [res res].

>>imwrite(f,’sf.tif’,’compression’,’none’,’resolution’,……………..[300
300])

the values of the vector[colures rows] were determined by multiplying


200 dpi by the ratio 2.25/1.5, which gives 30 dpi. Rather than do the
computation manually, we could write

>> res=round(200*2.25/1.5);

>>imwrite(f,’sf.tif’,’compression’,’none’,’resolution’,res)

where its argument to the nearest integer.Itfunction round rounds is


important to note that the number of pixels was not changed by these
commands. Only the scale of the image changed. The original 450*450
image at 200 dpi is of size 2.25*2.25 inches. The new 300_dpi image is
identical, except that is 450*450 pixels are distributed over a
1.5*1.5_inch area. Processes such as this are useful for controlling the
size of an image in a printed document with out sacrificing resolution.

Often it is necessary to export images to disk the way they appear


on the MATLAB desktop. This is especially true with plots .The contents

-82-
of a figure window can be exported to disk in two ways. The first is to
use the file pull-down menu is in the figure window and then choose
export. With this option the user can select a location, filename, and
format. More control over export parameters is obtained by using print
command:

Print-fno-dfileformat-rresno filename

Where no refers to the figure number in the figure window interest, file
format refers one of the file formats in table above. ‘resno’ is the
resolution in dpi, and filename is the name we wish to assign the file.

If we simply type print at the prompt, MATLAB prints (to the default
printer) the contents of the last figure window displayed. It is possible
also to specify other options with print, such as specific printing device.

-83-

You might also like