0% found this document useful (0 votes)
5 views

Chapter 2

Chapter 2 discusses digital image representation, detailing pixel intensity values, encoding methods (raster and vector), and types of images (binary, gray-level, and color). It also covers image compression techniques, file formats, basic terminology related to image topology, and various image processing operations in both spatial and transform domains. The chapter emphasizes the importance of understanding these concepts for effective image processing and manipulation.

Uploaded by

abdo2652149
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 2

Chapter 2 discusses digital image representation, detailing pixel intensity values, encoding methods (raster and vector), and types of images (binary, gray-level, and color). It also covers image compression techniques, file formats, basic terminology related to image topology, and various image processing operations in both spatial and transform domains. The chapter emphasizes the importance of understanding these concepts for effective image processing and manipulation.

Uploaded by

abdo2652149
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 2

DIGITAL IMAGE REPRESENTATION


• The value of the two-dimensional function f (x, y) at any given
pixel of coordinates (x0, y0), denoted by f (x0, y0), is called the
intensity or gray level of the image at that pixel.
• The maximum and minimum values that a pixel intensity can
assume will vary depending on the data type and convention
used. Common ranges are as follows: 0.0 (black) to 1.0 (white) for
double data type and 0 (black) to 255 (white) for uint8 (unsigned
integer, 8 bits) representation.
• At the most basic level, there are two different ways of encoding
the contents of a 2D image in digital format: raster (also known as
bitmap) and vector.
▪ Bitmap representations use one or more two-dimensional arrays of
pixels, whereas vector representations use a series of drawing
commands to represent an image.
• Each encoding method has its pros and cons:
 the greatest advantages of bitmap graphics are their quality and
display speed;
 their main disadvantages include larger memory storage
requirements and size dependence (e.g., enlarging a bitmap image
may lead to noticeable artifacts).
▪ Vector representations require less memory and allow resizing and
geometric manipulations without introducing artifacts, but need to be
rasterized for most presentation devices.
• Binary (1-Bit) Images
▪ Binary images are encoded as a 2D array, typically using 1 bit per pixel,
where a 0 usually means “black” and a 1 means “white”.
▪ Binary images are represented in MATLAB using a logical array of 0’s and
1’s.
▪ The main advantage of this representation (usually suitable for images
containing simple graphics, text, or line art) is its small size.

• Gray-Level (8-Bit) Images


▪ Binary images are encoded as a 2D array, typically using 1 bit per pixel,
where a 0 usually means “black” and a 1 means “white” (although there
is no universal agreement on that).
▪ Intensity images can be represented in MATLAB using different data
types (or classes): uint8 and uint16, each pixel has a value in the [0, 255]
and the [0, 65,535] range, respectively or double have pixel values in the
[0.0, 1.0] range.
• Color Images
▪ RGB representation: each pixel is usually represented by a 24-bit number
containing the amount of its red (R), green (G), and blue (B) components
▪ indexed representation: a 2D array contains indices to a color palette (or
lookup table - (LUT)).
• 24-Bit (RGB) Color Images
▪ Color images can be represented using three 2D arrays of same size, one
for each color channel: red (R), green (G), and blue (B).
▪ Each array element contains an 8-bit value, indicating the amount of
red, green, or blue at that point in a [0, 255] scale.
▪ The combination of the three 8-bit values into a 24-bit number allows
224 (16,777,216, usually referred to as 16 million or 16 M) color
combinations.
▪ An alternative representation uses 32 bits per pixel and includes a fourth
channel, called the alpha channel, that provides a measure of
transparency for each pixel and is widely used in image editing effects.
• Indexed Color Images
▪ A problem with 24-bit color representations is backward compatibility
with older hardware that may not be able to display the 16 million colors
simultaneously.
▪ A solution devised before 24-bit color displays and video cards were
widely available consisted of an indexed representation, in which a 2D
array of the same size as the image contains indices (pointers) to a color
palette (or color map) of fixed maximum size (usually 256 colors).
▪ The color map is simply a list of colors used in that image.
▪ An indexed color image and a 4 × 4 detailed region, where each pixel
shows the index and the values of R, G, and B at the color palette entry
that the index points to.

• Compression
▪ Most image file formats employ some type of compression.
▪ Compression methods can be:
• Lossy: a tolerable degree of deterioration in the visual quality of the
resulting image is acceptable.
• Lossless: the image is encoded in its full quality.
▪ As a general guideline:
• lossy compression should be used for general-purpose photographic images.
• lossless compression should be preferred when dealing with line art, drawings,
facsimiles, or images in which no loss of detail may be tolerable (most notably,
space images and medical images).
IMAGE FILE FORMATS
• Most of the image file formats used to represent bitmap images
consist of a file header followed by (often compressed) pixel data.
• The image file header stores information about the image, such as
image height and width, number of bands, number of bits per
pixel, and some signature bytes indicating the file type.
• Most common file types:
▪ BIN, PPM, PBM, PGM, PNM, BMP, JPEG, GIF, TIFF, PNG.

BASIC TERMINOLOGY
• Image Topology: It involves the investigation of fundamental image
properties (usually done on binary images and with the help of
morphological operators), such as number of occurrences of a particular
object, number of separate (not connected) regions, and number of holes.
• Neighborhood:
▪ The pixels surrounding a given pixel constitute its neighborhood, which
can be interpreted as a smaller matrix containing (and usually centered
around) the reference pixel, most neighborhoods used in image
processing algorithms are small square arrays with an odd number of
pixels.
▪ In the context of image topology, neighborhood takes a slightly different
meaning:
• The 4-neighborhood of a pixel as the set of pixels situated above, below, to the
right, and to the left of the reference pixel (p).
• The 8-neighborhood of a pixel as the set of all of p’s immediate neighbors.
• The pixels that belong to the 8-neighborhood, but not to the 4-neighborhood,
make up the diagonal neighborhood of p.
• Adjacency:
▪ In the context of image topology, two pixels p and q are 4-adjacent if
they are 4-neighbors of each other and 8-adjacent if they are 8-
neighbors of one another.
▪ A third type of adjacency (known as mixed adjacency (or simply m-
adjacency)) is sometimes used to eliminate ambiguities (i.e., redundant
paths) that may arise when 8-adjacency is used.

• Paths
▪ In the context of image topology, a 4-path between two pixels p and q is
a sequence of pixels starting with p and ending with q such that each
pixel in the sequence is 4-adjacent to its predecessor in the sequence.
▪ In the context of image topology, a 4-path between two pixels p and q is
a sequence of pixels starting with p and ending with q such that each
pixel in the sequence is 4-adjacent to its predecessor in the sequence.
• Connectivity
▪ If there is a 4-path between pixels p and q, they are said to be 4-
connected.
▪ Similarly, the existence of an 8-path between them means that they are
8-connected.
• Components
▪ A set of pixels that are connected to each other is called a component.
▪ If the pixels are 4-connected, the expression 4-component is used; if the
pixels are 8-connected, the set is called an 8-component.
▪ Components are often labeled (and optionally pseudocolored) in a
unique way, resulting in a labeled image, L(x, y), whose pixel values are
symbols of a chosen alphabet.
• The symbol value of a pixel typically denotes the outcome of a decision made for that
pixel in this case, the unique number of the component to which it belongs.
▪ Components in MATLAB:
• function bwlabel for labeling connected components in binary images.
• function, label2rgb, helps visualize the results by painting each region with a
different color.
I = imread('test_bw_label.png');

J = logical(I);

L1 = bwlabel(J,4);

L1_color = label2rgb(L1, 'lines', 'w');

L2 = bwlabel(J,8);

L2_color = label2rgb(L2, 'lines', 'w');

imshow(I) figure, imshow(L1_color) figure, imshow(L2_color)

• Following Figure shows an example of using bwlabel and label2rgb and highlights
the fact that the number of connected components will vary from 2 (when 8-
connectivity is used, following Figure b) to 3 (when 4-connectivity is used,
following Figure c).

• Distances Between Pixels:


▪ There are many image processing applications that require measuring
distances between pixels.
▪ The most common distance measures between two pixels p and q, of
coordinates (x0, y0) and (x1, y1), respectively, are as follows:
• Euclidean distance:

• D4 (also known as Manhattan or city block) distance:


D4(p, q) = |x1 − x0|+|y1 − y0|
• D8 (also known as chessboard) distance:
D8(p, q) = max(|x1 − x0|, |y1 − y0|)
▪ It is important to note that the distance between two pixels depends
only on their coordinates, not their values.
• The only exception is the Dm distance, defined as “the shortest m-path between
two m-connected pixels.”

OVERVIEW OF IMAGE PROCESSING OPERATIONS


In this section, we take a preliminary look at the main categories of
image processing operations. Although there is no universal agreement
on a taxonomy for the field, we will organize them as follows:
• Operations in the Spatial Domain:
▪ Here, arithmetic calculations and/or logical operations are performed on
the original pixel values.
▪ They can be further divided into three types:
• Global Operations:
 Also known as point operations, in which the entire image is treated in a
uniform manner and the resulting value for a processed pixel is a function of
its original value, regardless of its location within the image.
 Example: contrast adjustment.
• Neighborhood-Oriented Operations:
 Also known as local or area operations, in which the input image is treated on
a pixel-by-pixel basis and the resulting value for a processed pixel is a function
of its original value and the values of its neighbors.
 Example: spatial-domain filters.
• Operations Combining Multiple Images:
 Here, two or more images are used as an input and the result is obtained by
applying a (series of) arithmetic or logical operator(s) to them.
 Example: subtracting one image from another for detecting differences
between them.
• Operations in a Transform Domain:
▪ Here, the image undergoes a mathematical transformation (such as
Fourier transform (FT) or discrete cosine transform (DCT)) and the image
processing algorithm works in the transform domain.
▪ Example: frequency-domain filtering techniques.

➢ Global (Point) Operations:


• Point operations apply the same mathematical function, often called
transformation function, to all pixels, regardless of their location in
the image or the values of their neighbors.
g(x, y) = T[f(x, y)], s = T[r]

• Example: s = r/2

➢ Neighborhood-Oriented Operations:
▪ Neighborhood-oriented (also known as local or area) operations consist
of determining the resulting pixel value at coordinates (x, y) as a function
of its original value and the value of (some of) its neighbors, typically
using a convolution operation.
▪ The convolution of a source image with a small 2D array (known as
window, template, mask, or kernel) produces a destination image in
which each pixel value depends on its original value and the value of
(some of) its neighbors.
▪ The convolution mask determines which neighbors are used as well as
the relative weight of their original values.
▪ Masks are normally 3 × 3.

▪ Each mask coefficient (W1,...,W9) can be interpreted as a weight.


▪ The mask can be thought of as a small window that is overlaid on the
image to perform the calculation on one pixel at a time. As each pixel is
processed, the window moves to the next pixel in the source image and
the process is repeated until the last pixel has been processed.
➢ Operations Combining Multiple Images:
▪ There are many image processing applications that combine two images,
pixel by pixel, using an arithmetic or logical operator, resulting in a third
image, Z: X opn Y = Z
▪ where X and Y may be images (arrays) or scalars, Z is necessarily an array,
and opn is a binary mathematical (+, −, ×, /) or logical (AND, OR, XOR)
operator. Following Figure shows schematically how pixel-by-pixel
operations work. Chapter 6 will discuss arithmetic and logic pixel-by-
pixel operations in detail.
➢ Operations in a Transform Domain:
▪ A transform is a mathematical tool that allows the conversion of a set of
values to another set of values, creating, therefore, a new way of
representing the same information.
▪ In the field of image processing, the original domain is referred to as
spatial domain, whereas the results are said to lie in the transform
domain.
▪ The motivation for using mathematical transforms in image processing
stems from the fact that some tasks are best performed by transforming
the input images, applying selected algorithms in the transform domain,
and eventually applying the inverse transformation to the result
(following Figure).

Important level 3 (IT, CS)


Made by Abdulrahman Salah Eldin

You might also like