0% found this document useful (0 votes)
70 views

HOG (Histogram of Oriented Gradients) : A HOG Is A

A HOG descriptor extracts features from an image by calculating the distribution of intensity gradients or edge directions in localized portions of an image called blocks. Gradients are calculated within each block and used to form a histogram of gradient directions or edge orientations within each block. These histograms are then concatenated into a feature vector that describes the image. HOG descriptors are widely used for object detection tasks like pedestrian detection due to their representation of an object's characteristic edge or gradient structure.

Uploaded by

Nhân Hồ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

HOG (Histogram of Oriented Gradients) : A HOG Is A

A HOG descriptor extracts features from an image by calculating the distribution of intensity gradients or edge directions in localized portions of an image called blocks. Gradients are calculated within each block and used to form a histogram of gradient directions or edge orientations within each block. These histograms are then concatenated into a feature vector that describes the image. HOG descriptors are widely used for object detection tasks like pedestrian detection due to their representation of an object's characteristic edge or gradient structure.

Uploaded by

Nhân Hồ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

HOG (Histogram of Oriented Gradients): A HOG is a feature descriptor generally used

for object detection. HOGs are widely known for their use in pedestrian detection. A
HOG relies on the property of objects within an image to possess the distribution of
intensity gradients or edge directions. Gradients are calculated within an image per block.
A block is considered as a pixel grid in which gradients are constituted from the
magnitude and direction of change in the intensities of the pixel within the block.

HOG features sample face


In the current example, all the face sample images of a person are fed to the feature
descriptor extraction algorithm; i.e., a HOG. The descriptors are gradient vectors
generated per pixel of the image. The gradient for each pixel consists of magnitude and
direction, calculated using the following formulae:

In the current example, Gx and Gy are respectively the horizontal and vertical
components of the change in the pixel intensity. A window size of 128 x 144 is used for
face images since it matches the general aspect ratio of human faces. The descriptors are
calculated over blocks of pixels with 8 x 8 dimensions. These descriptor values for each
pixel over 8 x 8 block are quantized into 9 bins, where each bin represents a directional
angle of gradient and value in that bin, which is the summation of the magnitudes of all
pixels with the same angle. Further, the histogram is then normalized over a 16 x 16
block size, which means four blocks of 8 x 8 are normalized together to minimize light
conditions. This mechanism mitigates the accuracy drop due to a change in light. The
SVM model is trained using a number of HOG vectors for multiple faces.
HOW IT WORK ?
STEP 1: Preprocessing
As mentioned earlier HOG feature descriptor used for pedestrian detection is calculated
on a 64×128 patch of an image. As mentioned earlier HOG feature descriptor used for
pedestrian detection is calculated on a 64×128 patch of an image. Of course, an image
may be of any size. Typically patches at multiple scales are analyzed at many image
locations. The only constraint is that the patches being analyzed have a fixed aspect ratio.
To illustrate this point I have shown a large image of size 720×475

STEP 2: Calculate the Gradient Images


To calculate a HOG descriptor, we need to first calculate the horizontal and vertical
gradients; after all, we want to calculate the histogram of gradients. This is easily
achieved by filtering the image with the following kernels.
We can find the magnitude and direction of gradient using the following formula

The figure below shows the gradients


Left: Absolute value of x-gradient. Center: Absolute value of y-gradient. Right:
Magnitude of gradient
STEP 3: Calculate Histogram of Gradients in 8x8 cells
In this step, the image is divided into 8x8 cells and a histogram of gradients is calculated
for each 8x8 cells
One of the important reasons to use a feature descriptor to describe a patch of an image is
that it provides a compact representation.
But why 8×8 patch ? Why not 32×32 ? It is a design choice informed by the scale of
features we are looking for. HOG was used for pedestrian detection initially. 8×8 cells in
a photo of a pedestrian scaled to 64×128 are big enough to capture interesting features
( e.g. the face, the top of the head etc. ).
8x9 patch in the image and see how the gradients look:
Center: RGB patch and gradients represented using arrows. Right: the gradient in the
same patch represented as numbers
STEP 4: 16x16 Block Normalization
STEP 5: Calculate the Histogram of Oriented Gradients feature vector

To calculate the final feature vector for the entire image patch, the 36×1 vectors are
concatenated into one giant vector. What is the size of this vector ? Let us calculate

a. How many positions of the 16×16 blocks do we have ? There are 7 horizontal and
15 vertical positions making a total of 7 x 15 = 105 positions.
b. Each 16×16 block is represented by a 36×1 vector. So when we concatenate them
all into one gaint vector we obtain a 36×105 = 3780 dimensional vector.

Visualizing Histogram of Oriented Gradients

The HOG descriptor of an image patch is usually visualized by plotting the 9×1
normalized histograms in the 8×8 cells. See image on the side. You will notice that
dominant direction of the histogram captures the shape of the person, especially
around the torso and legs.

Unfortunately, there is no easy way to visualize the HOG descriptor in OpenCV

You might also like