0% found this document useful (0 votes)

97 views

Bag of Feature

This document discusses bag-of-features models for image classification. It describes how bag-of-features models represent images as histograms of visual word frequencies. The process involves extracting local image features, quantizing them into visual words via clustering, and encoding each image as a histogram of visual word counts. These histograms can then be classified using techniques like support vector machines. Nonlinear kernels allow the histograms to be separated in higher-dimensional feature spaces for improved classification performance.

Uploaded by

Budi Purnomo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views

Bag of Feature

Uploaded by

Budi Purnomo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Bag-of-features models

L
Location
ti
Car Cow
Category
Difficulties: within object variations

Variability: Camera position, Illumination,Internal parameters

Within-object variations
Difficulties: within class variations
Image classification
• Given
Positive training images containing an object class

Negative training images that don’t

• Classify
A test image as to whether it contains the object class or not

?
Bag-of-features
Bag of features – Origin: texture recognition

• Texture is characterized by the repetition of basic elements

or textons

histogram

Universal texton dictionary

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;
Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003
Bag-of-features
Bag of features – Origin: bag
bag-of-words
of words (text)
• Orderless document representation: frequencies of words
from a dictionary
• Classification to determine document categories

Bag-of-words
Co
Commono 2 0 1 3
People 3 0 0 2
Sculpture 0 1 3 0
… … … … …
Bag-of-features
Bag of features for image classification

SVM

Extract regions Compute Find clusters Compute distance Classification

descriptors and frequencies matrix

[Csurka et al., ECCV Workshop’04], [Nowak,Jurie&Triggs,ECCV’06],

[Zhang,Marszalek,Lazebnik&Schmid,IJCV’07]
Bag-of-features
Bag of features for image classification

SVM

Extract regions Compute Find clusters Compute distance Classification

descriptors and frequencies matrix
Step 1 Step 2 Step 3
Step 1: feature extraction
• Scale-invariant
Scale invariant image regions + SIFT (see previous lecture)
– Affine invariant regions give “too” much invariance
– Rotation invariance for many realistic collections “too”
too much
invariance

• Dense descriptors
– Improve results in the context of categories (for most categories)
– Interest
I t t points
i t do
d nott necessarily
il capture
t “all”
“ ll” features
f t

• Color-based
Color based descriptors

• Shape-based
Shape based descriptors
Dense features

- Multi-scale dense grid: extraction of small overlapping patches at multiple scales

Computation of the SIFT descriptor for each grid cells
-Computation
-Exp.: Horizontal/vertical step size 3 pixel, scaling factor of 1.2 per level
Bag-of-features
Bag of features for image classification

SVM

Extract regions Compute Find clusters Compute distance Classification

descriptors and frequencies matrix
Step 1 Step 2 Step 3
Step 2: Quantization

Visual vocabulary

Clustering
Examples
p for visual words

Airplanes

Motorbikes

Faces

Wild Cats

Leaves

People

Bikes
Step 2: Quantization
• Cluster descriptors
– K-means
– Gaussian mixture model

• Assign
g each visual word to a cluster
– Hard or soft assignment

• Build frequency histogram

K-means
K means clustering
• Minimizing
g sum of squared
q Euclidean distances
between points xi and their nearest cluster centers

• Algorithm:
– Randomly y initialize K cluster centers
– Iterate until convergence:
• Assign each data point to the nearest center
• R
Recomputet eachh cluster
l t center t as th
the mean off allll points
i t
assigned to it

• Local minimum, solution dependent on initialization

• Initialization important, run several times, select best

Gaussian mixture model (GMM)
• Mixture of Gaussians: weighted sum of Gaussians

where
ee
Hard or soft assignment
• K-means
K means  hard assignment
– Assign to the closest cluster center
– Count number of descriptors assigned to a center

• Gaussian mixture model  soft assignment

g
– Estimate distance to all centers
– Sum over number of descriptors

• Represent image by a frequency histogram

cy
frrequenc Image representation

…..
codewords

• each image is represented by a vector, typically 1000-4000 dimension,

normalization with L1/L2 norm
• fine grained – represent model instances
• coarse grained – represent object categories
Bag-of-features
Bag of features for image classification

SVM

Extract regions Compute Find clusters Compute distance Classification

descriptors and frequencies matrix
Step 1 Step 2 Step 3
Step 3: Classification

• Learn a decision rule (classifier) assigning bag-of-

bag of
features representations of images to different classes

Decision Zebra
boundary
Non-zebra
Training data
Vectors are histograms, one from each training image

positive negative

Train classifier,e.g.SVM
Linear classifiers
• Find linear function (hyperplane) to separate positive and
negative
i examples l

x i positive : xi  w  b  0
x i negative : xi  w  b  0

Which hyperplane
is best?
Linear classifiers - margin
x2
(color)

• G
Generalization
li ti iis nott
good in this case:
x1 (roundness)

x2
(color)
• Better if a margin
is introduced: b/|w|

x1 (roundness)
Nonlinear SVMs
• Datasets that are linearly separable work out great:

0 x

• But what if the dataset is just too hard?

0 x

• We can map it to a higher

higher-dimensional
dimensional space:
x2

0 x
Nonlinear SVMs
• General idea: the original input space can always be
mapped to some higher-dimensional feature space
where the training set is separable:

Φ: x → φ(x)
Nonlinear SVMs

• The kernel trick: instead of explicitly computing the lifting

transformation φ(x), define a kernel function K such that
K(xi ,xj ) = φ(xi ) · φ(xj)
j

• This gives a nonlinear decision boundary in the original

feature
eatu e space:
space

  y K ( x , x)  b
i
i i i
Kernels for bags of features
N
• Histogram intersection kernel: I (h1 , h2 )   min(h (i), h (i))
i 1
1 2

• Generalized Gaussian kernel:

 1 2
K (h1 , h2 )  exp  D(h1 , h2 ) 
 A 
• D can be Euclidean distance  RBF kernel

• D can be χ2 distance
N
D(h1 , h2 )  
h1 (i)  h2 (i) 2
i 1 h1 (i )  h2 (i )
Combining features
•SVM with multi-channel chi-square kernel

● Channel c is a combination of detector, descriptor

● Dc (Hi , Hj ) is the chi-square distance between histograms

1 m
Dc ( H1 , H 2 ) 
2
i 1
[ ( h1i  h2i ) 2
(h1i  h2i )]

● Ac is the mean value of the distances between all training sample

● Extension: learning of the weights, for example with Multiple

Kernel Learning (MKL)

[J. Zhang, M. Marszalek, S. Lazebnik and C. Schmid. Local features and kernels for
classification of texture and object categories: a comprehensive study, IJCV 2007]
Combining features
• For linear SVMs
– Early fusion: concatenation the descriptors
– Late fusion: learning weights to combine the classification scores

• Theoreticallyy no clear winner

• In p
practice late fusion g
give better results
– In particular if different modalities are combined
Multi-class
Multi class SVMs
• Various direct formulations exist
exist, but they are not widely
used in practice. It is more common to obtain multi-class
SVMs by combining two-class
two class SVMs in various ways

• One versus all:

– Training: learn an SVM for each class versus the others
– Testing: apply each SVM to test example and assign to it the
class of the SVM that returns the highest decision value

• One
O versus one:
– Training: learn an SVM for each pair of classes
– Testing: each learned SVM “votes”
votes for a class to assign to the test
example
Why does SVM learning work?

• Learns foreground and background visual words

foreground words – high weight

background words – low weight

Illustration

Localization according to visual word probability

Correct − Image: 35 Correct − Image: 37

20 20

40 40

60 60

80 80

100 100

120 120

50 100 150 200 50 100 150 200

Correct − Image: 38 Correct − Image: 39

20 20

40 40

60 60

80 80

100 100

120 120

50 100 150 200 50 100 150 200

foreground word more probable

background word more probable

Illustration
A linear SVM trained from positive and negative window descriptors

A few of the highest weighted descriptor vector dimensions (= 'PAS + tile')

+ lie on object boundary (= local shape structures common to many training exemplars)
Bag-of-features
Bag of features for image classification
• Excellent results in the presence of background clutter

bikes books building cars people phones trees

Examples for misclassified images

Books- misclassified into faces, faces, buildings

Buildings- misclassified into faces, trees, trees

Cars- misclassified into buildings, phones, phones

Bag of visual words summary

• Advantages:
– largely unaffected by position and orientation of object in image
– fixed length vector irrespective of number of detections
– veryy successful in classifying
y g images
g according g to the objects
j they
y
contain

• Disadvantages:
– no explicit use of configuration of visual word positions
– no model of the object location
Evaluation of image classification
• PASCAL VOC [05
[05-12]
12] datasets

• PASCAL VOC 2007

– Training and test dataset available
– Used to report state
state-of-the-art
of the art results
– Collected January 2007 from Flickr
– 500 000 images downloaded and random subset selected
– 20 classes
– Class labels per image + bounding boxes
– 5011 ttraining
i i iimages, 4952 ttestt iimages

• Evaluation measure: average precision

PASCAL 2007 dataset
PASCAL 2007 dataset
Evaluation
Precision/Recall

• Ranked list for category A :

A, C, B, A, B, C, C, A ; in total four images with category A

Results for PASCAL 2007
• Winner of PASCAL 2007 [[Marszalek et al.]] : mAP 59.4
– Combination of several different channels (dense + interest
points, SIFT + color descriptors, spatial grids)
– Non-linear
N li SVM with
ith G
Gaussian
i kkernell

• Multiple kernel learning [Yang et al

al. 2009] : mAP 62
62.2
2
– Combination of several features
– Group-based
p MKL approach
pp

• Combining object localization and classification

[Harzallah et al.’09] : mAP 63.5
– Use detection results to improve classification

• Adding objectness boxes [Sanchez at al.’12] : mAP 66.3

Spatial pyramid matching
• Add spatial information to the bag
bag-of-features
of features

• Perform
P f matching
t hi ini 2D iimage space

[Lazebnik, Schmid & Ponce, CVPR 2006]

Related work
Similar approaches:
Subblock description [Szummer & Picard, 1997]
SIFT [Lowe, 1999]
GIST [Torralba et al., 2003]

SIFT Gist

Szummer & Picard (1997) Lowe (1999

(1999, 2004) Torralba et al
al. (2003)
Spatial pyramid representation

Locally orderless
representation
i at
several levels of
spatial resolution

level 0
Spatial pyramid representation

Locally orderless
representation
i at
several levels of
spatial resolution

level 0 level 1
Spatial pyramid representation

Locally orderless
representation
i at
several levels of
spatial resolution

level 0 level 1 level 2

Spatial pyramid matching
• Combination of spatial levels with pyramid match kernel
[Grauman & Darell’05]
• Intersect histograms, more weight to finer grids
Scene dataset [Labzenik et al.’06]
Coast Forest Mountain Open country Highway Inside city Tall building Street

Suburb Bedroom Kitchen Living room Office

Store Industrial

4385 images
155 categories
c ego es
Scene classification

L Single-level
Single level Pyramid
0(1x1) 72.2±0.6
1(2x2) 77.9±0.6 79.0 ±0.5
2(4x4) 79.4±0.3 81.1 ±0.3
3(8x8) 77.2±0.4 80.7 ±0.3
Retrieval examples
Category classification – CalTech101

L Single-level Pyramid
0(1x1) 41.2±1.2
1(2x2) 55.9±0.9 57.0 ±0.8
2(4x4) 63.6±0.9 64.6 ±0.8
3(8x8) 60 3±0 9
60.3±0.9 64 6 ±0.7
64.6 ±0 7
Evaluation BoF – spatial
Image classification results on PASCAL
PASCAL’07
07 train/val set

(SH, Lap, MSD) x (SIFT,SIFTC) AP

spatial layout
1 0.53

2x2
3x1

1,2x2,3x1
Evaluation BoF – spatial
Image classification results on PASCAL
PASCAL’07
07 train/val set

(SH, Lap, MSD) x (SIFT,SIFTC) AP

spatial layout
1 0.53

2x2 0.52
3x1 0.52

1,2x2,3x1 0.54

Spatial layout not dominant for PASCAL’07 dataset

C bi i iimproves average results,
Combination l ii.e., iit iis appropriate
i ffor
some classes
Evaluation BoF - spatial

Image classification results on PASCAL’07 train/val set

for individual categories
g

1 3x1
Sheep 0.339 0.256
Bird 0.539 0.484
DiningTable 0.455 0.502
Train 0.724 0.745

Results are category

g y dependent!
p
 Combination helps somewhat
Discussion

• Summary
– Spatial pyramid representation: appearance of local image
patches
t h + coarse global
l b l position
iti iinformation
f ti
– Substantial improvement over bag of features
– Depends on the similarity of image layout

• Recent extensions
– Flexible, object-centered grid
• Shape
p masks [[Marszalek’12]] => additional annotations
– Weakly supervised localization of objects
• [Russakovsky et al.’12]
Recent extensions

• Efficient Additive Kernels via Explicit Feature Maps

[Perronnin et al.
al.’10,
10, Maji and Berg’09,
Berg 09, A. Vedaldi and Zisserman’10]
Zisserman 10]

• Recently improved aggregation schemes

– Fisher vector [Perronnin & Dance ‘07]
– VLAD descriptor [Jegou, Douze, Schmid, Perez ‘10]
– Supervector [Zhou et al. ‘10]
– Sparse coding [Wang et al. ’10, Boureau et al.’10]

• Improved performance + linear SVM

Fisher vector

 Use a Gaussian Mixture Model as vocabulary

 Statistical measure of the descriptors of the image w.r.t the GMM
 D i ti off likelihood
Derivative lik lih d w.r.t.
t GMM parameterst

GMM parameters:
weight
mean
co-variance (diagonal)

Translated cluster →
large derivative on for this
component

[Perronnin & Dance 07]

Fisher vector

For image retrieval in our experiments:

- only
l ddeviation
i ti wrtt mean, di
dim: K*D [K number
b off Gaussians,
G i D di
dim off d
descriptor]
i ]
- variance does not improve for comparable vector length
Image classification with Fisher vector
• Dense SIFT
• Fisher vector (k=32 to 1024, total dimension from approx.
5000 to 160000)
• Normalization
– square-rooting
– L2 normalization
– [Perronnin’10], [Image categorization using Fisher kernels of non-iid
image models, Cinbis, Verbeek, Schmid, CVPR’12]

• Classification approach
– Linear classifiers
– One versus
ers s rest classifier
Image classification with Fisher vector
• Evaluation on PASCAL VOC’07 linear classifiers with
– Fisher vector
– Sqrt transformation of Fisher vector
– Latent GMM of Fisher vector

• Sqrt transform + latent MOG

models lead to improvement
p

• State-of-the-art performance
obtained
bt i d with
ith lilinear classifier
l ifi
Evaluation image description
Fisher versus BOF vector + linear classifier on Pascal Voc’07

•Fisher improves over BOF

•Fisher comparable
p to BOF +
non-linear classifier
•Limited gain due to SPM
on PASCAL
•Sqrt helps for Fisher and BOF
•[Chatfield et al
al. 2011]
Large-scale
Large scale image classification
has 14M images
g from 22k classes

Standard Subsets
– ImageNet
I N t Large
L S
Scale
l Vi
Visuall R
Recognition
iti ChChallenge
ll 2010 (ILSVRC)
• 1000 classes and 1.4M images
– ImageNet10K dataset
• 10184 classes and ~ 9 M images
Large-scale
Large scale image classification
• Classification approach
– One-versus-rest classifiers
– Stochastic gradient descent (SGD)
– At each step choose a sample at random and update the
parameters using a sample-wise estimate of the regularized risk

• Data reweighting
– Wh
When some classes
l are significantly
i ifi tl more populated
l t d than
th others,
th
rebalancing positive and negative examples
– Empirical
p risk with reweighting
g g

Natural rebalancing, same weight to positive and negatives

Importance of re-weighting
re weighting

• Plain lines correspond to w-OVR,

d h d one tto u-OVR
dashed OVR

• ß is number of negatives samples

for each positive, β=1 natural
rebalancing

• Results for ILSVRC 2010

• Significant impact on accuracy

• For very high dimensions little impact
Impact of the image signature size
• Fisher vector (no SP) for varying number of Gaussians +
different classification methods, ILSVRC 2010

• Performance
P f improves
i f higher
for hi h di
dimensional
i l vectors
t
Experimental results
• Features: dense SIFT,
SIFT reduced to 64 dim with PCA

• Fisher vectors
– 256 Gaussians, using mean and variance
– Spatial pyramid with 4 regions
– Approx. 130K dimensions (4x [2x64x256])
– Normalization: square-rooting and L2 norm

• BOF: dim 1024 + R=4

– 4960 dimensions
– Normalization: square-rooting and L2 norm
Experimental results for ILSVRC 2010

• Features
F t : dense
d SIFT,
SIFT reduced
d d to
t 64 dim
di with
ith PCA
• 256 Gaussian Fisher vector using mean and variance + SP
(3x1) (4x [2x64x256] ~ 130k dim), square-root + L2 norm
• BOF dim=1024 + SP (3x1) (dim 4000), square-root + L2 norm
• Different classification methods
Large-scale
Large scale experiment on ImageNet10k

16.7

Top-1 accuracy

• Significant gain by data re-weighting, even for high-

dimensional Fisher vectors
• w-OVR > u-OVR
• Improves
Impro es oover
er state of the art
art: 6
6.4%
4% [Deng et
et. al] and
WAR [Weston et al.]
Large-scale
Large scale experiment on ImageNet10k
• Illustration of results obtained with w
w-OVR
OVR and 130K
130K-dim
dim
Fisher vectors, ImageNet10K top-1 accuracy
Conclusion

• Stochastic training: learning with SGD is well

well-suited
suited for
large-scale datasets

• One-versus-rest: a flexible option for large-scale image

classification

• Class imbalance: optimize the imbalance parameter in

one-versus-rest strategy is a must for competitive
p
performance
Conclusion

• State-of-the-art performance for large-scale image

classification

• Code on
on-line
line available at http://lear
http://lear.inrialpes.fr/software
inrialpes fr/software

• Future work
– Beyond a single representation of the entire image
– Take into account the hierarchical structure

An Accurate Volume Estimation On Single View Object Images by Deep Learning Based Depth Map Analysis and 3D Reconstruction
No ratings yet
An Accurate Volume Estimation On Single View Object Images by Deep Learning Based Depth Map Analysis and 3D Reconstruction
24 pages
Bag of Words
No ratings yet
Bag of Words
72 pages
Part 11 MD
No ratings yet
Part 11 MD
53 pages
Introduction To Object Recognition: Slides Adapted From Fei-Fei Li, Rob Fergus, Antonio Torralba, and Others
No ratings yet
Introduction To Object Recognition: Slides Adapted From Fei-Fei Li, Rob Fergus, Antonio Torralba, and Others
60 pages
Local Features and Bag of Words Models
No ratings yet
Local Features and Bag of Words Models
60 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Discriminative and Generative Methods For Bags of Features: Zebra Non-Zebra
No ratings yet
Discriminative and Generative Methods For Bags of Features: Zebra Non-Zebra
40 pages
Bag-Of-Words Models: Noah Snavely
No ratings yet
Bag-Of-Words Models: Noah Snavely
47 pages
Peronnin Etal ECCV10 PDF
No ratings yet
Peronnin Etal ECCV10 PDF
14 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
IT5409 - Ch7 - Part2 - Object Recognition - v2 - 4pages
No ratings yet
IT5409 - Ch7 - Part2 - Object Recognition - v2 - 4pages
38 pages
Ijcsereviewpaper
No ratings yet
Ijcsereviewpaper
6 pages
08classification I
No ratings yet
08classification I
52 pages
SWE622 Lecture 3 Classification
No ratings yet
SWE622 Lecture 3 Classification
57 pages
Course Material For cs391
No ratings yet
Course Material For cs391
21 pages
Image Features and Categorization: Computer Vision Jia-Bin Huang, Virginia Tech
No ratings yet
Image Features and Categorization: Computer Vision Jia-Bin Huang, Virginia Tech
70 pages
Visual Categorization With Bags of Keypoints
No ratings yet
Visual Categorization With Bags of Keypoints
17 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Bai09 Descriptors
No ratings yet
Bai09 Descriptors
81 pages
Content Based Image Retrieval Using Feature Coding
No ratings yet
Content Based Image Retrieval Using Feature Coding
4 pages
A Machine Learning Approach: SVM For Image Classification in CBIR
No ratings yet
A Machine Learning Approach: SVM For Image Classification in CBIR
7 pages
eccv06
No ratings yet
eccv06
15 pages
CV Lecture 07 BagOfFeatures
No ratings yet
CV Lecture 07 BagOfFeatures
42 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
out (8) (1)
No ratings yet
out (8) (1)
145 pages
Bag of Features
No ratings yet
Bag of Features
49 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
SVM Class
No ratings yet
SVM Class
33 pages
RO47002 - Lecture 2A - Case Study Visual Object Detection
No ratings yet
RO47002 - Lecture 2A - Case Study Visual Object Detection
24 pages
Tutorial 7 Developing A Simple Image Classifier
No ratings yet
Tutorial 7 Developing A Simple Image Classifier
11 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
lecture6-2 (1)
No ratings yet
lecture6-2 (1)
37 pages
Q1 Fisher Kernels On Visual Vocabularies For Image Categorization
No ratings yet
Q1 Fisher Kernels On Visual Vocabularies For Image Categorization
8 pages
Pattern Recognition: C G (P) G (F (M) )
No ratings yet
Pattern Recognition: C G (P) G (F (M) )
143 pages
Document
No ratings yet
Document
5 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
UNIT-3
No ratings yet
UNIT-3
100 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
MergedPDF Iml
No ratings yet
MergedPDF Iml
114 pages
Image Classification
No ratings yet
Image Classification
18 pages
Lec 14
No ratings yet
Lec 14
18 pages
Shaoyu Lu, Sina Lin, Beibei Wang, Recognition and Classification of Fast Food Images
No ratings yet
Shaoyu Lu, Sina Lin, Beibei Wang, Recognition and Classification of Fast Food Images
5 pages
Lab6 1
No ratings yet
Lab6 1
6 pages
Fast Kernel Classifiers
No ratings yet
Fast Kernel Classifiers
41 pages
Zhang DetectDistractedDriver Report
No ratings yet
Zhang DetectDistractedDriver Report
6 pages
Single Layer Perceptron Classifier
No ratings yet
Single Layer Perceptron Classifier
62 pages
DSH - L5 - Data-Driven Approaches - Concepts
No ratings yet
DSH - L5 - Data-Driven Approaches - Concepts
38 pages
Ipmv Mod 5&6 (Theory Questions)
No ratings yet
Ipmv Mod 5&6 (Theory Questions)
11 pages
VGG Image Classification Practical
No ratings yet
VGG Image Classification Practical
11 pages
Pattern Recognition & Learning II: © UW CSE Vision Faculty
No ratings yet
Pattern Recognition & Learning II: © UW CSE Vision Faculty
47 pages
Object Recog
No ratings yet
Object Recog
102 pages
SVM Kermel Refpics
No ratings yet
SVM Kermel Refpics
11 pages
Q1 - VLAD - Aggregating Local Descriptors Into A Compact Image Representation
No ratings yet
Q1 - VLAD - Aggregating Local Descriptors Into A Compact Image Representation
8 pages
An SVM Based Scoring Evaluation System For Fluorescence Microscopic Image Classification-Note PDF
No ratings yet
An SVM Based Scoring Evaluation System For Fluorescence Microscopic Image Classification-Note PDF
5 pages
Understanding_bag-of-words_model_A_statistical_fra
No ratings yet
Understanding_bag-of-words_model_A_statistical_fra
16 pages
Aggregating Local Descriptors Into A Compact Image Representation
No ratings yet
Aggregating Local Descriptors Into A Compact Image Representation
8 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Harris Corner Detector: Unveiling the Magic of Image Feature Detection
From Everand
Harris Corner Detector: Unveiling the Magic of Image Feature Detection
Fouad Sabry
No ratings yet
Autodesk Maya 2022: A Comprehensive Guide, 13th Edition
From Everand
Autodesk Maya 2022: A Comprehensive Guide, 13th Edition
Prof. Sham Tickoo
No ratings yet
Morpological Operation and Watershed
No ratings yet
Morpological Operation and Watershed
9 pages
03.h-Minima Transform For Segmentation of Structured Surface PDF
No ratings yet
03.h-Minima Transform For Segmentation of Structured Surface PDF
6 pages
Dental Radiographs and Photographs in Human Forensic Identification
No ratings yet
Dental Radiographs and Photographs in Human Forensic Identification
8 pages
Komparasi Algoritma Watershed
No ratings yet
Komparasi Algoritma Watershed
13 pages
Color Image To Grayscale Image Conversion
No ratings yet
Color Image To Grayscale Image Conversion
4 pages
Morphological PCB
No ratings yet
Morphological PCB
5 pages
Adaptive Histogram PDF
No ratings yet
Adaptive Histogram PDF
8 pages
Caries Detection2
No ratings yet
Caries Detection2
38 pages
Bergen Boin Sanjani Blood Vessel Segmentation
No ratings yet
Bergen Boin Sanjani Blood Vessel Segmentation
12 pages
Object Counting
No ratings yet
Object Counting
15 pages
Fast Template Matching in Images Under Non-Linear Tone Mapping
No ratings yet
Fast Template Matching in Images Under Non-Linear Tone Mapping
40 pages
Forensic Acquisitions of WhatsApp Data On Popular Mobile Platforms - Shortall2015
No ratings yet
Forensic Acquisitions of WhatsApp Data On Popular Mobile Platforms - Shortall2015
5 pages
Antemortem Dental Records
No ratings yet
Antemortem Dental Records
3 pages
Field and Service Robotics Results of the 10th International Conference 1st Edition David S. Wettergreen download
100% (2)
Field and Service Robotics Results of the 10th International Conference 1st Edition David S. Wettergreen download
56 pages
Robust Alcode Detection
No ratings yet
Robust Alcode Detection
7 pages
Yilmaz
No ratings yet
Yilmaz
45 pages
EECS 442: Prof. David Fouhey Winter 2019, University of Michigan
No ratings yet
EECS 442: Prof. David Fouhey Winter 2019, University of Michigan
64 pages
COMPAG D 22 00440 Reviewer
No ratings yet
COMPAG D 22 00440 Reviewer
25 pages
Automated Image Stitching Using SIFT Feature Matching
No ratings yet
Automated Image Stitching Using SIFT Feature Matching
28 pages
Opencv Computer Vision Projects with Python 1st Edition Joseph Howse instant download
No ratings yet
Opencv Computer Vision Projects with Python 1st Edition Joseph Howse instant download
59 pages
Lecture 8
No ratings yet
Lecture 8
126 pages
Cs412 Opencv Homework 02
No ratings yet
Cs412 Opencv Homework 02
2 pages
Ijgi 08 00409 PDF
No ratings yet
Ijgi 08 00409 PDF
24 pages
Android Application For Crop Yield Prediction and Crop Disease Detection
No ratings yet
Android Application For Crop Yield Prediction and Crop Disease Detection
4 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
72 pages
Laboratory 4. Image Features and Transforms: 4.1 Hough Transform For Lines Detection
No ratings yet
Laboratory 4. Image Features and Transforms: 4.1 Hough Transform For Lines Detection
13 pages
Overview of Image Matching Based On ORB Algorithm: Journal of Physics: Conference Series
No ratings yet
Overview of Image Matching Based On ORB Algorithm: Journal of Physics: Conference Series
7 pages
THE FAISS LIBRARY
No ratings yet
THE FAISS LIBRARY
21 pages
1 s2.0 S0045790622005419 Main
No ratings yet
1 s2.0 S0045790622005419 Main
15 pages
An Implementation of SIFT Detector and Descriptor: Andrea Vedaldi University of California at Los Angeles
No ratings yet
An Implementation of SIFT Detector and Descriptor: Andrea Vedaldi University of California at Los Angeles
7 pages
Social Media Security Identity Theft Prevention
No ratings yet
Social Media Security Identity Theft Prevention
7 pages
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
No ratings yet
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
14 pages
CV Assignment 2 Group02
No ratings yet
CV Assignment 2 Group02
12 pages
Black Pepper Harvester using Image Processing
No ratings yet
Black Pepper Harvester using Image Processing
3 pages
Introduction To Feature Detection and Matching - by Deepanshu Tyagi - Medium
No ratings yet
Introduction To Feature Detection and Matching - by Deepanshu Tyagi - Medium
15 pages
Vehicle Logo Recognition by Spatial-SIFT Combined With Logistic Regression
No ratings yet
Vehicle Logo Recognition by Spatial-SIFT Combined With Logistic Regression
9 pages
Classification of multi-spectral data with fine-tuning variants of representative models
No ratings yet
Classification of multi-spectral data with fine-tuning variants of representative models
23 pages
Pedestrian Safety Management System Using Automatically Retractable Bollards
No ratings yet
Pedestrian Safety Management System Using Automatically Retractable Bollards
7 pages
CS131 Computer Vision: Foundations and Applications Practice Final (Solution) Stanford University December 11, 2017
No ratings yet
CS131 Computer Vision: Foundations and Applications Practice Final (Solution) Stanford University December 11, 2017
15 pages
Accelerated C++ Practical Programming by Example
No ratings yet
Accelerated C++ Practical Programming by Example
34 pages

Bag of Feature

Uploaded by

Bag of Feature

Uploaded by

Bag-of-features models

for category classification

Variability: Camera position, Illumination,Internal parameters

Negative training images that don’t

• Texture is characterized by the repetition of basic elements

Universal texton dictionary

Extract regions Compute Find clusters Compute distance Classification

[Csurka et al., ECCV Workshop’04], [Nowak,Jurie&Triggs,ECCV’06],

Extract regions Compute Find clusters Compute distance Classification

- Multi-scale dense grid: extraction of small overlapping patches at multiple scales

Extract regions Compute Find clusters Compute distance Classification

• Build frequency histogram

• Local minimum, solution dependent on initialization

• Initialization important, run several times, select best

• Gaussian mixture model  soft assignment

• Represent image by a frequency histogram

• each image is represented by a vector, typically 1000-4000 dimension,

Extract regions Compute Find clusters Compute distance Classification

• Learn a decision rule (classifier) assigning bag-of-

• But what if the dataset is just too hard?

• We can map it to a higher

• The kernel trick: instead of explicitly computing the lifting

• This gives a nonlinear decision boundary in the original

• Generalized Gaussian kernel:

● Channel c is a combination of detector, descriptor

● Dc (Hi , Hj ) is the chi-square distance between histograms

● Ac is the mean value of the distances between all training sample

● Extension: learning of the weights, for example with Multiple

• Theoreticallyy no clear winner

• One versus all:

• Learns foreground and background visual words

foreground words – high weight

background words – low weight

Localization according to visual word probability

50 100 150 200 50 100 150 200

Correct − Image: 38 Correct − Image: 39

50 100 150 200 50 100 150 200

foreground word more probable

background word more probable

A few of the highest weighted descriptor vector dimensions (= 'PAS + tile')

bikes books building cars people phones trees

Books- misclassified into faces, faces, buildings

Buildings- misclassified into faces, trees, trees

Cars- misclassified into buildings, phones, phones

• PASCAL VOC 2007

• Evaluation measure: average precision

• Ranked list for category A :

A, C, B, A, B, C, C, A ; in total four images with category A

• Multiple kernel learning [Yang et al

• Combining object localization and classification

• Adding objectness boxes [Sanchez at al.’12] : mAP 66.3

[Lazebnik, Schmid & Ponce, CVPR 2006]

Szummer & Picard (1997) Lowe (1999

level 0 level 1 level 2

Suburb Bedroom Kitchen Living room Office

(SH, Lap, MSD) x (SIFT,SIFTC) AP

(SH, Lap, MSD) x (SIFT,SIFTC) AP

Spatial layout not dominant for PASCAL’07 dataset

Image classification results on PASCAL’07 train/val set

Results are category

• Efficient Additive Kernels via Explicit Feature Maps

• Recently improved aggregation schemes

• Improved performance + linear SVM

 Use a Gaussian Mixture Model as vocabulary

[Perronnin & Dance 07]

For image retrieval in our experiments:

• Sqrt transform + latent MOG

•Fisher improves over BOF

Natural rebalancing, same weight to positive and negatives

• Plain lines correspond to w-OVR,

• ß is number of negatives samples

• Results for ILSVRC 2010

• Significant impact on accuracy

• BOF: dim 1024 + R=4

• Significant gain by data re-weighting, even for high-

• Stochastic training: learning with SGD is well