0% found this document useful (0 votes)

47 views

Image-Based Indoor Localization Using Smartphone Camera

1. The document presents an image-based indoor localization approach using a smartphone camera that can determine position by taking a picture of the surrounding environment. 2. It proposes classifying scenes using deep belief networks and solving the camera position using spatial reference points extracted from depth images via the perspective-n-point algorithm. 3. Experiments on public data and real scenes showed the approach can achieve submeter positioning accuracy, providing an infrastructure-free solution for applications like self-driving, robot navigation, and augmented reality.

Uploaded by

toufikenfissi1999

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views

Image-Based Indoor Localization Using Smartphone Camera

Uploaded by

toufikenfissi1999

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Hindawi

Wireless Communications and Mobile Computing

Volume 2021, Article ID 3279059, 9 pages
https://doi.org/10.1155/2021/3279059

Research Article
Image-Based Indoor Localization Using Smartphone Camera

Shuang Li,1,2 Baoguo Yu,1 Yi Jin ,3 Lu Huang,1,2 Heng Zhang,1,2 and Xiaohu Liang1,2
1
State Key Laboratory of Satellite Navigation System and Equipment Technology, China
2
Southeast University, China
3
Beijing Jiaotong University, China

Correspondence should be addressed to Yi Jin; [email protected]

Received 17 April 2021; Revised 30 May 2021; Accepted 20 June 2021; Published 5 July 2021

Academic Editor: Mohammad R. Khosravi

Copyright © 2021 Shuang Li et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

With the increasing demand for location-based services such as railway stations, airports, and shopping malls, indoor positioning
technology has become one of the most attractive research areas. Due to the effects of multipath propagation, wireless-based indoor
localization methods such as WiFi, bluetooth, and pseudolite have difficulty achieving high precision position. In this work, we
present an image-based localization approach which can get the position just by taking a picture of the surrounding
environment. This paper proposes a novel approach which classifies different scenes based on deep belief networks and solves
the camera position with several spatial reference points extracted from depth images by the perspective-n-point algorithm. To
evaluate the performance, experiments are conducted on public data and real scenes; the result demonstrates that our approach
can achieve submeter positioning accuracy. Compared with other methods, image-based indoor localization methods do not
require infrastructure and have a wide range of applications that include self-driving, robot navigation, and augmented reality.

1. Introduction ing the development of indoor positioning technology.

Besides, these kinds of methods can only output the position
According to statistics, more than 80 percent of people’s (X, Y, and Z coordinates) but not the view angle (pitch, yaw,
living time is in an indoor environment such as shopping and roll angles).
malls, airports, libraries, campuses, and hospitals. The pur- The vision-based positioning method is a kind of passive
pose of the indoor localization system is to provide accurate positioning technology which can achieve high positioning
positions in large buildings. It is vital to applications such accuracy and does not need extra infrastructure. Moreover,
as evacuation of trapped people at fire scenes, tracking of it can not only output the position but also the view angle
valuable assets, and indoor service robot. For these applica- at the same time. Therefore, it has gradually become a
tions to be widely accepted, indoor localization requires an hotspot of indoor positioning technology [7, 8]. Such
accurate and reliable position estimation scheme [1]. methods typically involve four steps: first, establishing an
In order to provide a stable indoor location service, a indoor image dataset collected by depth cameras with exact
large number of technologies are researched including positional information; second, comparing the images
pseudolite, bluetooth, ultrasonic, WiFi, ultra wideband, and collected by a camera to the images in the database which
LED [2, 3]. It is almost impossible to obtain very accurate established the last step; third, retrieving some of the most
results for a radio-based approach in view of the multipath similar pictures, then extracting the feature and matching
interference through arrival time and arrival angle methods. the points; at last, solving the perspective-n-point problem
The time-varying indoor environment and the movement [9–12]. However, the application of scene recognition to
of pedestrians also have adverse effects on the stability of mobile location implies several challenges [13–15]. The com-
fingerprint information [4–6]. In addition, the high cost of plex three-dimensional shape of the environment results in
hardware equipment, construction, and installation as well occlusions, overlaps, shadows, and reflections which require
as maintenance and update is also an important factor limit- a robust description of the scene [16]. To address these issues,
2 Wireless Communications and Mobile Computing

we propose a particularly efficient approach based on a 3.1. Framework Overview. The whole pipeline of the visual
deep belief network with local binary pattern feature localization system is shown in Figure 1. In the following,
descriptors. It enables us to find out the most similar we briefly provide an overview of our system.
pictures quickly. In addition, we restrict the search space In the offline stage, the RGB-D cameras are held to collect
according to adaptive visibility constraints which allows us enough RGB images and depth images around the indoor
to cope with extensive maps. environment. At the same time, the pose of the camera and
the 3D point cloud are constructed. The RGB image is used
2. Related Work as a learning dataset to train the network model, and then,
the network model parameters are saved until the loss func-
Before presenting the proposed approach, we review previ- tion value does not decrease. In the online stage, after the pre-
ous work on image-based localization methods and divide vious step is completed, anyone enters the room, downloads
these methods into three categories roughly. the trained network model parameters to the mobile phone,
Manual mark-based localization methods completely rely and takes a picture with the mobile phone, and the most
on the natural features of the image which lacks robustness, similar image is identified according to the deep learning net-
especially under conditions of varying illumination. In order work. The unmatched points are eliminated, and the pixel
to improve the robustness and accuracy of the reference coordinates of the matched points and the depth of the
point, special coding marks are used to meet the higher posi- corresponding points are extracted. According to the pin-
tioning requirements of the system. There are three benefits: hole imaging model, the n-point perspective projection
simplify the automatic detection of corresponding points, problem-solving method can be used to calculate the pose
introduce system dimensions, and distinguish and identify of the mobile phone in the world coordinate system. Finally,
targets by using a unique code for each mark. Common types the posture is converted into a real position and displayed
of marks include concentric rings, QR codes, or patterns on the map.
composed of colored dots. The advantage is raising the recog-
nition rate and effectively reducing the complexity of posi- 3.2. Camera Calibration and Image Correction. Due to the
tioning methods. The disadvantage is that the installation processing error and installation error of camera lens, the
and maintenance costs are high, some targets are easily image has radial distortion and tangential distortion. There-
obstructed, and the scope of application is limited [17, 18]. fore, we must calibrate the camera and correct the images in
Natural mark-based localization methods usually detect the preprocessing stage. The checkerboard contains some cal-
objects on the image and match them with an existing build- ibration reference points, and the coordinates of each point are
ing database. The database contains the location information disturbed by the same noise. Establishing the function γ:
of the natural marks in the building. The advantage of this n m 2
method is that it does not require additional local infrastruc- γ= 〠〠
pij − p∧ðA, Ri , t i , Pi Þ , ð1Þ
ture. In other words, the reference object is actually a series i=1 j=1
of digital reference points (control points in photogramme-
try) in the database. Therefore, this type of system is suitable where pij is the coordinate of the projection points on image i
for large-scale coverage without increasing too much cost.
The disadvantage is that the recognition algorithm is for reference point j in the three-dimensional space. Ri and t i
complex and easy to be affected by the environment, the are the rotation and translation vectors of image i. Pi is the
characteristics are easy to change, and the dataset needs to three-dimensional coordinate of reference point i in the world
be updated [19–22]. coordinate system. p̂ðA, Ri , t i , Pi Þ is the two-dimensional
Learning-based localization methods have emerged in coordinate in the image coordinate system.
the past few years. It is an end-to-end method that directly 3.3. Scene Recognition. In this section, we use the deep belief
obtains 6dof pose, which has been proposed to solve loop- network (DBN) to categorize the different indoor scenes. The
closure detection and pose estimation [23]. This method framework includes image preprocessing, LBP feature
does not require feature extraction, feature matching, and extracting, DBN training, and scene classification.
complex geometric calculations and is intuitive and concise.
It is robust in weak textures, repeated textures, motion blur, 3.3.1. Local Binary Pattern. The improved LBP feature is
and lighting changes. In the training phase, the calculative insensitive to rotation and illumination changes. The LBP
scale is very large, and GPU servers are usually required, operator can be specifically described as the following: the
which cannot run smoothly on mobile platforms [20]. In gray values in the window center pixel are defined as the
many scenarios, learning-based features are not as effective threshold, and the gray values of the surrounding 8 pixels
as traditional features such as SIFT, and the interpretability are, respectively, compared with the threshold in a clockwise
is poor [24–27]. direction, and if the gray value is bigger than the threshold,
then mark the pixel as 1; otherwise, mark 0, and then get
3. Framework and Method an 8-bit binary number through the comparison. After the
decimal conversion, get the LBP value of the center pixel in
In this section, first, we introduce the overview of the frame- this window. The value reflects the texture information of
work. Then, the key modules are explained in more detail in the point at this position. The calculation process is shown
the subsequent sections. in Figure 2.
Wireless Communications and Mobile Computing 3

visible layer and a hidden layer. The neurons in the same

Image capture
layer and the neurons in different layers are connected to
Establishment each other. There are two types of neuron output states:
indoor
image library
active and inactive, represented by numbers 1 and 0. The
Scene recognition advantage of the Boltzmann machine is its powerful unsuper-
vised learning ability, which can learn complex rules from a
large amount of data; the disadvantages are the huge amount
Train model
Extract feature of calculation and the long training time. The restricted
points and match Boltzmann machine canceled the connection between neu-
rons in the same layer; each hidden unit and visible layer unit
are independent of each other. Roux and Bengio theoretically
Camera pose solving prove that as long as the number of neurons in the hidden
layer and the training samples are sufficient, the arbitrary
Off line On line discrete distribution can be fitted. The structure of BM and
RBM is shown in Figure 5.
Figure 1: The framework of the visual localization system.
The joint configuration energy of its visible and hidden
layers is defined as

m n m n
1 2 2 0 0 0
Eðv, hjθÞ = − 〠 bi vi − 〠 c j h j − 〠〠 vi wij h j , ð3Þ
Threshold i=1 j=1 i=1 j=1
9 5 6 1 1

5 3 1 1 0 0 where θ = fW ij , bi , c j g are parameters in RBM, bi is bias of

visible layer i, c j is bias of visible layer j, and wij is the weight.
Figure 2: Local binary pattern calculation process. The output of the hidden layer unit is
n
The formula of local binary pattern: h j = 〠 vi wij + b j : ð4Þ
j=1
N−1
LBPðxc , yc Þ = 〠 2n sðin − ic Þ, When the parameters are known, based on the above
n=0 energy function, the joint probability distribution of ðv, hÞ
( ð2Þ
1, if x ≥ 0,
sðxÞ = e−Eðv,hjθÞ
0, else, Pðv, hjθÞ = ,
Z ðθ Þ
ð5Þ
where ðxc , yc Þ is the horizontal and vertical coordinate of the Z ðθÞ = 〠 e−Eðv,hjθÞ ,
v ,h
center pixel; N is number 8; ic , in are the gray values of the
center pixel and the neighborhood pixel, respectively; and s
where ZðθÞ is the normalization factor. Distribution of v is
ð⋅Þ is the two-valued symbol function.
PðvjθÞ, joint probability distribution Pðv, hjθÞ:
The earliest proposed LBP operator can only cover a
small range of images, so the optimization and improvement
1
methods for the LBP operator are constantly proposed by PðvjθÞ = 〠 pðv, hjθÞ = 〠 e−Eðv,hjθÞ : ð6Þ
researchers. We adopt the method which improves the h
Z ðθÞ h
insuﬃciency of the window size of the original LBP operator
by replacing the traditional square neighborhood with a
circular neighborhood and expanding the window size as Since the activation state of each hidden unit and visible
shown in Figure 3. unit is conditionally independent, therefore, when the state
In order to make the LBP operator have rotation invari- of the visible and hidden units is given, the activation proba-
ance, the circular neighborhood is rotated clockwise to obtain bility of the ﬁrst implicit unit and visible elements is
a series of binary strings, and the minimum binary value is !
m
obtained, and then, the value is converted into decimal,
P h j = 1jv, θ = σ b j + 〠 vi wij ,
which is the LBP value of the point. The process of obtaining i=1
the rotation-invariant LBP operator is shown in Figure 4. ! ð7Þ
n
3.3.2. Deep Belief Network. The deep belief network consists Pðvi = 1jh, θÞ = σ ci + 〠 h j wij ,
of a multirestricted Boltzmann machine (RBM) and a back- j=1
propagation (BP) neural network. The Boltzmann machine
is a neural network based on learning rules. It consists of a where σðxÞ = 1/ð1 + e−x Þ is the sigmoid activation function.
4 Wireless Communications and Mobile Computing

(a) LBP15 (b) LBP25 (c) LBP216

Figure 3: Three types of LBP.

sponding projection point in the image coordinate system

255 are u~i = ðui , vi ÞT , and the corresponding homogeneous
coordinates are Pwi = ðxi , yi , z i , 1ÞT and ui = ðui , vi , 1ÞT .
The correspondence between the reference point Pwi and
the projection point ui :
240 120 60 30 15 135 195

λi ui = K ½R t Pwi , ð8Þ
15
1
2
where λi is the depth of the reference point and K is the
Figure 4: Rotation-invariant LBP schematic. internal parameter matrix of the camera:

2 3
f 0 u0
3.4. Feature Point Detection and Matching. In this paper, we 6 7
propose a multifeature point fusion algorithm. The combina- K =6
40 f v0 7
5, ð9Þ
tion of the edge detection algorithm and the ORB detection
algorithm enables the detection algorithm to extract the edge 0 0 1
information, thereby increasing the number of matching
points with fewer textures. The feature points of the edge
are obtained by the Canny algorithm to ensure that the object where f = f u = f v is the focal length of the camera and
with less texture has feature points. ORB have scale and rota- ðu0 , v0 Þ = ð0, 0Þ is the optical center coordinate.
tion invariance, and the speed is faster than SIFT. The BRIEF First, select four noncoplanar virtual control points in
description algorithm is used to construct the feature point the world coordinate system. The relationship between
descriptor [28–31]. the virtual control points and their projection points is
The Brute force algorithm is adopted as the feature shown in Figure 7.
matching strategy. It calculates the Hamming distance In Figure 7, Cw1 = ½0, 0, 0, 1T , Cw2 = ½1, 0, 0, 1T , Cw3 =
between each point of the template image and each feature ½0, 1, 0, 1T , and Cw4 = ½0, 0, 1, 1T . fCcj , j = 1, 2, 3, 4g are
point of the sample image. Then compare the minimum homogeneous coordinates of the virtual control point in the
Hamming distance value with the threshold value; if the dis- ~ c , j = 1, 2, 3, 4g is the corre-
camera coordinate system, fC j
tance is less than the threshold value, regard these two points
as the matching points; otherwise, they are not matching sponding nonhomogeneous coordinate, fC j , j = 1, 2, 3, 4g is
points. The framework of feature extraction and matching the homogeneous coordinate of the projection point corre-
is shown in Figure 6. sponding in the image coordinate system, and fC ~ j , j = 1, 2,
3, 4g is the corresponding nonhomogeneous coordinate.
3.5. Pose Estimation. The core idea is to select four noncopla- fPci , i = 1, 2, ⋯, ng is the homogeneous coordinate of the
nar virtual control points; then, all the spatial reference ~ ci , i
reference point in the camera coordinate system; fP
points are represented by the four virtual control points,
= 1, 2, ⋯, ng is the corresponding nonhomogeneous coor-
and then, the coordinates of the virtual control points are
dinate. The relationship between the spatial reference
solved by the correspondence between the spatial reference
points and the control points in the world coordinate is
points and the projection points, thereby obtaining the coor-
as follows:
dinates of all the spatial reference points. Finally, the rotation
matrix and the translation vector are solved. The speciﬁc
algorithm is described as follows. 4
Given n reference points, the world coordinate is P ~w Pwi = 〠 αij Cwj , i = 1, 2, ⋯, n, ð10Þ
i
= ðxi , yi , z i ÞT , i = 1, 2, ⋯, n. The coordinates of the corre- j=1
Wireless Communications and Mobile Computing 5

h1 h2 hn h1 h2 hn

c1 c2 ... cn c1 c2 ... cn

w w

b1 b2 b3 ... bm b1 b2 b3 ... bm

v1 v2 v3 vm v1 v2 v3 vm
BM RBM

Figure 5: Boltzmann machine and restricted Boltzmann machine. v is the visible layer, m indicates the number of input data, h is the hidden
layer, and w is the connection weight between two layers,∀i, j, vi ∈ f0, 1g, h j ∈ f0, 1g.

Canny edge feature

detection

Image BRIEF point Feature matching

feature description strategy

FAST point feature

detection

Figure 6: The process of multifeature fusion extraction and matching.

Then, obtain the equation:

4
C 2w 〠 αij f xcj − αij ui z cj = 0,
C3w j=1
O
C1w
z ð13Þ
4
〠 αij f ycj − αij vi z cj = 0:
C 4w j=1
x
c T c c T
y Assume Z = ½Z cT cT cT cT c
1 , Z2 , Z3 , Z4 , Z j = ½ f x j , f y j , z j , j =

Figure 7: Virtual control point and its projection point

1, 2, 3, 4, then the equations are obtained from the correspon-
correspondence. dence between spatial points and image points as follows:

MZ = 0: ð14Þ
where vector ½αi1 , αi2 , αi3 , αi4 T is the coordinate of the
Euclidean space based on the control point Cci . From The solution Z is the kernel space of the matrix M:
the invariance of the linear relationship under the Euclid-
ean transformation, N
Z = 〠 βi W i , ð15Þ
i=1
4
Pci = 〠 αij C cj , i = 1, 2, ⋯, n,
j=1
where W i is the eigenvector of M T M, N is the dimension of
ð11Þ the kernel, and βi is the undetermined coeﬃcient. For a
4 perspective projection model, the value of N is 1, resulting in
~ ci
λ i u i = KP ~ cj ,
= K 〠 αij C i = 1, 2, ⋯, n:
j=1
Z = βW, ð16Þ

T
~ cj = ½xcj , yc , z cj T , then
Assume C where W = ½wT1 , wT2 , wT3 , wT4 , w j = ½w j1 , w j2 , w j3 T ; then, the
j
image coordinates of the four virtual control points are
( )
4
w j1 w j2
λi = 〠 αij z cj : ð12Þ cj = , , 1 , j = 1, 2, 3, 4: ð17Þ
j=1 w j3 w j3
6 Wireless Communications and Mobile Computing

The image coordinates of the four virtual control points

obtained by the solution and the camera focal length obtained
during the calibration process are taken into the absolute posi-
tioning algorithm to obtain the rotation matrix and the trans-
lation vector.

4. Experiments
We conducted two experiments to evaluate the proposed
system. In the first experiment, we compare the proposed
algorithm with other state-of-the-art algorithms on public
datasets and then perform numerical analysis to show the
accuracy of our system. The second experiment evaluated Figure 8: Intel RealSense D435 and Lenovo mobile phone.
the performance of accuracy in the real world.
4.1. Experiment Setup. The experimental devices include an
Android mobile phone (Lenovo Phab 2 Pro) and a depth
camera (Intel RealSense D435) as shown in Figure 8. The
user interface of the proposed visual positioning system on
a smart mobile phone running in an indoor environment is
shown in Figure 9.
4.2. Experiment on Public Dataset. In this experiment, we
adopted the ICL-NUIM dataset which consists of RGB-D
images from camera trajectories from two indoor scenes.
The ICL-NUIM dataset is aimed at benchmarking RGB-D,
Visual Odometry, and SLAM algorithms [32–34]. Two dif-
ferent scenes (the living room and the office room scene)
are provided with ground truth. The living room has 3D sur- Figure 9: The user interface of the proposed visual positioning
face ground truth together with the depth maps as well as system on a smart mobile phone running in an indoor environment.
camera poses and as a result perfectly suits not only for
benchmarking camera trajectory but also for reconstruction. Table 1: Comparison of mean error in ICL-NUIM dataset.
The office room scene comes with only trajectory data and
does not have any explicit 3D model with it. The images were Method Living room Office room
captured at 640∗480 resolutions. PoseNet 0.60 m, 3.64° 0.46 m, 2.97°
Table 1 shows localization results for our approach 4D PoseNet 0.58 m, 3.40° 0.44 m, 2.81°
compared with state-of-the-art methods. The proposed local-
CNN+LSTM 0.54 m, 3.21° 0.41 m, 2.66°
ization method is implemented on Intel Core i5-4460
[email protected] GHz. The total procedure from scene recognition Ours 0.48 m, 3.07° 0.33 m, 2.40°
to pose estimation takes about 0.17 s to output a location for a
single image. divide them into 18 categories. In the online stage, we
4.3. Experiment on Real Scenes. The images are acquired by captured 45 images at different locations on route 1 and 27
a handheld depth camera at a series of locations. The images on route 2. The classification accuracy formula is
image size is 640 × 480 pixels, and the focal length of the
camera is known. Several images of the laboratory are Ni
P= , ð18Þ
shown in Figure 10. N
Using the RTAB-Map algorithm, we get the 3D point
cloud of the laboratory. It is shown in Figure 11. The blue where N i is the correct classified number of scene images and
points are the position of the camera, and the blue line is N is the total number of scene images. The classification
the trajectory. accuracy of our method is 0.925.
The 2D map of our laboratory is shown in Figure 12. The Most mismatched scenes concentrate in the corner,
length and width of the laboratory are 9.7 m and 7.8 m, mainly due to the lack of significant features or mismatches.
respectively. First, select a point in the laboratory as the Several mismatched scenes are shown in Figure 13.
origin of the coordinate system and establish a world coordi- After removing the wrong matched results, the error
nate system. Then, hold the mobile phone, walk along cumulative distribution function graph is shown in Figure 14.
different routes, and take photos, respectively, as indicated The trajectory of the camera is compared with the pre-
by the arrows. defined route. After calculating the Euclidean distance
In the offline stage, we get a total of 144 images. Due to between the results through our method and the true posi-
some images captured at different scenes being similar, we tion, we get the error cumulative distribution function
Wireless Communications and Mobile Computing 7

Figure 10: Images captured from diﬀerent scenes.

Figure 11: 3D point cloud of laboratory.

Zw O
9.7 m
Xw

7.8 m

Figure 12: Environmental map and walking route.

Figure 13: Mismatched scene.

graph (Figure 14). It can be seen that the average position- Since the original depth images in our experiment are
ing error is 0.61 m. Approximately 58% point positioning based on RTAB-Map, its accuracy is not accurate. For exam-
error is less than 0.5 m, about 77% point error is less than ple, in an indoor environment, intense illumination and
1 m, about 95% point error is less than 2 m, and the max- strong shadows may lead to inconspicuous local features. It
imum error is 2.55 m. is also diﬃcult to construct a good point cloud model. In
8 Wireless Communications and Mobile Computing

Empirical CDF
1

0.9

0.8

0.7

0.6
F(x) 0.5

0.4

0.3

0.2

0.1

0
0 0.5 1 1.5 2 2.5 3
x

Figure 14: Error cumulative distribution function graph.

the future, we plan to use laser equipment to construct a References

point cloud.
[1] J. Wu, S. Guo, H. Huang, W. Liu, and Y. Xiang, “Information
and communications technologies for sustainable develop-
5. Conclusions and Future Work ment goals: state-of-the-art, needs and perspectives,” IEEE
Communications Surveys & Tutorials, vol. 20, no. 3,
In this article, we have presented an indoor positioning pp. 2389–2406, 2018.
system based only on cameras. The main work is to use deep [2] P. Lazik, N. Rajagopal, O. Shih, B. Sinopoli, and A. Rowe,
learning to identify the category of the scene and use 2D-3D “Alps: s bluetooth and ultrasound platform for mapping and
localization,” in Proceedings of the 13th ACM Conferenceon
matching feature points to calculate the location. We imple-
Embedded Networked Sensor Systems, ACM, pp. 73–84, New
mented the proposed approach on a mobile phone and York, NY, USA, 2015.
achieved a positioning accuracy of decimeter level. The pre-
[3] S. He and S. Chan, “Wi-Fi fingerprint-based indoor position-
liminary indoor positioning experiment result is given in this ing: recent advances and comparisons,” IEEE Communications
paper. But the experimental site is a small-scale place. The Surveys & Tutorials, vol. 18, no. 1, pp. 466–490, 2017.
following work needs to be done in the future: with the rapid [4] C. L. Wu, L. C. Fu, and F. L. Lian, “WLAN location determina-
development of deep learning, it can generate high-level tion in e-home via support vector classification,” in Network-
semantics and effectively solve the limitations caused by arti- ing Sensing and Control, IEEE International Conference,
ficial design features, use a more robust lightweight image 2004, pp. 1026–1031, Taipei, Taiwan, 2004.
retrieval algorithm, and carry out tests under different light- [5] G. Ding, Z. Tan, J. Wu, and J. Zhang, “Efficient indoor finger-
ing and dynamic environments, system tests under large- printing localization technique using regional propagation
scale scenarios, and long-term performance tests. model,” IEICE Transactions on Communications, vol. 8,
pp. 1728–1741, 2014.
[6] G. Ding, Z. Tan, J. Wu, J. Zeng, and L. Zhang, “Indoor finger-
Data Availability printing localization and tracking system using particle swarm
optimization and Kalman filter,” IEICE Transactions on Com-
The data used to support the findings of this study are munications, vol. 3, pp. 502–514, 2015.
included within the article. [7] C. Toft, W. Maddern, A. Torii et al., “Long-term visual locali-
zation revisited,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, p. 1, 2020.
Conflicts of Interest [8] A. Xiao, R. Chen, D. Li, Y. Chen, and D. Wu, “An indoor posi-
tioning system based on static objects in large indoor scenes by
The authors declare that they have no conflicts of interest. using smartphone cameras,” Sensors, vol. 18, no. 7, pp. 2229–
2246, 2018.
[9] E. Deretey, M. T. Ahmed, J. A. Marshall, and M. Greenspan,
Acknowledgments “Visual indoor positioning with a single camerausing PnP,”
in In Proceedings of the 2015 International Conference on
This study was partially supported by the Key Research Indoor Positioning and Indoor Navigation (IPIN), pp. 1–9,
Development Program of Hebei (Project No. 19210906D). Banff, AB, Canada, October 2015.
Wireless Communications and Mobile Computing 9

[10] L. Kneip, D. Scaramuzza, and R. Siegwart, “A novel parametri- [24] A. Kendall, M. Grimes, and R. Cipolla, “Posenet: a convolu-
zation of the perspective-three-point problem for a direct tional network for real-time 6-dof camera relocalization,” in
computation of absolute camera position and orientation,” in IEEE International Conference on Computer Vision (ICCV),
Proceedings of the IEEE Conference on Computer Vision and pp. 2938–2946, Santiago, Chile, 2015.
Pattern Recognition, pp. 2969–2976, Colorado Springs, CO, [25] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and
USA, 2011. L.-C. Chen, “Mobilenetv2: inverted residuals and linear
[11] T. Sattler, B. Leibe, and L. Kobbelt, “Fast image-based localiza- bottlenecks,” in Proceedings of the IEEE conference on
tion using direct 2d-to-3dmatching,” in 2011 IEEE Interna- computer vision and pattern recognition, pp. 4510–4520,
tional Conference on Computer Vision, IEEE, pp. 667–674, Salt Lake City, Utah, 2018.
Barcelona, Spain, 2011. [26] Z. Chen, A. Jacobson, N. Sunderhauf et al., “Deep learning fea-
[12] Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, “Worldwide tures at scale for visual place recognition,” in 2017 IEEE Inter-
pose estimation using 3d point clouds,” in European Confer- national Conference on Robotics and Automation (ICRA),
ence on Computer Vision (ECCV), Berlin, Heidelberg, 2012. Singapore, 2017.
[13] M. Larsson, E. Stenborg, C. Toft, L. Hammarstrand, [27] S. Lynen, B. Zeisl, D. Aiger et al., “Large-scale, real-time
T. Sattler, and F. Kahl, “Fine-grained segmentation net- visual–inertial localization revisited,” The International Jour-
works: self-supervised segmentation for improved long- nal of Robotics Research, vol. 39, no. 9, pp. 1–24, 2020.
term visual localization,” in Proceedings of the IEEE/CVF [28] M. Dusmanu, I. Rocco, T. Pajdla et al., “D2-net: a trainable cnn
International Conference on Computer Vision, pp. 31–41, for joint description and detection oflocal features,” in Pro-
Seoul, Korea, 2019. ceedings of the IEEE/CVF Conference on Computer Vision
[14] A. Anoosheh, T. Sattler, R. Timofte, M. Pollefeys, and L. Van and Pattern Recognition, pp. 8092–8101, California, 2019.
Gool, “Night-to-day image translation for retrieval-based [29] R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histo-
localization,” in 2019 International Conference on Robotics grams (FPFH) for 3D registration,” in IEEE International Con-
and Automation (ICRA), pp. 5958–5964, Montreal, QC, Can- ference on Robotics and Automation, pp. 1848–1853, Kobe,
ada, 2019. Japan, 2009.
[15] J. X. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, [30] A. Xu and G. Namit, “SURF: speeded-up robust features,”
“Sun database: large-scale scene recognition from abbey to Computer Vision & Image Understanding, vol. 110, no. 3,
zoo,” in Proceedings of IEEE Conference on Computer Vision pp. 404–417, 2008.
and Pattern Recognition, pp. 3485–3492, San Francisco, CA,
[31] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: an
USA, 2010.
efficient alternative to SIFT or SURF,” in IEEE International
[16] D. G. Lowe, “Distinctive image features from scale-invariant Conference on Computer Vision, pp. 2564–2571, Barcelona,
keypoints,” International Journal of Computer Vision, vol. 60, Spain, 2012.
no. 2, pp. 91–110, 2004.
[32] A. Handa, T. Whelan, J. Mcdonald, and A. J. Davison, “A
[17] P.-E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, benchmark for RGB-D visual odometry, 3D reconstruction
“From coarseto fine: robust hierarchical localization at large and SLAM,” in 2014 IEEE international conference on Robotics
scale,” in Proceedingsof the IEEE Conference on Computer and automation (ICRA), pp. 1524–1531, Hong Kong, China,
Vision and Pattern Recognition, pp. 12716–12725, California, 2014.
2019.
[33] M. Labbe and F. Michaud, “RTAB-Map as an open-source
[18] Q. Niu, M. Li, S. He, C. Gao, S.-H. Gary Chan, and X. Luo, lidar and visual simultaneous localization and mapping library
“Resource efficient and automated image-based indoor locali- for large-scale and long-term online operation,” Journal of
zation,” ACM Transactions on Sensor Networks, vol. 15, no. 2, Field Robotics, vol. 36, no. 2, pp. 416–446, 2019.
pp. 1–31, 2019.
[34] Z. Gao, Y. Li, and S. Wan, “Exploring deep learning for view-
[19] Y. Chen, R. Chen, M. Liu, A. Xiao, D. Wu, and S. Zhao, based 3D model retrieval,” ACM Transactions on Multimedia
“Indoor visual positioning aided by CNN-based image Computing, Communications, and Applications (TOMM),
retrieval: training-free, 3D modeling-free,” Sensors, vol. 18, vol. 16, no. 1, pp. 1–21, 2020.
no. 8, pp. 2692–2698, 2018.
[20] A. Kendall and R. Cipolla, “Modelling uncertainty in deep
learning for camera relocalization,” in IEEE International Con-
ference on Robotics & Automation, pp. 4762–4769, Stockholm,
Sweden, 2016.
[21] T. Sattler, B. Leibe, and L. Kobbelt, “Efficient & effective prior-
itized matching for large-scale image-based localization,” IEEE
Transactions on Pattern Analysis and Machine Intelligence
(PAMI), vol. 39, no. 9, pp. 1744–1756, 2016.
[22] L. Svärm, O. Enqvist, F. Kahl, and M. Oskarsson, “City-scale
localization for cameras with known vertical direction,” IEEE
Transactions on Pattern Analysis and Machine Intelligence
(PAMI), vol. 39, no. 7, pp. 1455–1461, 2016.
[23] B. Zeisl, T. Sattler, and M. Pollefeys, “Camera pose voting for
large-scale image-based localization,” in IEEE International
Conference on Computer Vision (ICCV), pp. 2704–2712, Santi-
ago, Chile, 2015.

SOP Storage Account Keys
No ratings yet
SOP Storage Account Keys
12 pages
Project Synopsis Inventory Management System
100% (3)
Project Synopsis Inventory Management System
17 pages
Oracle EBS 11i Forms Personalization Examples - ABB
100% (1)
Oracle EBS 11i Forms Personalization Examples - ABB
41 pages
Conference 12 OPTIM 2012
No ratings yet
Conference 12 OPTIM 2012
6 pages
Efficient Image Retrieval Based Mobile Indoor Localization
No ratings yet
Efficient Image Retrieval Based Mobile Indoor Localization
4 pages
Indoor Navigation Using A Mobile Phone: 2012 African Conference For Sofware Engineering and Applied Computing
No ratings yet
Indoor Navigation Using A Mobile Phone: 2012 African Conference For Sofware Engineering and Applied Computing
7 pages
Verma 2016
No ratings yet
Verma 2016
4 pages
Applied Sciences: Lightweight Attention Pyramid Network For Object Detection and Instance Segmentation
No ratings yet
Applied Sciences: Lightweight Attention Pyramid Network For Object Detection and Instance Segmentation
16 pages
FADN Fully Connected Attitude Detection Network Based On Industrial Video
No ratings yet
FADN Fully Connected Attitude Detection Network Based On Industrial Video
10 pages
Remotesensing 11 01499
No ratings yet
Remotesensing 11 01499
29 pages
Going Out Robust Model-Based Tracking For Outdoor
No ratings yet
Going Out Robust Model-Based Tracking For Outdoor
11 pages
JETIR2209375
No ratings yet
JETIR2209375
6 pages
Journal of Intelligent Fuzzy Systems
No ratings yet
Journal of Intelligent Fuzzy Systems
9 pages
Hymotrack: A Mobile Ar Navigation System For Complex Indoor Environments
No ratings yet
Hymotrack: A Mobile Ar Navigation System For Complex Indoor Environments
19 pages
Paris 2020
No ratings yet
Paris 2020
10 pages
Ventura 2012
No ratings yet
Ventura 2012
10 pages
Location Fingerprinting of Mobile Terminals by Using Wi-Fi Device
No ratings yet
Location Fingerprinting of Mobile Terminals by Using Wi-Fi Device
4 pages
Automatically driving vehicle
No ratings yet
Automatically driving vehicle
6 pages
Motion Detection Application Using Web Camera
No ratings yet
Motion Detection Application Using Web Camera
3 pages
06 Avr PDF
No ratings yet
06 Avr PDF
6 pages
Mobilepotrait c&g19
No ratings yet
Mobilepotrait c&g19
10 pages
An Adaptable System For RGB-D Based Human Body Detection and Pose Estimation
No ratings yet
An Adaptable System For RGB-D Based Human Body Detection and Pose Estimation
44 pages
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
No ratings yet
Sensors: Semantic Segmentation With Transfer Learning For Off-Road Autonomous Driving
21 pages
MMPTRACK Large-scale Densely Annotated Multi-came
No ratings yet
MMPTRACK Large-scale Densely Annotated Multi-came
10 pages
Automatic Cables Segmentation From A Substation Device Based On 3D Point Cloud
No ratings yet
Automatic Cables Segmentation From A Substation Device Based On 3D Point Cloud
17 pages
Yevdokimov_Thesis_Stud_Conf_RTF_2023_Eng_30.04_rv2
No ratings yet
Yevdokimov_Thesis_Stud_Conf_RTF_2023_Eng_30.04_rv2
3 pages
Development of Multi Modal Image Fusion Techniques
No ratings yet
Development of Multi Modal Image Fusion Techniques
9 pages
Dimensions in Data Processing 2402
No ratings yet
Dimensions in Data Processing 2402
76 pages
Progressive 3D Reconstruction of Infrastructure With Videogrammetry
No ratings yet
Progressive 3D Reconstruction of Infrastructure With Videogrammetry
12 pages
Extraction - of - Dense - Urban - Buildings - From - Photogrammetric - and - LiDAR - Point - Clouds 2
No ratings yet
Extraction - of - Dense - Urban - Buildings - From - Photogrammetric - and - LiDAR - Point - Clouds 2
10 pages
Remote Distance Measurement From A Single Image by Automatic Detection and Perspective Correction
No ratings yet
Remote Distance Measurement From A Single Image by Automatic Detection and Perspective Correction
24 pages
Deep Learning-Based Pedestrian Detection Using RGB Images and Sparse LiDAR Point Clouds
No ratings yet
Deep Learning-Based Pedestrian Detection Using RGB Images and Sparse LiDAR Point Clouds
13 pages
Sensors 22 07920
No ratings yet
Sensors 22 07920
20 pages
RGB-D Mapping Using Depth Cameras For Dense 3D Modeling of Indoor Environments
No ratings yet
RGB-D Mapping Using Depth Cameras For Dense 3D Modeling of Indoor Environments
2 pages
Smart Camera Network Localization Using A 3D Target
No ratings yet
Smart Camera Network Localization Using A 3D Target
5 pages
A Novel Campus Navigation APP With Augmented Reality and Deep Learning PDF
No ratings yet
A Novel Campus Navigation APP With Augmented Reality and Deep Learning PDF
3 pages
SC-YOLO A Object Detection Model For Small Traffic Signs
No ratings yet
SC-YOLO A Object Detection Model For Small Traffic Signs
11 pages
Turning Mobile Phones Into 3D Scanners
No ratings yet
Turning Mobile Phones Into 3D Scanners
8 pages
Marker-Based Monocular Vision
No ratings yet
Marker-Based Monocular Vision
8 pages
Yevdokimov Thesis Stud Conf RTF 2023 Eng 30.04
No ratings yet
Yevdokimov Thesis Stud Conf RTF 2023 Eng 30.04
3 pages
Efficient Hybrid Tree-Based Stereo Matching With Applications To Postcapture Image Refocusing
No ratings yet
Efficient Hybrid Tree-Based Stereo Matching With Applications To Postcapture Image Refocusing
15 pages
A Low-Cost Stereo System for 3D Object Recognition
No ratings yet
A Low-Cost Stereo System for 3D Object Recognition
7 pages
Computer Networks: Moad Y. Mowafi, Fahed H. Awad, Walid A. Aljoby
No ratings yet
Computer Networks: Moad Y. Mowafi, Fahed H. Awad, Walid A. Aljoby
17 pages
Research Article Image Processing Design and Algorithm Research Based On Cloud Computing
No ratings yet
Research Article Image Processing Design and Algorithm Research Based On Cloud Computing
10 pages
Remote Sensing Image Classification A Comprehensiv PDF
No ratings yet
Remote Sensing Image Classification A Comprehensiv PDF
24 pages
Novel Fusion Sight Object Detection System Using Transformer Networks
No ratings yet
Novel Fusion Sight Object Detection System Using Transformer Networks
12 pages
UAV Target Detection Algorithm Based On Improved YOLOv8
No ratings yet
UAV Target Detection Algorithm Based On Improved YOLOv8
11 pages
Younis 2020
No ratings yet
Younis 2020
5 pages
A_Real_Time_Object_Distance_Measurement
No ratings yet
A_Real_Time_Object_Distance_Measurement
6 pages
Wireless Positioning System
No ratings yet
Wireless Positioning System
30 pages
Multi Scale Object Based Detection and Classificat
No ratings yet
Multi Scale Object Based Detection and Classificat
7 pages
Colourising Point Clouds Using Independent Cameras
No ratings yet
Colourising Point Clouds Using Independent Cameras
8 pages
Vision-Based Location Positioning Using Augmented Reality For Indoor Navigation
No ratings yet
Vision-Based Location Positioning Using Augmented Reality For Indoor Navigation
9 pages
Ijrpr Paper Templatev1
No ratings yet
Ijrpr Paper Templatev1
17 pages
Nontarget-Based Measurement of 6-DOF Structural Displacement Using Combined RGB Color and Depth Information
No ratings yet
Nontarget-Based Measurement of 6-DOF Structural Displacement Using Combined RGB Color and Depth Information
11 pages
UNIT_3__DL[1]
No ratings yet
UNIT_3__DL[1]
15 pages
ML Research Paper
No ratings yet
ML Research Paper
8 pages
A Residual Network and Bi-directional LSTM based Hybrid Approach to Remote Sensing Image Captioning
No ratings yet
A Residual Network and Bi-directional LSTM based Hybrid Approach to Remote Sensing Image Captioning
10 pages
Generalized Fringe-To-Phase Framework for Single-Shot 3D Reconstruction Integrating Structured Light with Deep Learning
No ratings yet
Generalized Fringe-To-Phase Framework for Single-Shot 3D Reconstruction Integrating Structured Light with Deep Learning
18 pages
A Framework For Automated Progress Monitoring Based On Hog Feature Recognition and High Resolution Remote Sensing Image
No ratings yet
A Framework For Automated Progress Monitoring Based On Hog Feature Recognition and High Resolution Remote Sensing Image
24 pages
UNIT_3 _DL
No ratings yet
UNIT_3 _DL
15 pages
1-realtimeobjectdetection
No ratings yet
1-realtimeobjectdetection
6 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Integrative Programming and Technologies 1
No ratings yet
Integrative Programming and Technologies 1
28 pages
Hostel Management System
No ratings yet
Hostel Management System
15 pages
Assignment 1 PROGRAMMING FOR MOBILE DEVICES (CCS21503)
No ratings yet
Assignment 1 PROGRAMMING FOR MOBILE DEVICES (CCS21503)
11 pages
Troubleshooting and Escalating TSM Scheduled Backup Failure Reports
No ratings yet
Troubleshooting and Escalating TSM Scheduled Backup Failure Reports
14 pages
ITM Unit - 3
No ratings yet
ITM Unit - 3
69 pages
FYP Thesis Template Finalized - Updated 15th June 2023
No ratings yet
FYP Thesis Template Finalized - Updated 15th June 2023
28 pages
Web Technologies Assessment 1
No ratings yet
Web Technologies Assessment 1
24 pages
Tableau Certified Data Analyst: Beta Exam Guide
No ratings yet
Tableau Certified Data Analyst: Beta Exam Guide
15 pages
Impressora GI-2408T
No ratings yet
Impressora GI-2408T
2 pages
Logiq Scan Assistant Guide Global 1
No ratings yet
Logiq Scan Assistant Guide Global 1
8 pages
Ensayo Sobre La Clase Media
100% (1)
Ensayo Sobre La Clase Media
7 pages
Book Basics of Programming in C++ Tudor 2010
No ratings yet
Book Basics of Programming in C++ Tudor 2010
80 pages
Mechanical Acoustic Analysis Guide
No ratings yet
Mechanical Acoustic Analysis Guide
100 pages
Flex Beat - User Guide - v1.1
No ratings yet
Flex Beat - User Guide - v1.1
8 pages
Saint John's School of Bacuag, INC.: Surigao Diocesan Schools System
No ratings yet
Saint John's School of Bacuag, INC.: Surigao Diocesan Schools System
6 pages
If Shell MP Setup
No ratings yet
If Shell MP Setup
25 pages
Digital Documentation (Advanced) Questions & Answers
100% (2)
Digital Documentation (Advanced) Questions & Answers
8 pages
3d Game Engine Design a Practical Approa
No ratings yet
3d Game Engine Design a Practical Approa
15 pages
Java Reviewer
No ratings yet
Java Reviewer
13 pages
Model-Driven Development - From Frontend To Code
No ratings yet
Model-Driven Development - From Frontend To Code
46 pages
Motherboard
No ratings yet
Motherboard
63 pages
A I For Children
No ratings yet
A I For Children
140 pages
Assembler 166
No ratings yet
Assembler 166
352 pages
Unit 2 Software Requirement Analysis and Design
No ratings yet
Unit 2 Software Requirement Analysis and Design
28 pages
PDF Element System Requirements:: Fileup Uploadrive Zippyshare Mirrored Fileup Uploadrive Zippyshare Mirrored
No ratings yet
PDF Element System Requirements:: Fileup Uploadrive Zippyshare Mirrored Fileup Uploadrive Zippyshare Mirrored
2 pages
Anatomy of A Game Design Document
No ratings yet
Anatomy of A Game Design Document
22 pages
OneNote For Team Collaboration PDF
50% (2)
OneNote For Team Collaboration PDF
28 pages

Image-Based Indoor Localization Using Smartphone Camera

Uploaded by

Image-Based Indoor Localization Using Smartphone Camera

Uploaded by

Hindawi

Wireless Communications and Mobile Computing

Correspondence should be addressed to Yi Jin; [email protected]

Academic Editor: Mohammad R. Khosravi

1. Introduction ing the development of indoor positioning technology.

visible layer and a hidden layer. The neurons in the same

5 3 1 1 0 0 where θ = fW ij , bi , c j g are parameters in RBM, bi is bias of

(a) LBP15 (b) LBP25 (c) LBP216

Figure 3: Three types of LBP.

sponding projection point in the image coordinate system

Canny edge feature

Image BRIEF point Feature matching

FAST point feature

Figure 6: The process of multifeature fusion extraction and matching.

Then, obtain the equation:

Figure 7: Virtual control point and its projection point

The image coordinates of the four virtual control points

Figure 10: Images captured from diﬀerent scenes.

Figure 11: 3D point cloud of laboratory.

Figure 12: Environmental map and walking route.

Figure 13: Mismatched scene.

Figure 14: Error cumulative distribution function graph.

the future, we plan to use laser equipment to construct a References

You might also like