0% found this document useful (0 votes)
111 views

3D Reconstruction From A Single Still Image Based PDF

Uploaded by

anass anas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

3D Reconstruction From A Single Still Image Based PDF

Uploaded by

anass anas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ITM Web of Conferences 12, 01018 (2017) DOI: 10.

1051/ itmconf/20171201018
ITA 2017

3D Reconstruction from a Single Still Image Based on Monocular Vision


of an Uncalibrated Camera

Tao Yu1, Jian-Hua Zou2, Qin-Bao Song3


1,2
Systems Engineering Institute, School of Electronic & Information Engineering, Xi’an JiaoTong University, 28 Xian-Ning West Road,
Xi’an city, Shaanxi Province, 710049, P R China
1,2
State key laboratory for systems engineering, Xi'an Jiaotong University, 28 Xian-Ning West Road, Xi'an city, Shaanxi Province, 710049,
P R China
1,3
Dept. of Computer Science & Technology, School of Electronics & Information Engineering, Xi'an Jiaotong University, 28 Xian-Ning
West Road Xi'an city, Shaanxi Province, 710049, P R China
1
[email protected], [email protected], [email protected]

Abstract: we propose a framework of combining Machine Learning with Dynamic Optimization for reconstructing
scene in 3D automatically from a single still image of unstructured outdoor environment based on monocular vision
of an uncalibrated camera. After segmenting image first time, a kind of searching tree strategy based on Bayes rule is
used to identify the hierarchy of all areas on occlusion. After superpixel segmenting image second time, the AdaBoost
algorithm is applied in the integration detection to the depth of lighting, texture and material. Finally, all the factors
above are optimized with constrained conditions, acquiring the whole depthmap of an image. Integrate the source
image with its depthmap in point-cloud or bilinear interpolation styles, realizing 3D reconstruction. Experiment in
comparisons with typical methods in associated database demonstrates our method improves the reasonability of
estimation to the overall 3D architecture of image’s scene to a certain extent. And it does not need any manual assist
and any camera model information.

sensors[12], some use the means of camera’s focus[13,


14], some append a few extra constrained
1. Introduction conditions[15-17].
We all know that the 3D scene has more valuable At present, with the rapid development of Machine
information than 2D scene because of including the Learning, some investigators utilize these theories for
depth information. However, the acquisition for 3D training associated models such as Bayes[18, 19],
scene is usually complex and difficult. It often needs Adaboost[20-23] and Markov random field(MRF)[24-
many cameras to take pictures at multi-view angles and 26] etc to estimate the structural information of scenes.
needs the techniques of camera’s calibration, image However, pure machine learning often brings the
matching etc or needs the complex sophisticated depth problem of under-fitting or over-fitting, which often
sensor instruments to realize. leads to incorrect or inaccurate estimation for the whole
In reality, sometimes, we cannot acquire many 3D structure of scenes. To make up the deficiency,
images from different view angles to reconstruct 3D since optimization’s theory and methods can deal with
scenes because of the limited conditions, such as the selecting the best alternative in the sense of the given
vehicles in the distance, ships in the sea, or the stars in objective function, we consider of introducing the
the interstellar space etc, but we are still interested in strategy of optimization upon the Machine Learning for
understanding the hierarchy relationships of front and estimating the final whole 3D structure of scenes.
back among the objects. At these situations, we only In this paper, a method of combining Machine
can hope to utilize the information from monocular Learning with Dynamic Optimization is used to
vision to recover or approximate the real 3D effects reconstruct 3D scene directly from a single still image
directly. based on monocular vision of an uncalibrated camera. It
Recently, many investigations attempting to improves the reasonability of estimation to the whole
reconstruct 3D scenes from monocular vision directly 3D structure of images’ scenes to a certain extent and
involve mainly two types: from videos [1-5] and from a doe not need any manual assist and any camera model
still image. Among the latter: some demand camera information.
calibration in advance[6-8], some need manual The remainder of this paper is arranged as follows.
assistance[9-11], some recur to auxiliary accurate Section 2 interprets the principle of 3D reconstruction
from a single still image based on monocular vision of

© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution
License 4.0 (http://creativecommons.org/licenses/by/4.0/).
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

an uncalibrated camera; Section 3 provides relative model information, here, the depths studied are not real
experiment and results analysis; Section 4 concludes absolute depths, but the relative depths of the different
this paper. components in an image.

(1) Occlusion relationship identification


Source image
2. The Principle of 3D Segment with
large scale
Based on search tree algorithm

Segment with Global depth architecture estimation


(2) (3)
Reconstruction From A Single Still smaller scale
Depth Integration detection
with Dynamic optimization
3D image reconstruction
to material, texture and lighting (4) Attached with Bilinear Interpolation
Image Based on Monocular Vision of Based on Adaboost algorithm Or Point-Cloud

An Uncalibrated Camera
Fig.1 the principle of the 3D reconstruction

2.1 Global Principle Fig.1 gives this paper’ main principle of the 3D
reconstruction directly from monocular 2D image, in
Comparing with 3D image, the monocular vision which: the source image is segmented with bigger scale
produces 2D image, which has only the information at and smaller scale in turn, to make occlusion
orientations of X and Y, without Z. That is to say, it has identification and depth integration detection on
not the depth information. So, in order to realize 3D material, texture and lighting; then the global depth
reconstruction, we need to estimate the depth architecture is estimated with dynamic optimization
information from 2D image. based on the two stages’ results above; at last,
If we close our one eye, with the other one open to combining the global depth architecture with source
see a picture, we still can feel the relationship of image, the 3D image reconstruction attached with
different parts’ back and forth to the object in image. To bilinear interpolation or point-cloud is realized. Next,
the complex image, according to some rules, we often we will describe each part of the principle in detail.
can also infer its relationship of back or forth with other
objects around and further infer the whole architecture
of depth from an image. These phenomena provide us 2.2 Implementation
with some possibilities for the 3D reconstruction of the
scene from monocular image. Here, we think that the To estimate the depth structure reasonably, the key is
perception above maybe some past experience and the need to make correct image representation as much as
rules above maybe some kind of optimization. So, this possible to the image’s important information. Image
paper will attempt to combine Machine Learning with representation is usually hierarchical, so it is carried on
Dynamic Optimization for the computer to solve this the basis of the image segmentation in different scales
problem automatically. [27, 28], in which the image’s different components
In a picture, usually we will see some familiar or represent different meanings.
unfamiliar object, which involves shape, material,
texture, color, and the effect of illumination. And we 2.2.1 Occlusion Identification
usually can also accept the inferences that:
A. For a familiar object, if it only can be seen part First, we segment the image in bigger scale with
shape, it is usually sheltered by something. method in Ref. [27] to acquire familiar objects and
B. For the material, the part near to us is usually identify the associated occlusions. Here, the occluding
clearer and rougher than the part in distance. phenomenon is defined as: when object A can only be
C. For the texture, usually, the near part presents seen in partial shape and the other area where should be
sparser, however, the part in distance presents denser. embodied as the rest shape of A, is occupied by object
D. For the effect of lighting, the part near light B, then we think that object A is occluded by object B,
source is usually brighter than the part far away. or B is in front of A. The occluded areas’ divisions refer
E. For the color, it usually changes with different to [29]. The related principle is as Eqs. (2-1) displaying,
objects or different parts of one object or even different in which M, N are two neighboring objects in an image;
lighting sources etc. X represents the associated traits; Y is the associated
According to the experiences above: we can use prior knowledge for familiar objects and Q is the
some samples owning different depth levels for the number of objects in an image. According to Bayesian
material, texture and brightness of pictures respectively rule, the more remarkable the traits information P(X/(M,
to train the relative learning machines; then we use N)) and the prior information P(Y/(M, N, X)) are, the
some priority to infer the depth level of some area in bigger the value of posterior probability function P((M,
image and for some objects with familiar shapes, we N)/(X, Y)) on occlusion will be.
can use some decision algorithm to infer the possible P Y  M , N , X   P  X  M , N  
P  M , N   X ,Y   (2  1)
existed occlusions; finally, to the whole image, we can 
P Y  M , N , X   P  X  M , N  
M  N Q
use optimization algorithm to integrate with all the
inferences above; thus, we will acquire multi-possible To find out all the familiar objects’ occlusive
architectures of depths for an image and we will select relationships in an image, we adopt a kind of searching
the most possible one as the final result according to the tree algorithm. The searching process is just as Fig.2, in
computational optimal value or the biggest probability which all the familiar objects’ relative occlusive
value. In addition, because we do not use any camera’s relationships of each other are inferred by the principle

2
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

of Eqs. (2-1) and at last, relative different number is On the selection of associated traits: the textural energy
used to represent different depth hierarchy. Of course, which are computed from Law’s masks in Refs. [31, 32]
there exists the likelihood of inferring error on the act as textural features to embody the extent of textural
occluding problem. It relies on whether the prior denseness at different depths; the Haar-like traits in Ref.
information on the familiar objects’ understanding is [33] act as material features to describe the extent of
reliable. And the more reliable the prior knowledge is, material smoothness and ambiguity with depth
the more accurate the inference is. changing; the lighting model on depth refers to [34, 35],
in which the distance from light source to viewer is
Whether there exists occlusion between A and B? regarded as the relative depth herein, then image’s other
Yes
No components are assigned to corresponding lighting
From occlusion, A and B are in the same
hierarchical position, so number A=B.
Whether A occlude B? categorical depth classes and model parameters in turn.
Yes No
A is in front of B, B is in front of A,
so number A>B. so number B>A. Integration detection on depth
Priority order: 1. material; 2. texture; 3. lighting

Use A or B in the next identification. Use A and B respectively, in the next identification
Detecting depth class Detecting depth class Detecting depth class
on Lighting factor on texture factor on material factor
Next group inference

AdaBoost AdaBoost AdaBoost AdaBoost AdaBoost Adaboost


Fig.2 the search strategy on occlusion in one image Lighting depth class 1 Lighting depth class N texture depth class 1 texture depth class M material depth class 1 material depth class Q

Fig.4. The principle of integration detection to the depth of


lighting, texture and material
2.2.2 Depth Integration Detection to Material,
Texture and Lighting
After the occlusion identification is over, then, the 2.2.3 Global Depth Architecture
image will be segmented again in smaller scale with To acquire the final global depth architecture of an
method in Ref. [27] to acquire the superpixels, each of image, we combine with all the inferences of the factors
which represents a coherent region with similar above and apply a kind of dynamic optimization with
properties such as lighting, texture, material or colors. constrain conditions to solve the problem. In addition,
Fig.3 displays an example of segmenting image, in image’s gray values usually reflect some depth
which there are two grades of segmentation with yellow information, so it will also be regarded as one base
color and red color describing respectively. applied in the estimation for the final global depth
architecture. The corresponding formula is Eqs. (2-2),
where: m, n represents two neighboring areas in an
image; g(m,n,d) is the depth mapping function between
neighboring area m and n corresponding to the
constrains above. Herein: q=1, base on gray image; q=2,
base on lighting; q=3, base on texture; q=4, base on
material; q=5, base on occluding; λ and μ are associated
Lagrange multipliers; L is the total number of
3-a Source image 3-b Superpixel segmenting image correlations corresponding to neighboring areas in an
image; d is the final depth correlation to be established
for all the smaller areas of an image; c is introduced
Fig.3 An example of segmenting image
slack variable used to transform the inequality
Next, we process the superpixels’ image above from constrains above to equality constrains.
  21   g  m , n , d   c   
 L 
 
5 5
arg min  d p  pq  g q  m p , n p , d p   c pq
2
the aspects of lighting, texture and material to extract  p 1 q p p p pq  2-2 
more depth information. Here, before further depth
d, c
 q 1 p q 1


identification, we predefine multiple depth classes From optimization theory in Ref. [36], we can infer
according to different changing extent of lighting, optimal
texture and material with depth changing. We use the
the value of cpq as Eqs. (2-3)
logistic regression version of AdaBoost algorithm in
Ref. [30] to train each class of depth samples and

c pq  max gq  mp , n p , d p    p  pq ,0   2-3
acquire relative detector. At each class of depth, the Combing Eqs. (2-2) with (2-3), we can construct
relative detector based on textural or material traits’ the augmented Lagrangian function (2-4).
d     g  m , n , d   max  g  m , n , d     , 0   
5

weighting is trained and the relative lighting model is def L


p
q 1
pq q p p p q p p p p pq

also trained. In identifying, according to some priority LA  d , ,     2-4 


   g  m , n , d   max  g  m , n , d     , 0   
5
1 p 1
2

order, we integrate the detectors to acquire the 2  p q 1
q p p

p q p p p p pq

integration depth on material, texture and lighting. Fig.4 Thus the problem (2-2) above is transformed into
displays the associated detecting principle, in which the formula (2-5).
property of AdaBoost that uses weak classifiers to min  LA  d , ,     2-5
d
construct strong classifiers is applied for both the single
depth class detection and the integration depth detection.

3
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

view angles, we can see that they all approximate the


Referring to [37], the specific solving procedures on
usual real objects and scenes.
problem
the above are as follow:
Given μ0 > 0, tolerance ε > 0, starting points ds0 and
λ0;
for k = 0, 1, 2, . . .
Find an approximate minimizer dk of LA(•,λk;μk),
starting at dsk , 6-a 6-b 6-c

and terminating when ||ͪdLA(dk, λk;μk) || ≤ε;


if final convergence test for formula (2-5) is
satisfied
Stop with approximate solution dk;
end(if) 6-d 6-e 6-f
Update Lagrange multipliers using Eqs. (2-6) to
obtain λk+1;
 
k 1  max k  g  m, n, d k   k ,0 ;  2-6
Choose new penalty parameter μk+1 ≥ μk;
Set starting point for the next iteration to ds(k+1) = dk; 6-g 6-h 6-i
end (for)
According to the method described above, the vector a-frontal image; b- right rotation 10DŽ; c- right rotation 20DŽ;
d, λ and μ are integrated updating until convergence. d- right rotation 30DŽ; e- right rotation 40DŽ; f- right rotation
Note that the vector d refers to relative depth, not 50DŽ; g- right rotation 60DŽ; h- right rotation 70DŽ; i- right
absolute depth. rotation 80DŽ;
Fig.5 illustrates an example of the estimation to an
image’s global depth architecture with the method of Fig.6 Example of 3D reconstruction from single image
this paper, in which the different gray levels indicate
different depths.
3. Experiment

3.1 Testing Environment

In order to prove the ability of the application in general


environment for the method proposed in this paper, all
the experiments are conducted in the environment of
Microsoft visual c++ 6.0 at the platform of Pentium
1.73GHz personal computer.
The framework of 3D reconstruction from
monocular vision proposed by this paper is tested with
Fig.5 An example of the estimation to the global depth
architecture of an image the Make3D Range Image Data’ dataset1[24, 25], which
consists of 534 image+depthmap pairs, with an image
resolution of 2272×1704 and a depthmap resolution of
2.2.4 3D Reconstruction Realization 55×305. In our experiment, 400 of the
images/depthmaps are used for training associated
At last, after acquiring the complete depthmap of an detectors and the remaining 134 for testing. At the same
image, we combine the depthmap with its source image time, we will compare the experimental results with
to reconstruct the 3D image. And here, two kinds of typical Saxena[24] method and HEH[23] method. And for
methods are used to process the voxels in 3D image: to fairness, their depthmaps are scaled and shifted before
the images distributed mainly with continuous qualities, computing the errors to match the global scale of our
the bilinear interpolation method is applied to test images.
complementing and smoothing the color area among the
reconstructed 3D pixel points; to the images distributed
mainly with discrete qualities, the point-cloud method is 3.2 Testing Results and Analysis
applied directly to the reconstruction of the 3D image’s Fig.7 gives some examples of real effects for
voxels. In addition, during the course of 3D image contrasting typical methods —Saxena[24] and HEH[23]
reconstruction, to the borderlines of the segmentation with ours. In comparisons: HEH’s generating a popup
before, we use the corresponding pixels in original effect by folding the images at “ground-vertical”
image to replace them. Fig.6 gives an example of boundaries is not feasible for all the images. Especially
reconstructed 3D images viewed in different visual for multy non-ground and discontinuous areas with
angles from a single 2D source image. From different front and back relationships in an image, it often fails to

4
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

describe these scenes’ structure correctly, such as fig.7-

Relative depth error


b1, 7-b2. Saxena’s using Markov Random Field (MRF)  

Depth error

 
to infer both 3D location and orientation of the patches 

+(+ 6D[HQD 2XUPHWKRG
+(+ 6D[HQD 2XUPHWKRG
in an image without any explicit assumptions about the
structure of the scene makes it generalize well, even to
scenes with significant nonvertical structure. The
deficiency exists in that at the aspect of the whole 8-a. Depth error 8-b. Relative depth error
structural reconstruction, it lacks the comprehensive

Planes Correct (%)


considering for many factors, so its global 3D structural  

Correct (%)
estimation is sometimes inaccurate, such as fig. 7-c3.  
 
Our algorithm’s global dynamic optimization seems to +(+ 6D[HQD 2XUPHWKRG +(+ 6D[HQD 2XUPHWKRG

be more reasonable and feasible for the whole 3D


architectural reconstruction to a certain extent,
producing better effects although a few small local parts 8-c. Correct rate 8-d. Planes correct rate
in images are estimated somewhat coarsely.

Fig.8 comparisons with typical methods.

7-a1 7-b1 7-c1 7-d1 4. Conclusion


This paper mainly proposes a framework of 3D
reconstruction from a single still image of unstructured
outdoor environment based on monocular vision of an
7-a2 7-b2 7-c2 7-d2 uncalibrated camera. It integrates machine’s learning
for each factor with dynamic optimization’s considering
comprehensively for multy-factors to estimate scene’s
3D structure and does not need any manual assist, any
7-a3 7-b3 7-c3 7-d3 camera model information. In experiment, we use
Make3D Range Image Data to compare it with typical
a-Source image; b-HEH’ algorithm; c-Saxena’s algorithm; d- methods of HEH and Saxena. The results basically
our algorithm confirm our method’s reasonability and feasibility for
Fig.7 Examples of 3D reconstruction from single image with image scene’s global architectural 3D reconstruction to
different algorithms
a certain extent.
Table.1 and Fig.8 give the final statistical results for In the next work, our investigation will manage to
the testing comparisons of our algorithm with the two improve the accuracy of estimation not only for the
typical algorithms above, in which: whether ‘Depth scene’s local patches especially, but also for the global
error’ or ‘Relative Depth error’ are both averaged over structure further. Meanwhile, whether this paper’s
all pixels in the hold-out test set; ‘Correct rate’ means method can apply to other databases needs new testing.
percent of models qualitatively correct; ‘Planes Correct In addition, we will investigate the real-time 3D
rate’ is percent of major planes correctly identified. reconstruction from a single still image, which needs
Combing with the analysis before, we can see that from paralleled mechanism and improving our algorithm’s
the whole testing data, Saxena’s method is a little better speed furtherly.
than ours probably because of our method’s
accumulated several local small estimation errors Acknowledgment
during the course of computation and our method
outperforms HEH mainly for HEH’ sometimes failed The authors thank to Computer Science Department,
estimation for the whole structure of scenes, all of Stanford University for supplying investigators with the
which verifies our algorithm’s feasibility and rationality free Make3D Range Image Data and thank for being
further. supported by the National Natural Science Foundation
of China (50177025).
Table.1 comparisons with typical methods
Depth Relative Depth Correct Planes
Meth error error rate Cor References
od log10d-log10dˆ  d-dˆ d (%) rect


rate 1. Jebara, T., A. Azarbayejani, and A. Pentland, 3D
(%)
HEH 0.320 0.423 33.1 50.3 structure from 2D motion. Signal Processing
Saxen 0.187 0.370 64.9 71.2 magazine, 1999. 16(3): p. 66-84.
a
Ours 0.253 0.382 52.5 69.4 2. Po, L.-M., et al., Automatic 2D-to-3D video
conversion technique based on depth-from-motion
and color segmentation, in Signal Processing
(ICSP), 2010 IEEE 10th International Conference

5
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

on. 2010, IEEE. p. 1000-1003. 17. Wang, G., et al., Single view metrology from scene
3. Bok, Y., Y. Hwang, and I.S. Kweon, Accurate constraints. Image and Vision Computing, 2005.
motion estimation and high-precision 3d 23(9): p. 831-840.
reconstruction by sensor fusion, in Robotics and 18. Delage, E., H. Lee, and A.Y. Ng, A dynamic
Automation, 2007 IEEE International Conference bayesian network model for autonomous 3d
on. 2007, IEEE. p. 4721-4726. reconstruction from a single indoor image, in
4. Cao, X., Z. Li, and Q. Dai, Semi-automatic 2D-to- Computer Vision and Pattern Recognition, 2006
3D conversion using disparity propagation. IEEE IEEE Computer Society Conference on. 2006,
TRANSACTIONS ON BROADCASTING, 2011. IEEE. p. 2418-2428.
57(2): p. 491-499. 19. Han, F. and S.-C. Zhu. Bayesian reconstruction of
5. Hertzmann, A. and S.M. Seitz, Example-based 3d shapes and scenes from a single image. in
photometric stereo: Shape reconstruction with Higher-Level Knowledge in 3D Modeling and
general, varying brdfs. IEEE TRANSACTIONS Motion Analysis, 2003. HLK 2003. First IEEE
ON PATTERN ANALYSIS AND MACHINE International Workshop on. 2003. IEEE.
INTELLIGENCE, 2005. 27(8): p. 1254-1264. 20. Hoiem, D., A.A. Efros, and M. Hebert, Recovering
6. Guillou, E., et al., Using vanishing points for surface layout from an image. International Journal
camera calibration and coarse 3D reconstruction of Computer Vision, 2007. 75(1): p. 151-172.
from a single image. The Visual Computer, 2000. 21. Hoiem, D., A.A. Efros, and M. Hebert, Closing the
16(7): p. 396-410. loop in scene interpretation, in Computer Vision
7. Wilczkowiak, M., E. Boyer, and P. Sturm. Camera and Pattern Recognition. CVPR. IEEE Conference
calibration and 3D reconstruction from single on. 2008, IEEE. p. 1-8.
images using parallelepipeds. in Computer Vision, 22. Hoiem, D., A.A. Efros, and M. Hebert, Automatic
2001. ICCV 2001. Proceedings. Eighth IEEE photo pop-up. ACM Transactions on Graphics
International Conference on. 2001. IEEE. (TOG), , 2005. 24(3): p. 577-584.
8. Wang, G., et al., Camera calibration and 3D 23. Hoiem, D., A.A. Efros, and M. Hebert, Geometric
reconstruction from a single view based on scene context from a single image, in Computer Vision.
constraints. Image and Vision Computing, 2005. ICCV 2005. Tenth IEEE International Conference
23(3): p. 311-323. on. 2005, IEEE. p. 654-661.
9. Criminisi, A., I. Reid, and A. Zisserman, Single 24. Saxena, A., M. Sun, and A.Y. Ng, Make3d:
View Metrology. International Journal of Computer Learning 3d scene structure from a single still
Vision, 2000. 40(2): p. 123-148. image. IEEE TRANSACTIONS ON PATTERN
10. Barinova, O., et al., Fast automatic single-view 3-d ANALYSIS AND MACHINE INTELLIGENCE,
reconstruction of urban scenes, in Computer 2009. 31(5): p. 824-840.
Vision-ECCV 2008. 2008, Springer. p. 100-113. 25. Saxena, A., S.H. Chung, and A.Y. Ng, Learning
11. Sturm, P. and S. Maybank. A method for interactive depth from single monocular images, in Advances
3d reconstruction of piecewise planar objects from in Neural Information Processing Systems(NIPS).
single images. in The 10th British Machine Vision 2005. p. 1161-1168.
Conference (BMVC'99). 1999. The British 26. Delage, E., H. Lee, and A.Y. Ng, Automatic single-
Machine Vision Association (BMVA). image 3d reconstructions of indoor manhattan
12. Willneff, J., J. Poon, and C. Fraser, Single-image world scenes, in Robotics Research. 2007, Springer.
high-resolution satellite data for 3D information p. 305-321.
extraction. International Archives of 27. Levinshtein, A., et al., Turbopixels: Fast
Photogrammetry, Remote Sensing and Spatial superpixels using geometric flows. 2009, IEEE
Information Sciences, 2005. 36(1/W3): p. 1-6. TRANSACTIONS ON PATTERN ANALYSIS
13. Namboodiri, V.P. and S. Chaudhuri, Recovery of AND MACHINE INTELLIGENCE,. p. 2290-2297.
relative depth from a single observation using an 28. Felzenszwalb, P.F. and D.P. Huttenlocher, Efficient
uncalibrated (real-aperture) camera, in Computer graph-based image segmentation, in International
Vision and Pattern Recognition, CVPR. IEEE Journal of Computer Vision. 2004, Springer. p.
Conference on. 2008, IEEE. p. 1-6. 167-181.
14. Zhuo, S. and T. Sim, On the recovery of depth from 29. Hoiem, D., et al., Recovering occlusion boundaries
a single defocused image, in Computer Analysis of from a single image, in Computer Vision, 2007.
Images and Patterns. 2009, Springer. p. 889-897. ICCV 2007. IEEE 11th International Conference on.
15. Van den Heuvel, F.A., 3D reconstruction from a 2007, IEEE. p. 1-8.
single image using geometric constraints. ISPRS 30. Collins, M., R.E. Schapire, and Y. Singer, Logistic
Journal of Photogrammetry and Remote Sensing, regression, AdaBoost and Bregman distances, in
1998. 53(6): p. 354-368. Machine Learning. 2002, Springer. p. 253-285.
16. El-Hakim, S.F. A flexible approach to 3D 31. Davies, E.R., Laws' texture energy in texture. 1997,
reconstruction from single images. in ACM Machine vision: theory, algorithms, practicalities.
SIGGRAPH. 2001. Citeseer.

6
ITM Web of Conferences 12, 01018 (2017) DOI: 10.1051/ itmconf/20171201018
ITA 2017

32. Laws, K.I., Rapid texture identification, in 24th


Annual Technical Symposium. 1980, International
Society for Optics and Photonics. p. 376-381.
33. Lienhart, R., A. Kuranov, and V. Pisarevsky,
Empirical analysis of detection cascades of boosted
classifiers for rapid object detection, in Pattern
Recognition. 2003, Springer. p. 297-304.
34. Klassen, R.V., Modeling the effect of the
atmosphere on light. 1987, ACM. p. 215-237.
35. Narasimhan, S.G. and S.K. Nayar, Shedding light
on the weather, in Computer Vision and Pattern
Recognition, 2003. Proceedings. 2003 IEEE
Computer Society Conference on. 2003, IEEE. p. I-
665-I-672 vol. 1.
36. Nocedal, J. and S.J. Wright, Numerical
Optimization. Springer, 2006. 9(4): p. 1556-1556.
37. Nocedal, J. and S.J. Wright, Numerical
Optimization Second Edition. 1999..

You might also like