0% found this document useful (0 votes)
9 views

Towards Automatic Modeling of Monuments and Towers

The document discusses issues with fully automated 3D modeling from images, including occlusions and lack of texture. It proposes an approach that uses both interactive and automatic techniques based on the situation to accurately and completely model monuments and towers from a small number of images. Key steps include extracting features interactively, registering images, and computing 3D coordinates with automation and human input where needed.

Uploaded by

michal.luba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Towards Automatic Modeling of Monuments and Towers

The document discusses issues with fully automated 3D modeling from images, including occlusions and lack of texture. It proposes an approach that uses both interactive and automatic techniques based on the situation to accurately and completely model monuments and towers from a small number of images. Key steps include extracting features interactively, registering images, and computing 3D coordinates with automation and human input where needed.

Uploaded by

michal.luba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Towards Automatic Modeling of Monuments and Towers

Sabry El-Hakim, J.-Angelo Beraldin, Jean-François Lapointe


Visual Information Technology (VIT) Group
Institute for Information Technology, National Research Council,
Ottawa, Ontario, Canada K1A 0R6
E-mail: {Sabry.El-Hakim, Angelo.Beraldin, Jean-Francois.Lapoints}@nrc.ca

Abstract 1.1. Full Automation versus Human Interaction

Three-dimensional modeling from images, when Three-dimensional modeling from images, when
carried out entirely by a human, can be time consuming carried out entirely by a human, can be very time
and impractical for large-scale projects. On the other consuming and impractical for large-scale projects.
hand, full automation may be unachievable or not Efforts to increase the level of automation are essential in
accurate enough for many applications such as culture order to broaden the use of this technology. So far,
heritage documentation. In addition, three-dimensional however, the efforts to completely automate the
modeling from images, particularly fully automated processing, from image capture to the output of a 3D
methods, requires the extraction of features, such as model, are not always successful or applicable [3, 4]. Full
corners, and needs them to appear in multiple images. automation has been achieved under certain conditions
However, in practical situations those features are not and up to finding point correspondence and camera
always available, sometimes not even in a single image, positions and orientation [5]. Self-calibration and 3D
due to occlusions or lack of texture on the surface. Taking construction still requires a human in the loop either to
closely separated images or optimally designing view specify constraints or to perform post processing [6, 7].
locations can preclude some occlusions. However, taking Also some sacrifice to the accuracy and fidelity of the
such images is often not practical and we are left with created model may result when using full automation (see
small number of images that do not properly cover every section 1.3). Automated methods also rely on features that
surface or corner. The approach presented in this paper can be extracted automatically from the scene, thus
uses both interactive and automatic techniques, each occlusions and un-textured surfaces are problematic. We
where it is best suited, to accurately and completely often end up with areas with too many features that are
model monuments and towers. It particularly focuses on not all needed for modeling, and areas with no or too few
automating the construction of unmarked surfaces such as features to produce a complete model. This means that
columns, arches, and blocks from minimum available post processing is often required which means that user
clues. It also extracts the occluded or invisible corners interaction is still needed. Most impressive results were
from existing ones. Many examples, such as Arc de achieved with highly interactive approaches e.g. [8].
Triomphe in Paris, Florence’s St. John baptistery at Some interactive approaches with automated features that
Santa Maria del Fiori Cathedral, and other monuments take advantage of environment constraints proved
and towers from around the world are completely effective [3, 9]. Other more automated techniques that
modeled from a small number of images taken by tourists. target specific objects such as architecture [10, 11, 12]
have also been developed. If the goal is creating accurate
1. Introduction and complete 3D models of medium and large scale
objects under practical situations using only information
This paper addresses several interconnected issues: full contained in images, then full automation is still in the
automation versus partial automation, how to handle the future.
inevitable occlusions and lack of features or texture, and
the importance of high accuracy to constructing and Full automation is a priority for certain applications
documenting monuments and towers. We will address such as navigation, telepresence, augmented reality, and
only image-based approaches. However, it is important to where a model is needed fast for decision-making. In
note that to achieve complete geometric details, range those applications, complete details and high accuracy are
sensors will also be required for sculpted surfaces that are secondary. For other applications such as documentation
usually found on many monuments [1]. This requires the and even virtual museums full automation cannot excuse
integration of the two types of data [2]. the missing details or lack of accuracy.

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
1.2. Occlusions and Lack of Texture level of automation to assist the operator without
sacrificing accuracy or level of details. Figure 1
Three-dimensional measurement and modeling from summarizes the procedure and indicates which step is
images obviously requires that relevant points be visible interactive and which is automatic (interactive operations
in the image. This is often not possible either because the are grayed). Images are taken, all with the same camera
points or region of interest are hidden or occluded behind set up, from positions where the object is suitably
an object or a surface, or because there is no mark, edge, showing. Parts of the object should appear in two or more
or visual feature to extract. In fact even without multiple images when possible, and there should be a reasonable
objects in the scene and when we can take images from distance, or baseline, between the images. Several
well planned positions, there are not many objects that features appearing in multiple images are interactively
can be imaged without having portions of its surfaces extracted from the images, usually 12-15 per image. The
either invisible or without texture to extract. In objects user points to a corner and label it with a unique number
such as architectures and monuments in their normal and the system will accurately extract the corner point.
settings we are also faced with restrictions limiting the Harris operator is used [17] for its simplicity and
positions from which the images can be taken. Also efficiency. Image registration and 3D coordinate
illumination variations and shadows hamper feature computation are based on the photogrammetric bundle
extraction. Not only those factors preclude the modeling adjustment approach for its accuracy, flexibility, and
of occluded parts but also have negative effect on the effectiveness [18] compared to other structure from
modeling of visible parts, for example when applying motion techniques. Advances in bundle adjustment
automatic matching. eliminated the need for control points or initial
approximate coordinates. Many other aspects required for
1.3. Accuracy of 3D Modeling high accuracy such as camera calibration with full
distortion correction have long been solved problems in
Historic monuments and towers are particularly Photogrammetry [16] and will not be discussed in the
important and thus need to be constructed with high remainder of the paper.
accuracy both for documentation and visualization
purposes. To achieve the needed accuracy, one must use
the most rigorous approach for 3D modeling from images
rather than the simplest or easiest to implement. Tests
showed that methods based on projective geometry,
although an elegant and efficient approach, result in
geometric errors in the range of 4 to 5% [6, 13]. This
means that 20-meter tower could have a significant 1-
meter error. Photogrammetric methods such as bundle
adjustment and proper camera calibration [14, 15, 16],
although interactive and not as easy to use as projective
methods, give several orders of magnitude smaller error,
in the range of 0.01-0.001% on well defined features,
depending on camera resolution and lens quality.

2. Outline Of The Approach


Our approach is photogrammetry-based. In order to
increase the level of automation, the process takes
advantage of properties found in monuments and towers.
For example those structures usually have: Figure 1. Simplified diagram of the procedure.
Interactive operations are grayed.
• Well defined surface shapes
• Well defined openings such as archways We now have all camera coordinates and orientations
• Regular blocks attached to flat surfaces and the 3D coordinates of a set of initial points, all in the
• Many symmetric sections same global coordinates system. The next interactive
• Columns with known shape operation is to divide the scene into connected segments
The approach does not aim to fully automate the to define the surface topology. An automatic corner
procedure nor completely rely on human operator for extractor, again the Harris operator, is used and a
reasons discussed in section 1.1 above. It provides enough matching procedure is applied across the images to add

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
more points into each of the segmented regions. The
matching is constrained by the epipolar condition and
disparity range computed from the 3D coordinates of the
initial points. The bundle adjustment is repeated with the
newly added points to improve on previous results and re-
compute 3D coordinate of all points.

We now need to add more points in order to reconstruct


un-textured surfaces and those that are occluded.
Subdivision techniques [19] are used to add points on
free-form shapes where some seed points are available.
The points are then projected onto the images in order to
determine texture coordinates. This results in a smooth
appearance of sculptured surfaces.

Since many parts of the scene will show only in one Figure 2. Left (a) 4 seed points are extracted
image, an approach to extract 3D information from a on the base and crown of the column, right (b)
single image is necessary [20]. Our approach applies the column points are added automatically.
equation of the surface as a constraint, along with the
camera parameters, to the single-image coordinates to Arches are constructed by first fitting a plane to seed
compute the corresponding 3D coordinates. For example points on the wall (figure 3-a). An edge detector is
in many monuments and towers, the walls are planes that applied to the region (figure 3-b) and points at constant
are either parallel or perpendicular to each other. The interval along the arch are sampled. For edge detection, a
equations of some of the planes can be determined from specially designed morphological operator was developed
seed points previously measured. The remaining plane (a variation on [22]). Using the image coordinate of these
equations are determined using the knowledge that they points (in one image only), the known image parameters
are either perpendicular or parallel to one of the planes (from the bundle adjustment), and the equation of the
already determined. With little effort, the equations of all plane, the 3D coordinates are computed (figure 4).
the planes on the structure can be computed. From these
equations and the known camera parameters for each
image, we can determine 3D coordinates of any point or
pixel from a single image. This can also be applied to
surfaces like quadrics or cylinders whose equations can be
computed from existing points. Other constraints, such as
symmetry and points with the same depth or same height
are also used.

The general rule for adding points on cylinders or


columns, arches, and blocks and for generating points in
occluded or symmetrical parts is to do the work in the 3D
space, like in a CAD system, to find the new points then
project them on the images using the internal and external
camera parameters. The texture images are edited
afterwards to remove the occluding objects and replace Figure 3. Left (a) shows seed points extracted
them with texture from current or other images. We to fit a plane, right (b) shows edge detector.
specifically designed features in the approach to
automatically add columns, arches, and blocks. The It often happens that only part of a monument section,
cylinder is constructed after its direction and approximate we will call it a block, is visible. For example in figure 5
radius and position have been automatically determined the bottom part of the block where it meets another block
from four seed points (figure 2-a) using quadric surface is not visible and need to be measured in order to
formulation [21]. The ratio between the upper and the reconstruct the whole block. To solve this problem, we
lower circle can be set in advance. It is set to less than 1.0 first extract the visible corners from several images and
(about 0.85) to create a tapered column. From this compute their 3D coordinates. We then fit a plan to the
information, points on the top and bottom circle of the top of the base block, using the gray points in figure 5,
column (figure 2-b) can be automatically generated in 3D then project normal to the plane from each of the corners
resulting in a complete solid model. of the block attached to it (the white points). The

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
intersections of each normal will produce a new point (a automatically. Eight examples are presented here (they
black point in figure 5) automatically. We now have and several more are on the web [23]), each to illustrate
sufficient points to fully construct the block. More details specific feature. They are presented in wire-frame, solid
of the procedure are given in the following examples. model without texture, and solid model with texture, in
figures 6 to 13. In some of the monuments, we found
dimensional information available in travel or history
books. This information was not used or needed in the
model construction, but was valuable in evaluating the
accuracy.

Figure 6 shows the Arc de Triomphe in Paris. The


Olympus C3030 digital camera (3.1 Mega-pixels) was
used (14 images). The arc measures 45 m x 22 m, as
indicated in some tourist guides (height varied from one
source to another, thus it was not used for evaluation). We
used one distance (the 22 m width of one side) to scale
our model. From the model, the dimensions on the four
sides were: 22 m (fixed for scale), 22.06 m, 44.85 m, and
44.89 m. This gives an error of 0.28%. One should point
out that the given dimensions are probably rounded off.

Figure 4. Results of automatic point extraction


on corners and selected edges (arches).

Figure 5. Constructing blocks.

3. Examples
Over the past year, members of our group visited
different cities around the world. Whenever possible, they
took images covering various interesting monuments. The
images were taken during routine tours without any
advanced planning of where to take the images. We took Figure 6. Arc de Figure 7. St. John
the images just like any typical tourist, by walking around Triomphe, Paris (14 Baptistery, Florence
the monument and getting the best view under real images). Illustrates (8 images). Illustrates
conditions such as presence of other tourists, vehicles, and automatic arches. automatic blocks
other buildings and objects. Several types of digital
cameras and regular film cameras (where the film was The next example is the St. John baptistery in Florence
digitized later) were used. The results were very (figure 7). The Olympus E-10 (4 Mega pixels) camera
encouraging and compelling. Over 100 models were was used to take eight images. The baptistery has eight
created using this approach, each one usually in 1-2 days sides. The actual dimensions were obtained from a plan in
of work by one person. The number of points and level of a book. The sides average about 13 m in length. Again we
interaction and automation obviously varied significantly will assign 13 m to one side and use it to scale the whole
from one model to another. Usually between 500 – 3000 model. The average difference between the model sides
points were needed, at least 80% of which were generated and the actual sides is less than 1 cm, or 0.07%. This is
significantly better than the accuracy of the Arc de

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Triomphe (figure 6). This is due to the better camera used
(higher resolution, larger pixel size, and better quality
lens) and smaller size object with good feature definition.

Figure 10. G. Poggi Figure 11. Modern


Tower, Florence (8 Monument in Dublin (5
images). Illustrates images). Illustrates
automatic arches. automatic spheres.
Figure 8. The WWII Figure 9. Monument
monument, Quebec City to Galileo, Padova (5
(6 images). Illustrates images). Illustrates
automatic blocks. automatic irregular
blocks.

The monument shown in figure 8 consists mostly of


blocks, including the steps. After extracting the visible
corners, all remaining points needed for complete
reconstruction of the monuments were easily added using
the block approach described in section 2. Figure 9 shows
a relatively uncomplicated monument. Corners of the
main structures are first extracted and plane equations of
each surface are computed. Sculptured details that are
attached to the surfaces are added by automatically
extracting the top most points on the sculptures, applying
our constrained matching technique to compute their 3D
coordinates, then projecting normal from each to the
plane to which they attached. The tower shown in figure Figure 12. Trinity Figure 13. San
10 includes three arch-shaped openings. Points on these College building, Dublin Giacomo dell’Orio,
arches are automatically measured using the procedure (2 images). Illustrates Venice (6 images).
illustrated in figures 3 and 4. The inside points of the automatic columns and Illustrates automatic
arches, even though they do not appear in any image, steps. cylinders.
were measured by intersecting the outside points with the
back plane along the normal to that plane. Figure 11 4. Conclusion
shows a modern monument in Dublin. Only 5 points were
measured interactively on the sphere, then a sphere A semi-automatic approach for constructing medium and
equation is fitted to these points and 1000 more points large-scale objects, mainly monuments and towers, was
were added automatically. The examples shown in figures presented. Several representative examples from images
12 and 13 illustrate automatic modeling of columns, taken by tourists were given. Parts of the process that can
cylinders, steps and blocks. straightforwardly be performed by humans, such as
registration, extracting seed points, and topological
segmentation, remain interactive. Numerous details plus
the occluded and the un-textured parts are added

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
automatically by taking advantage of some of the object Part B5A, Commission V, pp 309-318,
characteristics and making some realistic assumptions. Amsterdam, July 16-23, 2000.
Efforts to automate the whole procedure are continuing [10] Liebowitz, D., A. Criminisi, A. Zisserman, A.,
and will undoubtedly intensify in the future. In the mean ”Creating Architectural Models from Images”,
time in order to achieve immediate and useful results, EUROGRAPHICS ’99, 18(3), 1999.
parts of the process necessitate human interaction. [11] Tarini, A., P.Cignoni, C.Rocchini, R.Scopigno,
“Computer assisted reconstruction of buildings
5. Acknowledgements from photographic data”, Vision, Modeling and
Visualization 2000 Conference Proc., pp. 213-220,
Our colleagues François Blais and Eric Paquet took many Saarbrucken, DE, November 2000.
of the images. Emily Whiting constructed some models. [12] Dick, A.R., P.H.Torr, S.J. Ruffle, R.Cipolla,
“Combining single view recognition and multiple
6. References view stereo for architectural scenes”, Proc. 8th
IEEE International Conference on Computer
[1] Beraldin, J.-A., F. Blais, L. Cornouyer., M. Rioux, Vision (ICCV'01), pp. 268-274, July 2001.
S.F. El-Hakim, R. Rodella, F. Bernier, N. Harrison, [13] Georgis, N., M. Petrou, J. Kittler, “Error guided
“3D imaging system for rapid response on remote design of a 3D vision system”, IEEE Trans. PAMI,
sites”, IEEE proc. of 2nd. Int. Conf. On 3D Digital 20(4), pp. 366-379, 1998.
Imaging and Modeling (3DIM’99), pp 34- 43, 1999 [14] Brown, D.C., “The bundle adjustment - Progress
[2] El-Hakim, S.F., “3D modeling of complex and prospective.” International Archives of
environments", In Proc. Videometrics and Optical Photogrammetry, 21(3): 33 pages, ISP Congress,
Methods for 3D Shape Measurement, San Jose, Helsinki, Finland, 1976.
California, SPIE Vol. 4309, pp 162-173, Jan. 2001. [15] Fraser, C.S., “Network design considerations for
[3] Shum, H.Y., R. Szeliski, S. Baker, M. Han, P. non-topographic photogrammetry”,
Anandan, “Interactive 3D modeling from multiple Photogrammetric Engineering & Remote Sensing,
images using scene regularities”, European 50(8), pp. 1115-1126, 1994.
Workshop 3D Structure from Multiple Images of [16] Fraser, C.S., “Digital camera self-calibration”,
Large-scale Environments – SMILE (ECCV’98), ISPRS Journal for Photogrammetry and Remote
pp 236-252, 1998. Sensing, 52(4), pp. 149-159, 1997.
[4] Oliensis, J., “A critique of structure-from-motion [17] Harris C., M. Stephens, “A combined corner and
algorithms”, Computer Vision and Image edge detector", Proc. 4th Alvey Vision Conf., pp.
Understanding, 80(2), pp 172-214, 2000. 147-151, 1998.
[5] Pollefeys, M., R. Koch, M. Vergauwen., L. Van [18] Triggs, W., P. McLauchlan, R. Hartley, A.
Gool, “Hand-held acquisition of 3D models with a Fitzgibbon, Bundle Adjustment for Structure from
video camera”, IEEE proceedings of 2nd. Int. Conf. Motion, in Vision Algorithms: Theory and
On 3D Digital Imaging and Modeling (3DIM’99), Practice, Springer-Verlag, 2000.
pp. 14- 23, 1999. [19] Zorin, D.N., Subdivision and Multiresolution
[6] Pollefeys, M., R. Koch, L. Van Gool, “Self- Surface Representation. Ph.D. Thesis, Caltech,
calibration and metric reconstruction in spite of California, 1997.
varying and unknown intrinsic camera [20] van den Heuvel, F.A., “3D reconstruction from a
parameters”, International J. of Computer Vision, single image using geometric constraints”, ISPRS
32(1), pp. 7-25, 1999. Journal Photogrammetry & Remote Sensing, 53(6),
[7] Faugeras, O., L. Robert, S. Laveau, G. Csurka, C. pp. 354-368, 1998.
Zeller, C. Gauclin, I. Zoghlami, “3-D [21] Zwillinger, D. (ed.), Standard Mathematical
reconstruction of urban scenes from image Tables, 30th Edition, CRC Press, Inc., West Palm
sequences”, Computer Vision and Image Beach, Florida, pp 311-316, 1996.
Understanding, 69(3), pp 292-309, 1998. [22] Lee, J., R. Haralick, L. Shapiro, “Morphologic
[8] Debevec, P., C.J. Taylor, J. Malik, “Modeling and edge detection”, IEEE Journal of Robotics and
rendering architecture from photographs: A hybrid Automation, 3(2), pp 142-156, 1987.
geometry and image-based approach”, [23] http://www.vit.iit.nrc.ca/elhakim/home.html.
SIGGRAPH’96, pp. 11–20, 1996.
[9] Gruen, A., “Semi-automatic approaches to site
recording and modeling”, International Archives of
Photogrammetry and Remote Sensing, Volume 33,

Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE

You might also like