0% found this document useful (0 votes)

322 views

Computing in Civil Engineering 2019 - Data, Sensing, and Analytics

Uploaded by

Ernest Christian Nanola

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

322 views

Computing in Civil Engineering 2019 - Data, Sensing, and Analytics

Uploaded by

Ernest Christian Nanola

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 635

Computing in

Civil Engineering 2019

Data, Sensing, and Analytics

Selected Papers from the ASCE International

Conference on Computing in Civil Engineering 2019
Atlanta, GA
June 17–19, 2019

Edited by Yong K. Cho, Ph.D.; Fernanda Leite, Ph.D.;

Amir Behzadan, Ph.D.; and Chao Wang, Ph.D.
COMPUTING IN CIVIL
ENGINEERING 2019
DATA, SENSING, AND ANALYTICS
SELECTED PAPERS FROM THE ASCE INTERNATIONAL
CONFERENCE ON COMPUTING IN CIVIL ENGINEERING 2019

June 17–19, 2019

Atlanta, Georgia

SPONSORED BY
Computing Division of the
American Society of Civil Engineers

EDITED BY
Yong K. Cho, Ph.D.
Fernanda Leite, Ph.D.
Amir Behzadan, Ph.D.
Chao Wang, Ph.D.

1801 ALEXANDER BELL DRIVE

RESTON, VIRGINIA 20191–4400
Published by American Society of Civil Engineers
1801 Alexander Bell Drive
Reston, Virginia, 20191-4382
www.asce.org/publications | ascelibrary.org

Any statements expressed in these materials are those of the individual authors and do not
necessarily represent the views of ASCE, which takes no responsibility for any statement
made herein. No reference made in this publication to any specific method, product, process,
or service constitutes or implies an endorsement, recommendation, or warranty thereof by
ASCE. The materials are for general information only and do not represent a standard of
ASCE, nor are they intended as a reference in purchase specifications, contracts, regulations,
statutes, or any other legal document. ASCE makes no representation or warranty of any
kind, whether express or implied, concerning the accuracy, completeness, suitability, or
utility of any information, apparatus, product, or process discussed in this publication, and
assumes no liability therefor. The information contained in these materials should not be used
without first securing competent advice with respect to its suitability for any general or
specific application. Anyone utilizing such information assumes all liability arising from such
use, including but not limited to infringement of any patent or patents.

ASCE and American Society of Civil Engineers—Registered in U.S. Patent and Trademark
Office.

Photocopies and permissions. Permission to photocopy or reproduce material from ASCE

publications can be requested by sending an e-mail to [email protected] or by locating a
title in ASCE's Civil Engineering Database (http://cedb.asce.org) or ASCE Library
(http://ascelibrary.org) and using the “Permissions” link.

Errata: Errata, if any, can be found at https://doi.org/10.1061/9780784482438

Copyright © 2019 by the American Society of Civil Engineers.

All Rights Reserved.
ISBN 978-0-7844-8243-8 (PDF)
Manufactured in the United States of America.
Computing in Civil Engineering 2019 iii

Preface

The ASCE International Conference on Computing in Civil Engineering (i3CE) 2019 was hosted
by the Georgia Institute of Technology with sponsorship from ASCE’s Computing Division and
held in the city of Atlanta, Georgia from June 17-19, 2019. The conference is the Computing
Division’s major meeting event and is held bi-annually in the United States, with participation
from scholars worldwide. i3CE 2019 is aimed at presenting current research being carried out in
the area of computing in civil engineering by attracting a strong and active researchers and
audience through keynote sections, dedicated topic sessions, industry sessions, and technical
committee meetings.

The 2019 Conference, as a standalone event, received 454 abstracts, 288 full papers, and 58
extended abstracts for the poster and demonstration sessions. A total of 230 full papers from 26
countries around the globe were accepted and included in the proceedings. The final set of papers
was selected through a rigorous peer-review process, which involved the collection of at least
two blinded reviews per paper. The review process was performed for both abstracts and full
papers, ensuring that only the best contributions were selected. Finally, the authors had the
chance to incorporate reviewers’ comments into the final version. We are very pleased with the
high quality of selected papers, and we wish to thanks both authors and reviewers for their
efforts. All papers were divided into three books with the following three research focus areas:

 Visualization, Information Modeling, and Simulation

 Data, Sensing, and Analytics
 Smart Cities, Sustainability, and Resilience

Organizing this conference has been possible only with the support of many. We are particularly
grateful to the School of Civil and Environmental Engineering at the Georgia Institute of
Technology for their support and infrastructure. We would also like to thank the guidance from
the Computing Division’s Executive Committee and the assistance from ASCE.

We hope that you enjoyed the technical sessions, posters, demonstrations, technical committee
meetings, and industry panel discussion during the conference and that you had a memorable and
meaningful i3CE 2019 experience in Atlanta, Georgia.

Yong Cho, Ph.D.

Conference Chair, Organizing Committee, Georgia Institute of Technology

Fernanda Leite, Ph.D.

Conference Vice Chair, Organizing Committee University of Texas at Austin

Amir Behzadan, Ph.D.

Chair, Technical Committee, Texas A&M University

Chao Wang, Ph.D.

Manager, Technical Committee Proceedings, Louisiana State University

Acknowledgments

The members of the Organizing Committee are recognized for their dedication, support, and
contributions for the success of the 2019 ASCE International Conference on Computing in Civil
Engineering.

Yong K. Cho, Ph.D.

Conference Chair, Georgia Institute of Technology

Fernanda Leite, Ph.D.

Conference Vice Chair, University of Texas at Austin

Amir Behzadan, Ph.D.

Technical Committee Chair, Texas A&M University

Chao Wang, Ph.D.

Technical Committee Proceedings Manager, Louisiana State University

Special thanks are due to the following Poster Session Chair, Demo Session Chair, Best Paper
Award Committee Chair, and Student Assistants for their help with the related process.

Poster Session Chair:

Semiha Ergan New York University

Demo Session Chair:

Fei Dai West Virginia University

Best Paper Award Committee Chair:

Chao Wang Louisiana State University

Student and Postdoc Assistants:

Kinam Kim Georgia Institute of Technology
Jitae Kim Georgia Institute of Technology
Jisoo Park Georgia Institute of Technology
Jingdao Chen Georgia Institute of Technology
Pileun Kim Georgia Institute of Technology
Rachel Samuels Georgia Institute of Technology
Youjin Jang, Ph.D. Georgia Institute of Technology
Inbae Jeong, Ph.D. Georgia Institute of Technology

The Organizing Committee acknowledges the support of the technical committee session chairs and
members who helped in the peer-review and selection process of the articles that are part of these
proceedings.
Technical Committee (Session Chairs):
Artificial Intelligence and Machine Learning:
Abbas Rashidi West Virginia University
Reza Akhavian California State University, East Bay
Youngjib Ham Texas A&M University

Building and Urban Systems:

Carol Menassa University of Michigan
Zoltan Nagy University of Texas, Austin

Data Sensing:
Hubo Cai Purdue University
Pingbo Tang Arizona State University
Yong-Cheol Lee Louisiana State University

Disaster Preparedness and Response:

Masoud Gheisari University of Florida
Nan Li Tsinghua University

Facilities and Blockchain:

Sharareh Kermanshachi University of Texas, Arlington

Information Modeling:
Farrokh Jazizadeh Virginia Tech
Fei Dai West Virginia University
Semiha Ergan New York University

Infrastructure and Energy:

Rui Liu University of Florida
Zhenhua Zhu Concordia University

Professional Issues:
Amirhosein Jafari Louisiana State University

Project and Construction Management:

Mohammed Mehany Colorado State University
Youngcheol Kang Yonsei University

Reality Capture:
Behzad Esmaeili George Mason University
Dong Zhao Michigan State University

Robotics and Human-Computer Interaction:

Jiansong Zhang Purdue University
Joseph Louis Oregon State University

Safety and Health:

Jun Wang Mississippi State University
Ryan Ahn Texas A&M University

Simulation:
Wenying Ji George Mason University

Visualization:
Steven Ayer Arizona State University

Special Topics:
Chen Feng New York University
Lina Sela University of Texas, Austin
Qianwei Xu Tongji University
Yang Wang Georgia Institute of Technology

The Organizing Committee would also like to thank the following Executive Committee of
ASCE Computing Division, Local Conference Committee, and Conference Event Planning and
Registration Administrator for their assistance and support for the success of the conference.

Executive Committee:
Burcin Becerik-Gerber University of Southern California
Lucio Soibelman University of Southern California
Nora M. El-Gohary University of Illinois at Urbana-Champaign
Heather M. Brooks CTA Liaison
Yong K Cho Georgia Institute of Technology
Robert Keith Goldberg University of Akron
Christian Koch Bauhaus-Universität Weimar
Ken-Yu Lin University of Washington
James M Neckel ASCE Administrator, Technical Advancement

Local Conference Committee:

John E. Taylor Georgia Institute of Technology
Eric Marks Georgia Institute of Technology
Yi-Chang James Tsai Georgia Institute of Technology
Yang Wang Georgia Institute of Technology
Javier Irizarry Georgia Institute of Technology

Conference Event Planning and Registration Administrator:

Robin Finey Global Learning Center at Georgia Institute of Technology
Catherin Shaw Global Learning Center at Georgia Institute of Technology
Jamia Luckett Georgia Institute of Technology

Finally, a sincere appreciation goes to the EasyChair LTD. for providing the Organizing
Committee free access to EasyChair’s Conference Management Software System and for
customizing the online platform for the conference.

Contents
Big Data and Machine Learning

Automated Activity Recognition of Construction Equipment Using a Data

Fusion Approach ....................................................................................................................... 1
Behnam Sherafat, Abbas Rashidi, Yong-Cheol Lee, and Changbum R. Ahn

Estimating Commuting Patterns from High Resolution Phone GPS Data ............................. 9
Bita Sadeghinasr, Armin Akhavan, and Qi Wang

Reference Signal-Based Method to Remove Respiration Noise in

Electrodermal Activity (EDA) Collected from the Field ....................................................... 17
Gaang Lee, Byungjoo Choi, Houtan Jebelli, Changbum Ryan Ahn,
and SangHyun Lee

A Machine Learning Framework to Identify Employees at Risk of

Wage Inequality: U.S. Department of Transportation Case Study....................................... 26
Hamid R. Karimian, Behzad Rouhanizadeh, Amirhosein Jafari,
and Sharareh Kermanshachi

Integrating Positional and Attentional Cues for Construction Working Group

Identification: A Long Short-Term Memory Based Machine Learning
Approach ................................................................................................................................. 35
Jiannan Cai, Yuxi Zhang, and Hubo Cai

Neuro Fuzzy Inference Systems for Estimating Normal Concrete Mixture

Proportions .............................................................................................................................. 43
Jorge L. Santamaria, Luis Morales, and Paulina Lima

Evaluation of Machine Learning Algorithms for Worker’s Motion Recognition

Using Motion Sensors.............................................................................................................. 51
Kinam Kim, Jingdao Chen, and Yong K. Cho

Pressure Transient Detection and Pattern Discovery in Water Distribution

Systems .................................................................................................................................... 59
Lu Xing and Lina Sela

Deep Learning Models for Content-Based Retrieval of Construction Visual

Data.......................................................................................................................................... 66
Nipun D. Nath and Amir H. Behzadan

Unsupervised Machine Learning for Augmented Data Analytics of Building

Codes ....................................................................................................................................... 74
Ruichuan Zhang and Nora El-Gohary

A Data-Driven and Physics-Based Approach to Exploring Interdependency of

Interconnected Infrastructure ................................................................................................ 82
Shenghua Zhou, S. Thomas Ng, Yifan Yang, Frank Jun Xu, and Dezhi Li

Machine Learning Based Automatic Concrete Microstructure Analysis:

A Study on Effect of Image Magnification ............................................................................. 89
Srikanth Sagar Bangaru and Chao Wang

Artificial Neural Network for Semantic Segmentation of Built Environments

for Automated Scan2BIM ....................................................................................................... 97
Yeritza Perez-Perez, Mani Golparvar-Fard, and Khaled El-Rayes

Historical Accident and Injury Database-Driven Audio-Based Autonomous

Construction Safety Surveillance ......................................................................................... 105
Yiyi Xie, Yong-Cheol Lee, Moeid Shariatfar, Zhongjie "Doc" Zhang,
Abbas Rashidi, and Hyun Woo Lee

Business Failure Prediction with LSTM RNN in the Construction Industry ..................... 114
Youjin Jang, In-Bae Jeong, Yong K. Cho, and Yonghan Ahn

Identifying Patterns in Design-Build Projects in Terms of Project Cost

Performance .......................................................................................................................... 122
Yunping Liang, Baabak Ashuri, and Wei Sun

Reality Capture Technologies (LiDAR, RGB-D, Vision)

Reconstruction of Wind Turbine Blade Geometry and Internal Structure

from Point Cloud Data .......................................................................................................... 130
Benjamin Tasistro-Hart, Tristan Al-Haddad, Lawrence C. Bank,
and Russell Gentry

Falling Objects Detection for Near Miss Incidents Identification on

Construction Site ................................................................................................................... 138
Chengqian Li and Lieyun Ding

Fast Dataset Collection Approach for Articulated Equipment Pose Estimation ................ 146
Ci-Jyun Liang, Kurt M. Lundeen, Wes McGee, Carol C. Menassa,
SangHyun Lee, and Vineet R. Kamat

Emerging Construction Technologies: State of Standard and Regulation

Implementation ..................................................................................................................... 153
Ifeanyi Okpala, Chukwuma Nnaji, and Ibukun Awolusi

Exploring the Potential of Image-Based 3D Geometry and Appearance

Reasoning for Automated Construction Progress Monitoring ............................................ 162
Jacob J. Lin, Jae Yong Lee, and Mani Golparvar-Fard

Dimensional Quality Inspection of Prefabricated MEP Modules with 3D

Laser Scanning ...................................................................................................................... 171
Jingjing Guo and Qian Wang

Segmentation Approach to Detection of Discrepancy between As-Built and

As-Planned Structure Images on a Construction Site ......................................................... 178
Juhyeon Bae and SangUk Han

Development of Massive Point Cloud Data Geoprocessing Framework for

Construction Site Monitoring ............................................................................................... 185
Minh Hieu Nguyen, Sanghyun Yoon, Sangyoon Park, and Joon Heo

Spatial Change Tracking of Structural Elements of a Girder Bridge under

Construction Using 3D Point Cloud ..................................................................................... 193
Sudip Subedi, Vamsi Kalasapudi, and Nipesh Pradhananga

A 3D Irregular Packing Algorithm Using Point Cloud Data .............................................. 201

Yinghui Zhao and Carl T. Haas

Visual-Semantic Alignments for Automated Interpretation of 3D Imagery

Data of High-Pier Bridges ..................................................................................................... 209
Zhe Sun, Pingbo Tang, Ying Shi, and Wen Xiong

Robotics, Automation, and Control

4D BIM Based Optimal Flight Planning for Construction Monitoring

Applications Using Camera-Equipped UAVs ...................................................................... 217
Amir Ibrahim and Mani Golparvar-Fard

Selective Deconstruction Programming for Adaptive Reuse of Buildings .......................... 225

Benjamin Sanchez, Christopher Rausch, and Carl Haas

Task Allocation and Route Planning for Robotic Service Networks with
Multiple Depots in Indoor Environments ............................................................................ 233
Bharadwaj R. K. Mantha and Borja Garcia de Soto

Vision-Based Excavator Activity Recognition and Productivity Analysis in

Construction .......................................................................................................................... 241
Chen Chen, Zhenhua Zhu, Amin Hammad, and Walid Ahmed

The Design of Future Robotic Construction Lab ................................................................. 249

C. H. Yang, T. H. Wu, and S. C. Kang

Digital Twins as the Next Phase of Cyber-Physical Systems in Construction .................... 256
C. Kan and C. J. Anumba

Semantic Relation Detection between Construction Entities to Support Safe

Human-Robot Collaboration in Construction ..................................................................... 265
Daeho Kim, Ankit Goyal, Alejandro Newell, SangHyun Lee, Jia Deng,
and Vineet R. Kamat

Assessments of Intuition and Efficiency: Remote Control of the End Point

of Excavator in Operational Space by Using One Wrist ..................................................... 273
Dong-ik Sun, Sang-keun Lee, Yong-seok Lee, Sang-ho Kim, Jun Ueda,
Yong K. Cho, Yong-han Ahn, and Chang-soo Han

Real-Time Hazard Proximity Detection—Localization of Workers Using

Visual Data ............................................................................................................................ 281
Idris Jeelani, Hariharan Ramshankar, Kevin Han, Alex Albert,
and Khashayar Asadi

A Computational Framework for Characterizing Multiple Object Tracking

Methods in Construction Field Applications ....................................................................... 290
Jiawei Chen and Pingbo Tang

Sequential Pattern Learning of Visual Features and Operation Cycles for

Vision-Based Action Recognition of Earthmoving Excavators ........................................... 298
Jinwoo Kim, Seokho Chi, and Minji Choi

Modelling and Controlling Unmanned Excavation Equipment on

Construction Sites ................................................................................................................. 305
Joo-sung Lee, Byeol Kim, Dong-ik Sun, Chang-soo Han, and Yong-han Ahn

Automatic Wall Defect Detection Using an Autonomous Robot: A Focus

on Data Collection ................................................................................................................. 312
Jun Wang and Chaomin Luo

Real-Time Scene Segmentation Using a Light Deep Neural Network

Architecture for Autonomous Robot Navigation on Construction Sites ............................. 320
Khashayar Asadi, Pengyu Chen, Kevin Han, Tianfu Wu, and Edgar Lobaton

Vision-Based Obstacle Removal System for Autonomous Ground Vehicles

Using a Robotic Arm ............................................................................................................. 328
Khashayar Asadi, Rahul Jain, Ziqian Qin, Mingda Sun, Mojtaba Noghabaei,
Jeremy Cole, Kevin Han, and Edgar Lobaton

Planning and Execution for Geometrically Adaptive BIM-Driven Robotized

Construction Processes ......................................................................................................... 336
Kurt M. Lundeen, Vineet R. Kamat, Carol C. Menassa, and Wes McGee

Enhancing Visual SLAM with Occupancy Grid Mapping for Real-Time

Locating Applications in Indoor GPS-Denied Environments ............................................. 344
Lichao Xu, Chen Feng, Vineet R. Kamat, and Carol C. Menassa

Industrialized Construction: Emerging Methods and Technologies ................................... 352

Mohamad Razkenari, Qi Bing, Andriel Fenner, Hamed Hakim, Aaron Costin,
and Charles J. Kibert

Automating the Digital Fabrication of Concrete Formwork in Building

Projects: Workflow and Case Example ................................................................................ 360
M. S. Fardhosseini, H. Abdirad, C. Dossick, H. W. Lee, R. DiFuria, and J. Lohr

State-of-the-Art Review on the Applicability of AI Methods to Automated

Construction Manufacturing ................................................................................................ 368
Mohsen Hatami, Ian Flood, Bryan Franz, and Xun Zhang

Game Simulation to Support Construction Automation in Modular

Construction Using BIM and Robotics Technology—Stage I ............................................. 376
Oscar Wong Chong and Jiansong Zhang

UAV-UGV Cooperative 3D Environmental Mapping ......................................................... 384

Pileun Kim, Leon C. Price, Jisoo Park, and Yong K. Cho

Deep Learning with Spatial Constraint for Tunnel Crack Detection ................................. 393
Qingquan Li, Qin Zou, Jianghai Liao, Yuanhao Yue, and Song Wang

Automatic Review of Construction Specifications Using Natural Language

Processing .............................................................................................................................. 401
Seonghyeon Moon, Gitaek Lee, Seokho Chi, and Hyunchul Oh

Factors Influencing Measurement Accuracy of Unmanned Aerial Systems

(UAS) and Photogrammetry in Construction Earthwork ................................................... 408
Xi Wang, Julia C. Chen, and Gabriel B. Dadi

Perceptions for Crane Operations ........................................................................................ 415

Bo Xiao, Keith Yin Kong Lam, Jieyu Cui, and Shih-Chung Kang

An Improved Convolutional Neural Network System for Automatically

Detecting Rebar in GPR Data............................................................................................... 422
Zhongming Xiang, Abbas Rashidi, and Ge (Gaby) Ou

Human-Technology Frontier, Sensing, and Computing

Key Attributes of Change Agents for Successful Technology Adoptions

in Construction Companies: A Thematic Analysis .............................................................. 430
Afiqah R. Radzi, Hashim R. Bokhari, Rahimi A. Rahman, and Steven K. Ayer

Improved Optimization Model for Finance-Based Scheduling ........................................... 438

Ahmed Shiha and Ossama Hosny

Identifying a Ranking Method for Assessing the Potential Risk of Knee

Musculoskeletal Disorders among Roofers in Shingle Installation ..................................... 445
Amrita Dutta, Scott P. Breloff, Fei Dai, Erik W. Sinsel, Christopher M. Warren,
and John Z. Wu

Optimizing Neighborhood-Scale Walkability ...................................................................... 454

Andrew J. Sonta and Rishee K. Jain

A Novel Method for Monitoring Air Speed in Offices Using Low Cost
Sensors ................................................................................................................................... 462
Ashrant Aryal, Ishan Shah, and Burcin Becerik-Gerber

Review of Human-in-the-Loop Cyber-Physical Systems (HiLCPS): The

Current Status from Human Perspective ............................................................................. 470
Behnam Moshkini Tehrani, Jun Wang, and Chao Wang

A Deep Learning Framework for Construction Equipment Activity Analysis .................. 479
Carlos Hernandez, Trevor Slaton, Vahid Balali, and Reza Akhavian

Seeding Strategies in Online Social Networks for Improving Information

Dissemination of Built Environment Disruptions in Disasters ............................................ 487
Chao Fan, Yucheng Jiang, and Ali Mostafavi

Overview of Supporting Technologies for Cyber-Physical Systems

Implementation in the AEC Industry................................................................................... 495
Daniel A. Linares, Chimay Anumba, and Nazila Roofigari-Esfahan

Investigation and Analysis of Human, Organizational, and Project

Based Rework Indicators in Construction Projects ............................................................. 505
Elnaz Safapour and Sharareh Kermanshachi

Development of Effective Communication Network in Construction Projects

Using Structural Equation Modeling Technique ................................................................. 513
Elnaz Safapour, Sharareh Kermanshachi, and Shirin Kamalirad

Localizing Local Vulnerabilities in Urban Areas Using Crowdsourced Visual

Data from Participatory Sensing .......................................................................................... 522
Hongjo Kim, Youngjib Ham, and Hyoungkwan Kim

Enhancing Construction Safety Monitoring through the Application of

Internet of Things and Wearable Sensing Devices: A Review............................................. 530
Ibukun Awolusi, Chukwuma Nnaji, Eric Marks, and Matthew Hallowell

Evaluating Generated Layouts in a Healthcare Departmental Adjacency

Optimization Problem ........................................................................................................... 539
Jennifer I. Lather, Timothy Logan, Kate Renner, and John I. Messner

Thermal Comfort Aggregation Modeling Based on Social Science Theory:

Towards a Comfort-Driven Cyber Human System Framework ......................................... 547
Lu Zhang and Shankar Sanake

Prototype Development of a Tactile Sensing System for Improved Worker

Safety Perception................................................................................................................... 555
Sayan Sakhakarmi, JeeWoong Park, and Chunhee Cho

Robustness Analysis of Design Phase Performance Predictors Using

Extreme Bounds Analysis (EBA) .......................................................................................... 563
Sharareh Kermanshachi and Behzad Rouhanizadeh

Biomechanical Analysis of Manual Material Handling Tasks on Scaffold ......................... 572

Srikanth Sagar Bangaru, Chao Wang, and Fereydoun Aghazadeh

Development of Effective Communication Framework Using Confirmatory

Factor Analysis Technique ................................................................................................... 580
Thahomina Jahan Nipa, Sharareh Kermanshachi, and Shirin Kamalirad

Reliability and Validity of a Posture Matching Method Using Inertial

Measurement Unit-Based Motion Tracking System for Construction Jobs ....................... 589
Wonil Lee, Jia-Hua Lin, Stephen Bao, and Ken-Yu Lin

Investigating the Neurophysiological Effect of Thermal Environment on

Individuals’ Performance Using Electroencephalogram ..................................................... 598
Xi Wang, Da Li, Carol C. Menassa, and Vineet R. Kamat

Enhanced Welding Operator Quality Performance Measurement:

Work Experience-Integrated Bayesian Prior Determination .............................................. 606
Yitong Li, Wenying Ji, and Simaan M. AbouRizk

Understanding Different Views on Emerging Technology Acceptance

between Academia and the AEC/FM Industry .................................................................... 614
Yong K. Cho, Youjin Jang, Kinam Kim, Fernanda Leite, and Steven Ayer

Automated Activity Recognition of Construction Equipment Using a Data Fusion

Approach
Behnam Sherafat1; Abbas Rashidi2 ; Yong-Cheol Lee3; and Changbum R. Ahn4
1
Ph.D. Student of Construction Engineering, Dept. of Civil and Environemental Engineering,
Univ. of Utah, UT, USA. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil and Environmental Engineering, Univ. of Utah, UT, USA. E-
mail: [email protected]
3
Assistant Professor, Dept. of Construction Management, Louisiana State Univ., LA, USA. E-
mail: [email protected]
4
Associate Professor, Dept. of Construction Science, Texas A&M Univ., TX, USA. E-mail:
[email protected]

ABSTRACT
Automated monitoring of construction operations, especially operations of equipment and
machines, is an essential step toward cost-estimating, and planning of construction projects. In
recent years, a number of methods were suggested for recognizing activities of construction
equipment. These methods are based on processing single types of data (audio, visual, or
kinematic data). Considering the complexity of construction jobsites, using one source of data is
not reliable enough to cover all conditions and scenarios. To address the issue, we utilized a data
fusion approach: this approach is based on collecting audio and kinematic data, and includes the
following steps: 1) recording audio and kinematic data generated by machines, 2) preprocessing
data, 3) extracting time- and frequency-domain-features, 4) feature-fusion, and 5) categorizing
activities using a machine-learning algorithm. The proposed approach was implemented on
multiple machines and the experiments show that it is possible to get up to 25% more-accurate
results compared to cases of using single-data-sources.
Keywords: Construction Equipment, Audio and Kinematic Data, Feature Fusion, Activity
Recognition, Machine Learning

INTRODUCTION
In the construction industry, it is well known that the ownership, rental, and maintenance
costs of heavy equipment would significantly impact the overall budget and schedule of projects
(Cheng et al. 2017, Ahn et al. 2012). As a result, it is vital to continuously monitor various
operations of those equipment under different conditions. Manually recognizing those activities
are time-consuming and costly, so the researchers and practitioners have suggested automated
techniques for handling this important task. Results of an automated activity recognition system
could be further used to identify idle times, calculate productivity rates, and estimate cycle times
for repetitive activities (Rezazadeh and McCabe 2011, Teizer et al. 2010). Within the last couple
of years, several studies have been conducted on this topic. Three major methods have been
utilized for automated activity detection in construction job sites: 1) computer vison-based
methods (Yang et al. 2016, Golparvar-Fard et al. 2013); 2) kinematic-based methods (Akhavian
and Behzadan 2012, Ahn et al. 2013, Kim et al. 2018); and 3) audio-based methods (Zhang et al.
2018, Cheng et al. 2017a, Sabillon et al. 2017, Cheng et al. 2017b). Because of the scope of this
paper, recent studies on audio- and kinematic-based studies are investigated in more detail within
the following paragraphs.

Akhavian and Behzadan (2012) proposed a method that is capable of detecting the motion of
different parts of the equipment using magnetic field and tilt sensing, and creating 3D
simulations. This method can provide simulation models with accurate input data. In order to
improve their framework, Akhavian and Behzadan (2013) and Akhavian and Behzadan (2014)
used data fusion by merging weight, position, and orientation to detect equipment activities. Ahn
et al. (2013) proposed a method to detect engine-off, idling, and working activities of excavators
using accelerometer data. Recently, Kim et al. (2018) analyzed the cabin rotation data, which are
recorded by Inertial Measurement Units (IMUs), by using a dynamic time wrapping technique to
detect activities of excavators. The output of their method can determine the activity cycle time
of the excavator.

Fig. 1. The proposed hybrid kinematic-acoustic system for activity detection of

construction equipment
Furthermore, a few studies have been conducted using audio data. Recently, Zhang et al.
(2018) identified six types of equipment sounds using Mel-Frequency Cepstral Coefficients
(MFCCs) and Hidden Markov Model (HMM) and obtained an accuracy of 94%. Cheng et al.
(2017a), Cheng et al. (2017b), and Sabillon et al. (2017) implemented activity detection methods
based on the audio recorded while the equipment operates. These methods use Short Time
Fourier Transform (STFT) of audio signals to extract frequency-domain features.
All of the aforementioned studies are widely used either kinematic or audio data but have
their specific issues. For example, audio might not work efficiently on congested, noisy and large
job sites. Because the energy of audio diminishes in longer distances. Also, there are some issues
with the kinematic-based methods. For example, some types of equipment (e.g. hand drills) do
not provide enough space to place the sensors. To address these limitations, we propose an
automated construction activity detection method using both audio and kinematic sensors
simultaneously. The authors evaluated the applicability of this method on different job sites with
several types of equipment such as excavators, loaders, and trucks from which they obtained
very promising results.

RESEARCH METHODOLOGY
This paper proposes an automated method to detect various types of activities performed by
construction equipment using audio and kinematic signals. To achieve this goal, the authors
developed a binary classification to detect major (value-adding) and minor (non-value-adding)
activities. Value-adding activities (e.g. moving soil) are those activities directly related to the
productivity of the activity. On the other hand, non-value-adding activities (e.g. rotating cabin,

maneuvering) are the ones that support the performance of value-adding activities and do not
impact on the final productivity. Different steps of the proposed system are shown in Fig. 1.

RECORDING DATA
In this research, two types of data are recorded. The first one is audio, which is recorded
using a single microphone placed outside the equipment cabin. The other type of data is
kinematic, which is recorded using acceleration and angular velocity sensors. Three axes (x, y,
and z) for each data type is recorded. Sampling frequencies for audio and kinematic sensors are
44100 Hz and 100 Hz, respectively.

FEATURE EXTRACTION
In this step, different features are extracted in the time and frequency domain. Segments of
120ms have been chosen for feature selection. This segment duration has been approved by
previous studies for providing enough time resolution. Short Time Fourier Transform (STFT)
with a Hanning window (50% overlap) is selected for feature extraction. Features are chosen
based on their performance in previous studies (Table 1). Spectral features are shown to have
good performance on detecting equipment activities based on the sound of equipment engine. All
of the features are extracted and fed into the machine learning model described in section 3.4.

Table 1. List of extracted features for both audio and kinematic data
Features Usage Reference
25 STFT Coefficients Audio & Kinematic (Cheng et al. 2017)
Root Mean Square (RMS) Audio & Kinematic (Akhavian and Behzadan 2015)
Short Time Energy (STE) Audio & Kinematic (Akhavian and Behzadan 2015)
(Kozhisseri and Bikdash 2009,
Spectral Flux (SF) Audio & Kinematic
Padmavathi et al. 2010)
(Wieczorkowska et al. 2018,
Spectral Entropy (SE) Audio & Kinematic
Padmavathi et al. 2010)
(Wieczorkowska et al. 2018,
Spectral Centroid (SC) Audio & Kinematic
Padmavathi et al. 2010)
(Wieczorkowska et al. 2018,
Spectral Roll-off (SRO) Audio & Kinematic
Padmavathi et al. 2010)
Zero Crossing Rate (ZCR) Kinematic (Wieczorkowska et al. 2018)

SENSOR FUSION
All of the features explained in the previous section are extracted from the audio and
kinematic signals. As the next step, the authors used a feature fusion approach to combine
features from both audio and kinematic data to achieve a more robust system. Fused features are
used as input to the machine learning model.

SVM MODEL
The authors implemented Support Vector Machine (SVM) as the machine learning
algorithm. This algorithm has shown good performance in similar previous efforts (Sabillon et
al. 2017). In this paper, different activities of equipment are recorded using a video camera for
further actual labeling of activities. Four time periods of major activities and four time periods of
minor activities with durations of 5 to 10 seconds have been considered as samples for training

the model. The other time periods are used as testing data. SVM predicts the labels and these
labels are then fed into the post-processing algorithms.

Fig. 2. Placement of devices near and inside the cabin of the equipment

Fig. 3. Predicted labels after MCF and actual labels for Jackhammer using fused data

POST-PROCESSING DATA
The labels from the previous section are not appropriate for predicting activities. These labels
fluctuate in very little periods of time, which is not practical for prediction purposes. In this
research, three types of algorithms are used to smooth the labels: 1) Small Window Filtering
(SWF); 2) Big Window Filtering (BWF); and 3) Markov Chain Filter (MCF). These algorithms
look through the labels and substitute the labels with the more frequent label of their previous or
future ones. Windows for checking the frequency of labels are chosen as 2 and 6 for small
window SWF and BWF, respectively (Sabillon et al. 2017). More explanation about these three
filters are provided in previous researches (Sabillon et al. 2017).

EXPERIMENTAL SETUP AND RESULTS

The proposed system is evaluated on 5 different types of equipment including a loader (CAT
259D), a dozer (Dozer 850K), an excavator (CAT 308E), a drilling machine (Jackhammer), and
a lift (XTREME). The data capturing devices implemented in this study are Zoom H1 digital
handy recorder, an off-the-shelf microphone, and iPhone for recording kinematic data. The
microphone is placed outside the cabin within a distance of about 10ft and an iPhone is mounted
inside the cabin. The iPhone and laptop are connected using MATLAB software to send data
from the iPhone to the laptop. In addition, a video camcorder is used to record all activities of the
equipment for further labeling and validating purposes. Fig. 2 shows the configuration of the
devices.
Figure 3 demonstrates the generated signals after fusing data for the jackhammer. As shown
in this figure, there is a strong correlation between the predicted and actual labels. Finally, the

summary of the accuracies obtained for all five pieces of equipment is presented in Figure 4.

Fig. 4. Equipment activity detection accuracies using two types of data and the fused data

DISCUSSIONS
This paper proposed a data fusion approach, based on using audio and kinematic data, for
automated activity detection of construction equipment. While the proposed system shows very
promising results, a couple of issues and challenges still exist: First, in some cases, it might be
difficult to specify an activity as productive or non-productive. In other words, it depends on the
judgement of project managers. For example, when a drill is attached to the arm of CAT 305.5E2
to break concrete, moving back and forth, the rotating and extending arm are considered as
minor activities because they are not related to breaking concrete. Furthermore, the authors
found that some activities could not be detected correctly. This is due to the fact that some types
of equipment generate similar sounds when doing different tasks. For example, moving the arm
forward and backward are major and minor activities, respectively, but they have similar sound
patterns. For a Dozer 850K, some portions of moving backwards are incorrectly detected as a
major activity.
For jackhammer, both data sources are appropriate for activity detection. This is due to the
fact that drilling operations generate strong and distinct vibration and sound patterns. However,
when it comes to smaller backhoes, the generated kinematic data are not strong enough, which
leads to less accurate results. By observing the accuracies of dozer (i.e. Dozer 850K), it is almost
clear that vibration is a better data source for activity detection. For excavator (CAT 308E), both
data sources lead to same results. Because activities of excavators generate very distinct
vibration and sound patterns, it is possible to detect activities accurately. The other possible
reason for inaccuracies is the issue of time synchronization. The starting point of audio and
kinematic signals might not be exactly the same, which might impact the accuracy of results
negatively. In this study, this issue has been addressed using manual modifications. Also, some
old types of equipment generate outlier kinematic signals which distort the desired kinematic
signals and decrease the accuracy. On the other hand, some newer models are quieter and tends
to generate lower levels of noise and vibrations, which could lead to lower accuracy rates. A
detailed analysis of the results shows that each of audio and kinematic sensors works better with
certain types of equipment, and might not be appropriate for other types.
In conclusion, evaluating the results shows that audio sensors are a good choice for
excavators and kinematic sensors work better for bulldozers. For loaders, both types of data lead
to similar results. In Table 2, a brief comparison between different types of activities and sensors
is presented.

Table 2. Accuracy rates for kinematic data, audio data, and the fused data
Vibration Accuracy Audio Accuracy Fused Data Accuracy
Low High Low High Low High
Equipment Moderate Moderate Moderate
(0%- (90%- (0%- (90%- (0%- (90%-
(75% -90%) (75% -90%) (75% -90%)
75%) 100%) 75%) 100%) 75%) 100%)
Jackhammer ✔ ✔ ✔
CAT 259D ✔ ✔ ✔
Skyjack
SJ6826 ✔ ✔ ✔
CAT 308E ✔ ✔ ✔
Dozer 850K ✔ ✔ ✔

In summary, fusing both audio and kinematic signals helps detect equipment activities in a
more robust and accurate way and the results show that fusing data leads to accuracy rates over
90%. In the next section, some other advantages of the system as well as future research plans
are explained.

CONCLUSION
The proposed system automatically detects different types of activities performed by
construction equipment using the generated audio and kinematic signals. This system consists of
recording data, preprocessing data, extracting several features, feature fusion, and classifying
activities using the SVM model. The contributions of the proposed system are as follows: 1) it
can address the issue of using a single type of sensor. For example, in congested and/or large job
sites, the energy of audio might decrease over large distances. Also, some new models do not
generate detectable kinematic signals, which decreases the accuracy of the system; 2) in this
paper, different time and frequency-domain features are extracted, which support the process of
machine learning model. All of these features are evaluated in previous studies and showed
accurate results in detecting equipment engine sound; and 3) different pre-processing and post-
processing algorithms are conducted to refine the data and results.
This paper focused on detecting the activities of single pieces of equipment. In the future, the
authors will evaluate this method on more challenging cases where multiple machines operate at
a jobsite simultaneously. In addition, there is a need to collect more data from other types of
equipment such as graders and compactors.

ACKNOWLEDGMENTS
This research project has been funded by the U.S. National Science Foundation (NSF) under
Grant CMMI-1606034. The authors gratefully acknowledge NSF's support. Any opinions,
findings, conclusions, and recommendations expressed in this manuscript are those of the authors
and do not reflect the views of the funding agency. The authors also appreciate the assistance of
Mr. Richard Peterson, undergraduate student at University of Utah, with data collection and
audio recordings.

REFERENCES
Ahn, C. R., Lee, S., & Peña-Mora, F. (2012). Monitoring system for operational efficiency and
environmental performance of construction operations using vibration signal analysis. In
Construction Research Congress 2012: Construction Challenges in a Flat World (pp. 1879-

1888). https://doi.org/10.1061/9780784412329.189
Ahn, C. R., Lee, S., & Peña-Mora, F. (2013). Application of low-cost accelerometers for
measuring the operational efficiency of a construction equipment fleet. Journal of Computing
in Civil Engineering, 29(2), 04014042. https://doi.org/10.1061/(ASCE)CP.1943-
5487.0000337
Akhavian, R., & Behzadan, A. H. (2012). Remote monitoring of dynamic construction processes
using automated equipment tracking. In Construction Research Congress 2012: Construction
Challenges in a Flat World (pp. 1360-1369). https://doi.org/10.1061/9780784412329.137
Akhavian, R., & Behzadan, A. H. (2013). Knowledge-based simulation modeling of construction
fleet operations using multimodal-process data mining. Journal of Construction Engineering
and Management, 139(11), 04013021. https://doi.org/10.1061/(ASCE)CO.1943-
7862.0000775
Akhavian, R., & Behzadan, A. H. (2014). Client-server interaction knowledge discovery for
operations-level construction simulation using process data. In Construction Research
Congress 2014: Construction in a Global Network (pp. 41-50).
https://doi.org/10.1061/9780784413517.005
Akhavian, R., & Behzadan, A. H. (2015). Construction equipment activity recognition for
simulation input modeling using mobile sensors and machine learning classifiers. Advanced
Engineering Informatics, 29(4), 867-877. https://doi.org/10.1016/j.aei.2015.03.001
Cheng, C. F., Rashidi, A., Davenport, M. A., & Anderson, D. V. (2017). Activity analysis of
construction equipment using audio signals and support vector machines. Automation in
Construction, 81, 240-253. https://doi.org/10.1016/j.autcon.2017.06.005
Cheng, C. F., Rashidi, A., Davenport, M. A., Anderson, D. V., & Sabillon, C. A. (2017).
Acoustical modeling of construction jobsites: hardware and software requirements. In
Computing in Civil Engineering 2017 (pp. 352-359).
https://doi.org/10.1061/9780784480847.044
Golparvar-Fard, M., Heydarian, A., & Niebles, J. C. (2013). Vision-based action recognition of
earthmoving equipment using spatio-temporal features and support vector machine
classifiers. Advanced Engineering Informatics, 27(4), 652-663.
https://doi.org/10.1016/j.aei.2013.09.001
Kim, H., Ahn, C. R., Engelhaupt, D., & Lee, S. (2018). Application of dynamic time warping to
the recognition of mixed equipment activities in cycle time measurement. Automation in
Construction, 87, 225-234. https://doi.org/10.1016/j.autcon.2017.12.014
Kozhisseri, S., & Bikdash, M. (2009, March). Spectral features for the classification of civilian
vehicles using acoustic sensors. In Computational Intelligence in Vehicles and Vehicular
Systems, 2009. CIVVS'09. IEEE Workshop on (pp. 93-100). IEEE.
https://doi.org/10.1109/CIVVS.2009.4938729
Padmavathi, G., Shanmugapriya, D., & Kalaivani, M. (2010, October). Neural network
approaches and MSPCA in vehicle acoustic signal classification using wireless sensor
networks. In Communication Control and Computing Technologies (ICCCCT), 2010 IEEE
International Conference on (pp. 372-376). IEEE.
https://doi.org/10.1109/ICCCCT.2010.5670580
Rezazadeh Azar, E., & McCabe, B. (2011). Automated visual recognition of dump trucks in
construction videos. Journal of Computing in Civil Engineering, 26(6), 769-781.
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000179
Sabillon, C. A., Rashidi, A., Samanta, B., Cheng, C. F., Davenport, M. A., & Anderson, D. V.

(2017, March). A Productivity Forecasting System for Construction Cyclic Operations Using
Audio Signals and a Bayesian Approach. In Construction Research Congress 2018 (pp. 295-
304). https://doi.org/10.1061/9780784481264.029
Teizer, J., Allread, B. S., Fullerton, C. E., & Hinze, J. (2010). Autonomous pro-active real-time
construction worker and equipment operator proximity safety alert system. Automation in
construction, 19(5), 630-640. https://doi.org/10.1016/j.autcon.2010.02.009
Wieczorkowska, A., Kubera, E., Słowik, T., & Skrzypiec, K. (2018). Spectral features for audio
based vehicle and engine classification. Journal of Intelligent Information Systems, 50(2),
265-290. https://doi.org/10.1007/s10844-017-0459-2
Yang, J., Shi, Z., & Wu, Z. (2016). Vision-based action recognition of construction workers
using dense trajectories. Advanced Engineering Informatics, 30(3), 327-336.
https://doi.org/10.1016/j.aei.2016.04.009
Zhang, T., Lee, Y. C., Scarpiniti, M., & Uncini, A. (2010, October). A Supervised Machine
Learning-Based Sound Identification for Construction Activity Monitoring and Performance
Evaluation. In Construction Research Congress 2018 (pp. 358-366).
https://doi.org/10.1061/9780784481264.035

Estimating Commuting Patterns from High Resolution Phone GPS Data

Bita Sadeghinasr1; Armin Akhavan2; and Qi Wang, Ph.D.3
1
Dept. of Civil and Environmental Engineering, Northeastern Univ., 360 Huntington Ave.,
Boston, MA 02115
2
Dept. of Civil and Environmental Engineering, Northeastern Univ., 360 Huntington Ave.,
Boston, MA 02115
3
Dept. of Civil and Environmental Engineering, Northeastern Univ., 360 Huntington Ave.,
Boston, MA 02115 (corresponding author). E-mail: [email protected]

ABSTRACT
The rise of location positioning technologies has generated enormous volumes of digital
footprints. Translating this big data into understandable trip patterns plays a crucial role in
estimating infrastructure demands. Previous studies were unable to correctly represent
commuting patterns on smaller urban scales due to insufficient spatial accuracy. In this study, we
investigated if, and to what extent, estimated commuting patterns identified from GPS data can
replicate the results from transportation surveys and to what degree these estimates improve the
estimates of trips distribution pattern on census tract level using higher resolution data. We
inferred average daily home-to-work trips by analyzing phone GPS data and compared these
patterns with U.S. Census summary tables. We found that trips detected by GPS data highly
correlate with census trips. Furthermore, GPS data is a better proxy for Census tract-pairs with
larger numbers of trips.

INTRODUCTION
Gaining a profound insight into human mobility is of crucial importance to many areas such
as urban planning (Appleyard et al. 1964, Hägerstraand 1970, Carlstein et al. 1978, Jiang et al.
2012), emergency response and evacuation (Wang et al. 2014, Wang et al. 2016), traffic
monitoring and travel demand forecasting (Wilson et al. 2004, Treiber et al. 2013). Human
movements and activities in conjunction with infrastructures’ sprawl shape urban patterns;
therefore, comprehending and predicting urban movement patterns can enormously contribute to
finding solutions to urban complexities. The importance of broadening understanding of urban
mobility has required planners to seek different sources of information on this subject (Fan et al.
2008). Traditionally, planners and policymakers benefit from household travel surveys and
census data to learn about people’s whereabouts. However, these surveys are time-consuming
and expensive to conduct (Meyer et al. 1984, Stopher et al. 2007). Additionally, surveys only
capture a snapshot of the travel behavior of a sample of people and are susceptible to self-
reporting and upscaling errors (Palmer et al. 2013).
A fruitful direction of urban mobility study is to estimate trip patterns. Having an accurate
estimate of daily trips can facilitate traffic congestion management and travel time forecasting.
Trips can be constructed by estimating origins and destinations (ODs). Calculating OD matrices
enables authorities to better estimate volumes of traffic in transportation networks and develop
the infrastructures accordingly (Barbosa et al. 2018). Among trips with different purposes,
commute trips account for the largest portion of travels during peak hours (Polzin et al. 2015).
Therefore, having a clear picture of the distribution of these trips plays a key part in managing
traffic demand. ODs are traditionally estimated using travel surveys. Yet, due to travel surveys’

shortcomings, there have been efforts to estimate trips seeking other datasets and practices (Pan
et al. 2006, Caceres et al. 2007, Sohn et al. 2008).
Previous studies have most prevalently adapted call detail records (CDR) for inferring trip
patterns due to the availability of CDR on millions of users (Barbosa et al. 2018). Many studies
have estimated commute trips using CDR data and validated them against travel surveys (Zhang
et al. 2010, Calabrese et al. 2011, Frias-Martinez et al. 2012, Alexander et al. 2015, Toole et al.
2015, Jiang et al. 2016). They assigned OD matrices generated from CDR to networks of roads
and estimated average daily OD trips by purposes and time of day. They compared estimated
CDR trips with travel survey datasets and found that there is a strong correlation on the town
level.
Although CDR has provided researchers with useful insights into trip patterns (Blondel et al.
2015), using CDR to detect trips between smaller geographic levels such as tracts can be
problematic, due to the fact that users’ locations are approximated by the position of the tower
that their cell phone is connected to. Coverage area by cell towers considerably varies from tens
of meters in the densest areas up to a few kilometers in rural areas (Calabrese et al. 2011). In less
urbanized areas, users might transmit all communication through one tower while moving in the
area covered by the same tower. However, in dense areas, users ping multiple towers with much
smaller movements (Barbosa et al. 2018).
GPS data provides researchers with a high level of spatial accuracy and temporal frequency
and thus can be a rich source for detecting mobility patterns (Barbosa et al. 2018). Previously
GPS datasets were most commonly collected from smaller groups of individuals (Calabrese et al.
2011, Blondel et al. 2015) or trackers on cars (Bazzani et al. 2010, Pappalardo et al. 2013) and
often were not available at the scale of a city. Thus, commuting patterns had seldom been
explored using complete GPS datasets. Recently, data sets generated by phone GPS are emerging
and have been used for mobility research (Li et al. 2008, Zheng et al. 2008, Zheng et al. 2009,
Zheng et al. 2010, Akhavan et al. 2018). As opposed to travel surveys that report respondents'
travel behavior over the past single day or a few days, using technologies like GPS enables us to
capture travelers’ behavior for many days possible. However, before using them to estimate
urban trips, especially commuting trips, the limitations of the dataset should be quantified. The
objective of this study is to investigate if, and to what extent, the estimated commuting patterns
identified from phone GPS data of millions of users can replicate the results from commuting
surveys. In order to do so, we aim to estimate daily trip patterns using ODs extracted from
millions of phone GPS records.

DATA
In this study, we analyzed 810 million phone GPS records from 1 million users in Houston,
Texas over a course of two consecutive weeks (1st-15th August 2017). Phone GPS data was
generated by more than 50 mobile applications that anonymously collect users’ geolocation
records. Each record has an anonymous device ID, latitude, longitude and time stamp.

STUDY AREA
The area of study is the city of Houston. The Census Bureau follows people for determining
census tracts rather than political boundaries. As a result, census tracts don't necessarily follow
city boundaries (1994). Most of the census tracts are located entirely within the city boundaries,
but some cross over the city limits (Figure 1). In order to accurately capture the trips of Houston
residents, we set the limit of our analysis to tracts in which 50% or more of the area was

contained in the city of Houston limits.

METHODOLOGY
Stay extraction: In order to infer trips and activities from GPS data, we need to detect areas
where the user has remained stationary for a while. In this study, these areas are referred to as
“stay points”. Finding stay points helps us detect significant places where the user has engaged in
an activity and filter out GPS noise caused by user dependent and independent errors.
In this study, we developed a method based on the work from (Li et al. 2008) to detect stay
points. In order to eliminate temporary recurring stay points, such as routine stops at road
intersections, time and distance were both used as filters. Each stay point is marked by its
latitude, longitude, arrival time and departure time. Stay points’ latitudes and longitudes are
computed based on the mean coordinates of spatially-temporally close group of GPS points. The
arrival time and departing time are associated with timestamps of the first and last temporally
ordered points in the group of points respectively.
We detected stay points using this method rather than clustering GPS points due to the
followings. First, conventional clustering methods such as Density-Based Spatial Clustering of
Applications with Noise (DBSCAN) and Hidden Markov Models cause chaining effects, which
reflect routes of travel and created spread out clusters; this would distract from the goal of
finding centralized stay points. Furthermore, clustering raw GPS records merely based on the
geographies would dismiss their temporal sequence. On the other hand, to distinguish short visits
from more significant places like homes or workplaces, we introduced time thresholds in
forming stay points to account for the time that user spent in those areas. In our experiment, an
area was selected as a stay point if the user exceeded the time threshold within a radius of 250
meters.
Stay region: After extracting stay points, hierarchical clustering with complete linkage was
performed to cluster stay points that are spatially close. All points belonging to one cluster were
within 250 meters from each other. The center of these stay point clusters is referred to as “stay
region”. The number of visits to a stay region is counted by the number of stay points within
each cluster. This is due to the fact that stay points are spatially close to each other but may have
occurred on different days. For a given number of visits, longer trips are more likely to be the
workplaces.
Home location: In order to produce home-to-work (HW) trips, we need to detect
individuals’ home and workplaces. In this study, home locations were assumed to be places
where users typically spend time during night hours (8 pm- 5 am) with the greatest number of
visits. Therefore, to detect homes, stay points were chosen with (1.) night hours of at least 3
hours or (2.) stay duration of more than 24 hours. These were hierarchically clustered into stay
regions. Both weekends and weekdays were used to detect home places.

Table 1. Summary of GPS user statistics

Individuals Percentage
Users in the raw data set 1,000,000 100%
Users with “home” detected 286,718 29%
Users both “home” and “work” detected 40,623 14.1%
(commuters)

Figure 1. The city of Houston boundaries and tracts with more than 50% area inside city
boundaries. Stay points and estimated home location of one individual
Work location: To detect work locations, stay points with arrival time within working hours
(8am-6pm) on weekdays were clustered. Based on previous studies (Levinson et al. 1994,
Schafer 2000) for a given number of visits, longer trips are more likely to be the work places.
Therefore, for potential workplace clusters attributes of “distance from home” (d) and “number
of visits” (n) were computed. Clusters within walking distance (half a mile ~ 800meters) from
home and with visits less than two times during the studied period (on average once a week)
were dismissed. Finally, clusters with the biggest values for (nd) were selected as work
locations. We also examined n2 and n3 in order to counter skewed clustering as the results of the
long-distance travels. However, we found that less than two percent of the users have different
“work locations” identified by higher powers of n.

Figure 2. (a) Commuters’ home distribution in tracts of study area by GPS. (b)
Commuters’ home distribution in tracts of study area by CTPP data.
Average daily home-to-work trips: To account for the fact that not all commuters go to
work on every weekday, average daily HW trip was computed for each commuter. It was
presumed that on weekdays users start their trips from home. Therefore, we assume that a
commuter went to work if they had a stay point in a walking distance (800(m)) from their work

location. In such case, their trip origin was assumed to be their home location even though there
was no record close to her home location. The number of days that they commuted were then
divided by the number of weekdays for each individual. Lastly, these average daily trips were
aggregated into pairs of census tracts.
Commute distance and duration: An open source routing engine was used to estimate the
commute duration and distance for each individual. The data was requested as historical data for
morning peak through Navigation and Routing API of Here Technologies, for both personal cars
and public transportation. The API is fed with a pair of origin and destination [home, work] and
returns corresponding estimated travel distances and durations.

RESULT AND DISCUSSION

To validate the distribution of the commute flows detected by GPS phone data, we compared
them to flows between an individual’s home and workplaces reported by 2006–2010 Census
Transportation Planning Products (CTPP). The correlation between the estimated commuting
trips from the GPS phone data and the CTPP tract-pair home-work trips was 0.61 with a high
level of significance (p-value < 0.0001), which shows that GPS phone data can be used to
estimate urban commuting patterns. Additionally, this correlation indicates that our sample of
mobile phone users can accurately explain the distribution of trips between census tracts. Figure
3 shows that relationship seems more linear for tracts where the census estimates are larger for
tract-pair trips and GPS trips validate better with them. This trend can be explained due to the
sparse nature of the data in tracts with a small number of commuters. Furthermore, survey data is
susceptible to sampling and upscaling errors, and the uncertainty of the number of trips between
tract pairs with a few numbers of commuters is reflected in the relatively large values of standard
errors that are reported along with estimated home-to-workflows by CTPP. Yet, using two weeks
of high-resolution GPS data holds higher correlation with CTPP flow by a factor of 17% at the
tract level, as compared to the previous study (Alexander et al. 2015) using CDR data over a
period of two months.

Figure 3. Correlation of tract-pair trips reported by CTPP and GPS data.

According to the routing engine, the estimated mean commute time by GPS data using
personal cars was 27 minutes, which is close to 26.8 minutes estimated by the census, with 88%
of the sample commuting in driving mode. Mean commute distance was estimated to be 22 km
for people driving personal cars and 15 km for commuters taking public transits.

Figure. 4. (a) Home-to-work travel duration distribution using personal cars. (b) Home-to-
work travel duration distribution for people in the sample for whom the engine could find
a route using public transit.
LIMITATION
In this study, we constructed average daily home-work trips for 40,623 commuters in
Houston and compared them to census data to validate the applicability of phone GPS data for
estimating commute trips. While the results demonstrate a high correlation between GPS trips
and CTTP trips, due to gaps in the city limits not all commuters who work in Houston were
considered. Furthermore, we used two weeks of data to detect commuters and the sample should
increase given longer periods of observation. Also, a longer observation period might result in
sample workers distribution that is more representative of the population, and thus a better
representative of trip distributions. Finally, although we used the most recent census data, there
is still a gap of seven years with GPS data, and trips might have changed to some extent during
this time.

CONCLUSION
In this study, we developed the algorithms to extract significant locations where the user has
engaged in an activity within a specific timeframe. We investigated the effect of time thresholds
used to extract these locations with an emphasis on avoiding oversimplification and
computationally expensive tasks. This investigation is important because there are limited texts
and research on ways of finding the optimal time threshold for extracting stay points. Given the
higher temporal resolution of the phone GPS data, we found that the time threshold used for
extracting stay points is highly dependent on the types of activities to be detected. Extracting stay
points significantly reduced the volume of the raw phone GPS data and filtered out the noises.
Then, we were able to detect home and work locations and to find commuters. Home-to-work
average daily trips were constructed for commuters based on the observed working days for all
individuals. For validation, home-to-work trip distributions inferred from GPS were compared
with home-to-work flow tabulations of the 2006–2010 CTPP at tract-to-tract level. We observed
a high correlation between the two datasets. The results confirm that phone GPS data can create
a clearer picture of the spatial distribution of trips to address traffic demand issues.

ACKNOWLEDGMENT
This work was supported by National Science Foundation Grant HDBE-1761950;
Northeastern Tier 1 Project on “Neighborhood Connectivity and Social Inequality”; and Global
Resilience Institute Project on “Geosocial Network Resilience”. To protect the confidentiality of

any given individual’s movement trajectory, all individuals’ information from the phone GPS
data was encrypted, and all data are reported in nonidentifiable form. All data used in this paper
were reviewed and exempted by the Northeastern University Institutional Review Board (IRB).
The data are proprietary and will not be shared.

REFERENCES
(1994). Geographic Areas Reference Manual.
Akhavan, A., N. E. Phillips, J. Du, J. Chen, B. Sadeghinasr and Q. Wang (2018). "Accessibility
Inequality in Houston." IEEE Sensors Letters.
Alexander, L., S. Jiang, M. Murga and M. C. González (2015). "Origin–destination trips by
purpose and time of day inferred from mobile phone data." Transportation research part c:
emerging technologies 58: 240-250.
Appleyard, D., K. Lynch and J. R. Myer (1964). The view from the road, MIT press Cambridge,
MA.
Barbosa, H., M. Barthelemy, G. Ghoshal, C. R. James, M. Lenormand, T. Louail, R. Menezes, J.
J. Ramasco, F. Simini and M. Tomasini (2018). "Human mobility: Models and applications."
Physics Reports.
Bazzani, A., B. Giorgini, S. Rambaldi, R. Gallotti and L. Giovannini (2010). "Statistical laws in
urban mobility from microscopic GPS data in the area of Florence." Journal of Statistical
Mechanics: Theory and Experiment 2010(05): P05001.
Blondel, V. D., A. Decuyper and G. Krings (2015). "A survey of results on mobile phone
datasets analysis." EPJ data science 4(1): 10.
Caceres, N., J. Wideberg and F. Benitez (2007). "Deriving origin destination data from a mobile
phone network." IET Intelligent Transport Systems 1(1): 15-26.
Calabrese, F., G. Di Lorenzo, L. Liu and C. Ratti (2011). "Estimating Origin-Destination flows
using opportunistically collected mobile phone location data from one million users in
Boston Metropolitan Area." IEEE Pervasive Computing 10(4): 36-44.
Carlstein, T., D. Parkes and N. Thrift (1978). Human activity and time geography.
Fan, Y. and A. J. Khattak (2008). "Urban form, individual spatial footprints, and travel:
Examination of space-use behavior." Transportation Research Record 2082(1): 98-106.
Frias-Martinez, V., C. Soguero and E. Frias-Martinez (2012). Estimation of urban commuting
patterns using cellphone network data. Proceedings of the ACM SIGKDD International
Workshop on Urban Computing, ACM.
Hägerstraand, T. (1970). "What about people in regional science?" Papers in regional science
24(1): 7-24.
Jiang, S., J. Ferreira and M. C. González (2012). "Clustering daily patterns of human activities in
the city." Data Mining and Knowledge Discovery 25(3): 478-510.
Jiang, S., Y. Yang, S. Gupta, D. Veneziano, S. Athavale and M. C. González (2016). "The
TimeGeo modeling framework for urban mobility without travel surveys." Proceedings of
the National Academy of Sciences 113(37): E5370-E5378.
Levinson, D. M. and A. Kumar (1994). "The rational locator: why travel times have remained
stable." Journal of the american planning association 60(3): 319-332.
Li, Q., Y. Zheng, X. Xie, Y. Chen, W. Liu and W.-Y. Ma (2008). Mining user similarity based
on location history. Proceedings of the 16th ACM SIGSPATIAL international conference on
Advances in geographic information systems, ACM.
Meyer, M. D. and E. J. Miller (1984). "Urban transportation planning: a decision-oriented

approach."
Palmer, J. R., T. J. Espenshade, F. Bartumeus, C. Y. Chung, N. E. Ozgencil and K. Li (2013).
"New approaches to human mobility: Using mobile phones for demographic research."
Demography 50(3): 1105-1128.
Pan, C., J. Lu, S. Di and B. Ran (2006). "Cellular-based data-extracting method for trip
distribution." Transportation research record 1945(1): 33-39.
Pappalardo, L., S. Rinzivillo, Z. Qu, D. Pedreschi and F. Giannotti (2013). "Understanding the
patterns of car travel." The European Physical Journal Special Topics 215(1): 61-73.
Polzin, S. and A. Pisarski (2015). "Commuting in America 2013: The National Report on
Commuting Patterns and Trends." American Association of State Highway and
Transportation Officials, Washington, DC.
Schafer, A. (2000). "Regularities in travel demand: an international perspective." Journal of
transportation and statistics 3(3): 1-31.
Sohn, K. and D. Kim (2008). "Dynamic origin–destination flow estimation using cellular
communication system." IEEE Transactions on Vehicular Technology 57(5): 2703-2713.
Stopher, P. R. and S. P. Greaves (2007). "Household travel surveys: Where are we going?"
Transportation Research Part A: Policy and Practice 41(5): 367-381.
Toole, J. L., S. Colak, B. Sturt, L. P. Alexander, A. Evsukoff and M. C. González (2015). "The
path most traveled: Travel demand estimation using big data resources." Transportation
Research Part C: Emerging Technologies 58: 162-177.
Treiber, M. and A. Kesting (2013). Trajectory and Floating-Car Data. Traffic Flow Dynamics,
Springer: 7-12.
Wang, Q. and J. E. Taylor (2014). "Quantifying human mobility perturbation and resilience in
Hurricane Sandy." PLoS one 9(11): e112608.
Wang, Q. and J. E. Taylor (2016). "Patterns and limitations of urban human mobility resilience
under the influence of multiple types of natural disaster." PLoS one 11(1): e0147299.
Wilson, T. and M. Bell (2004). "Comparative empirical evaluations of internal migration models
in subnational population projections." Journal of Population Research 21(2): 127.
Zhang, Y., X. Qin, S. Dong and B. Ran (2010). Daily OD matrix estimation using cellular probe
data. 89th Annual Meeting Transportation Research Board.
Zheng, Y., Q. Li, Y. Chen, X. Xie and W.-Y. Ma (2008). Understanding mobility based on GPS
data. Proceedings of the 10th international conference on Ubiquitous computing, ACM.
Zheng, Y., X. Xie and W.-Y. Ma (2010). "Geolife: A collaborative social networking service
among user, location and trajectory." IEEE Data Eng. Bull. 33(2): 32-39.
Zheng, Y., L. Zhang, X. Xie and W.-Y. Ma (2009). Mining interesting locations and travel
sequences from GPS trajectories. Proceedings of the 18th international conference on World
wide web, ACM.

Reference Signal-Based Method to Remove Respiration Noise in Electrodermal Activity

(EDA) Collected from the Field
Gaang Lee1; Byungjoo Choi2; Houtan Jebelli3 ; Changbum Ryan Ahn4; and SangHyun Lee5
1
Ph.D. Candidate, Tishman Construction Management Program, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., G. G. Brown Bldg., Ann
Arbor, MI 48109 (corresponding author). E-mail: [email protected]
2
Assitant Professor, Dept. of Architectural Engineering, Ajou Univ., 206, World cup-ro, Suwon-
si, Gyeonggi-do 16499, Republic of Korea. E-mail: [email protected]
3
Ph.D. Candidate, Tishman Construction Management Program, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., G. G. Brown Bldg., Ann
Arbor, MI 48109. E-mail: [email protected]
4
Associate Professor, Dept. of Construction Science, Texas A&M Univ., 3137 TAMU, College
Station, TX 77843. E-mail: [email protected]
5
Associate Professor, Tishman Construction Management Program, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., G. G. Brown Bldg., Ann
Arbor, MI 48109. E-mail: [email protected]

ABSTRACT
Measuring built environment users’ response using wearable biosensors could provide a new
opportunity for understanding their experience in built environment. Electrodermal activity
(EDA) sensors are especially useful in detecting people’s stressful interaction with the built
environment. Despite this potential advancement, the detection accuracy is still limited because
of noises in EDA collected from uncontrolled settings. Alleviating respiration noise is most
challenging due to the similarity in signal characteristics between the respiration noise and EDA
response to distress. The authors propose an adaptive denoising method that references
photoplethysmogram (PPG) to detect and remove respiration noise in EDA. Quality of denoising
and quality improvement in stress measurement were measured for validation. The results
showed that the proposed method brought better quality of respiration noise removal than
previous methods, and therefore improved stress measurement quality. The finding can
contribute to improve quality of EDA from the field, which is essential to accurately understand
people’s stressful interaction with built environment.

INTRODUCTION
In recent years, built environment has been required to improve users’ comfort and safety
beyond effectiveness. The advancements in wearable sensing have provided a new opportunity
to advance built environment users’ comfort and safety by detecting their stress (i.e., a state of
mental or emotional response to the perception of a demand or challenge (Bunge 1987))
corresponding to the uncomfortable or unsafe events in built environment. Since the sympathetic
nervous system aroused by stress affects various physiological activities such as heart activities
and sweat production, wearable biosensors (e.g., electrodermal activity (EDA) and
electrocardiography (ECG)) could detect the stress by continuously measuring the physiological
activities (McCorry 2007) in their daily trips. Specifically, EDA (i.e., changes in human skin’s
electrical properties caused by eccrine sweat gland activity (Boucsein 2012)) has outperformed
other physiological signals that monitor human stress because EDA is exclusively innervated by

the sympathetic nervous system, while the other signals are also affected by parasympathetic
nervous system that is related to relaxation.
Despite such potential, analyzing EDA to identify the stress is still challenging due to lots of
noises (e.g., environment noise and intrinsic body noise) in EDA collected from people’s daily
trips that are uncontrolled and ambulatory. To predict human stress, electrodermal responses
(EDR), the specific phasic waveform in EDA, are identified because EDR can be used as an
indication of the stress occurrence. Since the noises in EDA can be misinterpreted as EDR, they
compromise the accuracy of stress measurement. Therefore, appropriate denoising processes are
essential to accurately detect human stress. Specifically, this study proposes a denoising method
to alleviate respiration noise that is one of significant noises in EDA (Boucsein 2012). The
respiration noise is the most challenging to remove because of its signal characteristics
indistinguishable from EDR (Boucsein 2012).

DENOISING EDA
In this study, noise is defined as changes in EDA signal, which do not originate from the
signal source of interest, stress (Boucsein 2012). When EDA is collected by wearable biosensors
in uncontrolled ambulatory settings, mainly three types of noises contaminate the original signal:
environment noises, skin-electrode interface noises, and intrinsic body noises (Boucsein 2012).
The environment noise refers to noises resulting from the recording environment (Heikenfeld et
al. 2018). Since EDA is measured based on electrical activity from the skin, the power line noise
caused by the alternating current frequency input is one of main environment noises. Also, a
magnetic field can contaminate EDA measurement, which often happen in uncontrolled settings
(Heikenfeld et al. 2018). Since such environment noises have distinctly higher frequency (i.e.,
50-60 Hz) than clean EDA signal (i.e., 0.0167 – 0.25 Hz) (Boucsein 2012), the low-pass filter
can successfully alleviate the noises. Also, other smoothing techniques such as exponential
smoothing and moving average filter can reduce impact of the environment noises by flattening
the small waves caused by the noises (Dube et al. 2009).
Since electrical impedance between skin and electrodes is drastically affected by changes in
the skin-electrode contact (Heikenfeld et al. 2018), unstable contact causes serious noises, called
skin-electrode interface noises. Contrast to stationary data collection, peoples’ body movement
can be hardly controlled when measuring EDA by wearable biosensors in ambulatory settings
(Jebelli et al. 2018). Such movement leads to detachments of electrodes and changes in positions
of and pressure on them, thereby generating significant skin-electrode interface noises. Unlike
environment noises, this type of noises show higher amplitude than EDR caused by stress and
abruptly appear so that they are not able to be suppressed by smoothing or low-pass filter
(Shukla et al. 2018). To effectively alleviate the skin-electrode interface noises, several adaptive
denoising methods (i.e., denoising methods that discriminately work only on noisy interval, not
clean (noise-free) interval) using EDA signal model, or wavelet decomposition have been
suggested and shown the successful denoising performance (Shukla et al. 2018).
The intrinsic body noises are caused by physiological responses in body systems other than
the physiological arousal inducted by stress (Boucsein 2012). The most representative intrinsic
source of noise in EDA is respiratory activities (Boucsein 2012; Schneider et al. 2003). Irregular
respiratory activities such as deep breath and cough have been associated with sudden increases
in free-circulating adrenaline, thereby producing sweat responses (Boucsein 2012). As a result,
such irregular respiratory activities cause noises in EDA, which leads researchers to overestimate
stress (Hygge and Hugdahl 1985). Specifically, the respiration noise in EDA would be more

serious when capturing EDA under ambulatory settings than stationary settings because people
are more likely to have irregular respiration when they have physical activities than when they
are still (Bradley and Esformes 2014). However, the before-mentioned denoising techniques are
limited to alleviate the respiration noises because the noises have similar signal characteristics
with EDR caused by stress. To suppress the respiration noise in EDA, previous studies recorded
subjects' respiration simultaneously with EDA using belt type sensors. Based on the respiration
data, noises in EDA, which occurred after irregular respiration, were manually detected and
removed (Schneider et al. 2003). To facilitate the manual denoising, Ksander et al. (2018)
recently developed a graphical user interface that helps analysts to efficiently observe the
interaction between EDA and respiration data. However, this manual filtering based on
respiration data is unfit to monitor people’s daily trips because acquiring respiration data is very
invasive. Also, it is too time and labor intensive to denoise a huge amount of EDA collected
from multiple people’s daily trips. Therefore, to acquire high quality of EDA from people’s daily
trips for detection of people’s stress in built environment, we must develop a non-invasive and
non-manual denoising method for respiration noise.

RESEARCH OBJECTIVE
Since the respiration noise in EDA is indistinguishable from EDR driven by stress in terms of
signal characteristics, it might not be possible to detect and reduce the respiration noise by
analyzing EDA signal alone. Rather, referencing another signal that is correlated with the
respiration noise is more feasible to alleviate the noise. The authors, therefore, propose a
denoising method that references photoplethysmogram (PPG) to alleviate respiration noise in
EDA, and tests its performance based on data collected from the field. PPG is an optical signal
that is acquired by illuminating human skin and measuring changes in skin’s light absorption,
reflection, and scattering (Ugnell and Öberg 1995). The intensity of the absorbed or reflected
light that is detected by a photodetector can be used to measure volumetric flow of blood in the
target skin area. The volumetric flow is modulated by respiratory activity because the alterations
in intrathoracic pressure, which are caused by respiratory activity, induce variations in venous
return to the heart (Ugnell and Öberg 1995). With this background, previous studies have found
that respiratory-induced intensity variations (RIIV) extracted from PPG could be used to estimate
respiratory status (e.g., rate and air volume) (Ugnell and Öberg 1995). Since irregular respiration
that causes noises in EDA can be expressed by sudden changes in respiratory rate or volume,
PPG can be also used as a noise reference signal to detect and alleviate the respiration noise. This
approach does not require to wear one more wearable sensor to measure PPG because PPG and
EDA can be built into the one wristband,

RESPIRATION NOISE REMOVAL METHOD

To remove the respiration noise, this method first detects occurrence of irregular respiration
using PPG. Then, EDRs, which follow the irregular respiration so that are inferred to be
respiration noises, are removed by a rule-based algorithm. Finally, performance of the respiration
noise removal is validated in terms of two aspects, which are denoising quality, and
improvement in the quality of stress measurement.

Irregular Respiration Detection

A machine learning based classifier is developed to detect irregular respiration based on

features extracted from PPG. First, respiratory-induced intensity variations (RIIV) is extracted
from PPG by band-pass filter within the range of 0.13-0.48 Hz (Ugnell and Öberg 1995). Then,
the continuous RIIV is segmented into samples by a moving window of 5 breaths (i.e., peaks of
the signal). From the RIIV of the 5 consecutive breaths, the authors calculate 11 features (i.e., 1.
integral of positive area, 2. integral of negative area, 3~10. average and standard deviation of
peak amplitude and intervals for local maximum and local minimum, and 11. length of RIIV
line), which have been used to detect irregular respiration, or estimate respiratory rate and
volumes from PPG in previous studies (Ayappa et al. 2009).
Based on the features extracted from RIIV, the best classification algorithm is selected based
on validation accuracy. Specifically, the authors focus on the classification algorithm’s subject
independency (i.e., how accurately the classification algorithm works for general public, not only
for a specific group of people). Since it is almost unfeasible to develop a user-specific classifier
by collecting labeled irregular respiration data for every person, the subject-independency of this
classification algorithm is critical for this study. To learn a subject-independent classification
algorithm, domain adaptation techniques are applied. Domain adaptation is a transfer learning
technique that enables a model learned based on labeled data in a source domain to conduct a
similar task on unlabeled data in a different target domain (Daume III and Marcu 2006). The
technique makes that possible by finding an optimized feature mapping that minimizes the
difference in distribution between source and target domain, and maximizes the task
performance on the source domain. Once a classifier is learned based on several people’s labeled
irregular respiration data, the classifier can be used for new people to detect their irregular
respiration by the domain adaptation. The author tried several algorithms for domain adaptation
(e.g., subspace alignment (SA), transfer component analysis (TCA), information-theoretical
learning (ITL), and domain-adversarial training of neural Network (DANN)) for picking the best
domain adaptation algorithm in this context. To select the most accurate and subject-independent
classification and domain adaptation algorithms, the authors use the “leave-one-subject-out-
cross-validation” (LOSOCV). By removing a particular subject’s data from training, then
validating the trained model based on this subject’s data, the LOSOCV can show models’
accuracy and subject-independency.

Respiratory-Induced EDR Removal

This step removes respiration noises that timely follow irregular respiration detected by the
classifier developed above. To remove respiration noise, EDRs should be first identified. The
authors use sparse representation based method (Chaspari et al. 2016) to identify EDRs because
this method showed more efficient and accurate performance to EDRs than previous
decomposition methods. After identifying all EDRs, EDRs caused by irregular respiration are
detected and removed based on the rules introduced in Schneider et al. (2003). Basically, EDRs
occurring 1-5 seconds after the occurrence of irregular respiration are picked. Then, a specific
amplitude threshold (i.e., over 0.4 μs ) (Schneider et al. 2003) is applied to finally determine the
respiratory-induced EDR.

Validation
The proposed respiration noise removal method is validated in two phases. First, the
denoising quality is quantitatively compared with previous methods (i.e., exponential smoothing
and convex optimization (Greco et al. 2016)). Artifact power attenuation (APA) and normalized

mean squared error (NMSE) are used as the indices of the denoising quality (Shukla et al. 2018).
The APA is defined as Equation 1. The APA indicates the extent of noise attenuation by
denoising methods. NMSE, which is calculated by Equation 2, quantitatively shows how much
distortion has been introduced to the signal by the applied denoising methods. As APA is higher
and NMSE is lower, the proposed denoising method is considered to have better quality. Since
signal that is clearly divided into noisy intervals and clean intervals is required to measure APA
and NMSE, controlled lab data collection was conducted, which will be described in the
following data collection section.

APA  10log10
 nSmVar  y  n  (1)
 nSm  
Var  y  n  

  y  n   y  n  
2
nSm 
NMSE  10 log10 (2)
  y  n   y  n  
2
nSm 

* y  n  : Original Signal , y  n  : Denoised Signal , y  n  : Meanof y  n  , Sm : the Noisy Intervals

The second phase is to assess the quality improvement in stress measurement of denoised
signal. This can be an important index of denoising performance because the purpose of
denoising EDA is to accurately measure stress from EDA. Specifically, the two EDA metrics
(i.e., mean amplitude over all phasic width (S int) and variability of amplitude over all phasic
width (Svar)) suggested in Chaspari et al. (2016) are used as stress measurement because these
metrics showed better stress measurements than previous metrics such as power spectral density,
or mean of EDA. To assess the improvement of stress measurement quality, the authors compare
two dimensions (i.e., validity and reliability) between before and after the proposed denoising
(LoBiondo-Wood and Haber 2014). The validity indicates how closely correlated the stress
measurement is with actual stress subjects perceive, while the reliability means how consistent
the stress measurement is. As an index of the validity, this study uses the point biserial
correlation coefficient (BCC), which is fit to measure correlation between a continuous variable
(i.e., Sint and Svar) and a binary variable (i.e., stress or non-stress). For quantitatively measuring
reliability, the intraclass correlation coefficient (ICC) is used. The ICC describes how strongly
measurements (i.e., Sint and Svar) are similar within a class and different between classes (i.e.,
stress or non-stress). For both coefficients, the closer the value is to 1, the better the stress
measurement is.

DATA COLLECTION
Two types of data were collected for testing the proposed denoising method. The first one is
in-lab data collection, which are needed to develop the classifier for irregular respiration
detection and assess the denoising quality. The second one is field data collection to assess the
improvement in the denoised signal’s stress measurement quality. The data collection protocol
was approved by the University of Michigan Institutional Review Board.

In-Lab Data Collection

EDA and PPG data were collected from seven subjects. All the subjects were graduate
students in the University of Michigan. To acquire data that is clearly separated into clean
intervals and respiratory-induced noise intervals, the data collection consisted of two parts:

regular respiration part and irregular respiration part. During the regular respiration part, the
subjects were asked to breathe regularly while walking (6 minutes) and sitting (6 minutes).
During the irregular respiration part, they simulated cough (total 6 times for 6 minutes; 3 times
for 3 minutes walking and 3times for 3 minutes sitting) and deep breath (total 6 times for 6
minutes; 3 times for 3 minutes walking and 3times for 3 minutes sitting) that were
representatively mentioned as the causes of respiration noise in EDA by Boucsein (2012). Each
simulation of irregular respiration lasted for 4 seconds.

Field Data Collection

The field data collection was conducted with 10 residents in Ypsilanti Township, Michigan.
Through the field data collection, the authors gathered the elderly subjects’ EDA and PPG data
labeled as their perceived stress (i.e., stress or non-stress). The subjects experienced 15 pre-
designated environmental stressors (e.g., cracked sidewalks, and curbs without ramps) along a
route. One research staff walked ahead to guide the subjects and another staff followed the
subject to take videos, which were required to label data. Afterwards, subjects assessed their
perceived stress for each environmental stressor using a binary category (i.e., stress or non-
stress). Based on the subjects’ perceived stress data and videos, their EDA and PPG data were
categorized into stress or non-stress. This perceived stress was used as the actual stress when the
stress measurement quality was assessed.

RESULT AND FINDINGS

First, based on the in-lab data, in which 2,247 samples were labeled into irregular (650
samples) or regular respiration (1,597 samples), the accuracy of irregular respiration detection
was assessed to select the best classification and domain adaptation algorithm. Table 1 shows
the LOSOCV accuracy of irregular respiration detection for each combination of classification
and domain adaptation algorithms. Considering that without use of domain adaption, the best
LOSOCV accuracy is 66.8 % with Gaussian support vector machine (SVM), the result shows
that domain adaptation definitely increases the classification’s subject independency. The best
LOSOCV accuracy was 83.8 % (true positive rate 67.5% and true negative rate 90.4%) with
logistic regression and transfer component analysis. The best model was selected as the irregular
respiration classifier.

Table 1. Validation Accuracy of Irregular Respiration Detection

Domain Accuracy Domain Accuracy
Classifier Classifier
Adaptation (LOSOCV) Adaptation LOSOCV
TCA 83.8 % Gaussian SVM ITL 69.8 %
Logistic
SA 66.7 % TCA 79.5 %
Regression K-Nearest
ITL 83.5 % SA 75.2 %
Neighbors
TCA 81.8 % ITL 64.3 %
Gaussian
Neural
SVM SA 67.5 % DANN 70.9 %
Network

The denoising quality was assessed in terms of APA and NMSE based on the in-lab data. As
the baseline, exponential smoothing and convex optimization methods were compared with the
proposed denoising method. As shown in Table 2, the proposed denoising method showed the
highest APA and the lowest NMSE, which means the proposed method has the best performance

to attenuate the respiration noise in EDA while the distortion of clean interval of EDA is
basically at the similar level with other denoising methods. Since the previous methods are
limited to detect and respond to respiration noises, which has similar signal characteristics of
EDR caused by stress, this result is expected. However, it was found that several miss detections
(i.e., missing of real noises, or detecting actual EDR as noises) degraded the denoising quality of
the proposed method. Specifically, since the learned irregular respiration classifier showed lower
true positive rate than true negative rate, it was more frequent to miss noises than to induce
distortion in clean intervals. Future research is, therefore, required to improve the denoising
quality by learning more accurate and more subject-independent irregular respiration classifier
based on more subjects’ data.

Table 2. Quality Comparison of Respiration Noise Removal

Index Proposed Method Exponential Smoothing Convex Optimization
APA 4.46 1.17 0.05
NMSE -7.59 -7.37 -13.79

Using the field data, the quality of EDA metrics (i.e., Sint and Svar) in stress measurement was
compared between before and after the proposed denoising as the last validation. The field data
was denoised based on the selected classification and domain adaptation algorithm above. Table
3 shows the changes in BCC and ICC by the denoising for each subject. In general, the denoising
improved both validity and reliability of both EDA metrics. Out of 10 subjects, 8 subjects’ data
showed significant increase in BCC and ICC. This improvement was mainly due to the fact that
the proposed denoising method made two samples (i.e., stress and non-stress) more clearly and
reliably distinguishable by removing respiration noise that caused overestimation of stress in
non-stress samples. During the field data collection, the authors often found that the subjects had
irregular respiration (e.g., coughing and deep breathing) regardless of their stress. This
observation is consistent with the result. Despite such general improvement, the result of one
subject (i.e., Subject 7) showed that the denoising could cause degradation of the stress
measurement quality. In the case, several irregular respirations happened when the subject felt
stress caused by environmental stressors. As a result, when removing respiration noises, some of
EDR due to the actual stress was also removed, thereby causing underestimation of stress in
stress samples. In the future research, the authors should more investigate the potential of this
degradation using more subjects’ data.

Table 3. Changes in Validity and Reliability of Stress Measurement

Subject Metric BCC ICC Subject Metric BCC ICC
Sint 0.34  0.44 0.62  0.76 Sint 0.40  0.45 0.70  0.76
1 6
Svar 0.32  0.44 0.58  0.77 Svar 0.39  0.45 0.69  0.76
S int
0.24  0.26 0.37  0.41 Sint 0.42  0.29 0.49  0.28
2 7
Svar 0.24  0.25 0.36  0.38 Svar 0.39  0.29 0.46  0.27
S int
0.11  0.23 0.08  0.35 Sint 0.43  0.56 0.64  0.80
3 8
Svar 0.11  0.25 0.08  0.39 Svar 0.44  0.56 0.66  0.80
Sint 0.30  0.46 0.45  0.72 Sint 0.14  0.36 0.16  0.65
4 9
S var
0.31  0.45 0.47  0.71 Svar 0.15  0.35 0.19  0.64
Sint -0.07  0.11 0.06  0.21 Sint 0.00  0.18 0.00  0.33
5 10
S var
-0.06  0.12 0.05  0.25 Svar -0.01  0.17 0.00  0.28

CONCLUSION
Current respiration noise removal, which is manual and requiring invasively acquired
respiration data, is hard to be applied for EDA collected from people’s daily trips. This study
proposes non-invasive and non-manual denoising method that references PPG signal to remove
respiration noise in EDA collected by wearable biosensors for accurate and efficient detection of
people’s stress response to the built environment. To test the performance of the proposed
denoising method, quality of denoising was first compared with previous denoising methods.
Then, improvement in EDA metrics’ stress measurement quality was compared between before
and after the respiration noise removal. As a result, the proposed denoising method brought not
only better quality of respiration noise removal than previous methods, but also quality
improvement of EDA metrics in stress measurement. The finding shows that PPG signal can be
used to detect and alleviate respiration noise in EDA, and further opens a potential of cross-
reference of signals commonly collected by wearable sensors (e.g., ECG, skin temperature, and
acceleration) for alleviating noises in each other signals. Also, the findings of this study
contribute to acquiring high quality EDA signal that is essential for wearable-based stress
detection.

ACKNOWLEDGEMENT
This study was supported by the Exercise and Sport Science Initiative (ESSI-2018-4), the
Urban Collaboratory in the University of Michigan, and the National Science Foundation –
United States (# 1800310).

REFERENCES
Ayappa, I., Norman, R. G., Whiting, D., Tsai, A. H., Anderson, F., Donnely, E., Silberstein, D.
J., and Rapoport, D. M. (2009). "Irregular respiration as a marker of wakefulness during
titration of CPAP." Sleep, 32(1), 99-104.
Boucsein, W. (2012). Electrodermal activity, Springer Science & Business Media.
Bradley, H., and Esformes, J. D. (2014). "Breathing pattern disorders and functional movement."
International journal of sports physical therapy, 9(1), 28.
Bunge, C. (1987). "Stress in the library." Library Journal, 112(15), 47-51.
Chaspari, T., Tsiartas, A., Duker, L. I. S., Cermak, S. A., and Narayanan, S. S. "EDA-Gram:
Designing electrodermal activity fingerprints for visualization and feature extraction." Proc.,
Engineering in Medicine and Biology Society (EMBC), 2016 IEEE 38th Annual International
Conference of the, IEEE, 403-406.
Daume III, H., and Marcu, D. (2006). "Domain adaptation for statistical classifiers." Journal of
Artificial Intelligence Research, 26, 101-126.
Dube, A.-A., Duquette, M., Roy, M., Lepore, F., Duncan, G., and Rainville, P. (2009). "Brain
activity associated with the electrodermal reactivity to acute heat pain." Neuroimage, 45(1),
169-180.
Greco, A., Valenza, G., Lanata, A., Scilingo, E. P., and Citi, L. (2016). "cvxEDA: A convex
optimization approach to electrodermal activity processing." IEEE Transactions on
Biomedical Engineering, 63(4), 797-804.
Heikenfeld, J., Jajack, A., Rogers, J., Gutruf, P., Tian, L., Pan, T., Li, R., Khine, M., Kim, J., and
Wang, J. (2018). "Wearable sensors: modalities, challenges, and prospects." Lab on a Chip,
18(2), 217-248.

Hygge, S., and Hugdahl, K. (1985). "Skin conductance recordings and the NaCl concentration of
the electrolyte." Psychophysiology, 22(3), 365-367.
Jebelli, H., Choi, B., Kim, H., and Lee, S. "Feasibility study of a wristband-type wearable sensor
to understand construction workers' physical and mental status." Proc., Construction
Research Congress.
Ksander, J. C., Kark, S. M., and Madan, C. R. (2018). "Breathe Easy EDA: A MATLAB toolbox
for psychophysiology data management, cleaning, and analysis." F1000Research, 7.
LoBiondo-Wood, G., and Haber, J. (2014). "Reliability and validity." Nursing research-ebook:
Methods and critical appraisal for evidencebased practice. Missouri: Elsevier Mosby, 289-
309.
McCorry, L. K. (2007). "Physiology of the autonomic nervous system." American journal of
pharmaceutical education, 71(4), 78.
Schneider, R., Schmidt, S., Binder, M., Schäfer, F., and Walach, H. (2003). "Respiration-related
artifacts in EDA recordings: introducing a standardized method to overcome multiple
interpretations." Psychological reports, 93(3), 907-920.
Shukla, J., Barreda-Ángeles, M., Oliver, J., and Puig, D. (2018). "Efficient wavelet-based artifact
removal for electrodermal activity in real-world applications." Biomedical Signal Processing
and Control, 42, 45-52.
Ugnell, H., and Öberg, P. (1995). "The time-variable photoplethysmographic signal; dependence
of the heart synchronous signal on wavelength and sample volume." Medical engineering &
physics, 17(8), 571-578.

A Machine Learning Framework to Identify Employees at Risk of Wage Inequality: U.S.

Department of Transportation Case Study
Hamid R. Karimian, Ph.D.1; Behzad Rouhanizadeh2 ; Amirhosein Jafari, Ph.D., M.ASCE3; and
Sharareh Kermanshachi, Ph.D., P.E., M.ASCE 4
1
Postdoctoral Research Fellow, Dept. of Biomedical Engineering, Lindy Boggs Center, Suite
500, Tulane Univ., New Orleans, LA 70118. E-mail: [email protected]
2
Ph.D. Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman Hall,
416 Yates St., Arlington, TX 76019. E-mail: [email protected]
3
Assistant Professor, Bert S. Turner Dept. of Construction Management, Louisiana State Univ.,
3315K Patrick F. Taylor Hall, Baton Rouge, LA 70803 (corresponding author). E-mail:
[email protected]
4
Assistant Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman
Hall, 416 Yates St., Arlington, TX 76019. E-mail: [email protected]

ABSTRACT
In the last decade, many programs have been developed to help decrease or eliminate the
wage inequality in the United States; however, identifying employees who might be at risk of
wage inequality remains challenging. This paper presents a framework to identify such
employees in an organization, using a machine learning approach. This paper utilized the U.S.
Department of Transportation (DOT) workforce demographic information to train and test the
model. First, a prediction model is developed to estimate the salary range of employees based on
historical data, using supervised machine learning techniques. Then a minority score is defined
to determine the employees who might be in the risk of inequality, based on three factors:
gender, ethnicity, and disability type. Finally, a framework is developed to identify the
employees at risk of wage inequality, using the prediction salary range and minority index. The
proposed framework enables employers to establish a fair wage, resulting in reduction and/or
elimination of inequality challenges and their consequences in their organizations.

INTRODUCTION
Wage is an undeniably important concern of both employees and organizations. U.S. wage
growth remains slow and uneven, with African-Americans and women still at a clear
disadvantage. Gender and racial wage gaps in the United States remain, even as they have
narrowed in some cases over the years. The issue of wage inequality has also become important
in the political debate due to the increasing economic crisis all over the world, specifically in
capitalist societies. Thus, the decision makers have been motivated to understand the sources of
wage inequality.
Recently, a growing number of researches have explored the wage inequality and their
sources. Stiglitz (2012) discussed that income inequality leads to instability in the economic
condition of a country by lowering the growth, decreasing the demand, and boosting the rate of
unemployment. Inequality slows down the growth pace, which causes poverty to increase, and
poverty will be more severe when capital is unevenly distributed (Ravallion, 2013). Kumhof et
al. (2012) shown an association between rising income inequality and other immense shortages
that create economic instability in society. Galor and Moav (2004) demonstrated that larger
income inequality divests the lower-income class to stay healthy.

Several factors lead to a growth of wage inequality, such as capital-output ratio changes,
skill-biased technological upgrading, dispersion of wage through industries, and supervisor
evaluations (Pupato, 2014). Park and Mah (2011) revealed that several of the economic policies
applied to overcome financial crisis contributed to increasing inequality in incomes. Stypińska
and Gordo (2018) discussed the interaction among three elements of age, gender, and migration
background, to identify if they contribute to wage inequality and found that these factors are
interactively associated with wage inequality.
According to the literature, gender, ethnicity, and level of disability are the most documented
minority statuses, have the potential for bringing inequality in the labor market (Jean et al.,
2016). Inequality between genders proved for causing many socio-economic issues, is one of the
most incisive and universal forms of social inequality (Duflo, 2012). Understanding the gender
wage inequality provides more information for labor market development and improvement of
the functioning of various segments of labor markets (Stypińska and Gordo, 2018). The wage
gap, as a gender inequality factor, has been well investigated and documented (Abeliansky &
Prettner, 2017). According to recent researches, by lessening the gender wage disparities,
economic prosperity would increase (Walls & Dowler, 2015). While within the last decades the
number of females gaining graduate degrees and entering professional positions has surged
discernibly, still there is a persistent difference between the earnings of male and female
occupants (Misra & Murray-Close, 2014). Several studies focused on the diverse effects of
ethnic or racial minority status of the workforce on their earnings (Stypińska and Gordo, 2018).
In the last century, the literature associated with racial wage gap has primarily focused on men
because the men have mostly been the majority of wage earners as well as the greatest
contributors to household income, in average. Currently, however, the number of black women
participating in the labor workforce is at the same rate as the black men, and there are even more
black women in the labor workforce than black men in part. Neal (2004) found that racial wage
gap of the female, is at least sixty percent more, between young black and white women. Fisher
and Houseworth (2012) revealed that among educated women, there is no wage gap between
white and black females. Individuals with disabilities are arguably known as the most
economically marginalized population in the U.S. (Kermanshachi and Sadatsafavi, 2018); thus,
the state-federal rehabilitation counseling makes a significant effort on cutting back poverty for
them (Walls & Dowler, 2015). Poverty and disability are assuredly tied closely, because of
related costs for healthcare and limitations to stay or even enter the labor market (Abeliansky &
Prettner, 2017). In addition, people with disability have commonly lower rates of participation in
the paid labor market (Duflo, 2012).
Due to the pace of technological change, and arising complexities for decision making, the
implementation of artificial intelligence (AI) for various purposes, including social and economic
affairs, has critically increased recently (Agrawal et al., 2018). In addition, implementing AI
leads to an increase in productivity of analysis when dealing with big data (Abeliansky &
Prettner, 2017). Kleinberg et al. (2015) have indicated that machine learning methods are critical
parts of prediction and policy-making problems.
Several studies have investigated wage inequities and their causes; however, no studies have
incorporated an intersectional perspective to analyze the combined effects of gender, ethnicity,
and type of disability to evaluate how susceptible the workers are to wage inequality; hence, this
study aims to fill this gap by developing a framework that uses a supervised machine learning
approach. This framework assists employers to create a fair wage rate regardless of demographic
information of the employees, which leads to reduction and/or elimination of inequality

challenges and consequences.

MODEL DEVELOPMENT
The purpose of the developed framework is to identify the employees in the risk of wage
inequality based on historical employment data, using a machine learning approach. The
flowchart presented in Figure 1 shows the major steps of the proposed framework.

Figure 1: The developed model for identifying the employees in the risk of wage inequality
The framework starts with developing a prediction model to estimate the salary range of
employees based on historical data, using a supervised machine learning approach. Different
models can be developed to predict the salary range based on the available inputs from historical
data (such as employees’ age, experience, education, among others), using linear regression,
neural network, among others; however, the predictive model needs to be validated and satisfy a
certain level of accuracy to be used in this framework. After training the model, the new
employment data will be entered as the input of the developed predictive model and the output of
the estimated salary range will be compared to the actual salaries. If a certain amount of variation
observed, that employee might be in the risk of wage inequality. Therefore, a minority index will
be calculated based on the employees’ gender, ethnicity, and the person with disability (PWD). If
the employee with a high variation in his/her actual salary considered as a minority based on the
defined minority index, the employee will be marked as a person in the risk of wage inequality.
The main two models used in the developed framework are described in details in this section.
Prediction Model: Supervised machine learning techniques are used to develop a prediction
model for employees’ salary range. In supervised machine learning techniques, formally, given a
set of N input-output pairs of training examples E  {( xi , yi )  X  Y : i 1...N} , we learn an

objective function f ( x;W )  y which is able to map each input x  X to an output y  Y . The
learned mapping f is assumed to be able to generalize well and operate on the unobserved
datasets in the future and make reasonably accurate predictions. In this study, Multilayer
Feedforward neural network (FFN) is implemented with five layers including input, three hidden
and output layers. FNN is a Neural Network wherein there is no loops or cycles connecting the
nodes from one layer to the next layer in the network. In this network, the information is fed to
the input nodes and is moving forward to the output nodes through hidden layers. The parameters
of the network are trained based on gradient-descent using the back-propagation algorithm. In
our case study, we use a customized loss function binary cross-entropy. We report the
performance of our model in terms of accuracy per class and confusion matrix.
Minority Index: The minority index is defined based on three main categories: gender,
ethnicity, and PWD. The weight of each category in the minority index is calculated based on the
actual probability on the diversity of each category in the studied data. For example, when the
majority of gender (which is male in this case) increases, the weight of gender in calculating the
minority index increases; therefore, the gender minorities (which is females in this case) will be
highlighted more significantly. The weight of each category is calculated using the equation (1):
P( jMajority )
Wj  (1)
 j  J P( jMajority )
Where J is the minority categorize (in this case it includes gender, ethnicity, and PWD), W j
is the weight of each minority category of j, and P( jMajority ) is the probability of majority variable
for each minority category of j in historical data.
The minority index for each individual i can be calculated using the equation (2):
Minority Index  i    j  J W j  ji  P  ji  (2)
Where W j is the calculated weight for each minority category of j, ji is the status of person
i in the minority category of j, and P( ji ) is the probability of Ji in the minority category of j in
historical data. An acceptable threshold of X is required to identify the people in the risk of wage
inequality based on the minority index; i.e., while the minority index is less than X, there might
be a risk of wage inequality. The threshold of X can be determined based on the actual diversity
numbers and the management level concerning wage inequality risk consideration.

DATA COLLECTION AND ANALYSIS

The data used for this study was obtained from the U.S. Department of Transportation (DOT)
providing demographic information of all the onboard transportation skilled workforce hires
across the nation in 2016. The collected data contains DOTs’ employees age, years of service,
education, retirement eligibility, veteran preference, type of appointment for job position, type of
region, supervisory role , equivalent grade, the location of station where job position located,
disability status, gender, and ethnicity as well as the salary ranges. These data are collected from
the 4th quarter of 2016; including 45,818 employees located in the United States.
From the available data set, around 2% (818) is selected randomly to be used for testing the
developed model. Among the 818 data points selected for testing the model, 17 data points (2%)
are selected randomly and transformed to false data, where one of the minority categories
(gender, ethnicity, or PWD) is changed at a time while the salary is decreased. Therefore, the
testing data set includes 2% of people who are in the risk of wage inequality. The other 98%

(45,000) of data is assumed as the historical data, and it is used for training the prediction model.
Table 1 presents the results of the descriptive analysis for 45,000 onboard DOT’s employees.

Table 1: Descriptive Analysis of 2016 On board DOT’s Employees

Demographic Information Number (%) Demographic Information Number (%)
Salary ($1,000) Years of Service (year)
Under $30 376 (0.8%) Under 5 5,545 (12.3%)
$30 - $49 1,605 (3.6%) 5–9 10,084 (22.4%)
$50 - $69 4,891 (10.9%) 10 – 14 7,308 (16.2%)
$70 - $89 6,951 (15.4%) 15 – 19 5,529 (12.3%)
$90 - $109 7,918 (17.6%) 20 – 24 3,693 (8.2%)
$110 - $129 8,043 (17.9%) 25 – 29 6,308 (14.0%)
$130 - $149 8,007 (17.8%) 30 – 34 4,512 (10.0%)
Over $149 7,209 (16.0%) Over 35 2.021 (4.5%)
Age (year) Retirement Eligibility
Under 25 548 (1.2%) Not Eligible 11 (0.0%)
25 – 29 3,137 (7.0%) Next 31 Years or More 1,342 (3.0%)
30 – 34 6,218 (13.8%) Next 30 Years 5,418 (12.0%)
35 – 39 5,664 (12.6%) Next 25 Years 7,063 (15.7%)
40 – 44 3,720 (8.3%) Next 20 Years 4,575 (10.1%)
45 – 49 5,698 (12.7%) Next 15 Years 5,221 (11.6%)
50 – 54 8,522 (19.0%) Next 10 Years 6,665 (14.8%)
55 – 59 6,394 (14.2%) Next 5 Years 8,901 (19.7%)
Over 60 5,099 (11.3%) Current Eligible 5,804 (12.9%)
Education Level Federal Region
Up to High School Graduate 16,353 (36.3%) Region I 1,556 (3.4%)
Post High School Degree 524 (1.2%) Region II 3,630 (8.1%)
Some College 12,007 (26.7%) Region III 6,741 (15.0%)
Bachelor's Degree 11,950 (26.5%) Region IV 7,174 (15.9%)
Post-Bachelors 977 (2.2%) Region V 5,272 (11.7%)
Master’s Degree 2,871 (6.4%) Region VI 8,351 (18.6%)
Doctorate Degree 318 (0.7%) Region VII 1,748 (3.9%)
Type of Region Region VIII 2,020 (4.5%)
Headquarter 8,811 (19.6%) Region IX 5,032 (11.2%)
Not Headquarter 36,189 (80.4%) Region X 3,470 (7.7%)
Appointment Status Gender
Permanent 44,422 (98.7%) Male 34,248 (76.1%)
Temporary 578 (1.3%) Female 10,752 (23.9%)
Veteran Preferences PWD
Entitled to Veteran Pref. 14,183 (31.5%) With Disabilities 3,526 (7.8%)
Not Entitled to Veteran Pref. 30,817 (68.5%) Without Disabilities 41,474 (92.2%)
DOT MCO Families Ethnicity
Community Planning 47 (0.1%) White 33,646 (74.8%)
Contracting 254 (0.7%) American Indian/Alaska Native 502 (1.1%)
Economist 29 (0.1%) Asian 1,884 (4.2%)
Engineering 4,122 (9.2%) Black/African American 4,925 (10.9%)
Human Resources Prof. 329 (0.7%) Hispanic/Latino 3,420 (7.6%)
Information technology 1,878 (4.2%) Native Hawaiian/Other Islander 235 (0.5 %)
Transportation Safety 22,728 (50.5%) Two or More races 382 (0.8%)
Non MCO 15,613 (34.7%) Other 6 (0.0%)

RESULTS AND DISCUSSION

Prediction Model Implementation: In the preparation of the data, we split it into 80%

training and 20% validation data, we use one-hot representation and convert each attribute into
an n dimension vector where n is the number of the values in the domain of each attribute. For
example, in the case of Salary Range there are 8 values wherein we can represent each value
with an 8 dimension vector, Under $30  [1,0,0,0,0,0,0,0] , $30  $49  [0,1,0,0,0,0,0,0] and
so on. In the dataset, each example has 21 attributes with discrete values. The salary ranges are
selected as the dependent variable, while the other variables are considered as the independent
variables. The one-hot representation yields a corresponding 165 dimensional input vectors
(employees’ features) and 8 dimensional output vectors (the salary ranges). The model has been
developed based on “Keras” deep learning tool. Keras is an open source neural network library
and runs on top of “Tensorflow” package. The architecture of the model is as follows: there are
five layers which are fully connected, input layer with 165 nodes, the first hidden layer
compresses the input information into 64 nodes followed by a “ReLU” activation function to
make the transformation nonlinear. In the second hidden layer, we have 32 nodes and then
followed by Dropout and Normalization layers to prevent the model of being over-fitted on
training data. There are 16 and 8 nodes in the last hidden layer and the output layer respectively.
In the output layer, we make use of sigmoid activation function to get outputs between zero and
one.
The model has trained on 80% of the data points, while 20% is held to test and validation the
trained model. The confusion matrix of the prediction model on training data is shown in Figure
2(a). The R-square of the prediction model for the training data is 89.8. To increase the success
likelihood of the prediction model, it is assumed that the acceptable predicted salary range is one
range below to one range above the actual range (as it is highlighted in Figure 2). Therefore, the
new diagonal values in Table 2 contains more than 98% of the total training data; resulting in a
very accurate prediction. In addition, Figure 2(b) shows the confusion matrix of the prediction
model on the validation data. The R-square of the prediction model for the validating data is
88.0, illustrating that the model is valid to be used for predicting the salary range based on the
employee’s characteristics with an acceptable accuracy.

Figure 2. Normalized Confusion Matrix of the Prediction Model

Minority Index Implementation: Based on the available historical data (table 1), the
majority of gender, ethnicity, and PWD are male (%76.1), white (%74.8), and the person without

disability (%92.2), respectively. Therefore the weight for each minority category can be
calculated from Equation (1) as 0.31 for gender, 0.31 for ethnicity, and 0.38 for PWD. Based on
the available historical data, the minority index is calculated using Equation (2) as shown in
Table 2.

Table 2: Calculated Minority Index Based on the Available Historical Data

Male Male Female Female
Ethnicity Without With Without With
Disability Disability Disability Disability
White 82% 50% 65% 33%
American Indian/Alaska
59% 27% 43% 11%
Native
Asian 60% 28% 44% 12%
Black/African American 62% 30% 46% 14%
Hispanic/Latino 61% 29% 45% 13%
Native Hawaiian/Other
59% 27% 43% 11%
Islander
Two or More races 59% 27% 43% 11%
Other 59% 27% 42% 10%

Table 3: Results of the Model Testing

Predicted Employees
Actual Employees In Risk of Wage Not in Risk of Wage
Inequality(25) Inequality(793)
In Risk of Wage
16 (94.1%)1 1 (5.9%)2
Inequality(17)
Not in Risk of Wage
9 (1.1%)3 792 (98.9%)4
Inequality(801)
1
called true positive (TP),
2
called false positive (FP),
3
called false negative (FN),
4
called true negative (TN)

As it is shown in Table 2, the highest minority index is 82% for White males without
disability (82%), while the lowest minority index is for the other ethnicities females with
disability (10%). It is worthwhile to mention that increasing diversity will level the minority
index throughout different minority categories. Increasing the acceptable threshold will increase
the number of people in the risk of wage inequality. For example, the threshold of 80% in the
available historical data will consider anyone except a white male without disability as a
minority, however decreasing this threshold to 60% will exclude the White, Black/African
American, and Hispanic/Latino males as well as White females without disability form the risk
of wage inequality.
Framework Implementation: The framework is implemented on the selected 818 data
points with 17 persons in the risk of inequality. First, the prediction model, trained by historical
data, is used to predict the salary range of all 818 employees. If the predicted salary for an
employee is at least two salary range categories (approximately $40,000) higher than the actual
salary, that employee will be selected for further investigation. After that, the minority index is

calculated for each selected employee. If the minority index is less than a threshold of 80%, that
employee is selected as a person in risk of wage inequality. Table 3 presents the result of the
model implementation on the selected data (total of 25 employees are marked in risk of wage
inequality).
As it is shown in Table 3, among 17 employees in risk of wage inequality, the model was
able to identify 16 employees (TP=16), while 1 of the in-risk employees is not identified by the
model (FP=1). On the other hand, among 801 employees who were not in risk of wage
inequality, the model was able to identify 792 of them (TN=792) while only 9 of them were
identified as the employees in the risk of wage inequality (FN=3). To evaluate the error of the
developed model, two factors are used: precision and recall. Precision (also called positive
predictive value) is the fraction of relevant instances among the retrieved instances, while recall
(also known as sensitivity) is the fraction of relevant instances that have been retrieved over the
total amount of relevant instances. The Precision is the ratio of true positives to the sum of true
positives and false positives (94.1%), while the recall is the ratio of true positives to the sum of
true positives and false negatives (64.0%). By further consideration of 25 selected employees
(3%), the proposed framework is able to assist the employer to identify 94.1% of the employees
who are in risk of wage inequality (16 out of 17), which shows a high accuracy value.

CONCLUSIONS
In this study a framework developed to identify the employees in the risk of wage inequality
based on historical employment data, using a supervised machine learning approach. This
framework utilized an FFN method to predict the employee’s salary range and developed a
minority index to define the inequality risk. The implementation of the proposed framework on
DOT employee’s demographic data showed the capability of the framework on identifying the
employees who might be in risk of wage inequality. The results showed that while 17 out of 818
employees were in the risk of wage inequality, the framework could find 25 employees, while of
them were in the risk (precision of 94.1%). The authors hope that more employers use this
framework in the construction industry to identify the employees in risk of wage inequality, and
by reducing and/or eliminating of inequalities, move toward creating a fair wage and equal work
environment.

REFERENCES
Abeliansky, A., & Prettner, K. (2017). “Automation and demographic change.” CEGE
Discussion Paper, 310.
Agrawal, A., Gans, J., & Goldfarb A. (2018). “Prediction, Judgement, and Uncertainty.” NBER
Working Paper, 24243.
Duflo, E. (2012). “Women, empowerment and economic development.” Journal of Economic
Literature, 50:1051-1079.
Fisher, J., D., & Houseworth C., A. (2012). “The Reverse Wage Gap among Educated White and
Black Women.” Journal of Economic Inequality, 10(4): 449–470.
Galor, O., Moav O. (2004). “From physical to human capital accumulation: inequality and the
process of development.” Rev Econ Stud, 71(4):1001–26.
Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon S. (2016). “Combining
satellite imagery and machine learning to predict poverty.” Science, 353(6301):790-794.
Kleinberg, J., Ludwig, J., Mullainathan, S., & Obermeyer. Z. (2015). “Prediction policy
problems.” American Economic Review, 105(5):491-495.

Kermanshachi, S. and Sadatsafavi, H., (2018). “Predictive Modelling of U.S. Transportation

Workforce Diversity Trends: A Study of Human Capital Recruitment and Retention in
Complex Environments”, Proceedings of ASCE International Conference on Transportation
& Development, Pittsburgh, PA, July 15-18, 2018.
Kumhof, M., Lebarz C., Ranciere R., Richter A. (2012). “Income inequality and current account
imbalances.” IMF Working Papers 12/08.
Misra, J., & Murray-Close, M. (2014). “The gender wage gap in the United States and cross
nationally.” Sociology Compass, 8:1281-1295.
Neal, D., (2004). “The Measured Black-White Wage Gap among Women is Too Small.” Journal
of Political Economy, 112(1): S1–S28.
Park, J., Mah J. (2011). “Neo-liberal reform and bipolarization of income in Korea.” Journal of
Contemporary Asia, 41:249–265.
Pupato, G., P. (2014). “Performance Pay, Trade and Inequality.” manuscript.
Ravallion, M. (2013). “Pro-poor growth: a primer.” World Bank Policy Research Working Paper
Series, 3242.
Stiglitz, J. (2012). “The price of inequality.” Published by W. W. Norton & Company, Inc.
Stypińska, J., &, Gordo L., R. (2018). “Gender, Age and Migration: An Intersectional Approach
to Inequalities in the Labour Market.” European Journal of Ageing, 15(1): 23–33.
Walls, R., T., & Dowler, D., L. (2015). “Disability and income.” Rehabilitation Counseling
Bulletin, 58:146–153.

Integrating Positional and Attentional Cues for Construction Working Group

Identification: A Long Short-Term Memory Based Machine Learning Approach
Jiannan Cai1; Yuxi Zhang2; and Hubo Cai, Ph.D., P.E., M.ASCE3
1
Lyles School of Civil Engineering, Purdue Univ., 550 Stadium Mall Dr., West Lafayette, IN
47906. E-mail: [email protected]
2
Lyles School of Civil Engineering, Purdue Univ., 550 Stadium Mall Dr., West Lafayette, IN
47906. E-mail: [email protected]
3
Lyles School of Civil Engineering, Purdue Univ., 550 Stadium Mall Dr., West Lafayette, IN
47906. E-mail: [email protected]

ABSTRACT
Construction entities interact with each other to accomplish assigned tasks, constituting
working groups. Recognizing working groups is important as it enables the correct
comprehension of jobsite context, which in turn facilitates the interpretation of entities’
intentions, and the prediction of their movements. Aiming at identifying working groups formed
by interacting workers and/or equipment, this study devises a machine learning approach to
integrate positional and attentional cues. Methods are created to represent the spatial and
attentional states of individual entities and compute positional and attentional cues between two
entities. A long short-term memory network is adopted to identify the working groups. The
proposed method is validated using construction videos that are available online and taken by the
authors. The results suggest that by integrating positional and attentional cues, the working
groups can be identified with an accuracy of over 95%, much higher than that obtained using
positional cues only.

INTRODUCTION
Construction entities (including both workers and equipment) interact with each other to
accomplish the assigned tasks, constituting several working groups. Inappropriate interactions
among these entities may cause disastrous consequences, such as struck-by accidents. If the
movements of workers and equipment can be accurately predicted in advance, the struck-by
accidents can be prevented. Since an entity’s behavior is triggered and determined by their
involved activities and the behaviors of those entities in the same group tend to exhibit group
patterns, recognizing the working groups is of great significance. It enables the correct
comprehension of jobsite context, which in turn facilitates the interpretation of workers’
intentions, the prediction of their movements, and the detection of potential abnormal events.
Many studies have created methods to automatically recognize actions of individual entities
from visual data, including laying bricks, transporting materials (Luo et al. 2018), holding, and
picking up of workers (Khosrowpour et al. 2014), and excavating and loading actions of the
excavator (Golparvar-Fard et al, 2013). Some of them exploited the interaction between multiple
entities with a particular emphasis on interactions between different types of equipment (Kim et
al. (2018a, b)). In these studies, hand-crafted decision rules were used to determine the working
states (e.g., loading, hauling, idling, etc.) of the equipment based on their spatial-temporal
relation (e.g., distance). Such approach is effective for activities with simple interaction and
repeated process, but is insufficient for complex group activities, especially for those involving
multiple workers as workers’ behavior is much more flexible and diverse. As a result, the

analysis of complex interactions among multiple entities remains challenging. To address this
challenge, the working groups formed by interactive entities should first be identified, which,
however, has been overlooked in existing studies.
Aiming at accurately identifying working groups with a particular focus on those involving
human workers, this paper presents a machine learning approach that integrates the positional
and attentional cues to implicitly learn the patterns presented in multi-entities interactions. In this
study, entities refer to construction workers and equipment. The positional cues refer to the
spatial relationship between entities, such as the distance and relative direction. The attentional
cues reflect the entity’s visual attention, such as the head pose.
In this study, the working group identification is modeled as a classification problem. The
spatial and attentional states of individual entities are first represented numerically, and the
corresponding positional and attentional cues between any two entities are then computed. These
computed cues within a time period are constructed as time-sequential features and fed into a
long short-term memory (LSTM) network that can learn their temporal dependencies and
classify the working groups. Finally, the proposed method is validated using videos from a
hospital construction project that are available online and videos from an ongoing teaching
building project taken by authors on campus.
The contribution of this study is threefold. First, a set of positional and attentional cues have
been identified as critical features for construction working group identification. Second, the
proposed method can deal with varying number of entities and can be used in general
construction scenarios. Third, the LSTM network has been adopted to model the temporal
dependency among features, which can better capture the dynamic interactions on the jobsite.

METHODOLOGY
This study stems from the promise that the real-time states of construction entities can be
acquired from visual data (Zhu et al. 2017, Memarzadeh et al. 2013). Methods are developed to
represent the states of individual entities in terms of location and visual attention and compute
the positional and attentional cues between any two entities that are further treated as features to
identify working groups. The annotation of working groups is based on the jobsite context – if
two entities are interacting with each other, they are annotated as one group, otherwise, they are
not in one group. Figure 1 illustrates the overall framework.

Spatial and Attentional State Representation

In this study, the entity’s spatial state refers to the real-time location, typically measured by
the central coordinates of its bounding box for 2D visual data. An entity’s attentional state refers
to the direction of its visual attention, which is measured by head pose, body orientation, and
body pose. For equipment, the main cab is regarded as its “head”.
At each time step t, the state of entity i is represented as Sti  {Pt i , Hti , boti , bpti } . Pt i =(xti , yti ) is
the central coordinates of bounding box for the entity, indicating its 2D position in the image.
Hti  ( yawti , pitchti ) represents the yaw and pitch of head pose that are used to capture the visual
attention, as shown in Figure 2(a). boti indicates the body orientation, and bpti represents the
body pose that consists of three classes: standing (1) and bending (2), and not applicable for
machines (0).

Figure 1. Overall framework.

The head yaw and body orientation are categorized into eight bins, i.e., north (N), south (S),
east (E), west (W), northeast (NE), northwest (NW), southeast (SE), and southwest (SW), as
shown in Figure 2(b-c). Besides, the head pitch is categorized into three bins, i.e., up (U),
horizontally (H), and down (D), shown in Figure 2(d). Such representation allows the
incorporation of uncertainty in orientation estimation and it is suitable for construction site where
accurate estimation is extremely challenging. For equipment (considered as rigid objects), the
body orientation is identical to the head yaw (Figure 2(e)) and the head pitch always remains
horizontal (Figure 2(f)). Note that in this study, equipment is simplified as rigid object, while
future study can create a more sophisticated model that considers different components of
equipment to differentiate head pose and body orientation, and incorporate machine body pose.
Figure 2(g) illustrates an example image of the cooperation of a worker and a bulldozer at
time t. The spatial and attentional state of worker i at time t is denoted as Sti  {(30,160),(1,1),1,1}
, indicating the worker is located at (30, 160) with head and body facing east, looking
horizontally and standing still. The state of equipment j at time t is denoted as
St j  {(450,150), (6,1), 6, 0} , indicating it is located at (450,150) and facing southwest.

Positional and Attentional Cues Modeling

This study models a series of positional and attentional cues by considering the spatial and
attentional relationship between two entities, which can serve as critical features for the working
group identification.

Positional Cues
Positional cues are computed using the trajectory information of two entities. A total of five
measures for positional cues are formulated.
Distance relationship: The distance relationship measures the proximity of two entities i and
j, denoted as P1i , j . Different from previous studies (Chamveha et al. 2013) that relative distance
is adopted to represent the spatial relation, this study uses the scale-invariant topological
relationship to generalize the approach to different scenarios. The 9-interaction model
(Egenhofer and Herring 1990) is used to represent the topological relation, as shown in Figure
3(a), where two rectangle regions represent two entities. The topological relations connected by a
solid line are topological neighborhoods that can convert to each other through spatial
transformation. The change of topological relation over time reveals the evolution of the distance

relationship between two entities. The topological distance is adopted for the numerical
representation, and readers are referred to Nabil et al. (1996) for more detail.

Figure 2. State representation (a. demonstration of head pose, b. head yaw, c. body
orientation, d. head pitch, e. head yaw and body orientation of equipment, f. head pitch of
equipment, g. example of state representation).
Directional relationship: The directional relation measures the direction of entity j with
respect to entity i, denoted as P2i , j . A project-based model (Isli et al. 2001) is used to represent
the directional relation, as shown in Figure 3(b). The region that vector r i , j   x j  xi , y j  yi 
falls in indicates the directional relation between i and j. Such representation can absorb the
uncertainty in position estimation compared with numerical angles.

Figure 3. Representation of topological and directional relation.

Differences in speed and in moving direction: Difference in speed ( P3i , j ) and difference in
walking direction ( P4i , j ) reflect movement patterns of two entities. The difference in speed is
computed as P3i , j  abs( vti  vtj ) , where vti   xti1  xti , yti1  yti  is the velocity of entity i. The
difference in the moving direction, is computed as P4i , j  min{abs( i   j ),8  abs( i   j )} ,
where  i is the moving direction of entity i, represented as the numerical values in Figure 3 (b).
Difference between moving direction and relative direction: This cue measures the degree
of one entity moving towards the other entity, computed as

P5i , j  min{abs( i   i , j ),8  abs( i   i , j )} , where  i represents the moving direction of entity i,
and  i , j represents the direction of entity j with respect to entity i.

Attentional Cues
According to Chamveha et al. (2013), human’s gaze exchange and joint attention are critical
attentional cues for the social group discovery. In addition to their findings, this study considers
more attentional cues captured by the head pitch, body orientation, and body pose, as well as
attentional cues related to equipment. A total of six measures for attentional cues are formulated.
Difference between head pose and relative direction: The difference between the head
yaw of entity i and its directional relation with entity j, indicates the gaze exchange cue that
measures the degree of entity i looking at entity j, computed as
A1i , j  min{abs( yawi   i , j ),8  abs( yawi   i , j )} .
Difference in head pose: The difference in head yaw pose of entity i and j indicates the joint
attention cue that measures the degree of them looking at a common direction/object, denoted as
A2i , j  min{abs( yawi  yaw j ),8  abs( yawi  yaw j )} .
Difference between head yaw and moving direction, and between head yaw and body
orientation: The difference between head yaw and body orientation (mostly for static entity) and
moving direction (mostly for moving entity) are strong cues that reflect the change of visual
attention. These two cues are computed as A3i  min{abs( yawi  boi ),8  abs( yawi  boi )} and
A4i  min{abs( yawi   i ),8  abs( yawi   i )} .
Head pitch and body pose: The head pitch, A5i  pitchi , and body pose, A6i  bpi , provide
strong hints regarding the working states of the entity, which can reflect the attention.

LSTM-based Classification
The task of working group identification is formulated as a classification problem. The input
features are modeled based on the abovementioned positional and attentional cues. The output
consists of two classes, i.e., “is a group” (1), “is not a group” (0). An LSTM network is adopted
to learn the temporal dependency between the time-variant features.

Time-sequential features
Given a construction scene, the positional and attentional cues are normalized and
concatenated into a 17-dimensional feature vector that describes the relationship between any
two entities at any given time, denoted as
ft   P1it, j , P2it, j , P3it, j , P4it, j , P5it, j , P5tj ,i , A1it, j , A1jt ,i , A2i ,t j , A3i t , A3jt , A4i t , A4jt , A5i t , A5jt , A6i t , A6jt  . In this study, the
time-sequential feature that is constructed from a series of time-variant feature vectors is used as
input feature, denoted as { ft , ft t , ft 2t ,..., ft T } , where t is the starting time, t is the sampling
frequency and T is the time duration.

LSTM models
LSTM network (Hochreiter and Schmidhuber 1997) is used to model the temporal
dependency among features. It works in such a way that during a time period of T, the input of

LSTM network is a sequence of feature vectors, { ft , ft t , ft 2t ,..., ft T } , each of which is fed
into an LSTM unit that outputs a state value, and the state value of previous time steps will affect
those of the following time steps. The output of the final LSTM unit will enter a fully connected
layer, followed by a Softmax layer that computes the probability of a specific class, shown in
Figure 1.

IMPLEMENTATION AND RESULTS

In this study, construction videos were used as a major data source for working group
identification. This section discusses the implementation of the proposed method and the results.

Data Description
The implementation dataset consists of 14 videos from two sources: videos from a hospital
construction project that are available online (YouTube 2018), and videos taken by the authors
from an ongoing teaching building project on campus. These videos are taken in different
construction scenarios with various construction entities, working groups, and ongoing activities,
and from different viewpoints to evaluate the proposed method in general construction jobsite.
Figure 4 shows some sample images from the dataset. The summary of the dataset is listed in
Table 1. All videos were downsampled to 2fps in this study.

Figure 4. Sample images (a-c from hospital project (YouTube 2018), d-e from teaching
building project).

Table 1. Data summary.

Data source # of videos Duration (s) # of entities # of groups Average group size
YouTube 12 952 66 22 2.5
Purdue campus 2 117 9 4 2

Data Preprocessing
Since the focus of this study is to identify the working groups using higher-level information,
i.e., the features computed based on the extracted states, the entities’ spatial and attentional states
(e.g., location and head pose) in each video frame were manually annotated, which were further
used to compute the positional and attentional cues. State information can also be extracted
automatically using methods from existing studies (Memarzadeh et al., 2013, Raza et al. 2018).

Implementation details
The model was trained via a fully-supervised manner. The input sequential features were fed
into a one-layer LSTM network with 17 hidden units, followed by a fully connected layer and a
Softmax layer that computes the probability of each class, as illustrated in Figure 1. In the
training process, Adam was used as the optimization algorithm with a batch size of 40 and the
initial learning rate of 0.001.

Results
The performance of LSTM-based classification model was evaluated in terms of the accuracy,
precision, and recall. Accuracy measures the overall correctness in classifying two classes.
Precision measures the proportion of predicted working groups being true working groups.
Recall measures the proportion of true working groups being correctly recognized.
A series of experiments were conducted. All videos in the dataset described in Table 1 were
trimmed into video clips with a fixed length that can start from an arbitrary frame of the original
video. The sequential feature computed from one pair of entities in one video clip was used as
one piece of data. The 5Fold mechanism was used with 4800 and 1200 data randomly selected
from the dataset serving as training and testing data. Video clips with different lengths (i.e., 5s,
10s, 15s) were used to assess the impact of available previous observations. Furthermore, the
performance of integrating positional and attentional cues was compared with that obtained using
positional cues only. Table 2 lists the result. Note that, although the frame rate of videos is
relatively low in this study, it will not lose important information as the videos were recorded far
away from the site with a large field of view (see Figure 4), which is also compatible for
commonly used surveillance videos.

Table 2. Results of working group identification

Feature Previous observation Accuracy Precision Recall
5s (10 frames) 0.951 0.899 0.894
position + attention 10s (20 frames) 0.958 0.907 0.920
15s (30 frames) 0.972 0.949 0.933
5s (10 frames) 0.842 0.666 0.698
position only 10s (20 frames) 0.864 0.736 0.680
15s (30 frames) 0.886 0.778 0.737

From Table 2, the proposed method achieves over 95% accuracy for all three scenarios.
Furthermore, the performance is largely improved when longer previous observations are
available. It is reasonable because the longer previous observations are available, the better the
dynamic interactions among entities can be revealed, leading to both higher precision and recall.
However, on the construction site, longer previous observations require more accurate and
reliable detection of individual entities’ real-time states, which are not always available.
Therefore, the duration of previous observation should be carefully selected by considering both
target accuracy and the available sensing technologies. In addition, the proposed method
achieves much higher accuracy, precision, and recall, compared with the results obtained using
positional cues alone. It proves that attentional cues can provide valuable information to
understand the working scenario and should not be ignored.

CONCLUSION
This study devises a novel machine learning approach that leverages both positional and
attentional cues to automatically identify working groups on the construction site. Methods have
been created to represent the spatial and attentional states of individual entities and compute the
positional and attentional cues between two entities. LSTM network is used to model the
temporal dynamics of identified cues and classify whether two entities belong to one group. The
proposed method is validated using construction videos from different projects with over 95%
accuracy, much higher than that obtained using positional cues only.

There remain some limitations that deserve more research efforts. First, the positional and
attentional cues are computed based on manually annotated states of individual entities. Future
study is needed to automatize the entire process including state detection and group identification.
Second, this study focuses only on the working group identification, future study will further
recognize the corresponding group activity based on the findings of this study. Third, the state
information and corresponding positional and attentional cues are represented in 2D. Future
study is needed to extend them into 3D and assess the performance using 3D information.

REFERENCES
Chamveha, I., Sugano, Y., Sato, Y., & Sugimoto, A. (2013, September). Social Group Discovery
from Surveillance Videos: A Data-Driven Approach with Attention-Based Cues. In BMVC.
Egenhofer, M. J., & Herring, J. (1990). Categorizing binary topological relations between
regions, lines, and points in geographic databases. The, 9(94-1), 76.
Golparvar-Fard, M., Heydarian, A., & Niebles, J. C. (2013). Vision-based action recognition of
earthmoving equipment using spatio-temporal features and support vector machine
classifiers. Advanced Engineering Informatics, 27(4), 652-663.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8),
1735-1780.
Isli, A., Haarslev, V., & Möller, R. (2001). Combining cardinal direction relations and relative
orientation relations in qualitative spatial reasoning. Univ., Bibliothek des Fachbereichs
Informatik.
Khosrowpour, A., Niebles, J. C., & Golparvar-Fard, M. (2014). Vision-based workface
assessment using depth images for activity analysis of interior construction operations.
Automation in Construction, 48, 74-87.
Kim, H., Bang, S., Jeong, H., Ham, Y., & Kim, H. (2018a). Analyzing context and productivity
of tunnel earthmoving processes using imaging and simulation. Automation in Construction,
92, 188-198.
Kim, J., Chi, S., & Seo, J. (2018b). Interaction analysis for vision-based activity identification of
earthmoving excavators and dump trucks. Automation in Construction, 87, 297-308.
Luo, H., Xiong, C., Fang, W., Love, P. E., Zhang, B., & Ouyang, X. (2018). Convolutional
neural networks: Computer vision-based workforce activity assessment in construction.
Automation in Construction, 94, 282-289.
Memarzadeh, M., Golparvar-Fard, M., & Niebles, J. C. (2013). Automated 2D detection of
construction equipment and workers from site video streams using histograms of oriented
gradients and colors. Automation in Construction, 32, 24-37.
Nabil, M., Ngu, A. H., & Shepherd, J. (1996). Picture similarity retrieval using the 2D projection
interval representation. IEEE Transactions on Knowledge and Data Engineering, 8(4), 533-
539.
Raza, M., Chen, Z., Rehman, S. U., Wang, P., & Bao, P. (2018). Appearance based pedestrians’
head pose and body orientation estimation using deep learning. Neurocomputing, 272, 647-
659.
YouTube, Hospital Construction (2018).
https://www.youtube.com/channel/UCEKwrM78pRv8WRcKvZNtE1w (accessed in January
17th, 2019)
Zhu, Z., Ren, X., & Chen, Z. (2017). Integrated detection and tracking of workforce and
equipment from construction jobsite videos. Automation in Construction, 81, 161-171.

Neuro Fuzzy Inference Systems for Estimating Normal Concrete Mixture Proportions
Jorge L. Santamaria, Ph.D.1; Luis Morales2 ; and Paulina Lima3
1
School of Civil Engineering, Dept. of Engineering, Physical Sciences, and Mathematics, Central
Univ. of Ecuador, Quito, Ecuador 170521. E-mail: [email protected]
2
School of Civil Engineering, Dept. of Engineering, Physical Sciences, and Mathematics, Central
Univ. of Ecuador, Quito, Ecuador 170521. E-mail: [email protected]
3
School of Civil Engineering, Dept. of Engineering, Physical Sciences, and Mathematics, Central
Univ. of Ecuador, Quito, Ecuador 170521. E-mail: [email protected]

ABSTRACT
Concrete is a very popular construction material utilized in the construction industry. There
are several methods for determining the quantities of its components, being the concrete mixture
design method by American Concrete Institute (ACI) the most common procedure among others.
Construction laboratories have conducted innumerable concrete mixture designs applying
different procedures through time, acquiring valuable information that is not being exploited
completely. This study used experimental historical information of a construction laboratory in
order to develop Sugeno type fuzzy inference systems (FISs) to estimate material proportions for
fabricating normal concrete. Fuzzy modeling was accomplished by using subtractive clustering
and adaptive neuro fuzzy inference systems (ANFIS) techniques. A Sugeno type FIS was
developed for estimating the proportion of each component in a concrete mixture; namely, water-
cement ratio, fine and coarse aggregates, where concrete compressive strength and aggregate
properties including fineness moduli and abrasion resistance were the input variables. The
resulting fuzzy models were able to estimate concrete constituents very well since computed
coefficients of determination (i.e., R-squares) were greater than 90% when validating the models.
All FISs can be used as mixture design tools to compute material proportions based on past
experience when fabricating concrete on the jobsite. Also, the proposed framework illustrated in
this research for concrete mixture design could be extended for any particular data set regardless
of concrete component characteristics.

INTRODUCTION
Concrete is a construction material made of water, Portland cement and aggregates which has
been widely utilized in the construction industry. Concrete components vary according to the
geographic location where a project is built and thus its corresponding mixture proportions.
Kosmatka et al. (2002) pointed out that proportioning consists of establishing desired concrete
characteristics and choosing the proportions of available materials in order to produce concrete
complying with specifications at the lowest price. There are several methods to estimate the
quantities of each concrete constituent (i.e., concrete mixture design methods) which have
evolved from the arbitrary volumetric method (Abrams 1918) to weight and absolute – volume
methods (ACI Committee 1991), being the methodology established by the American Concrete
Institute (ACI) the most common. A resulting mixture design only gives the initial quantities that
will be utilized to fabricate the first concrete sample. The final quantities will be derived from
performing corrections due to desired fresh and hardened concrete properties such as workability
and compressive strength respectively.
All aforementioned procedures for designing and proportioning concrete mixtures tend to

comply with concrete specifications, being concrete compressive strength test the usual test
performed in construction laboratories in order to assure that their mixture designs satisfy their
clients’ requirements. Due to this recurring activity, thousands of concrete mixture designs have
been conducted, causing construction labs to generate tons of experimental data, which have
been stored in lab files without providing with any information to researchers other than the
concrete compressive strength lab reports. Such experimental data constitutes a valuable source
of information that is not being used and that can be utilized to estimate new data (i.e., concrete
proportions).
Ross (2017) pointed out that fuzzy set theory is a very efficient tool to understand complex
systems when no mathematical functions are available. A fuzzy inference system (FIS) provides
a researcher with valuable information to understand complex systems (Zadeh 1973). This study
uses experimental concrete mixture design data laying in construction laboratories to develop
Sugeno – type fuzzy models to estimate material proportions for fabricating normal concrete.
The main goal of this study is to provide concrete workers and technicians with fuzzy models for
estimating concrete constituents through the use of local experimental data.

METHODOLOGY
A Sugeno – type FIS was developed to estimate each concrete mixture proportion by weight
by using experimental data and fuzzy modeling. Adaptive neuro fuzzy inference systems
(ANFIS) is a technique that uses experimental input – output data for constructing FISs by
developing membership functions and fuzzy rules based on data clustering.
Experimental Data: All experimental data was provided by the Laboratory of Construction
and Testing Materials (LCTM) of the School of Civil Engineering of the Central University of
Ecuador from January 2003 to December 2017. All tests were conducted at the age of 28 days
since compressive strength is expected to be an index of concrete strength at this age (Mehta and
Monteiro 2006). Also, concrete compressive strength was computed by averaging the strength of
three specimens in concordance to the requirements of the American Concrete Institute (ACI
2014) that points out that satisfactory concrete compressive strength should be calculated by
averaging the strength of three samples on day 28. Such data were tabulated and corresponded to
the results of 132 concrete mixture designs for normal concrete with no admixtures. All tests
were conducted at LCTM and the experimental data consisted of data tuples containing input and
output variables. The properties of concrete constituents including fineness moduli of fine and
coarse aggregates as per ASTM C136 (ASTM International 2014a) and abrasion as per ASTM
C131 (ASTM International 2014b), concrete slump as per ASTM C143 (ASTM International
2015) and concrete compressive strength (f’c) as per ASTM C39 (ASTM International 2016)
were the selected input variables while water cement ratio ( w – c ), fine aggregate ( FA ) and
coarse aggregate ( CA ) were the output variables (i.e., concrete mixture proportions). All data
provided by LCTM is tabulated in Table 1.
Fuzzy Modeling: System identification, an approach for developing fuzzy models, was the
method selected for creating Sugeno type FISs as described by Sugeno and Yasukawa (1993)
and it consists of two phases: (1) structure identification and (2) parameter identification. In the
first phase, partitions of data points, if–then rules, and the number of rules are defined through
the Subtractive clustering method (Chiu 1994), while the parameters of the model are adjusted to
minimize output errors in the second phase by Adaptive Neuro Fuzzy Inference Systems (Jang
1993; Jang et al. 1997).

Table 1. Experimental Data

Input Variables Output Variables
Concrete Mixture
Fineness Modulus
No Compressive Abrasion Slump Proportions
Strength (MPa) Fine
Aggregate
Coarse
Aggregate
(%) (cm)
wc FA CA
1 23.0 2.0 7.1 53.6 6.0 0.56 1.9 3.1
2 21.1 2.8 6.6 29.4 6.0 0.56 1.4 3.6
3 24.9 2.3 6.6 21.4 7.5 0.56 1.9 2.9
4 21.3 2.0 7.2 52.4 6.5 0.52 1.7 3.1
5 17.2 2.9 5.7 58.7 8.5 0.56 1.5 3.4
6 24.0 2.7 7.3 34.0 7.5 0.58 1.9 3.3
7 25.0 3.3 7.3 29.0 7.5 0.52 1.5 3.1
8 22.0 3.4 7.4 22.3 6.0 0.58 2.1 3.7
9 20.9 2.6 7.1 48.2 7.0 0.57 2.0 3.0
10 23.4 3.2 6.6 48.7 7.0 0.57 1.5 3.0
11 26.7 3.1 7.7 37.6 5.0 0.56 2.4 3.2
12 27.2 2.9 6.2 51.3 8.0 0.51 1.3 3.2
13 24.9 1.3 7.4 40.8 8.0 0.58 1.7 3.8
14 22.9 2.4 7.5 26.3 6.5 0.57 1.9 3.7
15 19.9 3.0 7.5 34.1 8.0 0.58 1.6 3.4
16 21.6 3.1 6.0 59.8 6.0 0.56 1.9 3.0
17 26.7 2.6 6.2 18.3 8.0 0.54 1.9 3.1
18 21.6 2.5 7.3 19.9 5.0 0.56 1.7 3.7
19 25.5 2.4 6.9 19.1 7.0 0.56 2.0 2.8
20 28.3 2.4 6.9 19.1 6.0 0.52 1.8 2.6
21 21.4 2.5 6.7 38.4 6.0 0.56 2.0 2.8
22 24.5 3.2 7.5 23.0 6.5 0.56 2.0 3.5
23 21.9 3.2 7.4 43.9 8.0 0.60 2.4 2.8
24 24.0 3.2 7.4 43.9 8.5 0.58 2.1 2.5
25 25.1 3.2 7.4 43.9 8.0 0.56 2.9 2.3
26 22.8 3.1 6.9 28.1 6.5 0.57 2.5 2.7
27 19.1 3.1 6.9 28.1 6.5 0.60 2.6 2.8
28 25.6 1.4 7.6 29.6 7.0 0.56 1.9 3.6
29 27.3 3.2 7.1 45.9 4.0 0.55 1.6 3.4
30 24.3 3.3 7.2 56.1 10.0 0.56 1.7 3.6
31 22.8 2.7 7.4 44.0 5.0 0.51 2.2 3.6
32 24.6 2.7 7.4 44.0 7.0 0.58 2.0 3.3
33 26.8 1.5 6.4 38.4 6.0 0.50 1.8 2.4
34 24.3 1.5 6.4 38.4 6.0 0.54 2.0 2.6
35 20.3 1.5 6.4 38.4 8.0 0.59 2.3 2.9
36 22.5 3.1 7.5 46.7 7.0 0.58 1.9 3.2
37 22.0 3.9 7.3 29.1 7.5 0.54 1.9 3.2
38 19.2 3.5 7.2 55.2 5.0 0.60 1.9 3.8
39 21.1 1.7 7.3 26.4 7.5 0.58 1.8 3.8
40 26.6 1.7 7.5 23.7 8.0 0.46 1.8 2.8
41 18.8 3.1 7.4 28.2 5.0 0.60 2.1 3.7
42 21.9 3.1 7.4 28.2 5.0 0.58 1.9 3.3
43 25.8 3.1 7.4 28.2 6.5 0.56 1.8 3.1
44 21.1 3.3 6.7 36.0 7.0 0.58 1.8 2.7
45 26.4 3.2 7.3 55.1 7.0 0.56 1.9 3.4
46 24.0 2.9 7.3 50.4 7.0 0.58 2.0 3.3
47 28.5 3.7 7.5 30.4 10.5 0.52 1.8 3.2
48 24.9 3.7 7.5 30.4 7.5 0.56 1.9 3.3
49 23.9 2.3 6.7 22.5 5.0 0.54 1.3 4.1
50 27.7 2.3 6.7 22.5 5.0 0.50 1.3 3.9
51 28.3 3.7 7.1 28.9 8.0 0.50 1.7 3.4
52 24.4 3.7 7.1 28.9 8.0 0.54 1.7 3.6
53 19.1 3.7 7.1 28.9 8.0 0.60 1.8 3.9
54 21.3 3.2 7.2 49.9 8.0 0.58 1.9 3.2
55 24.3 3.1 7.6 27.5 7.0 0.56 1.9 3.2
56 22.8 3.4 5.9 61.5 7.0 0.58 1.7 3.4

Input Variables Output Variables

Concrete Mixture
Fineness Modulus
No Compressive Abrasion Slump Proportions
Strength (MPa) Fine
Aggregate
Coarse
Aggregate
(%) (cm)
wc FA CA
57 22.0 3.2 6.0 23.6 7.0 0.58 1.7 2.8
58 28.7 2.7 7.5 25.0 5.0 0.54 1.9 3.1
59 24.5 3.3 6.0 59.4 7.0 0.56 2.0 3.0
60 21.1 2.7 7.0 27.9 10.5 0.58 1.8 3.4
61 22.6 2.7 6.8 62.0 11.5 0.58 1.7 3.5
62 16.7 2.6 6.6 26.1 13.0 0.54 2.4 3.6
63 20.1 2.6 6.6 26.1 12.0 0.63 2.3 3.3
64 22.9 2.6 6.6 26.1 11.0 0.58 2.1 3.2
65 26.9 2.6 6.6 26.1 10.0 0.57 2.0 3.1
66 27.5 2.6 6.6 26.1 10.0 0.56 2.0 3.1
67 31.2 2.6 6.6 26.1 10.0 0.54 1.9 2.9
68 36.7 2.6 6.6 26.1 11.0 0.47 1.8 2.7
69 21.7 2.6 6.3 62.3 7.5 0.58 2.0 3.4
70 22.8 2.9 7.0 40.3 7.5 0.56 2.1 3.5
71 25.3 2.5 7.4 20.0 13.5 0.58 2.0 3.3
72 20.7 3.1 6.5 64.9 8.5 0.60 2.4 3.0
73 24.2 2.7 7.4 57.7 6.0 0.55 1.5 3.0
74 21.4 2.7 7.4 57.7 7.0 0.58 1.6 3.2
75 26.7 3.3 7.4 27.8 7.0 0.55 1.9 3.3
76 29.3 2.4 6.9 51.2 14.0 0.55 1.7 3.7
77 36.1 2.7 7.3 24.9 6.0 0.50 1.8 2.5
78 30.5 2.7 7.3 24.9 6.0 0.57 2.1 2.9
79 22.7 3.2 7.0 45.6 8.0 0.58 1.9 2.9
80 24.8 3.1 7.1 25.4 7.5 0.57 1.9 2.8
81 25.7 1.9 6.9 24.8 6.0 0.56 2.0 2.8
82 22.9 2.7 6.3 53.9 8.0 0.58 1.9 2.8
83 22.1 2.0 6.4 62.3 10.0 0.58 1.8 2.6
84 24.8 2.0 6.8 42.2 7.0 0.58 1.8 2.6
85 29.5 2.5 7.6 26.8 4.0 0.50 2.0 2.9
86 21.7 2.5 7.9 24.1 7.5 0.57 2.4 3.6
87 22.2 3.0 6.8 56.9 7.5 0.58 1.9 2.7
88 22.1 2.6 7.2 57.1 7.5 0.58 2.0 2.8
89 23.0 3.2 7.3 23.7 8.5 0.58 2.0 2.8
90 23.7 3.1 7.5 21.6 7.0 0.58 1.9 2.9
91 21.7 2.8 7.1 47.7 6.5 0.58 1.9 2.7
92 21.7 3.7 7.8 21.0 5.0 0.58 2.2 2.6
93 24.3 3.2 7.8 21.0 5.0 0.58 1.9 2.6
94 31.9 3.2 6.9 17.5 7.0 0.55 1.5 2.1
95 29.1 3.7 6.9 17.5 7.0 0.58 1.6 2.0
96 24.0 1.9 6.4 46.7 8.0 0.58 1.9 2.7
97 23.3 2.8 7.3 29.3 11.0 0.58 2.0 2.7
98 22.9 2.2 7.5 22.6 8.0 0.58 3.2 1.6
99 22.6 1.8 6.8 48.3 9.0 0.56 1.4 2.2
100 35.9 2.7 6.1 49.7 1.0 0.51 1.9 2.2
101 37.1 2.7 7.1 23.8 11.0 0.51 1.7 2.6
102 29.4 2.7 7.1 23.8 12.0 0.56 1.9 2.9
103 26.3 2.7 7.1 23.8 11.5 0.58 2.1 3.2
104 23.1 2.7 7.1 23.8 11.0 0.60 2.2 3.3
105 24.0 3.1 7.0 26.0 7.0 0.58 2.0 3.0
106 24.6 2.7 6.4 38.7 9.0 0.58 2.0 2.6
107 27.2 2.5 7.1 21.1 9.0 0.55 1.6 2.5
108 24.3 2.5 7.3 25.2 7.0 0.58 2.1 3.2
109 27.4 3.1 7.0 27.0 10.0 0.55 1.4 2.0
110 30.4 2.8 7.1 26.5 8.0 0.50 1.7 2.6
111 23.3 3.2 7.3 25.6 5.0 0.58 2.1 3.1
112 25.8 3.2 7.3 25.4 6.0 0.55 1.9 2.9
113 23.7 2.5 7.1 21.7 10.0 0.58 1.5 2.3

Input Variables Output Variables

Concrete Mixture
Fineness Modulus
No Compressive Abrasion Slump Proportions
Strength (MPa) Fine
Aggregate
Coarse
Aggregate
(%) (cm)
wc FA CA
114 23.9 2.7 6.6 46.1 8.0 0.58 2.1 3.2
115 22.7 3.0 7.6 26.2 8.0 0.58 2.1 3.0
116 22.1 3.1 6.8 26.8 6.5 0.58 2.0 2.8
117 25.6 2.8 7.4 27.7 6.0 0.55 2.0 3.0
118 22.4 2.9 6.8 29.7 9.0 0.58 2.0 3.1
119 23.5 2.0 6.7 17.5 8.0 0.58 2.2 3.3
120 22.7 3.2 7.5 49.7 7.0 0.58 2.1 3.1
121 30.6 2.8 7.2 25.7 6.0 0.50 1.5 2.0
122 23.6 2.9 7.5 21.0 11.0 0.58 2.0 2.5
123 33.4 2.9 7.0 26.8 8.0 0.48 1.5 2.0
124 21.3 3.5 6.7 22.2 8.0 0.60 2.0 2.6
125 23.5 3.8 6.7 40.0 3.0 0.57 1.3 1.8
126 25.1 2.7 7.4 43.2 6.0 0.58 2.3 2.6
127 21.8 2.8 7.1 46.4 7.5 0.60 3.0 3.4
128 24.1 3.5 7.1 32.7 11.0 0.58 2.1 3.7
129 25.2 2.5 6.8 40.6 8.0 0.58 2.1 2.8
130 24.6 2.5 6.8 17.9 7.0 0.59 2.5 3.9
131 24.1 2.6 7.3 18.4 7.0 0.58 2.3 4.4
132 22.1 3.2 7.4 35.2 6.0 0.58 2.0 2.9

Subtractive Clustering: This method, presented by Chiu (1994), consists of the first phase
of the fuzzy modeling process. It is a fast clustering method since it is not required iterative
nonlinear optimization. This technique considers any data point, not a grid point, as a potential
cluster center. Potential values of each data point are computed and the data point with the
highest value becomes a cluster center ( ci ). Then the potential values of all remaining data
points are recalculated with respect to the previous cluster center to identify the next cluster
center. The process continue until all cluster centers are identified. Once the clustering process is
finished, the number of cluster centers determine the number of membership functions (MFs) as
well as the number of if – then rules. Also, the parameters of each Gaussian MF are defined by
each cluster center and its corresponding sigma (  i ).
Adaptive Neuro Fuzzy Inference Systems (ANFIS): ANFIS (Jang 1993; Jang et al. 1997),
is a method that generates a Sugeno type FIS and it takes the advantages of artificial neural
networks (ANN) by allowing fuzzy systems to learn. Past knowledge (i.e., experimental data) of
a system is used to tune MFs in order to create if – then rules for a fuzzy model. Only Gaussian
MFs can be utilized by ANFIS in the parameter identification process. Figure 1 illustrates
ANFIS architecture to generate a Sugeno type FIS. In Layer 1, a degree of membership is
calculated for each input by using its corresponding MF through Equation 1.
2
1  x c 
  i i
  xi   e 2 i 
(1)
The second layer (i.e., Layer 2) is a product layer (  ) where firing strengths ( i ) are
obtained by multiplying all degrees of membership that arrive to each node. Layer 3 is the stage
where each i is normalized by dividing it to the summation of firing strengths. All consequent
parameters ( pi , qi , and ri ) are computed in Layer 4 by using linear least squares method. Finally,
weighted average defuzzification method is used to compute a single output (i.e., w  c, FAor CA
). The resulting FIS developed by using ANFIS is a Sugeno type FIS, having the same number of

MFs and if – then rules, and each rule has the same weight.

Figure 1. ANFIS Mechanism.

RESULTS
Three Sugeno – type fuzzy models (SFISs) were developed for performing a normal concrete
mixture design (i.e., a SFIS for w – c , FA and CA ) since a SFIS only can have one output. Each
fuzzy model had the same input variables; namely, concrete compressive strength, fineness
moduli of fine and coarse aggregate, abrasion and slump. The parameters used for subtractive
clustering method were range of influence ( ra )= 0.35, squash factor ( )=1.05, accept Ratio ( 
)=0.50, and reject ratio (  )=0.15 which identified 47 cluster centers (i.e., fuzzy rules) for the
SFIS for w – c and FA , and 35 fuzzy rules for CA . Table 2 compiles the ranges of the input and
output variables for each fuzzy model.

Table 2. Input – Output Data Ranges

Output
Input Variables
Variable
Compressive Fine Coarse
Abrasion Proportions
Strength Fineness Fineness Slump (cm)
SFIS (%) by Weight
(MPa) Modulus Modulus
Min. Max. Min. Max. Min. Max. Min. Max. Min. Max. Min. Max.
w –c 0.46 0.63
FA 16.7 37.1 1.3 3.9 5.7 7.9 17.5 64.9 1.0 14.0 1.26 3.20
CA 1.60 4.40

Model Validation: Tesfamariam and Najjaran (2007) suggested that training data should not
be utilized for model validation. Instead, new data called checking data should be used to test a
model performance by plotting predicted versus experimental data plots. Kostić and Vasović
(2015) argued that statistical parameters including R-squared values (R2) and root mean square
errors (RMSE) allow to see how well a model estimate new data. Six concrete mixture designs
were performed for 18, 21, 24, 28, 31 and 36 MPa by using materials with different properties.
Six standard concrete cylinders were fabricated for each mixture and tested as per ASTM C39

(ASTM International 2016) on day 28. The average of six samples were utilized for plotting and
calculating statistics. Statistical results indicate that R2 is greater than 90%, suggesting that
developed Sugeno type FISs are able to estimate new data satisfactorily.

CONCLUSIONS
The results indicated that all Sugeno type FISs perform very well when estimating concrete
mixture proportions for a specific concrete compressive strength ( f ’c ) since the correlation
coefficient (i.e., R2 value) is greater than 90% when plotting f ’c predicted vs. f ’c
experimental. Verification tests on standard cylindrical samples fabricated by utilizing fuzzy
models showed that past experience (i.e., previous knowledge) could be utilized for fabricating
concrete by using local or particular material characteristics. A construction technician located in
another particular geographic location could use data sets from construction laboratories of such
particular location to generate his/her own particular fuzzy models, taking advantage of such
valuable information.
Moreover, the comprehensive framework described in this paper is a contribution to the body
of knowledge that illustrates how past experience (i.e., data stored in construction laboratories)
can be utilized to develop Sugeno type FISs for estimating concrete mixture proportions. This
procedure takes into account specific characteristics of concrete components and can be extended
to any dataset containing mixture designs. In this way, previous knowledge (i.e., experimental
data) can be used to generate new data. In addition, fuzzy models serve as proportioning tools for
concrete technicians and provide researches with additional knowledge of the system through if
– then rules since they are not considered as “black boxes” as ANN are.

REFERENCES
Abrams, D. A. (1918). Design of Concrete Mixtures, Lewis Institute, Structural Materials
Research Laboratory, Bulletin No. 1, Portland Cement Association, Chicago, IL.
ACI (American Concrete Institute). (2014). “Building code requirements for structural concrete.”
ACI 318-14, Farmington Hills, MI.
ACI Committee 211. (1991). “Standard Practice for Selecting Proportions for Normal,
Heavyweight and Mass Concrete.” ACI 211.1-91, Farmington Hills, MI.
ASTM International. (2014a). "Standard test method for sieve analysis of fine and coarse
aggregates." ASTM C136/C136M − 14, West Conshohocken, PA.
ASTM International. (2014b). " Standard test method for resistance to degradation of small-size
coarse aggregate by abrasion and impact in the Los Angeles machine." C131/C131M − 14,
West Conshohocken, PA.
ASTM International. (2015). "Standard test method for slump of hydraulic-cement concrete."
ASTM C143 / C143M, West Conshohocken, PA.
ASTM International. (2016). "Standard test method for compressive strength of cylindrical
concrete specimens." C39/C39M − 16b, West Conshohocken, PA.
Chiu, S. L. (1994). “Fuzzy model identification based on cluster estimation.” J. Intell. Fuzzy
Syst., 2, 267-278.
Jang, J.-S. R. (1993). “ANFIS: Adaptive-network-based fuzzy inference system.” IEEE Trans.
Syst., Man, Cybern., 23(3), 665-685.
Jang, J.-S. R., Sun, C., and Mizutani, E. (1997). Neuro-fuzzy and soft computing: a
computational approach to learning and machine intelligence. Prentice Hall, Upper Saddle

River, NJ.
Kosmatka, S. H., Kerkhoff, B., and Panarese, W. C. (2002). Design and control of concrete
mixtures (14th ed.). Portland Cement Association, Skokie, IL.
Kostić, S., and Vasović, D. (2015). "Prediction model for compressive strength of basic concrete
mixture using artificial neural networks." Neural Comp. Appl., 26(5), 1005-1024.
Mehta, P. K., and Monteiro, P. J. M. (2006). Concrete: Microstructure, properties, and materials
(3rd ed.). McGraw-Hill, New York.
Ross, T. J. (2017). Fuzzy logic with engineering applications, John Wiley & Sons, Ltd.,
Chichester, U.K.
Sugeno, M., and Yasukawa, T. (1993). "A fuzzy-logic-based approach to qualitative modeling."
IEEE Trans. Fuzzy Syst., 1(1), 7-31.
Tesfamariam, S., and Najjaran, H. (2007). "Adaptive network–fuzzy inferencing to estimate
concrete strength using mix design." J. Mater. Civil Eng., 10.1061/(ASCE)0899-
1561(2007)19:7(550), 550-560.
Zadeh, L. A. (1973). "Outline of a New Approach to the Analysis of Complex Systems and
Decision Processes." IEEE Trans. Syst., Man, Cybern., 3(1), 28-44.

Evaluation of Machine Learning Algorithms for Worker’s Motion Recognition Using

Motion Sensors
Kinam Kim1; Jingdao Chen2 ; and Yong K. Cho, Ph.D., M.ASCE 3
1
School of Civil and Environmental Engineering, Georgia Institute of Technology, 790 Atlantic
Dr., Atlanta, GA 30332-0355. E-mail: [email protected]
2
Institute for Robotics and Intelligent Machines, Georgia Institute of Technology, 790 Atlantic
Dr., Atlanta, GA 30332-0355. E-mail: [email protected]
3
School of Civil and Environmental Engineering, Georgia Institute of Technology, 790 Atlantic
Dr., Atlanta, GA 30332-0355. E-mail: [email protected]

ABSTRACT
Construction tasks involve various activities composed of one or more body motions. It is
essential to understand the dynamically changing behavior and state of construction workers to
manage construction workers effectively with regards to their safety and productivity. While
several research efforts have shown promising results in activity recognition, further research is
still necessary to identify the best locations of motion sensors on a worker’s body by analyzing
the recognition results for improving the performance and reducing the implementation cost.
This study proposes a simulation-based evaluation of multiple motion sensors attached to
workers performing typical construction tasks. A set of 17 inertial measurement unit (IMU)
sensors is utilized to collect motion sensor data from an entire body. Multiple machine learning
algorithms are utilized to classify the motions of the workers by simulating several scenarios
with different combinations and features of the sensors. Through the simulations, each IMU
sensor placed in different locations of a body is tested to evaluate its recognition accuracy toward
the worker’s different activity types. Then, the effectiveness of sensor locations is measured
regarding activity recognition performance to determine relative advantage of each location.
Based on the results, the required number of sensors can be reduced maintaining the recognition
performance. The findings of this study can contribute to the practical implementation of activity
recognition using simple motion sensors to enhance the safety and productivity of individual
workers.

INTRODUCTION
Monitoring the behavior and working state of construction workers is a challenging task due
to the dynamic nature of construction projects. Since construction tasks involve various physical
activities consisted of one or more body motions, understanding the ever-changing activities and
motions of the workers is necessary to manage the workers effectively in order to improve safety
and productivity. Construction projects, in general, require excessive and repetitive physical
activities, which arouses the strong need for understanding the worker’s activities and motions
for ensuring and improving the safety, health, and productivity of individual workers.
To recognize motions of the construction workers, machine learning algorithms have been
utilized to classify motion sensor data. Several research efforts have shown promising action
recognition performance using the motion sensor data. However, there is a lack of understanding
about appropriate locations of motion sensors on a worker’s body to better recognize the motions
and activity classifications. Motion sensor data, such as acceleration and angular velocity, varies
widely depending on its location on the body. Thus, the understanding the locations of each

sensor and its impact on motion recognition can be used to improve the recognition performance
and reduce the implementation cost. Hence, this study proposes a simulation-based evaluation of
motion sensor locations on a construction worker’s body. Seventeen inertial measurement unit
(IMU) sensors are used to capture the motions from the entire body while the worker is
performing typical construction tasks. Five machine learning algorithms are utilized to classify
the motions. By simulating several scenarios with different combinations and features of the
sensors, each IMU sensor located on different parts of the body is evaluated, and the locations
with higher recognition rates are determined.

LITERATURE REVIEW
Two approaches have been widely utilized to recognize human’s motions and activities,
which are 1) image-based approach and 2) sensor-based approach. While the image-based
approach collects motion data by extracting feature points or a 2D or 3D skeleton model from
images, the sensor-based approach collects motion data from sensors including accelerometer,
gyroscope, and magnetometer which are usually located on body joints.
Several research studies utilize RGB-D cameras, e.g., Kinect, which provide depth
information that can be used to build 2D or 3D skeleton model of a human (Escorcia et al. 2012;
Han and Lee 2013; Michel et al. 2017; Yu et al. 2017). A 2D skeleton model developed from
Kinect images was used to recognize the leading postures of unsafe behaviors (Yu et al. 2017).
Assuming that leading postures play a role as the precursor of unsafe behaviors, the ranges of
joint angles of the leading postures were determined through an experimental study. Multiple
cameras, 3D camcorder, and Kinect were used to generate the 3D skeleton model (Han and Lee
2013). By comparing the skeleton model and motion templates, unsafe behaviors during ladder
climbing were recognized. A machine learning algorithm, e.g., the support vector machine
classifier, was deployed to recognize construction workers’ actions using a 3D skeleton model
generated by Kinect (Escorcia et al. 2012). While RGB-D camera or multiple cameras provide
useful information for motion recognition, several efforts have been made to achieve the same
purpose using a single camera (Kim et al. 2016; Peddi et al. 2009; Yang et al. 2016). Machine
learning classifiers using image-based descriptors were developed to recognize various
construction activities (Yang et al. 2016). Although this study did not show satisfactory
performance for every activity, the study provided an insight into how images from a single
camera can be used for construction activity recognition. However, the image-based approach
still has a limitation that occlusion between objects can result in incomplete object detection
which may lead decrease of a recognition rate.
For sensor-based approaches, smartphones are widely used because smartphones have
embedded sensors for recognizing motions including accelerometer and gyroscope, and it is a
cost-effective way to collect data (Akhavian and Behzadan 2016; Bayat et al. 2014; Kwapisz et
al. 2011; Nath et al. 2017; Yang 2009). The study by Akhavian and Behzadan (2016) analyzed
the productivity of construction worker through a machine learning classification method using
data from accelerometer and gyroscope embedded in a smartphone. The study measures the time
spent on each activity using the classifier and analyzes productivity. Other studies solely used
acceleration data to train machine learning classifiers for recognizing human activities (Bayat et
al. 2014; Kwapisz et al. 2011; Yang 2009). Although these studies showed promising
classification performance, target activities are limited to daily activities such as running,
dancing, and walking which are not descriptive enough to fully recognize specific construction
tasks. A study by Nath et al. (2017) used two smartphones and embedded accelerometers and

gyroscopes to analyze the construction worker’s ergonomic posture while the worker is
performing screw driving tasks. On the other hand, Valero et al. (2017) used a separately
developed sensor system to identify angular thresholds for detecting motions of construction
workers. The study utilizes multiple sensors so that the data can represent motions in more detail.
Cho et al. (2018) utilized motion sensors to detect the unsafe posture of exoskeleton wearers.
However, there is still a lack of understandings of how to determine locations and the number of
sensors to recognize the construction worker’s motions.

METHODOLOGY
The proposed study follows the procedure shown in Figure 1. Each step of the procedure will
be explained in the following subsections.

Figure 1. Evaluation procedure of sensor locations.

Identify motion candidates
The following motions for typical construction tasks are selected as motion candidates;
standing, bending, squatting, walking, twisting, working overhead, kneeling, and using stairs.
Each of the bending, squatting, and kneeling activities is divided into three classes (such as
bending-up, bending-down, and bending) to reduce the loss of information caused by transitions
of motions. For example, bending-up and bending-down motions indicate transitional motions
from the bending motion to other motions or vice versa. Hence, fourteen motion classes are used
in this study.

Define body nodes

As shown in Figure 2 (a), 21 body joints or body parts are designated as nodes for simulation.
Among the 21 nodes, 4 nodes located on the spine can be tracked by interpolation between the
neck and hip nodes. Thus, the integrated wearable sensor system only requires 17 IMU sensors
(red dots in Figure 2 (b)) to collect data from all the nodes.

Define feature vector

Once the nodes are defined, feature vectors are constructed using data from three types of
sensors (accelerometer, gyroscope, and magnetometer), embedded in each IMU. Each feature
vector is used as input of machine learning classifiers. As shown in Figure 3, each node with a
single IMU generates a feature set composed of 13 values including quaternion (4 values),
acceleration (3 values), velocity (3 values), and angular velocity (3 values). Feature sets from

multiple nodes are concatenated to construct an overall feature vector. The generated feature
vectors are used as training and test data for machine learning classifiers. Each data contains a
discrete and time-independent motion state.

Figure 2. (a) Body nodes and (b) wearable sensor locations.

Figure 3. Formation of the feature vector.

Generate simulation scenarios

To identify the impact of the sensor locations on motion recognition performance, simulation
scenarios are generated by considering different combinations of the selected nodes. Table 1
shows the combinations of nodes used in the simulation. Each scenario uses a different
dimension for the feature vector depending on the number of selected nodes.

RESULTS
Collect data and deploy five classifiers
A dataset containing 18,350 data points is collected from a subject performing the

aforementioned motions with a 28-lb concrete block (Figure 4). Subject’s motions are
simultaneously videotaped to be used as the ground truth. Once the dataset is collected, five
machine learning classifiers including logistic regression, k-nearest neighbors, multilayer
perceptron, random forest, and support vector machine classifiers are deployed to recognize the
motions from the dataset. Different sizes of the feature vector dimension based on the simulation
scenarios are used to train and test the classifiers. 10-fold cross-validation is implemented to
validate the classification performance. Hyper-parameters of each classifier are also tuned
through the 10-fold cross-validation process.

Table 1. Combinations of nodes used in the simulation

Selected nodes (The number of Selected nodes (The number of
Combination Combination
nodes) nodes)
1 All nodes (21) 17 Right Foot (1)
2 Upper body (15) 18 Left Thigh (1)
3 Lower body (7) 19 Left Leg (1)
4 Core nodes* (7) 20 Left Foot (1)
5 Hip and Head (2) 21 Right Shoulder (1)
6 Hip and Neck (2) 22 Right Arm (1)
7 Hip and Spine (5) 23 Right Forearm (1)
8 Head and Neck (2) 24 Right Hand (1)
9 Head and Spine (5) 25 Left Shoulder (1)
10 Neck and Spine (5) 26 Left Arm (1)
11 Hip (1) 27 Left Forearm (1)
12 Head (1) 28 Left Hand (1)
13 Neck (1) 29 Spine 3 – close to Neck (1)
14 Spine (4) 30 Spine 2 (1)
15 Right Thigh (1) 31 Spine 1 (1)
16 Right Leg (1) 32 Spine 0 – close to Hip (1)
*Head, Neck, Spine, and Hip

Figure 4. Bending and squatting motion examples.

Analyze classification performance

As shown in Figure 5, each classifier is evaluated in terms of accuracy which is the
evaluation metric for the suitability of different sensor placement locations. As expected, the
model using all nodes showed the best recognition performance. Among the five classifiers,
random forest classifier showed the best performance in all cases. It is noteworthy that the model
with two nodes located within a certain distance, such as combinations 5 and 6 as shown in Table
2, showed similar recognition performance compared to the model with all nodes. In the case of
the single node scenarios, scenarios containing upper body nodes showed better recognition

performance than the ones containing lower body nodes. While each classifier shows a different
accuracy for the same scenario, varying the node combination for each classifier resulted in
similar trends for accuracy.

Figure 5. Simulation results with 32 node combinations

Table 2. Classification accuracy of Random Forest classifier
Combination Selected nodes AccuracyCombination Selected nodes Accuracy
1 All nodes 0.7983 17 Right Foot 0.2791
2 Upper body 0.7729 18 Left Thigh 0.5941
3 Lower body 0.7440 19 Left Leg 0.5973
4 Core nodes 0.7658 20 Left Foot 0.5090
5 Hip and Head 0.7617 21 Right Shoulder 0.6844
6 Hip and Neck 0.7544 22 Right Arm 0.6431
7 Hip and Spine 0.7713 23 Right Forearm 0.6134
8 Head and Neck 0.6811 24 Right Hand 0.6222
9 Head and Spine 0.7606 25 Left Shoulder 0.6762
10 Neck and Spine 0.7635 26 Left Arm 0.6158
11 Hip 0.6849 27 Left Forearm 0.6171
12 Head 0.6893 28 Left Hand 0.6110
13 Neck 0.6871 29 Spine 3 0.7298
14 Spine 0.7639 30 Spine 2 0.7310
15 Right Thigh 0.6032 31 Spine 1 0.7365
16 Right Leg 0.5658 32 Spine 0 0.6833

CONCLUSION
This study investigated the impacts of IMU sensor locations on the motion recognition
performance through a simulation. A set of 17 wearable IMU sensors was used to collect the
motion data from the entire body. Five machine learning classifiers were deployed to evaluate
their recognition performance based on the motion sensor locations. Comparing to the best
performance node combination which includes all 17 nodes, the node combinations containing
two nodes located in a certain distance, such as neck and hip or head and hip, showed similar
motion recognition performance. It is an important finding in this study that fewer motion

sensors can show a similar performance with the fully loaded sensors if their locations are
selected with the understanding of their effectiveness based on the locations. Thus, a motion
sensor-based system can effectively recognize worker’s various motions with the reduced
number of motion sensors, thus decreasing the implementation cost.
Future study will focus on several issues. First of all, the training dataset needs to be further
collected from various subjects to improve the generality of the dataset. Since human motions
vary depending on individual working behaviors, the dataset from the various subjects is
essential to better represent the motions of construction workers. Second, the formation of the
feature vector will be further investigated to achieve a better recognition accuracy. Raw data of
IMU sensors is rarely used as input of existing motion recognition system. Instead, statistical
features are extracted from the raw data and utilized as the input. These features may affect the
recognition performance, and evaluation of the sensor location may also be changed.

REFERENCES
Akhavian, R., and Behzadan, A. H. (2016). “Productivity Analysis of Construction Worker
Activities Using Smartphone Sensors.” Proceedings of 16th International Conference on
Computing in Civil and Building Engineering, 1067–1074.
Bayat, A., Pomplun, M., and Tran, D. A. (2014). “A Study on Human Activity Recognition
Using Accelerometer Data from Smartphones.” Procedia Computer Science, Elsevier, 34,
450–457.
Cho, Y. K., Ma, S., Ueda, J., and Kim, K. (2018). “A Wearable Robotic Exoskeleton for
Construction Worker’s Safety and Health.” 2018 ASCE Construction Research Congress
(CRC), New Orleans, LA, 19–28.
Escorcia, V., Dávila, M. A., Golparvar-Fard, M., and Niebles, J. C. (2012). “Automated Vision-
based Recognition of Construction Worker Actions for Building Interior Construction
Operations Using RGBD Cameras.” Proceedings of Construction Research Congress 2012,
879–888.
Han, S., and Lee, S. (2013). “A vision-based motion capture and recognition framework for
behavior-based safety management.” Automation in Construction, Elsevier, 35, 131–141.
Kim, H., Paik, J., Park, J., and Park, H. (2016). “Motion Estimation-based Human Fall Detection
for Visual Surveillance Low-light image restoration method View project Motion
Estimation–based Human Fall Detection for Visual Surveillance.” IEIE Transactions on
Smart Processing and Computing, 5(5).
Kwapisz, J. R., Weiss, G. M., and Moore, S. A. (2011). “Activity Recognition using Cell Phone
Accelerometers.” Proceedings of ACM SigKDD Explorations Newsletter 2011, 74–82.
Michel, D., Qammaz, A., and Argyros, A. A. (2017). “Markerless 3D Human Pose Estimation
and Tracking based on RGBD Cameras: an Experimental Evaluation.” Proceedings of the
10th International Conference on PErvasive Technologies Related to Assistive Environments,
Island of Rhodes, Greece, 115–122.
Nath, N. D., Akhavian, R., and Behzadan, A. H. (2017). “Ergonomic analysis of construction
worker’s body postures using wearable mobile sensors.” Applied Ergonomics, Elsevier, 62,
107–117.
Peddi, A., Huan, L., Bai, Y., and Kim, S. (2009). “Development of Human Pose Analyzing
Algorithms for the Determination of Construction Productivity in Real-Time.” Construction
Research Congress 2009, American Society of Civil Engineers, Reston, VA, 11–20.
Valero, E., Sivanathan, A., Bosché, F., and Abdel-Wahab, M. (2017). “Analysis of construction

trade worker body motions using a wearable and wireless motion sensor network.”
Automation in Construction, Elsevier, 83, 48–55.
Yang, J. (2009). “Toward Physical Activity Diary: Motion Recognition Using Simple
Acceleration Features with Mobile Phones.” Proceedings of the 1st international workshop
on interactive multimedia for consumer electronics, 1–10.
Yang, J., Shi, Z., and Wu, Z. (2016). “Vision-based action recognition of construction workers
using dense trajectories.” Advanced Engineering Informatics, Elsevier, 30(3), 327–336.
Yu, Y., Guo, H., Ding, Q., Li, H., and Skitmore, M. (2017). “An experimental study of real-time
identification of construction workers’ unsafe behaviors.” Automation in Construction,
Elsevier, 82, 193–206.

Pressure Transient Detection and Pattern Discovery in Water Distribution Systems

Lu Xing, S.M.ASCE1; and Lina Sela, Ph.D., M.ASCE2
1
Dept. of Civil, Architectural, and Environmental Engineering, Univ. of Texas at Austin, 301 E.
Dean Keeton St., Stop C1786, Austin, TX 78712. E-mail: [email protected]
2
Dept. of Civil, Architectural, and Environmental Engineering, Univ. of Texas at Austin, 301 E.
Dean Keeton St., Stop C1786, Austin, TX 78712. E-mail: [email protected]

ABSTRACT
In water distribution systems (WDSs), pressure transients reflect the responses of WDSs to
normal and abnormal changes. The behavior of pressure transients is largely unknown and
cannot be fully assessed by numerical simulation or modeling. This study proposed a two-step
time series data mining (TSDM) approach for detecting and clustering pressure transients to
reveal recurrent and consistent patterns based on high-resolution pressure data collected by a
network of sensors. First, the modified cumulative sum control chart (CUSUM) approach is
performed on the pressure data preprocessed by piecewise aggregative approximation (PAA) to
detect and extract pressure transient events. Second, a novel clustering technique is applied to
cluster the extracted pressure transients based on their similarity as measured by dynamic time
warping (DTW). An example application of this method has been implemented to confirm that
the proposed approach provides a fast and efficient way to reveal consistent and unique patterns
in WDSs.

INTRODUCTION
The American Society of Civil Engineers (ASCE) estimates that the aging and deteriorating
infrastructures, operating at design limit in terms of lifespan or capacity, are wasting 14% to 18%
of the treated water (ASCE 2017). However, with the current average pipe replacement rate of
0.5% per year, the replacement of the whole system will take more than 200 years (ASCE 2017);
therefore, the existence of aging pipes in water distribution systems (WDSs) is inevitable,
imposing a primary challenge to monitoring and maintenance of WDSs.
The monitoring of WDSs can be effectively achieved by accessing the unsteady pressure,
commonly termed pressure transients, since pressure transients can reliably reflect the responses
of WDSs to normal and abnormal changes, such as pipe failures (i.e., background leakages and
bursts), system operations, and demand fluctuations. Pressure transients typically occur quickly
but potentially lead to the pipe deterioration by introducing pressure extremes and variability
(Starczewska et al. 2015). Despite the general understanding that pressure variability should be
minimized, there is an inadequate amount of research focusing on quantitatively deploying the
roles that pressure variability plays in WDSs (Ghorbanian et al. 2016).
The rapid development of data logging and data mining technologies in the past few decades
have made it possible to investigate the pressure transients in WDSs in a more rigorous manner.
The emerging applications of low-cost high-frequency transient pressure transmitters (TPTs),
introduced to continuously monitor water pressure, directly contribute to the monitoring of
WDSs (Visenti 2018). High frequency pressure data have been foremost utilized to validate and
calibrate the simulation model by comparing and fitting the numerical results to real-time
pressure data (Meseguer et al. 2014; Rathnayaka et al. 2016). Moreover, integrated model- and
data-driven approaches begin to gain popularity in the domain of pipe failure detection in WDSs.

Nevertheless, to this point, all aforementioned studies have not consider using the data collected
by the TPTs to characterize the typical patterns of pressure transients occurring in WDS.
Motivated by the availability of high-frequency pressure data collected over an extended
period of time, we propose a time series data mining (TSDM) approach to detect the pressure
transients and discover the patterns of pressure transients from high-resolution data. The
objectives of this study are as follows: (a) the modification and application of cumulative sum
control chart (CUSUM) change detection scheme to high-resolution pressure data; (b) the
identification and employment of appropriate similarity search and pattern discovery algorithms
to pressure data in WDSs; (c) the intuitive illustration of the pressure transient patterns
discovered using proposed methodology.

METHODOLOGY
This study proposed a two-step TSDM approach for transient detection and pattern discovery
based on high-resolution pressure data collected by TPTs. In the first step, the modified CUSUM
approach is performed on the pressure data preprocessed by piecewise aggregative
approximation (PAA) to detect and extract pressure transient events. In the second step, a novel
clustering technique, clustering by fast search and find of density peak (CFSFDP), is applied to
cluster the extracted pressure transients based on their similarity as measured by dynamic time
warping (DTW). An outline of the proposed TSDM algorithm is presented in Figure 1.
In Step I, the raw pressure data collected by the pressure sensors are preprocessed to reduce
the dimensionality and impulsive noises using the computational efficient technique PAA
(Keogh et al. 2004). Subsequently, abrupt changes are detected from the pressure data with
reduced dimensionality, i.e., lower resolution. This study modified and adopted the CUSUM
algorithm as the change detection technique due to its efficiency, unbiasedness and intuitive
nature of the parameters. The CUSUM algorithm, originally proposed by Page (1954) as two
repeated uses of sequential probability tests, tracks the characteristics of the changes, i.e., rate
and magnitude, and compares these characteristics with control limits. Once the rate and
magnitude of the changes exceed the respective control limits at the same time, an alarm will be
raised; sequentially, an abrupt change will be defined and recorded. Based on the detected
pressure changes, pressure data of a 5min duration, i.e., window size, are collected to represent
complete pressure transient, which typically contains several changes.

Figure 1. Flowchart of the proposed TSDM approach.

In Step II, the similarities between the pressure transients collected in Step I are measured by
DTW, an elastic similarity metrics. DTW operates in the time domain and measure the distance
by comparing the values at each time step and compensating potential temporal misalignment
through some elastic adjustment (Lines and Bagnall 2015). Predicated on the distance matrix
calculated by DTW, the set of extracted pressure transients can then be mapped into a 2-D space
through multidimensional scaling, from where CFSFDP (Rodriguez and Laio 2014) can be
applied to find recurrent and consistent pressure transient patterns.
CFSFDP performs based on the assumption that the cluster prototypes should exhibit high
local density, i.e., surrounded by relatively large number of samples, and significant distance
from samples with higher density. In detail, for each sample ( i ), CFSFDP computes its local
density ( i ) and distance to its nearest high-density data point (  i ), which are defined as:
i  k  dij , dc  (1)
j

 min  dij  , if  s.t.  j  i

 j : i   j
i   (2)
 jmax
 : i   j
 dij  , otherwise

where, d ij is the distance between sample i and j as measured by DTW in this study, d c is the
cut-off distance, and k  dij , dc  represents a radial basis function kernel, such as cut-off kernel
and Gaussian kernel. The results of  and  for the 2-D mapping of ten pressure transient
events (Figure 2(a)) are illustrated in a decision graph (Figure 2(b)). The dashed line represents
the threshold for  and  . Figure 2(b)shows that the sample 0 and 6, red points, are cluster
prototypes, while sample 5, 8 and 10, blue points, are outliers.

Figure 2. Illustration of the CFSFDP algorithm: δ10 is the distance of 10th sample, d c is the
cutoff distance, red points represent cluster prototype, blue points show outliers and black
points stand for samples belonging to a certain cluster but not as its prototype.

APPLICATION AND RESULTS

The performance of the proposed algorithms is tested on real data, collected by high-
frequency TPTs distributed in a large water utility. Each TPT unit includes a pressure sensor
taking 64 samples per second and a remote temporary unit (RTU). An example of a pressure
signal recorded by a TPT within a day is shown in Figure 3(a). The dataset available in this study
is high-resolution pressure data collected by one TPT from October 2017 to August 2018.

Figure 3. Change detection by applying CUSUM algorithm (a) daily pressure data and the
detected changes; (b) daily CUSUM results ( c  and c  ); (c) the zoom-in view of one
pressure transient signal; (d) the zoom-in view of the CUSUM results.

Step I: Transient detection and extraction

As the preprocessing step, PAA segmentation is applied to the raw pressure data (blue line in
Figure 3(a)) to eliminate noises and reduce the dimensionality, resulting in pressure data with
10s resolution (black line in Figure 3(a)), on which CUSUM is then performed. The pressure
changes detected by CUSUM algorithm are characterized by the start time ( t s ) and the pressure
at start time ( Pts ) as well as the end point ( te ) the pressure at end time ( Pte ). The duration ( T ) of

the pressure change is then defined by T  te  ts , while the amplitude ( ΔP ) is defined as

ΔP  Pte  Pts . This work focuses on significant (i.e., ΔP  10 psi ) and abrupt pressure changes
with changing rate greater than 0.1psi/s.
The results of CUSUM applied to the pressure data collected from one TPT within a typical
day are shown in Figure 3(b), where the orange dashed lines represent the time period when
changes are detected. The detailed view of the pressure data and CUSUM results for the first
time period are then presented in Figure 3(c) and (d). When the CUSUM value ( c  or c  )
exceeds the threshold , represented by the red dashed line, a pressure change will be detected
and recorded. It can be noticed in Figure 3(d) that several changes occurred sequentially. Figure
3(c) illustrates the corresponding pressure signal: a pressure drop followed by three cycles of
pressure raises and drops, the combination of which constitutes a pressure transient.
The CUSUM algorithm is then applied to pressure data collected by one TPT from October,
2017 to August, 2018, when 1314 pressure changes are detected, based on which 586 5-minute
transient events are extracted from the original data and inputted into Step II.

Step II: Pattern discovery

In terms of pattern discovery, our interests lie clearly in patterns not the absolute amplitude;
therefore, the extracted pressure transients are normalized. DTW distances between every two of
these 586 normalized transient events are calculated. Consequently, the pairwise DTW distances
comprise a 586  586 ( N  N ) distance matrix, of which the  i, j  element represents the DTW
th

distance between i th and j th transient event. Based on the distance matrix, the set of pressure
transient is then mapped into a 2-D space, as shown in Figure 4(a) using multidimensional
scaling technique (Maaten and Hinton 2008). Each point in Figure 4(a) represents a pressure
transient signal, the distance between which reflects their similarity as measured by DTW
distance: the closer i th and j th points in 2-D space, the more similar i th and j th transient events
exhibit.

Figure 4. 2-D feature space and decision graph.

The normalized pressure transients represented by points in 2-D space can then be mined by
CFSFDP algorithm. The results of density    and distance   are illustrated in the decision
graph, as shown in Figure 4(b). Four prototypes, located in the upper right quadrant, are
discovered, representing four distinguish clusters. The color of the prototype points identify their
representing clusters. The same color code is used in Figure 4(a) to show the belonging cluster of
each points. As expected, the closer points are clustered into the same cluster.
Figure 5 shows the clustering results and the prototype patterns for each cluster. Explicitly,
Cluster 0 includes 145 down-surge pressure transients, where pressure drops at the beginning of
the transient event, fluctuates and finally stabilizes. If we define a pair of pressure increase and
decrease as a cycle, Pattern 0 constitutes one major cycle and one minor cycle. Cluster 1, also
representing down-surge transient events (105 of them), differs from Cluster 0 in that the major
cycle of Pattern 1 occurs more abruptly than Pattern 0. However, Cluster 2 comprises 137 up-
surge transient events, characterized by starting with pressure raise followed by pressure drop.
Only one major cycle exists in this pattern and the pressure stabilizes at a higher pressure level
than the original. In Cluster 3, which is made up of 199 transient events, the representative
pattern shares some similarity with Pattern 2, but differs in that the pressure of Pattern 3 returns
to original level when it stabilized.

Figure 5. Pressure transient clusters and their prototypes.

CONCLUSIONS AND FUTURE WORK

A TSDM approach aimed at discovering pressure transient patterns in WDSs using high-
resolution pressure data was presented herein. The modified CUSUM change detection
algorithm, together with DTW distance measurement and CFSFDP techniques, was applied to
detect and discover the representative patterns. The proposed procedure was demonstrated using
pressure data collected by a TPT. The example application has shown TSDM to be a powerful
tool for transient pressure analysis, so as to reveal consistent and unique patterns in WDSs.
However, the practical meaning of this TSDM approach is significant only if combined with
the domain knowledge in the field of WDS. To maximize the full benefits and functionality of
TSDM in the field of transient pressure analysis in WDSs, future research should be conducted
to reveal the causes and effects of different patterns of pressure transients, upon which
suggestion for system operations in WDSs will be proposed.

ACKNOWLEDGEMENTS
This work was supported by the University of Texas at Austin New Faculty Start Up Grant
and Cooperative Agreement No. 83595001 awarded by the U.S. Environmental Protection
Agency to The University of Texas at Austin. It has not been formally reviewed by EPA. The
views expressed in this document are solely those of the authors and do not necessarily reflect
those of the Agency. EPA does not endorse any products or commercial services mentioned in
this publication.

REFERENCES
ASCE (2017). “The 2017 infrastructure report card: A comprehensive assessment of America's
infrastructure.” American Society of Civil Engineers Washington, DC.
Ghorbanian, V., Karney, B., and Guo, Y. (2016). “Pressure standards in water distribution
systems: reflection on current practice with consideration of some unresolved issues.”
Journal of Water Resources Planning and Management, 142(8), 04016023.
Keogh, E., Chu, S., Hart, D., and Pazzani, M. (2004). “Segmenting time series: A survey and
novel approach.” Data mining in time series databases, World Scientific, 1-21.
Lines, J. and Bagnall, A. (2015). “Time series classification with ensembles of elastic distance
measures.” Data Mining and Knowledge Discovery, 29(3), 565-592.
Maaten, L. v. d. and Hinton, G. (2008). “Visualizing data using t-sne.” Journal of machine
learning research, 9(Nov), 2579-2605.
Meseguer, J., Mirats-Tur, J. M., Cembrano, G., Puig, V., Quevedo, J., Perez, R., Sanz, G., and
Ibarra, D. (2014). “A decision support system for on-line leakage localization.”
Environmental modelling & software, 60, 331-345.
Page, E. S. (1954). “Continuous inspection schemes.” Biometrika, 41(1/2), 100-115.
Rathnayaka, S., Keller, R., Kodikara, J., and Chik, L. (2016). “Numerical simulation of pressure
transients in water supply networks as applicable to critical water pipe asset management.”
Journal of Water Resources Planning and Management, 142(6), 04016006.
Rodriguez, A. and Laio, A. (2014). “Clustering by fast search and find of density peaks.”
Science, 344(6191), 14920-1496.
Starczewska, D., Collins, R., and Boxall, J. (2015). “Occurrence of transients in water
distribution networks.” Procedia Engineering, 119, 1473-1482 Computing and Control for
the Water Industry (CCWI2015) Sharing the best practice in water management.
Visenti (2018). “Keeping an intelligent eye on your assets.” <https://www.visenti.com/>.

Deep Learning Models for Content-Based Retrieval of Construction Visual Data

Nipun D. Nath1 and Amir H. Behzadan, Ph.D.2
1
Zachry Dept. of Civil Engineering, Texas A&M Univ., College Station, TX 77843. E-mail:
[email protected]
2
Dept. of Construction Science, Texas A&M Univ., College Station, TX 77843. E-mail:
[email protected]

ABSTRACT
Deep learning (DL) algorithms such as convolutional neural networks (CNNs) can assist in
tasks such as content search and retrieval, image tagging and captioning, scene description,
motion prediction, and language processing. This paper presents research that aims at designing
and validating DL models for automated content-based retrieval of daily construction images and
videos. Information retrieval from visual data is key to labor-intensive tasks such as safety
inspection, crew activity logging, and work progress documentation. In order to train deep neural
networks (DNNs), large repositories of high-quality annotated visual data are needed. However,
generating such labeled datasets in construction is non-trivial and resource intensive, and
requires specific skillset. To overcome this challenge, we present a methodology for fast object
detection and tagging in visual data using DNNs trained with a relatively small dataset. Two
state-of-the-art object detection algorithms, i.e., you-only-look-once (YOLO) and mask region-
based CNN (a.k.a., Mask R-CNN) are investigated. Training data is obtained via web mining
(the Internet) and crowdsourcing. Results show that training on data from both sources yields the
best classification accuracy. Testing the model on new data reveals that the fully-tuned model
can achieve a minimum mean average precision (mAP) of 79% when tested on different image
subsets.

INTRODUCTION
In recent years, machine learning has become a major area of research in many disciplines
that deal with data-intensive applications including but not limited to computer vision, robotics,
speech and handwriting recognition, human-computer interaction (HCI), and biomedicine
(Witten 2016, Nasrabadi 2007). Among several approaches to machine learning, deep learning
(DL) has gained much popularity due to its ability to handle high volume of data while yielding
high accuracy. A DL model trained on a large image dataset can be used for visual or semantic
content search and retrieval, image tagging, and captioning, scene understanding and description,
motion prediction, and language processing (Deng and Yu 2014, Schmidhuber 2015). In
particular to the construction domain, analyzing jobsite imagery and videos with a DL model can
assist in generating various key project documents such as progress and safety reports, request
for information (RFI), and change orders with increased accuracy and timeliness.
With these motivations, in Project PICTOR, the authors aim at designing and validating DL
models for automated content-based retrieval of daily construction images and videos. A key
facilitator of this research is the exponentially increasing number of images and videos captured
on a daily basis from construction sites using digital cameras, drones, and smartphones.
However, a review of existing methodologies and practices reveals that in most cases, such
visual data are manually sorted, analyzed for content, and ultimately archived. The majority of
digital images and videos are stored only with date/time and/or geolocation tags, making the

manual retrieval of a particular subset of digital media based on specific visual contents a non-
trivial and extremely time-consuming task. A potential remedy to this challenge is to
automatically analyze the captured scene for content followed by identifying specific objects of
interest (a.k.a. object detection) using a computer algorithm. For example, the progress of steel
frame erection can be quantified by comparing multiple images (or successive video frames)
captured over a period of time from a building under construction. Similarly, identifying heavy
equipment and workers and their spatial relationship would be very useful for monitoring
productivity, safety, and planning resource allocation. When performed at a high processing rate,
this approach can also cater to specific project needs such as creating and/or updating as-built
information in (near) real-time, marking deviations from plans, identifying risky behaviors, and
detecting impending accidents.
Within the general area of computer vision and digital data mining, and thanks in part to the
availability of large labeled image datasets, recent work such as Krizhevsky et al. (2012), and
Simonyan and Zisserman (2014) has made it possible to identify everyday objects/animals in
digital imagery using DL and computational methods. Despite this, to the authors’ best
knowledge, there is a dearth of publicly available datasets containing labeled construction site
imagery. In particular to the construction domain, a limited number of studies designed and
validated methodologies for recognizing construction equipment and materials (Zou and Kim
2007; Brilakis and Soibelman 2008) using traditional machine learning. A major limitation of
such studies is the need for careful extraction and selection of hand-crafted features that best
represent the content of a particular dataset, and therefore the trained models may not necessarily
be generalizable to a new dataset. In contrast, DL technique can achieve significantly better
results by streamlining the feature learning process (Ding et al. 2018; Kolar et al. 2018; Siddula
et al. 2016). However, most DL frameworks require large computational time for both training
and testing due to the deep neural network (DNN) architecture, which could impede their
applicability to real-time prediction or detection problems. In light of this, the research presented
in this paper describes a methodology for building and testing a DL model trained on a relatively
small dataset but capable of detecting construction objects (e.g., building, equipment, worker) in
real-time and with high fidelity.

LITERATURE REVIEW
Object detection using hand-crafted features generally utilizes color information, e.g., HSV
(hue, saturation, and value) color space, and/or geometric information (e.g., shapes, edges). Zou
and Kim (2007) used saturation threshold to detect excavators. Brilakis and Soibelman (2008)
proposed a methodology to identify shapes and detect corresponding material types (e.g., steel,
concrete). Wu et al. (2009) applied Canny edge detection (Canny 1986) and watershed
transformation (Beucher and Meyer 1992) to detect column edges in an image. Recent studies
also applied traditional machine learning algorithms to recognize objects from hand-crafted
features. Examples include but are not limited to Chi and Caldas (2011) who used Naïve Bayes
(NB) and neural network (NN) classifiers to detect workers, loaders, and backhoes, as well as
Han and Golparvar-Fard (2015) who applied Support Vector Machine (SVM) to classify ~20
types of construction materials. However, these methodologies require careful extraction and
selection of features relevant to specific classes that might not be directly applicable to other
cases (Kolar et al. 2018). DL-based methods such as convolutional neural network (CNN), on
the other hand, offer a more convenient solution due to their ability to self-learn relevant and
useful features in a large number of annotated images (Kolar et al. 2018). In particular, CNN has

proven to achieve significantly better results in image classification as well as reduce

computational time compared to traditional NN (LeCun et al. 1998). Applications of CNN model
are plentiful, including handwritten digit recognition (LeCun et al. 1998), and identification of
everyday objects from among a total of 1,000 different classes (Krizhevsky et al. 2012;
Simonyan and Zisserman 2014). In the construction domain, Kolar et al. (2018) used CNN to
identify safety guardrails, Siddula et al. (2016) used CNN coupled with the Gaussian mixture
model (GMM) (Reynolds 2015) to detect objects related to roof construction, and Ding et al.
(2018) combined CNN with the long short-term memory (LSTM) to recognize unsafe behaviors
of construction workers (e.g., climbing a ladder) in video frames. More recently, the authors
have proposed a method to classify construction images using deep and transfer learning
techniques (Nath et al. 2018).
For object detection problems, Region-based CNN (R-CNN) is one of the most powerful
methods that can achieve excellent results (Girshick 2015). R-CNN performs object detection in
two steps. First, it finds candidate regions (a.k.a. regions of interest) for objects, and then, it
applies CNN to classify objects inside the proposed regions (Girshick 2015). To overcome the
enormous computational burden of R-CNN, however, more efficient variants of the algorithm,
e.g., Fast R-CNN (Girshick 2015) and Faster R-CNN (Ren et al. 2015), have been proposed.
Another variant, namely Mask R-CNN, extends Faster R-CNN by adding an extra branch which
outputs segmentation masks on top of the existing branches that output classification labels and
bounding boxes (He et al. 2017). Unlike these approaches, the YOLO (You-Only-Look-Once)
algorithm unifies classification and localization (i.e., prediction bounding boxes) tasks in a single
neural network which allows extremely fast detection of objects (Redmon et al. 2016).
Therefore, it is possible to detect objects in real-time from a live video stream by sequentially
and individually processing successive frames at a rate of ~30 frames per second (FPS).
Real-time (or near real-time) detection is vital, particularly for ensuring safety and mitigating
risks. For example, contact collisions can be prevented by real-time detection and localization of
workers in close proximity to equipment. Therefore, the main focus of this paper is on
algorithms that can perform (near) real-time object detection (i.e., Mask R-CNN and YOLO).

METHODOLOGY
The schematic diagram of the overall methodology is shown in Figure 1. As shown in this
Figure, images are sourced (from the Internet via web mining and crowdsourcing). In this work,
Google image search engine is used for web mining (Deng et al. 2009; Fergus et al. 2005).
Images are collected from this database by searching the following keywords: 1) building under
construction, 2) construction equipment, and 3) construction worker. Also, crowdsourced images
are collected from a number of construction projects in China. Collectively, these images
constitute PICTOR v.1 dataset. Next, Labelbox (a web-based labelling toolbox) is used to label
the images within the dataset. Particularly, three type of objects, i.e., building (buildings under
construction), equipment (construction equipment, e.g. trucks, excavators, and loaders), and
worker (construction workers), are marked by drawing polygonal shapes around the boundaries
of these objects (a.k.a. semantic segmentation). The average time required to annotate an image
from the Internet and crowdsourced dataset is ~99 seconds and ~163 seconds, respectively. Of
note, there are more instances of each class per image in the images obtained through web
mining, compared to crowdsourced images. For instance, there are 1,816 instances of “worker”
in 667 Internet images (2.72 workers/image), versus 1,906 instances of “worker” in 832
crowdsourced images (2.29 workers/image). Therefore, the time required for annotating Internet

images is generally higher. Table 1 shows the breakdown of PICTOR v.1 dataset by the number
of images containing each object type. In this Table, a single image may contain multiple object
classes.

Figure 1. Schematic diagram of the designed methodology.

Table 1. Number of sourced images in PICTOR v.1 dataset containing each object type.
Internet Crowdsourced
Class Train Test Train Test
Building 319 299 324 147
Equipment 577 468 375 166
Worker 361 306 589 243
Total 1,000 862 1,000 417

Table 2. Three models and their descriptions.

Model Algorithm CNN Architecture FLOP a Pre-trained Dataset
YOLO V2 YOLO 23 Conv. Layers 62.94 Billion VOC
Tiny YOLO YOLO 9 Conv. Layers 5.41 Billion VOC
Mask RCNN Mask RCNN ResNet-50 3.8 Billion COCO
a
Floating point operations which refers to the number of mathematical operations (e.g., addition, subtraction,
multiplication, and division) performed on floating-point (real) numbers.

For a fair comparison between each source (Internet and crowdsourced subsets), 1,000
images from each subset are randomly selected for training and the rest is used for testing. For
consistent performance, all models utilize transfer learning, i.e., learning general features via pre-
training with a different but related (and larger) dataset, and learning more specific features
related to the target classes via re-training some parts of the model (a.k.a. fine-tuning) with the
desired (and smaller) dataset (Oquab et al. 2014; Shin et al. 2016). Particularly, three different
models are created using three different architectures, namely, YOLO V2, Tiny YOLO, and Mask
R-CNN (ResNet-50). A summary of the description of these models is given in Table 2.
The YOLO V2 and Tiny YOLO models both use the YOLO algorithm for detection and are
pre-trained on the Pattern Analysis, Statistical Modelling and Computational Learning
(PASCAL) Visual Object Class (VOC) dataset (Everingham et al. 2015). On the other hand, the

Mask RCNN model utilizes the Mask R-CNN algorithm and is pre-trained on Microsoft’s
Common Objects in Context (COCO) dataset (Lin et al. 2014). Each model is trained and tested
on PICTOR v.1 subsets, both individually and collectively, and the mean average precision
(mAP) is calculated to compare the performance of the models.

RESULTS AND DISCUSSION

The performance of all three models in terms of mAP is shown in Figure 2. From this Figure,
it can be seen that overall, the performance of YOLO V2 is better than the other two models.
Particularly, the YOLO V2 model trained on the web mining subset or both subsets and tested on
the web mining subset can achieve 86% mAP. Of note, for comparison, the YOLO V2 model
trained on VOC-2007 and VOC-2012 training datasets has achieved 78.6% mAP when tested on
the VOC-2007 validation dataset (Everingham et al. 2015). In general, with Intel Core i7-8850H
CPU @ 2.60Hz (six cores), 16 GB RAM, and NVIDIA Quadro P2000 4GB GPU, the inference
speed of YOLO V2, Tiny YOLO, and Mask R-CNN models is ~16, ~41, and ~2.5 FPS,
respectively.

Figure 2. Performance of three models trained and tested on different datasets.

In can be observed in Figure 2 that models tested on the web mining subset generally
perform better than those tested on the crowdsourced subset. This can be attributed to the fact
that in general, Internet images have better quality (taken with good cameras, in well-lit
conditions, without obstacles) preserving image details. On the contrary, crowdsourced images
are of lower quality (taken in diverse lighting and weather conditions, containing visual
obstacles). However, interestingly, models perform better when trained on images from both
subsets (web mining and crowdsourced). This can be best described considering that models
learn relevant and useful features from the well-structured Internet images, while diverse and
challenging crowdsourced images help prevent overfitting and assist in selecting features that are
more general. In short, this counteracting balance makes the models more robust. In this work, a
model that can perform well on crowdsourced images is of interest since these images resemble
the real-world conditions. Therefore, the YOLO V2 model is selected for real-time construction
object detection. As shown in Figure 3, this model takes a 416×416 image input and outputs
rectangular bounding boxes for detected objects, each associated with a confidence level
indicating the probability of the detected object inside the box. A threshold value is used to
discard detections with low confidence. A lower threshold allows the model to output more
detections with lower confidence, increasing false positives, and lowering detection precision.

On the contrary, a high threshold filters out potentially correct detections, increasing false
negatives, and lowering detection recall. This inherent precision-recall tradeoff can be better
understood from Figure 4. It is found that for a confidence threshold of 0.1, the harmonic mean
of precision and recall (a.k.a. F1 score) is optimal at 28.75%.

Figure 3. Detection of building (blue), equipment (green), and worker (red) by YOLO (top
row) and Mask R-CNN (bottom row) algorithms.

Figure 4. Precision and recall of the YOLO V2 model.

SUMMARY AND CONCLUSION
DL algorithms can learn features and classify objects with high accuracy. However, two
major challenges still exist in applying DL particularly to construction domain: (1) scarcity of
large labeled datasets, and (2) high computational time. In this paper, the authors presented a DL
technique to overcome these challenges by detecting construction objects with high precision in
real-time. First, a large number of images were collected from the Internet (using web mining)
and crowdsourcing to construct a large dataset named PICTOR v.1. Transfer learning was used
to utilize the features learned from larger and more relevant datasets. Next, two algorithms,
namely YOLO and Mask R-CNN were used to train and test models to detect objects in real-
time. Three different models (i.e., YOLO V2, Tiny YOLO, and Mask R-CNN) were trained and

tested on three different combinations of the PICTOR v.1 dataset (i.e., web mining and
crowdsourced subsets, and the entire dataset). It was found that the YOLO V2 model performed
better, particularly, when trained on the entire dataset (images from the Internet and
crowdsourcing), and was able to detect buildings, equipment, and workers in crowdsourced and
web mining image subsets with 79% and 86% mAP, respectively. Therefore, this model can be
used for analyzing construction images/videos and retrieving specific contents of interest.

ACKNOWLEDGMENTS
Authors gratefully acknowledge U.S. National Science Foundation (NSF) for supporting this
project through grant CMMI 1800957, and Mr. Yalong Pi for assisting in data preparation. Any
opinions, findings, conclusions, and recommendations expressed in this paper are those of the
authors and do not necessarily represent the views of the NSF or the individual named above.

REFERENCES
Beucher, S., and Meyer, F. (1992). “The morphological approach to segmentation: the watershed
transformation.” Optical Engineering-New York-Marcel Dekker Incorporated, 34, 433-433.
Brilakis, I., and Soibelman, L. (2008). “Shape-based retrieval of construction site photographs.”
J. Comput. Civ. Eng., 22, 14–20.
Canny, J. (1986). “A computational approach to edge detection.” IEEE Transactions on pattern
analysis and machine intelligence, 6, 679-698.
Chi, S., and Caldas, C. H. (2011). “Automated object identification using optical video cameras
on construction sites.” Comput.-Aided Civ. Infrastruct. Eng., 26, 368–380.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). “ImageNet: A large-
scale hierarchical image database.” Proc., IEEE Conference on Computer Vision and Pattern
Recognition, 248–255.
Deng, L. and Yu, D. (2014). “Deep learning: methods and applications.” Foundations and
Trends in Signal Processing, 7(3–4), 197-387.
Ding, L., Fang, W., Luo, H., Love, P. E., Zhong, B., and Ouyang, X. (2018). “A deep hybrid
learning model to detect unsafe behavior: Integrating convolution neural networks and long
short-term memory.” Autom. Constr., 86, 118–124.
Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A.
(2015). “The pascal visual object classes challenge: A retrospective.” Int. J. Computer
Vision, 111(1), 98-136.
Fergus, R., Fei-Fei, L., Perona, P., and Zisserman, A. (2005). “Learning object categories from
Google’s image search.” Proc., 10th IEEE International Conference on Computer Vision,
1816–1823.
Girshick, R. (2015). “Fast R-CNN.” Proc., IEEE International Conference on Computer Vision,
1440-1448.
Han, K. K., and Golparvar-Fard, M. (2015). “Appearance-based material classification for
monitoring of operation-level construction progress using 4D BIM and site photologs.”
Autom. Constr., 53, 44–57.
He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017). “Mask R-CNN.” Proc., IEEE
International Conference on Computer Vision, 2980-2988.
Kolar, Z., Chen, H., and Luo, X. (2018). “Transfer learning and deep convolutional neural
networks for safety guardrail detection in 2D images.” Autom. Constr., 89, 58–70.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). “ImageNet classification with deep

convolutional neural networks.” Proc., Advances in Neural Information Processing Systems,

1097–1105.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). “Gradient-based learning applied to
document recognition.” Proc., IEEE, 2278–2324.
Lin, T-Y, Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C.
L., (2014). "Microsoft COCO: Common objects in context." Proc., European Conference on
Computer Vision, 740-755.
Nasrabadi, N. M. (2007). “Pattern recognition and machine learning.” Journal of electronic
imaging, 16(4), 049901.
Nath, N., Chaspari, T., Behzadan, A. H. (2018). “A Transfer Learning Method for Deep Neural
Network Annotation of Construction Site Imagery.” Proc., 18th Int. Conf. on Construction
Applications of Virtual Reality, Auckland, New Zealand, 1-10.
Oquab, M., Bottou, L., Laptev, I., and Sivic, J. (2014). “Learning and transferring mid-level
image representations using convolutional neural networks.” Proc., IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), IEEE, 1717–1724.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). “You only look once: Unified,
real-time object detection.” IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), IEEE, 779-788.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). “Faster R-CNN: Towards real-time object
detection with region proposal networks.” Proc., Advances in Neural Information Processing
Systems, 91-99.
Reynolds, D. (2015). “Gaussian mixture models.” Encyclopedia of biometrics, 827-832.
Schmidhuber, J. (2015). “Deep learning in neural networks: An overview.” Neural networks, 61,
85-117.
Shin, H.-C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., and Summers,
R. M. (2016). “Deep convolutional neural networks for computer-aided detection: CNN
architectures, dataset characteristics and transfer learning.” IEEE Trans. Med. Imaging, 35,
1285–1298.
Siddula, M., Dai, F., Ye, Y., and Fan, J. (2016). “Unsupervised feature learning for objects of
interest detection in cluttered construction roof site images.” Procedia Eng., 145, 428–435.
Simonyan, K., and Zisserman, A. (2014). “Very deep convolutional networks for large-scale
image recognition.” ArXiv Prepr. ArXiv:1409.1556.
Witten, I.H., Frank, E., Hall, M.A. and Pal, C.J. (2016). Data Mining: Practical machine
learning tools and techniques. Morgan Kaufmann.
Wu, Y., Kim, H., Kim, C., and Han, S. H. (2009). “Object recognition in construction-site
images using 3D CAD-based filtering.” J. Comput. Civ. Eng., 24, 56–64.
Zou, J., and Kim, H. (2007). “Using hue, saturation, and value color space for hydraulic
excavator idle time analysis.” J. Comput. Civ. Eng., 21, 238–246.

Unsupervised Machine Learning for Augmented Data Analytics of Building Codes

Ruichuan Zhang, S.M.ASCE1; and Nora El-Gohary, Ph.D., M.ASCE2
1
Dept. of Civil and Environmental Engineering, Univ. of Illinois at Urbana–Champaign, 205
North Mathews Ave., Urbana, IL 61801. E-mail: [email protected]
2
Dept. of Civil and Environmental Engineering, Univ. of Illinois at Urbana–Champaign, 205
North Mathews Ave., Urbana, IL 61801. E-mail: [email protected]

ABSTRACT
Existing automated code checking methods/tools are unable to automatically analyze and
represent all types of requirements (e.g., requirements that are too complex or that require human
judgement). Recent efforts in the area of augmented data analytics have proposed the use of
templates to facilitate the analysis of text. However, most of these efforts have constructed such
templates manually, which is labor-intensive. More importantly, it is difficult for manually-
developed templates to capture the linguistic variations in building codes. More research is, thus,
needed to automate the generation of templates to support the tagging and extraction of
information from building codes. To address this need, this paper proposes an unsupervised
machine-learning based method to extract sentence templates that describe syntactic and
semantic features and patterns from building codes. The proposed method is composed of four
main steps: (1) data preprocessing; (2) identifying the different groups of sentence fragments
using clustering; (3) identifying the fixed parts and the slots in the templates based on the
syntactic and semantic patterns of the sentence fragment groups; and (4) evaluating the extracted
templates. The proposed method was implemented and tested on a corpus of text from the
International Building Code. An accuracy of 0.76 was achieved.

INTRODUCTION
Various automatic and semi-automatic code checking methods have been developed,
including methods for automated text analysis (to capture the requirements in the textual codes)
and automated rule formalization (to formalize these requirements in the form of computable
rules). Although these methods have achieved different levels of automation and accuracy, they
have one common limitation, which is the inability to represent all types of building code
requirements, especially those requirements that have complex syntactic and semantic structures,
such as nested clauses and multiple proposition (verb-argument) units. Therefore, efforts must be
made to enhance the analysis of building code requirement sentences before any automatic or
semi-automatic compliance checking can be improved.
Recent efforts in the area of augmented data analytics have proposed the use of document
and sentence templates to facilitate the analysis of natural language data. A template is usually
composed of several fixed parts and slots. The fixed parts consist of words and/or phrases
frequently appearing in the corpus, and the slots are labeled by the syntactic and/or semantic
roles that the corresponding sentence fragments play in the sentence. Templates have been used
to facilitate semantic annotation and information extraction for various applications such as
document management (Arantes and Falbo 2010) and intelligent contracting (Clack et al. 2016).
A limited number of template-based approaches have also been used in the construction domain
for the analysis of contract specifications. For example, Ryoo et al. (2010) used premade
templates to facilitate the writing and editing of construction specifications in a web-based

system (Ryoo et al. 2010). Commercial software, such as e-Specs (Avitru 2018) and BIMdrive
(Digicon 2018), also used templates based on the National Master Specifications for developing
BIM-compatible specifications.
However, the template-based approach usually assumes that the templates are pre-defined or
manually constructed by domain experts, which is labor-intensive. Such manually-developed
templates are also typically rigid and static, lacking the flexibility and dynamism to capture the
linguistic variations across different types of chapters/documents. Manually-developed templates
are, thus, difficult to adapt from on chapter/document to another, especially when dealing with
complex, technical, and/or domain-specific documents like specifications and building codes.
Existing research to automate the generation of templates from text data includes automatic
template extraction from human-written weather reports for summarization (Das et al. 2008),
from news reports for information extraction (Chambers and Jurafsky 2011), and from emails for
email auto-reply (Proskurnia et al. 2017). Generally, automatic or semi-automatic template
generation approaches are composed of two primary steps: text/sentence clustering and template
induction from the clusters of textual data, with the latter step consisting of two substeps to deal
with the fixed parts and the slots, respectively. The objects to be clustered and the similarity
measures for clustering vary from one effort to another [e.g., verbs and WordNet similarity (Das
et al. 2008, Chambers and Jurafsky 2011), sentences and ROUGH similarity (Das et al. 2008),
and term frequencies and Euclidean distance (Proskurnia et al. 2017)] depending on the type of
text, domain/application characteristics and requirements, and template generation approach
taken. Thus, there is a need to develop a domain-specific automatic template generation method
to support the tagging and extraction of information from building codes.
To address this need, this paper proposes an unsupervised learning-based approach for
automatic generation of templates from building code sentences. The proposed approach consists
of four steps: data preparation and preprocessing; clustering of sentence fragments; induction of
templates from the clusters; and evaluation of the generated templates.

Figure 1. Proposed unsupervised learning-based approach for augmented data analytics of

building codes.
BACKGROUND
Constituency parsing and shallow semantic labeling: Constituency parsing aims to
organize words in a sentence into nested constituents based on phrase structures (Jurafsky 2000).
The result of constituency parsing is often represented in the form of a tree, where the nodes are
phrase structure categories, and the leaves are part-of-speech (POS) tags and words. Shallow
sematic role labeling aims to extract the proposition units, which consist of target verbs and other
constituents, where each fills a specific semantic role of the verb (Carreras and Màrquez 2005).
The semantic roles include agent, patient, and also adjuncts such as location, manner, etc. The

results of both techniques are usually used to gain a better understanding of the syntactic and
semantic structure of a sentence, which can function as features in machine learning problems
such as clustering to further analyze the natural language data.
Agglomerative hierarchical clustering: Clustering is an unsupervised learning problem that
aims to find groups of similar objects in the data (Aggarwal and Zhai 2012). Clustering
algorithms can be classified into several categories: agglomerative, partitioning, and probabilistic
model-based algorithms. Agglomerative hierarchical clustering aims to successively merge
groups of data in a pairwise manner based on the pairwise distance/similarity, until all the data
are within one single group (Aggarwal and Zhai 2012).

PROPOSED UNSUPERVISED APPROACH FOR AUGMENTED DATA ANALYTICS

OF BUILDING CODES
The proposed unsupervised machine learning-based approach for automatic generation of
templates for supporting building code analytics consists of four main steps, as per Figure 1.
Step 1: Data Preparation and Preprocessing: Around 1,000 sentences were collected from
Chapters 11 to 16 of the International Building Code (IBC) 2009 (ICC 2009). Three steps of data
preprocessing were conducted: tokenization, lowercasing, and stemming. All sentences were
then tagged and parsed using the Stanford CoreNLP constituency parser (Manning et al. 2014).
The results of constituency parsing were used for slot interpretation when inducing the templates
(Step 3.2). The sentences were also labeled using a shallow semantic role labeler, which
segments a sentence into several proposition units. Each proposition unit has a verb-argument
structure that is composed of: (1) one verb (V); (2) arguments (A): arguments are usually noun
phrases that define the verb-specific semantic roles [e.g., the agent (A0), the patient or theme
(A1), and a general argument with no specific semantic meaning (A2)]; (3) adjuncts (AM):
adjuncts are adverbs, adverb phrases, and preposition phrases that describe and/or modify the
verb [e.g., location (LOC), modal verb (MOD)]; and (4) references (R). The results of shallow
semantic role labeling were used as features for sentence fragment clustering (Step 2) and slot
interpretation (Step 3.2). Figure 2 shows an example of the semantic role labeling results.
Step 2: Sentence Fragment Clustering: Pairwise distance calculation. Edit distance was
used to describe the dissimilarity between each pair of proposition units generated by shallow
semantic role labeling. In edit distance, three types of edit operations are defined: removal,
insertion, and substitution (Jurafsky 2000). Each operation adds one to the distance between the
two sentences. For example, the edit distance between proposition units “A0 A1 V” and “A0 A1
LOC V” is 1 because the first unit can be converted into the second unit after one insertion
operation of the tag “LOC”.
Agglomerative hierarchical clustering. Different agglomerative hierarchical clustering
methods were tested and evaluated based their average silhouette coefficients (Rousseeuw 1987):
single, complete, average, centroid, McQuitty, median, and Ward’s. The centroid method
achieved the highest performance, with an average silhouette coefficient of 0.5, and was
therefore selected. The centroid method defines the pairwise similarity as the cosine similarity
between the centroids of two groups (Steinbach et al. 2000). The average silhouette coefficient is
the average of the silhouette coefficients of all the data. The silhouette coefficient of a datum is
computed as per the following equation, where a(i ) is the average difference between datum i
and the other data in the same cluster, and b  i  is the lowest average difference between i and
the other clusters. The silhouette coefficient ranges from -1 to 1. A value near 1 indicates that the

datum is far from the neighboring clusters; and a negative value indicates that the datum might
be assigned to a wrong cluster. A higher average silhouette coefficient indicates better clustering
result, which is essential for the following steps of template induction. If the size of a cluster is
too small compared to the average size of all the clusters, the sentence fragments corresponding
to the cluster are treated as outliers and are neglected in the following template generation steps.
Figure 2 shows an example of the clustering results.
b  i   a(i )
silhouette  i  
max{a  i  , b  i }
Step 3: Template Generation: Defining the fixed parts. A template was generated for each
type of sentence or sentence fragment. The fixed parts in the templates were identified based on
the frequent words. First, the frequent words are identified. Then, for each sentence fragment
template, the frequent words form the fixed parts of the template. Term frequency was used to
find the frequent words in each cluster. The term frequency of every word in all sentence
fragments was calculated for each cluster. Articles such as “a”, “an”, and “the”, symbols, and
punctuations were neglected when calculating the term frequency. A threshold n was set to
define the frequent words. The top n percent of words in terms of frequency (or words with the
highest frequency if the size of the vocabulary is too small) were treated as frequent words.
Interpreting and annotating the slots. A slot in a template is the non-fixed (or blank) part
between two consecutive fixed parts. The complete template is, thus, a mixed sequence of both
frequent words and slot labels. The slots were annotated using two sets of labels: syntactic and
semantic labels. The syntactic labels are the phrasal tags at the first level below the root of the
constituency parsing tree. The slots were labeled with the phrasal tags of the fragments
corresponding to the slots. For example, the slot in the sentence fragment “an airspace of not less
than 1 inch” corresponding to “an airspace” was labeled as NP. The semantic labels are the
semantic roles based on the semantic information elements by Zhang and El-Gohary (2015) (e.g.,
subject, compliance checking attribute, and quantity value) and the proposition units. For
example, a slot was labeled as “subject” if the corresponding fragment represents a thing (e.g.,
building element) that is subject to a requirement; and a slot was labeled as “location” if the slot
is in the “LOC” in a proposition unit, and follows a preposition such as “in”, “above”, or
“below”. Figure 2 shows an example of the template generation results.

Figure 2. An example of the clustering-based template generation.

Step 4: Template Evaluation: Five metrics were used to evaluate the templates: the total
number and the average length of templates, coverage, entropy, and accuracy. The total number
of templates and the average length of templates are two basic indicators of the complexity of a
template system. The larger the total number of templates and the average length of templates,

the more complex the template system is. In addition, coverage and entropy were computed to
evaluate the templates (Proskurnia et al. 2017). The coverage is how much of the training
sentences can be represented using the template system. The entropy indicates how
rigid/inflexible the templates are; the higher the probability of the word occurrence in the fixed
parts, the higher the entropy. One important step of template generation is to find the optimal
frequent-word threshold to maximize the coverage and minimize the entropy simultaneously.
Coverage was computed as per the following equation, where l  ti  is the number of words in
the fixed parts of the ith template, T is the number of templates, l ( si ) is total number of words in
the jth sentence, and S is the number of sentences.

T
l (ti )
coverage  i 1


S
j 1
l (s j )
Entropy was computed as per the following equation, where P( wi ) is the probability of wi
occurring in the fixed parts of the templates and is computed by dividing the word frequency of
wi by the total length of all templates, and N is the total number of frequent words.
N
entropy  P( wi ) log 2 P( wi )
i 1
Accuracy was computed as the portion of the testing sentence fragments that are matched to
at least one of the templates to the total of testing sentence fragments. To develop the testing
dataset, a total of 160 sentence fragments were randomly selected from the rest of the chapters of
IBC 2009 (i.e., from Chapters 1 to 10 and Chapters 17 to 35).

PRELIMINARY EXPERIMENT RESULTS AND DISCUSSION

Sentence Fragment Clustering: A total of 3,000 sentence fragments were generated from
Chapter 11 to 16 of the IBC 2009 (ICC 2009). The corresponding proposition units were
clustered using the agglomerative hierarchical clustering method. The resulting average
silhouette coefficient is 0.5, which indicates good clustering performance. The clusters with size
lower than 30 were neglected, resulting in a total of eight clusters that were used for template
induction. Table 1 shows the proposition unit structures corresponding to the eight clusters and
the size of each cluster. The proposition unit roles that exist in the majority of data in the cluster
are bolded.

Table 1. Proposition Unit Structures of the Clusters.

Cluster Proposition unit structure Cluster size
1 A1 AM (PNC/MNR/LOC/MOD) V 243
2 A1 A0 V AM (MOD) 220
3 A1 A0 A2 R V 109
4 A1 A2 AM (LOC/MOD) V 389
5 A1 A2 V 305
6 A1 A0 V 414
7 A1 AM (MNR/ LOC) V 985
8 A1 A2 AM (PNC/MNR/LOC/ADV/MOD) V 111
Note: A0=Agent; A1=Patient; A2=General argument with no specific semantic meaning; AM=Adjunct
LOC=Location; MNR=Manner; MOD=Modal verb; PNC=Purpose; R=Reference argument; V=Verb

Template Development. Four example templates and the corresponding example sentences
or sentence fragments that were used to generate the templates are shown in Table 2. The left
side of each template is the proposition unit element and the right side includes the slots, slot
labels, and the fixed parts. The sentence parts corresponding to the template slots are underlined.

Table 2. Example Templates.

Example templates Example sentence or fragment
A1 Slot Subject [NP]
“a drainage system installed in
V Fixed part installed
accordance with Sections 1805.4.2
Fixed part in accordance with
MNR and 1805.4.3”
Slot Reference [NP]
A1 Slot Subject [NP]
R Fixed part that are
Quantitative relation “freestanding press boxes that are
V Slot
[V] elevated above grade 12 feet
Fixed part above grade minimum”
A2 Slot Quantity value [CD]
Fixed part feet minimum
Fixed part in
LOC
Slot Location [NP]
“In multilevel parking structures,
A1 Slot Subject [NP]
van-accessible parking spaces are
V Fixed part permitted
permitted on one level.”
Fixed part on
A2
Slot Location [NP]
A0 Slot Subject [NP]
MOD Fixed part shall
“A building, room or space used
V Fixed part comply
for assembly purposes with fixed
Fixed part with
A1 seating shall comply with Sections
Slot Reference [NP]
1108.2.1 through 1108.2.5.”
Fixed part through
MNR
Slot Reference [CD]
Note: A0=Agent; A1=Patient; A2=General argument with no specific semantic meaning; CD=Cardinal number;
LOC=Location; MNR=Manner; MOD=Modal verb; NP=Noun phrase; R=Reference argument; V=Verb

Figure 3. (a) Plot of coverage; (b) Plot of entropy.

Template Evaluation: A sequence of frequent-word thresholds ranging from 0.01 to 0.2

with a step of 0.01 were tested. Based on the empirical analysis of the coverage and entropy
results, a threshold value of 0.05 was found optimal. The coverage and entropy results are
illustrated in Figure 3(a) and (b), respectively. As shown, the coverage and entropy of the
templates increase as the threshold increases. The coverage increases because more words in the
training sentences are used in the fixed parts of the templates; and the entropy increases because
the rigidity of the templates increases. Similarly, the average length of the templates and the total
number of templates both increase as the threshold increases. Using the testing dataset, the
accuracy of the final templates is 0.76.

CONCLUSIONS AND FUTURE WORK

This paper proposed an unsupervised learning-based template extraction approach for
analyzing building code requirement sentences. The training building code sentences were first
parsed by a constituency parser and labeled by a shallow semantic role labeler. The proposition
units generated in the process of shallow semantic role labeling were clustered using an
agglomerative hierarchical clustering method based on pairwise edit distance. The sentence
fragments corresponding to the resulted clusters were then used to induce the templates: first,
frequent words were found and assembled into the fixed parts of the templates; second, the slots
were interpreted using syntactic and semantic labels. A number of building code requirement
sentence templates were generated and evaluated using five metrics: total number of templates,
average size of templates, coverage, entropy, and accuracy. An accuracy of 0.76 was achieved
for the final set of templates, at a 0.05 frequent-word threshold value.
This paper contributes to the body of knowledge in two primary ways. First, these
preliminary results show that the clusters learned from the syntactic and semantic features are
potentially effective for template generation. Second, the results also indicate that sentence
templates could help improve building code analytics by enabling the analysis of text and the
extraction of information using these templates.
In their future work, the authors plan to refine the semantic annotation of the template slots
using more compliance checking-related semantic roles; integrate the templates with ontologies
to enhance the quality of semantic annotation for improving coverage and decreasing entropy;
and leverage the templates to improve the tagging and extraction of information from building
codes and thus the performance of automated/semi-automated compliance checking.

REFERENCES
Aggarwal, C.C., and Zhai, C. (2012). “A survey of text classification algorithms.” Mining Text
Data, Springer, US, 163-222.
Arantes, L.O., and Falbo, R.A. (2010). “An infrastructure for managing semantic document.”
Proc., 14th Int. Enterprise Distributed Object Computing Conf. Workshops EDOCW, 235-244.
Avitru. (2018). “E-Specs.” http://e-specs.com/products/e-specs-master-specification-support.
(Oct 15, 2018)
Carreras, X., and Màrquez, L. (2005). “Introduction to the CoNLL-2005 shared task: semantic
role labeling.” Proc., 9th Computational Natural Language Learning Conf., 152-164.
Chambers, N., and Jurafsky, D. (2011). “Template-based information extraction without the
templates.” Proc., 49th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies -Volume 1, 976-986.
Clack, C.D., Bakshi, V.A., and Braine, L. (2016). “Smart contract templates: foundations, design

landscape and research directions.” arXiv preprint arXiv:1608.00771.

Das, D., Kumar, M., and Rudnicky, A.I. (2008). “Automatic extraction of briefing templates.”
Proc., 3rd Int. Joint Conf. on Natural Language Processing: Volume-I.
Digicon. (2018). “BIMdrive Specification Management Software.”
http://www.digicon.ab.ca/services.aspx. (Oct 15, 2018)
ICC (International Code Council). (2009). International Building Code 2009. US.
Jurafsky, D. (2000). Speech & language processing. Pearson Education India, India.
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., and McClosky, D. (2014).
“The Stanford CoreNLP natural language processing toolkit.” In ACL (System
Demonstrations), 55-60.
Proskurnia, J., Cartright, M. A., Garcia-Pueyo, L., Krka, I., Wendt, J. B., Kaufmann, T., and
Miklos, B. (2017). “Template induction over unstructured email corpora.” Proc., 26th Int.
Conf. on World Wide Web, 1521-1530.
Rousseeuw, P.J. (1987). “Silhouettes: a graphical aid to the interpretation and validation of
cluster analysis.” J. Comput. Appl. Math, 20, 53-65.
Ryoo, B.Y., Skibniewski, M.J., and Kwak, Y.H. (2010). “Web-based construction project
specification system.” J. Comput. Civ. Eng., 24(2), 212-221.
Steinbach, M., Karypis, G., and Kumar, V. (2000). “A comparison of document clustering
techniques.” KDD workshop on text mining, 400(1), 525-526.
Zhang, J., and El-Gohary, N.M. (2015). “Automated information transformation for automated
regulatory compliance checking in construction.” J. Comput. Civ. Eng., 29(4), B4015001.

A Data-Driven and Physics-Based Approach to Exploring Interdependency of

Interconnected Infrastructure
Shenghua Zhou1; S. Thomas Ng2; Yifan Yang3 ; Frank Jun Xu4; and Dezhi Li5
1
Dept. of Civil Engineering, Univ. of Hong Kong, Pokfulam, Hong Kong. E-mail:
[email protected]
2
Dept. of Civil Engineering, Univ. of Hong Kong, Pokfulam, Hong Kong (corresponding
author). E-mail [email protected]
3
Dept. of Civil Engineering, Univ. of Hong Kong, Pokfulam, Hong Kong. E-mail:
[email protected]
4
Dept. of Civil Engineering, Univ. of Hong Kong, Pokfulam, Hong Kong. E-mail:
[email protected]
5
School of Civil Engineering, Southeast Univ., Nanjing 210096, China. E-mail:
[email protected]

ABSTRACT
Interdependency of interconnected infrastructure is a critical and daunting issue. While
existing research is still limited to single physical mechanism-based methods or data-driven
statistics approaches; there is a predicament that certain target infrastructure may only have
limited knowledge on physical operation mechanisms or lack enough associated data. To fill the
gap, the study proposes an integrated data-driven (DD) and physics-based (PB) approach to
explore the interdependency of interconnected infrastructure. The approach consists of three
components: (i) “motivations”—to understand the targeted infrastructure, availability of data,
and physical knowledge related to different infrastructure systems; (ii) “methods”—to select
suitable DD or PB methods for individual infrastructure; and (iii) “modes”—to design the
connection for bridging DD and PB methods. A case study is conducted to explore the
interdependency of the water supply pipe system and road transport networks of a district in
Hong Kong. The preliminary findings reveal that the framework can help identify the hot spots
of water pipe bursts and predict their cascading effects on the road transport networks, despite
certain limitations need to be overcome in future research.

INTRODUCTION
Failure of any interdependent infrastructure systems (e.g. transportation, gas, electricity,
water) or their sub-components could trigger a cascading effect (Buldyrev et al., 2010). To
understand the knock-on effect, a variety of research on the interdependency of interconnected
infrastructures (IoII) has been conducted.
Data-driven (DD) and physics-based (PB) approaches are two basic research paradigms for
studying IoII apart from the conceptual studies. The PB methods are generally used to simulate
the temporal and spatial motion and behavior of infrastructures (Feynman et al., 2011) with
reference to the physical laws (e.g. Newton’s Law of Gravity) that governs the operation of
infrastructures. DD methods are usually adopted to extract insightful information or to discover
knowledge (e.g. the service disruption patterns, trends, and relationship) from the static and
operation data of infrastructure systems (Mosley et al., 2010).
When utilizing a single PB or DD method to investigate the IoII, it encounters a common
challenge that clear physical operation mechanisms or a large volume of consistent data are not

available for all targeted infrastructures at the same time (Eusgeld et al., 2011), which means that
certain infrastructure systems only have fuzzy physical operation mechanisms or lack sufficient
high-quality data. There is an urgent need to integrate the DD and PB methods for the IoII
research and practice.
In this paper, a 3M (Motivation, Method and Mode) framework is proposed to guide the
integration of the DD and PB methods for IoII research. The framework is applied and validated
through a preliminary case study by taking the water supply pipe network and transportation road
system of a district in Hong Kong as an example.

THE PROPOSED FRAMEWORK

The 3M framework is devised based on the existing literature alleged to have already
achieved DD & PB integration (An et al., 2015). As illustrated in Figure 1, the framework
consists of three components: (i) determine motivations (Red); (ii) select methods (Green); and
(iii) design connection modes (Yellow). Notably, the double arrows between selecting methods
and designing connection modes means they affect each other, which is an iterative process.

Understand
1.1 3.1 Identify
Understand infrastructure B
infrastructure Interdepende-
A ncy types

Determine
Motivations
1.2 Explore 3.2 Determine
Data & corresponding
Physical Design interaction
knowledge Select
Connection variable
Methods
Modes

2.1 Choose DD
method 3.3 Clarify
formal Methods for
infrastructure B
2.2 Choose PB characteristics
method

Figure 1. The 3M framework for integrating PB&DD approaches

Determine Motivations: This component comprises two steps as follows:
(i) Understand the targeted infrastructures. The temporal and spatial granularity of targeted
infrastructures’ information, like the abstract level (i.e. component level or system level
information), timeframe (i.e. real-time, daily or weekly data), and scale (i.e. different
profiles of community, city or regional infrastructures) could significantly influence the
applicability of different DD and PB methods.
(ii) Explore the accessibility and availability of infrastructure data or physical knowledge.
The extent to which this data or knowledge could determine the motivations of choosing
the DD or PB methods. The motivation can be divided into “improvement motivations”
and “necessary motivations”. “Improvement motivations” means that the objective of
integrating the DD and PB methods is to improve the efficiency and performance in order
to save time and cost as well as to improve the accuracy and reliability. “Necessary

motivations” means that the two approaches shall be utilized complementarily to describe
different infrastructures. For example, when it is difficult to use the PB method to
understand the mechanism governing physical operations of infrastructure systems, the
DD method may play a significant role. Similarly, when infrastructure data is not
available, the PB approach can be leveraged.
Select Methods. The motivations determined in the above steps could help select the DD or
PB methods. In recent years, various approaches have been applied to investigate the IoII.
Ouyang (2014) proposed a widely accepted way for classifying these approaches, including (i)
conceptual and qualitative studies; (ii) empirical approaches; (iii) agent-based methods; (iv)
economic-theory based approaches; (v) system dynamics; and (vi) network. To facilitate the
integration and selection of the DD or PB methods for IoII problems, these methods could be re-
organized into a decision matrix along the DD, PB and target infrastructure characteristics
dimensions (Table 1), and this decision matrix is extendable to include more indicators (An et
al., 2015).

Table 1. DD and PB methods commonly used for IoII

Data Physics Infrastructure characteristics (part)
Methods
Driven based Abstract level Time-frame scale
Empirical approaches √ S/C D D
Economic-theory
√ S L Macro
based
System dynamics √ S L Macro
Agent-based √ √ S/C D D
Big Data Mining √ S/C D D
Machine learning √ S/C D D
Network √ S S Macro
Domain specified PB
√ C S Micro
simulation
Note: In Abstract level, S: system level, C: Component level; In Timeframe, L: Long-term, S: Short-term; D means
It depends.

Emerging big data and machine learning technologies have significantly extended the
boundary of the DD approach, as they allow users to collect and make better use of a large
volume and variety of infrastructure data for establishing relationships between the input and
output and discovering actionable knowledge. Representative technologies mainly include
natural language processing, support vector machine, and artificial neural network. Additionally,
domain specified PB simulation is another large group of PB methods, and examples include
transportation flow, computational fluid dynamics (CFD) and finite element simulation. Each of
the methods in Table 1 has its own (de) merits depending on the specific research scenarios.
Besides the decision matrix, Figure 2 describes a general process of the DD or PB methods,
irrespective of which DD or PB methods being used. The DD methods’ process refers to
classical data-driven processes (e.g. Cross-industry standard process for data mining and
Knowledge Discovery in Databases) (Azevedo & Santos, 2008). The PB methods’ process refers
to object-oriented language to help abstract the physical infrastructure components (An et al.,
2015). Results of PB (DD) methods could be incorporated into DD (PB) process as the input
data.

Establish Evaluation/
DD: Data Collection Data cleansing Results
models Interpretation

Objects Set attributes & operations Realize Evaluation/

PB: determination based on physical laws Results
models Validation
Figure 2. Process of DD or PB Methods
Design Connection Modes: To bridge selected PB and DD methods, connection modes are
designed (Winkler et al., 2011). This section contains three sub-steps:
(i) Identify conceptual interdependency types amongst infrastructures. Rinaldi et al. (2001)
classified infrastructure interdependencies into the “physical”, “cyber”, “geographic” and
“logical” categories. The hierarchy, which is enriched by other authors by adding the
“policy and informational” type and “budgetary, market and economic” principles
(Dudenhoeffer et al., 2006; Ouyang, 2014), has been widely accepted.
(ii) Determine the interaction variables corresponding to the infrastructure interdependency
types. Some typical interaction variables are summarized in Table 2 for demonstration.
There are many types of interaction variables, such as continuous variable and discrete
ones; the types of the variables can significantly affect the precision of IoII modeling,
especially for “functional” interdependency. For example, when the network based
approaches are used to study interconnected infrastructures, a binary variable can be
selected to describe the service capability (i.e. functioning or failure) of a network node
or link. But, it is more suitable to use an ordinal or continuous variable to depict the
strong and weak infrastructure functionality (Liu et al., 2018).
(iii)Clarify formal characteristics. This step aims to clarify the formal characteristics of the
modes, including connection type, direction and frequency. Similar to circuit
connections, the PB and DD method can be connected in series or in parallel. For
“parallel connection”, the results obtained from one type of method are used to verify
results from another type of method; and for “series connection”, the output of one
method can be the input of the other method. The frequency characteristic presents the
number of times of transferring information amongst the PB and DD methods, which
could be one-off or frequent. The direction means the sequence of connecting the PB and
DD methods with the “series connection”, from PB to DD or vice versa.

Table 2. “Conceptual interdependency” & “operational interaction variables”

No. Interdependency types Typical interaction variables
1 Geological / co-located / spatial Failures’ location, distance, elevation
2 Cyber / information Information can be transferred or not
3 Functional / physical/ input Function available or not, service ability
Logical / budgetary / market The logical factors (market share, budget,
4
economy / policy policy)

CASE STUDY
A district located in Hong Kong known as Wan chai was selected for case study. On one
hand, aging infrastructure is a serious chronic problem of the district, and many pipes for water
supply are older than 20 years. On the other hand, it is a downtown area where there are heavy
traffic flows. In the district, water pipe bursting often causes serious traffic jams, which calls for

an effective solution. The case aims to investigate the cascading effects of water supply pipe
bursts on the road transportation performance.
Application of the 3M Framework: Figure 3 shows the detailed steps of applying the 3M
framework to the case study.
(i) Motivation for applying the DD or PB methods for water supply pipes network: Firstly,
the system of water supply pipe networks was understood the from several perspectives
(i.e. level, scale, timeframe). Subsequently, the data and physical knowledge for
identifying pipe bursts’ hot spots was identified. If using PB methods for modeling water
pipe failures, we need to draw pipe topology network in Wan chai, set pipes attributes
(i.e. age, size, materials, deterioration rate), monitor water flows (i.e. speed, pressure) and
outside pressure (i.e. vehicle loads), and then apply hydraulic model and pipes’ structural
mechanics model to simulate the operations and predict the burst spots. There are two
challenges with such a PB method: (1) the detailed deterioration mechanism of pipe
bursts is not clear enough; and (2) pipe burst is a sporadic event, and applying PB method
for a long-term deterioration simulation is costly. Considering the availability of
historical data and the need to drive down the analytical cost, it is better to choose DD
approach to identify hot spots of water supply pipes.
(ii) Data-mining method for water pipe bursts: In this case, data-mining techniques are
applied to identify the water pipe failure locations. The location information is extracted
from raw unstructured text data. As illustrated in Figure 3, it comprises four sub-steps:
(1) 2,657 news articles reporting water pipe bursts between January 2000 and July 2018
were collected from the largest news repository in Greater China - Wise news; (2) pre-
processing and cleansing of the article text using the “uniqueness”, “accuracy”, and
“authority” criteria; (3) mining the cleansed text to extract feature information of pipe
bursts’ locations; and (4) descriptive statistics analysis was conducted for information
extracted. In addition, the DD method requires City Infrastructure Dictionary (CID) and
location dictionary to form the foundation.
(iii)Motivation for applying the DD or PB methods for traffic flow: The objective of the steps
in the right part of Figure 3 is to identify the traffic flow pattern. By repeating the process
of motivations, the PB model is considered to be better due to abundant and mature
knowledge on the physical mechanisms of traffic flow.
(iv) PB methods for traffic flow: This phase includes (1) identification of objects (e.g. cars,
traffic lights, lanes); (2) setting attributes of these objects; (3) realizing the model with
VISSIM, which is based on the “traffic flow theory” and a mature traffic simulation
platform; and (4) acquiring traffic flow results under different situations.
(v) Connection mode between the DD and PB methods: As shown in the yellow part of
Figure 3, the connection mode includes: (1) the interdependency between the water
supply network and road transportation system is primarily of geographic
interdependency type; (2) The corresponding interaction variable between the DD and PB
methods is “Location”, which is discrete. The hot spots of pipe bursts identified by the
DD method are regarded as an input of the PB model for traffic simulation; and (3) the
mode’s formal characteristics are series collection, one-off, and DD2PB direction.
Preliminary Results: Figure 4a shows hot spots of water pipe bursts in Hong Kong, which
are identified with the DD method. In Wan chai (see Figure 4b) district, Jaffe Road is one of the
most serious pipe bursts’ hot spots. Using Jaffe road as a case, the traffic flow conditions at peak
hour (e.g. speed, density, volume and time lost) in each road segment of Wan chai are predicted,

and the roads in the blue circle obviously become congested (See Figure 4c&4d). Based on these
results, an emergency plan can be designed to respond to any water pipe bursting in that area.

Figure 3. The 3M framework implementation process of the case

Figure 4. Hotspots of water pipe bursts and traffic jam caused

CONCLUSION
A 3M framework is put forward to facilitate the integration of DD and PB methods for IoII
analysis. The analysis includes understanding the infrastructures to determine the improvement
or necessary motivations, choosing the methods for individual infrastructure, and designing the
connection modes to transfer conceptual interdependency to interaction variables in order to
combine these methods. Taking the water supply pipes and road transportation as an example,
the 3M framework is implemented through the process of Motivation, Methods and Modes.
Finally, hot spots of pipe bursts are identified, and its cascading effects on transportation are
investigated.
It is admitted that this study has many limitations. For example, it is still a high-level
framework for conducting IoII research. Our future research will focus on enriching the content
of the framework with more detailed domain knowledge, as well as on refining and validating
the framework through more case studies.
ACKNOWLEDGMENTS
The authors thank the RGC of the HKSAR Government for funding this research under the
General Research Fund (No.: 17202215, 17248216 and 17204017).

REFERENCE
An, D., Kim, N. H., & Choi, J. H. (2015). Practical options for selecting data-driven or physics-
based prognostics algorithms with reviews. Reliability engineering & System safety, 133,
223-236.
Azevedo, A. I. R. L., & Santos, M. F. (2008). KDD, SEMMA and CRISP-DM: a parallel
overview. IADS-DM.
Buldyrev, S. V., Parshani, R., Paul, G., Stanley, H. E., & Havlin, S. (2010). Catastrophic cascade
of failures in interdependent networks. Nature, 464(7291), 1025.
Dudenhoeffer, D. D., Permann, M. R., & Manic, M. (2006). CIMS: A framework for
infrastructure interdependency modeling and analysis. Paper presented at the Proceedings of
the 38th conference on Winter simulation.
Eusgeld, I., Nan, C., & Dietz, S. (2011). “System-of-systems” approach for interdependent
critical infrastructures. Reliability engineering & System safety, 96(6), 679-686.
Feynman, R. P., Leighton, R. B., & Sands, M. (2011). The Feynman lectures on physics, Vol. I:
The new millennium edition: mainly mechanics, radiation, and heat (Vol. 1): Basic books.
Liu, R.-R., Eisenberg, D. A., Seager, T. P., & Lai, Y.-C. (2018). The “weak” interdependence of
infrastructure systems produces mixed percolation transitions in multilayer networks.
Scientific reports, 8(1), 2111.
Mosley, M., Brackett, M. H., Earley, S., & Henderson, D. (2010). DAMA guide to the data
management body of knowledge: Technics Publications.
Ouyang, M. (2014). Review on modeling and simulation of interdependent critical infrastructure
systems. Reliability engineering & System safety, 121, 43-60.
Rinaldi, S. M., Peerenboom, J. P., & Kelly, T. K. (2001). Identifying, understanding, and
analyzing critical infrastructure interdependencies. IEEE Control Systems, 21(6), 11-25.
Winkler, J., Dueñas-Osorio, L., Stein, R., & Subramanian, D. (2011). Interface network models
for complex urban infrastructure systems. Journal of Infrastructure Systems, 17(4), 138-150.

Machine Learning Based Automatic Concrete Microstructure Analysis: A Study on Effect

of Image Magnification
Srikanth Sagar Bangaru1 and Chao Wang, Ph.D., A.M.ASCE2
1
Ph.D. Student, Bert S. Turner Dept. of Construction Management, Louisiana State Univ., 237
Electrical Engineering Building, Baton Rouge, LA 70803. E-mail: [email protected]
2
Assistant Professor, Bert S. Turner Dept. of Construction Management, Louisiana State Univ.,
3315D Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]

ABSTRACT
The scanning electron microscopy (SEM) images are commonly used to understand the
microstructure of the concrete. Many researchers have adopted the image processing techniques
for the microstructure analysis, but little has been studied on how the magnification of the SEM
images influence the accuracy of analysis. Therefore, this paper presents a machine learning
(ML) based framework to study the effect of SEM image magnification on degree of hydration
measurement. In this study, the authors looked into the impact of magnification of SEM images
on the model training, accuracy, and degree of hydration measurement using two scenarios. First,
the image segmentation was performed using a classifier of specific magnification, and then a
common classifier is trained using the image of different magnification. The preliminary results
show that there is no significant effect of magnification on model training and accuracy.
However, it has a significant impact on the degree of hydration measurement.

INTRODUCTION
The microstructure of concrete is a complex system with a high degree of heterogeneity,
composed of solid phases, pores, and water. At the simplest heterogeneity level, concrete
consists of aggregates, hydrated cement paste, anhydrous cement and pores (Scrivener and
Composites 2004). The quantitative microstructure characterization of concrete is useful to
determine the relationship to macrostructure properties (Scrivener et al. 2016). Scanning
Electron Microscopy (SEM) image analysis and X-ray Diffraction are the most commonly used
techniques to quantify microstructure phases of the concrete. Due to the presence of amorphous
phases in the concrete, SEM image analysis technique is more accessible to apply compared to
the X-ray diffraction method (Gaël et al. 2016). SEM images of concrete are segmented into
various components such as aggregates, hydrated cement, anhydrous cement, and pores using
image-processing techniques. The segmented images allow measurement of areas and volumes
fractions of the components. Since all the measurements are based on segmented images, image
segmentation is the curial step in quantitative image analysis. Various image-processing
techniques are proposed to segment SEM concrete images for microstructure analysis. All these
SEM image analysis techniques differ by the segmentation algorithm and associated filters used
to process the images. Wong et al. 2009 have proposed thresholding-based image segmentation
method for estimating the water-cement ratio, cement content, water content and degree of
hydration of Portland cement-based materials which yielded results with a minimal error.
However, the proposed method is not able to characterize the entire range of capillary pores due
to insufficient image resolution. The results of thresholding-based image segmentation are
affected by image magnification and resolution. Another challenge of SEM image segmentation
is edge detection due to noise in the images, (Feng et al. 2006) have proposed segmentation

technique using the hybrid ridge signal detector which shows robustness on noise and
performance but has a computational issue for defining coefficients in the high order polynomial
used for regression. To overcome the issue of noise and edge detection, (Lee and Yoo 2008)
have developed a segmentation method using watershed segmentation algorithm, global-local
threshold method, Laplacian of Gaussian filter and achieved an accuracy of 95%. Yang (Yang et
al. 2001) proposed a method combining the grey-level thresholding, filtering, and binary
operations to segment the aggregates from the concrete image. Even though the segmented
method aggregates with high accuracy but the model was implemented only one specific
magnification images. A method was developed by (Gaël et al. 2016) have used the Particle
Swarm Optimization (PSO) technique to segment and determine the proportion of anhydrous
cement in a concrete sample. The PSO algorithm segments the SEM images into four gray scale
levels. In this method, a large number of filters such as hole filling, morphological, and minimal
particle size is used, which increases the computational complexity. The results of PSO
segmented images have a lower standard deviation of error compared to manually segmented
images. However, PSO algorithm is based on image histogram and is highly influenced by the
quality of images such as brightness and contrast. Also, smaller sample size (i.e., only twelve
images were analyzed) and images of specific resolution (640 x 480 pixels) and magnification
(100x) were used for the analysis. Even though it stated that, the number of images and
magnification used for SEM image analysis are chosen relative to, the largest granulates but no
evidence or explaining is provided. The pore segmentation method (Feng et al. 2013) proposed
by using the brightness histograms of grey level images has the effect of magnification and
resolution on the results of porosity. Gaël (Gaël et al. 2016) investigated the impact of
magnification and accelerating voltage of SEM images of cement and slag blended cement paste
on the measurement of anhydrous cement. The results concluded that with an increase of
accelerating voltage, the contrast between anhydrous cement and other phases increased. An
accelerating voltage of 15 and 20 kV was proper for neat cement and slag blended Portland
cement respectively. A magnification of 250x is sufficient to evaluate the anhydrous volume
fraction.
Some of the limitations of the existing literature for the microstructure analysis of concrete
are the accuracy of the method are highly influenced by magnification, resolution and
accelerating voltage of the SEM image. To achieve high accuracy, the previous studies have
proposed the use of higher resolution and magnification images which involves high cost and
labor intensive. Moreover, the current methods are not fully automated, non-adaptable, and time-
consuming. There is a necessity for an image processing model that can process SEM images of
various magnification and resolution. To overcome these challenges, this paper proposes a
methodology for microstructure analysis of concrete using the machine learning based image-
processing technique. Also, this paper evaluates the influence of image magnification on
classifier training and microstructure properties measurement.

AUTOMATIC CONCRETE MICROSTRUCTURE ANALYSIS

Specimen Preparation
SEM imaging requires a specimen preparation using a standard procedure explained in
(Winter and Winter 2012). The specimen preparation procedure involves four major tasks – core
sampling, freezing & drying, epoxy impregnation or conductive coating, and grounding &
polishing. Due to the sample size limitations in the electron microscope chamber, the concrete

blocks are cored to a rectangular sample of size (40x20x8) mm. Freezing-drying technique is
applied to remove the free water present in the samples. Cementitious materials are non-
conductive, so it is necessary to apply the conductive coating unless the Environmental Scanning
Electron Microscope (ESEM) is used for SEM imaging. Epoxy impregnation fills the pores and
voids which stabilizes the microstructure, prevents damage during grounding & polishing, and
may improve the identification of pores and voids. Finally, a flat surface is required for an
effective quantitative microstructure analysis of concrete that is obtained by grounding &
polishing.

SEM Image Collection

SEM system is capable of producing high-quality images of electronically conductive
specimens using electronic microscopy method (Scrivener et al. 2016). SEM images can be
obtained using secondary electron (SE) or backscattered electron (BSE) imaging mode. The
probe current, beam current, and spot size influences the choice of SE and BSE imaging mode
(Winter and Winter 2012). Other factors that influence the image quality are working distance
(WD), accelerating voltage, and magnification. A short WD will yield better images because as
WD increases, the electrons beam travels farther and will tend to become more diffuse and repel
each other. Similarly, lower accelerating voltage and low vacuum conditions improve spatial
resolution and prevent the charging effect even in the absence of the conductive coating. Before
the imaging, the SEM system is calibrated for brightness and contrast to set the system to an
optimum value to record the image which is centered and stretched to the dynamic range of
grayscale (0 to 255) for each image (Wong et al. 2009).

Figure 1. Original SEM image (left) and segmented SEM image (right) with M120 classifier

Machine Learning based SEM Image Segmentation

The proposed methodology uses the pixel-based image segmentation algorithm developed by
(Arganda-Carreras et al. 2017). The algorithm uses the pixel labels for the image classification.
The input SEM images are pixel labeled into four classes (i.e., Aggregates, Hydrated Cement,
Anhydrous cement, and Pores). The labeled pixels are represented on features space such as
Gaussian Blur, Sobel Filter, Hessian, Difference of Gaussians, and membrane projection. The
feature data is used as training data for the classifier (such as Random Forest, Naïve Bayes, and
J48, etc.). The performance of the classifier is assessed by analyzing the overall accuracy,
accuracy by class and confusion matrix. Once the desired accuracy is obtained the trained
classifier is applied to the rest of the input images or new image data. The segmented image

(Figure 1) obtained after applying the classifier is used to measure the degree of hydration, and
microstructure characterization using the image analysis techniques.

SEM Image Analysis

Microstructure Characterization
At simplest heterogeneity level, the composition of anhydrous cement, hydrated cement,
pores, and aggregates characterize the microstructure of the concrete. The image analysis
algorithm uses a segmented image obtained from image processing to quantify these components
automatically. The segmented images have four classes with unique pixel values for each. The
algorithm determines the proportion of each component by measuring the area of total pixels of
each class. The microstructure properties of the concrete such as the degree of hydration are
determined using the quantities of each class obtained from the image analysis.

Table 1. Overall classification accuracy of both Scenarios

Classifier Random Forest J48 Naive Bayes
Scenario I
M120 100 99.99 98.45
M150 100 99.99 99.04
M500 99.96 99.92 99.16
M650 99.93 99.9 92.64
M800 99.99 99.97 99.31
M1000 100 99.91 99.62
M1200 99.89 99.88 99.37
M1500 99.93 99.86 84.38
Scenario II
Common 99.95 99.95 97.56

Degree of Hydration
According to method proposed by (Feng et al. 2006), degree of hydration (M) is estimated
based on the ratio of the volume of anhydrous cement (VAH) to the volume of the total cement
paste (VC), where VC is the sum of volume of hydrated cement (VH) and VAH which are both
determined through image analysis of the segmented image.
V
M  1  AH (Equation 1)
VC

CASE STUDY
Experiment Setup
The samples were analyzed in an environmental scanning electron microscope (ESEM) at an
accelerating voltage of 20 kV under the backscattered electron (BSE) imaging mode at a low
vacuum, using the FEI Quanta 3D FIB apparatus. The SEM images obtained are (1024x884)
pixels, the images are converted into 8-bit grayscale images. The images are captured at various
magnification such as 120x, 150x, 500x, 650x, 800x, 1000x, 1200x, and 1500x. These are most
commonly used magnification for SEM image analysis.

To study the effect of image magnification on the classifier training and microstructure
properties measurement, two scenarios are considered. In the Scenario-I, the classifier is trained
using the images of specific magnification, which means each magnification has its classifier. In
the case of Scenario-II, the classifier is trained using the images of different magnification. For
each scenario, three machine learning classification algorithms (i.e., Random Forest, J48 and
Naïve Bayes) are tested. To understand the effect of magnification three different analysis is
performed. Firstly, the models are evaluated for the accuracy in both the scenarios. Secondly, the
model classifier of Scenario II is compared with Scenario I. Finally, the degree of hydration
results obtained in scenario I&II are compared. The statistical analysis is performed to determine
the significant effect of magnification on model accuracy using the average degree of hydration
measurements.

Results and Discussions

Classifier Performance
A k-fold cross-validation technique is used to evaluate the performance of the classifiers in
both the scenarios. The results of overall classification accuracy, accuracy by class, and
confusion matric are analyzed to evaluate the performance of the concrete. For simplicity, the
results of overall classification accuracies of all the classifiers, confusion matrix for scenario II,
and a segmented image of M120 classifier are shown in Table 1 and Table 2 respectively. Figure
1 shows a segmented image of M120.

Table 2. Confusion matrix of Scenario-II

Predicted Class
Aggregate Anhydrous Cement Hydrated Cement Pores
Actual

Aggregate 99.93% 0.00% 0.05% 0.00%

Anhydrous Cement 0.00% 100.00% 0.00% 0.00%
Hydrated Cement 0.00% 0.00% 100.00% 0.00%
Pores 0.00% 0.00% 0.07% 99.28%

The accuracy results from Table 1 shows that the Random Forest algorithm performed
statistically better than J48 and Naïve Bayes in case of Scenario-I for all the magnifications at a
significance level of 0.05. Moreover, the accuracies of all the classifiers for three classification
algorithms are statistically almost the same with a slight difference. In the case of Scenario-II,
both the Random Forest and J48 performed statistically better compared to Naïve Bayes at a
statistical significance of 0.05. The confusion matrix of the classifier in Scenario-II shows that
only low percentage of instances are predicted as hydrated cement instead of aggregates (0.05%)
and pores (0.07%).

The Degree of Hydration Measurement

Since the performance of the classifiers on the data is acceptable, the classifiers are applied
on an entire dataset for segmentation. The segmented images are used for image analysis where
the degree of hydration is calculated using the Equation 1 for each image. An average value is
calculated for each magnification group. The results obtained from both the scenarios are shown
in Figure 2.

Effect of Magnification on Classifier Training and Model Accuracy

F-test and T-test have performed on the degree of hydration measurements between the two
scenarios to test if the variances and means are statistically equal. The null hypothesis for F-test
is the variance of measurements in Scenario I and II are equal. Whereas the null hypothesis for
T-test is mean of measurements in both scenarios are equal. In the case of F-test, the null
hypothesis is rejected if the F-value if less than F Critical one-tail. The F-test results from Table
4 show that the variances of measurements in both the scenarios are unequal. In the case of T-
test, the null hypothesis is rejected if any one of the two inequalities (t stat < -t Critical two-tail
or t Stat > t Critical two-tail) is satisfied. The T-test results from Table 4 shows that the null
hypothesis is accepted which states that the mean value of the degree of hydration measurements
from both the scenarios are significantly equal. Also, the accuracy of the classifier of Scenario II
is compared with the classifier of Scenario I. For this comparison, the classifier of Scenario I is
supplied as test data.

Figure 2. Average degree of hydration in Scenario-I&II for different magnification

Table 3. Performance of classifier of Scenario II on Scenario I classifier

Accuracy Wt. Avg Precision Wt. Avg Recall
M120 92.39% 0.918 0.924
M150 99.77% 0.998 0.998
M500 99.16% 0.992 0.992
M650 99.38% 0.994 0.994
M800 99.71% 0.997 0.997
M1000 98.83% 0.989 0.988
M1200 97.74% 0.979 0.977
M1500 99.29% 0.933 0.933

The accuracy results from Table 3 shows that the classifier of scenario II has performed well
except in case of M120. The lower accuracy in case of M120 is because of misclassification of
aggregates class. The aggregates class is higher in M120 compared to other magnification. Due
to the large size of aggregates, the entire range of aggregates is not captured at higher
magnification. Overall, from the t-test and classifier performance results it can be concluded that

the image magnification does not affect classifier training procedure and the model accuracy. In
other words, results of the degree of hydration and microstructure proportions are significantly
the same in case of a classifier trained using specific magnification or a classifier trained using
images of different magnifications. Therefore, the use of single classifier reduces the effort of
training.

Table 4. Summary of F-test and T-test Analysis

Null
Output Conclusion
Hypothesis
F - Value 3.29
F Critical One Variances are
1.32 Reject
Tail Unequal
P-Value 4.41E-12
t Stat -1.86
P (two-tail) 0.064 Accept Means are equal
t Critical two-tail 1.97

Effect of Magnification on Degree of Hydration Measurement

ANOVA analysis is performed on the degree of hydration measurements between the
magnification groups for each scenario to understand the effect of magnification on
measurements. The null hypothesis of the ANOVA test is assumed that the means of
measurements of different magnification groups are equal at a significance level of 0.05. The
ANOVA test results show that the p-value [Scenario I (2.12E-34), Scenario II (1.83E-13)] in
case of both the scenarios is much lower than 0.05 which states that the null hypothesis is
rejected (i.e., the mean of measurements between magnification groups are significantly
unequal). The result is consistent with the fact that the increase in magnification reduces the field
of view due to which entire range of different classes is not captured (Wong et al. 2009). Further,
the post-hoc test is performed using the Bonferroni correction (n = 0.00625) to compare the
significance between the two groups. For both the scenarios, the post-hoc test results show that
the mean of measurements of the magnification groups M800 & M1000 [p-value = 0.124], and
M800 & M1500 [p-value =0.654] are significantly equal since the p-value is greater than the
Bonferroni adjustment. This methodology can be applied to identify the optimum magnification
required for microstructure analysis of a particular sample.

CONCLUSION
The proposed framework is capable of producing consistent and accurate results of
microstructure proportions and degree of hydration. The proposed method has no significant
effect of image magnification on classifier training and the model accuracy whereas the certain
image magnification groups have a significant effect on the degree of hydration measurement,
which is consistent with the fact that the different magnification of images captures different
proportions of the classes. However, the proposed method can be used to determine the optimum
magnification required for microstructure analysis of a particular sample. One of the limitations
of this study is SEM images of one concrete sample are used for training and testing of the
model. Future studies include validation of the proposed methodology using different concrete
samples. The proposed microstructure analysis of concrete using machine learning based image

processing is accurate, adaptable, requires less effort and has no effect of magnification on the
model training and accuracy.

REFERENCES
Arganda-Carreras, I., Kaynig, V., Rueden, C., Eliceiri, K. W., Schindelin, J., Cardona, A., and
Sebastian Seung, H. J. B. (2017). "Trainable Weka Segmentation: a machine learning tool for
microscopy pixel classification." 33(15), 2424-2426.
Feng, H., Ye, J., and Pease, R. F. "Segmentation-assisted edge extraction algorithms for SEM
images." Proc., Photomask Technology 2006, International Society for Optics and Photonics,
63491L.
Feng, S., Wang, P., and Liu, X. J. J. o. W. U. o. T.-M. S. E. (2013). "SEM-backscattered electron
imaging and image processing for evaluation of unhydrated cement volume fraction in slag
blended Portland cement pastes." 28(5), 968-972.
Gaël, B., Christelle, T., Gilles, E., Sandrine, G., Tristan, S.-F. J. C., and Materials, B. (2016).
"Determination of the proportion of anhydrous cement using SEM image analysis." 126, 157-
164.
Lee, J. H., and Yoo, S. I. "An effective image segmentation technique for the SEM image."
Proc., Industrial Technology, 2008. ICIT 2008. IEEE International Conference on, IEEE, 1-
5.
Scrivener, K., Snellings, R., and Lothenbach, B. (2016). A practical guide to microstructural
analysis of cementitious materials, Crc Press.
Scrivener, K. L. J. C., and Composites, C. (2004). "Backscattered electron imaging of
cementitious microstructures: understanding and quantification." 26(8), 935-945.
Winter, N. B., and Winter, N. B. (2012). Scanning electron microscopy of cement and concrete,
WHD Microanalysis.
Wong, H., Buenfeld, N. J. C., and Research, C. (2009). "Determining the water–cement ratio,
cement content, water content and degree of hydration of hardened cement paste: Method
development and validation on paste samples." 39(10), 957-965.
Yang, R., Buenfeld, N. J. C., and Research, C. (2001). "Binary segmentation of aggregate in
SEM image analysis of concrete." 31(3), 437-441.

Artificial Neural Network for Semantic Segmentation of Built Environments for

Automated Scan2BIM
Yeritza Perez-Perez1; Mani Golparvar-Fard, Ph.D.2;
and Khaled El-Rayes, Ph.D.3
1
Graduate Student, Dept. of Civil and Environmental Engineering, Univ. of Illinois at Urbana
Champaign, 205 N. Mathews Ave., Urbana, IL 61801. E-mail: [email protected]
2
Associate Professor and Faculty Entrepreneurial Fellow, Dept. of Civil and Environmental
Engineering, Univ. of Illinois at Urbana Champaign, 205 N. Mathews Ave., Urbana, IL 61801.
E-mail: [email protected]
3
Professor, Dept. of Civil and Environmental Engineering, Univ. of Illinois at Urbana
Champaign, 205 N. Mathews Ave., Urbana, IL 61801. E-mail: [email protected]

ABSTRACT
3D modeling of the built environment has become common practice in the AEC/FM
industry. Practitioners take advantage of the geometric and semantic information embedded in
the 3D model to perform engineering analysis. Despite the benefits provide by the 3D model, the
process is time-consuming, labor-intensive, and error-prone. In this paper, we propose a new
neural network-based method for 3D point cloud semantic segmentation of building scenes using
a hierarchical approach: first, we reason on the local and global contents of raw point cloud data
to extract geometrical features. Second, the features are used as input to an artificial neural
network that performs semantic segmentation on the points. These points are classified into:
beam, ceiling, clutter, column, door, floor, pipe, wall, and window. We evaluated our approach
on a dataset of several buildings and we obtained an accuracy of 73%. Our experiments produce
robust results readily useful for practical Scan2BIM applications.

INTRODUCTION
The automatic identification of structural, architectural, and mechanical components present
in indoor point cloud scenes is a heavily studied area by the Architecture, Engineering, and
Construction (AEC) Industry. The identification of these components has many applications,
such as scene visualization, 3D solid model generation, on-site communication, resolve conflict,
clash detection, and task coordination during the entire life-cycle of the infrastructure (Volk et al.
2014). The Scan2BIM method allows the geometric information embedded in point cloud data to
be used to identify the semantic components (e.g. beam, ceiling, column, door, floor, pipe, wall,
and window) presents in the scene. During the past few years much research has been conducted
with the objective of generating new methods that reduce the complexity and the processing time
of the Scan2BIM method. A review of this methods can be found in (Borrmann et al. 2011;
Dimitrov et al. 2016; Grilli et al. 2017; Huber et al. 2010; Nahangi et al. 2015; Tang et al. 2010).
Typically, point clouds are composed by millions of points which makes a challenge to
generating 3D solid models from point cloud data. Researchers divide the Scan2BIM process in
three steps: data processing, data modeling, and data analysis. In general, the Scan2BIM process
starts by dividing the point cloud into meaningful parts, such as beam, ceiling, column, floor,
pipe, and wall. Then, a 3D solid surface is fit over each point cloud component to generate a 3D
solid representation of the scene. Despite the available applications and the research conducted
with the objectives of simplifying the process and reduce the processing time, the process

remains time-consuming, labor-intensive, error-prone, and requires a person with experience in

3D modeling. In addition, the quality of the solid model depends on the point cloud density
distribution, the variation of surface roughness, the type of curvatures, the amount of clutter and
missing data, and the level of abstraction selected by the designer (Dimitrov et al. 2015). For
these reasons, this paper presents a new method for semantic labeling of mechanical and
structural components, such as beam, ceiling, column, floor, pipe, and wall. In addition, the
method identifies the clutter and the openings (doors and windows). The input of the method is
the point cloud’s points which are described by a 1x15 vector that contains the coordinates; the
color channels; the normals; the maximum and minimum curvatures; the surface roughness; and
the distance from the point to the point cloud centroid. The method uses an Artificial Neural
Network (ANN) to identify to which semantic category (beam, ceiling, clutter, column, door,
floor, pipe, wall, and window) the point belongs. The ANN was trained and tested using 244 and
35 real-word point clouds respectively. The propose method had a point accuracy of 73.80% and
56.21% average accuracy across the nine (9) categories. The experiments’ encouraging results
suggest their practicality for Scan2BIM applications..
The key contribution of this paper is presented a new neural network architecture that takes
as an input an unorganized 3D point cloud; the point cloud’s points are described by a 1x 15
vector. This new architecture classifies the points in nine (9) semantic categories, without the
need of divided the point cloud in meaningful segments before labor these. Also, the architecture
describe the point as a 1x 15 vector (x, y, z coordinates; RGB color channels; N x Ny, Nz normal;
Cmax, Cmin curvatures; Ra surface roughness; and x, y, z distances from point to point cloud
centroid), this allows the method to use the geometric information of the points neighborhood.
The next section presents related work to semantic segmentation of point cloud indoor scenes.

RELATED WORK
During the past few years, the automatic identification of semantic components in point
cloud scene have been a very studied area. Research are divided in two categories: (1)
identification of planar components such as ceiling, floor, and wall; and (2) identification of
cylindrical components such as pipe. In this paper, we only focus on recent methods that assign
semantic labels to a structural and mechanical components in point cloud scene.
One area of research is the semantic identification of planar components, such as ceilings,
floors, and walls. For example, Huber et al. 2011and Xiong et al. 2013 generate 3D models from
laser scanner data. Their methods extract the planar patches using the contextual relation
between them and classify the patches in floors, ceilings, walls, or clutter. In addition, their
methods analyze the planar patches to identify the present openings, such as doors and windows.
On the other hand, Wang et al. (2015) identifies planar components and its openings by
segmenting the point cloud and using the plane boundaries to classify the segments in exterior
walls, roofs, slabs, windows, and doors. Similar, Pu and Vosselman 2006 segments outdoor point
cloud scenes and classifies the segments into walls, windows, doors, and roof. They use the
segment properties to identify the different building component. Despite the contribution made
by these research, their methods are limited to identify planar components and depend on the
quality of the planar segments. Typically, construction scene are composed by planar and
cylindrical components (e.g. pipes) surrounded by clutter which affect the output of the
segmentation methods (the result is an over-segmented or under-segmented point cloud scene).
On the other hand, to solve these limitations, Perez-Perez et al. 2016 and Perez-Perez et al.
2017, focused their research on identifying structural (e.g. beam, column, ceiling, floor, pipe, and

wall) and mechanical (e.g. pipe) components. They segment the point cloud using a multi-scale
region growing method, proposed by (Dimitrov and Golparvar-Fard. 2015), that allows use the
desire level of abstraction to segment the point cloud. The resulted segments are classified using
a Markov Random Field (MRF), the MRF uses the relationship between semantic and geometric
labels that are given by a Support Vector Machine(SVM) and AdaBoost classifier respectively,
to classify the segments into semantic categories. The results showed that the procedure achieves
the state-of-the-art performance in semantic labeling. However, their method depends on the
performance of the segmentation method. The results showed that the method have excellent
results with over-segmented scenes, but the method accuracy reduces with under-segmented
scenes.
Finally, during the past few years the use of neural networks for identifying semantic
components in point cloud scenes has increased. Qi et al. 2017 developed a method called
PointNet that uses a deep learning architecture for 3D shape classification, shape part
segmentation and scene semantic parsing. Their architecture uses a 1D convolutional neural
network for classifying floors, windows, beams, columns, walls, ceilings, among others. Despite
the results obtained by this architecture, their method does not uses the geometric properties of
the point neighborhood such as the normals, curvatures, and surface roughness which have been
proven increase the accuracy of the semantic segmentation. In addition, their method does not
identifies mechanical components, such as pipes, which are commonly present in construction
scenes.

METHOD
This paper presents a new method that uses an ANN architecture that semantically segment
unorganized 3D point clouds. First, the method subsamples the point cloud using a radius, r.
Then, the method extracts the point features, normals (N x, Ny, Nz), maximum curvature (Cmax),
minimum curvature (Cmin), surface roughness (Ra), and the distance from the point to the point
cloud centroid ( x , y  , z  ). These features are concatenated with the point coordinates (x, y, z)
and the point color channels (RGB) to form a 1 x 15 feature vector. The points feature vector are
the input to the ANN architecture; and this architecture has two hidden layers that use a Rectifier
Linear Unit (ReLU) activation function and the output layer return the classification score of the
nine (9) categories using a Sigmoid activation function. The ANN classifies the point cloud
points in the following categories: beam, ceiling, clutter, column, door, floor, pipe, window, and
wall. Figure 1 illustrates the method described before. The next section discuss the feature
extraction process.

Figure 1: An overview of the method processed

Feature extraction
The method proposed in this paper has as input a point cloud, PC, composed by n points. The
point cloud, PC, is subsampled using a radius, r, forming a subsampled point cloud, PCs. The
PCs points are represented by a [1 x 6] vector containing the x, y, z coordinates and the RGB

color channels. The method takes advantage of the geometry properties of the point
neighborhood by estimating Nx, Ny, Nz, Cmax, Cmin, x , y  , z  , and Ra. The description of these
features is presented in Table 1.

Table 1. Description of the point features used by the ANN

Features Description Features Description
Cmin, Magnitude of the minimum and
x, y, z Normalized x, y, and z coordinates
Cmax maximum curvatures
Distance of neighborhood points to the
R,G,B RGB color channels Ra
fitted surface
Nx, Ny, Normal components of the x, y, and x, y, z Distance from the point to the point
Nz z directions cloud centroid

The method estimates the point features using a multi-scale feature extraction process. The
point neighborhood is formed by neighbor points in a distance, d={r, 0.75r, 0.5r, 0.25r}. The
method estimates the multi-scale features by fitting a surface over the point neighborhood which
size is larger than 500. The normals are the unit surface normal Nx, Ny, Nz of the fitted surface;
the curvatures are the absolute maximum and minimum curvature Cmax and Cmin respectively; the
surface roughness, Ra, is calculated by estimate the distance from the neighbor points to the
fitted surface; and the x, y, z components are the distance from the point to the point cloud
centroid.

Artificial Neural Network Architecture

The method presented in this paper uses an ANN architecture to semantically segment a 3D
point cloud. The ANN classifies the point cloud points into the following categories: beam,
ceiling, clutter, column, door, floor, pipe, wall, and window. The input of the ANN is the point’s
feature descriptor built during the multi-scale feature extraction phase. An overview of the ANN
architecture is illustrated in Figure 2.

Figure 2. An overview of the ANN architecture

The inputs of the ANN are n x 15 feature descriptors, one feature descriptor for each point.
The activation function that the ANN uses is ReLU. The ANN architecture has an input layer
with 100 neurons, two hidden layers (the first hidden layer has 200 neurons and the second
hidden layer has 400 neurons), and the output layer has nine (9) neurons that uses the sigmoid
function and return the categories scores. The hidden layers have a dropout rate of 0.5. The point
label is selected by choosing the category with the higher probability.

RESULTS
The proposed method in this paper was trained and tested using two groups of datasets. The
first group was composed by the S3DIS dataset (Armeni et al. 2016) and the second group was
composed by the dataset built by the RAAMAC Lab. In total, 244 and 35 rooms from real-world
point clouds were used to train and test the method respectively. These point clouds were
composed by beams, columns, ceilings, clutter, doors, floors, pipes, walls, and windows. Table 2
lists the number of points used for train and test the method.

Table 2. Number of points per semantic category used for train and test the method
Beam Ceiling Clutter Column Door Floor Pipe Wall Window Total
Training
1264494 6410302 8221673 612177 1475631 4681176 528893 8324518 674045 32192909
dataset
Testing dataset 1192745 2697867 1635708 444756 192986 2093113 267212 1706968 144958 10376313

The point clouds were subsampled using a radius, r, equal to 0.3. In addition, the point’s
feature vector had a size of 1 x 15. We did a cross validation process to find the dimension of the
ANN layers, we tested how the number of layers and the layers size affect the method accuracy.
We tested 20 different combinations and layers size from 15 to 1024 nodes. In addition, we took
in consideration the dropout factor, and for this reason each combination was tested using the
dropout factors of [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]. We found the ANN architecture that returns the
highest accuracy was an input layer of 100 neurons, two hidden layer of 200 and 400 neurons
respectively, and an output layer of 9 neurons. In addition, we found that the dropout rate that
returned the highest accuracy is 0.5.
Using the architecture mentioned above, we obtained a point accuracy of 73.80% and
average accuracy across the 9 categories was 56.21%. Figure 3 shows examples of the results
obtained. In addition, the accuracy per category was 46.59%, 87.17%, 68.63%, 10.02%, 48.63%,
80.97%, 62.18%, 82.86%, and 18.87% for the beam, ceiling, clutter, column, door, floor, pipe,
wall, and window categories respectively. Table 3 shows the accuracies obtained using the
feature vector sizes of 6 and 15. The feature vector with size 1 x 6 describe a point using the its
coordinates (x, y, and z) and RGB color channels. Using the feature vector size 1 x 15 the
average accuracy was increase 14.6 %. In addition, the accuracy of the beam, ceiling, clutter,
column, door, floor, pipe, and wall categories was increased by 17.13%, 47.77%, 11%, 2.22%,
0.74%, 9.01%, 21.95%, and, 21.76% respectively; and the accuracy of the window category
decreased 0.15%. On the other hand, the categories with lower accuracy were the column and
window, which were the two categories with least amount of points for training and testing.
Figure 4 shows the confusion matrix for the ANN. As is shown in Figure 4, the ANN frequently
mislabeled the column points as wall and the beam points as ceiling. One reason for this
behavior is the proximity of the beam with the ceiling and the proximity of the column with the
wall. The categories in which the method had the largest amount of data was the ceiling, floor,
and wall and the ANN accordingly suffered from less confusion for these three categories.

Table 3: ANN results using 6 and 15 features

Num. of Beam Ceiling Clutter Column Door Floor Pipe Wall Window Average
features
6 29.46 39.40 57.63 7.80 47.89 71.96 40.23 61.10 19.02 41.61
15 46.59 87.17 68.63 10.02 48.63 80.97 62.18 82.86 18.87 56.21

Figure 3. Example of ANN Results

Figure 4. ANN Confusion Matrix

CONCLUSIONS AND FUTURE WORK

The presented method uses an ANN architecture for semantic segmentation of point cloud
scenes. The method identifies the mechanical and structural components present in the scene,
such as beams, ceilings, columns, floors, pipes, and walls. In addition, the ANN architecture
identifies the clutter and openings such as doors and windows. The method obtained a point
accuracy of 73.80% and an average accuracy of 56.21% across the nine (9) categories. The
method accurately segmented the ceiling, floor and wall categories but performed less
impressively on the column and window categories.
Our preliminary experimental results using real-world point cloud data showed that the
method achieves the state-of-the-art performance on semantic labeling of point cloud data. Our
anecdotal observations on using the resulting models shows how use the geometric information
of the point surroundings, such as maximum and minimum curvature, surface normals, and
surface roughness, can be use to describe a point to semantically categorize it. In addition, our
results presented in this paper showed that the method has potential for Scan2BIM applications.
In future work, we will explore adding convolutional layers to our architecture which will allow
us to extract features from a larger neighborhood. Finally, we want to test how the point cloud
depth map can be used to enhance the semantic labeling accuracy.

ACKNOWLEDGEMENTS
The authors would like to thank partner construction companies for their technical support
and for allowing access to point cloud data. They also like to thank the members of the
RAAMAC lab for their help and support with forming a large database of ground truth models.

REFERENCES
Armeni, I., Sener O., Zamir A., Jiang H., Brilakis I., Fischer M. and Savarese S. “3D Semantic
Parsing of Large-Scale Indoor Spaces”. CVPR (2016).
Borrmann, D., Elseberg, J., Lingemann, K., and Nüchter, A. (2011). “The 3D Hough Transform
for plane detection in point clouds: A review and a new accumulator design.” 3D Research,
2(2), 1–13.

Dimitrov, A., and Golparvar-Fard, M. (2015). “Segmentation of building point cloud models
including detailed architectural/structural features and MEP systems.” AutoCon, Elsevier
B.V., 51, 32–45.
Dimitrov, A., Gu, R., and Golparvar-Fard, M. (2016). “Non-Uniform B-Spline Surface Fitting
from Unordered 3D Point Clouds for As-Built Modeling.” COMPUT-AIDED CIV, 31(7),
483–498.
Grilli, E., Menna, F., and Remondino, F. (2017). “A Review of Point Clouds Segmentation and
Classification Algorithms.” ISPRS (2017), 339–344.
Huber, D., Akinci, B., Adan, A., Anil, E., and Okorn, B. (2011). “Methods for Automatically
Modeling and Representing As-built Building Information Models.” NSF Engineering
Research and Innovation Conference.
Huber, D., Akinci, B., Tang, P., Adan, A., Okorn, B., and Xiong, X. (2010). “Using laser
scanners for modeling and analysis in architecture, engineering, and construction.”
Information Sciences and Systems (CISS), 2010 44th Annual Conference on, IEEE,
Princeton, NJ, 1–6.
Nahangi, M., Czerniawski, T., and Haas, C. T. (2015). “Automated 3D Shape Detection and
Outlier Removal in Cluttered Laser Scans of Industrial Assemblies.” ICIC, 0–10.
Pătrăucean, V., Armeni, I., Nahangi, M., Yeung, J., Brilakis, I., and Haas, C. (2015). “State of
research in automatic as-built modelling.” AEI, 162–171.
Perez-Perez, Y., Golparvar-Fard, M., and El-Rayes, K. (2017). “Semantic-Rich 3D CAD Models
for Built Environments from Point Clouds: An End-to- End.” ASCE CCE 2017, Seattle,
Washington, 166–174.
Perez-Perez, Y., Golpavar-fard, M., and El-Rayes, K. (2016). “Semantic and Geometric Labeling
for Enhanced 3D Point Cloud Segmentation.” CRC 2016, San Juan, PR, 2542–2552.
Pu, S., and Vosselman, G. (2006). “Automatic extraction of building features from terrestrial
laser scanning.” IAP, 36(5), 25–27.
Qi, C., Su, H., Mo, K., Guibas, L. (2017). “PointNet: Deep Learning on Point Sets for 3D
Classification and Segmentation.” CVPR 2017
Tang, P., Huber, D., Akinci, B., Lipman, R., and Lytle, A. (2010). “Automatic reconstruction of
as-built building information models from laser-scanned point clouds: A review of related
techniques.” AutoCon, 19, 829–843.
Volk, R., Stengel, J., and Schultmann, F. (2014). “Building Information Modeling (BIM) for
existing buildings – Literature review and future needs”. Automation in Construction, 38,
109-127.
Wang, C., Cho, Y. K., and Kim, C. (2015). “Automatic BIM component extraction from point
clouds of existing buildings for sustainability applications.” Automation in Construction,
Elsevier B.V., 56, 1–13.
Xiong, X., Adan, A., Akinci, B., and Huber, D. (2013). “Automatic creation of semantically rich
3D building models from laser scanner data.” Automation in Construction, 31, 325–337.

Historical Accident and Injury Database-Driven Audio-Based Autonomous Construction

Safety Surveillance
Yiyi Xie1; Yong-Cheol Lee, Ph.D.2; Moeid Shariatfar3; Zhongjie "Doc" Zhang, Ph.D., P.E.4;
Abbas Rashidi, Ph.D.5; and Hyun Woo Lee, Ph.D.6
1
Ph.D. Student, Bert S. Turner Dept. of Construction Management, Louisiana State Univ., 3319
Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]
2
Assistant Professor, Bert S. Turner Dept. of Construction Management, Louisiana State Univ.,
3319 Patrick F. Taylor Hall, Baton Rouge, LA 70803 (corresponding author). E-mail:
[email protected]
3
Ph.D. Student, Bert S. Turner Dept. of Construction Management, Louisiana State Univ., 3319
Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]
4
Research Administrator, Louisiana Transportation Research Center, Baton Rouge, LA 70803. E-
mail: [email protected]
5
Assistant Professor, Dept. of Civil and Environmental Engineering, Univ. of Utah, Salt Lake
City, UT. E-mail: [email protected]
6
Assistant Professor, Dept. of Construction Management, Univ. of Washington, 120 Architecture
Hall, Campus Box 351610, Seattle, WA 98195. E-mail: [email protected]

ABSTRACT
Construction safety has always been one of critical concerns in the construction industry with
diverse approaches and technology for consistently managing safety hazard issues being
examined and adopted. However, the current technology such as a vision-based approach does
not appear to fully support consistent and robust safety monitoring because of its heavy-data
processing and inherent restrictions including limited angle coverage and view detectability. A
vulnerable construction environment requires an advanced safety surveillance and event
detection approach. To provide a supplement safety monitoring method, this project has an
objective to establish an audio-based autonomous safety surveillance system with alarming pre-
notifications and map visualization based on identified work activities and potential safety issues.
This proposed system adopts a schedule-based sound data training method and incorporates
historical occupational injury and illness manual data with classified sources and events of
accidents in each construction activity. The injury data are informed by a daily project schedule
to link potential safety hazard to daily planned work activities. By applying a machine learning
technique, the system accurately categorizes a sound type according to sound training data
scope-downed by project schedule and safety data and provides pre-warnings in accordance with
any detected irregular events. The system is expected to contribute to the body of knowledge for
safety monitoring and provide the potential to be integrated into a robust automated safety
surveillance system with occupational accident data and significantly improving construction
activities classification accuracy.

INTRODUCTION
With the advancement of information technology, practitioners and researchers have paid
more attention to technical methods to address construction safety issues. According to the
international occupational safety statistics, the construction industry is considered as one of the
most dangerous industries (Melzner et al. 2013). In addition, the National Census of Fatal

Occupational Injuries survey conducted by the U.S. Bureau of Labor Statistics in 2015 reveals
that the major causes of construction accidents include the unique nature of the industry, the
mistaken behavior of an unskilled worker, a lack of safety knowledge, poor safety management,
and weak enforcement of mandatory safety rules (The U.S. Bureau of Labor Statistics 2018). To
prevent safety hazards and associated casualties in construction work environment, it is
recommended to adopt advanced safety identification technologies and security management
systems for accurately detecting various construction activities and security issues (Zhang et al.
2011). Cheng et al. (2018) state that exploiting advanced information and communication
technologies can potentially improve the current construction safety management and risk
identification processes. In particular, conventional methods that detect construction safety issues
generally involve human inspection and manual judgment. As a result of the nature of the
traditional methods, the necessary follow-up rescue measures are taken late after incidents occur.
To overcome these limitations, automated safety monitoring and identification is a promising
approach for robust and consistent safety surveillance on a construction site (Park et al. 2016).
Diverse technologies such as vision-based method, global position system (GPS), ultra-wide
band (UWB), and radio frequency identification (RFID) have been investigated and applied to
remotely monitor construction safety. However, because of the complexity and the vulnerability
of construction site environment, the application of state-of-the-art technologies for construction
safety management and automated surveillance has been limited. In response, the present study
proposes an audio-based method for autonomous site safety surveillance and rapid precautions.
This safety data-driven system for construction site safety monitoring employs a construction
project schedule and associated daily safety hazard data. This system is expected to provide new
opportunities for rapid safety issue identification and management.

LITERATURE REVIEW
Various technology has been adopted for safety surveillance. However, an audio-based
method offers diverse competitive benefits such as light-weight data processing for rapid data
sharing and unlimited angles for data collection and detection. In addition, audio sensors are not
limited to illumination, enabling to implement the system day and night. Audio-based activity
detection has been applied for outdoor and indoor safety monitoring in a perimeter environment.
In terms of outdoor monitoring, an audio-based method is used to detect impulsive sounds
including glass breaks, human screams, and gunshots by employing classification algorithms
such as Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) (Peng et al.
2009). In health and safety management, sound signals generated by body falls and distress
speech expressions are applied to detect emergency incidents for patients with attached
equipment at home (Doukas et al. 2009). This method monitors the elder alone at home with the
first assistance through remote monitoring. To manage safety issues on construction site, Cho et
al. (2017) proposes an audio-based system for identifying construction activities and events,
which analyzes sound data for real-time monitoring. A framework integrating a wearable sensor
into BIM is also proposed to prevent noise-caused hearing loss, and noise spatial distribution is
visualized on the platform (Wei et al. 2017).

RESEARCH APPROACH AND METHODOLOGY

To pre-identify safety issues and improve audio-based event detection in a construction
workplace, this study incorporates Occupational Injury and Illness Classification System data
(OIICS) into an audio-based safety surveillance system. The OIICS is referred by a daily

construction schedule to provide corresponding historical accident issues that can notify labors in
a workplace. Although previous studies in diverse domain areas have already applied an audio-
based recognition method for automated event detection, this study involves historical safety and
accident data that can occur in a real construction project according to the associate work
activities and equipment operations. To accurately refer the historical safety data, this study
employs a construction schedule indicating daily planned work activities. The safety data are the
knowledge base that establishes a training dataset of associated sound types and executes an
audio-based safety monitoring and alarming system.

Figure 1. Processes of a research study

Research Goal
The primary research goal is to develop a safety surveillance system integrated with the
OIICS and a project schedule. To achieve the goal, this study has the following two-fold
objectives: (1) establishing a data mapping process between a project schedule and the OIICS;
and (2) generating a project-specific sound data library for an autonomous safety issue
notification system. Obviously, it is not feasible to create a sound database covering all sound
types of construction activities and equipment operations because of heterogeneous materials,
equipment, and working environment of each project. Thus, this study aims to develop a project-
specific sound data library addressing sound data of planned work activities in a project for
training based on a construction project schedule. This schedule information is directly mapped
to the OIICS database to retrieve previous accident lists for creating sound types of associated
safety issues. In addition, this study aims to establish an audio-based framework to pre-identify

possible safety issues and detect construction accidents. By adopting a machine learning
algorithm and the OIICS data, this system can identify real-time construction activities and
accidents, providing pre-notifications.

Figure 2. Construction resource and possible safety issues in XML

Figure 3. Tasks, resources, start/finish dates extracted from XML

Methodology
The proposed method involves the frameworks for project schedule- and the OIICS-based
sound training data generation, pre-identification of possible safety issues, and audio-based
construction event and safety accident detection. Figure 1 represents the process of this research
study. Each construction project entails a detailed project schedule including daily work plans,
allowing to predict what types of work activities and equipment operations will occur in a project
site on a daily basis. With the help of this schedule information, the proposed system includes
only associated sound types and accident sound data for establishing training data as well as
restricting types of a training dataset to improve a classification accuracy. To consistently map a
project schedule into a safety database, this study applies the OmniClass Construction
Classification System (OmniClass) as a unified standard terminology for defining construction
tasks, activities, and resources in a project schedule as well as referencing possible safety issues
in the OIICS manual. The OIICS manual is accumulated reference data that clearly show
potential safety issues of each resource or activity in previous construction projects and
categorize them in a manner that can be used in a current construction project. In this study, the
XML file format is applied to provide an information translation between scheduling software
and the OIICS manual. The XML-based construction schedule contains detailed information
about each construction event, such as starting time, finishing time, tasks and resources, which
provide information to extract construction activities within the same period. In this research

study, the K nearest neighbor algorithm is adopted as a sound classifier and 62 features are
extracted from a dataset.

IMPLEMENTATION AND CASE STUDY

To assess the accuracy and feasibility of this framework, this study adopted a real
construction project schedule for generating daily construction activities and retrieving safety
issue data. Section 4.1 depicts the detailed processes of construction schedule generation and
safety issue extraction generated by using an XML-based construction schedule. Section 4.2
describes dataset pre-processing and feature extraction. Section 4.3 compares the classification
accuracy between three expected types of construction activities based on construction schedule
and ten selected construction activities. Section 4.4 shows safety hazard precaution and real-time
detection.

Schedule Generation and Construction Activities and Safety Issues Data Extraction
An XML-based schedule defined according to the OMNI-Class is generated from MS Project,
which is one of construction scheduling software, and then it is mapped with the OIICS manual
as shown in Figure 2. Construction activities including starting time, finishing time, tasks and
resources and possible safety issues are extracted from the XML-based daily project schedule,
shown in Figure 3.

Figure 4. Sound of excavating before and after de-noising

Dataset Pre-processing and Feature Extraction
Because of the dynamic environment of a construction site, a recorded audio generally
contains noise that severely influences a sound classification result. Consequently, a de-noising
algorithm is required to preprocess the dataset. The audio signals are divided into short segments
(bins of 2 seconds long) to obtain feature values in that segment before de-noising. The noise
estimation algorithm (Rangachari et al. 2006) is adopted to enhance an original signal, as shown
in Figure 4. Afterwards, features mentioned in Section 3.2 are extracted and some features of
enhanced signal are shown in Figure 5.

Figure 5. ZCC and LEF Feature of Enhanced Excavating Sound

Table 1. Selected ten construction activities

Excavating Concrete Mix
Compacting Bulldoze
Construction Activities Hammering Crane
Piling Concrete Pump
Drilling Chainsaw

Table 2. Schedule based three construction activities

Construction Activities Construction Schedule
Earth Works (Excavating) 2018-09-07 to 2018-09-20
Spreading and Compaction (Compacting) 2018-09-14 to 2018-09-20
Cast-in-Place Concrete Pile (Piling) 2018-09-14 to 2018-09-17

Figure 6. Ten Construction Activities

Classification Accuracy Comparison
Table 1 shows ten construction activities executed by a construction project and Table 2
shows three activities planned by a daily project schedule. In this experiment, 200 short time
segments (2s long) for each construction activities were used. 100 segments were set as a
training dataset and the rest were used for a testing dataset. To evaluate the accuracy of sound
classifications before and after the application of the proposed method, the confusion matrices

are demonstrated in Figures 6 and 7. Figure 6 illustrates the classification result of ten
construction activities, and Figure 7 shows the result of daily planned three construction
activities. As shown in these two figures, compared with ten construction activities, the
classification accuracy with three types of work activities has been significantly improved. In
Figure 6, Bulldoze activities were completely misclassified into Drilling, and over half Concrete
Pump activities were misclassified into Pilling and Drilling. However, in Figure 7, each three
activities are classified with the high accuracy.

Figure 7. Three Construction Activities

Figure 8. Safety issue precaution

Safety Issues Precaution and Real-time Event Identification
The proposed framework can identify sounds of construction activities and provide
associated safety hazard issues to labors via mobile devices or a web-based interface. In addition,
the system is capable to directly recognize safety issues and accidents with sound training data
restricted by a daily construction schedule and safety accident information. For example, if this
system identifies the sound of an excavating work from a site, the following safety hazard
notification retrieved from historical safety accidents data is automatically notified to associated

labors: Transportation accident hazard (overturning, falling from and struck by vehicle, mobile
equipment, pedestrian struck by vehicle), bodily reaction and exertion (repetitive motion),
contact with objects and equipment (compressed or pinched by rolling, sliding, or shifting
objects or equipment, caught in running equipment or machinery, struck by object or equipment,
swinging or slipping object). These safety hazard notifications are expected to support project
managers and field personnel as rapid on-site safety hazard precaution and prompt rescue
procedure. Before starting each construction activity, possible safety issues related to planed
work activities should be sent through mobile devices as shown in Figures 8 and 9.

Figure 9. Real-Time Safety Issue Detection

CONCLUSION
Automated safety monitoring is one of the most promising methods for accurate and
continuous monitoring of safety performance on construction site. The proposed system can not
only be used for detecting safety issues immediately in case of occurrence of an incident, but
also provides safety precautions for each activity before it starts. In this research study, sound
recognition is used for safety precautions and accident detection. Safety precautions are
announced by activity detection based on the proposed schedule-based sound recognition system.
Accident alarms and notifications are sent based on detection of any abnormal sound from a
construction site. A significant reduction in the number of incidents and injuries is expected by
applying the proposed framework on a construction site. In addition, this remote and automated
safety surveillance system is a promising approach to organize complicated construction
processes and risky activities by providing immediate issue notification and pre-identifying
possible safety issues.

REFERENCES
Cho, C., Lee, Y. C., & Zhang, T. (2017). Sound Recognition Techniques for Multi-Layered
Construction Activities and Events. In Computing in Civil Engineering 2017 (pp. 326-334).
Cheung, W. F., Lin, T. H., & Lin, Y. C. (2018). A Real-Time Construction Safety Monitoring
System for Hazardous Gas Integrating Wireless Sensor Network and Building Information
Modeling Technologies. Sensors, 18(2), 436.
Doukas, C., Athanasiou, L., Fakos, K., & Maglogiannis, I. (2009). Advanced sound and distress
speech expression classification for human status awareness in assistive environments. The
Journal on Information Technology in Healthcare, 7(2), 111-117.
Melzner, J., Zhang, S., Teizer, J., & Bargstädt, H. J. (2013). A case study on automated safety
compliance checking to assist fall protection design and planning in building information

models. Construction Management and Economics, 31(6), 661-674.

Peng, Y. T., Lin, C. Y., Sun, M. T., & Tsai, K. C. (2009, June). Healthcare audio event
classification using hidden markov models and hierarchical hidden markov models. In
Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 1218-1221).
IEEE.
Park, J., Kim, K., & Cho, Y. K. (2016). Framework of automated construction-safety monitoring
using cloud-enabled BIM and BLE mobile tracking sensors. Journal of Construction
Engineering and Management, 143(2), 05016019.
Rangachari, Sundarrajan, and Philipos C. Loizou. "A noise-estimation algorithm for highly non-
stationary environments." Speech communication 48.2 (2006): 220-231.
The U.S. Bureau of Labor Statistics. (2015). The National Census of Fatal Occupational Injuries
survey. https://www.bls.gov/news.release/archives/cfoi_12162016.pdf. Last accessed in
October 30 2018.
W Wei., C Wang & Lee, Y. C. (2017).BIM-Based Construction Noise Hazard Prediction and
Visualization for Occupational Safety and Health Awareness Improvement. In Computing in
Civil Engineering 2017 (pp. 262-269).
Zhang, J. P., & Hu, Z. Z. (2011). BIM-and 4D-based integrated solution of analysis and
management for conflicts and structural safety problems during construction: 1. Principles
and methodologies. Automation in construction, 20(2), 155-166.

Business Failure Prediction with LSTM RNN in the Construction Industry

Youjin Jang, Ph.D.1; In-Bae Jeong, Ph.D.2; Yong K. Cho, A.M.ASCE3;
and Yonghan Ahn, A.M.ASCE4
1
Robotics and Intelligent Construction Automation Laboratory, School of Civil and
Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0355. E-mail:
[email protected]
2
Robotics and Intelligent Construction Automation Laboratory, School of Civil and
Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0355. E-mail:
[email protected]
3
Robotics and Intelligent Construction Automation Laboratory, School of Civil and
Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0355. E-mail:
[email protected]
4
Sustainable Building and Construction Management Laboratory, School of Architectural
Engineering, Hanyang Univ., Ansan-City, Gyeonggi-do 15588, South Korea. E-mail:
[email protected]

ABSTRACT
Due to the characteristics of the construction projects, construction contractors are often
vulnerable to business failure compared to those in other industries. Thus, predicting the
potential business failure of construction contractors has been a crucial issue. This study
proposes a model that predicts business failure using long short term memory recurrent neural
network (LSTM RNN), which is one of the deep-learning algorithms. The proposed model uses
not only a set of accounting data but also proxies for the construction market condition and the
macroeconomic environment as input variables. The prediction performance of the proposed
model is examined by varying the combination of input variable groups. The results showed that
adding construction market and macroeconomic variables to accounting variables could increase
the performance of business failure prediction. It was also found that macroeconomic variables
had a slightly higher impact on the business failure prediction than construction market variables.
The results of this study are expected to be useful references for both researchers and
practitioners to develop business failure prediction models of construction contractors.

INTRODUCTION
As increasing the scale and technical complexity of construction projects, a construction
contractor and a number of stakeholders lead to be collaborated to accomplish the projects (Lam
et al. 2009). If a construction contractor fails the business, stakeholders related construction
contractors will also be affected. Unfortunately, construction contractors are vulnerable to
business failure compared to those in other industries since the construction projects have
characteristics such as uniqueness, long period of the project completion, and uncertainty of
construction activities. Business failure prediction would not only provide an early warning
about a business failure to prevent severe business failure but also assist collaborators or
investment partners to facilitate the selection of construction contractors. In this respect,
predicting the potential business failure of construction contractors has been a critical issue in the
construction industry.
Due to this importance, there has been abundant literature on the development of business

failure prediction models of construction contractors, which attempted to improve prediction

performance by using numerous techniques such as statistical and artificial intelligent (AI)
techniques. However, the variables used to the business failures prediction models mostly
concern financial ratios that covers the leverage ratio, profitability ratio, leverage ratio, and
activity ratio. A few studies have attempted to employ other factors, including firm characteristic
factors related to company main activity, company size, company age, types of trade and
headquarter geographic location (Adeleye et al. 2013; Horta and Camanho 2013), market factors
related to stock exchange type and market to book ratio (Adeleye et al. 2013; Tserng et al. 2014),
and macroeconomic environmental factors related to inflation and interest rate (Kangari 1988;
Russell and Zhai 1996). These studies used other factors in addition to financial ratios to improve
the prediction performance, but did not empirically explain whether it contributes to the
improvement of prediction performance when using other factors together as compared with
using only financial ratios. Construction contractors are highly susceptible to macroeconomic
effect, and macroeconomic factors are essential to predict business failure (Arditi et al. 2000;
Sang et al. 2014). The level of business riskiness may also vary due to the market condition.
Thus, there is a need to develop a business failure prediction model of construction contractors
using not only accounting variables but also construction market and macroeconomic variables,
and to investigate the effect of the construction market and macroeconomic variables on the
business failure prediction performance.
In an attempt to address these issues, this study aims to develop the prediction model using
not only a set of accounting data but also proxies for the construction market condition and the
macroeconomic environment as input variables and to examine the effect of the combinations of
these input variable groups on the prediction performance. We propose a model that predicts
business failure using Long Short Term Memory Recurrent Neural Network (LSTM RNN),
which is one of the deep-learning algorithms. LSTM RNN can effectively learn sequential
patterns from data containing temporal or sequential information and shows excellent
performance in predicting time-series data. The remainder of this paper is structured as follows.
The next section proposes the business failure prediction model using LSTM RNN. Then, we
select the variables and prepare the dataset for experiments are prepared. Lastly, this study
provides experiment setup and results followed by conclusions.

BUSINESS FAILURE PREDICTION MODEL WITH LSTM RNN

LSTM RNN is a kind of RNN which has been developed in order to learn sequential and
temporal patterns from time-series or sequences of data while has the ability to learn long-term
dynamic while avoiding the vanishing and exploding gradients problems. In a standard RNN, the
calculated ht value is used to calculate yt value as shown in Equation (1) and to cacluate the
hidden state ht 1 at the next time t  1 concurrently. Hence, the temporal patte Long Short-Term
Memory Recurrent Neural Network (LSTM RNN) is a kind of RNN which has been developed
in order to learn sequential and temporal patterns from time-series or sequences of data. It has
the ability to learn long-term dynamic while avoiding the vanishing and exploding gradients
problems. rn can be learned. The ht at time t is calculated by receiving input information at
time t ( xt ) and using the previous hidden state vector ( ht 1 ) as following Equation (2).
yt  f1 (Vht  b1 ) (1)
ht  f 2 (Uxt  Wht 1  b2 ) (2)

Where U, V and W are weight matrices and b1 , b1 are bias vectors. f1 , f1 are activation functions
such as the sigmoid function or hyperbolic tangent function.

Fig. 1. The Architecture of Business Failure Prediction Model

In LSTM RNN, f1 is replaced by the LSTM memory block which consist of memory cells
and gates and play an important role in training long-range dependency while controlling
information storage and flow. The LSTM memory block has one memory cell ( ct ) and three
gates which are input ( it ), forget ( f t ) and output gates ( ot ). The gates are used as a mechanism
to determine the information that is able to be received by the cell. The memory cell in each gate
consists of a sigmoid neural net layer and a pointwise multiplication operation. The sigmoid
layer outputs numbers between zero and one and describes how much of each component can be
forwarded to the cell. For time step t , the cell state can be updated by using the following
Equations (3) – (8).
it  Ui xt Wi ht 1  bi  (3)
ft  U f xt W f ht 1  b f  (4)
ot  U o xt Wo ht 1  bo  (5)
gt  tanh U g xt Wg ht 1  bg  (6)
ct  fi  ct 1  ii  gt (7)
ht  ot  tan h(ct ) (8)
Where,  is activate function sigmoid defined as   x   (1  e ) , it , f t , ot , ct are the outputs
 x 1

of the ‘input’, ‘forget’ and ‘output’ gates and cell at time t , respectively. hi , bi , b f , bo , and bg
are offset vector, U i , U f , U o , U g , Wi , W f , Wo and Wg are the coefficient matrix.
This study developed a business failure prediction model using LSTM RNN with LSTM
hidden layers of three and a time step of three, as illustrated in Fig.1. The input data ( xt ) were
normalized using standardization which makes the values of each data have zero-mean and unit-
variance. A given input value passes through the LSTM layer at time t , which is the most recent
time, through calculation of the LSTM layer. The value ht passing through the LSTM layer at
the last point in time t is the predicted class of the business failure in the next year. This

prediction is output as yt thought the softmax layer. It should be noted that the output classes are
numbered 0 and 1, indicating healthy and business failure, respectively. This study used the
cross-entropy loss function

VARIABLE SELECTION AND DATASET PREPARATION

Variable Selection
Business failure firms were defined as a construction contractor with the delisting codes
between 550 and 585 assigned by the Center for Research in Securities Prices (CRSP) as
following to the Tserng et al. (2015). These delisting codes represent bankruptcy, liquidity, or
poor performance. We set the values of output variables to (0) and (1), which mean the business
failure and healthy samples, respectively. A sample of business failure data refers to financial
data in a year before being delisted.
This study used three input variable groups which were accounting, construction market, and
macroeconomic variables. As accounting variables, this study employed twelve financial ratios
which were commonly used in the previous studies with the aspects of profitability, liquidity,
leverage and activity (Bal et al. 2013; Chen 2012; Cheng and Hoang 2015; Cheng et al. 2014;
Heo and Yang 2014; Horta and Camanho 2013; Tsai et al. 2012; Tserng et al. 2015, 2011, 2012,
2014) (1) return on asset; (2) return on equity; (3) return on sales; (4) current ratio; (5)
current assets to net assets; (6) working capital to total asset; (7) total liabilities to net
worth; (8) retained earnings to sales; (9) debt ratio; (10) working capital turnover; (11)
equity turnover; (12) and total asset turnover. In addition to accounting variables, three
construction market variables were employed (Ashuri et al. 2012; Shahandashti and Ashuri
2013): (1) construction spending; (2) house starts; and (3) employment in construction.
Construction spending is a measure of the value of new construction activities in including
residential projects, non-residential projects, and public projects. House starts is the number of
new privately owned housing where construction has been started in a given period and provides
useful information about expected construction activity in the relatively near future. Employment
in construction is the number of employees on payrolls in construction and is a useful measure to
represent the labor force just in the construction sector of the economy. Also, this study selected
three macroeconomic variables (Ashuri et al. 2012; Shahandashti and Ashuri 2013): (1)
consumer per index; (2) gross domestic product (GDP); and (3) federal funds rate.
Consumer per index, widely used measures of inflation, is a measure of the price level of a
representative basket of goods and services purchased by urban consumers. Gross Domestic
Product (GDP), which represents the economic health of a country, is a measure of the total
value of goods and services that are produced in a country in a given period. The federal funds
rate is the interest rate at which banks and other depository institution charge each other for loans.

Dataset Preparation
This study chose North America public listed firms having Standard Industrial Classification
(SIC) codes between 1500 and 1799 which are building construction (1500 to 1699), heavy
construction (1600 to 1699) and special trade construction (1700 to 1799). Sample data were
collected from each construction contractor for the period from 1980 to 2016. To avoid sample
selection biases, this study used firm-year data for analysis and removed the construction
contractors who do not have continuous data for at least five years. The accounting data were
obtained from the Standard & Poor’s COMPUSTAT database (Wharton Research Data Service).

The construction market and macroeconomic data were collected from the US Bureau of Census,
US Bureau of Labor Statistics, U.S. Bureau of Economic Analysis and Board of Governors of
the Federal Reserve Systems. The dataset constructed in this study consisted of 1,336 healthy
samples and 41 business failure samples when considering all the firm-year data. To solve the
imbalance classification problem, we resampled the original data by using a hybrid method that
combines the SMOTE (Synthetic Minority Over-sampling Technique) and Tomek links.
SMOTE oversampled the minority class to generate new synthetic examples, and Tomek links
cleaned the data of both minority and majority classes to create better-defined class clusters.

EXPERIMENTS
Experiment Setup
The resampled dataset was randomly divided into 70% and 30% of the dataset. 70% of the
dataset was divided into the training (80%) and validation dataset (20%). 30% of the dataset was
tested dataset. The procedure was repeated ten times, and ten prediction performances were
generated. The dropout probability was set to 0.5, and the learning model was trained for 50,000
epochs. The learning rate began from 0.001 and exponentially decayed to 0.00001 at the end of
the iterations. The models were trained with a workstation equipped with four GPUs (CPU:
Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, RAM: 64GB and VGA: NVIDIA TITAN Xp ×
4ea).
To examine the effect of the combinations of input variable groups on the prediction
performance, four combinations of input variable groups (accounting, construction market, and
macroeconomic variables) were used as shown in Table 1. Model 1 tested the prediction
performance of only accounting variables. Model 2 & 3 added construction market and
macroeconomic variables to the accounting variables, respectively. Model 4 examined prediction
performance by including both construction market and macroeconomic variables to accounting
variables.

Table. 1. The combinations of Input Variable Groups

Models Input Variable Groups(size)
Model 1 Accounting variables (12)
Model 2 Accounting variables (12) + Construction market variables (3)
Model 3 Accounting variables (12) + Macroeconomic variables (3)
Accounting variables (12) + Construction market variables (3) +
Model 4
Macroeconomic variables (3)

Results
The performance of the prediction models was measured regarding accuracy and F1-score.
The accuracy is defined as the number of correctly classified data elements among the total
number of test data elements, and the F1-score is the harmonic mean of the precision and recall.
As can be seen in Fig. 2, it could be confirmed that overfitting of the LSTM RNN models does
not occur through both accuracy curve and F1-score curve for the validation and test dataset.
The experiment results are shown in Table 2. Model 4 achieved the best accuracy of 0.971
for the testing dataset, and the next highest accuracy exhibited models are Model 3 (0.965) and
Model 2 (0.963). The testing results of Model 1 showed the worst accuracy of 0.941. Similar to

accuracy results, the best model for determining F1-score was found to be model 4 (0.971), while
Model 1 showed the worst prediction (0.941). The F1-score of Model 3 and Model 2 achieved
0.965 and 0.963, respectively. The results of this study indicate that adding construction market
and macroeconomic variables to accounting variables can increase the performance of business
failure prediction. In addition, macroeconomic variables were found to have a slightly higher
impact on prediction performance than construction market variables.

Fig. 2. Prediction performance of LSTM RNN models for each epoch: (a) Accuracy curve
for validation data; (b) F1-score curve for validation data; (c) Accuracy curve for test data;
and (d) F1-score curve for test data

Table. 2. Comparison of the Prediction Performance

Models Accuracy F1-Score
Model 1 0.9414 0.9416
Model 2 0.9633 0.9634
Model 3 0.9657 0.9654
Model 4 0.9710 0.9712

CONCLUSION
This study proposed the business failure prediction model of construction contractors using
LSTM RNN and employed four combinations of input variable groups including accounting,
construction market, and macroeconomic variables. The prediction performance results
measured by both accuracy and F1-score showed that adding construction market and
macroeconomic variables to accounting variables could increase the performance of business
failure prediction. It was also found that macroeconomic variables had a slightly higher impact

on the business failure prediction than construction market variables. This study confirms that it
is important to select not only techniques but also variable to improve the prediction performance
of business failure prediction model. The results of this study are expected to be useful
references for both researchers and practitioners to develop business failure prediction models of
construction contractors. However, it remains an interesting topic for further study to derive
optimal input variables. In addition, this study defined business failure as a year before being
delisted to predict the possibility of business failure within one year. Since many construction
projects require a longer period of construction time than one year, it would be necessary that the
business failure prediction model cover a relatively long construction project period.

ACKNOWLEDGMENT
This research was supported by the Basic Science Research Program through the National
Research Foundation of Korea (NRF), funded by the Ministry of Science, ICT & Future
Planning (No. 2015R1A5A1037548).

REFERENCES
Adeleye, T., Huang, M., Huang, Z., and Sun, L. (2013). “Predicting Loss for Large Construction
Companies.” Journal of Construction Engineering and Management, 139(9), 1224–1236.
Arditi, D., Koksal, A., and Kale, S. (2000). “Business failures in the construction industry.”
Engineering, Construction and Architectural Management, 7(2), 120–132.
Ashuri, B., Shahandashti, S. M., and Lu, J. (2012). “Empirical tests for identifying leading
indicators of ENR Construction Cost Index.” Construction Management and Economics,
30(11), 917–927.
Bal, J., Cheung, Y., and Wu, H. C. (2013). “Entropy for business failure prediction: An
improved prediction model for the construction industry.” Advances in Decision Sciences,
2013.
Chen, J. H. (2012). “Developing SFNN models to predict financial distress of construction
companies.” Expert Systems with Applications, 39(1), 823–827.
Cheng, M., and Hoang, N. (2015). “Evaluating contractor financial status using a hybrid fuzzy
instance based classifier : Case study in the construction industry.” IEEE Transactions on
Engineering Management, 62(2), 184–192.
Cheng, M. Y., Hoang, N. D., Limanto, L., and Wu, Y. W. (2014). “A novel hybrid intelligent
approach for contractor default status prediction.” Knowledge-Based Systems, 71, 314–321.
Heo, J., and Yang, J. Y. (2014). “AdaBoost based bankruptcy forecasting of Korean construction
companies.” Applied Soft Computing Journal, 24(2014), 494–499.
Horta, I. M., and Camanho, A. S. (2013). “Company failure prediction in the construction
industry.” Expert Systems with Applications, 40(16), 6253–6257.
Kangari, R. (1988). "Business failure in construction industry." Journal of Construction
Engineering and Management, 114(2), 172-190.
Lam, K. C., Palaneeswaran, E., and Yu, C. yun. (2009). “A support vector machine model for
contractor prequalification.” Automation in Construction, 18(3), 321–329.
Russell, K.S., Zhai, H. (1996). "Predicting contractor failure using stochastic dynamics of
economic and financial variables." Journal of Construction engineering and Management,
122(2), 183-191.
Sang, J., Ham, N.-H., Kim, J.-H., and Kim, J.-J. (2014). “Impacts of Macroeconomic
Fluctuations on Insolvency: Case of Korean Construction Companies.” Journal of

Management in Engineering, 30(5), 05014009.

Shahandashti, S. M., and Ashuri, B. (2013). “Forecasting Engineering News-Record
Construction Cost Index Using Multivariate Time Series Models.” Journal of Construction
Engineering and Management, 139(9), 1237–1243.
Tsai, L. K., Tserng, H. P., Liao, H. H., Chen, P. C., and Wang, W. P. (2012). “Integration of
accounting-based and option-based models to predict construction contractor default.”
Journal of Marine Science and Technology (Taiwan), 20(5), 479–484.
Tserng, H. P., Chen, P.-C., Huang, W.-H., Lei, M. C., and Tran, Q. H. (2014). “Prediction of
default probability for construction firms using the logit model.” Journal of Civil
Engineering and Management, 20(2), 247–255.
Tserng, H. P., Liao, H.-H., Jaselskis, E. J., Tsai, L. K., and Chen, P.-C. (2012). “Predicting
Construction Contractor Default with Barrier Option Model.” Journal of Construction
Engineering and Management, 138(5), 621–630.
Tserng, H. P., Lin, G. F., Tsai, L. K., and Chen, P. C. (2011). “An enforced support vector
machine model for construction contractor default prediction.” Automation in Construction,
20(8), 1242–1249.
Tserng, H. P., Ngo, T. L., Chen, P. C., and Quyen Tran, L. (2015). “A grey system theory-based
default prediction model for construction firms.” Computer-Aided Civil and Infrastructure
Engineering, 30(2), 120–134.

Identifying Patterns in Design-Build Projects in Terms of Project Cost Performance

Yunping Liang1; Baabak Ashuri, M.ASCE2; and Wei Sun3
1
School of Civil and Environmental Engineering, Georgia Institute of Technology, 790 Atlantic
Dr. NW, Atlanta, GA 30332. E-mail: [email protected]
2
School of Building Construction and School of Civil and Environmental Engineering, Georgia
Institute of Technology, 280 Ferst Dr., Atlanta, GA 30332-0680. E-mail: [email protected]
3
College of Civil Engineering, Tongji Univ., 1239 Siping Rd., Shanghai, China 200092. E-mail:
[email protected]

ABSTRACT
Statistical studies indicate that the design-build (DB) project delivery system has better
overall project cost performance compared to the traditional design-bid-build (DBB) project
delivery system. The project cost performance of DB projects could be further improved if
project managers had a better understanding of the budget implications of their specific project
attributes. Existing relevant studies are either inconclusive or only explore project characteristics
in isolation. The overall objective of this research is to comprehensively utilize data mining and
statistical analysis to identify common patterns associated with project attributes and respective
interaction effects in terms of cost performance in design-build projects. Several control factors,
such as evolving requirements from owners, unforeseen conditions, attributes of project team,
and major project characteristics are taken into account. The analysis shows that owner-directed
changes and project managers’ previous experience, particularly the lack thereof, are risk factors
having significant negative impacts on cost performance. Progressive design-build and
metropolitan locations are project attributes associated with best cost performance. Interaction
effects of project attributes also have statistically significant impacts on cost performance. The
findings of this research contribute to empirical knowledge related to cost performance and assist
project managers in addressing project issues to limit their negative impact on cost performance.

INTRODUCTION
Design-Build (DB) is a delivery method in which the owner manages only one contract
incorporating design and construction. Collaboration between design and construction from the
very beginning enables DB projects to have generally better cost performance over traditional
delivery methods. Due to this advantage of DB, nearly half of all the projects in the U.S. are
delivered using DB delivery method these days (DBIA 2014) and moreover, the market share of
DB is predicted to grow by 18% by 2021 (FMI 2018).
In response to the increase of DB projects, researchers have performed several studies that
have demonstrated support for the hypothesis that DB projects, in general, have better cost
performance over projects delivered in traditional methods (e.g., Rosner et al. 2009; Shrestha et
al. 2012; Minchin et al. 2013; Carpenter and Bausman 2016; Shrestha and Fernane 2017). While
the original purpose of these studies was to demonstrate the cost performance advantages of DB,
these studies also found that cost overrun is still common in DB projects.
The fact that cost overrun is common in DB projects suggests that investigating specific
project attributes and their budget implications might provide useful information for decreasing
cost overrun. Rather than simply exploring conclusive causality between project attributes and
cost performance, we conducted pattern analysis of project attributes and cost performance.

Empirical-oriented pattern analysis provides practitioners with intuitive but useful knowledge on
complex project-management to support risk management and decision making in relation to
cost performance.

LITERATURE REVIEW
Existing studies normally identify patterns based on the analysis of one project feature. For
example, El Wardani et al. (2006) analyzed the correlation between procurement method and
project performance based on 76 cases located in the United States. The study reveals that
qualification-based (QB) selection leads to lowest cost overrun. Chen et al. (2016b) focused on
contract method and found that guarantee maximum price contract (GMP) performs better in
saving cost compared to more frequently used lump sum (LS) contract. Several studies consider
multiple factors in assessing performance of design-build projects. Creedy et al. (2010) used
multivariate regression to investigate the correlation of the cost overrun with project performance
risk and project features. The authors showed that cost overrun is primarily due to uncertainty
(uncontrollable) than risk (controllable). It is worth noting that the developed model was found
to be statistically weak. Chen et al. (2016a) investigated the cost performance associated with
project type, owner types, procurement methods, contract methods and LEED levels. The
research identified owner types and contract methods as project features with significant
influence on cost performance. The author did not consider the interaction effects of the four
involved features.
In conclusion, although existing studies have demonstrated the existence of patterns of
features associated with project cost performance, it is necessary to involve more project
attributes including project features (e.g. owner, procurement method, contract cost, etc.) and
potential performance risks (i.e., owner-directed changes and unforeseen conditions) into
discussion with the help of workable analysis approach. Moreover, there is also a need to
investigate the interaction effects of project attributes on project cost performance as a result of
not all attributes being mutually independent.

RESEARCH METHODOLOGY
The overall objective of the research described in this paper is to 1) identify the attributes, i.e.,
project features and potential performance risks, that have statistically significant influence on
the cost performance of DB projects; and 2) identify the interaction effects of different project
attributes that have statistically significant impact on the project cost performance. In order to
provide adequate answers to research questions, the following research stages were adopted:
1. Collecting data where each observation has information on multiple project features and
potential performance risk;
2. Preprocessing the unformatted dataset by transforming text information into formatted
variables and by calculating necessary computed variables;
3. Use of cluster analysis to generate a classification variable of cost overrun percentage;
4. Use of general linear model (factorial ANOVA) and one-way ANOVA to identify the
attributes that significantly impact the cost overrun percentage.

DATA DISCRIPTION AND PREPROCESSING

Secondary data analysis was conducted on the basis of Design-Build Project and Award
Database created by Design–Build Institute of America, which has collected 148 successful DB

projects submitted by stakeholders by 2018. All these carefully-managed projects are considered
as getting benefits from applying DB. The feature of being well-managed takes project
management as a control variable and it would help the research in this paper be more rigorous.
Although project management has been identified as critical factors impacting the project
performance (Lam et al. 2008), it has always been ignored in precedent research of this kind.
As shown in the Table.1, there are 13 variables representing project attributes and one cost
performance indicator in the formatted dataset. The 13 attribute variables, including seven
categorical, three binary, and three numerical variables, are either obtained directly or converted
from the text message in original dataset.

Table 1. Variables in Formatted Dataset

Variable Range of value
Project type {Civic, Educational, Governmental, Healthcare, Industrial, Office,
(PT) Transportation non-building}
Contract {Cost plus fee, Guaranteed maximum price(GMP), Lump sum, Target, Unit
method (CM) price, Other}
Procurement {Best value, Qualification-based, Two-step, Sole source, Progressive, Lowest
method (PM) bid}
Owner type (O) {Federal government agency, State government agency, Local government
agency, Military, NPO corporation, Private corporation, University}
Design-Builder {Three types of design-builder classified by Vashani et al. (2016) on the basis of
(D) market share of design-builders}
Owner-directed {0, 1} Note: 0= no, 1= yes.
changes (C)
Unforeseen {0, 1, 2, 3, 4, 5, 6, 7}
conditions (U) Note: 0= NA; 1= unforeseen animal/environmental protection task; 2=
unforeseen inclement weather; 3= unforeseen hazard materials on site; 4=
unforeseen underground/site condition; 5= unforeseen delay associated with
subcontractors/suppliers; 6= unforeseen delay associated with government
agencies; 7= unforeseen market changes (including code revisions)
Start-on-time {0, 1} Note: 0= no, 1= yes
(S)
New-built (N) {0, 1} Note: 0= no (renovation), 1= yes (new-built)
Lack-of- {0, 1} Note: 1= if the project is marked as unprecedented in the database, 0=
precedent (L) otherwise
Location factor {Coming from RS Means, continuous value, min=0.66 and max=1.35}
(LF)
Contract price Note: Integer
(P)
Contract Note: Integer
duration (D)
Cost overrun Note: cost performance indicator; maximum 74.03%, minimum -31.71%, mean
percentage 5.08%
(COP)

COST PERFORMANCE ANALYSIS

K-means clustering is used to assign cost performance indicators into several clusters for
further analysis. With the manually inputted k and randomly selected initial centroids, the
algorithm of K-means clustering iterates between two steps (i.e., data assignment and centroid
update) to assign each data point to the cluster with the nearest center. In the first step, each data
point is assigned to its nearest centroid, based on the squared Euclidean distance; and in the
second step, the centroids are replaced by the data point that has the minimal sum of distance
from all data points within the cluster. The algorithm iterates between steps one and two until a
stopping criteria is met (i.e., no data points change clusters, the sum of the distances is
minimized, or some maximum number of iterations is reached). To specify the number of
clusters k as required, elbow method is used by calculating the sum of Euclidean distance
between data points and corresponding centroids. As shown in Fig. 1, the reduction of sum of
distance drops significantly when k  7 which denotes that having seven clusters is reasonable
for the dataset in this research.

Figure 1. Sum of distance within clusters procurement method

Figure 2. Interaction of project type and

Factorial ANOVA with the null hypotheses that average cost performance of projects with
different categorical or binary project attributes are the same was then conducted for pattern
analysis. The analysis results indicate that project type, procurement method, owner type, and

owner-directed changes are four attributes with statistically significant impact on cost overrun
percentage (Table 2). As there are more than two groups in project type, procurement method,
and owner type, post-hoc tests were conducted to specify where the differences occurred
between groups. The results of post-hoc test indicate that two-step procurement has no
significant mean difference with any other groups within project types. Similarly, federal
government agency and military are two groups belonging to type I errors in ANOVA test within
owner types. Mean cost overrun percentage within project attributes those are statistically
significant are listed in Table 3. The statistical analysis results show that: 1) progressive design-
build projects and projects located in metropolitan area have the best cost performance; and 2)
owner-directed changes, government-owned projects, unprecedented projects are attributes that
have negative implications on project cost performance. Moreover, some interaction effects have
been identified for the first time in studies of this kind. The impact of either one attribute on the
cost performance will be influenced by the other one, if the interaction effect of two project
attributes is significant. For example, as shown in Fig.2, although projects procured by sole
source procurement have overall worse cost performance compared to projects procured by best
value selection or qualification-based selection, the projects owned by government is an
exception.

Table 2. Factorial ANOVA Test for Differences within Project Attributes

Variables PTa CM PMa Oa D S N La
p 0.000 0.856 0.033 0.034 0.280 0.056 0.793 0.007
Variables Ca U-1b U-2b U-3b U-4b U-5b U-6b U-7b
p 0.024 0.285 0.746 0.865 0.328 0.458 0.158 0.663
Levene's Test: p= 0.204. Retain the null hypothesis: the error variance is equal across groups.
a
Significance level at 0.05
b
Due to a project may have more than one unforeseen condition, this variable was transformed into binary variables
in analysis. “U-2= 1” means U= 2.

Table 3. Project Attributes with statistically significant impacts on COP

Project attribute Mean COP(%) Project attribute Mean COP(%)
Project type Owner
Civic 1.625 Local government agency 3.135
Educational 4.783 NPO corporation 9.333
Governmental 11.250 Private corporation 3.500
Healthcare 0.385 State government agency 10.696
Industrial 4.000 University 4.440
Office 9.167 Procurement method 8.188
Transportation 7.400 Best value
Owner-directed change Lowest bid 12.000
No 2.898 Progressive 1.200
Yes 6.989 Qualification-based 2.912
Lack-of-precedent Sole source 1.917
Yes 7.308
No 5.365

With the obtained clustering results, one-way ANOVA tests were conducted with the null
hypothesis that there is no statistically significant difference in mean values of continuous

project attributes within different cost overrun clusters. The calculation results indicate that: 1)
projects assigned to different cost overrun clusters have statistically significant difference in
location factors; and 2) project size indicated by contract cost and contract duration has no
statistically significant impact on cost performance of projects (see Table 4). Table 5 provides
detailed mean location factors of different cost overrun clusters, which indicates that projects
located in places with higher location factors are more prone to end up with as-expected actual
project cost.

Table 4. One-way ANOVA Test for Differences within Cost Overrun Clusters
Variables LFa P D
p Levene's Test 0.133 0.280 0.331
ANOVA 0.019 0.811 0.917
a
Significance level at 0.05

Table 5. Cost Overrun Clusters with Statistically Significant Different Location Factors
Number of clusters 2 3 4 5 6
Mean location factor 0.91 0.98 1.01 0.99 0.89

DISCUSSION
Data analysis results indicate that progressive design-build has the best cost performance.
According to the definition of DBIA (2017), progressive design-build is one application of
design-build, which is via a stepped or progressive process. It comprises two phases (i.e.,
preliminary preconstruction service and final design& construction service) and provides the
owner more control over the project.
The results of data analysis statistically support the claim that owner-directed project changes
have negative impacts on project cost performance, which has been identified in literature (e.g.
Ibbs 2012). For project type and owner type, government-related projects have a higher general
cost overrun rate. Although the data used in this paper have a similar pattern with the data used
by Chen et al. (2016a) in terms of government-related projects having generally higher project
contract cost, the authors do not intend to ascribe the reason to project contract cost as Chen et al.
did. That is because ANOVA test indicates that project contract cost is not an attribute having
statistically significant impact on project cost overrun. A closer look at the data illustrates that
government-related projects have a higher average frequency of owner-directed changes (80.0%)
in comparison with non-government-related projects (61.9%). In literature, variations in
expectation are identified as the main reason of owner-directed changes (Williams et al. 2003).
Hsieh et al. (2004) further provided an explanation specific in public works which says politics,
election, early occupation of the newly built facility or conflicting views by higher authorities
may be triggers of changes of this kind.
Effective project management action and design-builder experience are critical success
factors on DB projects (Lam et al. 2008, Lu et al. 2015). Lack-of-precedent brings difficulties to
project management team in successfully conducting management actions. Moreover, working in
unfamiliar conditions increases the chance of encountering unexpected challenges. Another
obtained complementary result is that the projects located in places with higher location factors
are prone to have actual expenditure as anticipated. A closely look at the RS Means location
factor table indicates that metropolitan area where more projects have been completed and more
experience has been accumulated always have higher location factors than remote areas.

The impact of procurement and contract method on cost performance is affected by project
type. All existing research treat these three attributes separately, nevertheless different types of
projects (e.g., governmental/non-governmental projects) may have obviously different
applications in practice due to statutes or market environments. For example, as a non-
competitive procurement method, sole source procurement in public projects is strictly confined
by statutes. Justification is required, and strict review will be conducted if the sole source
procurement is inevitable [FTA Circular C 4220.1F]. These legislative processes provide a
different environment for sole source procurement in public projects compared to others.

CONCLUSION AND FUTURE WORK

The findings of this research indicate five project attributes (i.e., project type, owner type,
procurement method, lack-of-precedent, owner-directed changes, and location) and three
interaction effects (i.e., project type vs. lack-of-precedent, project type vs. procurement method,
and owner type vs. contract method) have statistically significant impact on project cost
performance. The contribution of this research to the body of knowledge is: 1) improve
understanding of DB project performance by specifically indicating implications of project
attributes on cost performance; and 2) proposing the necessity of considering interaction effects
of project attributes to better demonstrate the widely existed entanglements of project attributes.
In future, the authors will perform more analysis with focus on project time performance.
The authors will also work on exploring the patterns in DB projects with higher probability to
encounter unforeseen working conditions.

REFERENCES
Carpenter, N., and Bausman D. C. (2016). “Project Delivery Method Performance for Public
School Construction: Design-Bid-Build versus CM at Risk.” J. Constr. Eng. Manage.,
142(10): 05016009.
Chen, Q., Jin, Z., Xia, B., and Wu, P. and Skitmore, M. (2016a). “Time and Cost Performance of
Design–Build Projects.” J. Constr. Eng. Manage., 142(2): 04015074.
Chen, Q., Xia, B., Jin, Z., Wu, P., and Hu, Y. (2016b). “Choosing Appropriate Contract Methods
for Design-Build Projects.” J. Manage. Eng., 32(1): 04015029.
Creedy, G. D., Skitmore, M., and Wong, J. K. W. (2010). “Evaluation of Risk Factors Leading to
Cost Overrun in Delivery of Highway Construction Projects.” J. Constr. Eng. Manage.,
136(5), 528-537.
DBIA (Design- Build Institute of America). (2014) “What is Design-Build? A Design- Build
Done Right Primer”. https://dbia.org/wp-content/uploads/2018/05/Primers-What-is-Design-
Build.pdf <Nov. 5, 2018>.
DBIA. (2017). “Progressive Design-Build.” https://dbia.org/wp-content/uploads/2018/05/Primer-
Progressive-Design-Build.pdf <Nov. 16, 2018>.
EI Wardani, M. A., Messner, J. I., and Horman, M. J. (2006). “Comparing Procurement Methods
for Design-Build Projects.” J. Constr. Eng. Manage., 132(3): 230-238.
FMI. (2018). “Design-Build Utilization: Combined Market Study.” https://dbia.org/wp-
content/uploads/2018/06/Design-Build-Market-Research-FMI-2018.pdf <Nov. 5, 2018>.
Hsieh, T.Y., Lu, S.T., and Wu, C.H. (2004). “Statistical analysis of causes for change orders in
metropolitan public works.” Int. J. Proj. Manage., 22: 679–686.
Ibbs, C. W. 2012. “Construction change: Likelihood, severity, and impact on productivity.” J.
Leg. Aff. Disp. Resolut. Eng. Constr., 4 (3): 67–73.

Lam, E. W., Chan, A. P., and Chan, D. W. (2008). “Determinants of Successful Design-Build
Projects.” J. Constr. Eng. Manage., 134(5): 333-341.
Lu, W., Hua, Y.Y., and Zhang, S.J. (2017). “Logistic regression analysis for factors influencing
cost performance of design-bid-build and design-build projects.” Eng. Constr. Archit.
Manage., 24(1): 118-132.
Minchin, R. E. Jr., Li, X., Issa, R. R., and Vargas, G. G. (2013). “Comparison of Cost and Time
Performance of Design-Build and Design-Bid-Build Delivery Systems in Florida.” J. Constr.
Eng. Manage., 139(10): 04013007.
Rosner, J. W., Thal, A. E. Jr., and West, C. J. (2009). “Analysis of the Design-Build Delivery
Method in Air Force Construction Projects.” J. Constr. Eng. Manage., 135(8): 710-717.
Shrestha, P. P., O’Connor, J. T., and Gibson, G. E. Jr. (2012). “Performance Comparison of
Large Design-Build and Design-Bid-Build Highway Projects.” J. Constr. Eng. Manage.,
138(1): 1-13.
Shrestha, P. P., and Fernane, J. D. (2017). “Performance of Design-Build and Design-Bid-Build
Projects for Public Universities.” J. Constr. Eng. Manage., 143(3): 04016101.
Vashani, H., Sullivan, J., El Asmar. (2016). “DB 2020: Analyzing and Forecasting Design-Build
Market Trends.” J. Constr. Eng. Manage., 142(6): 04016101.
Williams, T., Ackermann, F., and Eden, C. (2003). “Structuring a delay and disruption claim: An
application of cause-mapping and system dynamics.” Eur. J. Oper. Res., 148: 192–204.

Reconstruction of Wind Turbine Blade Geometry and Internal Structure from Point Cloud
Data
Benjamin Tasistro-Hart1; Tristan Al-Haddad2; Lawrence C. Bank3; and Russell Gentry4
1
Georgia Institute of Technology, School of Architecture, 145 4th St. NW, Atlanta, GA 30332.
E-mail: [email protected]
2
Georgia Institute of Technology, School of Architecture, 145 4th St. NW, Atlanta, GA 30332.
E-mail: [email protected]
3
City College of New York, 160 Convent Ave., NY 10031, New York. E-mail:
[email protected]
4
Georgia Institute of Technology, School of Architecture, 145 4th St. NW, Atlanta, GA 30332.
E-mail: [email protected]

ABSTRACT
This paper presents a method for the digital reconstruction of the geometry of a wind turbine
blade from a point-cloud model to polysurface model. The digital reconstruction of the blade
geometry is needed to develop computer models that can be used by architects and engineers to
design and analyze blade parts for reuse and recycling of decommissioned wind turbine blades.
Initial studies of wind-blade geometry led to the creation of an airfoil database that stores the
normalized coordinates of publicly-available airfoil profiles. A workflow was developed in
which these airfoil profiles are best-fitted to targeted cross-sections of point-cloud
representations of a blade. The method for best-fitting airfoil curves is optimized by minimizing
the distance between points sampled on the curve and point-cloud cross section. To demonstrate
the workflow, a digitally-created point-cloud model of a 100 m blade developed by Sandia
National Laboratory was used to test the reconstruction routine.

INTRODUCTION
The production of wind energy worldwide has increased twenty-fold since 2001 (GWEC).
Almost all this energy is generated by three-bladed wind turbines with fiber reinforced polymer
(aka plastic) (FRP) composite wind blades. Blades for turbines installed in the early 2000s are
nearing the end of their 20- year design lives, and the need to recycle the blades has become a
well-recognized problem (Albers et al. 2009; Jensen and Skelton, 2018). In addition, many wind
farm sites are being re-powered, with new and more powerful turbines and blade sets being
installed even though the current turbines have not reached their design lives. Advancements in
turbine technology have rendered these turbines functionally obsolete (Delony 2018). Recycling
of the steel and copper that make up most of the turbine mass is well understood, but the
potential for recycling or re-purposing the composite materials has not been addressed (Liu and
Barlow 2017).
Most blades are constructed from thermoset composites using mostly glass but some carbon
fibers (Mishnaevsky et al. 2017). Currently material is recycled by shredding the blades, and the
shredded polymer composite material is utilized as partial replacement for coal or natural gas in
fossil power plants and the ash for feedstock in cement kilns (Job 2013). Efforts to use the
ground waste in the production of new FRP composites shows that the inclusion of recycled
materials dramatically decreases the mechanical properties of the material (Beauson et al. 2016).
Chemical and thermal methods are also being investigated. For a review of FRP composite

recycling methods see Oliveux et al (2015). Given this context, our research program seeks to
develop re-use applications for wind blades, aiming to use entire blades or large parts of blades
in architectural and civil infrastructure applications, taking advantage of their structural
properties (see www.re-wind.info).

WIND BLADE GEOMETRY

Wind blades range from 9 m to 90 m in length. They are essentially large propellers, whose
exterior surface is created by a series of successively more aerodynamic profiles placed along a
reference axis. Beginning at the root, a circular section (about 10% of the total length) blends
into a series of airfoil sections which taper along their length (about 90% of the total length).
Each profile is located at a specific point away from the start point of the reference axis and
provides material information regarding the layering of the composites and cores as well as
geometric information (Fig. 1) that describes the shape the blade at that point (Bronsted et. al
2005). Structural loading and aerodynamic performance have an inverse relationship (Fig.2)
along the length of the blade where the root and transitional region have poor aerodynamic
performance but carry the highest structural loads while the mid-span and tip have a maximized
lift-to-drag ratio and carry less structural loads (Schubel 2012). Wind blades behave as huge
cantilevers, with the primary bending moments carried by a tapering and twisting box beam
consisting of spar caps (i.e., flanges) bounded by internal structural webs (Fig. 3) that are also
composed of sandwich panels (Gentry et al. 2018).

Figure 1. Schematic cross-section of wind blade.

Figure 2. Structural performance vs. aerodynamic performance along length of blade.

Figure 3. Internal structural webs of wind blade.

Figure 4. Interfaces of NuMAD wind blade design software displaying key geometric
parameters offset (Arias, 2016; Griffith and Ashwill 2011)
Most of the information regarding the geometric and material properties of wind blades is the
intellectual property of the blade designers and manufacturers. However, some of this data can
be acquired using 3D non-contact metrology. These methods are of interest in the re-use context
because of low-cost variants, such as photogrammetry, laser scanning, and LIDAR can provide
detailed point-cloud models that can be used in robot path planning. In addition, these models do
not infringe on proprietary rights of the original manufacturers (Siddiqi 2018). Architects and
engineers who work with renovation or retrofitting often use 3D non-contact metrology methods
to create a detailed point-cloud model as starting point of a project. Much research has been
conducted on developing automated workflows which translate point-cloud models into
integrated volumetric building models which allow for the necessary flexibility in the design
process (Ochmann 2015).

WIND BLADE GEOMETRY RECONSTRUCTION

Software routines that build representative models of wind blades within CAD packages
commonly used by architects and engineers have been developed (Charalampous et al. 2015). In
this paper, the reconstruction of the wind blade geometry involves the use of an evolutionary
solver to create the complex exterior surfaces from a point-cloud. The resulting blade model is
represented as airfoils at stations along the blade length, in manner analogous to the process used
in the Numerical Manufacturing and Design Tool (NuMAD), a wind blade design tool developed
by Sandia National Laboratory (Berg and Resor 2012). The use of NuMAD for the design of
wind blades, and the implementation of evolutionary solvers in modern CAD packages is well-
documented (Arias 2016; Rutten 2010). Both methods will be described briefly below.
Numerical Manufacturing and Design Tool (NuMAD): Developed by Sandia National
Laboratory, NuMAD facilitates the structural design of wind blades. In NuMAD, wind blades
are modelled as a series of closed section curves at set intervals, termed stations, along a
reference axis that describe the outer geometry of the wind blade and provide reference points
for the internal geometry. At each station, the user specifies a variety of geometric parameters,

such as, the airfoil type, distance from root, chord length, twist (rotation) of station and
normalized x-offset. Figure 4 shows a sample dialog box from NuMAD that displays the
geometric parameters as well as a visualization for part of the blade. These geometric parameters
significant for the automated reconstruction of the wind blade geometry. Each airfoil profile is
divided into a series of segments that describe both the material properties of the composite
surface and the points for the polyline that creates the internal structural spars.

Figure 5. Transformations used to position the airfoil on the point-cloud cross section.
Evolutionary Solvers: The development of evolutionary solvers in computing is well
documented and has been summarized by Eiben (2015). Evolutionary solvers work with
populations of individuals that can recombine their characteristics—or genomes—to produce a
population with greater fitness in a manner analogous to biological evolution. The fitness of the
population is directed by maximizing or minimizing a single quantity. Evolutionary solvers
always return a solution, and through time the solution improves eventually approaching a
maximum or minimum in a given fitness landscape. Applied to the reconstructive modeling of
wind blades, the individuals are known airfoils whose genomes are the geometric properties
such as their rotation, translation, x-and-y offset from reference axis and chordal length along
the reference axis. Figure 5 summarizes the transformations used to position a known airfoil
within across-section of the point-cloud model.

Figure 6. Partial surface output from the routine.

Proposed Reconstruction Routine: The proposed reconstruction routine developed by the
authors consists of best-fitting a known airfoil profile against a section of the wind blade point-
cloud model in a given sampling plane (specified as a Z offset from the root of the blade). Initial
studies to create point-cloud models from physical artifacts utilized a small quad-bladed wood
propeller, and the surface-reconstruction routine proved possible. For the purposes of this paper,
a point-cloud model of Sandia National Laboratory 100 m prototype blade (SNL-100) was
digitally created so that airfoil profiles selected by the routine could be compared to the actual
profiles of the blade (Griffith and Ashwill 2011). Different resolutions of point-clouds were
tested ranging from 10,000 to 1 million points, and ultimately a point-cloud of 500,000 points
was chosen because of its ability to return errors of less than 100 mm while completing several
thousand iterations.
The airfoil selection process is automated by linking to the routine a database of publicly
available airfoil profiles, created as part of this project. This database contains a selection of
normalized coordinates that describe airfoil profiles created by the National Advisory Committee
for Aeronautics (UIUC 2018), Delft University (Berg 2012; UIUC 2018; Bertagnolio 2001),
Aeronautical Research Institute of Sweden (Bjork 1990), and Risø-DTU National Laboratories
(Bertagnolio 2001). Each airfoil shape is stored in the database as a set of XY points, with
between 50 and 200 points for each airfoil.
During the runtime of the evolutionary solver, an airfoil profile is selected from the database
and is compared to a point-cloud section in a 2D XY plane. Given the density of the 3D point
cloud (500,000 points), the solver generally identifies between 500 and 700 points with
essentially equivalent Z coordinates. Both the airfoil and point-cloud section curves are sampled
at 100 points equidistant from each other. The fitness value is chosen as the mean distance
between corresponding points on the airfoil and point-cloud section, and it is minimized by
altering geometric characteristics such as the specific airfoil type, the distance from the root,
chord length, rotation of airfoil in plane, and offset from reference axis. The routine goes through
thousands of iterations where the fitness of each iteration progressively increases. This routine is
repeated along key points of the blade such as the transitional region, plane of maximum chord,
and mid-span. Once the cross-sections are established, a surface is created between these airfoils.

A test-case was used to re-create the blade surface of the SNL-100 blade from the point cloud
between Z coordinates of 29 m to 39 m (Figure 6). The routine was run in 6 instances to fit 6
airfoils. In each instance, pairs of points sampled on the airfoil and point-cloud section were
between 10 mm to 75 mm away from each other. When comparing the surface to the 3D point
cloud, a mean distance of 10 mm is between a given 3D point in the point cloud and point on the
recreated surface following the shortest normal vector.

GENERATING THE INTERAL STRUCTURE OF THE BLADE

Once the external geometry of the wind blade is established, the process advances to generate
the internal structure of the blade. The typical materials and relative thicknesses of the composite
structures within the blade have been described in earlier work by the authors (Gentry et al.
2018). The aerodynamic forces generated on the blade are complex, but the lift component on
the blade generates the flapwise bending moments that drive the design of the root section and
spar cap (Schubel and Crossley 2012). Given the external geometry, and knowledge of the
relative width of the spar cap, derived parametrically from the width of the blade at maximum
chord, the algorithm calculates the effective flapwise moment of inertia at multiple stations along
the reference axis. Given prior knowledge regarding the composite material strength and
modulus and the calculated cantilever bending moment at the section, the thickness of the
primary materials at that location can be established. The secondary structure (blade shell) is a
composite sandwich panel, this structure can be selected based on a rule set with blade length
and maximum chord dimensions as the input variables.

CONCLUSIONS AND FUTURE WORK

This paper presents an automated routine that reconstructs a polysurface model from point-
cloud representations of wind blades. To demonstrate the reconstruction routine, a digitally-
created point-cloud model of the SNL-100-01 prototype blade was used. In the future, the routine
will be tested with field-produced point-cloud models of wind blades so that the results can be
more quantitatively compared and analyzed. Wind blades with reference axes other than a line
pose a challenge because of the complexities in interpolating a smooth non-linear curve. In
addition, the internal structure of wind blades will be refined after better understanding of their
geometric relationships to the exterior surface geometry.

ACKNOWLEDGEMENTS
This work was conducted under the auspices of a US-Ireland Tripartite Project funded by the
U.S. National Science Foundation (NSF) under grants 1701413 and 1701694;
InvestNI/Department for the Economy (DfE), grant USI-116; Science Foundation Ireland (SFI),
under grant 16/US/3334; and Royal Academy of Engineering (RAE) Distinguished Visiting
Fellowship (DVF) grant DVF1516\4\25. The authors would also like to acknowledge the support
of project PIs Jian Fei Chen (QUB), Paul Leahy (UCC), Lawrence Bank (CCNY) and Russell
Gentry (GT). Any opinions, findings, and conclusions or recommendations expressed in this
material are those of the authors and do not necessarily reflect the views of the funding agencies.

REFERENCES
Albers, H.; Greiner, S.; Seifert, H.; Kühne, U. Recycling of Wind Turbine Rotor Blades—Fact or
Fiction? (Recycling von Rotorblättern aus Windenergieanlagen—Fakt oder Fiktion?). DEWI

Mag. 2009, 34, 32–41.

Arias, F. (2016). NuMAD Modeling and Finite Element Analysis of SNL-100-01 Wind Turbine
Blade Shells, City College of New York, New York, New York.
Beauson, J., Lilholt, H., and Brønsted P. (2014). “Recycling solid residues recovered from glass
fibre-reinforced composites – A review applied to wind turbine blade materials.” J.
Reinforced Plastics & Composites, 10.1177/0731684414537131.
Berg, J. C., and Resor, B. R. (2012). Numerical Manufacturing And Design tool (NuMAD v2.0)
for Wind Turbine Blades: User’s Guide. U.S. Department of Commerce, Springfield, VA.
Brønsted P., Lilholt, H., Lystrup, A. (2005). “Composite materials for wind power turbine
blades.” Annual Reviews of Materials Research, 10.1146/annurev.matsci.35.100303.110641.
Bertagnolio, F., Sørensen, N. N., Johansen, J., Fuglsang, P. (2001). Wind turbine airfoil
catalogue, Risø National Laboratory, Roskilde, Denmark.
Bjorck, A. (1990). Coordinates and Calculations for the FFA-W1-XXX, FFA-W2-XXX and FFA-
W3-XXX Series of Airfoils for Horizontal Axis Wind Turbines, The Aeronautical Research
Institute of Sweden, Stockholm, Sweden.
Charalampous, K. G., Strofylas G. A., Mazanakis G. I., and Nikolos I. K. (2015). “Wind Turbine
Blades Parametric Design using Grasshopper” 8th GRACM International Congress on
Computational Mechanics.
Delony, J. (2018). “The Repowering Mission: Breathing New Life into Our Aging Wind Turbine
Fleet.” Renewable Energy World, < https://www.renewableenergyworld.com> (Oct. 15,
2018)
Eiben A.E., Smith J. E. (2015) “Evolutionary Computing: The Origins.” Chapter 2 in
Introduction to Evolutionary Computing. Springer, Berlin, Heidelberg, 13-24.
Gentry, R. Bank, L.C., Chen, J-F, Arias, and Al-Haddad, T. (2018), Adaptive Reuse of FRP
Composite Wind Turbine Blades for Civil Infrastructure Construction, 9th International
Conference on Fiber Reinforced Polymer Composites in Civil Engineering (CICE 2018),
July 17-19, 2018, Paris, France.
Griffith, D. T. and Ashwill, T. D. (2011) The Sandia 100-meter All-glass Baseline Wind Turbine
Blade: SNL-100-00. U.S. Department of Commerce, Springfield, VA.
GWEC (Global Wind Energy Council). “Global Wind Report—Annual Market Update 2017”;
Global Wind Energy Council: Brussels, Belgium, 2017. Available online:
<http://files.gwec.net/files/GWR2017.pdf> (Oct. 15, 2018).
Jensen J. P. and Skelton, K. (2018) “Wind turbine blade recycling: Experiences, challenges and
possibilities in a circular economy.” Renewable and Sustainable Energy Reviews,
10.1016/j.rser.2018.08.041
Job, S. (2013). “Recycling glass fibre reinforced composites – history and progress.” Reinforced
Plastics, 10.1016/S0034-3617(13)70151-6.
Liu, P., and Barlow, C. Y. (2017). “Wind turbine blade waste in 2050.” Waste Management,
10.1016/j.wasman.2017.02.007.
Mishnaevsky Jr. L., Branner, L., Petersen, K., Beauson, J., McGugan, M. and Sørensen, B.F.
(2017). “Materials for Wind Turbine Blades: An Overview.” Materials,
10.3390/ma10111285.
Ochmann, S., Vock, R., Wessel, R., and Klein, R. (2015). “Automatic reconstruction of
parametric building models from indoor point clouds.” Computers & Graphics,
10.1016/j.cag.2015.07.008.
Oliveux, G., Dandy, L., and Leeke, G. (2015). “Current status of recycling of fibre reinforced

polymers: Review of technologies, reuse and resulting properties” Progress in Materials

Science, 10.1016/j.pmatsci.2015.01.004.
Rutten, D. (2010). “About Grasshopper” Computing Architectural Concepts,
<https://www.aaschool.ac.uk/VIDEO/lecture.php?ID=1212> (July 10, 2018).
Schubel, P., and Crossley, R. J. (2012). “Wind turbine blade design” energies,
10.3390/en5093425.
Siddiqi, M. U. R., Ijomah, W. L., Dobie, G. I., Hafeez, M., Pierce S. G., Ion, W., Mineo, C., and
Macleod, C. N. (2018). “Low cost three-dimensional virtual model construction for
remanufacturing industry.” J. Remanufacturing, 10.1007/s13243-018-0059-5.
UIUC (University of Illinois Urbana-Champaign) Applied Aerodynamics Group. (1996) “UIUC
Airfoil Coordinates Database” University of Illinois Urbana-Champaign, < https://m-
selig.ae.illinois.edu> (June 13, 2018).

Falling Objects Detection for Near Miss Incidents Identification on Construction

Site
Chengqian Li, Ph.D.1; and Lieyun Ding, Ph.D.2
1
School of Civil Engineering and Mechanics, Huazhong Univ. of Science and Technology, 1037
Luoyu Rd., Wuhan, China. E-mail: [email protected]
2
School of Civil Engineering and Mechanics, Huazhong Univ. of Science and Technology, 1037
Luoyu Rd., Wuhan, China. E-mail: [email protected]

ABSTRACT
Falling objects accidents frequently occurs on construction site and cause a great deal of
casualties every year. In addition to the reported injury accidents, there are still a large number of
near miss incidents related to falling objects on construction site, since not all falling object
incidents involves injuries because of sheer luck. Therefore, identifying the near miss incidents
and their causes in time is an effective way to prevent a real injury accident. This paper proposes
a deep learning-based video monitoring method to automatically detect and analyze the probable
causes of the near miss incident (falling objects) on construction site, which forms an exhaustive
case database from real-time monitoring for safety evaluation, training, and improvement. The
proposed approach develops a comprehensive framework for near miss incident detection and
cause analysis by advanced computer vision methods. The experiment results show that our
approach achieves high performance in both falling object detection and probable causes
classification.

INTRODUCTION
Falling objects incidents is the most common but extremely dangerous scenario contributing
to struck-by accidents, accounting for 30.8% of all cases(Hinze et al., 2005). According to the
report of fatal occupational injuries by Department of Labor's Bureau of Labor Statistics (BLS),
an average of 239 fatalities associated with falling object or equipment was calculated during
2011-2015 in the U.S., accounting for 5% of the total(Bureau of Labor Statistics, 2016). As for
non-fatal injuries, Occupational Safety and Health Administration (OSHA) recorded that 52,260
"struck by falling object" happen every year in the U.S. (Cherri et al., 2016). If you average it
out, it comes to one worker injured by falling objects every 10 minutes or killed every 1 or 2
days.
Most researches on reducing the falling object injuries are concentrated on developing a
falling object protection system. Sari et al. developed a protective structure against falling object,
which was designed as a frame, a cab(Pai, 2006) or halfpipe assemblies surrounding the workers
beneath the structure. Protective structure is a local shelter and doesn’t cover the entire site.
Moreover, it takes up a great deal of space and hinders the operation of other machineries on site.
Wu et al. proposed a proactive prevention system based on a ZigBee RFID sensor network
against falling objects(Wu et al., 2013). Wu categorized several common falling objects on
construction site and attach RFID tags to all these potentially falling object for real-time tracking
and proximity alarm. However, the main construction activities aren’t considered in his research.
And it’s unrealistic to attach every small object or tool with a RFID tag, such as wrenches and
bricks.
This paper proposes an object tracking method to detect the falling objects on construction

site. Even if the near miss incidents don’t cause any injury, it still exposes a great hidden danger
of the safety management on site. It reminds us to re-examine the construction environment and
improve the safety awareness of workers for preventing a real accident next time.

LITERATURE REVEIW

Identifying Near Miss Incidents to Prevent A Real Accident

Despite the established standards, many workers don’t conform to the requirements in actual
work. Especially, fall protection violation rank first in OSHA’s top ten violation lists during
2012-2013 with an average of 7746(OSHA, 2013), which includes protection standards for
falling objects.
Workers who violates the protection standards so many times are very likely to trigger falling
object incidents, but not always causing injuries. An empirical finding from Heinrich's law
(Heinrich et al., 1980)is that in a work place, every major accident is bred from 29 minor
accidents that cause minor injuries and 300 near miss incidents that cause no injuries. Heinrich's
law illustrates the disparity in the number of between major accidents and near miss incidents.

Sliding Windows Multi-Stream Network Attention Model

t+2

…
t+1 … C
E D L
SLIDING WINDOWS

N E A
C C S
… S
t O O
… D D I
E E F
R R I
E
t-1 … R
…

t-2 Spatio- Identification

temporal Instance of features that Key event
t-3 Raw images
features segmentation are associated detection
extracting with key event

Figure 1. The overall framework of our method

As a result, there are dozens of chances to discover the hidden dangers in the field
management before a major accident happens. Common cause hypothesis assumes that near miss
incidents share the same causal pattern with accidents, so near misses are used as precursors of
later and more severe accidents(Alamgir et al., 2009). If we find the persons responsible for the
near miss incidents with harsh punishment and safety training(Aksorn et al., 2008)and eliminate
the hidden hazards meanwhile, a more serious accident that causes heavy casualties can be
avoided.

Automatic Monitoring of Construction Unsafe Behavior

The monitoring approaches on unsafe behaviors can be divided into sensor based methods
and vision based methods. Sensor based methods are stuck by the following problems: additional

cost on location equipment, low positioning accuracy, and the limited application range (Wang et
al., 2015). In comparison, vision based methods show superiority for its wide range, high
accuracy and using local surveillance videos. SangHyun et al. proposed a series of motion
capture and action recognition methods for unsafe behavior and ergonomic analysis(Han et al.,
2013; Seo et al., 2014). Teizer et al. use point cloud data to identify fall risk(Wang et al., 2014)
and hazard events in heavy equipment operation(Teizer et al., 2015). However, most of these
methods are traditional handcrafted representation-based approaches, which rely on expert
designed feature detectors and has gradually been replaced by the learning-based approach for its
poor performance(Sargano et al., 2017). In recent years, deep learning has made a breakthrough
in many fields, especially in the field of image recognition(LeCun et al., 2015; Schmidhuber,
2015). Compared with other methods, the image recognition method based on deep
Convolutional Neural Networks (CNN) has higher recognition accuracy and faster processing
efficiency. Therefore, some scholars have introduced deep learning into the construction safety
behavior monitoring(Fang et al., 2018; Fang et al., 2018).
This paper uses the deep learning based attention model, which was a powerful methods used
for recognizing key actors in multi-person events(Ramanathan et al., 2016), to detect the
dropping objects behavior.

PROPOSED METHODOLOGY
Figure 1 shows the main steps of the proposed method: First, the sliding window method is
applied to obtain temporal proposals; Secondly, multi-stream features of a frame are extracted;
Finally, the multi-stream features are input into attention models, which consists of multiple Bi-
directional Long Short-Term Memory (B-LSTMs) to encrypt and decrypt the event features.

Appearance Feature
Appearance Feature Segmentation
Extraction by SSD
Full-Frame: Person-Centric: Object-
Centric:
Spatial Stream
ConvNet
…
… C
L
Raw Frame
A
S
S
I
F Motion Feature Segmentation
Motion Feature Extraction
by Optical Flow
I Full-Frame: Person-Centric: Object-
E Centric:
Temporal Stream R
ConvNet
…
…

Optical Flow

Figure 2. Images pre-processing by Multi-Stream Network

Pre-Processing of A Frame Based On Multi-Stream Network

To intercept the target events from the continuous video streams, we propose the temporal
proposals using sliding windows method. The process of the Dropping objects includes the
action of throwing and the free fall of the object. Assuming that workers are working at the
height of 5m~50m, and the time of free fall will probably need 1~3 seconds. Considering that the
parabolic motion of workers may cost 2 seconds, the length of the whole process is between
3~5s. Therefore, we slide 4 second windows through all the surveillance videos and try to

classify the window into a negative class or Dropping Object Behavior (DOB).
Images in a temporal proposal should be pre-processed by Multi-stream Network (Singh et
al., 2016) first before classification. The purpose of preprocessing is to extract and classify
feature maps of images, and it includes the following two steps: First, spatio-temporal contexts
of raw images are extracted by dynamic streams and static streams, which stores temporal
messages and spatial messages, respectively; Second, the extracted features are segmented and
categorized into two different classes, global streams ( f t ) and local streams ( Pti and Oti ). Global
streams store the overall features of an image, while local streams store the features of a
particular object, such as the feature of a worker or a wrench.

Feature Encoder and Decoder Based On Attention Model

Figure 3 shows the process of fusing three different types of resources to obtain information
of the key event in each frame, including the information of full-frame, all persons and all non-
person objects. Information of full-frame is represented as ht , which can be calculated in
equation 1. Bidirectional LSTM is used to filters out the useful information from the past and
future frames and incorporates them into the current frame. The information of all persons ( htip )
and objects ( htio ) is calculated in the same way.
ht  BLSTMframe  ht 1 , ht 1 , ft  (1)

ENCODER DECODER
f t1 pt 1 O t 1 f t 1 ht 1

t-1
… … … p t 1 C ttra1 c k h te 1
C
O t 1 d ttra1 c k L
A
ft pt Ot ft ht
S
t S
… … … pt C ttr a c k h te
I
Ot d ttr a c k
F
I
ft 1 pt 1 Ot 1 f t 1 ht 1 E
R
t+1 C ttra1c k h te 1
… … … p t 1

O t 1 d ttra1c k

Figure 3. Encoder and decoder process in attention model

The comprehensive features cttrack of workers blended with all workers' movements are
calculated then. The htip ( htio ) in the equation is only the feature of the i’th worker (object), and
not every worker (object) is equally important (Only the thrown object and worker who is
performing unsafe behavior need more attention and weight). Therefore, instead of simply
overlapping the features of each worker or object, we should discriminate between different
workers according to the type of movement of each worker or object. If the worker or object is
related to the target event (the person who is performing unsafe behavior or the object being

thrown), we should give it more attention (i.e., greater weight). Ultimately, the weighted average
of all workers’ features indicates the key worker in the image, which is mainly presented in cttrack
, as in equation 3. The weighted average of all objects’ features is presented in dttrack (the
calculation of dttrack is the same as cttrack ).
cttrack  i t1 htipti
N
(3)
exp( A(hte1,i , htip ))
 ti  (4)

Nt
k 1
exp( A(hte1,k , htkp ))
Where  ti is an alignment model A (as a feedforward neural network) which scores how
well the inputs of person (object) i’s feature (including the action) match the output hte1,i at time
t. Finally, the full-frame feature ht , workers’ weighted average feature cttrack and objects’
weighted average feature dttrack are all input into the unidirectional LSTM for feature decoder.
Equation 4 describes the state of event hte , which is calculated combined with all these
information at time t. For example, when cttrack indicates that a worker is taking a dropping
action and dttrack indicates that a neighboring object is falling, it can be initially determined that a
dropping objects behavior is in this frame.
hte  LSTM  hte1 , ht , cttrack , dttrack  (4)

Table 1. Object recognition performance

Scaffold Steel Wood
Workers Wrenches Screwdrivers
Couplers Pipes Braces
Precision 97% 71% 91% 79% 73% 83%
Recall 96% 67% 93% 81% 74% 81%
Recognition
Easy Hard Easy Medium Hard Medium
Difficulty

Table 2. DOB detection performance

Throwing Kicking Knocking Sweeping
over off off
Easy 92%(91%) 91%(93%) 76%(75%) 84%(85%)
Medium 89%(86%) 87%(89%) 74%(76%) 80%(84%)
Hard 70%(65%) 72%(68%) 58%(52%) 61%(61%)

EXPERIMENT AND RESULT

The evaluation of our method is divided into two parts, including the object recognition
performance (in which these objects will be tested as falling objects) from images and Dropping
Object Behavior (DOB) detection performance from video clips.
The identification of common materials, tools and other objects on the site is the basis for
understanding the behavior of workers. Precision and recall is chosen as metrics to evaluate the
performance of SSD model, and the test results are showed in Table 1. It’s obvious the objects
recognition is influenced by the following two aspects: the outline and shape features of an

object greatly influence the recognition. For example, workers have more distinct outline and
sharp edges than piece of concrete, and they also present more accurate detectability. Another
factor that greatly impact detectability is the visual size of objects, which determine the
resolution of the objects. For example, workers and wood brace are easier to recognize than
fasteners and wrenches. Overall, the recognition accuracy of these objects is adequate despite the
pixel and occlusion effects.
The test videos were recorded for 1 minute, during which 1 to 2 workers threw different
objects from high places by different actions. Here, ‘throwing’ refers to the action of throw away
the objects with hands, ‘kicking over’ refers to the action of striking out the objects with the feet,
‘knocking off’ refers to the action of knocking the objects off by accident, ‘sweeping off’ refers
to the action of sweeping the objects off with broom or other tools. The results of DOB detection
are shown in Table 2, where two conclusions can be drawn. First, the detection of DOB is
influenced by workers’ postures in dropping action. The more obvious and wide range of motion,
the better performance of event detection, such as throwing and kicking over. Oppositely, the
more covert and smaller the range of motion, the worse performance of the event detection, such
as knocking over. On the other hand, the recognition difficulty of a dropped object has a more
significant effect for the event detection than the throwing actions of the worker, especially for
the scaffold couplers and piece of concrete.

DISCUSSION AND CONCLUSION

It’s obvious that the key factor restricting the detection performance of DOB lies in the
object recognition algorithm on the construction materials and tools. Suppose the recognition
accuracy of the construction object has been greatly improved, the performance of our method
will also be significantly improved. A feasible solution is to install more high-definition
surveillance cameras and take a more intensive arrangement of them on the site, which is
expected to increase the pixels of the object by narrowing the camera's range of view. In
addition, with the rapid development of the object recognition method, the detection precision
and recall rate of our method will be improved when a more efficient alternative of SSD is
proposed. Overall, the experimental results demonstrate the feasibility and great potential of our
approach to detect DOB in surveillance video.
The major contribution of this study is for the first time we propose a computer vision based
solution for the identification of dynamic human-object interaction in construction, which could
be extensively used to other similar situations.
Dropping objects behavior is a significant threat to construction safety, and it’s one of the
important causes of struck-by accidents. Such behavior is essentially a bad manner of workers,
which deserves our attention and can be corrected by enhanced monitoring. We learnt from the
latest research of video recognition and apply the multi-stream features and attention model into
our study. Experiment results show that our method is excellent in the test of surveillance video
and has good accuracy of DOB.

REFERENCES
Occupational Safety & Health Administration (OSHA). (2013), Osha Releases 2013 Top 10
Most Cited Standards, http://www.oshalawupdate.com/2013/11/04/osha-releases-2013-top-
10-most-cited-standards/,
Bureau of Labor Statistics (BLS). (2016), National Census of Fatal Occupational Injuries in
2015, https://www.bls.gov/news.release/pdf/cfoi.pdf

Aksorn, T. & Hadikusumo, B. H. W. (2008), Critical Success Factors Influencing Safety

Program Performance in Thai Construction Projects, Safety Science, 46(4), 709-727.
Alamgir, H., Yu, S., Gorman, E., Ngan, K. & Guzman, J. (2009), Near Miss and Minor
Occupational Injury: Does It Share a Common Causal Pathway with Major Injury?,
American journal of industrial medicine, 52(1), 69-75.
Cherri, S. & Argudin, R. (2016), Fall Rescue Plans & Dropped Object Prevention: What Every
Safety Manager Should Know, Professional Safety, 61(5), 38.
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T. M. & An, W. (2018), Detecting Non-
Hardhat-Use by a Deep Learning Method from Far-Field Surveillance Videos, Automation in
Construction, 85, 1-9.
Fang, Q., Li, H., Luo, X., Ding, L., Rose, T. M., An, W. & Yu, Y. (2018), A Deep Learning-
Based Method for Detecting Non-Certified Work on Construction Sites, Advanced
Engineering Informatics, 35, 56-68.
Han, S., Lee, S. & Peña-Mora, F. (2013), Comparative Study of Motion Features for Similarity-
Based Modeling and Classification of Unsafe Actions in Construction, Journal of Computing
in Civil Engineering, 28(5), A4014005.
Heinrich, H. W., Petersen, D. C., Roos, N. R. & Hazlett, S. (1980), Industrial Accident
Prevention: A Safety Management Approach, McGraw-Hill Companies.
Hinze, J., Huang, X. & Terry, L. (2005), The Nature of Struck-by Accidents, Journal of
Construction Engineering and Management, 131(2), 262-268.
LeCun, Y., Bengio, Y. & Hinton, G. (2015), Deep Learning, Nature, 521(7553), 436-444.
Pai, K. N. (2006), Modeling of Rollover Protective Structure and Falling Object Protective
Structure Tests on a Composite Cab for Skid Steer Loaders.
Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K. & Fei-Fei, L. (2016),
Detecting Events and Key Actors in Multi-Person Videos, Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 3043-3053.
Sargano, A. B., Angelov, P. & Habib, Z. (2017), A Comprehensive Review on Handcrafted and
Learning-Based Action Representation Approaches for Human Activity Recognition,
Applied Sciences, 7(1), 110.
Schmidhuber, J. (2015), Deep Learning in Neural Networks: An Overview, Neural networks, 61,
85-117.
Seo, J. O., Han, S. U., Lee, S. H. & Armstrong, T. J. (2014), Feasibility of Onsite Biomechanical
Analysis During Ladder Climbing, Construction Research Congress 2014: Construction in a
Global Network, pp. 739-748.
Singh, B., Marks, T. K., Jones, M., Tuzel, O. & Shao, M. (2016), A Multi-Stream Bi-Directional
Recurrent Neural Network for Fine-Grained Action Detection, Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1961-1970.
Teizer, J., Golovina, O., Wang, D. & Pradhanang, N. (2015), Automated Collection,
Identification, Localization, and Analysis of Worker-Related Proximity Hazard Events in
Heavy Construction Equipment Operation, ISARC. Proceedings of the International
Symposium on Automation and Robotics in Construction, Vilnius Gediminas Technical
University, Department of Construction Economics & Property, pp. 1.
Wang, J., Pradhananga, N. & Teizer, J. (2014), Automatic Fall Risk Identification Using Point
Cloud Data in Construction Excavation, Computing in Civil and Building Engineering
(2014), pp. 981-988.
Wang, J. & Razavi, S. N. (2015), Low False Alarm Rate Model for Unsafe-Proximity Detection

in Construction, Journal of Computing in Civil Engineering, 30(2), 04015005.

Wu, W., Yang, H., Li, Q. & Chew, D. (2013), An Integrated Information Management Model for
Proactive Prevention of Struck-by-Falling-Object Accidents on Construction Sites,
Automation in Construction, 34, 67-74.

Fast Dataset Collection Approach for Articulated Equipment Pose Estimation

Ci-Jyun Liang1; Kurt M. Lundeen2 ; Wes McGee3 ; Carol C. Menassa, Ph.D.4;
SangHyun Lee, Ph.D.5; and Vineet R. Kamat, Ph.D.6
1
Ph.D. Candidate, Laboratory for Interactive Visualization in Engineering, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., 2105 G. G. Brown Building,
Ann Arbor, MI 48109-2125. E-mail: [email protected]
2
Ph.D. Candidate, Laboratory for Interactive Visualization in Engineering, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., 2105 G. G. Brown Building,
Ann Arbor, MI 48109-2125. E-mail: [email protected]
3
Fabrication Lab, Taubman College of Architecture and Urban Planning, Univ. of Michigan,
2000 Bonisteel Blvd., Ann Arbor, MI 48109-2069. E-mail: [email protected]
4
Sustainable and Intelligent Civil Infrastructure Systems Laboratory, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., 2140 G. G. Brown Building,
Ann Arbor, MI 48109-2125. E-mail: [email protected]
5
Dynamic Project Management Group, Dept. of Civil and Environmental Engineering, Univ. of
Michigan, 2350 Hayward St., 2012 G. G. Brown Building, Ann Arbor, MI 48109-2125. E-mail:
[email protected]
6
Laboratory for Interactive Visualization in Engineering, Dept. of Civil and Environmental
Engineering, Univ. of Michigan, 2350 Hayward St., 2008 G. G. Brown Building, Ann Arbor, MI
48109-2125. E-mail: [email protected]

ABSTRACT
Struck-by accidents are potential safety concerns on construction sites and require a robust
machine pose estimation. The development of deep learning methods has enhanced the human
pose estimation that can be adapted for articulated machines. These methods require abundant
dataset for training, which is challenging and time-consuming to obtain on-site. This paper
proposes a fast data collection approach to build the dataset for excavator pose estimation. It uses
two industrial robot arms as the excavator and the camera monopod to collect different excavator
pose data. The 3D annotation can be obtained from the robot's embedded encoders. The 2D pose
is annotated manually. For evaluation, 2,500 pose images were collected and trained with the
stacked hourglass network. The results showed that the dataset is suitable for the excavator pose
estimation network training in a controlled environment, which leads to the potential of the
dataset augmenting with real construction site images.

INTRODUCTION
The prospect of human-robot collaboration (HRC) on construction sites raises safety
concerns (Liang et al. 2018b; You et al. 2018). Unlike HRC in typical industrial settings, the
robot on the construction site has to maneuver around the unstructured environment to their next
task location. The workplace of the robot changes dynamically based on their location, which is
a challenge for HRC safety. According to ISO standards, the safety of the HRC must be adhered
to either by stopping the robot before human contact, or be controlled by regulating force and
speed limits (Salmi et al. 2018). The recently developed dynamic safety system utilized human
detection sensors and optical sensors to adjust the robot speed according to the detected human
action and the protective distance. However, the protective distance has to be very large since the

optical sensors only identify the difference between current frame and previous frame instead of
tracking the robot’s exact pose (Salmi et al. 2018). This highlights the need for developing an
effective on-site pose estimation system for articulated construction equipment and human
workers, as shown in Figure 1.
Vision-based methods can extract object’s pose directly from input data with a marker
(marker-based) or without marker (marker-less) (Liang et al. 2018b). The marker-based method
identifies all the markers mounted on equipment and estimates the pose by their geometric
relations or marker network (Feng et al. 2018; Liang et al. 2017; Lundeen et al. 2016; Rezazadeh
Azar et al. 2015), whereas the marker-less method directly extracts features and estimates the
pose from them (Liang et al. 2018a; Soltani et al. 2018). The marker-less pose estimation method
only requires an on-site camera system, which is common on typical construction sites today, or
utilizes RGB-D cameras (Han et al. 2013, 2014; Han and Lee 2013; Seo et al. 2015). Feature
descriptor is the first type of marker-less pose estimation method (Chen et al. 2017; Lundeen et
al. 2017; Rezazadeh Azar et al. 2013). The recently emerging Convolutional Neural Networks
(CNN) is another type of pose estimation method (Andriluka et al. 2014), which has improved
performance (accuracy and speed) in comparison with all other vision-based methods, especially
for human pose estimation. The majority of the human pose estimation methods are 2D-based
methods (Newell et al. 2016), which estimate the human pose in 2D pixel-wise coordinates, as
shown in Figure 1. This is due to the lack of 3D ground truth posture data (Martinez et al. 2017).
For human pose data collection, the motion capture system is primarily used to obtain the ground
truth data of human skeleton in an indoor environment (Ionescu et al. 2014), which is difficult to
employ for construction equipment in an outdoor environment.
In this study, a fast dataset collection approach for articulated equipment pose estimation is
developed and evaluated. This approach collects images from a factory environment with a
robotic manipulator and from real construction sites. Both 2D and 3D data are annotated. The
performance of the dataset is validated by a state-of-the-art 2D human pose estimation network
(Newell et al. 2016) and a 3D human pose estimation baseline network (Martinez et al. 2017),
and compared with the IMU sensor pose estimation method.

DATA COLLECTION APPROACH

The image dataset is collected with an articulated robotic manipulator outfitted with a
simulated excavator bucket. The dataset is separated into training and testing groups. The 2D and
3D networks are trained by the training group and then evaluated by the testing group.
Dataset Collection Setup: For the dataset collection setup, a KUKA seven degrees-of-
freedom (DOF) robot arm (KUKA KR120) was used to simulate the excavator, and the images
of the robot arm with different poses were captured. Figure 2 illustrates the simulated excavator
in the laboratory. The upper arm represents the excavator stick and the lower arm represents the
excavator boom. A bucket is mounted on the robot arm end-effector for a more realistic
simulation. In order to control the robot as an excavator, the profile of the mounted bucket must
remain perpendicular to the ground level. Thus, only four of the robot joints were moved during
the dataset collection process, and the others were fixed at all times. The robot arm was
controlled to follow trajectories to perform several excavator-like tasks such as digging,
swinging, or unloading. The ground truth of the excavator pose data was acquired from the robot
arm’s embedded encoders, including 6 DOF pose of the robot’s end-effector  X , Y , Z , A, B, C 
and angles of all joints  A1 , A2 , A3 , A4 , A5 , A6  .

Figure 1. Illustration of the 2D on-site pose estimation system on a video frame for both
articulated equipment and human workers. Red lines are the estimated pose.

Figure 2. The simulated robotic excavator - robot arm mounted with an excavator bucket.

Figure 3. The camera mounted on the second robot arm to capture the images (red square).
In order to collect the images of the simulated excavator, a Point Grey camera was used in
the process. The camera was mounted on a second KUKA robot arm in the laboratory, as shown
in Figure 3. This could not only provide several different locations and orientations of the camera
to increase the variety of the dataset but also helped obtain the 6 DOF pose of the camera itself,
which is the end-effector of the camera robot, for further processing. The mounted camera on the
second robot arm was triggered by the same controller (Programmable Logic Controller, PLC)
utilized to control the first robot arm. Thus, the captured image and the recorded ground truth

pose data were synchronized with each other. In the data collection process, a total of 2,500
images were collected; 2,000 of them were used as training images and 500 of them were used as
testing images. Figure 4 shows a set of the collected images from the dataset. The size of each
image is 2048x2048 pixels.

Figure 4. A set of the captured images for the excavator dataset with different camera
location and orientation, and excavator pose.
Data Annotation: Data annotation is required in order to indicate the location of the
excavator’s joints in the images as the ground truth. The structure of the excavator data
annotation follows the similar structure to the human pose dataset annotation, MPII for the 2D
pose (Andriluka et al. 2014) and Human3.6M for the 3D pose (Ionescu et al. 2014). In the 2D
pose annotation, excavator joint locations were annotated in the pixel-wise coordinate. The scale
of the image was measured with respect to a height of 200 pixels. On the other hand, in the 3D
pose annotation, the locations of the excavator’s joints were labeled as (X, Y) in pixel-wise
coordinates and Z was considered as the distance from the camera to each joint, which was
calculated from the robot arm end-effector and joints’ ground truth data. The bounding box was
also labeled to show the area of the excavator in the image. The annotations were performed via
MATLAB and saved as two separated annotation files, one for the 2D pose and the other for the
3D pose. Figure 5 shows an example of an annotated image.

Figure 5. An example of the annotated image. Stars represent the joint locations and the
rectangle represents the bounding box.
The 3D ground truth data was acquired by the robot arm’s built-in encoders and the PLC.
The PLC sent the control command to both robot arms (North and South). The South robot
would perform the predefined trajectory, such as digging or unloading, whereas the North robot
would stay as it is to capture the images. Several trigger points were set to trigger the camera on
the North robot to capture the image and acquire the pose of both robots, and then transfer them
to a computer. After the South robot finished the entire trajectory, the North robot would move to

the different pose and re-run the process. This could increase the variety of the dataset by having
different orientations in the images. The 3D pose of the end-effector was directly read from the
robot arm, and the 3D pose of the rest of the robot joints was obtained using inverse kinematics.
Sensor-based Pose Estimation: For evaluating the excavator dataset, the sensor-based pose
estimation method was used to compare the performance. Four IMU sensors were deployed to
measure the angular change of the robot joints. These sensors were placed on the axis of each
joint so that they can measure the correct angle when the robot changed its pose. The results
were compared with the ground truth joint angle of the robot arm. The 3D pose of each joint
could be calculated by Forward Kinematics. Since the exact location of a joint required location
sensors such as GPS, which was not available in the experiment, the first joint (A1) was aligned
with the ground truth A1 joint location, and then the other joints were calculated relative to the
first joint. The Xsens MTw Awinda wireless motion tracker system was used for the sensor-
based method. The system contained four motion trackers with IMU embedded and a wireless
receiver to transmit the data. The sensor data was also synchronized with the vision-based pose
data and the ground truth data so that it could be compared with each other.

Figure 6. Results of the excavator 3D pose estimation. The left image is the result from 2D
pose estimation and the right image is the 3D result.
Table 1. Results of the average Euclidean distance (mm) between the predicted and the
ground truth joint location.
(mm) 3D Vision-based Sensor-based
Boom-Stick 148.16 84.35
Stick-Bucket 134.22 97.21
Bucket 151.58 99.42

EXPERIMENT RESULTS
The proposed pose estimation dataset was evaluated by comparing the estimated results and
the ground truth, as shown in Figure 6. The left image was the 2D result and the right image was
the 3D result. The dashed line is the vision-based result, the dotted line is the sensor-based result,
and the solid line is the ground truth. The Euclidean distance between the estimated joint location
and the ground truth joint location are used to evaluate the performance, as shown in Table 1.
The average Euclidean distance between the 3D vision-based method and ground truth is 144.65
mm, and between the sensor-based method and the ground truth is 93.66 mm. The result showed
that the error of the 3D vision-based is higher than the sensor-based method. One of the reasons
was that the 3D vision-based method predicted the pose based on the 2D pose estimation result,
wherein the error would accumulate from 2D prediction and decrease the accuracy in the 3D

prediction. The reason was that the camera coordinates preprocessing mentioned in (Martinez et
al. 2017) was not applied to the ground truth data because the camera matrix was not determined
in the laboratory dataset. In addition, the occlusion issue also affected the prediction result.

CONCLUSION
In this research, a fast dataset collection approach for 2D and 3D articulated construction
robots pose estimation methods were proposed. A KUKA robot arm was utilized to represent an
excavator in a factory and a camera on the second robot arm was used to capture the image. The
3D pose was acquired from robot arm sensors and the 2D pose was annotated manually. A 3D
pose estimation network was evaluated on the dataset. The sensor-based pose estimation method
was also implemented to compare the performance. The results showed that the dataset collected
by the proposed approach could estimate excavator's joints but had higher estimation error.

ACKNOWLEDGMENTS
The authors would like to acknowledge the financial support for this research received from
the US National Science Foundation (NSF) via grant 1734266. Any opinions and findings in this
paper are those of the authors and do not necessarily represent those of the NSF.

REFERENCES
Andriluka, M., Pishchulin, L., Gehler, P., and Schiele, B. (2014). “2D human pose estimation:
new benchmark and state of the art analysis.” Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), IEEE, Columbus, OH, 3686–3693.
Chen, J., Fang, Y., Cho, Y. K., and Kim, C. (2017). “Principal axes descriptor for automated
construction-equipment classification from point clouds.” Journal of Computing in Civil
Engineering, 31(2), 04016058.
Feng, C., Kamat, V. R., and Cai, H. (2018). “Camera marker networks for articulated machine
pose estimation.” Automation in Construction, 96, 148–160.
Han, S., and Lee, S. (2013). “A vision-based motion capture and recognition framework for
behavior-based safety management.” Automation in Construction, 35, 131–141.
Han, S., Lee, S., and Peña-Mora, F. (2013). “Vision-based detection of unsafe actions of a
construction worker: case study of ladder climbing.” Journal of Computing in Civil
Engineering, 27(6), 635–644.
Han, S., Lee, S., and Peña-Mora, F. (2014). “Comparative study of motion features for
similarity-based modeling and classification of unsafe actions in construction.” Journal of
Computing in Civil Engineering, 28(5), A4014005.
Ionescu, C., Papava, D., Olaru, V., and Sminchisescu, C. (2014). “Human3.6M: large scale
datasets and predictive methods for 3D human sensing in natural environments.” IEEE
Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325–1339.
Liang, C.-J., Kamat, V. R., and Menassa, C. C. (2018a). “Real-time construction site layout and
equipment monitoring.” Proceedings of the 2018 Construction Research Congress, New
Orleans, LA, 64–74.
Liang, C.-J., Kang, S.-C., and Lee, M.-H. (2017). “RAS: a robotic assembly system for steel
structure erection and assembly.” International Journal of Intelligent Robotics and
Applications, 1(4), 459–476.
Liang, C.-J., Lundeen, K. M., McGee, W., Menassa, C. C., Lee, S., and Kamat, V. R. (2018b).

“Stacked hourglass networks for markerless pose estimation of articulated construction

robots.” Proceedings of the 35th International Symposium on Automation and Robotics in
Construction, Berlin, Germany.
Lundeen, K. M., Dong, S., Fredricks, N., Akula, M., Seo, J., and Kamat, V. R. (2016). “Optical
marker‐based end effector pose estimation for articulated excavators.” Automation in
Construction, 65, 51–64.
Lundeen, K. M., Kamat, V. R., Menassa, C. C., and McGee, W. (2017). “Scene understanding
for adaptive manipulation in robotized construction work.” Automation in Construction, 82,
16–30.
Martinez, J., Hossain, R., Romero, J., and Little, J. J. (2017). “A simple yet effective baseline for
3d human pose estimation.” Proceedings of the IEEE International Conference on Computer
Vision (ICCV), IEEE, Venice, Italy, 2659–2668.
Newell, A., Yang, K., and Deng, J. (2016). “Stacked hourglass networks for human pose
estimation.” Computer Vision – ECCV 2016, Lecture Notes in Computer Science, Springer,
Cham, Amsterdam, Netherlands, 483–499.
Rezazadeh Azar, E., Dickinson, S., and McCabe, B. (2013). “Server-customer interaction
tracker: computer vision–based system to estimate dirt-loading cycles.” Journal of
Construction Engineering and Management, 139(7), 785–794.
Rezazadeh Azar, E., Feng, C., and Kamat, V. R. (2015). “Feasibility of in-plane articulation
monitoring of excavator arm using planar marker tracking.” Journal of Information
Technology in Construction (ITcon), 20(15), 213–229.
Salmi, T., Ahola, J. M., Heikkilä, T., Kilpeläinen, P., and Malm, T. (2018). “Human-robot
collaboration and sensor-based robots in industrial applications and construction.” Robotic
Building, Springer Series in Adaptive Environments, H. Bier, ed., Springer International
Publishing, Cham, 25–52.
Seo, J., Starbuck, R., Han, S., Lee, S., and Armstrong, T. J. (2015). “Motion data-driven
biomechanical analysis during construction tasks on sites.” Journal of Computing in Civil
Engineering, 29(4), B4014005.
Soltani, M. M., Zhu, Z., and Hammad, A. (2018). “Framework for location data fusion and pose
estimation of excavators using stereo vision.” Journal of Computing in Civil Engineering,
32(6), 04018045.
You, S., Kim, J.-H., Lee, S., Kamat, V., and Robert, L. P. (2018). “Enhancing perceived safety in
human–robot collaborative construction using immersive virtual environments.” Automation
in Construction, 96, 161–170.

Emerging Construction Technologies: State of Standard and Regulation Implementation

Ifeanyi Okpala, S.M.ASCE1; Chukwuma Nnaji, Ph.D., A.M.ASCE2;
and Ibukun Awolusi, Ph.D., A.M.ASCE 3
1
Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, PO Box
870205, Tuscaloosa, AL 35487. E-mail: [email protected]
2
Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, PO Box
870205, Tuscaloosa, AL 35487. E-mail: [email protected]
3
Dept. of Construction Science, Univ. of Texas at San Antonio, 501 W. Cesar E. Chavez Blvd.,
San Antonio, TX 78207. E-mail: [email protected]

ABSTRACT
In recent times, construction operations have reported enhanced performance as a result of
utilizing reality capture technologies, robotics, and the Internet of things (IoT). However, recent
studies indicate that these technologies could struggle to diffuse across the construction industry
at an acceptable rate. One reason for the predicted stagnation is the lack of standardization and
regulations for these technologies. Given the importance of standards and regulation to
technology adoption and the need to keep construction stakeholders informed with industry
trends, it is essential to synthesize and document the current state of industry standards and
regulations associated with emerging construction technologies. Presently, a few studies have
evaluated the current state of regulations affecting some promising construction technologies.
However, these studies were limited to national regulations and focused on a specific technology
without pointing out important trends, opportunities for interoperability standards, and synergies
across multiple technologies. To fill this gap, a global search was conducted to identify and
streamline the available standards and regulations governing four promising construction
management technologies—exoskeleton, IoT, LiDAR, and UAVs. Technology implementation
experts from six countries were interviewed to qualitatively analyze the intersection between
standards and regulations and the diffusion of emerging technologies. Results from the study
suggest that although there is a growing development and implementation of technology
standards and regulations in Western Europe and North America, construction-specific standards
and regulations are sparse. In closing, recommendations that could help accelerate the
development of standards and practical regulations that drive adoption are provided.

INTRODUCTION
The built environment is an essential part of the global economy – contributing
approximately 13% of the gross domestic product (GDP) (Schilling 2014). Although considered
a vital contributor to the world’s economy, the performance of the construction industry has not
improved significantly in the recent past (McKinsey 2017). The stagnant project performance
witnessed in the construction industry requires a rethink of how projects are executed in this
industry (Mair 2016). Although unwanted, this stagnation provides an opportunity to
revolutionize the assembly of construction infrastructures. The introduction of advanced
technology is expected to lead to considerably enhanced efficiencies, economies, resilience, and
adaptability, benefitting not just the construction industry, but the society served by its
infrastructure (Mair 2016).
In recent times, construction managers have reported interest in implementing promising

technologies such as Exoskeleton, Internet of Things (IoT), Light Detection and Ranging
(LiDAR), and Unmanned Aerial Vehicles (UAVs) (SmartMarket Report 2017). It is recognized
that these emerging technologies (broadly classified as reality capture technologies, robotics, and
IoT) have increasingly begun to find their applications in project design, construction, planning,
and management. Although these technologies hold immense potential to improve performance
across the life cycle of projects, the innate nature of the construction industry (dynamic, resistant
to change and fragmented) is expected to retard the diffusion of these technologies. Furthermore,
the costs of acquiring and operating some of these technologies and the lack of standardization
and regulations appear to be a primary factor that limits the adoption of technologies in the
construction industry (Chan et al. 2017). In Israel, the “Survey Regulations” has facilitated
different levels of operable and programmable communication across heterogeneous
technologies (digital cameras on UAVs, LIDAR, radar, and mobile mapping systems). It
instigates the development of computer systems to support new quality control requirements and
the development of other regulations (Felus et al. 2013).
Currently, the literature on the state of standards and regulations implementation for the
technologies above is limited. Therefore, the present study seeks to assess the current state of
standardization and regulation for these promising technologies with the expectation that such
information will help improve awareness regarding the need for more research on technology
standards and regulations.

EMERGING CONSTRUCTION TECHNOLOGIES

Prior to assessing the state of standards and regulations associated with Exoskeleton, IoT,
LiDAR, and UAVs, background information on the utility of these technologies is provided
below.
Exoskeleton: Fundamentally, the exoskeleton is a mechanism designed to be wrapped
around the limbs of an operator, thereby allowing the replication or enhancement of forces at
body segments. In this arrangement, the human and the mechanical systems (robots) are
inherently coupled, and safety is paramount. These wearable exoskeleton devices are tailor-made
to reduce some of the mechanical stress of manual labor. Exoskeletons were first developed by
the military and have gradually found some application from healthcare, manufacturing to the
agriculture industries, where employees also carry and transfer heavy loads and move in a
repetitive manner. In construction, professionals are beginning to adopt robotic structures. In the
construction industry, Cho et al. (2018) proposed an exoskeleton for improving workers posture
while performing construction tasks. However, the National Institute for Safety and Health
(NIOSH) recently highlighted the need for developing standards and regulations to guide the
implementation of exoskeletons to limit the potential negative impact of human-robot
interactions (NIOSH 2017).
Internet of Things (IoT): The Internet of Things (IoT) is a compendium of millions of new
Internet-connected devices that have been developed in the past two decades. These include
technologies such as sensors, actuators, smartphones, signalization, robots that are being
developed to facilitate smart, automated, secure and sustainable environments and working
processes in construction (Kochovski and Stankovski 2014). As construction operations
gradually move towards optimized processes, there are smart application possibilities including
site automation, robot-assisted construction, infrastructures monitoring, material management,
safety monitoring, and home automation (Kochovski and Stankovski 2014).
Light Detection and Ranging (LiDAR): Remote sensing, in general, refers to any

noncontact technique whereby the object space can be observed (Lillisand et al. 2015). Remote
sensing is widely regarded as a rapidly advancing technology, mainly driven by imaging sensor
developments and endlessly increasing performance of the information infrastructure, including
processing, storage capacity, and communication capabilities (Toth and Józ´ków 2016). With
advancement in sensing and computer technologies, sensors have become more affordable, and
modern remote sensing systems use multiple sensors, including identical and/or different
sensors, such as multiple cameras and/or Light Detection and Ranging (LiDAR) sensors (Asner
et al. 2012; Nagai et al. 2009; Paparoditis et al. 2012).
Unmanned Aerial Vehicles (UAVs): Siebert and Teizer (2014) explained that until recently,
Unmanned Aerial Vehicles (UAV), Unmanned Aerial Systems (UAS), Remotely Piloted
Vehicles (RPV), also often known as drones, were mostly developed and used for military
applications. In recent years, researchers have shown a growing interest in utilizing UAV
systems for diverse non-military purposes. They are reportedly used in forest and agricultural
applications, autonomous surveillance, emergency and disaster management, traffic surveillance
and management, photogrammetry for 3D modeling, remote-sensing based inspection systems,
and many more domains (Siebert and Teizer 2014). This level of utility is mainly due to the fast
speed, high maneuverability, and high safety of UAV systems for capturing important
information (Siebert and Teizer 2014). It is believed that UAVs have become so widespread that
it is necessary to regulate its use to prevent illegal acts and accidents.
Although the technologies discussed above are applicable across the life cycle of a project,
the lack of standardization could impair the use of these technologies, thereby influencing their
diffusion across the construction industry.

METHODS
To meet the research goal, a detailed global search on the availability and application of
standards and regulations for using these emerging technologies in the construction industry and
other industries with similar characteristics was conducted. This detailed search included title,
keyword, and abstract search using the two primary databases for synthesizing literature – Web
of Science and Scopus. In addition to these databases, the researchers accessed Google Scholar
to identify relevant technical reports that are typically missing from the aforementioned
databases. In total, the authors reviewed 59 articles from six countries, published between 2000
and 2018. However, due to page limitations, only key articles will be highlighted in this paper.
Thereafter, individuals with experience in utilizing these technologies were interviewed. Given
the specificity of the study, participants were identified using a purposive sampling technique.

STATE OF STANDARD AND REGULATION

First, it is important to distinguish between standards and regulations as they differ
substantially. ‘Formal standards are developed in recognized standardization bodies and they
are voluntary and consensus-driven’ (WTO 2011). Two basic functions of standards are
interface (or compatibility) and quality (or safety). In general terms, a standard’s interface
specifications define how specific products should interoperate with each other or with the end-
users (Masum et al. 2013). In contrast, a standard’s quality specifications define the performance
of the product against quality or safety metrics.
As opposed to standards, regulations are legal restrictions which are released, enacted by the
government and deemed mandatory (Blind et al. 2017). It is believed that well-designed
regulations may guide or even force firms to invest in innovative activities, implement

innovative processes or release innovative products (Porter and van der Linde 1995).
Exoskeleton: Recently, the United States National Institute of Standards and Technology
(NIST) acknowledged that exoskeletons, which can dramatically improve and extend human
performance, are becoming a reality for workers in the manufacturing industry, and individuals
with mobility impairments. NIST’s leadership is developing standards that address the safety and
performance of exoskeletons, and to this end, NIST has engaged 23 private companies, 14
universities, and nine government agencies as part of an ASTM committee developing the first
consensus standards for exoskeletons and exosuits (NIST 2018). It is contained in a report for the
Centers for Disease Control and Prevention (CDC) that in Europe, a 2015 European Union (EU)
research and development project was set out to develop “standards for the safety of
exoskeletons used by industrial workers performing manual handling activities.” This effort
intends to develop policies and standards for exoskeleton use in industry. Currently, there is
insufficient data to determine complete safety profiles or health effects for long-term use of
exoskeletons (Zingman et al. 2018). The introduction of standards and regulations will help
ensure that exoskeletons will not have negative long-term health impacts on workers (CDC
2017).
Internet of Things (IoT): A primary concern in the evolution of IoT is the lack of
operability between devices. Concerning the plethora of sensing network devices under the
umbrella of IoT, a robust search by the authors revealed that there are mainly existing
manufacturing regulations with no standards nor regulations tailor-made for construction
engineering applications.
In the US, there is increasing calls for the government to introduce regulations, especially
regarding the integrity of the data created, stored, and transmitted by these devices. A recent
report by Gemalto (2017) revealed, from a survey, that most organizations and consumers
believe there is a need for IoT security regulations. The researchers also canvassed for active
government involvement. It is believed that this is necessary since more than two-thirds of
businesses encrypt all data captured or stored via IoT devices (Gemalto 2017) . The Institute of
Electrical and Electronics Engineers (IEEE), through the IEEE Standards Association (IEEE-
SA), developed a list of standards that focused on enabling products with real-world
applications. However, a review of this list revealed that the standards were broad in scope
(IEEE 2018). To increase the adoption and application of IoT in a fragmented, dynamic, and
complex industry such as construction, it is essential to encourage interoperability through the
development of standards that govern the use of construction management-related sensors and
other IoT devices (IEEE 2018). The construction industry could achieve some level of
interoperability by utilizing open standards such as open messaging standards (O-MI and O-DF).
Recently, Dave et al. (2018) developed and validated a platform for integrating IoT data into
building information modeling (BIM) using open standards. Nevertheless, the researchers
highlighted the need for standardization guidelines for exporting IFC files.
LiDAR: A detailed search of existing literature indicates that there is no standard and
regulation for implementing LiDAR in the construction industry. In Israel, Felus et al. (2013)
explained that the field of mapping and Geospatial Information has regulations that will cover
basic concepts which include: Digital environment, Quality control, and Emerging technologies.
Of the latter, one goal is to encourage the production of more advanced mapping products. Some
of these latest technologies include digital cameras (on satellites, aerial, on UAV's or terrestrial
systems) and LiDAR. The regulation permits the use of any technology that meets the quality
requirements of a specific mapping level and is certified by the Survey of Israel (SOI). This

regulation will also define advanced mapping products that will be used to facilitate planning,
design, construction, facility management, and more. However, the authors could not find any
information on regulation in most countries.
UAVs: Following a detailed search of multiple databases, the authors were unable to identify
any technical or quality standards specific to the construction industry. However, UAV handlers
are required to obtain a remote pilot certificate in the United States and other European countries,
such as Spain. In the United States, there is a Small UAS rule as a new part 107 to Title 14 Code
of Federal Regulations (14 CFR) to allow for routine civil operation of small Unmanned Aircraft
Systems (UAS) in the National Airspace System (NAS) and provide safety rules for those
operations. The rule addresses airspace restrictions, remote pilot certification, visual observer
requirements, and operational limits (FAA 2018). According to Herrmann (2018), the existing
regulations limit the application of UAV in the construction industry, hence a need to develop
more industry-centric regulations.
In 2017 when the European Commission set the alignment of different national UAV
legislation as one of Horizon 2020 goals, it called on the individual member states to adopt
measures that will allow for the integration of these systems in the European civil airspace.
However, this goal has been postponed until 2050. The Spanish Royal Decree 1036/2017 came
into effect in 2018 following the “Strategic Plan for Drones” by the Spanish Ministry of Public
Works. This new legislation regulates the civil use of remote-controlled aircraft by private
individuals for recreational and leisure (Chamoso et al. 2018). According to Stöcker et al. (2017)
report, nearly one-third of all 195 countries explored have UAV regulatory documents in place
as of October 2016. Of these 195 countries, approximately half do not provide any information
regarding the use of UAVs for civil applications. Announcements for pending UAV regulations
were found in 15 countries. In Cuba, Egypt, and Uzbekistan, UAVs are officially banned which
prohibits the use of this platform.
Findings from the global literature search indicate that although there is some interest in the
extended use of these emerging technologies in different countries, robust industry-specific
standards and regulations that have the potential of increasing the diffusion rate of these
technologies are largely non-existent.

PERSPECTIVES OF EXPERIENCED PROFESSIONALS

As part of the study, the researchers interviewed six individuals involved in investigating the
utility and enhancing the implementation of the technologies discussed in the previous sections.
The primary objective of the interview was to generate additional qualitative information that
validates or refutes the finding from the extensive literature review. This additional information
provides further insight into the intersection between standards, regulations and the potential
diffusion of emerging technologies in construction in different countries. In total, the interview
included six individuals from different countries (China [R1], England [R2], Israel [R3], South
Korea [R4], Turkey [R5], and US [R6]) with extensive experience (more than seven years each)
in handling, conducting research and strategic decision making concerning real-life applications
of these technologies in the various phases of the construction life cycle. Questions asked were
focused on determining: a) whether they adhered to any governmental regulation when operating
these technologies; b) whether they conformed to any technical or/and quality standard when
using these tools; c) if the construction industry has adopted any standards or regulation
regarding the technologies; d) their perspective on the impact of standards and regulation on the
diffusion of these emerging technologies.

Following the analysis of each participant responses, it was observed that opinions were
similar to the information contained in the secondary data outlined in previous sections. Below
are some insights provided by the participants:
“I am not aware of specific regulations and standards guiding the use of IoT in
the construction industry in the United States. However, the future of IoT in the
construction industry rests on the development of standards and regulations that
encourage interoperability of devices. Without this [standards that encourage
interoperability], manufacturers of these devices cannot effectively scale
production because construction companies will not purchase isolated products
or technologies that require them to ‘reinvent the wheels’” [R6].
Regarding the use of exoskeletons, the participants highlighted that exoskeletons have not
been implemented within the construction space in their respective countries. However, the
participant from China indicated that some companies were exploring implementing
exoskeletons in offsite construction in China.
“the use of robotics such as exoskeletons is considered a potential game-changer
in modular construction since it could enhance the productivity of workers when
executing tasks that require human input.” [R1].
“The use of drones in the construction industry could constitute some nuisance
such as distraction. It is essential to develop standards and regulations for the
construction industry that accounts for our unique characteristics [such as the
environment, operation, and human behavior]. This way, contractors and
consultants will have a guide for implementing such technologies in a safe,
ethical, and effective manner” [R4].
Although not presented in this paper, other comments extracted from discussions with the
experts were consistent with the need to encourage the development and implementation of
standards and regulations in other to maximize the potentials of these devices. Also, the
participants pointed out that standards and regulations will ensure that these technologies are
used ethically and safely.

DISCUSSION AND CONCLUSION

The primary aim of this paper was to synthesize the current state of industry standards and
regulations associated with four emerging construction technologies - Exoskeleton, IoT, LiDAR,
and UAVs. Results from the study suggest that although several countries – especially in
Western Europe – have developed some standards and regulations for the application of these
technologies, little work focused on developing construction-specific regulations has been done
so far. To foster the standardization and regulation of emerging construction technologies such as
those reviewed in this study, strong efforts should be made to think beyond a single technology
and consider an ecosystem of interoperable technologies.
A huge need exists for standards that facilitate different levels of operable and programmable
communication across these heterogeneous technologies reviewed in this study and other
emerging construction technologies. Interoperability brings about commitment from diverse
stakeholders which gives strong credibility signal necessary to advocate for the standardization

of emerging construction technologies. As discussed previously, interoperability can foster

widespread adoption of open platforms which could potentially reduce the duplication of effort
and investment inherent in creating a separate construction technology devices and platforms.
Additionally, modularization of technologies into independent interfacing parts that can then
interact via well-defined interoperability standards can preclude the development of mutually
incompatible solutions by different technology developers (Masum et al. 2013). Construction
technologies can have complexity-reducing modular and interoperable architectures that can
foster innovation from diverse parties involved in technology creation.
Standards generate collective principles and rules capable of facilitating technology
implementation. An efficient standard-setting process reinforces convergence to a single
standard which can reduce complexity, cost, and enhance the effectiveness of these emerging
technologies at the early stage of development. Developing standards as part of a larger
consortium involving experts, innovators, developers, and stakeholders from both academia and
industry can mitigate risks and avoid inappropriate standards. Another promising solution would
be to encourage construction technology manufacturers to produce technology platform using a
standard that allows interoperability of essential technologies. A well implemented standardized
platform that encourages interoperability and modularization can reduce barriers to the diffusion
of the technologies, improve affordability, and create a vibrant ecosystem of emerging global
construction technologies.

REFERENCES
Asner, G. P., Knapp, D. E., Boardman, J., Green, R. O., Kennedy-Bowdoin, T., Eastwood, M., &
Field, C. B. (2012). “Carnegie airborne observatory-2: Increasing science data dimensionality
via high-fidelity multi-sensor fusion”, doi://doi.org/10.1016/j.rse.2012.06.012
Blind, K., Petersen, S. S. & Riilloc, C. A. F. (2017). “The impact of standards and regulation on
innovation in uncertain markets”. Research Policy, 46, 249-264.
Chamoso, P., González-Briones, P., Rivas, A., De Mata, F. & Corchado, J. (2018). “The Use of
Drones in Spain: Towards a Platform for Controlling UAVs in Urban Environments”.
Sensors, 18, 1416.
CDC (2017). “New NIOSH Center to Study Safety and Health Implications of Occupational
Robots”. Center for Disease Control. < https://www.cdc.gov/niosh/updates/upd-10-16-
17.html> (Nov. 20, 2018)
Chan, A., Darko, A., & Ameyaw, E. E. (2017). “Strategies for Promoting Green Building
Technologies Adoption in The Construction Industry—An international study.”
Sustainability, 9(6), 969.
Cho, Y. K., Kim, K., Ma, S., & Ueda, J. (2018). “A Robotic Wearable Exoskeleton for
Construction Worker’s Safety and Health”, Construction Research Congress 2018, 19-28.
Dave, B., Buda, A., Nurminen, A., & Främling, K. (2018). “A Framework for Integrating BIM
and IoT through Open Standards.” Automation in Construction, 95, 35-45.
FAA (Federal Aviation Administration). (2018). “FAA UAS Part 107: Federal Aviation
Administration: The Small UAS Rule.” Unmanned Aircraft Systems, <
https://www.faa.gov/uas/> (Oct. 30, 2018).
Felus, Y., Keinan, E. & Regev, R. (2013). “Regulations in the field of Geo-Information”
International Archives of the Photogrammetry, Remote Sensing and Spatial Information
Sciences, XL-7/W2.
Gemalto (2017). “The State of IoT Security – Global Survey Report” Gemalto,

< https://www.gemalto.com/m2m/documents/iot-security-report > (Oct. 30, 2018).

Herrmann, M. (2018). “Regulation of Unmanned Aerial Vehicles and a Survey on Their Use in
the Construction Industry.” In Construction Research Congress 2018, 758-764.
IEEE (Institute of Electrical and Electronics Engineers). (2018). “Internet of Things Related
Standards.” IEEE. <https://standards.ieee.org/initiatives/iot/stds.html> (Nov. 18, 2018).
Kochovski, P & Stankovski., V. (2014). “Supporting smart construction with dependable edge
computing infrastructures and applications.” Automation in Construction, 85, 182-192.
Lillisand, T., Kiefer, R., Chipman, J. (2015). “Remote Sensing and Image Interpretation, Seventh
Ed.” Wiley.
Mair, R. J. (2016). “Briefing: Advanced Sensing Technologies for Structural Health
Monitoring”. Forensic Engineering. 169, FE2 46-49.
Masum, H., Lackman, R. & Bartleson, K. (2013). “Developing Global Health Technology
Standards: What Can Other Industries Teach Us?” Globalization and Health,
<http://www.globalizationandhealth.com/content/9/1/49> (Oct. 30, 2018).
McKinsey Global Institute (2017). “Reinventing construction through a productivity
revolution.” <https://www.mckinsey.com/industries/capital-projects-and-infrastructure/our-
insights/reinventing-construction-through-a-productivity-revolution> (Oct. 3, 2017)
NIST (National Institute of Standards and Technology). (2018). “Innovation in Exoskeletons.”,
NIST <https://www.nist.gov/industry-impacts/innovation-exoskeletons> (Oct. 30, 2018).
NIOSH (National Institute of Occupational Safety and Health Hazards). (2017). “Exoskeletons
in Construction: Will They Reduce or Create?” NIOSH. <https://blogs.cdc.gov/niosh-
science-blog/2017/06/15/exoskeletons-in-construction/> (Oct. 31, 2018).
Nagai, M., Tianen Chen, Shibasaki, R., Kumagai, H., & Ahmed, A. (2009). “UAV-Borne 3-D
Mapping System by Multisensor Integration.” IEEE Transactions on Geoscience and Remote
Sensing, 47(3), 701-708.
Paparoditis, N., Papelard, J.-P., Cannelle, B., Devaux, A., Soheilian, B., David, N., & Houzay, E.
(2012). “Stereopolis II: a multi-purpose and multi-sensor 3D mobile mapping system for
street visualisation and 3D metrology.” Revue Française de Photogrammétrie et de
Télédétection, 200, 69–79.
Porter, M. & Van der Linde, C., (1995). “Toward A New Conception of the Environment-
Competitiveness Relationship”, J. Econ. Perspect., 97–118.
Schilling D. R. (2014). “Global Construction Expected to Increase by $4.8 Trillion by 2020.”
<http://www.industrytap.com/global-construction-expected-to-increase-by-4-8-trillion-by-
2020/1483> (Oct. 1, 2017)
Siebert, S. & Teizer, J. (2014). “Mobile 3D Mapping for Surveying Earthwork Projects Using
An Unmanned Aerial Vehicle (UAV) System”. Automation in Construction, 41, 1-14.
SmartMarket Report. (2017). “Safety Management in The Construction Industry 2017.” Dodge
Data and Analytics, Bedford, MA.
Stöcker, C., Bennett, R. Nex, F., Gerke, M. & Zevenbergen, J. (2017). “Review of the Current
State of UAV Regulations”. Remote Sens., 9(5), 459
Toth, C. & Józ´ków, G. (2016). “Remote sensing platforms and sensors: A survey”. ISPRS
Journal of Photogrammetry and Remote Sensing, 115, 22-36.
WTO (World Trade Organization). (2011). “Decisions and Recommendations Adopted by The
WTO Committee on Technical Barriers to Trade Since 1 January 1995”, G/TBT/1/Rev.10.
<https://goo.gl/YOKovB> (Oct. 30, 2018)

Zingman, A., Earnest, G. S., Lowe, B. D., & Branche, C. M. (2017). “Exoskeletons in
Construction: Will They Reduce or Create Hazards?” Centers for Disease Control and
Prevention: NIOSH Science Blog, < https://blogs.cdc.gov/niosh-science-
blog/2017/06/15/exoskeletons-in-construction/> (Oct. 30, 2018).

Exploring the Potential of Image-Based 3D Geometry and Appearance Reasoning for

Automated Construction Progress Monitoring
Jacob J. Lin, S.M.ASCE1; Jae Yong Lee2; and Mani Golparvar-Fard, Ph.D., M.ASCE3
1
Dept. of Civil and Env. Engineering, Univ. of Illinois at Urbana–Champaign, 205 N. Mathews
Ave., Urbana, IL 61801. E-mail: [email protected]
2
Dept. of Computer Science, Univ. of Illinois at Urbana–Champaign, 201 North Goodwin Ave.,
Urbana, IL 61801. E-mail: [email protected]
3
Dept. of Civil and Env. Engineering and Dept. of Computer Science, Univ. of Illinois at
Urbana–Champaign, 205 N. Mathews Ave., Urbana, IL 61801. E-mail: [email protected]

ABSTRACT
The exponential increase in the volume of images and videos captured on construction sites
and the growing availability of building information models (BIM) and schedules with
production-level details has created a unique opportunity to automate how progress is monitored
and reported on construction sites. However, the state-of-the-art methods of automated progress
comparison are still in its infancy largely because of these methods either only leverage
geometry of the 3D reconstructed scenes to reason about presence or detect and classify
construction material from 2D images without considering geometrical characteristics. To the
best of our knowledge, this paper is the first to offer a computer vision method that can jointly
reason about geometry and appearance of observed BIM elements in site images and videos to
monitor and report on their state of progress. The new method fuses structure-from-motion
geometrical features together with directional and radial appearance features in a new deep
convolutional neural network (CNN) architecture to detect and classify state of work-in-progress.
Our experimental results show that using geometrical features reduces errors in appearance-
based recognition methods and offers a new opportunity to scale the applicability of automated
progress detection methods to real-world settings.

INTRODUCTION
Our goal is to design a single machine learning model that can jointly reason about geometry
and appearance of observed BIM elements in site images and videos to monitor and report on
their state of progress. Prior automated progress monitoring research either focus on reasoning
about the presence of the elements in 3D environment or classifying the construction material
based on appearance of the elements from 2D images. As a result, research focus on element
presence are unable to distinguish difference between finished concrete surfaces vs. forming
stage and cannot accurately report on the state of work-in-progress. On the other hand, methods
that detect and classify construction material from 2D images have primarily been challenged in
their performance due to their inability to reason about geometrical characteristics of their
detected components. With the recent exponential increase in the volume of images and videos
captured on construction sites and the growing availability of Building Information Models
(BIM) with model disciplines at Level of Development (LoD) 350 to 400 and schedules with
production-level details, there is a unique opportunity to leverage these information to extend the
potential of automated progress monitoring. The new method fuses Structure-from-Motion
geometrical features together with directional and radial appearance features in a new Deep
Convolutional Neural Network (CNN) architecture to detect and classify state of work-in-

progress. This provides the opportunity to address the current time consuming and labor
intensive process of progress monitoring.
Construction progress monitoring is imperative to keeping the project on schedule and within
budget. However, the current practice suffers from several drawbacks of the process of
collecting, analyzing and communicating work-in-progress status (Figure 1). First, inefficient
mechanism to report progress causes latency on communicating information and leads to
incorrect decision making. Current practice still highly relies on coloring 2D drawings printout
or recording daily construction report (DCR) to track progress on the field. Even though 3D
models and schedules can be accessed, it usually requires significant amount of upfront effort to
make the model align with the work breakdown structure and further use for work tracking. In
addition, it also requires continuous effort on maintaining the relationship between tasks in the
schedule and the elements in the BIM. In the best practice, progress is reported back to an onsite
kiosk which still requires back and forth traveling between the site and the trailer (Garcia-Lopez
and Fischer 2014; Kamat et al. 2010; Sacks et al. 2013). Second, subjective measurement of
progress causes misunderstanding between different parties and leads to waste of resources on
the site. The reported progress is mostly estimated based on the experience of the personnel
instead of the actual quantity. This causes problems for the coming trades to estimate the starting
date and could become a serious problem when the site is congested. Third, there is lack of
effective visualization techniques to communicate the current work-in-progress. Utilizing BIM
as a visualization tool to help people understand the clashes between different systems in the
model have successfully facilitated the coordination of Mechanical Electrical Plumbing (MEP)
systems. However, there is little research that investigates how visual analytics of progress can
improve the understanding and communication on progress (Tezel et al. 2015), while providing
visual feedback of the actual progress through comparing Reality and Planed model can also
enhance the communication between subcontractors.
To address the limitations listed above, this new model creates a single machine learning
model to measure progress based on appearance and geometry provides an opportunity to
automatically measure progress, compare quantities against 4D BIM model, and improve
communication through visualization of the progress. The following section provides a thorough
review of the current practice of progress monitoring.

Figure 1. Current practice of progress monitoring is conducted infrequently and

completely manually. While the mobile app and software application streamlined the data
collection process, these products still require significant amount of upfront work and the
collected data are still questionable due to the subjectiveness of the personnel. Visualization
of the progress are then made on the 2D drawings that is not intuitive enough to
communicate the progress.

RELATED WORKS
To empower systematic work tracking, research has focused on improving the current
workflow of progress reporting and automated comparison between the 4D BIM against time-
lapse videos (Abeid et al. 2003; Abeid and Arditi 2002; Golparvar-Fard et al. 2009a; Yang et al.
2015), or 3D image-based and laser scanning point clouds (Bosché et al. 2013; Bosche and Haas
2008; Golparvar-Fard et al. 2009b, 2011, 2012d; Han and Golparvar-Fard 2014; Son et al. 2015;
Turkan et al. 2012). These methods can be mainly categorized into two different categories: 1)
methods that reason about the physical presence of the elements and 2) methods that detect
progress based on the appearance of the elements in images. The following sections will
discussed the state-of-the-art practice in the industry and the above mentioned two categories.
Industry Practice of Progress Monitoring: Current best practice of construction progress
monitoring is still labor intensive and time consuming. Most of the construction site have a
designated project engineer to measure the progress changes based on the subjective observation
while walking around the site. This process usually can take up to several hours or even a day. At
best, time can be reduced by utilizing mobile applications and software to facilitate the process
of documenting progress, however, the result is still dependent on the subjective assessment of
the observer. To be able to efficiently leverage progress information, progress information can
also be also linked to the BIM. However, the linkage from progress to BIM also requires extra
effort to match the work breakdown structure with the model breakdown structure (Garcia-Lopez
and Fischer 2014; Kamat et al. 2010). These bottlenecks are holding back construction
companies from fully utilizing progress information to improve planning, coordination and
communication. As a result, progress reports are currently only used for schedule updates.
Occupancy based Progress Monitoring: Golparvar-Fard et al. (2009, 2011) utilized image-
based 3D point clouds and BIM model to reason about the occupancy and visibility of the
elements. They proposed a supervised machine learning method to infer the construction process.
Bosche et al. (2008, 2013) compares laser scanning point cloud models against BIM to monitor
progress of Mechanical Electrical Plumbing (MEP) system. Turkan et al. further introduce a
method to differentiate different operational details of concrete construction objects such as
formwork and rebar (2012). These methods are limited in their ability to detect operational
details and the occlusion and visibility of elements in the point clouds.
Appearance based Progress Monitoring: To address the limitation of occupancy-based
progress monitoring, Han and Golparvar-Fard (2015) introduced a computer vision method to
backproject the elements in point cloud to the corresponding images and extract image patches to
detect the material of the elements. This reasoning enables progress tracking of operational
details and linking to the correct task level. They further leverage the geometry feature of the
image patches to enhance the accuracy of material recognition (Han et al. 2018). However, these
methods are unable to utilize the geometrical characteristics of their detected components.
To address the drawbacks of the above mentioned methods, we introduce a new machine
learning model that can jointly reason about occupancy and appearance to measure the
performance of work-in-progress.

METHODOLOGY
We first created the visual production model to generate the dataset for the new automated
progress monitoring method. The visual production model includes a production level 4D BIM,
construction site image collections and the image-based point clouds. In this paper, we used a

typical Structure from Motion (SfM) pipeline to generate the 3D image-based point cloud, and
the point cloud is aligned with the 4D BIM by picking at least three points to establish the
corresponding transformation matrix (Lin et al. 2015). Five types of data are generated from the
visual production model to create the dataset: (1) back-projected BIM elements to each camera;
(2) back-projected point cloud mesh to each camera; (3) back-projected point cloud mesh with
the normal vector corresponding to every point to each camera; (4) super pixels generated using
SLIC (Achanta et al. 2010) and (5) occupancy of BIM elements corresponding to the super pixel.
The deep convolutional neural network is inspired by the state-of-the-art object recognition
network developed by Oxford's renowned Visual Geometry Group (VGG) (Simonyan and
Zisserman 2014) and extended it with the features described above. Figure 2 illustrates the
overall framework and the following section will discuss the visual production model, dataset
generation and network architecture in detail.

Figure 2. The visual production model integrated the 4D point clouds with registered
images, production level 4D BIM model with schedule. The construction progress
monitoring dataset consists of five different types of data (back-projected BIM elements,
depth of point cloud mesh/BIM, normal of point cloud mesh, super pixels, occupancy of
BIM elements) generated from the visual production model. The proposed method output
the occupancy and material of each super pixel in the images which can be referred to the
corresponded BIM elements and use traffic light metaphor to indicate progress status.
Visual Production Model: The visual production model includes images and point clouds
that are aligned to the 4D BIM (Figure 2). The point cloud is first transformed into the BIM
coordinates using the transformation matrix generated based on the user inputs of selecting at
least three corresponding points between the point cloud and BIM. The point cloud is generated
through a typical SfM procedure which outputs both the intrinsic and extrinsic camera
parameters such as focal length, radial distortion, camera rotation and camera translation matrix.
Using both the transformation matrix and the camera parameters, all information from the visual
production model can be back-projected to each camera. The point cloud is also converted into a
mesh model using TexRecon (Waechter et al. 2014) to efficiently extract surface normal. Three

different type of images are then generated based on the back-projection: (1) back-projected BIM
elements to each camera; (2) back-projected point cloud mesh to each camera and (3) back-
projected point cloud mesh with the normal vector correspond of every point to each camera.
These images, point clouds and 4D BIM integrated operational level details make up the visual
production model (Figure 3).
Construction Progress Monitoring Dataset: To train the new model for progress
monitoring, we identified the features that used both the appearance and the geometry. For
appearance, to be able to consider the directional and radial appearance features with texture and
color patterns, normal vectors that we calculated from the point cloud mesh model are combined
with the CNN features. Adding the normal features provides geometry information that prevents
material classification method ignoring albedo patterns, face orientation and small surface shape
variations. The CNN features are obtained from the last layer of the trained VGG-16 network.
We acquired the three dimension normal vector from the point cloud mesh model, all the normal
vectors are transformed into the image frame using the camera matrix. For geometry, depth
information of the BIM elements and the point cloud can indicate if the BIM elements in the
Planned model are present in the corresponded Reality point cloud model. We extract the depth
information of the 4D BIM and the point cloud for each camera and calculate the average
difference between them and to represent the occupancy of the BIM elements in the point cloud.
We combined the depth information with the CNN features to identify the presence of elements.
The last step is to apply SLIC segmentation to get the super pixel for each camera. In the end,
depth difference between BIM and point cloud and normal vectors are generated for each super
pixel of a camera.

Figure 3. The data set is generated based on the information from the visual production
model. The top left image is the result of applying SLIC segmentation, each image patch
represent one super pixel. The top right image shows the normal vector in terms of RGB
value in an image. The bottom left in the depth image of the BIM and the bottom right is
the depth image of the point cloud.
Image Annotation: Each image is segmented based on SLIC into approximately 300-400
super pixels, each super pixel is labeled with occupancy and material. The occupancy is
determined by comparing the super pixel of the BIM element back-projection and the
corresponded super pixel. If the BIM element and the super pixel represent the same element
then the occupancy is marked as true, and vice versa. For example, the super pixel shows a
concrete column and the back-projection also shows the concrete column, then the occupancy is
true. If the super pixel shows backfilled soil and the back-projected BIM super pixel shows a

grade wall then the occupancy is marked as false. The material is purely based on observation of
the super pixel, the results are classified into 25 different classes: Brick, Cement - Granular,
Cement – Smooth, Concrete, Foliage, Formwork - Gang, Formwork – Wood, Grass, Gravel,
Insulation, Marble, Metal - Grills, Paving, Soil - Compact, Soil - Dirt and Vegetation, Soil -
Loose, Steel Beam, Stone - Granular, Stone - Limestone, Waterproofing Paint, Wood, Rebar,
Asphalt, Sky and others.
Material and Occupancy Classification: The network architecture is shown in Figure 4.
The model detects occupancy and classifies material at the same time. The network is based on
the VGG-16 network with adding the features of average depth difference between BIM and
point cloud and normal of each super pixel. The super pixels are all resized into 224x224x3 and
the 1x3 normal vector is normalized. We input the resized super pixel with the normalized
normal vector in the first fully connected layer. In each fully connected convolution layer the
input and output are both 4096-dim, while the last output generates 25 dense features for the last
soft max layer to classify the materials. The depth difference is concatenated with these 25 dense
features into another set of fully connected layers to classify the occupancy. Cross entropy is
used for both material classification and occupancy detection for the loss functions. The final
output detects if the input super pixel is occupied by any BIM element, and what kind of material
the super pixel is.

Figure 4. The network architecture is similar to the VGG-16 networks. We have added the
normal feature and the depth difference between the BIM and the point cloud model. The
3D normal vector is normalized and input to the first FC layer. The resulted final 25 dense
features are then passed to the final soft-max layer to classify the material. For occupancy,
the depth difference is used with the 25 dense features and pass to another set of FC layer
and finally classified using the binary output from the last softmax layer.

EVALUATION
The evaluation is done by dividing the dataset into training and testing dataset. The dataset
were collected on a real world construction project with a total of 160 images. Each image is
divided into 300-400 super pixels where the authors have hand labeled 52800 super pixels with
occupancy and material. The result achieves an accuracy of 87.4% in terms of material
recognition, 95.3% in terms of occupancy, and 80.6% in terms of both material and occupancy.
Some of the errors of material recognition and occupancy were due to the segmentation

algorithm. For example, “Rebar” elements in the image usually have a background of other
different material even while introducing directional and radial appearance as features. By
comparing the occupancy and material between the BIM model and the result, we can infer the
actual progress of the project on operation-level, and visualize the progress using traffic light
metaphor on the BIM elements.

Figure 5. The left image shows the original image, the middle image shows the result of the
predicted occupancy and material, progress can then be depicted on the BIM model based
on the model output.
CONCLUSION
This research developed a new machine learning model that addresses the problems of
current appearance and occupancy based automated progress monitoring solutions. The machine
learning model used a single deep CNN model to classify occupancy and material based on the
image texture, color patterns, radiance, and depth from the visual production model which
includes daily construction photos, point cloud, and 4D BIM integrated with operation-level
schedule. The method can successfully classify both occupancy and material on an average
accuracy of 80.6%, and the results can be shown in the visual production model using traffic
light metaphor to indicate progress. The authors planned to investigate the effect of using
different segmentation algorithm in the future.

ACKNOWLEDGEMENT
This material is in part based upon work supported by the National Science Foundation Grant
#1446765. The support and help of the construction team in collecting data and implementing
the work tracking system is greatly appreciated. The opinions, findings, and conclusions or
recommendations expressed are those of the authors and do not reflect the views of the NSF, or
the company mentioned above.

REFERENCES
Abeid, J., Allouche, E., Arditi, D., and Hayman, M. (2003). “PHOTO-NET II: a computer-based
monitoring system applied to project management.” Elsevier Journal of Automation in
Construction, article, 12(5), 603–616.
Abeid, J., and Arditi, D. (2002). “Linking Time-Lapse Digital Photography and Dynamic
Scheduling of Construction Operations.” Journal of Computing in Civil Engineering, article,
16(4), 269–279.
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Süsstrunk, S. P. P.-E. (2010). SLIC
Superpixels.
Bosché, F., Guillemet, A., Turkan, Y., Haas, C., and Haas, R. (2013). “Tracking the Built Status

of MEP Works: Assessing the Value of a Scan-vs-BIM System.” Journal of Computing in

Civil Engineering, (ja).
Bosche, F., and Haas, C. T. (2008). “Automated retrieval of 3D CAD model objects in
construction range images.” Automation in Construction, article, 17(4), 499–512.
Garcia-Lopez, N. P., and Fischer, M. (2014). “A System to Track Work Progress at Construction
Jobsites.” Industrial and Systems Engineering Research Conference.
Golparvar-Fard, M., Peña-Mora, F., Arboleda, C. A., and Lee, S. (2009a). “Visualization of
Construction Progress Monitoring with 4D Simulation Model Overlaid on Time-Lapsed
Photographs.” Journal of Computing in Civil Engineering.
Golparvar-Fard, M., Pena-Mora, F., and Savarese, S. (2009b). “D4AR- A 4-Dimensional
Augmented Reality model for automating construction progress data collection, processing
and communication.” Journal of ITCON-- Special Issue Next Generation Construction IT:
Technology Foresight, Future Studies, Roadmapping, and Scenario Planning, article, 14(1),
129–153.
Golparvar-Fard, M., Peña-Mora, F., and Savarese, S. (2011). “Integrated Sequential As-Built and
As-Planned Representation with Tools in Support of Decision-Making Tasks in the AEC/FM
Industry.” Journal of Construction Engineering and Management.
Golparvar-Fard, M., Peña-Mora, S., Savarese, S., Pena-Mora, F., Savarese, S., Peña-Mora, F.,
Savarese, S., Peña-Mora, S., and Savarese, S. (2012). “Automated model-based progress
monitoring using unordered daily construction photographs and IFC as-planned models.”
ASCE Journal of Computing in Civil Engineering, (10.1061/(ASCE)CP.1943-
5487.0000205), 147.
Han, K., Degol, J., and Golparvar-Fard, M. (2018). “Geometry- and Appearance-Based
Reasoning of Construction Progress Monitoring.” Journal of Construction Engineering and
Management, American Society of Civil Engineers, 144(2), 4017110.
Han, K., and Golparvar-Fard, M. (2014). “Multi-Sample Image-Based Material Recognition and
Formalized Sequencing Knowledge for Operation-Level Construction Progress Monitoring.”
Computing in Civil and Building Engineering, I. Raymond and I. Flood, eds., American
Society of Civil Engineers, 364–372.
Kamat, V. R., Martinez, J. C., Fischer, M., Golparvar-Fard, M., Peña-Mora, F., and Savarese, S.
(2010). “Research in visualization techniques for field construction.” Journal of construction
engineering and management, American Society of Civil Engineers, 137(10), 853–862.
Lin, J., Han, K., and Golparvar-Fard, M. (2015). “Model-driven Collection of Visual Data using
UAVs for Automated Construction Progress Monitoring.” Int’l Conference for Computing in
Civil and Building Engineering 2015, Austin, TX.
Sacks, R., Barak, R., Belaciano, B., Gurevich, U., and Pikas, E. (2013). “Kanbim workflow
management system: Prototype implementation and field testing.” Lean Construction
Journal, LCI, 9(1), 19–34.
Simonyan, K., and Zisserman, A. (2014). “Very deep convolutional networks for large-scale
image recognition.” arXiv preprint arXiv:1409.1556.
Son, H., Bosché, F., and Kim, C. (2015). “As-built data acquisition and its use in production
monitoring and automated layout of civil infrastructure: A survey.” Advanced Engineering
Informatics, 29(2), 172–183.
Tezel, A., Koskela, L., Tzortzopoulos, P., Formoso Carlos, T., and Alves, T. (2015). “Visual
Management in Brazilian Construction Companies: Taxonomy and Guidelines for
Implementation.” Journal of Management in Engineering, American Society of Civil

Engineers, 31(6), 5015001.

Turkan, Y., Bosche, F., Haas, C., and Haas, R. (n.d.). “Tracking secondary and temporary
concrete construction objects using 3D imaging technologies.” Computing in Civil
Engineering (2013), inproceedings, 749–756.
Turkan, Y., Bosche, F., Haas, C., and Haas, R. (2012). “Automated progress tracking using 4D
schedule and 3D sensing technologies.” Automation in Construction, 22(0), 414–421.
Waechter, M., Moehrle, N., and Goesele, M. (2014). “Let There Be Color! Large-Scale
Texturing of 3D Reconstructions BT - Computer Vision – ECCV 2014.” D. Fleet, T. Pajdla,
B. Schiele, and T. Tuytelaars, eds., Springer International Publishing, Cham, 836–850.
Yang, J., Park, M.-W., Vela, P. A., and Golparvar-Fard, M. (2015). “Construction performance
monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the
future.” Advanced Engineering Informatics, Elsevier.

Dimensional Quality Inspection of Prefabricated MEP Modules with 3D Laser Scanning

Jingjing Guo1 and Qian Wang2
1
Dept. of Building, School of Design and Environment, National Univ. of Singapore, 4
Architecture Dr., Singapore 117566. E-mail: [email protected]
2
Dept. of Building, School of Design and Environment, National Univ. of Singapore, 4
Architecture Dr., Singapore 117566. E-mail: [email protected]

ABSTRACT
Prefabricated mechanical, electrical, and plumbing (MEP) module is one of the off-site
construction technologies that can significantly improve the construction productivity. To
successfully install MEP modules on-site, it is necessary to ensure all the MEP elements
including pipes, ventilation ducts, and cable trays inside the module have accurate positions and
dimensions before on-site installation. However, the inspection process is time-consuming using
a traditional manual way. To address the problem, this study presents an automated dimensional
quality inspection technology to estimate the geometric properties of MEP modules using 3D
laser scanning. This study uses an automated technique to align the scan data with the as-
designed BIM of the MEP module. Each MEP element is then extracted based on the scan data
within the region of interest for each respective element. Each pipe is extracted by fitting the
scan data within the region of interest to a cylinder, and each cable tray is obtained by detecting
the side and bottom surfaces of the cable tray as planar surfaces and finding the intersection lines
of the planes. Lastly, the checklist items are computed based on the extracted as-built objects. An
experiment on a real MEP module was conducted to demonstrate the accuracy and efficiency of
the proposed method.

INTRODUCTION
The construction industry has been moving towards the prefabrication status, where the
construction works are mostly done in factories and on-site works are greatly reduced.
Prefabrication has been successfully implemented in many disciplines of construction, one of the
most impressive among which is the Mechanical, Electrical and Plumbing (MEP) module. The
MEP modules include a number of technologies pre-installed at the factory, with pipework, cable
management and ductwork for building services, which contribute to 40-60% of the total
construction cost of a building. These technologies are integrated in a multi-services module
mounted in the ceiling, under the floor or in service riser. Traditionally, MEP construction
suffered from absent and inefficient labor teams, inclement weather, dangerous working
conditions, difficulties in maintaining quality control and the resulting rising costs. Also, access
to building sites could be restricted if other buildings exist outside the site boundaries, and
different trade contractors would face difficulties to bring in equipment and materials, which
caused challenges in both coordination and fabrication. The practice of MEP prefabrication
provides a feasible alternative in which both MEP modules with a number of components can be
manufactured, inspected, labelled and tested off-site.
Typically, a prefabricated MEP module has steel frames with pipes, ducts and cable trays.
After breaking the components into easily transported and installed modules in tight and hard to
reach locations with minimal disruption to other trades onsite, these modules are designed using
BIM technology. Due to the adoption of BIM, the process of MEP module design can minimize

rework and redesign of services, snagging and defects. MEP modules result in reduction of
project schedules, improvement of construction quality, improved safety of on-site installation,
and saving of both labor and cost.
To make sure the MEP modules can function well in the ceiling, several requirements are
stipulated by the Building and Construction Authority (BCA) in Singapore, including fire safety,
water leakage, access provisions, quality standards, etc. Among these requirements, a high
geometric quality is one of the most important requirements that need to achieve, because a
precise and well-aligned installation is the first step for the usage of MEP services. However,
during the installation process of MEP module, problems always arise because of the wrong
sizes or the displacement of components inside the steel frame so that two adjacent modules
could not be properly connected or inserted into the wall cavities, which will lead to construction
delay and additional replacement. It is reported by Burati Jr et al. (1992) that 5% to 16% of the
construction cost is related to construction quality defects of components. The geometric quality
assurance is therefore necessary to ensure the consistency with the design specifications before
on-site installation.
The quality inspection of MEP modules in Singapore is mainly developed by the
prefabricator and endorsed by MEP trade specialists and consultant to ensure they are applicable
for the project according to the MEP guide book from BCA or the International Organization for
Standardization (ISO). The inspectors usually use tape measures or other contact devices to
manually conduct measurements. However, there are several problems with this traditional way.
The first problem is low precision. Because the assessment is carried out by workers with simple
measurement devices, it is difficult to obtain accurate geometric information. The second
problem is being time consuming. Manual work is time consuming because it includes
alignment, measurement and record and each step will consume quite a long time especially for a
huge number of MEP modules in a project. The third problem is the increased need of labour. To
properly assess the geometric quality of the modules, factory would have to hire additional
workers with relevant certification and extensive experiences to conduct the inspection, which
generates redundant labour and further increases the cost. Thus, more accurate and efficient
geometric quality inspection of prefabricated MEP modules is essential to quickly and precisely
obtain the geometric information with minimum labour and cost.
To achieve automated geometric quality inspection, non-contact sensing technologies such as
photogrammetry and 3D laser scanning for 3D geometric information acquisition have been
widely used (Kim et al., 2015). Image-based approaches like computer-vision and
photogrammetry is one of the widest used technologies to capture the product and automatically
extract the needed geometric information (Martinez et al., 2019). Nonetheless, problems remain
such as the limitation of detectable size, the sensitivity to the lighting conditions of surrounding
environment and difficulty to figure out the depth (Bosché, 2010). Contrary to the image-based
technologies, 3D laser scanning can acquire 3D point clouds with good accuracy, high density,
great speed with a wider measurement range (Bosché and Guenet, 2014) without interfering by
the lighting conditions. Due to these outstanding advantages, 3D laser scanning has been widely
adopted in construction field for progress tracking (Kim et al., 2013, Rebolj et al., 2017), 3D
reconstruction of buildings (Son et al., 2015, Pu and Vosselman, 2009, Budroni and Boehm,
2010, Ochmann et al., 2016) and construction quality assurance (Anil et al., 2013, Kim et al.,
2014, Kim et al., 2015, Rebolj et al., 2017).
Therefore, to overcome the limitations of the current quality inspection system for
prefabricated MEP modules, this study proposes to use 3D laser scanning to generate point

clouds of MEP modules, from which the dimensional quality of the MEP modules is
automatically inspected. The proposed method takes the laser scan data and the as-designed BIM
of the MEP modules as the inputs, and the discrepancies between the laser scan data and the as-
designed BIM are extracted for quality inspection. Experimental validation was conducted on a
real prefabricated MEP module to demonstrate the effectiveness and accuracy of the proposed
method.

PROPOSED METHOD
The proposed automated quality inspection technique includes three major steps as shown in
Figure 1. The first step is to align the scan data of the MEP module with its as-designed BIM
model. The second step is to extract the as-built geometries of MEP elements within the MEP
module from the laser scan data. The MEP elements include pipes and cable trays. Lastly, the
third step is to calculate the checklist items for quality inspection of the MEP module and
compare to specific design codes.

Figure 1. Flowchart of the proposed quality inspection technique

Figure 2. Extraction of steel frame from scan data: (a) extraction of steel frame in X axis,
(b) extraction of steel frame in Y axis, (c) extraction of steel frame in Z axis, (d) extracted
steel frame in all axes.
Alignment of Scan with BIM
Each MEP module consists of the external steel frame and internal MEP elements. Due to
serious occlusions in laser scanning, only a part of the internal MEP elements is scanned.

Therefore, only the external steel frame of the MEP module is extracted for the accurate
alignment between the scan data and the as-designed BIM of the module.
First of all, the MEP module is manually adjusted such that the three principal directions of
the MEP module are roughly aligned with the X, Y, and Z axes, respectively. Then, the external
steel frame of the MEP module is extracted from the scan data using a slicing based method. For
example, along the X axis, slicing of scan data is taken with a certain interval. For each slice, the
regions containing scan data are marked, and overlapping regions for all the slices are obtained
as shown in Figure 2(a). The obtained overlapping regions are generally representing the steel
frame along the X axis. With a similar method, the steel frame along the Y and Z axes are
obtained as shown in Figure 2(b) and 2(c). Combining the steel frame along the X, Y, and Z
axes, the whole steel frame is obtained as shown in Figure 2(d).
While the steel frame is extracted from the as-built scan data (blue points in Figure 3(a)), the
steel frame of the module from the as-designed BIM is also extracted and transformed into a
simulated as-designed point cloud data (orange points in Figure 3(a)) based on a uniform
sampling method. Then, the as-built scan data and the simulated as-designed point cloud are
aligned by coarse registration. In the coarse registration, the principal directions of the as-built
scan data and as-designed point cloud of the steel frame are obtained from principal component
analysis (PCA). In addition, the centers of mass for the as-built scan data and as-designed point
cloud of steel frame are also obtained as the averaged coordinates of all the points. Then, the
coordinates of the as-built scan data are transformed such that the two groups of principal
directions and centers of mass are overlapping. In other words, the as-built scan data are roughly
aligned with the as-designed point cloud, as shown in Figure 3(a). To further enhance the
alignment performance, fine registration based on the iterative closest point (ICP) algorithm is
performed, as shown in Figure 3(b). The result in Figure 3(b) shows that fine registration further
reduces the discrepancies between the as-built scan data and as-designed point cloud.

Figure 3. Alignment of as-built scan data with as-designed point cloud based on the
extracted steel frames: (a) coarse registration between as-built scan data (blue points) and
as-designed point cloud of the steel frame (orange points), and (b) fine registration between
as-built scan data and as-designed point cloud
Extraction of MEP Elements
To extract an MEP element (a pipe or a cable tray), the scan data near the target element are
firstly extracted. For each MEP element, its approximate location is obtained from its location in
the as-designed BIM. As the scan data and BIM are well aligned in the previous step, the as-built
location of an element should be close to the as-designed location. Therefore, scan data within a
certain distance to the as-designed element are extracted as the region of interest, as shown in

Figure 4.

Figure 4. Extraction of region of interest for each MEP element based on the as-designed
location
For a pipe, the M-estimator SAmple Consensus (MSAC) algorithm is used to fit the scan data
within the region of interest into a cylinder, as shown in Figure 5(a). To improve the recognition
accuracy, the orientation of the cylinder is obtained from the as-designed BIM and is provided as
an input of the MSAC algorithm. For a cable tray, the two side surfaces and the bottom surface
of each cable tray are obtained by detecting planar surfaces using the random sample consensus
(RANSAC) algorithm. Then, the intersection lines of the three surfaces are obtained, shown as
red lines in Figure 5(b), which indicate the dimensions and positions of the cable tray.

Figure 5. Extraction of MEP elements: (a) extraction of pipes as cylinders, and (b)
extraction of cable trays based on plane detection

Calculation of Checklist Items

For each pipe, the checklist items include its dimensions and positions. The dimensions
include the radius of the cylinder. The positions include the distances from the center to the left,
right, bottom, and top edges of the steel frame, and the locations are measured at the two ends of
the pipe.
For a cable tray, the checklist items include its dimensions and positions. The dimensions
include the width of the cable tray. The positions include the distances from its center to the left,
right, bottom, and top edges of the steel frame, and are measured at the two ends of the cable
tray.
After calculating the checklist items, the dimensions and positions of each MEP element are
compared to the as-designed ones and the discrepancies are obtained. Lastly, the discrepancies
are compared to relevant design codes, which help engineers to decide whether the MEP module

can be accepted, and if not, what measures should be taken to rectify the discrepancies.

VALIDATION
An experiment was conducted on a real MEP module (see Figure 6) to verify the proposed
technique. The MEP module was manufactured in a prefabrication factory in Singapore for a
university project. The MEP module used for the validation experiment had dimensions of 4 m
(length) × 2 m (width) × 0.5 m (height). The MEP module contained 5 pipes and 10 cable trays
with different sizes. A total of four scans were conducted near the four corners of the MEP
module. The proposed method was implemented to the scan data of the MEP module, and all the
pipes and cable trays were successfully extracted from the scan data. The dimensions and
positions of the pipes and cable trays were also obtained from the scan data.

Figure 6. The prefabricated MEP module for validation experiment

To examine the accuracy of the extracted as-built dimensions and positions, manual
measurements of the as-built dimensions and positions were conducted using measurement tapes,
and the manual measurements were taken as the ground truth. Comparisons between the
extracted dimensions and positions with the ground truth were conducted. The results show that
the discrepancies between them were varying from 0.4 mm and 3.2 mm, providing an average
discrepancy of less than 2 mm. The results demonstrated that the proposed method was able to
provide accurate as-built geometries for the quality inspection of prefabricated MEP modules.

CONCLUSION
This study proposes an automated technique for the dimensional quality inspection of
prefabricated MEP modules. Firstly, the scan data are acquired for the MEP module and are then
aligned with the corresponding as-designed BIM of the module based on the external steel frame.
Secondly, each MEP element within the MEP module is extracted from the scan data. Each pipe
is obtained by fitting the scan data into a cylinder and each cable tray is obtained by extracting
the side and bottom planar surfaces and finding the intersection lines of the planes. Lastly, the
checklist items are calculated based on the extracted as-built objects, which include the
dimensions and positions of each pipe or cable tray. Experimental validation was conducted on a
real MEP module and the proposed method was successfully implemented to extract the as-built
objects and estimate the as-built dimensions and positions of MEP elements. Experimental
results showed that the proposed method could effectively and accurately inspect the
dimensional quality of prefabricated MEP modules from laser scan data.
However, this study still has a few limitations, which should be addressed in future research.

First, this research is not fully automated because the user needs to manually remove noise data
and adjust the orientation of the scan data at the beginning. Second, only one experiment was
conducted on a single MEP module. More experiments are needed to further demonstrate the
robustness of the proposed technique. Third, it is found that serious occlusions of MEP elements
may affect the performance of the proposed method. Future research should be conducted to
investigate the quantitative relationship between the level of occlusions and the performance of
the proposed method.

REFERENCES
Anil, E. B., Tang, P., Akinci, B., & Huber, D. (2013). Deviation analysis method for the
assessment of the quality of the as-is Building Information Models generated from point
cloud data. Automation in Construction, 35, 507-516.
Bosché, F. (2010). Automated recognition of 3D CAD model objects in laser scans and
calculation of as-built dimensions for dimensional compliance control in construction.
Advanced engineering informatics, 24(1), 107-118.
Bosché, F., & Guenet, E. (2014). Automating surface flatness control using terrestrial laser
scanning and building information models. Automation in Construction, 44, 212-226.
Budroni, A., & Boehm, J. (2010). Automated 3D reconstruction of interiors from point clouds.
International Journal of Architectural Computing, 8(1), 55-73.
Burati Jr, J. L., Farrington, J. J., & Ledbetter, W. B. (1992). Causes of quality deviations in
design and construction. Journal of construction engineering and management, 118(1), 34-
49.
Kim, C., Son, H., & Kim, C. (2013). Fully automated registration of 3D data to a 3D CAD model
for project progress monitoring. Automation in Construction, 35, 587-594.
Kim, M. K., Cheng, J. C., Sohn, H., & Chang, C. C. (2015). A framework for dimensional and
surface quality assessment of precast concrete elements using BIM and 3D laser scanning.
Automation in Construction, 49, 225-238.
Kim, M. K., Sohn, H., & Chang, C. C. (2014). Automated dimensional quality assessment of
precast concrete panels using terrestrial laser scanning. Automation in Construction, 45, 163-
177.
Martinez, P., Ahmad, R., & Al-Hussein, M. (2019). A vision-based system for pre-inspection of
steel frame manufacturing. Automation in Construction, 97, 151-163.
Ochmann, S., Vock, R., Wessel, R., & Klein, R. (2016). Automatic reconstruction of parametric
building models from indoor point clouds. Computers & Graphics, 54, 94-103.
Pu, S., & Vosselman, G. (2009). Knowledge based reconstruction of building models from
terrestrial laser scanning data. ISPRS Journal of Photogrammetry and Remote Sensing, 64(6),
575-584.
Rebolj, D., Pučko, Z., Babič, N. Č., Bizjak, M., & Mongus, D. (2017). Point cloud quality
requirements for Scan-vs-BIM based automated construction progress monitoring.
Automation in Construction, 84, 323-334.
Son, H., Kim, C., & Kim, C. (2015). 3D reconstruction of as-built industrial instrumentation
models from laser-scan data and a 3D CAD database based on prior knowledge. Automation
in Construction, 49, 193-200.

Segmentation Approach to Detection of Discrepancy between As-Built and As-Planned

Structure Images on a Construction Site
Juhyeon Bae1 and SangUk Han2
1
Graduate Student, Dept. of Civil and Environmental Engineering, Hanyang Univ., Seoul 04763,
S. Korea. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil and Environmental Engineering, Hanyang Univ., Seoul
04763, S. Korea. E-mail: [email protected]

ABSTRACT
Object detection has been widely used to extract visual information from resources and build
components, enabling timely identification and evaluation of construction performance.
However, it is difficult to evaluate the progress of a structure solely based on the detection
approach. A segmentation approach using 2D images is proposed for finding discrepancies
between an as-built and an as-planned structure. First, preprocessing is implemented to convert
the drawing to a binary image, and a deep-learning based segmentation is conducted to extract
structural components from as-built images. Then, the extracted structure and the converted
drawing are compared using a 2D matrix of those images. An experiment using wood structure
images was performed. The results demonstrate the accurate detection of discrepancies. Thus,
the proposed approach can potentially be utilized for monitoring progress as well as inspection
and maintenance, for which the identified discrepancies between as-planned and as-built images
can be used.

INTRODUCTION
In the construction industry, image processing has gained increasingly more attention as an
automated means of extracting on-site information from images, which can be used for project
management. Especially in image processing, the detection of multiple objects has been widely
used to extract meaningful on-site information from images. Object detection detects instances
on an image based on features (e.g., texture, color, shape) extracted from the instances and
identifies the objects that are included in the image. However, as locating objects in the image
through object detection is difficult, image localization has been utilized to provide an
approximate location of the object in the form of a quadrilateral box representing the boundary
of objects. As a result, the detection of objects, with localization, has significantly improved the
data acquisition and analysis manner in construction, particularly for detecting resources (e.g.,
human, material, and equipment).
Apart from the resources, detecting elements of a structure or a building has also gained
considerable attention for analyzing the progress or quality of the structure. However, unlike the
resources, the shape of a structure changes its form when in progress, and thus makes it difficult
to evaluate the progress merely based on the result of object detection (i.e., the number of
detected structures in the image and approximate boundary of the structure). Furthermore, when
any complete 4D model with a specific schedule—being compared with the detected object—is
unavailable, the detection approach still requires visual checks on how much work has been done,
whether every component is rightly installed, and whether there are missing parts. The
conventional monitoring procedure visually carried out by a project manager is time-consuming,
labor-intensive, and unsafe; according to Son and Kim (2010), it has been reported that site

managers spend 30 to 49% of their time in monitoring and analysis. In that sense, an approach
that effectively extracts information of the structural components from image data is required to
improve the on-site information acquisition practice.
Image segmentation can be a useful option for the improvement of the extraction of structural
information. Image segmentation is a process that partitions a digital image into multiple
segments, based on similar characteristics, to make the image more meaningful and easier to
analyze. Image segmentation provides a useful result containing the assigned label of every pixel
in an image, which shows what each pixel is classified to, and locating the exact position of the
classified objects in a 2D domain. From the result, a desired object can be extracted as a separate
image while maintaining the context in the original image. This approach allows for the
comparison of the segmented objects with an as-planned model for detecting discrepancies. In
this paper, a method to detect discrepancies of 2D structures is proposed, and validated for
identifying and evaluating the built structural components. Specifically, each extracted structure
and drawing of the image is expressed as a 2D matrix, which enables an analysis of the structure
based on the drawing with a simple calculation using the two matrices.

RESEARCH BACKGROUND
Computer vision techniques utilizing image data have recently advanced in construction,
mainly focusing on three application areas such as progress monitoring, inspection, and
maintenance. In the progress monitoring area, Wu et al. (2010) proposed a 3D CAD filtering
method to extract concrete columns built on a construction site. Similarly, Son and Kim (2010)
also proposed a 3D structural component recognition and modeling method for progress
monitoring. Various research efforts have been made by previous studies (e.g., Abeid and Aarditi,
2002; Golparvar-Fard et al., 2007; Golparvar-Fard and Peña-Mora, 2007; Lukins and Trucco,
2007; Wu and Kim, 2004; Shih and Wang, 2004) to recognize progress or visualize planned
information.
As an application of image data for inspection, Li et al. (2014) proposed a long-distance
crack inspection method for a bridge structure. The proposed method contains several
preprocessing steps including enhancement, smoothing, and denoising, and an extracting step
using a segmentation algorithm based on the Chan-Vese model. Mashford et al. (2010) presented
a new approach using image segmentation and support vector machine for automatic pipe
inspection. In recent years, Halfawy and Hengmeechai (2015) proposed a CCTV-based sewer
inspection system using image segmentation and support vector machine.
Similarly, research efforts to use image data for maintenance have also been made in
construction. Chenglong et al. (2017) proposed an enhanced crack segmentation algorithm using
3D pavement data. Golparvar-Fard et al. (2015) proposed a novel algorithm for 2D image
segmentation based on Semantic Texton Forest Classifier (STFs) for effective data collection of
highway assets from video clips. For 12 categories (e.g., asphalt pavement, concrete pavement,
guard rail, and so on), multi-label segmentation was implemented in the study. Li et al. (2017)
also proposed a 2D pavement-crack detection and segmentation method for pavement
maintenance. A steerable matched filter was used to generate a saliency map of the crack.
Those research studies show what information can be extracted from site images, and how
such information can be used for management purposes by adopting and advancing computer
vision techniques. Particularly, it is found that the methods that can recognize and detect the
deviation between an as-planned model and an as-built model are the key step to understanding
the scene. For example, research works (e.g., Golparvar-Fard and Peña-Mora, 2007; Shih and

Wang, 2004) used 3D drawings, or as-planned models based on 3D point clouds, to detect the
discrepancies between an as-planned model and an as-built model. Although the number of
construction projects that use 3D models or 3D point clouds has been increasing recently, it is
still hard to assume that an as-planned 3D model (i.e., 4D BIM with schedule), or 3D point
clouds, is available for every project. In that sense, the proposed method in this paper, that uses a
2D drawing which is available in any construction site, is expected to have its potential in terms
of the accessibility and usability.

Figure 1. Flowchart of the proposed method

2D-BASED DISCREPANCY DETECTION
The objective of this paper is to present a segmentation approach, using 2D images, for the
recognition of discrepancies between a drawing and a structure image. Especially, this method
provides a simple measure to help a project manager figure out the discrepancies in modular
construction applications. In modular construction, many standardized parts of a building are
produced in a repeated manner in the factory as done in the manufacturing industry, and the parts
are then transported and assembled on a construction site. This study focuses on the
manufacturing process of building components in shops where the environment can be controlled.
For example, the proposed method uses 2D images of the front view of the structure and a
corresponding drawing image. Particularly, it is assumed that the structure in the image has a
planar view from the same perspective as the drawing; a camera can be installed at a fixed
location in front of production lines in shops. Once the structure images and the drawing image
are obtained, the images are converted to a grayscale, and then the Otsu threshold is
implemented for getting a binary image of the drawing. In this, images are stored in the form of a
2D matrix, and here, a pixel in an image corresponds to an element in the 2D matrix. Therefore,
a binary image implies a 2D matrix, where each element has a value of 0 or 1, corresponding to
the image. As a result of the preprocessing, a structure image is converted to a gray image to

perform the segmentation in next step. The drawing image is converted to a binary image, 0 is
assigned to the background and 1 is assigned to the structural component. In the segmentation
phase, the part corresponding to the structure is segmented with a value of 1, and the rest of the
image has a value of 0. The segmented image is then compared with a binary image of the
drawing to find discrepancies. The comparison is conducted by subtracting the matrix of the
segmented image from the matrix of the drawing image. Consequently, in the calculated matrix,
the values of -1, 0, and 1 mean a falsely installed part, background, and a missing part,
respectively. A flowchart of the proposed method is illustrated in Figure 1. The technical details
of each phase are described below.

Preprocessing
Once the image and the drawing for the structure are obtained, preprocessing is implemented
to get a gray-indexed image from the structure image, which results in less computation in the
training phase of the segmentation model, as well as to get a binary image of the corresponding
drawing. In this study, an Otsu threshold is employed and used to convert the drawing image to a
binary scale image. For example, the pixels of the drawing image can actually have a range of 0
to 255, but the pixels theoretically have only two meanings (i.e., pixel values) according to the
color (i.e., white for background and gray for structure). With the Otsu threshold, thus drawing
images are converted to a simpler form that only contains 0 and 1. As a result, 0 is assigned to
the background pixels, 1 is assigned to a structure component. Then, the sizes of both the
structure and drawing image are resized to be identical. This preprocessing procedure facilitates
a simple comparison between the actual object in the image and the drawing of the object in
following session.

Figure 2. (a) Input image, (b) Result of U-net segmentation, (c) Ground truth

Image Segmentation
In this study, the U-net algorithm (Ronneberger et al., 2015), that extracts structural
components from the image, is adopted for the segmentation of the images. It is known that this
algorithm has a great performance when implementing the segmentation of a binary image which
has low disparity and thin boundary between instances, with even a comparatively small number
of trainsets (Ronneberger et al., 2015). Data augmentation that generates new images by rotating,
shifting, or zooming an existing image is employed for training, to enable for the model with a
smaller number of trainsets to train sufficiently, and to reduce the possibility of the occurrence of
overfitting. An example of an input image, the resulting output of the segmentation, and the
ground truth is shown in Figure 2. The performance of the segmentation model is measured by

the average ratio of the number of correctly classified pixels to the actual number of pixels in the
image that represents the structural components.

Discrepancy
In the segmented image shown in Figure 2(b), the value of pixels can be interpreted as
whether or not the structure component at the specific point is installed. In other words, if the
value of a pixel is 1, the structure is already installed at that point, and vice versa. This enables
the construction status of a structure to be represented in the form of a 2D matrix. As a result, a
matrix representing the construction status of the structure and a matrix representing a
construction plan of the structure can be compared with each other. When comparing those two
matrices, the elements which do not correspond in their values between the two matrices imply
that there can be some structural components that are wrongly constructed or missing. For
example, if an element at the same position has a value of 1 in a matrix from a drawing and a
value of 0 in a matrix from the corresponding structure image, this implies a missing part. A
simple calculation between the two matrices provides a better understanding of the construction
status of the structure, as shown in Figure 3. Subtracting a matrix of the structure image from a
matrix of its drawing gives a matrix that has a useful information of falsely installed parts and
missing parts in the current construction status.

Figure 3. Matrix calculation for finding discrepancies

Figure 4. (a) Binary image of drawing, (b) Segmented image, (c) Missing part, (d) Ground
truth of missing part

EXPERIMENT RESULTS AND DISCUSSION

To prove the concept, a pilot experiment is carried out to evaluate the proposed method. In
this experiment, a wood structure image and the corresponding drawing are used as test images.
First, preprocessing is implemented to get grayscale images of the structure for the training of the
segmentation model, and to produce binary images of the drawing shown in Figure 4(a). After
this, the U-net segmentation model is trained with 23 of independent wood structure images. The

accuracy of the trained segmentation model is 95.86%, which means 95.86% of the pixels are
correctly classified. Using this segmentation model, the structure of the test image is extracted,
as shown in Figure 4(b). Since the pixels of the segmented image have a range of 0 to 255, it is
converted to a binary image using Otsu threshold before being calculated. Then, the two matrices
of the extracted structure image and drawing are calculated, as presented in Figure 3. In the given
test situation, since there are only some missing parts, matrix elements (i.e., pixels) that have a
value of 1, which implies a missing part, are separated and visualized in Figure 4(c).
The proposed method only used 2D images for detecting the discrepancies between as-built
and as-planned images. As a result, it is found that the discrepancies can be detected using 2D
images of the built structure and the drawing, as shown in Figure 4. 8 discrepancies of diagonal
structural components, as seen in Figure 4(a), are correctly detected in the result (Figure 3(c)).
However, it is seen that there are some falsely detected regions occurring due to errors of the
segmentation result. Moreover, there are some overlapping regions, where more than two
components or materials overlap on the same spot, resulting in an omission of detection of
missing parts. In this regard, a combined approach, that distinguishes different materials in the
overlapping region, using object detection, can be investigated to improve the performance. In
addition, it should be noted that in this experiment, only a simple structure is used and tested to
prove the concept of segmentation. Thus, further research is required to evaluate the performance
of the proposed method for various types of structures with different shapes.

CONCLUSION
In this study, a segmentation approach has been presented that detects the discrepancies of a
structure. It was observed that 2D image segmentation allows for extracting structure
information for comparing it with an as-planned model and for detecting and evaluating the
discrepancies for early recognition of construction progress or errors. Early recognition of
construction performance can reduce the potential time and cost for rework or unforeseen issues
occurring from the discrepancies. Moreover, the extracted discrepancy information from the
images can be used for visualizing the construction performance, on as-built images, and for
supporting the project manager’s decision-making. In this regard, the proposed approach can
potentially be used for progress monitoring, quality inspection, and maintenance, by providing
information on newly-built structural components, and geometrical errors or deviations from
base plans. Our future work will include comparing structure images from different views and
detecting various structure components and materials, as well as examining a post-process of the
segmentation that can improve accuracy and reliability, resulting in better recognition of the
structure.

ACKNOWLEDGEMENT
This research was supported by the Basic Science Research Program through the National
Research Foundation of Korea funded by the Ministry of Science, ICT and Future Planning
(NRF 2018R1C1B6005108). Any opinions, findings, and conclusions or recommendations
expressed in this paper are those of the authors and do not necessarily reflect the views of the
National Research Foundation of Korea.

REFERENCES
Son, H., and Kim, C. (2010). 3D structural component recognition and modeling method using

color and 3D data for construction progress monitoring. Automation in Construction, 19(7),
844-854.
Wu, Y., Kim, H., Kim, C., and Han, S. H. (2009). Object recognition in construction-site images
using 3D CAD-based filtering. Journal of Computing in Civil Engineering, 24(1), 56-64.
Fard, M. G., and Peña-Mora, F. (2007). Application of visualization techniques for construction
progress monitoring. In Computing in Civil Engineering (2007) (pp. 216-223).
Golparvar-Fard, M., Balali, V., and de la Garza, J. M. (2012). Segmentation and recognition of
highway assets using image-based 3D point clouds and semantic Texton forests. Journal of
Computing in Civil Engineering, 29(1), 04014023
Fard, M. G., Sridharan, A., Lee, S., and Peña-Mora, F. (2007). Visual representation of
construction progress monitoring metrics on time-lapse photographs. In Proc. Construction
Management and Economics Conference, Reading, UK.
Shih, N. J., and Wang, P. H. (2004). Study on Construction Inaccuracies Between Pointcloud and
Building Construction Models.
Wu, Y., and Kim, H. (2004). Digital imaging in assessment of construction project progress. In
Proceedings of the 21st International Symposium on Automation and Robotics in
Construction, IAARC, Jeju, Korea (pp. 537-542).
Zhu, Z., and Brilakis, I. (2010). Concrete column recognition in images and videos. Journal of
computing in civil engineering, 24(6), 478-487.
Li, G., He, S., Ju, Y., and Du, K. (2014). Long-distance precision inspection method for bridge
cracks with image processing. Automation in Construction, 41, 83-95.
Li, S., Cao, Y., and Cai, H. (2017). Automatic pavement-crack detection and segmentation based
on steerable matched filtering and an active contour model. Journal of Computing in Civil
Engineering, 31(5), 04017045.
Jiang, C., and Tsai, Y. J. (2015). Enhanced crack segmentation algorithm using 3D pavement
data. Journal of Computing in Civil Engineering, 30(3), 04015050.
Halfawy, M. R., and Hengmeechai, J. (2014). Integrated vision-based system for automated
defect detection in sewer closed circuit television inspection videos. Journal of Computing in
Civil Engineering, 29(1), 04014024.
Mashford, J., Rahilly, M., Davis, P., and Burn, S. (2010). A morphological approach to pipe
image interpretation based on segmentation by support vector machine. Automation in
Construction, 19(7), 875-883.
Abeid, J., and Arditi, D. (2002). Time-lapse digital photography applied to project management.
Journal of Construction Engineering and Management, 128(6), 530-535.
Leung, S. W., Mak, S., and Lee, B. L. (2008). Using a real-time integrated communication
system to monitor the progress and quality of construction works. Automation in
construction, 17(6), 749-757.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical
image segmentation. In International Conference on Medical image computing and
computer-assisted intervention (pp. 234-241). Springer, Cham.

Development of Massive Point Cloud Data Geoprocessing Framework for Construction

Site Monitoring
Minh Hieu Nguyen1; Sanghyun Yoon2; Sangyoon Park3; and Joon Heo, Ph.D.4
1
Dept. of Civil and Environmental Engineering, Yonsei Univ., Seoul, South Korea. E-mail:
[email protected]
2
Dept. of Civil and Environmental Engineering, Yonsei Univ., Seoul, South Korea. E-mail:
[email protected]
3
Dept. of Civil and Environmental Engineering, Yonsei Univ., Seoul, South Korea. E-mail:
[email protected]
4
Dept. of Civil and Environmental Engineering, Yonsei Univ., Seoul, South Korea. E-mail:
[email protected]

ABSTRACT
In civil engineering, point cloud data are used for construction site monitoring. However,
with such massive datasets, processing these data have barriers due to the size and computational
intensity. This study presents the first scalable Hadoop based point cloud management and
processing framework to overcome those barriers. The framework consists of 3 layers: (1)
storage layer; (2) operation layer; (3) interactive layer. The storage layer is optimized by
indexing techniques to accelerate the map-reduce applications. The operation layer is equipped
not only with the common operations such as range query, kNN, or spatial join, but also with two
powerful modules including “Change Detection” and “3D Geometric Model”, for efficient
monitoring. The Interactive layer enables users to approach a parallel processing model on
Hadoop without requiring deep related knowledge. In addition, this layer could also visualize
massive point cloud data directly from Hadoop, which is useful for analytical processes. With
such components, our framework is scalable, inexpensive, and full-fledged.

INTRODUCTION
Bosché et al. (2015) showed the value of integrating laser scanning and Building Information
Modeling (BIM) for construction monitoring. The development of high accuracy laser scanning
technology allows 3D models of buildings to be obtained without using the traditional
measurement methods. However, there are still many challenges associated with being able to
build a BIM from point cloud data. Some obstacles include, 1) the natural indoor environment,
which usually includes a lot of materials thereby reducing the accuracy of structural interpolation
model of a building; and 2) the huge size of point-cloud data, which often leads to system
slowdown or failure during the modeling process (Jung et al. 2015). In response to the massive
point cloud data processing, Pajić et al. (2018) introduced distributed point cloud data processing
in a high-memory cloud-computing environment. However, the hardware investment for these
systems requires higher costs than a similar system running on Apache Hadoop
(https://hadoop.apache.org). Kissling et al. (2017) proposed eEcoLiDar infrastructure based on
Hadoop for ecological applications of LiDAR point clouds. However, this project is still under
development and very few results have been published. Li et al. (2018) provided general
frameworks based on Hadoop for point cloud data processing. In these frameworks, the core
functionalities are leveraged from LAStools (https://rapidlasso.com/lastools) allowing the
frameworks to handle more LiDAR data processing tasks compared to the others. However,

these frameworks, as well as most of the previous frameworks, completely lack the ability to
visualize massive point cloud data in 3D space. Indeed, visualizing a mass of objects, such as
point cloud data in 3D space, brings many challenges. Potree (Schütz 2016) solved this problem
effectively but completely lacks the tools of management and analysis. Therefore, it would be
practical if a framework for massive point cloud data processing with full components, including
storage, processing, and visualization, is developed. This research introduces a full-fledged,
massive point cloud geoprocessing framework for construction site monitoring, based on Hadoop
to overcome the limitations of the previous studies. The contributions of this paper include:
 Providing a comparison of five potential indexing techniques for point cloud data
processing. In which, our customized Octree structure (Octree*) is more stable and
efficient than the original structure.
 Providing an efficient method for change detection on massive point cloud based on a
novel “soft-join” operation running on Map-Reduce.
 Introducing a model for plane extraction from massive point cloud based on an advanced
RANSAC algorithm and Hadoop.
 Introducing a model for visualizing an extremely large amount of point cloud data
directly from Hadoop without having to transfer data to another single computer.

PROPOSED FRAMEWORK
The overall architecture of our framework with the three main layers is depicted in Figure 1
and is named B-Eagle. In the context of a scientific paper, we will not go into detailed
explanations of all B-Eagle’s functionalities. Instead, we focus on the analysis of crucial ground-
related factors in construction site monitoring, including “Change Detection” and “3D Geometric
Model”. In which, “Change Detection” will be a key feature to implement the monitoring, where
old and new point cloud datasets are compared together frequently to represent the state of
construction site. "3D Geometric Model" produces geometric structure which is much smaller in
terms of size compared to using the original data, without losing the most important
characteristics of the object. If each part of this structure is labeled automatically, BIM data can
be obtained at a lower cost.
Storage Layer: This layer plays an important role in the overall framework as all operations
involve data. To optimize this layer, a global index structure is often created to eliminate the
blocks not involved in the computation. In this study, we applied the Sort-Tile-Recursive
algorithm, which was first introduced by Lautenberg et al. (1997), to construct a global index. In
this way, points will be grouped in a same partition if they are close together. Each partition
corresponds to a block whose size does not exceed 128MB. MBRs (Minimum Bounding
Rectangles) of the partitions are then organized in an R-tree structure. This global index structure
will be bulk-loaded into memory of the master node as a preparation step of the computation.
Right after a global index is created, a local index is usually constructed on each block. The
efficiency of a local index structure is determined by type of the input data, how the original
structure is customized, the tuning parameters (if they exist), and the specific problem that need
to be solved. On dealing with point cloud data indexing, Han et al. (2011) showed that Octree
has the advantage of fast generation and query, while 3D R-tree is more memory-efficient. In
another research, the hashing-based virtual grid is also introduced as a suitable alternative to
Octree (Han et al. 2012). In addition, Han et al. (2018) also introduced a semi-isometric Octree
structure which obtained a better performance compared to using the native structure. In this
article, section “Experiment 1” presents a comparison of five potential index structures for point

cloud data within the context of the “Change Detection” problem.

Figure 1. Architecture of the B-Eagle Framework

Operation Layer: In this article, the "Change Detection" is based on a “soft-join” algorithm
which is described by the following steps, 1) Given two point cloud datasets A and B, where B is
the newly collected dataset; 2) Find all points p | p  B and p  A; p  A if at least one point p’
 A where d (p, p’) < dmax. Three typical spatial join techniques in parallel computing are given
in a comparison by You et al. (2015). The advantages of our joint algorithm compared to the
previous studies include, 1) It does not require creating and storing a complete index structure for
each dataset before the join process; 2) The entire local join process is performed in the memory
of the worker nodes with the support of a local index structure and, thus, the performance is
improved compared to the previous studies; 3) It not only joins two datasets, but also can join
two data directories; 4) It supports both "join" and "disjoin" operations in the same process. The
"soft-join" algorithm is illustrated in Figure 2. The performance of this algorithm is presented in
session "Experiment 2" of this article.
The "3D Geometric Model" is a process of plane detection and contour extraction. The large
point cloud data is subdivided into sub point cloud data based on the global index structure. The
3D geometric extraction is then performed in parallel processing and independently on each
partition of the global index. The subdividing data in this way enables RANSAC to be
implemented on a large dataset. In fact, the RANSAC model often takes a lot of time in the early
stages when no points have been classified. This is because even if a random plane is created, all
the remaining points need to be scanned to find which ones belong to this plane. Subsequently,
the elimination of points that have been classified from the input points also spend quite a lot of
time even if it can be done by processing arrays in memory. In our study, we improved
RANSAC algorithm in the study of Jung et al. (2014) by applying the Octree* index structure.
Specifically, the process of finding the points belonging to the base plane (usually created by 3
points) is accelerated, since only the points include in the child nodes that intersect with this
plane are used in the computation. In addition, separating the points found on a plane from the
input points is also accelerated by the “soft-join” algorithm as mentioned. The contour extraction
process is performed by using the Flood Fill algorithms (Burger and Burge 2016). The

performance of this model is presented in session "Experiment 3" of this article.

Figure 2. B-Eagle Join Operation based on labeling data

Interactive Layer: This layer makes it easier for users to access the system through a
WebUI. For the visualization, we generated multi-resolution data (tile-based) based on the
Octree* structure. This structure ensures that the number of points between the tiles is balanced,
since each leaf node corresponds to a tile data at the highest LoD (Level of Detail). The
generation of tile data is accelerated by a Map-Reduce application. The challenge in this step is
that the tile data is often small in size, which is not optimized by HDFS. To overcome these
limitations, the tile data are stored in HBase (https://hbase.apache.org) which is a column-
oriented database management system running on top of HDFS. However, during the writing of
tile data, too many reducers writing data to HFile (the storage unit of HBase) can cause
bottlenecks. Therefore, data can be temporarily written to Hadoop File Sequence before being
converted into HFile. This way, the performance of writing tile data is improved. At the same
time, when a tile ID is given (key), tile data (value) can be loaded quickly. The loading and
rendering engines are developed based on the Potree, which is state-of-the-art in browser-based
point cloud visualization. However, the whole structure must be re-built to be compatible with
Hadoop and the B-Eagle’s architecture. All images in the “Experiment 3” are obtained by using
B-Eagle’s visualization module.

EXPERIMENT
Experiment 1: In this experiment, we performed a comparison of five spatial index
structures on point cloud data, including Octree, Octree*, Kd-tree, R-tree, and Z-order, through
the “soft-join” operation (dmax = 1cm). We used dataset A (128MB, ~3 million points) and
dataset B, which was generated by randomly moving the points of the dataset A in a range [0-
10m]. Only a small dataset (128MB) was chosen to check the performance of the reducer when
data could be processed in memory. The "soft-join" performed a series of kNN operations
instead of simply finding two coincidental points. The performance of the kNN operations was
improved through the index structures. From the results as shown in Figure 3, it is clear that
creating an index structure in memory does not take much time when the data size is small.
The original Octree is unstable compared to our Octree*, because the Octree’s performance
depends on the dimension of the leaf node. In reality, we do not know which size (m) is the best,
as it depends on the MBR of the data. In our Octree*, the convergence time to the query location
is proportional to the value "t", which is the maximum number of points per leaf node. In this
experiment, the "t" value should be [10-100]. The node-partitioning of the Kd-tree is the same as
the Octree*. The convergence time on Kd-tree structure is fastest with “t” in [10-100]. However,
the time required for creating a Kd-tree structure is longer than for an Octree structure, because it
always requires sorting locations of the objects during the indexing process. The R-tree structure

is more complex and contains the most optional parameters compared to the other index
structures. The time for creating an R-tree is longest due to updating the directory rectangles of
ancestor nodes instead of the linearly dividing nodes as in the others tree structures. In the 3 split
methods of R-tree, the Sort-Tile-Recursive split method produced a better result than the others.
The Z-order is a simple but efficient indexing method for point cloud data. It performs a multi-
dimensional data conversion in one-dimensional data that allows the search to be performed
through the B-tree structure and, thus, the convergence time in Z-order is not dependent on the
actual location of the object in real space. Overall, the bottom image in Figure 3 shows the
comprehensive comparison between the 5 indexing techniques.

Figure 3. Performance of the five indexing techniques

Experiment 2: This experiment aims to observe the performance of the "soft-join" algorithm
on large datasets. We used a real dataset (4GB, 100+ million points) and generated different
datasets with different sizes and MBRs. The Hadoop cluster was configured from 8 nodes with
the same capacity (Intel Core i5-2300, 2.80 GHz x 4, 16GB memory). In Figure 4a, when data
completely overlapped each other, our application took only 25 minutes for the whole process
between 16x16GB synthetic data (600x600 million points). In a relative comparison,
SpatialHadoop used 20 nodes and took 60 minutes (not including indexing time) to complete the
spatial join operation with the same size of synthetic data (Eldawy and Mokbel 2015). Of course,
processing polygons is more complex than processing points, and one framework was designed
for 2D data while our framework was designed for 3D data. However, the performance of our
join algorithms has improved significantly since the entire join process is performed in memory
(similar to Apace Spark). In another comparison (Figure 4b), the performance of our joint
algorithm was much better than PostgreSQL in both the preparation step (indexing data) and the

implementation step. More specially, the Pgpointcloud library

(https://github.com/pgpointcloud/pointcloud) was unable to perform the join operation on a large
number of records, such as 100x100 million, as in this case.

Figure 4. Performance of the “soft-join” algorithm

Experiment 3: This experiment aims to observe the performance of the “3D geometric
model”. We used 4GB (100+ million points) of data that was registered, and we removed points
which were not related to the building structure. With the same data, in the past, we spent a lot of
time to break down and implement down-sampling before processing with only one computer.
Using our new solution, this workflow can be handled concurrently with 8 nodes for an average
of 4 minutes. The data after processing is only 20MB (0.5%), however it still maintains the
characteristics of the building as shown in Figure 5. In a similar experiment, however, the
complexity is increased when the density of input point cloud was not uniform and all materials
at the site were retained. Particularly, we used 5GB data (130+ million points) collected in a
lecture hall. The results in Figure 6 showed that the “3D Geometric model” could extract the
basic structure of the building, although the accuracy decreases due to the reason above.

Figure 5. Performance of the 3D Geometric Model

Figure 6. Performance of the 3D Geometric Model on noise data

CONCLUSION
In this article, we presented a framework for massive point cloud data geoprocessing, based

on Hadoop, within the scope of construction site monitoring from all three aspects: storage,
processing, and visualization. The “Change Detection” module achieved high granularity thanks
to the "soft-join" algorithm as given. The results obtained from the “3D Geometric Model” were
improved, however, they have been still limited and need to be further refined before being able
to achieve an advantage model such BIM. The initial visualization module has shown its
feasibility, however, the speed of generating tile data and the loading process needs to be further
optimized. In future, this framework will be equipped with more features such as object
recognition to better support the construction site monitoring. Step by step, we expect to develop
a powerful point cloud data processing framework, where the interactions with common users
remain the same while the processing is accelerated by a computer cluster such Hadoop.

ACKNOWLEDGMENTS
This work was supported by the National Research Foundation of Korea (NRF) grant funded
by the Korea government (Ministry of Science and ICT) (No. 2018R1A2B2009160).

REFERENCES
Bosché, F., Ahmed, M., Turkan, Y., Haas, C. T., and Haas, R. (2015). “The value of integrating
Scan-to-BIM and Scan-vs-BIM techniques for construction monitoring using laser scanning
and BIM: The case of cylindrical MEP components.” J. Autom. Constr., 49, 201-213.
Burger W., and Burge M.J. (2016). “Regions in Binary Images.” Digital Image Processing,
Springer, London, 209-222.
Eldawy, A., and Mokbel, M. F. (2015). “Spatialhadoop: A mapreduce framework for spatial
data.” Proc., 31st Int. Conf. on Data Engineering, IEEE, 1352-1363.
Han, S. H. (2018). “Towards Efficient Implementation of an Octree for a Large 3D Point Cloud.”
J. Sens., 18(12), 4398.
Han, S. H., Kim, S., Jung, J. H., Kim, C., Yu, K., and Heo, J. (2012). “Development of a
hashing-based data structure for the fast retrieval of 3D terrestrial laser scanned data.” J.
Comput. Geosci., 39, 1-10.
Han, S. H., Lee, S. J., Kim, S. P., Kim, C. J., Heo, J., and Lee, H. B. (2011). “A Comparison of
3D R-tree and octree to index large point clouds from a 3D terrestrial laser scanner.” J.
Korean Soc. Surv., Geod., Photogramm. Cartography, 29(1), 39-46.
Jung, J., Hong, S., Jeong, S., Kim, S., Cho, H., Hong, S., and Heo, J. (2014). “Productive
modeling for development of as-built BIM of existing indoor structures.” J. Autom. Constr.,
42, 68-77.
Jung, J., Hong, S., Yoon, S., Kim, J., and Heo, J. (2015). “Automated 3D wireframe modeling of
indoor structures from point clouds using constrained least-squares adjustment for as-built
BIM.” J. Comput. Civil Eng., 30(4), 04015074.
Kissling, W. D., Seijmonsbergen, A., Foppen, R., and Bouten, W. (2017). “eEcoLiDAR,
eScience infrastructure for ecological applications of LiDAR point clouds: reconstructing the
3D ecosystem structure for animals at regional to continental scales.” J. Res. Ideas
Outcomes, 3, e14939.
Leutenegger, S. T., Lopez, M. A., and Edgington, J. (1997). “STR: A simple and efficient
algorithm for R-tree packing”. Proc., 13th Int. Conf. on Pattern Recognition, IEEE, Vienna,
Austria, 497-506.
Li, Z., Hodgson, M. E., and Li, W. (2018). “A general-purpose framework for parallel processing
of large-scale LiDAR data.” Int. J. Digital Earth, 11(1), 26-47.

Pajić, V., Govedarica, M., and Amović, M. (2018). “Model of Point Cloud Data Management
System in Big Data Paradigm.” Int. J. Geo-Inf., ISPRS, 7(7), 265.
Schütz, M. (2016). “Potree: Rendering large point clouds in web browsers.” Diploma. Thesis,
TU Wien Uni., Wien, Austria.
You, S., Zhang, J., and Gruenwald, L. (2015). “Spatial join query processing in cloud: Analyzing
design choices and performance comparisons.” Proc., 44th Int. Conf. on Parallel Processing
Workshops (ICPPW), IEEE, Washington, DC, 90-97.

Spatial Change Tracking of Structural Elements of a Girder Bridge under Construction

Using 3D Point Cloud
Sudip Subedi1; Vamsi Kalasapudi, Ph.D.2; and Nipesh Pradhananga, Ph.D.3
1
Dept. of Civil and Environmental Engineering, Florida International Univ., 10555 W. Flagler
St., Miami, FL 33174. E-mail: [email protected]
2
Moss School of Construction, Infrastructure, and Sustainability, Florida International Univ.,
10555 W. Flagler St., Miami, FL 33174. E-mail: [email protected]
3
Moss School of Construction, Infrastructure, and Sustainability, Florida International Univ.,
10555 W. Flagler St., Miami, FL 33174. E-mail: [email protected]

ABSTRACT
Inability to capture and resolve spatial changes (deviations and deformations) during the
construction process can have severe impacts on the structural behavior of bridges during its
operation and maintenance. The major challenge lies in understanding the origin of spatial
changes of a bridge element, which develop either during the construction phase, post
construction, or during its service life. The objective of this study is to explore the potential of
leveraging 3D point cloud data to analyze the quality of bridge construction and identify the
deformations caused during the accelerated bridge construction process. This paper presents a
case study of an ongoing bridge construction project where a site is scanned multiple times to
obtain several 3D point cloud datasets. The obtained 3D point cloud data were analyzed to
discover any spatial changes in major bridge members (abutment, concrete bearing pad, and
bridge girder) induced by either construction error or structural deformations. The authors
believe that such detailed study will produce a novel method to differentiate between
deformations caused in the bridge elements during the accelerated construction process and its
service life. The findings will potentially pave new avenue of research by pinpointing the causes
of long-term deformation and can play a key role in taking preventive actions during
construction to rectify potential problems. The proposed research can also act as a tool for
construction managers to evaluate the quality of work and help in saving time and money by
identifying discrepancies as soon as they occur on site.

INTRODUCTION
The growing need for construction of new bridges and replacement and repair of structurally
deficient older bridges come with supervising challenges that include minimizing the traffic
disruption, financial liabilities and most importantly workers’ safety (Li, Ma, Griffey, & Oesterle
2010). These challenges led to a rapid increase in the need for accelerated bridge construction
techniques demanding safer and faster construction of bridges. The implementation of
accelerated construction techniques in construction site demands efficient quality control
mechanisms and several researchers identified the effect of construction-related errors on the
structural reliability of a bridge (Catbas, Ciloglu, Hasancebi, Grimmelsman, & Aktan 2007;
Ellirtgwood 1987). However, existing studies are more focused on structural health monitoring
of bridges during its service than identifying errors that originate during the construction process
(Hu & Wang 2013; Zhou, Li, Xia, Yang, & Zhang 2017).
Traditional infrastructure construction monitoring systems focus more on monitoring of the
overall construction processes and do not particularly emphasize identifying changes to the

elements of the infrastructure during construction. Current researchers developed techniques that
include visual monitoring using Unmanned Aerial Vehicles (UAVs) (Ham, Han, Lin, &
Golparvar-Fard 2016), implementation of Building Information Modeling (BIM) for construction
process optimization and monitoring (Liu, Guo, Li, & Li 2014), and application of Global
Positioning System (GPS) and remote sensing technology for construction monitoring (Kang
2017). These research methods cannot accurately identify important dimensional changes
(deformation, deviations etc. spatial changes hereafter) that occur in the elements of a bridge
under construction. Failure to identify such errors could significantly affect the structural
behavior of a bridge and can lead to adverse financial consequences (Ellirtgwood 1987).
Statistics from past research showed that approximately 10% of the construction rework cost is
due to the delay in detecting spatial changes that can cause structural collapse (Sedek & Serwa
2016). Additionally, it is extremely difficult to identify the spatial changes during the accelerated
construction process due to the tight schedule and rapid construction processes. Therefore, the
authors leveraged the use of 3D laser scanners to rapidly detect three-dimensional spatial
changes of bridge elements during such accelerated construction process.
Recently, the use of 3D laser scanner for sensing and modeling construction environment has
been significantly increasing (Akinci et al. 2006; Golparvar-fard, Bohn, Teizer, Savarese, &
Peña-mora 2011; Sedek & Serwa 2016). Traditional monitoring methods such as optical sensors,
GPS/GIS sensors, acceleration sensors, and strain gauges have also been used in some of the
major long-span bridges for monitoring deflections and deformations (Zhou et al. 2017). But the
inapplicability of analyzing old bridges due to the requirement of pre-embedded sensors, high
installation cost, and damage proneness has limited the use of such technologies for structural
health monitoring (Hu & Wang 2013). Several studies validated the potential of using a 3D laser
scanner for conducting construction quality control (Kalasapudi & Tang 2015), construction
defect control (Sedek & Serwa 2016), structural health monitoring (Yang, Xu, & Neumann
2016), and spatial change deformation monitoring (Kalasapudi & Tang 2017) etc.
This paper focuses on identifying the spatial changes (includes dimensional, position and
orientation) that originate during construction of a single span bridge structure using 3D point
cloud data (data obtained from scanning the site with laser scanner). Such identified changes can
help us in better understanding the performance of the bridge during its service life and
identifying the root cause of a spatial change deteriorating the structure in long-term. The 3D
point cloud data obtained from the laser scanner can also be used for post-construction as-built
modeling and can serve as a basis of structural health monitoring during its services life.

CASE STUDY
The authors collected the 3D point cloud data of an ongoing bridge construction project
located in Broward County, Florida . The bridge is a simply supported single span structure with
8 girders spaced 5’-4.125” center to center. The length of each girder is 180’-10.25”. The girders
rest upon the elastomeric bearing pad placed on top of a concrete bearing pad (hereinafter called
as the concrete pad). The construction of two abutments and concrete pads on top of it were
already completed before the starting of the data collection process.
The first 3D point cloud data was collected for the completed abutments without concrete
girders (hereinafter referred as Milestone_1). The 3D point cloud data also included adjacent
permanent structures (highway, barriers in a highway, building roofs) for facilitating feature-
based registration. Two girders were scheduled for placement on the top of the concrete pad each
day requiring a total of four working days to complete the girder installation process. After

installation of every two girders, the bracings were installed to support the girders requiring at
least one additional day between each girder pair installation. Every two girder installation task
was defined as a milestone and the construction site was scanned after completion of each
milestone. Table 1 shows the installation details including milestones, the number of girders
installed, and the number of scans taken after each milestone. Figure 1 shows the actual
construction site with two completed abutments (on left) and its cleaned & segmented point
cloud data (on right).

Table 1. Girder Installation and Scanning Details

Milestones No of Girders installed No. of Scans
1 0 6
2 2 (Girder 1 & 2) 7
3 4 (Girder 3 & 4) 5
4 6 (Girder 5 & 6) 4
5 8 (Girder 7 & 8) 4
6 Deck cast 5

Figure 1. Construction site with two completed abutments (left) and cleaned 3D cloud point
(right)
For the scope of this paper, the authors analyzed milestone_1 (with no girders), milestone_2
(with 2 girders installed) and milestone_3 (with 4 girders installed). The 3D point cloud data
pertaining to milestone_1 was taken as a reference scan for registration of milestone_2 and
milestone_3 3D point cloud data. Any geometric changes in the girder from milestone_2 (length,
shape, bending, torsion, etc.) were analyzed before and after the completion of milestone_3.
Figure 2 shows the point cloud of the construction site with two girders (on left) and four girders
(on right) respectively.
Due to ongoing construction works with moving construction workers, equipment, and
moving traffic in the adjacent highway, the data contained lots of noise requiring rigorous
preprocessing. Preprocessing involved noise cleaning and registration of scans for the same
milestone. After preprocessing the 3D point cloud data, the authors utilized a feature-based
registration method to register all the milestones together in a single coordinate system (Böhm &

Becker 2006; Shi et al. 2017; Stamos & Leordeanu 2003). Again, after getting all the scans for
each milestone into a single coordinate system, they were registered accurately using robust
registration algorithm (Kalasapudi 2017; Kalasapudi & Tang 2017) and validated manually using
manual registration.

Figure 2. Construction site with two girders installed and blow-up of the concrete pad (left)
and four girders installed (right)

CASE STUDY RESULT

The abutments, bridge girders and concrete pads upon which girders rested were analyzed to
find out any significant spatial changes at the different stage of construction (milestone_2 and
milestone_3). The abutments were inspected to find out any settlement of structure. No
significant change was found in abutments between milestone_2 and milestone_3. Similarly, the
bridge girders were inspected to see any changes due to construction misalignment, torsion, side
sways, etc. For this, the girders were automatically extracted from the scans and compared to
observe any spatial changes. No significant spatial change was noticed in the first two girders
(milestone_2) due to the installation of an additional two girders (milestone_3). Finally, the
authors analyzed the concrete pads and identified the spatial changes.
The authors observed that after the placement of an additional two girders (milestone_3),
there was an increment in the distance between the concrete pads supporting each girder. The
author utilized object (planar surface) recognition algorithm to identify the concrete pads. The
average perpendicular distance was computed between the outer faces of the bearing pads
supporting the same girder using plane-to-plane distance measurement algorithm. Figure 3 shows
the plotted 3D point cloud data (on top) and the fitted planes (on bottom). Different axes scales
were used for three axes to provide a better perspective as the size of the concrete pad was very
small compared to the distance between the concrete pads.

Figure 3. 3D point cloud data (on top) and the fitted planes (on bottom) for North and
South concrete pad outer faces
The distance between the concrete pads was also computed within the CloudCompare
software environment for the validation of the computational algorithm. First, the
CloudCompare’s plane fitting tool created individual planes for each concrete pads’ outer faces.
Then, the “compute cloud/mesh distance” algorithm provided the minimum, average and
maximum distances between the two planes. The standard deviation for the computed distances
was negligible. Similar measurements were observed for both cases (using the computational
algorithm and CloudCompare’s compute cloud/mesh distance algorithm), further validating the
used computational algorithm.
The perpendicular distances were computed between the 6 pairs of concrete pads located on
north and south abutment for milestone_2 and milestone_3. From plane-plane distance
computation, it was observed that the distance between concrete pads increased for each pair
after the installation of two additional girders (milestone_3) compared to milestone_2.
Table 2 shows the computed average plane-plane distance between the outer face of concrete
pads for 6 girders after the placement of first and second pairs of girders.

Table 2. Concrete pad to Concrete pad distances for 6 girders

Girder Milestone_2 (m) Milestone_3 (m) Difference (mm)
1 54.1471 54.1866 39.5
2 54.1957 54.2216 25.9
3 54.2015 54.2373 35.7
4 54.2116 54.2239 12.3
5 54.1716 54.2428 71.2
6 54.2075 54.2617 54.2

The average increment in plane-plane distance for 6 measured pairs of concrete pads was
39.8mm with the standard deviation of 20.8mm. The maximum increase in the distance observed
was 71.2mm for girder 5 and the minimum was 12.3mm for girder 4. This increase in the
distance between the outer face of concrete pads suggests that either one or both the abutments
have moved away from each other due to the addition of girders as no spatial change was noticed
between the concrete pads of the same abutment. The authors plan to collect more 3D point
cloud data to pinpoint the movement of each abutment as well as to see if these observed
changes have any influence on the service life of bridge structure.

CONCLUSION
From this pilot study, the authors observed that the 3D laser scanner could be effective for an
accelerated bridge construction projects’ quality control as well as for proper documentation of
as-built condition for future inspection and maintenance planning. The 3D point cloud data
analysis concluded that no significant deviation was noticed in the first two girders even after the
installation of the second pair of girders. Similarly, there was no significant deviation on the
abutments between milestone_2 and milestone_3. However, the authors observed the change in
the distance between concrete bearing pads on which the girders rested between the 3D point
cloud from milestone_2 and milestone_3 respectively. Additional analysis also concluded that
there was no lateral displacement between the concrete pads of the same abutment between the
3D point cloud data collected during milestone_2 and milestone_3. This suggests that with the
addition of load in the form of two more girders, the abutment has undergone minor
displacements which could be further verified by measuring the distance between concrete pads
after placing additional girders. This method provides an easy way for engineers to identify,
visualize, and interpret the deformations.

Limitations
The major limitation of this research paper was lack of design parameters defining the
permissible deviations for each structural element. Such design parameters will be obtained from
the structural designer of the bridge and comparison will be used to check whether the
deformations or deviations noticed in bridge elements are within the permissible limits or not.
Besides, the authors only compared the bridge structures between two milestones in this paper.
An additional comparison should be performed for other milestones to analyze the pattern of
change. For this, the authors plan to further analyze the behavior of bridge elements for other
construction milestones and after addition of service load (vehicular live load).

Future Work
The future work includes the identification and separation of structural deformations and
construction deviations to allow rectification of error before carrying out further construction
works. More scans will be collected after the completion and during the construction to create
the as-built model which will serve as the basis for future inspection. During its service life, the
authors plan to collect 3D point cloud every two months for spatiotemporal analysis of bridge
structure to monitor if the spatial changes identified during construction are deteriorating and
interfering with the structural functioning of the bridge post construction. Such continuous
monitoring can help in making better decisions during the rehabilitation process of the bridge
structure.
The researchers will also focus on developing an automated structural health monitoring
model which would control the quality of bridge structure during construction, create the as-built
model, service life inspection and identification and prediction of possible structural
deformations by collaborating with the bridge structural engineers.
In addition, the future work includes the development of an automated model that will
identify all important structural deformations and defects so that the structural engineers will be
able to retrofit the bridge structure before failure. The use of laser scanner for crack identification
has already been validated by researchers (Laefer, Truong-Hong, Carr, & Singh 2014).
Furthermore, the actual deflection of the bridge girders computed from the 3D point cloud data
will be compared periodically with time-dependent deflection (including creep and shrinkage)
computed by the bridge structural engineers. And if the actual deflection is more than the
expected one, the structural engineers will be able to conduct full inspection to find any
abnormalities in the bridge. This will help in increase the overall lifespan of the bridge by
preventing the possible crack formations and failure of the bridge in advance.

REFERENCES
Akinci, B., Boukamp, F., Gordon, C., Huber, D., Lyons, C., & Park, K. (2006). A Formalism for
Utilization of Sensor Systems and Integrated Project Models for Active Construction Quality
Control. Automation in Construction, 15(2), 124–138.
Böhm, J., & Becker, S. (2006). Automatic marker-free registration of terrestrial laser scans using
reflectance features. The International Archives of the Photogrammetry, Remote Sensing and
Spatial Information Sciences, 36(5/W17), 338–344.
Catbas, F. N., Ciloglu, S. K., Hasancebi, O., Grimmelsman, K., & Aktan, A. E. (2007).
Limitations in structural identification of large constructed structures. Journal of Structural
Engineering-Asce, 133(8), 1051–1066. https://doi.org/10.1061/(ASCE)0733-
9445(2007)133:8(1051)
Ellirtgwood, B. (1987). Design and Construction Error Effects on Structural Reliability. Journal
of Structural Engineering, 113(2), 409–422. https://doi.org/10.1061/(ASCE)0733-
9445(1987)113:2(409)
Golparvar-fard, M., Bohn, J., Teizer, J., Savarese, S., & Peña-mora, F. (2011). Evaluation of
image-based modeling and laser scanning accuracy for emerging automated performance
monitoring techniques. Automation in Construction, 20(8), 1143–1155.
https://doi.org/10.1016/j.autcon.2011.04.016
Ham, Y., Han, K. K., Lin, J. J., & Golparvar-Fard, M. (2016). Visual monitoring of civil
infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): a review of

related works. Visualization in Engineering, 4(1), 1–8. https://doi.org/10.1186/s40327-015-

0029-z
Hu, X., & Wang, B. (2013). A Wireless Sensor Network-Based Structural Health Monitoring
System for Highway Bridges. Computer-Aided Civil and Infrastructure Engineering, 28,
193–209. https://doi.org/10.1111/j.1467-8667.2012.00781.x
Kalasapudi, V. S. (2017). Robust Registration Algorithm for Performing Change Detection of
Highway Bridges Using 3-D Laser Scanning Data. Eleventh International Bridge and
Structures Management Conference, (April), 243. Retrieved from www.TRB.org
Kalasapudi, V. S., & Tang, P. (2015). Automated tolerance analysis of curvilinear components
using 3D point clouds for adaptive construction quality control. Congress on Computing in
Civil Engineering, Proceedings, 2015–Janua(January), 57–65.
https://doi.org/10.1061/9780784479247.008
Kalasapudi, V. S., & Tang, P. (2017). A Robust Registration Algorithm for Automatic and
Reliable Geometric Change Detection of Bridges using 3D Laser Scanning Data, 09, 2017.
Kang, C. (2017). Application of GPS and Remote Sensing Image Technology in Construction
Monitoring of Road and Bridge, (Ssme), 337–342.
Laefer, D. F., Truong-Hong, L., Carr, H., & Singh, M. (2014). Crack detection limits in unit
based masonry with terrestrial laser scanning. NDT and E International, 62, 66–76.
https://doi.org/10.1016/j.ndteint.2013.11.001
Li, L., Ma, Z. (John), Griffey, M. E., & Oesterle, R. G. (2010). Improved Longitudinal Joint
Details in Decked Bulb Tees for Accelerated Bridge Construction: Concept Development.
Journal of Bridge Engineering, 15(3), 327–336. https://doi.org/10.1061/(ASCE)BE.1943-
5592.0000067
Liu, W., Guo, H., Li, H., & Li, Y. (2014). Using BIM to improve the design and construction of
bridge projects: A case study of a long-span steel-box arch bridge project. International
Journal of Advanced Robotic Systems, 11(1), 1–11. https://doi.org/10.5772/58442
Sedek, M., & Serwa, A. (2016). Development of new system for detection of bridges
construction defects using terrestrial laser remote sensing technology. Egyptian Journal of
Remote Sensing and Space Science, 19(2), 273–283.
https://doi.org/10.1016/j.ejrs.2015.12.005
Shi, Y., Xiong, W., Kalasapudi, V. S., Geng, C., Zhang, C., & Tang, P. (2017). Automated
Change Diagnosis of Single-Column-Pier Bridges Based on 3D Imagery Data Ying.
Computing in Civil Engineering, 91–98. https://doi.org/doi:10.1061/9780784407943
Stamos, I., & Leordeanu, M. (2003). Automated Feature-Based Range Registration of Urban
Scenes of Large Scale. CVPR (2), 0–6.
Yang, H., Xu, X., & Neumann, I. (2016). Laser Scanning-Based Updating of a Finite-Element
Model for Structural Health Monitoring. IEEE Sensors Journal, 16(7), 2100–2104.
https://doi.org/10.1109/JSEN.2015.2508965
Zhou, J., Li, X., Xia, R., Yang, J., & Zhang, H. (2017). Health Monitoring and Evaluation of
Long-Span Bridges Based on Sensing and Data Analysis : A Survey. Sensors, 17(3), 603.
https://doi.org/10.3390/s17030603

A 3D Irregular Packing Algorithm Using Point Cloud Data

Yinghui Zhao1 and Carl T. Haas, F.ASCE2
1
Ph.D. Candidate, Dept. of Civil and Environmental Engineering, Univ. of Waterloo, ON N2L
3G1, Canada. E-mail: [email protected]
2
Professor, Dept. of Civil and Environmental Engineering, Univ. of Waterloo, ON N2L 3G1,
Canada. E-mail: [email protected]

ABSTRACT
The cutting and packing (C&P) problem has been extensively studied as it has a wide variety
of applications in many industries. Good packing solutions can effectively reduce manpower and
production costs. However, approaches for packing 3D irregular shaped items common in
construction are very limited. In this paper, a heuristic algorithm to pack a set of irregular shaped
items into a box-shaped container with the objective to maximize the contact area between
objects has been proposed as one step required for alternative packing solutions. A 3D scanner is
employed to obtain the geometric information of items. The heuristic algorithm determines the
rotation and translation of each item, moves the objects into close proximity, and fits the objects
together automatically using point cloud representation. This is a new approach. Experiment
results show that the proposed approach has the potential to support good packing solutions of
realistic items in a reasonable time.

INTRODUCTION
The Cutting and Packing (C&P) problem has a wide application in the medicinal, materials
science, chemical, mechanical engineering, shipbuilding, aircraft construction, transportation,
and garment industries. The basic problem is to find the optimal packing layout in order to
optimize an objective. The C&P problem with 3D irregular shapes became a significant research
interest in recent years. One of the main driving forces for research in the 3D irregular packing
problem comes from the 3D printing industry. Three-dimensional printing, also known as
additive manufacturing, is a type of manufacturing method that produces 3D shapes by adding
layers. Three-dimensional printing is rapidly growing and has massive potential in many
industries including medical instruments, aerospace, and robotics. Three-dimensional printing
also has a promising outlook in the construction industry, its application including buildings and
bridges. For example, the office building for Dubai Future Foundation was printed by Winsun, a
Chinese 3D printing architecture Company. Benefits arise through the possibility of packing
several parts into one build chamber and simultaneously printing them by batch production (Wu
et al. 2014). As the use of 3D printing is mainly focused on off-site fabrication, C&P problem
can help decease transportation costs for the prefabricated parts.
Another application of C&P problem in construction industry lies in facility
decommissioning and disassembling. Take the decommissioning of a nuclear power plant as an
example. The packing of nuclear waste produced by the decommissioning phase highly affects
the final number of containers required, manpower required, the manipulation complexity, and
consequently their effect on costs. Further, the C&P problem can be more generally applied in
the construction industry, such as assisting in the placement of construction materials on
construction sites, improving transport efficiency of building tools or fabricated construction
assemblies by increasing space utilization of trucks, and reducing the storage space of the

construction waste. Despite all the promising applications, as far as we know, no research has
been done on the C&P problem in the scope of the construction industry.
This paper describes a novel packing algorithm for 3D irregular objects using scanned point
cloud data. In the current study, a solution for the 3D irregular packing problem using the
principle of maximum contact surface between objects to minimize the packing volume is
proposed. A collision detection method developed for point cloud models is proposed and
multiple orientations for each object are taken into consideration.
The paper is organized as follows. A brief introduction of the existing 3D irregular packing
approaches is given in section 2. Section 3 explains the proposed collision detection algorithm
for point cloud models, followed by the description of the maximum contact surface principle in
section 4. The 3D irregular packing algorithm is described in detail in section 5. Experiment
results are shown in section 6. Finally, section 7 contains concluding remarks.

RELATED WORK
Heuristic methods: Heuristic algorithms, which aim to find an optimized, “good enough”
solution in an acceptable amount of time, are often used to deal with packing problems. Bortfeldt
and Wäscher (2013) asserted that heuristic algorithms, in particular metaheuristics, are and will
remain the most important class of algorithms for solving the container loading problems in the
foreseeable future. Liu et al. (2015) proposed a packing algorithm called HAPE3D, a heuristic
algorithm aiming at minimizing the total potential energy of the packing, which can be
hybridized with a simulated annealing algorithm to further improve the packing quality by
optimizing the sequence of the packing. Wu et al. (2017) introduced a genetic algorithm based
two-step packing procedure: (1) determine the sequence of packing and orientation for each
small item by genetic algorithm; and (2) place items into the container one-by-one using a
modified bottom-left-fill placement heuristic, which is an extended method from the bottom-left
heuristic for the 2D packing problem. Yao et al. (2015) introduced a relaxed placement method
using level-set representation of small items. The idea is to slowly reduce the size of the
container and remove the collision area using level set function. Another contribution of their
research is that they tried to incorporate the cutting process into the packing through iteration
between these two processes. Vanek et al. (2014) also managed to combine cutting and packing
together. They use iteration between height field based packing and Tabu-search sequence
optimization.
Mathematical modeling: Mathematical modeling refers to the methodologies that attempt to
formulate the packing problem into a mathematical programming problem. Due to the complex
characteristics of the irregular shapes and the multiplicity of objectives, there are only a few
methods that attempt to find the exact or optimal solutions to the problem. Stoyan et al. (2005)
extended phi-function into three dimensions to pack three-dimensional convex polytopes into a
parallelepiped. However, as polytopes can only be translated, the rotation is not allowed. Stoyan
et al. (2016a) developed a non-linear programming model using ready-to-use phi-function, to
search for local optimal solution for two-dimensional irregular packing problem with continuous
rotation. Romanova et al. (2018) proposed a non-linear programming formulation to manage the
concave polyhedra packing problem with continuous rotations using quasi phi-functions
introduced by Stoyan et al. (2016b). The complexity of generating phi-functions for arbitrary
shapes is the major limitation of this approach.
To address the shortcoming identified for phi-functions, the use of point cloud representation
is proposed as leading to a class of solutions to optimize the C&P problem of 3D objects. Point

cloud models are increasingly used to measure complex geometry or environment in various
fields, including environmental surveying and robotics. As 3D scanners are becoming more
affordable, point clouds have also became a popular tool for shape representation, which allows
high precision and rapid geometry acquisition for objects without existing synthetic models.
Comparing with other previously developed packing algorithms, the proposed packing algorithm
does not require a pre-designed CAD model, the point cloud representation is easy and rapid to
acquire, and it is more consistent with the actual shape of the objects. For this reason, the
proposed packing algorithm would be suitable for construction components (with careful
tolerance management).

COLLISION DETECTION METHOD

Little literature (Figueiredo et al. 2010; Pan et al. 2013) exists on determining collision
between point clouds. In this research, the 3D irregular packing problem of unknown geometries
is addressed. Three-dimensional scanning technology is employed to collect raw data, and a
novel collision detection method in the context of the packing problem using point cloud is
proposed. Building upon the sphere assembly model (Li and Zhao 2009) and the bubblepack
algorithm in PFC3D (Particle Flow Code, a discrete element modeling framework available for
three-dimensional programs), every point in the point cloud can be represented by a sphere.

Figure 1. Collision detection method

Figure 2. Rotation and starting points for point clouds

Consider two point clouds A and B, which have m and n points respectively. To determine
if these point clouds are intersecting, two parameters a and b are defined as shown in figure 1,
which are empirical values dependent on the density of the point cloud. The value a denotes the
diameter of spheres that represent each scanned point, which is also the minimum distance
between point cloud A and point cloud B when they are as close to each other as measurement

error might allow without collision and yet not connected. Value b denotes another critical
distance. When the distance between two point clouds is greater than b , the two point clouds are
considered to be separate. Let d ij define the distance between i th point in point cloud A and j th
point in cloud B, where i [1, m] , j [1, n] . Let d i define the minimum distance between point i
in point cloud A and point j in point cloud B, di  min  dij  , where i [1, m] , j [1, n] .
In the situation where two point clouds are approaching each other. For point i in point cloud
A, if distance d i from this point to point cloud B is:
 a  di  b , the distance between this point and point cloud B is appropriate, this point is
in contact with point cloud B;
 di  b , this point is far from point cloud B;
 di  a , this point is inside point cloud B;
For point cloud A and B:
 Collide, di  a ;
 Separate, di  b ; and
 Contact, di  a  di  b .

Figure 3. Flowchart of packing algorithm

In the program, a and b can be adjusted to easily change the distance between items in the
final packing result.

OPTIMIZATION ASSUMPTION
The area of contact surface between objects reflects how well they fit together. If the area of
contact is larger, the objects are more attached to each other and the final overall volume is
smaller. Essentially, the aim is to nest the objects. For point clouds, the number of points that are
in contact is counted. Based on this assumption, a criterion is proposed to judge how good the
relative placement between two point clouds is. Let s define the number of points that are in
contact between two point clouds. The algorithm chooses the layout with maximum s value.
Consider the point cloud A and point cloud B. Let s AB be the number of points that are in
contact between these point clouds. For each point in point cloud A, calculate d i , and count the
points of which a  di  b . The algorithm calculates s for the different placement and chooses
the layout with the highest s AB value.

Figure 4. Eight scanned instances

Table 1. Experiment results for 8 instances

Bounding
Number of Calculation time
Number Packing sequence change box
points/object (s)
Volume
1 1000 --- 0.67945 71.739
2 5000 --- 0.68647 120.019
3 5000 12 times 0.64929 1411.438

ALGORITHM OUTLINE
The packing algorithm can be described as follows:
1. Read point clouds of each object, and down-sample the number of points to speed up
computation. The distances between points are calculated in the proposed algorithm,
which leads to the fact that the efficiency of the program highly depends on the number
of points in each point cloud.
2. Determine the initial sequence of packing by sorting the point clouds in ascending order
of their volume. Choose the first two point clouds to start.
3. Calculate the principle components for each point cloud using principal component
analysis (PCA). Align two point clouds based on their first principle component such that
corresponding principle components are pointing in the same direction.
4. Fix the position of point cloud A, and move point cloud B to a different initial starting
placement. However, trying unlimited starting points is both unrealistic and inefficient.
The solution is to limit the number of starting points while allowing rotation of both point
clouds along the axis of their first principle component respectively. The rotation degree
can be adjusted according to need. The number of starting positions of the second point
cloud can be controlled through parameter  as shown in figure 2.

5. Move point cloud B from the starting position to point cloud A by step distance
combined with the previously mentioned collision detection method. The process stops
when two point clouds are as close to each other as measurement error might allow
without collision. Calculate s AB .
6. Repeat step 4 and 5 with different starting positions for point cloud B.
7. Choose the placement with the highest s AB value as the preferred layout of the two point
clouds.
8. Consider the former packed point clouds as a whole, defined herein as a cluster cloud,
and run the packing algorithm with the next point cloud in the sequence.
9. The iteration stops when all point clouds are allocated.
10. Change the packing sequence by randomly swapping two objects or reversing the
sequence between two randomly picked objects.
11. Repeat from step 3. The algorithm stops when all the iterations finish and the layout of
minimum volume is chosen.
Algorithm flowchart is presented in figure 3.

EXPERIMENT RESULTS
The packing program was written in MATLAB on the Windows platform. To evaluate the
performance of the proposed algorithm, experiments were conducted using a computer with 2.80
GHz i7 CPU and 8 GB of memory.
Eight pieces of waste from a structural laboratory were scanned using 3D scanner Structure
Sensor as shown in figure 4. They represented typical decommissioning objects. To shorten the
processing time in the experiment, each point cloud is allowed 4 rotations (0°, 90°, 180°, 270°)
around the axis of their first principle component when approaching each other, and  is set to
45°. A Kd-tree is used to organize points in point clouds in order to improve runtime efficiency.
As the calculation time depends on the number of points input in the algorithm, the same
experiment was conducted two times with a scan of each object down-sampled to 1000
points/object and 5000 points/object in experiment 1 and experiment 2 respectively. Apart from
the number of points, another important factor is the packing sequence. Packing experiment 2
was carried out with 12 changes of packing sequence. The sequence is changed by randomly
switching two objects’ order, randomly selecting one object and insert it into the sequence or by
reversing the order between two objects.
The packing results are shown in figure 5 and table 1. It can be seen from figure 5 that the
collision detection method for point clouds works well in the context of packing. Comparison
between figure 5a and 5b shows that a more preferable packing order can be found by changing
the sequence, which leads to a smaller packing volume. It is not surprising to see from table 1
that reducing the size of the point clouds can shorten the calculation time. Further experiments
will expand on these preliminary results.

CONCLUSION
The point cloud is a popular representation in many industries, such as robotics, which allows
for high precision and rapid geometry acquisition for objects without existing synthetic models.
However, there is no paper in the literature about packing using point clouds. This paper
describes a novel collision detection method and packing algorithm for 3D irregular objects
using scanned point cloud data. The proposed collision detection method considers every point in

the point cloud as a sphere of diameter a .

The packing algorithm is based on the principle to maximize the contact area between
objects. The packing algorithm locally moves one or more objects into close proximity, as it fits
the objects together automatically. It will be useful in a broader optimization algorithm that uses
various search methods and heuristics.
Experiments show that the proposed collision detection is effective in the context of packing,
and the packing algorithm can produce good packing solutions of realistic items in a reasonable
time.

Figure 5. Packing results for 8 scanned instances (a) Result with 1000points/object and in
volume ascending packing sequence; (b) Result with 5000points/object and in volume
ascending packing sequence; (c) Result with 5000points/object and with change of packing
sequence

REFERENCE
Bortfeldt, A., & Wäscher, G. (2013). Constraints in container loading–A state-of-the-art review.
European Journal of Operational Research, 229(1), 1-20.
Figueiredo, M., Oliveira, J., Araújo, B., & Pereira, J. (2010). An efficient collision detection
algorithm for point cloud models. In 20th International conference on Computer Graphics
and Vision (Vol. 43, p. 44).
Li, S. X., & Zhao, J. (2009). Sphere assembly model and relaxation algorithm for packing of
non-spherical particles. Chin. J. Comp. Phys, 26(3), 167-173.
Liu, X., Liu, J. M., & Cao, A. X. (2015). HAPE3D—a new constructive algorithm for the 3D
irregular packing problem. Frontiers of Information Technology & Electronic Engineering,
16(5), 380-390.
Pan, J., Şucan, I. A., Chitta, S., & Manocha, D. (2013, May). Real-time collision detection and
distance computation on point cloud sensor data. In Robotics and Automation (ICRA), 2013
IEEE International Conference on (pp. 3593-3599). IEEE.
Romanova, T., Bennell, J., Stoyan, Y., & Pankratov, A. (2018). Packing of concave polyhedra
with continuous rotations using nonlinear optimisation. European Journal of Operational
Research, 268(1), 37-53.
Stoyan, Y. G., Gil, N. I., Scheithauer, G., Pankratov, A., & Magdalina, I. (2005). Packing of
convex polytopes into a parallelepiped. Optimization, 54(2), 215-235.
Stoyan, Y., Pankratov, A., & Romanova, T. (2016a). Cutting and packing problems for irregular
objects with continuous rotations: mathematical modelling and non-linear optimization.
Journal of the Operational Research Society, 67(5), 786-800.
Stoyan, Y., Pankratov, A., & Romanova, T. (2016b). Quasi-phi-functions and optimal packing of
ellipses. Journal of Global Optimization, 65(2), 283-307.
Vanek, J., Galicia, J. G., Benes, B., Měch, R., Carr, N., Stava, O., & Miller, G. S. (2014,
September). Packmerger: A 3d print volume optimizer. In Computer Graphics Forum (Vol.
33, No. 6, pp. 322-332).
Wu, S., Kay, M., King, R., Vila-Parrish, A., & Warsing, D. (2014). Multi-objective optimization
of 3D packing problem in additive manufacturing. In IIE Annual Conference. Proceedings
(p. 1485). Institute of Industrial and Systems Engineers (IISE).
Wu, H., Leung, S. C., Si, Y. W., Zhang, D., & Lin, A. (2017). Three-stage heuristic algorithm for
three-dimensional irregular packing problem. Applied Mathematical Modelling, 41, 431-444.
Yao, M., Chen, Z., Luo, L., Wang, R., & Wang, H. (2015). Level-set-based partitioning and
packing optimization of a printable model. ACM Transactions on Graphics (TOG), 34(6),
214.

Visual-Semantic Alignments for Automated Interpretation of 3D Imagery Data

of High-Pier Bridges
Zhe Sun, S.M.ASCE1; Pingbo Tang, Ph.D., P.E., M.ASCE 2; Ying Shi3 ; and Wen Xiong, Ph.D.4
1
School of Sustainable Engineering and the Built Environment, Arizona State Univ., 660 S.
College Ave., Tempe, AZ 85281. E-mail: [email protected]
2
School of Sustainable Engineering and the Built Environment, Arizona State Univ., 660 S.
College Ave., Tempe, AZ 85281. E-mail: [email protected]
3
Dept. of Bridge Engineering, School of Transportation, Southeast Univ., Nanjing 210096,
China. E-mail: [email protected]
4
Dept. of Bridge Engineering, School of Transportation, Southeast Univ., Nanjing 210096,
China. E-mail: [email protected]

ABSTRACT
Visual inspection for bridges highly relies on inspectors’ experiences. Such inspections are
time-consuming and dangerous due to the needs of putting inspectors to hardly accessible places.
Terrestrial laser scanning has proved its capability to overcome the shortcomings of subjective
and error-prone visual inspection. Unfortunately, only experienced engineers with profound
bridge knowledge and 3D data analysis skills can reveal anomalous trends of structural
deteriorations from images. Combined uses of inspection reports and imagery data could help
address the lack of experienced inspectors, but manually analyzing observations from images
against documented bridge defects is also tedious and subject to expert knowledge. This paper
examines the practical value and technical feasibility of automating the integrated analysis of 3D
imagery data and bridge inspection reports for supporting high-pier bridge inspections. The
results show that the potential values, technical feasibility, and challenges of such automation for
supporting timely and reliable high-pier bridge inspection.

INTRODUCTION
Continuous spatial changes in bridge elements can be indicators of structural defects and
possible structural deteriorations. Reliable spatial change analysis of a bridge is important for
structural condition assessment and deterioration pattern prediction. However, visual inspection
highly relies on the skills and experiences of on-site licensed inspectors and can often be
subjective and time-consuming. In addition, putting inspectors to hardly accessible places, such
as places under high-pier bridges crossing valleys or wide rivers could be extremely dangerous.
Large structures, such as high-pier bridges, bring challenges of assessing displacements and
shape changes of structural elements that are of large dimensions and far away from places
where inspectors could access. Terrestrial laser scanning (TLS), however, can acquire geometric
details of structures of hundred meters long to overcome the shortcomings of visual inspection.
Nevertheless, the lack of skillful engineers with profound bridge knowledge for analyzing and
interpreting imagery data brings challenges for the spatial change interpretation during bridge
inspection.
Combined uses of imagery data and inspection reports have the potential of overcoming the
shortages of bridge inspections (see Figure 1). Imagery data collected by TLS capture geometric
details of structural elements. Bridge inspection reports written by experienced inspectors
explain in detail about bridge conditions. Unfortunately, manually cross-validating information

from reports and imagery data from TLS can be tedious and error-prone. An accurate alignment
method could help cross-validating the visual data and the semantic information. Such alignment
brings two complementary sources together to overcome the difficulties for spatial change
interpretation and formulate the basis for automating change interpretation. However, diverse
and complex structural composition and deterioration mechanism of different bridges become
obstacles while automating such alignment for reliable bridge inspection.

Figure 1. Overall framework.

Previous studies of image captioning, linguistic analysis and laser scan for bridge inspection
provide basis and potential for reliable bridge inspection through integrated analysis. For
instance, previous studies developed algorithms for producing human-like captions for images.
These studies also examined the potential of implementing Natural Language Processing
algorithms for generating comprehensive image interpretations. However, such image captioning
process requires not only a large amount of training data with labels generated with expert
knowledge but also advanced machine learning algorithms to get accurate perdition results. On
the other hand, the use of imagery data for reliable bridge inspection remains as a challenge.
Previous studies of using terrestrial laser scanner (TLS) for bridge inspection aims at overcoming
the shortcomings of visual inspection. Some studies investigated the use of TLS for concrete
cracks recognition and bridge deterioration assessment (Law et al. 2018). Nevertheless, few
studies provide guidance of better interpreting the changes captured from 3D point clouds for
understanding the loading behaviors of structural elements on bridges. In addition, some
researchers applied text analysis for identifying contributing factors of bridge failure from
inspection reports (Liu and El-Gohary 2017). Still, few studies have examined the potential to
integrate the text analysis with imagery data for automating the change interpretation process
during bridge inspection.
While conventional inspection method for examining bridges that cross valleys and wide
rivers is considerably challenging, applying the integrated text and image analysis algorithm for
those bridges is thus important. Besides, the use of prestressed concrete continuous rigid frame
bridge with high piers has widely adopted in regions where surrounded by mountains with

extremely complex terrain. A number of factors, such as the increased traffic and rigidity of the
high piers cause large deformations of the bridge (Shan 2015). An effective and reliable
automated inspection method is then crucial for maintaining the functionalities of bridges. This
paper explored the visual-semantic alignment manually by using the 3D imagery data and the
inspection reports for reliable and efficient 3D data interpretation during high-pier bridge
inspections. The results show the potential value of automated alignment for supporting timely
and reliable high-pier bridge inspection. In addition, such automation could provide inputs for
early prediction of potential structural failure mode and corresponding recommendations for
required maintenance.

METHODOLOGY
This study has reviewed existing studies of applying text analysis algorithms in analyzing
bridge inspection report and the use of imagery data for automated spatial change analysis. The
proposed visual-semantic alignment integrates the analysis of imagery data and bridge inspection
reports for reliable and efficient 3D data interpretation in the inspections of large-span
prestressed concrete continuous rigid frame bridge with high-piers. In other words, the authors
studied a method of aligning visual and semantic information for supporting the automation of
visual-semantic alignments and automated interpretation of spatial changes of bridge structures.
The details of the proposed method include two parts:
1) Explore values of applying the proposed visual-semantic alignment for bridge
inspection.
a. Synthesizing the most common symptoms of structural deficiencies and failing
patterns from literature and past inspection reports of similar bridges.
b. Conduct data collection using a laser scanner for identifying spatial changes in
structural elements. The authors use a large-span prestressed concrete continuous
rigid frame bridges with high-piers (the Bridge in Anhui, China) as a case.
c. Cross-validating documented symptoms of structural deficiencies with captured
changes in 3D imagery data.
2) Synthesize technical challenges and explore the technical feasibility of such alignment
in bridge inspection.
a. Synthesizing technical challenges for automated imagery data processing and text
analysis algorithms. The authors reviewed existing studies about automated spatial
change analysis, image annotation and advanced text analysis for bridge inspection.
b. Explore the technical feasibility of automating visual-semantic alignments. The
authors further explored the feasibility of applying such alignment into bridge
inspection and discussed the feasibility in the future work plan.

CASE STUDY: INSPECTION OF A BRIDGE IN CHINA

The Bridge is located at Anhui Province, China, where surrounded by mountains with
extremely complex terrain. The Bridge has a total length of 1,010 m, including a six-span main
bridge of 612m (66m + 4*120m + 66m). The main bridge uses the prestressed concrete
continuous rigid frame with high hollow thin-walled piers (83m).

Synthesizing structural deficiencies from literature and past inspection reports

The authors synthesized symptoms of structural deficiencies from literature and past

inspection reports of large-span prestressed concrete continuous rigid frame bridges (see Table 1)
(Chen and Yang 2015; Sakata et al. 2000).

Table 1. Structural evaluation from inspection reports of similar bridges.

Bridge Span Length Pier Height Major Issues
The Humen Bridge (CHN) 150m+270m+160m 40m Excessive deflection
The Parrots Ferry Bridge (U.S.) 99m+195m+99m 108m Excessive deflection
Shanghai-Chengdu Expressway Minor deflection and
110m+200m+110m 105m
Bridge (CHN) settlement
*AASHTO specifies the maximum deflection limit (Span/1000) for concrete bridges under vehicular and pedestrian
loads (AASHTO 2012).

The summarized results indicate the most common structural deficiencies of such high-pier
bridges are deflections, settlement, and pier-drift. For example, the Humen Bridge is suffering a
deflection of 0.22m. The deflection of the Parrots Ferry Bridge is 0.64m. As for the Shanghai-
Chengdu Expressway Bridge, it also suffering the minor deflections and settlement.

Laser scanning data collection for spatial change detection

The authors collected 3D point cloud data for the Bridge in China in 2017 and 2018 by using
the FARO Focus S 350 laser scanner (37 scans for 2017; 31 scans for 2018).

a) Preprocessing of each dataset (2017 and 2018)

The authors performed registration for each 3D laser scanning dataset by using Cloud-to-
Cloud (C2C) registration through SCENE and manually remove the redundant data (i.e. trees,
etc.) by using the segmentation tool in CloudCompare (see Figure 2).

Figure 2. Registration of 2017 laser scan data (a: registered data; b: laser scan locations).

Figure 3. Finely registered scan (yellow: 2017 dataset; Red: 2018 dataset).
b) Registration of the two-year datasets
The authors applied the Iterative Closest Point (ICP) by using all points to finely register two

datasets with a minimum error of distance (RMS = 0.335) for reliable spatial change detection
(see Figure 3).

c) Change detection from 3D point cloud

Results (see Figure 4) indicate 1) the third and the forth span of the bridge is having
comparatively significant vertical deflections compared to the other spans; and 2) the pier, which
connects the third and the forth span of the right amplitude, has minor drift perpendicular to the
traffic direction.

Figure 4. The C2C distance computed for the registered two scans.
Table 2. Aligning change detection results against common structural failing patterns and
reasons documented in inspection reports.
Change detection results Common structural failing patterns and reasons
Deflections of larger-span prestressed concrete continuous rigid
frame bridge often due to the shrinkage of concrete, change of
The bridge has minor
stiffness of the main beam and prestress loss of longitudinal
deflection in the third and
steel. Excessive deflection will eventually lead to serious
fourth span.
vertical cracks in box girder and reduce the ability of the beam
body against the principal tensile stress (Chen and Yang 2015).
The pier of connects the The drift (lateral displacement perpendicular to the traffic
third and fourth span has direction) of high pier undertaken by comparatively significant
minor drift. lateral force (Sakata et al. 2000).

Visual-semantic alignment results

The proposed visual-semantic alignment requires both detected spatial changes and
documented descriptions of possible structural failing patterns from inspection reports. Spatial
changes shown on imagery data give a quantitative assessment of the structural deformations

while text descriptions give possible interpretations of structural behaviors that are causing such
deformations. The principle of such alignment is for a certain structural type (i.e. continuous
rigid frame), similar structural failing patterns always occur and cause alike deformations. Such
alignment will help not only the change interpretations but also better locating on certain parts of
the bridge inspection.
According to the detected spatial changes based on the two laser scanning datasets, the
authors manually align the detected changes from laser scan data and the documented structural
deficiencies in the inspection reports and literature (see Table 2).

Discussion of technical challenges and feasibilities

The results indicate that visual-semantic alignment has the potential of annotating the
detected spatial changes with appropriate interpretations of the structural loading behaviors and
failing patterns. Such alignment provides bridge engineering knowledge that not only helps field
engineers to better understand the bridge condition but also reduces risks of inconsistencies
between imagery data and documented bridge condition. In addition, the proposed alignment
method provides a basis for implementing advanced image and text processing algorithms,
which could significantly improve the computation efficiency for large bridges. Unfortunately,
visual-semantic alignment remains challenging in practice and requires in-depth research studies
to integrate existing text and image analysis algorithms effectively. As shown in Table 3, image
processing researchers aim at developing advanced image processing algorithms for object
recognition and detection of visual deterioration patterns. In addition, advanced text analysis
algorithms extract the cause and effect of structural failing patterns from bridge inspection
reports. These developed automated algorithms not only can process tedious imagery and text
data but also improve computational efficiency when dealing with huge datasets.

Table 3. A synthesis of technical challenges related to automatic visual-semantic alignment.

Method Purpose Challenges Relevant studies
 Object recognition  Laser scan
 Detection of visual planning (Kalasapudi et al. 2018; Kasireddy
Computer Vision
deterioration patterns  Segmentation and Akinci 2015; Strom et al.
(CV)
 Change classification  Registration 2010; Tang et al. 2010)
 Change analysis
 Extract deterioration  Sentence parsing
Natural
descriptors from texts  Information
Language (Liu and El-Gohary 2017; Zhang
 Determine semantic retrieval
Processing and El-Gohary 2017)
(NLP)
similarity  Semantic
summarization
 Match corresponding  Automatic
texts and images alignment
Integration of  Identify discrepancies  Computational
(Karpathy and Fei-fei 2017)
CV and NLP between texts and efficiency
images  Discrepancy
resolution

While visual-semantic alignment is challenging when integrating both research of computer

vision and natural language processing, the synthesized results illustrate the technical feasibility

of automating the visual-semantic alignment for reliable bridge inspection. Such automation not
only requires a high-level understanding of the semantic contents of bridge images but also needs
to align the information with technical descriptions provided by professional engineers. The
reviewed studies reveal the potential of achieving automated image annotation through the
integrated use of computer vision, natural language processing, and machine learning methods.
Machine learning algorithms can use a number of images and descriptions generated by
human individuals to train a model that can predict labels of images in terms of the contents. For
example, Mitchell et al. developed computer vision algorithms for labeling the image as
containing certain objects, actions, and spatial relationships (Mitchell et al. 2012). More
meaningful descriptions of image contents can be sentences longer than short keywords.
Examples include: Farhadi et al. established an automatic image captioning algorithm, which
produces captions for given images by retrieving sentences from a pre-specified sentence pool
(Farhadi et al. 2010). Such algorithm help matching corresponding text and images. A typical
integration of computer vision and natural language processing method is that Karpathy et al.
developed deep Convolutional Neural Networks for feature extraction of an image and perform
word prediction by using a natural language model conditioned on the extracted image feature
(Karpathy and Fei-fei 2017).
Unfortunately, although the studies mentioned above have revealed the potential of achieving
visual-semantic alignment for bridge inspection, many challenges listed in Table 3 remain
unresolved. Among these open challenges, the authors identified two of them as critical for near-
future efforts: 1) the exponentially growing computational complexity of matching image
contents and text elements; 2) the difficulties related to resolving inconsistencies between image
contents and text meanings.

CONCLUSION AND FUTURE RESEARCH

This paper shows that the proposed method has the potential to automate the change
interpretation process of the imagery data into insights about anomalous trends of structural
deteriorations. Such automation could not only help form a basis for supporting timely and
reliable high-pier bridge inspection, but also provides inputs for early prediction of potential
structural failure mode and corresponding recommendations for required maintenance.
However, challenges remain while integrating the existing image annotation and text analysis
algorithms for automating the visual-semantic alignment. The future direction of this research
will try to develop a method for integrating the existing text and image analysis algorithms with
a robust knowledge-based point cloud registration method for automating visual-semantic
alignment. The focus will be on resolving challenges related to computational efficiency of
visual-semantic alignment problem, and the discrepancy resolution problem.

ACKNOWLEDGMENT
This material is based on work supported by the science and technology project on
transportation construction by the U.S. National Science Foundation (NSF) under Grant No.
1454654 and the Ministry of Transport of the People's Republic of China (Project No.
2013318223380, 2014318J14250). The supports are gratefully acknowledged.

REFERENCES
AASHTO. (2012). “AASHTO LRFD bridge design specifications.” American Association of

State Highway and Transportation Officials, Sixth Edition, Washington, D.C.

Chen, D., and Yang, Y. (2015). “Analysis of Deflection Problems of Large-span Continuous
Rigid Frame Bridge and Prevention Measures.” 1, 1–6.
Farhadi, A., Hejrati, M., Sadeghi, M. A., Young, P., Rashtchian, C., Hockenmaier, J., and
Forsyth, D. (2010). “Every picture tells a story: Generating sentences from images.” Lecture
Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), 15–29.
Kalasapudi, V. S., Tang, P., Xiong, W., and Shi, Y. (2018). “A multi-level 3D data registration
approach for supporting reliable spatial change classification of single-pier bridges.”
Advanced Engineering Informatics, Elsevier, 38(June), 187–202.
Karpathy, A., and Fei-fei, L. (2017). “Deep Visual-Semantic Alignments for Generating Image
Descriptions.” IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE,
39(4), 664–676.
Kasireddy, V., and Akinci, B. (2015). “Challenges in Generation of As-is Bridge Information
Model: A Case Study.” 32nd International Symposium on Automation and Robotics in
Construction.
Law, D. W., Silcock, D., and Holden, L. (2018). “Terrestrial laser scanner assessment of
deteriorating concrete structures.” Structural Control and Health Monitoring, 25(5), 1–15.
Liu, K., and El-Gohary, N. (2017). “Ontology-based semi-supervised conditional random fields
for automated information extraction from bridge inspection reports.” Automation in
Construction, Elsevier B.V., 81, 313–327.
Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Stratos, K., Mensch, A., Berg, A., Han, X.,
Berg, T., and Health, O. (2012). “Midge: Generating Image Descriptions From Computer
Vision Detections.” Proceedings of the Conference of the European Chapter of the
Association for Computational Linguistics (EACL), 747–756.
Sakata, T., Nakazawa, T., Kanemaru, R., and Tokunaga, M. (2000). “Earthquake-Resistance of
Four-Span Continuous Pc Rigid-Frame Box Girder Bridge.” 12th Italian National
Conference on Earthquake Engineering, 4–9.
Shan, A. (2015). “Analytical Research on Deformation Monitoring of Large Span Continuous
Rigid Frame Bridge during Operation.” Engineering, 7(August), 477–487.
Strom, J., Richardson, A., and Olson, E. (2010). “Graph-based segmentation for colored 3D laser
point clouds.” IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems,
IROS 2010 - Conference Proceedings, IEEE, 2131–2136.
Tang, P., Huber, D., Akinci, B., Lipman, R., and Lytle, A. (2010). “Automatic reconstruction of
as-built building information models from laser-scanned point clouds: A review of related
techniques.” Automation in Construction, Elsevier B.V., 19(7), 829–843.
Zhang, J., and El-Gohary, N. M. (2017). “Integrating semantic NLP and logic reasoning into a
unified system for fully-automated code checking.” Automation in Construction, Elsevier
B.V., 73, 45–57.

4D BIM Based Optimal Flight Planning for Construction Monitoring Applications Using
Camera-Equipped UAVs
Amir Ibrahim1 and Mani Golparvar-Fard2
1
Ph.D. Student, Dept. of Civil and Environmental Engineering, Univ. of Illinois at Urbana–
Champaign. E-mail: [email protected]
2
Associate Professor and Faculty Entrepreneurial Fellow, Dept. of Civil and Environmental
Engineering and Dept. of Computer Science, Univ. of Illinois at Urbana–Champaign, 205 N.
Mathews Ave., Urbana, IL 61801. E-mail: [email protected]

ABSTRACT
Visual data used for progress tracking on job sites require the knowledge of where changes
are happening and the as-built state of the construction site. Today’s common methods for
capturing visual data only offer (a) manually designed and operated missions; and (2) automated
collection using lawn mowing patterns. This paper presents a new method for automated flight
planning and visual data collection that takes advantage from the availability of 4D BIM as a
priori to create optimal flight missions at locations of expected change. The developed approach
simulates the flight path to review flight execution safety and visibility of the camera frames to
the in-progress elements while accounting for requirements of baseline among images for 3D
reconstruction purposes. Finally, a case study conducted on a construction project reported
improvements in visual coverage to the structure and reduction in 3D reconstruction errors
obtained by improving the visual quality of the collected frames.

INTRODUCTION
Construction companies hire experts or third-party companies to conduct manual drone
flights to collect meaningful images according to some requirements usually set by the
companies or by drone flight guidelines. Operating drones manually on construction sites to
comply with a predefined flight plan is a very sophisticated task that usually ends up with not
achieving the targeted visual coverage objectives or collecting excess visual data (Lin et al.
2015). Failing to collect informative visual data during the flight execution leads to extra re-
planning and flight duration and added post-processing operations to filter out redundant
collected data.
The current state-of-practice for automated drone-based visual inspection is conducted using
2D lawn mowing pattern plans. These plans are defined with a fixed altitude and nadir (pointing
down) camera trajectories to collect overlapping images observing the construction site. In the
later plans, the full observation (coverage) of the construction site is assumed by enforcing a
minimum percentage overlap between captured frames -usually 60% to 80%-. The overlap
between images also ensures the generation of a complete orthographic photograph displaying
the job-site through panoramic stitching of the captured frames. Adding to that, the specified
minimum overlap between frames supports the detection of redundant visual features required
for successful post-processing of the images to create 3D as-built model using Structure-from-
Motion (SfM) algorithm (Schönberger et al. 2016).
Generally, flight planning and execution for complete visual coverage of a construction site
are associated with the challenges of:

i. Complexity in the structure's topography and continuous changes in the structure's

geometry and appearance that requires high frequency flight planning by altering
previous plans or creating new ones whenever construction progress is expected.
ii. Risks associated with mid-air failure and collision with the permanent structure,
temporary structures as tower cranes and/or other public vertical structures as power
transmission power lines which should be accounted for in the flight plan.
Utilizing BIM model as a prior for flight planning enables automated creation of flight plan
templates placed around the model. In addition, simulation of the flight plan in BIM environment
helps to evaluate both visual coverage and safety of execution before flight deployment (Ibrahim
et al. 2017a).

RELATED WORK
Lin et al. (2015) defined a pipeline for planning model-driven visual data acquisition using
camera equipped Unmanned Aerial Vehicles (UAVs) for construction progress monitoring. The
later research reported the importance of simulating UAV flight plans in BIM environment to
assess the visual coverage of the plan that is required to support complete and accurate image-
based 3D reconstruction of the as-built state of the construction site. A complete and accurate 3D
reconstructed model is important for geometry-based automated progress monitoring where an
element is assumed to be constructed if the reconstructed as-is model occupies the same 3D
space of the as-planned 3D element (Golparvar-Fard et al. 2010).
Ensuring visual coverage -also referred as the Art Gallery problem (Urrutia 2000)- addresses
the problem of finding the locations and orientations of the minimum number of visual sensors to
provide a complete detection of all surfaces of a certain model. Several solutions were presented
to support complete visual coverage of flight plans to the constructed structure, among these
solutions:
 Creating UAV waypoints on a planar surface parallel to the ground at a fixed height that
is set according to a specific Ground Sampling Distance (GSD) which implies the
required resolution for the reconstructed model. The required minimum overlap between
frames is enforced by fixing the spaces between waypoints (Mancini et al. 2013).
 Creating waypoints at a fixed offset distance facing the structure's walls. The geometry
of the structure’s walls is detected using a separate module that fits surfaces to reality
data collected before using LiDAR sensors (Phung et al. 2016).
 Using a prior model to create series of uniformly spaced waypoints on a 3D bounding
box set at a safe offset distance from the structure and orient the camera trajectories to
look at specific points of interest (Freimuth and König 2018).
Ensuring visual coverage for geometry-based progress detection is not enough to provide the
accurate state of progress for constructed building elements (for instance all the activities related
to constructing a concrete wall are detected at the same 3D location). In such case, the collected
images can help in classifying the correct stage of element progress by detecting the element's
back-projected pixels and perform material classification using a machine learning model (Han
and Golparvar-Fard 2015). The success of the later method requires redundant canonical
visibility of elements’ pixels in multiple captured frames to enhance the accuracy of material
recognition.
BIM-driven visual quality metrics (Ibrahim et al. 2017a; Ibrahim et al. 2017b) are developed
to create flight plans to collect visual data that supports automated progress detection through
evaluating visual coverage and material recognition requirements at different stages of

construction. Visual quality metrics are based on counting the number of BIM elements observed
with a minimum number of back-projected pixels in the simulated waypoints frames. Where the
number of observed elements indicate the amount of visual coverage and redundancy in
observations of elements implies detection of redundant features in the collected frames for
image matching step of 3D reconstruction and enhance the performance of material classifiers.
To this end, research efforts are directed towards optimizing the visual coverage of flight
plans by simulating the flight missions and using ray-tracing to detect visible elements. Next-
best-view algorithms are used to generate consecutive camera trajectories that aims to observes
all the sides of a defined model (Connolly 1985). Evolutionary algorithms are also tested to
provide optimal waypoints with maximal coverage to abstract representation of the model
(Strubel et al. 2017).

METHOD
In this work we propose a new method and a prototype for visual data collection using UAVs
to assist automated visual progress monitoring through geometry and appearance-based models.
The process starts with (1) automatic generation of a 3D flight plan template around the structure
to be monitored, (2) visual evaluation and safety check for the generated template with the ability
for manual refinements, (3) optimization of the flight plan using evolutionary multi-objective
genetic algorithm and (4) flight execution by uploading mission’s data to a cloud server
accessible by flight execution platform (Figure 1).

Figure 1. Pipeline for optimized flight planning and automatic execution

Automatic 3D Flight Templates

Searching for an optimum flight plan in 3D space is associated with unlimited number of
possible solutions where each solution can be defined by the number of waypoints (N), 3D
position of each waypoint (P) defined by x, y and z coordinates and orientation of the camera at
each position (R) defined by the heading and pitch angles. In this case, the optimization problem
is considered a six degrees of freedom problem with continuous infinite search space.
To reduce the search space for the optimization, four flight templates are used to provide an
initial solution to the optimization. Flight templates are proven to have good performance for
mapping simple structures by assuring a minimum percentage overlap between frames to
maximize visual coverage and the convergence of Structure-from-Motion (SfM) algorithm to

generate 3D as-built models.

The four flight templates are (a) Top-down lawn mowing pattern defined by a fixed altitude
from the ground and nadir camera trajectories; (b) Top-down lawn mowing pattern with
perimeter capture where an extra path is added with 45-degrees oblique camera trajectories used
to cover the structure's sides; (c) Top circular pattern used to simulate data collection using
tower crane camera; and (d) 3D grid-based plan generated around the building's bounding box at
a user-defined safe offset with canonical camera views (Figure 2).

Figure 2. Flight plan templates (a) top-down lawn mowing pattern, (b) lawn mowing
pattern with perimeter capture, (c) top circular pattern and (d) 3D grid-based plan
Flight Plan Evaluation
Two evaluation criterions are used to assure the visual quality of the flight plan and safe
proximity to the structure according to (Ibrahim et al. 2017b).

Visual quality
That is used to benchmark the visual quality of a flight plan by computing the visibility of
elements observed in the camera frames and redundant visibility of such observations across
different frames. The two metrics support the requirements of 3D image-based reconstruction
and the material recognition models.

Flight Safety
The developed flight planning tool allows the user to edit the plan manually. Manual edits
can be performed by adding new waypoints or modifying the locations and orientations of
existing waypoints according to the data collection requirements that are not satisfied by the
flight plan template. Such manual modifications to the flight template might put some or all
waypoints in unsafe proximity to the structure which is detected automatically and reported
visually to the user.

4D-BIM based flight optimization

An optimum flight plan is defined as the flight plan that maximizes the visual quality metrics
and minimize the flight duration. Near optimum solutions are computed by formulating a multi-
objective optimization problem (Murata and Ishibuchi 1995; Rao 2009) to reduce the

computation time required to find the optimum solution. Genetic algorithm model was selected
to search for sub-optimum solutions by local search around the best solutions in each iteration.
Genetic algorithm simulates the biological process of genetic improvements in human by
removing dominated solutions at each iteration.

Optimization Objectives
The three objectives of the optimization problem are defined by Equation 1.
max(V  x  , R( x), T ( x)) (1)
Where, V is the number of visible elements with at least 2% of the frame's pixels, R is the
number of elements with six or more redundant observations, T is the flight duration and x is a
flight plan.

Decision variables
Modification to the initial flight plan are considered a solution (individual) to the problem
which, each solution defines the final positions (P) and orientations (O) of the plan’s waypoints.
The search space is constrained by fixing the number of waypoints to those of the initial flight
plan and restricting the modification to the position of a waypoint away or parallel to the
structure's surface (to avoid unsafe proximity) with a maximum user-defined offset distance. The
possible orientations (heading and pitch angles) are bounded by angular offset from the initial
orientation to prevent generation of waypoints that put the structure out of the captured frame.

Procedure
Random solutions are first generated and used as seed to the optimization model. At each
iteration, the new solutions are simulated, then their quality metrics are calculated along with the
flight duration. Since the solution space for this problem is three dimensional -with visibility,
redundancy and time dimensions-, there is no one best solution for the problem, however several
sub-optimum solutions exist in the solution space. These solutions construct a 3D surface with
all non-dominated solutions -Pareto front solutions (Rao, 2009)-. The best solutions (parents) are
detected by constructing the Pareto front surface after each iteration via eliminating all dominant
solutions and keeping only the solutions on the Pareto surface. Iterative mutation is conducted
randomly by selecting one of the Pareto front solutions (parent solution) and altering a single
waypoint picked randomly (mutation step). Crossover is conducted by selecting two parent
solutions from the Pareto front surface and generating two new solution by randomly swapping a
waypoint between the two solutions (crossover step).

EXPERIMENTS AND RESULTS

An under-construction three story mixed-use facility is chosen to validate the developed
model. The developed pipeline is added as a client-server plug-in to Reconstruct web platform to
access flight plans, visualize data collection waypoints and flight path, run the optimization
module and make the flight plans available for the execution module. Four different flight plan
templates were created and then optimized to improve visual quality and minimize the data
collection duration. 4D BIM is utilized to detect the expected change during the data collection
process at the building's top floor (incl. exterior and interior walls and roof slab). Parameters of
the flight plan templates are set to ensure a minimum overlap of 60% between the camera

frames. The safe proximity is set to 30 feet, drone speed is set to 11 mph and camera parameters
are set to match the specifications of the camera mounted on DJI Phantom 4 drone.
Optimization parameters are set to constrain waypoints' translation to 18 feet and camera
trajectory to +/-45 degrees. The size of the seed solution is set to 100 with maximum iterations
set to 1000 generations. Probability of mutation is set to 60% and that of crossover is set to 80%.
Each of the four initial flight plans are optimized by retrieving the waypoints of the initial plans
and altering the positions and orientation of the waypoints to generate new random solutions
(individuals). At each iteration, the visual quality of the individuals is calculated as described in
Flight Evaluation section. Since the waypoints are constrained to move parallel or away from the
structure, the proximity of the generated flight plan to building is assured to be at least 30 feet to
ensure flight safety.
Figure 3 shows the four initial flight templates on the left and the optimal solution with the
maximum visibility on the right. Table 1 reports the best performing solutions for optimizing
each flight plan template in terms of maximum visibility, maximum redundancy and minimum
time. Quantitative results show that the optimal flight plans with minimum time were the same as
the initial plans. Optimizing the Top-down lawn mowing pattern (T) and the Top circular pattern
(TC) let to major improvements in the visibility and redundancy metrics of the plans. Optimizing
the Top-down lawn mowing pattern with perimeter capture (TP) let to minor visual quality
improvements. Finally, optimizing the 3D grid-based plan (G) resulted in no improvement to the
visibility metric and minor improvement to the redundancy metric by one more element.

Figure 3. Results of the optimization model showing each initial flight templates on the left
visualized using a yellow path and the optimized plan with best visibility on the right
visualized using a blue path

Table 1. Flight planning optimization results

Flight Template T TP TC G
# of Waypoints 122 187 149 361
Initial & Min Time: Time (mins) 10 15 12 29
Initial & Min Time: Visibility (# of elements) 131 178 114 234
Initial & Min Time: Redundancy (# of elements) 3 147 87 205
Max Visibility: Time (mins) 11.8 18 14.4 30.7
Max Visibility: Visibility (# of elements) 153 182 170 234
Max Visibility: Redundancy (# of elements) 18 117 85 206
Max Redundancy: Time (mins) 11.4 16.2 14.4 30.7
Max Redundancy: Visibility (# of elements) 136 178 152 234
Max Redundancy: Redundancy (# of elements) 27 148 93 206
T: Top-down lawn mowing pattern, TP: Top-down lawn mowing pattern with perimeter capture, TC: Top circular
pattern and G: grid-based plan

CONCLUSION AND FUTURE WORK

The paper presents a new method for automatically creating optimized flight plans for
outdoors visual data collection on construction site. 3D Flight templates are used as initial
solutions and then optimized using Multi-Objective Genetic Algorithm to maximize the plans’
visual quality and minimize the flight execution duration. Results of the optimization process
show that the initial plans (based on flight templates) have the lowest flight duration as
optimizing the visual quality of the plans leads to offsetting the plans’ waypoints and thus
increasing the total flight distance and duration. Adding to that, the optimization has the most
significant effect on the visual quality of top-down and the circular pattern flight templates.
Whereas in the first plan, the optimization improved the visibility by 22 elements and
redundancy by 24 elements. For the circular pattern plan, the visibility was improved by 56
elements and redundancy by 6 elements. Top-down plan with perimeter capture on the other
hand shows little improvement to the visual quality due to the simple shape of the inspected
structure (box-shaped building) that could be covered well with the oblique perimeter cameras.
Finally, optimization to the 3D grid-based plan resulted in almost the same visual quality as the
initial plan due to the later reasons.
Analyzing the deviation between the simulated and executed results was not addressed in this
paper assuming that the localization errors during execution will not have major effect on the
results, however, further analysis for such deviation will be considered in future work. Future
work will also investigate different flight planning optimization techniques while introducing the
number of waypoints as a decision variable to improve the results.

ACKNOWLEDGEMENTS
This material is based in part upon the work supported by the National Science Foundation
under Grant CMMI 1446765. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the views of
the National Science Foundation.

REFERENCES
Connolly, C. I. (1985). “The Determination of Next-best Views.” Proc., 1985 IEEE Int. Conf. on

Robotics and Automation, Vol. 4, 233–242.

Freimuth, H. and König, M. (2018). “Planning and executing construction inspections with
unmanned aerial vehicles.” J. Automation in Construction, 96, 540–553.
Golparvar-Fard, M., Savarese, S., and Peña-mora, F. (2010). “Automated Model-based
Recognition of Progress Using Daily Construction Photographs and IFC-based 4DModels.”
Construction Research Congress 2010, 51–60.
Han, K. K. and Golparvar-Fard, M. (2015). “Appearance-based Material Classification for
Monitoring of Operation-Level Construction Progress Using 4D BIM and Site Photologs.” J.
Automation in Construction, 53, 44–57.
Ibrahim, A., Roberts, D., Golparvar-Fard, M., and Bretl, T. (2017a). “An Interactive Model-
Driven Path Planning and Data Capture System for Camera-Equipped Aerial Robots on
Construction Sites.” Int. Workshop for Computing in Civil Engineering, 117–124.
Ibrahim, A., Golparvar-Fard, M., Bretl, T., and El-Rayes, K. (2017b). “Model-Driven Visual
Data Capture on Construction Sites: Method and Metrics of Success.” Int. Workshop for
Computing in Civil Engineering, 109–116.
Lin, J. J. J., Han, K. K., and Golparvar-Fard, M. (2015). “A Framework for Model-Driven
Acquisition and Analytics of Visual Data Using UAVs for Automated Construction Progress
Monitoring.” Computing in Civil Engineering, 156–164.
Mancini, F., Dubbini, M., Gattelli, M., Stecchi, F., Fabbri, S., and Gabbianelli, G. (2013). “Using
unmanned aerial vehicles (UAV) for high-resolution reconstruction of topography: The
structure from motion approach on coastal environments.” Remote Sensing, 5(12), 6880–
6898.
Murata, T., and Ishibuchi, H. (1995). “MOGA: Multi-Objective Genetic Algorithms.” In IEEE
International Conference on Evolutionary Computation (pp. 289–294).
<https://doi.org/10.1109/ICEC.1995.489161>
Phung, M. D., Quach, C. H., Chu, D. T., Nguyen, N. Q., Dinh, T. H., and Ha, Q. P. (2016).
“Automatic interpretation of unordered point cloud data for UAV navigation in
construction.” 14th Int. Conf. on Control, Automation, Robotics & Vision.
Rao, S. S. (2009). “Engineering Optimization: Theory and Practice.” Wiley-Interscience.
<https://doi.org/10.1002/9780470549124>
Reconstruct Inc. “Reconstruct Visual Data Analytics Platform for Construction.”
<https://www.reconstructinc.com> (Jan. 15, 2019)
Schönberger, J. L., and Frahm, J.-M. (2016). “Structure-from-Motion Revisited.” IEEE
Conference on Computer Vision and Pattern Recognition (CVPR).
https://doi.org/10.1109/CVPR.2016.445
Strubel, D., Morel, O., Saad, N. M., and Fofi, D. (2017). “Evolutionary algorithm for positioning
cameras networks mounted on UAV.” Intelligent Vehicles Symposium (IV), 2017 IEEE,
1758–1763.
Urrutia, J. (2000). “Art Gallery and Illumination Problems.” In Handbook of Computational
Geometry, 973–1027.

Selective Deconstruction Programming for Adaptive Reuse of Buildings

Benjamin Sanchez, Ph.D.1; Christopher Rausch2 ; and Carl Haas3
1
Dept. of Civil and Environmental Engineering, Univ. of Waterloo, 200 University Ave. West,
Waterloo, ON N2L 3G1, Canada (corresponding author). E-mail: [email protected]
2
Ph.D. Candidate, Dept. of Civil and Environmental Engineering, Univ. of Waterloo, 200
University Ave. West, Waterloo, ON N2L 3G1, Canada. E-mail: [email protected]
3
Prof., Dept. of Civil and Environmental Engineering, Univ. of Waterloo, 200 University Ave.
West, Waterloo, ON N2L 3G1, Canada. E-mail: [email protected]

ABSTRACT
Adaptive reuse of buildings has been identified as a superior alternative to new construction,
in terms of sustainability and circular economy. The programming of deconstruction works plays
an important role in the adaptive reuse process in order to improve a project’s performance.
Unfortunately, there is little evidence of efficient implementation and use of deconstruction
programming for adaptive reuse building projects. The aim of this paper is to describe a selective
deconstruction programming approach for adaptive reuse of buildings. First, we describe a new
model for multiple-target selective disassembly sequence planning for buildings. Afterwards, we
describe a method for programming the deconstruction works based on the disassembly
sequences generated in the last step. The proposed approach helps to improve project
performance through process automation that supports quantitative analysis, low cost exploration
of alternatives, and an iterative design process for meeting project constraints, e.g. budget,
schedule, environmental impact.

INTRODUCTION
Adaptive reuse of buildings is considered a superior alternative for new construction in terms
of sustainability and Circular Economy (CE) (Kibert 2007; Conejos et al. 2015; Douglas 2006;
Teo and Lin 2011; Langston et al. 2008; Sanchez and Haas 2018a). However, the current
implementations of adaptive reuse rely on descriptive approaches with little objective
measurement that depends on the intuition and experience of practitioners (Sanchez and Haas
2018a; Volk et al. 2018; Mohamed et al. 2017). Increasing the efficiency of the adaptive reuse of
buildings through automation and optimization is fundamental to fully exploit the residual life-
cycle of buildings, and with it, avoid wasting the embedded resources.
In the same way that project planning for green-field construction has significant impact on a
project’s outcome (Gibson and Gebken 2003; Kang et al. 2013; Camacho et al. 2018), project
planning for selective deconstruction has the potential of improving the performance of an
adaptive reuse building project. Selective deconstruction project planning involves the
scheduling for dismantling targeted building components, the choice of technology, the
definition of work tasks, the estimation of the required resources and durations for individual
tasks, and the identification of any interactions among the different work tasks. In contrast to the
vast literature on construction project planning, the literature on planning of selective
deconstruction projects is scarce (Hübner et al. 2017). Therefore, the development of project
planning methods for selective deconstruction is required to increase the efficacy of the process
of adaptive reuse. The goal of this paper is to describe a semi-automated selective deconstruction
programming approach for adaptive reuse of buildings. The proposed approach is able to support

efficient programming of the deconstruction works based on realistic and optimized selective
disassembly planning.

BACKGROUND
The design and planning process of an adaptive reuse building project is arguably more
complex than the design and planning of green-field construction projects. Adaptive reuse
involves restoring and, in some cases, changing the use of existing buildings that are obsolete or
are nearing their disuse stage. The extra complexity lies in the initial existing conditions that
represent initial design constraints and restrictions. If the initial evaluation of the existing
conditions is incorrect or underestimated, the impacts for the project’s outcome could be
disastrous (Conejos et al. 2016). Also, the adaptive reuse design process first includes an
intensive planning stage associated with the selective deconstruction of the existing building
asset, followed by the planning of the construction works for the redevelopment, adaptation, and
in some cases the expansion of the building asset. Our study is focused on the selective
deconstruction planning stage. In that stage, the designers decide which building components, or
subsystems, should be retrieved and the deconstruction methods to apply. Even though, the field
of deconstruction project planning of buildings has been studied for years, there is no evidence of
studies developed for the selective deconstruction planning for buildings (Sanchez and Haas
2018b; Hübner et al. 2017). Previous studies have focused on the total or partial deconstruction
of building assets with a presupposed fixed programming of work activities and work packages.
For adaptive reuse building projects, the programming of the work activities and work packages
can change according to the design decisions of the building components to retrieve. The
selective deconstruction planning is an iterative process where the different design options could
be driven by financial, technical, functional, and/or aesthetic needs, among others (Langston et
al. 2008).
The process of selective deconstruction programming, also known as selective
deconstruction scheduling, in adaptive reuse building projects is a cornerstone to influence the
project's outcome. Through an appropriate programming the designers can estimate project’s
parameters such as total duration, critical path(s), activity precedence, and resource allocation.
Then, planners can apply methods in the literature of Project Management (PM) in order to
improve the project’s performance, such as resource leveling, schedule compression techniques,
and optimization methods (Bakry et al. 2014). In this matter, the adaptive reuse process has the
challenge that selective deconstruction plans change based on the designer’s decisions, such as,
project needs and the building components to retrieve. Each optional plan can lead to a distinctly
different programming of the activities which has a profound impact on deconstruction planning.
These have been some of the technical limitations perhaps explaining why the current
implementations of adaptive reuse planning rely on descriptive approaches with little objective
measurements that depend on the intuition and experience of practitioners (Sanchez and Haas
2018a; Volk et al. 2018). Due to the importance of adaptive reuse inside the CE, there is a need
for improving the selective deconstruction programming process thorough automation and
optimization in order to maximize the benefits of the building stock renovation.

THE KNOWLEDGE GAP

Adaptive reuse of buildings plays a key role in the construction value chain of a Circular
Economy (CE). In order to improve the outcomes of adaptive reuse building projects, it is
necessary to implement appropriate project planning. This includes the development of a correct,

quantitative, analytical, and objective selective deconstruction programming. According to the

literature review, there is a lack of methods for selective deconstruction programming for
buildings. To the authors’ knowledge, this is the first study that describes a method for selective
deconstruction programming for adaptive reuse of buildings.

Optimized Final schedule

Initial
disassembly and budget
matrices
sequence plan
model

Programming
Retrieving
BIM data
Does the
BIM model Preparing the Disassembly yes Final
Cost design meet
Start (existing disassembly planning deconstruction End
estimation the project
building) model (multi-target) planning
needs?
Adaptive
reuse design Environmental
impact

Figure 1. Selective deconstruction project planning by using BIM-based phase planning.

A FRAMEWORK FOR DECONSTRUCTION PROJECT PLANNING FOR ADAPTIVE
REUSE OF BUILDINGS
Planning for disassembly plays a fundamental role within deconstruction project planning for
adaptive reuse, where the disassembly planning sequences for recovering targeted components
have to be estimated efficiently. The targeted components to recover depend directly on the
adaptive reuse design of the building. In this regard, the adaptive reuse design could be directed
by multiple design criteria that have been defined in several previous studies (Wilson 2010;
Bullen 2007; Conejos et al. 2015). Some of these design criteria are mandatory (e.g. physical
integrity, legal requirements, and functional service) while other ones optional (e.g.
technological retrofits, social aspects, architectural vision and programming, and political
context). The designers are responsible for making the final decisions of a functional and
affordable adaptive reuse design. Because of the multiple designs possible for adaptive reuse of a
fixed asset, the planning for disassembly of targeted components will also change from design to
design. Based on a proposed disassembly plan, it is possible to continue the deconstruction
planning in detail through the programming of the deconstruction works, estimation of resource
allocation, and computing the associated budget, among others. In the end, this becomes an
iterative design process, much of it intuitive, where the best design will be the one that fulfills
the projects needs and limitations.
However, intuitive planning has inevitably led to poor project outcomes and in some cases
has even led to project failures. This is due to the technical limitations and complexities that the
entire process requires, such as the large amount of required project data to gather and compute,
the multiple criteria for different possible designs, and the lack of user-friendly standardized
procedures for optimizing the outcomes. In this study we propose a framework for
deconstruction project planning by using Building Information Modeling (BIM) based phase
planning (see Figure 1), in order to improve the inefficiencies inside the process of adaptive
reuse of buildings. Inside this framework, we develop a selective deconstruction programming
method for buildings based on a multiple-target selective disassembly sequence planning model
for buildings. In addition, we setup the basis for retrieving data from the BIM model in order to

fully automate the process for deconstruction project planning.

Figure 2. An algorithm for creating a multiple-target sequential disassembly planning

(SDP) model for buildings.
AN APPROACH FOR CREATING A MULTIPLE-TARGET DISASSEMBLY
SEQUENCE PLANNING FOR BUILDINGS
This study is an extension of a previous work related to single-target selective disassembly
planning for adaptive reuse (Sanchez and Haas 2018b). In their work Sanchez and Haas (2018b)
describes and validates an optimization method for sequential disassembly planning for buildings
(SDPB). The SDPB tool is used in order to generate optimized disassembly plans for retrieving
single targeted components from building assemblies. All the spatial, topological, and
interdependence constraints of a building assembly under study are organized in a building's
Disassembly Graph (DG) model for their computational processing. A DG model is represented
by constraint matrices where each cell position contains constraints for a part under study. A part
can either be a building component cn or a fastener fn. A constraint can be physical, functional,
environmental, or economic. The approach for creating a single-target selective disassembly
model for buildings gets parts from the DG model, arranges and orders the parts in levels, and
adds the parts to an inverted tree (Smith et al. 2012; Sanchez and Haas 2018b). Finally, the

approach uses expert rules to improve solution quality, minimize graph complexity, reduce
searching time, and reduce computational requirements (Sanchez and Haas 2018b).
In this paper we extend the SDPB model in order to develop an approach for creating a
multiple-target disassembly sequence planning for buildings. This new approach merges single-
target plan models to create one with multiple targets. If the single-target plan models have
identical nodes, including the extraction direction, then the approach merges them in one plan
model with multiple targets. Otherwise, the plan models will remain disjoint. The new approach
uses the SDPB tool to choose the best direction for removing each target component and to
create one single-target disassembly plan model for each target component. Then the approach
looks for the identical nodes in between the single-target plan models. If there are identical nodes
between two plan models, the method merges them to create one plan model with multiple-
targets. The approach compares plan models one by one, starting from the plan model with more
number of parts. If there is more than one extraction direction for a target component, the
approach can create more single-target plan models and select and merge the best one in the
multiple-target plan model. The best single-target plan model is defined as the one with a
maximum number of targeted components and a minimum number of parts to retrieve. The final
multiple-target disassembly sequence plan model represents a high quality, realistic, practical,
and physically feasible solution that is optimized for the number of removed parts. Figure 2
shows the workflow for the proposed approach for multiple-target disassembly sequence
planning for buildings. This kind of model approach for multiple targets has been proven
successful in the manufacturing industry.

f1 f4 f1
c2
f2 c4 c1

c1 f5 f2
c5 c2
f3 c3 SDP1 SDP2
f7
c4 f4 f4 f1

f5 c7 c4 c1
c5 f8 c8
f6 c9 f9 f3 f5 f2
c6 c10
f10 c3 c5 c2

y f6

x f11 c6
z SDP3
Figure 3. Multiple-target disassembly sequence plan model for components c2, c5, and c6.
The multiple-target disassembly sequence plan model approach for buildings uses expert
rules to choose the best single-target plan models of different target components that are meant
to be part of the same disassembly plan model. Different rules can be used for different
applications, and in this study the rules were derived from disassembly planning case studies for
buildings and literature review in the topic. The following are the expert rules which define the
recursive multi-target selective disassembly planning process.
 Rule 1: If the individual SDP’ have identical nodes, p’, including their extraction
direction, then merge the identical nodes, p’, of the single-target SDP’ to create multiple-
target SDP’.

 Rule 2: If the individual SDP’ do not have identical nodes, p’, and they do not share the
same root node, ct, then keep separated the SDP’ with a unique root node.
 Rule 3: If the individual SDP’ do not have identical nodes, p’, but they share the same
root node, ct, then generate a new SDP for the shared root node, ct, with a different
extraction direction.
For the disassembly model showed in Figure 3, the best direction for removing components
c2, c5, and c6 are -y, -x, and -x, respectively. Figure 3 shows the single-target disassembly
sequence plan model for components c2, c5, and c6. The plan model for component c6 contains
the plan model for component c2 and c5. Therefore, the approach merges the plan models for
components c2 and c5 to create one multiple-target disassembly sequence plan model for three
components.

SELECTIVE DECONSTRUCTION PROGRAMMING FOR ADAPTIVE REUSE

As pointed out in the background section, the field of selective disassembly planning for
manufacturing products (for single and multiple targeted components) has been developed
during the last decade showing extraordinary results. For manufacturing products, the selective
disassembly time estimation focuses on sequential planning, dismantling one part at a time. This
makes sense since typically only one worker can be involved in the process of dismantling a
product. In contrast, buildings can be deconstructed by numerous workers operating in parallel.
Hence, in this paper we describe a new method to generate the programming of deconstruction
works based on the multi-target selective disassembly sequence planning approach. The new
method, builds and stores, individual source and target vectors with the precedence data of the
building parts to deconstruct. These vectors are generated along the execution of the multi-target
disassembly planning subroutine. The source and target vectors represent activity nodes. Then,
the individual vectors are merged into a main deconstruction plan. In the end, the main
deconstruction plan is used to create the activity project network of the deconstruction works.
In first instance, the deconstruction works consists of non-destructive disassembly of the
building parts. The information related to the construction resources (materials, labor, and
machinery), labor productivity rates, activity duration, and the associated direct cost of the
deconstruction works is retrieved from the 6D BIM model. A 6D BIM model of a building
assembly integrates the three-dimensional geometry, topology, and physical properties of the
model (3D), time of building activities (4D), building works cost estimating (5D), and
environmental impact assessment (6D). Then, all this information is exported to a PM software
package in order to complete the selective deconstruction programming. The objective is to
determine essential deconstruction project planning information such as project duration, critical
activities, resource allocation, and total project cost. More specific project parameters could be
estimated as well, for example, work assignment, cashflow, and material inventory management.
In a deeper analysis, this stage of deconstruction project planning can be improved by using
advanced PM methods like resource leveling, schedule crashing, and resource allocation. What is
more, it could be possible to apply optimization methods for specific project objectives. These
optimization methods could bring the opportunity of expanding the scope of the deconstruction
planning to other fields such as minimizing the environmental impact and maximizing the life-
cycle of the building components. After a preliminary selective deconstruction planning has been
calculated, the stakeholders can decide if the planning satisfies the needs of the adaptive reuse
building project. If the objectives of the project are not fulfilled, then the designers can go back
to modify the adaptive reuse design. This becomes an iterative process where the decision

makers can learn from the old designs and move forward in order to improve the final adaptive
reuse design. The automation of the selective deconstruction project programming is the
cornerstone to make this process feasible and affordable.

CONCLUSIONS
Adaptive reuse of buildings has demonstrated to be a superior alternative for new
construction in terms of sustainability and CE, when it is applied properly. This paper establishes
the reference framework about the importance of developing an appropriate deconstruction
project planning for improving adaptive reuse building projects' outcomes. In order to remedy
the current inefficiencies in the deconstruction planning phase for adaptive reuse, this study
describes the critical processes and sub processes associated to the BIM-based selective
deconstruction project planning. In the end, a semi-automated selective deconstruction
programming method for buildings is developed and validated as a contribution for improving
the deconstruction project planning for adaptive reuse building projects. As discussed in the
paper, there is a lack of technical methods to assist in the decision making inside the selective
deconstruction planning of a building. Therefore, by developing user-friendly standardized
procedures and tools, it is possible to fully exploit the benefits in adaptive reuse projects. The
goal of this study is to describe a semi-automated selective deconstruction programming
approach for adaptive reuse of buildings. The proposed approach is able to create an efficient
programming of the deconstruction works based on realistic and optimized selective disassembly
planning. The new approach builds on previous work related to single-target selective
disassembly sequence planning algorithm for adaptive reuse of buildings. The approach can
create the programming of deconstruction works for recovering multiple components of a
building assembly in a semi-automated way. The automation of this complex process is a critical
feature in order to make the iterative process of adaptive reuse affordable, measurable, and
comparable.
The selective deconstruction programming approach developed in this study aims to improve
adaptive reuse project outcomes in terms if sustainability and CE. This method is a user-friendly
tool that helps during the iterative process of evaluating different design options for repurposing
an existing building asset. This study demonstrates the affordability and practicality for applying
selective deconstruction programming by using current technologies such as 6D BIM,
disassembly planning optimization models, and PM software. This study can be broadly used
with different objectives and purposes, for example the reduction of waste stream in the
construction industry through the recovery of building components for reuse and recycle.

REFERENCES
Bakry, I., Moselhi, O., and Zayed, T. (2014). "Optimized acceleration of repetitive construction
projects." Autom.Constr., 39 145-151.
Bullen, P. A. (2007). "Adaptive reuse and sustainability of commercial buildings." Facilities,
25(1/2), 20-31.
Camacho, A., Cañizares, P. C., Estévez, S., and Núñez, M. (2018). "A tool-supported framework
for work planning on construction sites based on constraint programming." Automation in
Construction, 86 190-198.
Conejos, S., Langston, C., and Smith, J. (2015). "Enhancing sustainability through designing for
adaptive reuse from the outset: A comparison of adaptstar and adaptive reuse potential (ARP)
models." Facilities, 33(9-10), 531-552.

Conejos, S., Langston, C., Chan, E. H. W., and Chew, M. Y. L. (2016). "Governance of heritage
buildings: Australian regulatory barriers to adaptive reuse." Build.Res.Inf., 44(5-6), 507-519.
Douglas, J. (2006). Building adaptation. Elsevier Ltd., UK.
Gibson, G., and Gebken, R. (2003). "Design quality in pre-project planning: applications of the
Project Definition Rating Index." Building Research and Information, 31(5), 346-356.
Hübner, F., Volk, R., Kühlen, A., and Schultmann, F. (2017). "Review of project planning
methods for deconstruction projects of buildings." Built Environ.Proj.Asset Manage., 7(2),
212-226.
Kang, Y., Kim, C., Son, H., Lee, S., and Limsawasd, C. (2013). "Comparison of preproject
planning for green and conventional buildings." J.Constr.Eng.Manage., 139(11),.
Kibert, C. J. (2007). "The next generation of sustainable construction." Build.Res.Informat.,
35(6), 595-601.
Langston, C., Wong, F. K. W., Hui, E. C. M., and Shen, L. (2008). "Strategic assessment of
building adaptive reuse opportunities in Hong Kong." Build.Environ., 43(10), 1709-1718.
Mohamed, R., Boyle, R., Yang, A. Y., and Tangari, J. (2017). "Adaptive reuse: a review and
analysis of its relationship to the 3 Es of sustainability." Facilities, 35(3/4), 138-154.
Sanchez, B., and Haas, C. (2018a). "Capital project planning for a circular economy."
Constr.Manage.Econ., 36(6) 303-312.
Sanchez, B., and Haas, C. (2018b). "A novel selective disassembly sequence planning method
for adaptive reuse of buildings." Journal of Cleaner Production, 183 998-1010.
Smith, S., Smith, G., and Chen, W. -. (2012). "Disassembly sequence structure graphs: An
optimal approach for multiple-target selective disassembly sequence planning." Adv.Eng.Inf.,
26(2), 306-316.
Teo, E. A. L., and Lin, G. (2011). "Building adaption model in assessing adaption potential of
public housing in Singapore." Build.Environ., 46(7), 1370-1379.
Volk, R., Luu, T. H., Mueller-Roemer, J. S., Sevilmis, N., and Schultmann, F. (2018).
"Deconstruction project planning of existing buildings based on automated acquisition and
reconstruction of building information." Autom.Constr., 91 226-245.
Wilson, C. (2010). "Adaptive reuse of industrial buildings in Toronto, Ontario: evaluating
criteria for determining building selection". Master dissertation. Canadian theses.

Task Allocation and Route Planning for Robotic Service Networks with Multiple Depots in
Indoor Environments
Bharadwaj R. K. Mantha, Ph.D., S.M.ASCE1; and Borja Garcia de Soto, Ph.D., P.E., M.ASCE2
1
S.M.A.R.T. Construction Research Group, Division of Engineering, New York Univ. Abu
Dhabi (NYUAD), PO Box 129188, Abu Dhabi, UAE. E-mail: [email protected]
2
S.M.A.R.T. Construction Research Group, Division of Engineering, New York Univ. Abu
Dhabi (NYUAD), PO Box 129188, Abu Dhabi, UAE; Dept. of Civil and Urban Engineering,
New York Univ. (NYU), Tandon School of Engineering, 6 MetroTech Center, Brooklyn, NY
11201, USA. E-mail: [email protected]

ABSTRACT
Studies suggest that the recent technological advancements in robotics can foster the
automation capabilities of the built infrastructure by introducing service robots. Task-allocation
and path-planning are two of the fundamental challenges faced by such task-oriented robots.
Existing algorithms in this domain have been adapted from outdoor logistics based applications
with context-specific assumptions. This study particularly addresses this issue, extends the
authors’ previous efforts, and proposes a new methodology to optimize the task allocation and
route planning in case of multiple starts and destination depots where each robot begin and end at
the same depot. Scenario analysis is conducted to compare the performance (e.g., total distance)
of the proposed multi-depot algorithm with the single-depot one. The developed methodology is
generic and can be used for a wide range of indoor building environment applications. Finally,
limitations of the current approach are identified, and future work directions that require further
investigation are proposed.

INTRODUCTION
Intelligent robots will soon be ubiquitous, and there is a strong need to explore the potential
of these robotic systems to improve autonomy in the operation and utilization of today’s
buildings. To enable such autonomy, the robots need to a) optimally divide the task among
themselves (task allocation) and b) plan their respective paths for the assigned task locations
(route/path planning). Path planning refers to determining an optimal route from a designated
start location to a single or a set of destinations without necessarily returning to the start location.
On the other hand, tour planning refers to determining an optimal path originating and ending at
the same start location (or depot) while covering each of the desired destination locations exactly
once (Bellman 1962). This is also commonly referred to as the travelling salesman problem
(TSP) in case of a single salesman (or robot in this context) and multiple travelling salesman
problem (mTSP) in case of multiple robots (Kara and Bektas 2006; Sofge et al., 2002). Multi-
depot mTSP (MmTSP) is a more generalized version of single depot mTSP where there exist
more than one depot and salesman at each depot. If each of the m salesmen has to return to their
respective start locations (or depots), it is referred to as the fixed distance MmTSP. On the
contrary, if the salesman does not have to return to their original depots, but the subtotal number
of salesmen at each depot at the end should remain the same, it is referred to as the nonfixed
destination MmTSP (Kara and Bektas 2006).
The focus of previous studies was to solve the path-planning problem for indoor and outdoor
environments or tour-planning problem for outdoor environments, but indoor environments

received very less attention (Jose and Pratihar 2016). Though several studies looked into solving
mTSP and MmTSP for outdoor environments, those approaches are not directly applicable to
indoor environments due to some unique characteristic differences and challenges in indoor
environments (Mantha et al., 2017). For example, Ryan et al. (1998) proposed a heuristic based
approach (Reactive Tabu Search) to solve the routing problem for Unmanned Aerial Vehicles in
outdoor environments. Since the graph network created based on indoor environments is
incomplete because of the physical access constraints (e.g., locations inside buildings can only be
accessed through corridors and stairs), the same outdoor approach cannot be applied to indoor
built environments. Nallusamy et al. (2009) solved multi-agent task allocation for outdoor
networks based on K-means clustering, Shrink-Wrap Algorithm, and Meta-Heuristics. However,
the same approach cannot be adopted for indoor built environments since they do not account for
the topology of the graph network (i.e., the connectivity of nodes which represent edges).
Gorbenko et al. (2011) proposed a methodology to solve the route planning problem for
indoor service robots based on a few assumptions. One of the assumptions is that the final
solution consists of a Hamiltonian path. That is, the final solution consists of a traceable path
where the collection of nodes in the solution can be traced by visiting every node exactly once.
However, since the indoor graph network is incomplete, the final solution for each robot might
not be a Hamiltonian path. To address these limitations, Mantha et al. (2017) developed a
generalized iterative optimization algorithm to solve mTSP (consisting of multiple robots) with
single depot (or base location).
To extend this further, this work develops a generic interdisciplinary approach that addresses
the limitations of the current approaches and provides a general solution for multi-robot task
allocation and route planning involving multiple depots. It presents a methodology to solve the
fixed distance MmTSP problem for indoor environments. Specifically, the primary contribution
of this paper is a generalized methodology to optimize task allocation (i.e., divide the tasks
assigned among a group of robots in the most desirable way) and route planning (i.e., determine
the optimal way to visit several locations) for multiple concurrent or recurring (repeating or
periodic) tasks to be performed by multi-robotic systems in dynamic indoor built environments.
These robotic systems have similar mobility characteristics, physical constraints, and all begin
and end at their respective start and destination depots (i.e., charging stations). The applicability
and feasibility of the proposed approach are illustrated with the help of a case study and scenario
analysis.

PROBLEM STATEMENT
Assume ‘m’ robots distributed among ‘D’ depots (or charging stations) inside a building.
These robots complete ‘x’ tasks at different locations in the building. The objective is to visit all
the desired destination locations in the building with the help of the available robots. That is,
determine ‘m’ tours (for each robot) so the total distance traveled by the robots is minimized and
each of the desired destinations is visited at least once by any of the robots. It has to be noted that
each robot start and end at the same depot. This means that the product of the number of times an
edge (e.g., corridor inside a building) is visited and the corresponding edge distance needs to be
minimized as shown by the objective function (Eq. (1)). The edges and the nodes are extracted
from the graph network generated based on the floor plans (layout) of the indoor environment.
The details regarding the graph network generation are described in the methodology section of
this paper.

n n

c
i 0 j 0 k D
ijk d{Vi ,V j } (1)

Where cijk  0 is the number of times the edge between nodes Vi and Vj is visited by a
robot originated from depot k and d Vi , Vj represents the distance between the respective
nodes. n is the total number of nodes (or vertices) in the graph network.

METHODOLOGY
The proposed methodology can be broadly classified into two main steps, 1) network
formulation, and 2) network solution. The objectives of the network formulation and solution
steps are to create an indoor graph network based on the given task requirements and solve it for
single/multiple robots (s) respectively.

Network Formulation
The whole process of network formulation can be divided into two main steps a) Multi-layer
information generation - storing the information regarding tasks to be performed, and b)
Graphical representation - representing the obtained information in graphical terms for further
optimization. The indoor environment can be graphically represented as a weighted undirected
graph G= (V, E), where V refers to the nodes representing distinct locations (e.g., rooms), and E
refers to the edges representing the network topology connecting any two different nodes (e.g.,
corridors). Each edge can consist of several attributes such as distance, time, accessibility (e.g.,
based on the dimensions of the corridor), and feasibility (e.g., based on the type of mobility of
the robot). The aggregation of all the respective edge attributes (or weights or costs) in a matrix
is called a cost or weight (e.g., distance) matrix and is referred to as C. Further details regarding
the network formulation can be found in Mantha et al. (2017).

Network Solution
A complete graph network is thus generated based on the given task requirement locations in
the indoor environment. Network solution is mainly divided into two phases. In the first phase, a
near-optimal feasible solution is determined for a single robot and depot. In the second phase,
this solution is used as an initial guess to determine the solution for multiple robots and depots.
Each of these phases is described in detail below.

Single Robot Solution

A three-step methodical process determines the solution for a single robot case. First,
determine the shortest tour for a single robot to visit all the nodes in the network (at least once)
and return to the base node needs to be estimated with the help of a Traveling Salesman Problem
(TSP) algorithm such as Branch and Bound (Bellman 1962; Ali and Kennington 1986; Kara and
Bektas 2006). The most optimal tour obtained from the algorithm might most likely contain
pseudo edges because of the incomplete nature of the network. Second, the pseudo edges which
do not physically signify anything in the indoor built environment need to be converted to actual
edges. This is done with the help of Dijkstra’s shortest path algorithm (Dijkstra, 1959; Dorigo et
al., 2008). For example, consider the following tour 8-->1013 in Figure 1, where dotted (-->)
and solid () arrows represent pseudo and actual edges respectively. The algorithm detects 8 --

>10 as a pseudo edge, recalls the shortest path algorithm, and determines 8910 as the
shortest tour to go from 8 to 10. Therefore, the algorithm modifies the total path to
891013. This process is repeated for all such edges, and the final solution, which consists
of all the actual edges, is obtained. Lastly, the task requirement locations are incorporated in the
final tour obtained which consists of just the nodes and edges.

Multi-Robot Solution
The entire process of solving the route-planning problem for the swarm of robots is divided
into seven steps as described below.
Step 1. Initialization: The initialization for several of the existing algorithms such as Genetic
Algorithm (GA) and Neural Network (NN) to solve TSP-related (e.g., mTSP and MmTSP)
problems is random or based on greedy search. That might be the reason for delayed
convergence and the requirement of a large number of iterations to obtain a near optimal
solution, which sometimes is far off optimality too. Thus, in this methodology, the tour obtained
by the single robot solution is used for initialization instead of a random tour. The goal is to
arrive at the near optimal solution with the less number of iterations based on this informed
initialization.
Step 2. Group Formation: Previous studies based on GA performed the grouping to solve for
the multi-robot route planning by compressing the length of the parental individual with the help
of a genetic (or crossover) operator (Yu et al., 2002). For this study, however, the crossover
operator was used to form different groups based on the single robot solution. In the context of
this paper, a crossover point is based on the pre-defined depots where the robots are located in
the network. For example, consider the following single robot tour 038910430.
If the pre-defined depots are 0 and 10, then the division is made at the edge 910. That means
that the crossover operator is applied at that edge and this division results in two groups with the
following nodes (0,3,8,9), (10,4,3). The predefined nodes are either determined based on
physical constraints (e.g. charging stations) or identified by the facility manager. The total
number of crossover operators will be one less than the total number of depots (or robots). That
is, if there are five robots, then there will be four crossover operators. This is because four
crossover operators will result in five different groups of nodes (one for each of the robots).
During the iterations, other combinations of groups are generated by randomly swapping nodes
between the existing groups of nodes.
Step 3. Uniqueness Operator: As the name suggests, the objective of this step is to avoid
commonality of nodes between the groups formed, and to ensure the uniqueness of each, and the
combination of the groups. This is to eliminate the possibility of a robot (or group of robots)
visiting a specific node multiple times. The obtained solution is termed as a feasible solution if
the groups formed abide by these rules: a) intra group uniqueness (i.e., each group formed should
not possess a repetition of nodes), b) inter group uniqueness (i.e., all the groups are mutually
exclusive), and c) all the groups combined consist of all the nodes in the network.
Step 4. Intra Depot Solution: At this stage, a unique group of nodes is assigned to each depot.
The immediate next step is to distribute these nodes to the available robots at that particular
depot. This is done using the iterative algorithm for single depot mTSP (Mantha et al., 2017).
Step 5. Cost Estimation: In this step, the total cost to visit all the nodes (by all the robots) in
the network is estimated. The total cost is equal to the sum of all the depot costs based on the
intra depot solution estimated in the previous step. The total tour length to visit all the nodes at
least once by all the ‘m’ robots starting and ending at their respective depots.

Step 6: Data Association and Aggregation: Data association in this context refers to
associating the information regarding the total cost with the corresponding set of unique
combination of nodes for each depot. Several iterations are run with different combinations of
groups based on a randomly mutating group of nodes from the initial group generated. The data
mentioned above elements (e.g., the unique combination of depots at each depot and the
corresponding total cost) are stored.
Step 7: Termination Rule: Due to the randomness of the combination of nodes generated, it is
not possible to predict if the solution will converge. Generally, the algorithm is iterated until
there is no significant improvement in the optimized value or for a certain number of iterations
(Florin et al., 2016). For demonstrating the effectiveness of the proposed approach, the algorithm
is iterated for only 100 times (can be increased or decreased) to compare the results. After that,
the network with the minimum total costs is selected, and the respective unique combination as
chosen as the near-optimal solution for the respective problem.

Figure 1. Graph network based on the floor plan for the case study

CASE STUDY
The graphical node network representation of the building floor plan chosen for simulation
experiments is shown in Figure 1. The undirected graph network consists of 27 nodes, 34 edges,
and 28 task requirement locations. The objective of this case study is to compare and analyze the
total distance, task distribution, and frequency of the near-optimal solutions if single and
multiple depots occur.

RESULTS AND DISCUSSION

Table 1 shows the total and individual robot distance based on the near optimal solution
returned by the iterative algorithm for single and multiple depots with a number of robots
ranging from 2 to 6. As expected, the total distance is reduced when multiple depots are used
(when compared to total distances for single depot solutions). For example, in the case of 4
robots, the total distance traveled when there is a single depot is 108.6 units but it reduces to 87.7
units when the robots are distributed among four different depots. The total distance for multiple
depots does not necessarily decrease with an increase in the number of depots and robots. For

example, the total distance traveled by all the robots in the case of 2, 3, and 4 depots (with one
robot at each depot) is 85.2, 83.5, and 87.7 respectively. It has to be noted that the number of
depots (= robots) increased from 3 to 4, but, the total distance did not decrease. Though it seems
counterintuitive, this phenomenon can also be explained as follows.

Table 1. Simulation results of the iterative algorithm for mTSP and MmTSP.
Iterative Total Robot
#Robots #Depots
Method Distance 1 2 3 4 5 6
2 1 89.2 4.0 85.2 - - - -
3 1 104.6 8.0 11.4 85.2 - - -
mTSP 4 1 108.6 4.0 8.0 11.4 85.2 - -
5 1 126.0 4.0 8.0 10.0 23.4 80.6 -
6 1 136.1 4.0 8.0 10.0 14.4 19.6 80.1
2 2 85.2 4.0 81.2 - - - -
3 3 83.5 4.0 3.4 76.1 - - -
MmTSP 4 4 87.7 4.0 3.4 4.4 75.9 - -
5 5 86.4 8.0 4.2 4.4 14.2 55.6 -
6 6 100.4 8.0 4.4 4.4 14.2 39.0 30.0

Consider part of the network in Figure 1 with nodes 0, 3, 4, and 8. If there are one robot and
one depot at node 0, the optimal route to visit the rest of the nodes would be 0  3  4  3 
8  3  0 with a total distance of 14 units. With the same network, consider two depots at node
0 and 4 with one robot at each of those depots. The respective optimal routes for the robot at
nodes 0 and 4 would be 0  3  8  3  0 and 4  3 4 with tour lengths of 8 and 6
respectively (i.e., total distance = 14 units). Though the number of depots and robots were
increased, the total distance traveled by the robots did not decrease.
Table 1 also shows the distance traveled by each robot for both the single and multiple
depots. For example, in the case of multiple depots (MmTSP) with 2 robots, based on the
assigned nodes in the near optimal solution, the tour length of the first and second robot is 4.0
and 81.2 respectively. That is, the first robot (Robot 1) starts at its base depot (node 0) visits
node 3 (assigned node) and returns to its base depot (node 0). Similarly, the second robot (Robot
2) starts at its base depot (node 16), visits the rest of the nodes (other than nodes 0 and 3) in the
network and returns to its base depot (node 16). One of the interesting things to observe is the
unequal division of labor. Robot 2 travels most of the distance (81.2 units) while Robot 1 travels
very less distance (4.0 units). This is possibly because the objective is to optimize the total
distance traveled by all the robots. To address this, distance constraints can be imposed as a
condition in the overall objective of the optimization. This can be achieved by incorporating an
additional step to the methodology at the group formation stage with a minimum distance
constraint for the groups formed. Any group formed is considered a feasible group if the tour
length of that particular group is more than the threshold limit (or minimum distance). Similarly,
a maximum distance constraint can also be incorporated which signifies the total distance any
robot can travel with a single charge.
In addition, the results from Table 1 can be used to determine the optimal number of robots
required given the frequency of task accomplishment and vice versa. For example, consider that
the task is to periodically monitor the environmental parameters at all the task requirement
locations in Figure 1. The objective is to determine the optimal number of robots required to
perform this task with a required frequency of 10 and 20 minutes. Considering an average

velocity of the robot to be 4 units/ minute, with multiple depots, the required number of robots
would be 3 (20 minutes) and 6 (10 minutes). However, if there is only one depot, 6 robots are
required to achieve a frequency of 20 minutes and more than 6 for 10 minutes. In addition, the
frequency of task accomplishment does not significantly improve in the single depot case
compared to that of multiple depots. This is because of base node location and network topology.
Even the robot assigned the farthest node (e.g., node 26) in the network has to return to the base
depot (e.g., node 0) in every tour. Thus, it can be deduced that the topology of the network and
location of the depots in the network play a significant role in the results (task distribution and
robot distances). To that extent, the assumed velocity of the robot is incidental since frequency is
a proportional measure of distance and does not impact the results. To address this issue, the
depot locations need to be optimized to achieve the desired results. However, this might not
always be feasible because the depot location might not be flexible in buildings with a significant
number of public spaces.

CONCLUSION
This work presents a generalized task allocation and route-planning algorithm for multiple
robots with multiple depots. The results from the case study show the feasibility and applicability
of the approach for real-world applications. Scenario analysis was conducted with an objective to
compare factors such as total distance, task distribution, and frequency of task accomplishment
for both single and multiple depot solutions. It was found that the multiple depot solution
outperforms the single depot solution in all the cases (i.e., the number of robots ranging from 2 to
6). However, the total distance traveled by all the robots did not decrease with the increase in the
number of robots as expected. This might be because of the near optimal solution or the topology
of the floor plan. In addition, there was an unequal division of labor observed in the near optimal
solution. To address this, additional constraints such as minimum and maximum distance needs
to be considered in the optimization function. It was also observed that the frequency of task
accomplishment depends on the location of the depots in the network and the topology of the
network itself. Thus, to address this issue, the location of depots needs to be optimized in the
network to achieve optimal frequency in case of a limited number of resources (robots). To
extend this further, the proposed algorithm can be iterated for all possible combination of a
unique set of depots to determine the optimal node location placement. The proposed algorithm
is general and applies to several built applications such as environmental data collection,
structural health monitoring, indoor wayfinding, occupant comfort evaluation, security, and
surveillance. The algorithm can also be motivated for robotic construction applications such as
progress monitoring, safety, and quality inspection, as-built model generation, worker tracking,
and fault detection.

REFERENCES
Ali, A. I., and Kennington, J. L. (1986). “The asymmetric M-travelling salesmen problem: A
duality based branch-and-bound algorithm.” Discrete Applied Mathematics, 13, 259-276.
Bellman, R. (1962). “Dynamic programming treatment of the travelling salesman problem.”
Journal of the ACM, 9(1), 61-63, DOI:10.1145/321105.321111.
Dijkstra, E. W. (1959). "A note on two problems in connection with graphs". Numerische
Mathematik. 1: 269–271. DOI:10.1007/BF01386390
Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T., and Winfield, A. (2008). “Ant Colony
Optimization and Swarm Intelligence” ANTS, Vol. 5217.

Florin, C., Akhil, S., Shubham, S., Vivek, J., and Rachit, J. (2016). “A datadriven approach for
solving route & fleet optimization problems.” hyperlink [Accessed: Sept 13, 2018].
Gorbenko, A., Mornev, M., and Popov, V. (2011), “Planning a typical working day for indoor
service robots.” IAENG International Journal of Computer Science, 38(3), 176-182.
Jose, K., and Pratihar, D. K. (2016). “Task allocation and collision-free path planning of
centralized multi-robots system for industrial plant inspection using heuristic methods.”
Robotics and Autonomous Systems, 80, 34-42.
Kara, I., and Bektas, T. (2006). “Integer linear programming formulations of multiple salesman
problems and its variations.” European Journal of Operations Research, 174(3), 1449–58.
Mantha, B. R., Menassa, C. C., and Kamat, V. R. (2017). “Task Allocation and Route Planning
for Robotic Service Networks in Indoor Building Environments.” JCCE, 31(5), 04017038.
Nallusamy, R., Duraiswamy, K., Dhanalaksmi, R., and Parthiban, P. (2009), “Optimization of
Non-Linear Multiple Traveling Salesman Problem Using K-Means Clustering, Shrink Wrap
Algorithm and Metaheuristics.” IJNS, 8(4), 480-487.
Ryan, J. L., Bailey, T. G., Moore, J. T., and Carlton, W. B. (1998), “Reactive tabu search in
unmanned aerial reconnaissance simulations.” Winter Simulation Conference, 873-880.
Sofge, D., Schultz, A., and De Jong, K. (2002). “Evolutionary computational approaches to
solving the multiple traveling salesman problem using a neighborhood attractor schema.” In
Workshops on Applications of Evolutionary Computation (pp. 153-162). Springer Berlin.
Yu, Z., Jinhai, L., Guochang, G., Rubo, Z., and Haiyan, Y. (2002), “An implementation of
evolutionary computation for path planning of cooperative mobile robots.” 4th World
Congress on Intelligent Control and Automation, IEEE, Vol. 3, (pp. 1798-1802).

Vision-Based Excavator Activity Recognition and Productivity Analysis in Construction

Chen Chen1; Zhenhua Zhu2; Amin Hammad3; and Walid Ahmed4
1
Dept. of Building, Civil, and Environmental Engineering, Concordia Univ., 1455 De
Maisonneuve Blvd. W., Montreal, QC. E-mail: [email protected]
2
Dept. of Civil and Environmental Engineering, Univ. of Wisconsin–Madison, Madison, WI
53706. E-mail: [email protected]
3
Concordia Institute for Information Systems Engineering, Concordia Univ., 1455 De
Maisonneuve Blvd. W., Montreal, QC. E-mail: [email protected]
4
Indus.ai, 5075 Yonge St., Toronto, ON. E-mail: [email protected]

ABSTRACT
Equipment activity recognition plays an important role in automating the analysis of onsite
construction productivity, considering that human observation is always labor-intensive and
time-consuming. Recently, several methods have been proposed to recognize the activity of
construction equipment from videos. However, one of their common issues is the failure to
conduct the recognition in a long video for classifying a sequence of equipment activities. This
paper proposes a novel equipment recognition method with the support of a three-dimensional
(3D) convolutional neural network. The network could extract both temporal and spatial
information of the equipment to recognize its activities in a long video. The recognition results
are further compiled and analyzed to identify the equipment productivity. The proposed method
was tested to recognize the excavator activities (digging, swinging, and loading) on real
construction sites. The results showed that the method outperformed existing equipment activity
recognition methods in terms of accuracy and robustness.

INTRODUCTION
The construction equipment is an important component of the construction projects, which
have a major impact on construction productivity (Kim et al. 2018a). Recognizing the sequential
activities of the equipment could provide the vital information to calculate equipment
productivity. Moreover, such activity information could help the construction manager to
optimize the equipment’s operation time, improve work efficiency, and make important project-
related decisions. Despite these benefits, the equipment activity analysis is considered as a time-
consuming, expensive, and error-prone job (Golparvar-Fard et al. 2013), as it requires the
manual observation and recording in the entire operation process for each construction
equipment. Therefore, it is necessary to create an efficient automatic activity recognition method
to accurately and continuously analyze the equipment’s activities.
Existing automatic methods to recognise construction equipment activities could be classified
into the sensor-based methods and vision-based methods. The sensor-based methods categorize
activities by analyzing the acceleration, velocity and orientation information of the equipment
extracted from sensors, such as the Global Positioning System (GPS), inertial measurement units
(IMU) and radio frequency identification (RFID) tags (Kim et al. 2018b). However, sensor-based
methods need to attach additional devices into all the equipment to be monitored which might
not be feasible (Azar et al. 2013). Also, they have difficulty in categorizing the activities in detail
when the equipment stands at a fixed location and performs different activities (Kim et al.
2018b).

Compared to sensor-based methods, vision-based methods do not have the aforementioned

drawbacks. Furthermore, they could provide more direct and detailed activity information from
the images and/or videos. For example, Golpavar-Fard et al. (2013) extracted the motion
information of the equipment using dense trajectories to classify the activities into detailed
categories (digging, hauling and dumping etc.). Kim et al. (2018b) used the object detection
technology to get the location and transformation information of the equipment to identify the
activities of working and idling. However, the dense trajectory methods have limitations in
recognizing consecutive activities in long videos, and the object detection methods cannot
discriminate the activities in detail.
In order to address these issues, this paper introduces a novel method to recognize the
activities of excavators on the construction site. A state-of-the-art three-dimensional (3D)
convolutional neural network (CNN) was used to extract both the spatial and temporal
information of the activities and precisely recognize the series of activities of the excavator in
long videos. In addition, an automatic productivity calculation method is also developed to
analyze the recognition results and calculate the productivity of the excavator.

BACKGROUND

Equipment Activity Recognition in Construction

Recently, vision-based methods have been developed for the recognition of construction
equipment activities. One of the commonly used methods is the motion-based method, which
uses dense trajectory to get the motion information of the activity. Typically, the trajectories are
the location of the interest points on continues frames in a video, which represents the motion
features of the activity. Then, the activities could be identified by classifying the motion features.
For example, Gong et al. (2011) used the 3D-Harris and local histograms to extract and represent
the motion features of the excavator’s activities. Then, the Bayesian network was used as the
learning mechanism to classify the activates into three categories: relocating, excavating, and
swinging (Gong et al. 2011). The activity recognition schema proposed by Golparvar-Fardetal et
al. (2013) is similar as the one proposed by Gong et al. (2011). The major difference is that they
used the 3D Histograms of Oriented Gradients (HOG) to represent the motion features and a
support vector machine (SVM) classifier to recognize the digging, hauling, dumping and
swinging of the excavator. These studies have been tested to recognize the activities of the
construction equipment and got the recognition accuracy ranging from 73.6% to 86.33%.
However, one major limitation of these motion-based recognition methods is that the trajectory
was always drifted with the progress of the time. As a result, these methods could not be used to
detect activities in long video sequences, which contains several continues activities of the
equipment.
Existing activity recognition methods mainly rely on object detection and
tracking technologies to identify the equipment consecutive activities in long video sequences.
These methods first get the equipment locations from the detection and tracking results. Then,
the types of the equipment activities are identified by comparing the equipment spatial
relationships with each other. Zou and Kim (2007) used the color space to track the excavator
and estimate the idling time by analyzing its changing centroid coordinates. Azar et al. (2013)
used the HOG-based excavator and truck detectors to get the location of the excavator and the
truck. Then, they trained an SVM classifier to identify the loading activity using the vectors,
which are computed from the distance between the base point of the excavator and four corners

of the dump truck. Kim et al. (2018a) also used a HOG-based excavator and truck detectors to
get the centroid distance between the excavator and the truck. Then, they compared the distance
changing in consecutive frames to distinguish the equipment working and idling states.
Moreover, Kim et al. (2018b) improved the previous method by tracking the location of the truck
to recognize the hauling activity of the truck. Although these methods could achieve promising
results in long videos, it is hard to classify the activities into more detailed categories, such as
digging, swinging, and dumping.

State-of-the-art Activity Recognition in Computer Vision

In recent years, video activity recognition has made great progress with the development of
the deep learning method in the computer vision field. There are two main categories in video
activity recognition with deep neural networks: two-stream methods and deep 3D Convolutional
Neural Network (CNN) methods.
Simonyan and Zisserman (2014) first proposed the two-stream method. In their method, two
CNNs were used to extract the special and temporal information from the frames and optical
flows of a video, separately. Then, the two CNNs were fused to identify the activity in the video.
The test result on the human activity dataset achieved the accuracy of 88%. Based on this
original method, numerous variations and improvements have been produced. Wang et al. (2015)
applied the data augmentation, multi-GPU and pre-training techniques to achieve a better activity
recognition accuracy. Ng et al. (2015) improved the activity recognition performance on the
long-time video by replacing the widely used CNN with the Recurrent Neural Network (RNN).
Wang et al. (2016) presented a more efficient temporal segment network, which worked on a
sequence of short snippets sparsely sampled from the input video instead of the whole video.
Although two-stream methods achieved good performance in activity recognition, it is not
efficient to use; and it takes a long time and huge computation resource to extract the optical
flows in the videos.
The deep 3D CNN methods recognize the activities in the videos using a 3D neural network,
which could learn the spatial-temporal features from the video sequences more efficiently and
directly. Tran et al. (2017) first proposed the deep 3D CNN architecture, which used the 3D
convolutional kennels to capture both temporal and spatial information from consecutive frames.
Following that, numerous research studies have been conducted to improve the performance of
the 3D CNN architecture on activity recognition. Hara et al. (2017) created a deep 101-layer3D
CNN based on the residual network’s architecture, which achieved the state-of-the-art
performance of 90.7% in human activity recognition. Varol et al. (2018) created a 3D neural
network with the long-term temporal convolutions (LTC), which could capture the long-temporal
features of the activity and achieved the high accuracy of 82.4% in long videos.

Challenges and Objective

Current methods for construction equipment activity recognition have shown promising
results in the tests. However, there still remain limitations. First, it is difficult to precisely
recognize several continues activities of the equipment in a long video sequence. Specifically,
the motion-feature based methods could merely identify the single activity on the short video
clips, and the detection-based methods could just distinguish working and idling status, which is
not detailed enough for further analysis. Second, existing methods are not efficient and are
difficult to use. For example, the computation of the motion trajectory is extremely large, and the
computation process is complicated. Moreover, in order to identify the activity of the equipment

using the detection-based methods, the threshold should be pre-calculated and adjusted as the
camera position changes.
This research will address the limitations in the current research works. The objective of this
study is to propose an efficient method to automatically recognize and analyze the sequence
activities of the excavator in the long video. A 3D CNN will be applied to recognize the
activities of the excavator in the real earthmoving work, and automatic calculation method will
be created to compute the productivity of the excavator.

METHODOLOGY

Activity Recognition
In the activity recognition task, the latest 3D residual neural network (ResNet) (Hara et al.
2017) is used to recognize the excavator’s activities in the videos. The 3D ResNet has the same
residual block architecture as the ResNet (He et al. 2016), but it performs the convolution and
polling with 3D kernels. All of the inputs, kernels, and outputs in the network are 3D tensors of
temporalheightwidth (LHW) (Tran et al. 2017). Specifically, the network takes
16112112 video frames as input. The sizes of the convolutional kernels are 333, and the
temporal strides are 1 for the first convolutional layer and 2 for the other layers.
In the training stage, to avoid overfitting, a pre-trained 3D ResNet model (Hara et al. 2017) is
selected here and tuned to be the equipment activity recognition model. The original top dense
layer in the pre-trained 3D ResNet is replaced with a new dense layer. Then, the neural network
with the new dense layer is further trained with an excavator activity database.

Activity Analysis and Productivity Calculation

In this stage, a method is created to automatically calculate the productivity of the excavator
based on the activity recognition results. In the independent earthmoving work, the equipment
productivity could be calculated with the cycle time and the bucket payload as
productivity Cycles Averagebucket payload ( LCY )
  . Since the bucket payload is given by
( LCY / hr ) hr Cycle
the manufacturer of the excavator, the target of the productivity calculation becomes to
determine the cycle time of the excavator. Typically, one cycle of the excavator’s earthmoving
work could be broken down into the activities of digging, swinging with bucket full, loading the
truck, and swinging back with empty bucket (Golparvar-Fard et al. 2013). To simplify the
procedure, the two types of swinging (swinging with bucket full and swinging with bucket empty)
are not distinguished in this paper. Instead, they are only differentiated by considering their
temporal relationships with the other two activities. This way, one excavator working cycle is
broken down into digging, swinging, and loading, as shown in Fig. 1

Figure 1. Excavator working cycle

The time for each cycle is measured following the workflow in Fig. 2. After the activity
recognition, each video frame is labeled to indicate the activity of the excavator in the frame. The

labels are shown as Li in Fig. 2. Then, he labels of two consecutive frames are compared. If they
are the same, it means that the activity continues. Therefore, the time for this activity is increased
by 1/FPS (frame per seconds). If the labels are different, it means that a new activity has started.
The time of the newly recognized activity will increase by 1/FPS. The total time of one cycle is
the difference between the start times of two adjacent digging activities.

Figure 2. Workflow for cycle time calculation

IMPLEMENTATION AND RESULTS

Implementation
In the model training stage, a video database including digging, swinging, and loading
activities was created to train the model. In order to avoid bias, 351 video clips were collected
from 21 different construction scenarios, considering site conditions and equipment viewpoints,
scales and colors. The detailed information of the database is shown in Table 1.

Table 1. Statistic information of the database

Activity type Number of Total time(s) Average of Number of
videos video length(s) excavators
Digging 122 651 5.3 19
Swinging 119 440 3.7 21
Loading 110 490 4.5 19

Then, the training started from tuning the 3D ResNet model developed in the work of Hara et
al. (2017). The batch size in the model was set to 16, and the learning rate was set to 0.001. In
the training process, all of the video clips were fixed to 25 FPS. Also, the data augmentation
technique was used to increase the size of the database by flipping the video frames, shifting the
video channels, etc. shearing the video size, etc.
In the testing stage, the proposed methods for activity recognition and productivity
calculation were tested on the long video sequences, which were taken from different
construction sites. Two indicators, precision and recall, were applied to measure the

TP TP
performance of the activity recognition method as Precision  , and Recall  ,
TP  FP TP  FN
where, TP is the number of true positives, FP is the number of false positives, and FN is the
number of false negatives. These two indicators are widely used to validate the performance of
object classification and detection methods in the computer vision domain. More details about
the concepts of precision and recall could be found in the work of Kim et al. (2018b).

Results
All of the implementation processes were performed on a server equipped with the 64-bit
Ubuntu operation system, an NVIDIA GeForce 920M GPU, and the 32 gigabytes memory. The
activity recognition results are presented in Table 2. The recognition precisions for the three
activities are: 85% for digging, 89% for swinging, and 88% for loading. The recall rates are 81%,
97%, and 75%, respectively. The average accuracy of the recognition is 86%. The results
indicate that the proposed method could effectively and successfully recognize the consecutive
activities of the excavator in the long video sequences.

Table 2. Activity recognition results.

Number of activities in the
videos
Activity
Incorrectly
Total recognition precision recall Accuracy
Digging 21 4 85% 81%
Swinging 40 1 89% 97%
Loading 20 4 88% 75%
Average/Total 81 87% 84% 86%

In addition, the time of the activities and the productivity of the excavators were also
calculated by the productivity calculation method. Figure 3 shows an example of the calculation
result, which was printed on the frame of the video. Specifically, the information provided by the
productivity calculation method included the time of the activities, the number of the cycles and
the productivity of the excavator per hour.

Figure 3 Example of test result: (a) Frame in the video (Beytekin 2018); (b) Results of
activity recognition and productivity calculation printed on the frame.

CONCLUSION AND FUTURE WORK

This paper introduced the use of 3D ResNet to recognize the activities of the excavator (i.e.

digging, swinging and loading) in long video sequences. Based on the recognition results, the
productivity of the excavator is estimated automatically. The proposed methods have been tested
on the surveillance videos from construction sites. The test results indicated that the 3D ResNet
achieved the 86% average accuracy in the excavator’s activity recognition. Moreover, the results
of the activity recognition method tested on different excavator sizes and types, camera positions,
and lighting conditions proved its robustness in construction environments and provided a solid
basis to automate the calculation of the equipment productivity.
However, there are still some limitations in this research, which will be addressed in the
future. First, the proposed method could only recognize the activities when the excavator works
continuously. It is important to identify the idling status of the excavator. Second, a criteria
should be found to evaluate the performance of the productivity calculation method, and more
tests are needed to evaluate the accuracy of the productivity calculation results. Finally, objects
detection and tracking methods should be integrated, so that the method could be applied in large
construction sites, where multiple pieces of equipment work together.

REFERENCE
Azar, E. R., Dickinson, S., and McCabe, B. (2013). “Server-Customer Interaction Tracker:
Computer vision-based system to estimate dirt-loading cycles.” J. Constr. Eng.
Manage.,139(7), 785-794.
Beytekin, B. (2018). Caterpillar 323F loads man tipper trailer trucks. YouTube,
<https://www.youtube.com/watch?v=G-tNgBb1WX0&t=122s> (Jun.16, 2018).
Golparvar-Fard, M., Heydarian, A., and Nibles, J.C. (2013). “Vision-based action recognition of
earthmoving equipment using spatio-temporal features and support vector machine
classifiers.” Adv. Eng. Inf., 27(2013), 652-663.
Gong, J., Caldas, C. H., & Gordon, C. (2011). “Learning and classifying actions of construction
workers and equipment using bag-of-video-feature-words and bayesian network models.”
Adv. Eng. Inf., 25(4), 771–782.
Hara, K., Kataoka, H., and Satoh, Y. (2017). “Can Spatiotemporal 3D CNNs Retrace the History
of 2D CNNs and ImageNet?” arXiv: 1711.09577.
He, K., Zhang, X. Ren, S. and Sun, J. (2016) “Deep residual learning for image recognition.” In
Proc., IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 770–778. Las
Vegas: CVPR.
Kim, H., Ann, C. R., Engelhaupt, D. and Lee, S. (2018a). “Application of dynamic time warping
to the recognition of mixed equipment activities in cycle time measurement.” Autom. Constr.,
87(2018), 225-234.
Kim, J., Xhi, S. and Seo, J. (2018b). “Interaction analysis for vision-based activity identification
of earthmoving excavators and dump trucks.” Autom. Constr., 87(2018), 297-308.
Ng, J. Y. H., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., & Toderici, G.
(2015). “Beyond Short Snippets: Deep Networks for Video Classification.” In Proc., IEEE
Conf. on Computer Vision and Pattern Recognition (CVPR), 4694–4702. Boston: CVPR.
Simonyan, K., & Zisserman, A. (2014). “Two-Stream Convolutional Networks for Action
Recognition in Videos.” arXiv:1406.2199.
Tran, D., Ray, J., Shou, Z., Chang, S. and Paluri, M. (2017) “ConvNet architecture search for
spatiotemporal feature learning.” arXiv: 1708.05038.
Varol, G., Laptev, I., & Schmid, C. (2018). “Long-Term Temporal Convolutions for Action
Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6),

1510–1517.
Wang, L., Xiong, Y., Wang, Z., & Qiao, Y. (2015). “Towards Good Practices for Very Deep
Two-Stream ConvNets.” arXiv:1507.02159.
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Gool, L. (2016). “Temporal
Segment Networks: Towards Good Practices for Deep Action Recognition.” European
Conference on Computer Vision (ECCV), 20-36. Amsterdam: ECCV.
Zou, J., Kim, H., (2007). “Using hue, saturation, and value color space for hydraulic excavator
idle time analysis.” J. Comput. Civ. Eng. 21 (4), 238–246.

The Design of Future Robotic Construction Lab

C. H. Yang1; T. H. Wu2; and S. C. Kang3
1
Ph.D. Student, Dept. of Civil and Environmental Engineering, Univ. of Alberta, 116 St. and 85
Ave., Edmonton, AB T6G 2R3, Canada. E-mail: [email protected]
2
Ph.D. Candidate, Dept. of Civil Engineering, National Taiwan Univ., No. 1, Sec. 4, Roosevelt
Rd., Taipei 10617, Taiwan. E-mail: [email protected]
3
Professor, Dept. of Civil and Environmental Engineering, Univ. of Alberta, 116 St. and 85 Ave.,
Edmonton, AB T6G 2R3, Canada. E-mail: [email protected]

ABSTRACT
The increasing labor shortage issue and the working safety awareness cause the urgent of the
development of new construction methods. A new type of lab for new construction processes is
required to expedite the innovation cycle. This paper presents an ongoing work of building a
construction lab at University of Alberta. The goal of the construction lab is to provide a sandbox
for developing new construction processes and machines. We designed the lab with four major
systems: (1) sensors: to collect data from construction site for operation assistants and virtual
reconstruction; (2) manipulators: to excavate the path planning algorithms and to develop the
cooperation approaches between human and machines; (3) visualizers: to construct the digital
twin of a real construction site for revealing the simulated results in virtual environment; (4)
computers: to run machine learning algorithms for recognizing and tracking objects in
construction environments. This laboratory allows the researchers in the construction
engineering test and develop their tools in a controlled environment. Such scaled tests in the lab
can bring significant benefits in finance, efficiency, and safety.
Technical Area: Robotics, automation, and control.
Application Context: Project design, construction, planning, and management.

INTRODUCTION
The laboratory is considered an essential role in both techniques developing and personnel
training in the field of engineering. In a laboratory, the environmental factors can be manipulated
easily and scrutinized arbitrarily (Allen 1980). Newly developed techniques can be prototyped,
tested, and improved in a well-control environment. A laboratory is also a place for students to
realize what they have learned from the lectures or textbooks and sometimes a stepping stone for
them to the industries (Basey et al. 2008; Feisel and Rosa 2013). In the field of structural
engineering, researchers usually rely on the structural lab for testing material parameter (Chan
and Chu 2004), behaviors (Morrow et al. 1970), and developing new types of dampers or
connectors (Yang et al. 2018). However, there is no such a lab can be the environment for
developing new types of construction methods or machines in the field of construction currently.
Nowadays, as the elevation of labor shortage and safety awareness, it is important for
engineers to developed advanced construction methods for reducing the use of labors. The labor
shortage in the construction industry is becoming a severe problem, especially in developed
countries. According to the Canadian Federation of Independent Business (CFIB)’s recent report
(CFIB 2018), construction industries holds the second highest job vacancies rate (3.6%) in
Canada in the first quarter of 2018. The rate is expecting higher due to the increasing aging
population and decreasing birth rate. The shortage of labors brings serious issues on construction

productivity. Besides the labor shortage issue, construction safety is another increasing critical
issue in the industry. In Canada, fatalities in the construction industry account for about 23.18%
of all industries in Canada during 2014-2016 according to the report from the Association of
Workers’ Compensation Boards of Canada (AWCBC).

Figure 1. The average selling price of industrial robots from 2009 to 2018 (in 1,000 U.S.
dollars) (The Statistics Portal1 2018).

Figure 2. The average sensor sales price from 2010 to 20218 (in U.S. dollars per unit) (The
Statistics Portal2 2018).
As the growth of robot technology and decreasing of robot price, it is a promising direction
to develop robotic and internet of things (IOTs) solutions to reduce the labors in the job site, and
meanwhile increasing the productivity of the construction industry. However, shifting the long-
lasting construction process from labor-based to robot-based is not a simple task but requires a
series of research. A new type of lab equipped with robots and sensors is required to test
innovative approaches. Therefore, this paper proposes a newly designed construction lab,
tentatively coined as R-Lab, with the integration of robotics, sensors, computational units. The
lab will serve the role of innovation promoter for the robotic construction processes. Similar to
structural engineers rely on structural labs, R-Lab can be the environment of the development
and test new machines, new sensors and new construction processes with the assistance of

robots.

REASONS FOR ROBOTIC CONSTRUCTION LAB

This paper presents an ongoing work of building a robotic construction lab in the department
of civil and environmental engineering at University of Alberta. The objective of such a lab is to
provide a sandbox for developing new construction processes and machines. Three main reasons
can be listed to explain why robotic lab can be a promising direction of the construction lab.
1. The decreasing of robots and sensors price: The price of robots and sensors decrease
yearly due to the growth of related technology and manufacturing capacities. For instance,
the selling price now is three-quarter of a decade ago (Figure 1). The unit price of the
sensor in 2018 is half of 2010 (Figure 2). The price trend shows that robots and sensors
are now more affordable and readier for the construction industry.
2. Accurate and programmable control of robots: Comparing with the existing
construction machines, robots can have higher accuracy in manipulations. The feature of
programmable control of the robot also makes it able to simulate different kinds of
machines. With the feature of accurate and programmable control, robots can efficiently
be utilized in the development of new machines.
3. Design toward robotic construction: Robotic has been widely utilized in the
manufacturing industry for complex, repetitive, and tedious tasks for years (Rüßmann et
al. 2015). However, in the field of construction, the application of industrial robots is now
infancy and still has a long way to go. According to the Statistics Portal’s report, eighty-
three percentage of construction companies have not implemented robots in their working
process in 2017. As the development of both hardware and software, robots are
considered the immediate solutions for having higher productivity with the replacement
of labors in the field construction. With a robotic construction lab, researchers can
develop and test advanced robotic construction methods.

Figure 3. Degree of robotic process automation/digital labor implementation in U.S.

engineering and construction companies as of 2017 (The Statistics Portal3 2018).

Figure 4. The overall plan of the R-Lab.

Table 1. The specifications and roles of each device in R-lab.

Device Specifications Roles in the lab
Stationary robot arm with a 300-kg payload, 3-m reach, and 30-m
Machine behavior simulation
track track
Hanging robot arm with a 150-kg payload, 1.5-m reach, and 30-
Robot-robot cooperation
track m track
Mobile robot with arm on
10-kg payload Robot-human interaction
the top
Augmented virtuality
- Virtual environment simulation
environment
Computational Data processing and robot
High-performance computer clusters
environment simulation

DESIGN OF R-LAB
Figure 4 illustrates the overall plan of the R-Lab. The lab will contain industrial robot arms
setting on tracks, multiple mobile platforms with an arm on top, augmented virtuality devices for
visualizing the physical robot experiment and the virtual construction scenarios, and a high-
performance computational environment for real-time simulations. In the lab, a large stationary
robot with a high-payload (at least 300 kg), long-reach (at least 3 m), and a track (approximate
30 m) to increase the working space can support multiple research topics in both on-site
construction and off-site manipulation. A hanging robot arm with a mid-payload (100-150 kg)
can support research topics in robot-robot cooperation for various construction tasks. Two
mobile robots with a low-payload (approximate 10 kg) can be used for studying the human-robot

cooperation and interaction the construction site. R-lab will also equip with augmented virtuality
and a remote-control environment that allows a complete demonstration of the innovation for
future construction sites. The computational high-performance computational environment can
support the real-time data process and robot simulations (Table 1).

Figure 5. The supported research topics in R-Lab. (a) autonomous construction process;
(b) teleoperations; (c) augmented virtuality; (d) 3D construction printing.

SUPPORTED RESEARCH TOPICS IN R-LAB

R-Lab is designed as a sandbox to develop new construction methods and machines for the
future construction industry. In the following four years, the lab will be utilized for supporting
four main research topics:
1. Autonomous construction process: The lab will focus on the development of
autonomous construction process. With a 300-kg payload robot arm, 3-m reach, and a 30-
m track, the maximal working area achieve 6 m by 30 m, considering simulating the
process in an actual scale (1:1). We can simulate simplified on-site activities and
manufacturing activities in a precision ten times higher than current human processes.
When considering a 1:20 scale experiment, we can simulate construction activities in the
site area of 120 meters by 600 meters. Due to the highly flexible of the use of the robot
arms, it can easily serve as the simulator of construction machines, such as cranes and
excavators, then testing the newly developed construction process (Figure 5a).
2. Teleoperations: With a remote-control environment, we can develop teleoperation
systems for preventing the workers from engaging in the dangerous and unpredictable
construction site. The robot arms in the lab can serve as the simulator of the construction
machine. Sensors will be installed for collecting related data for guiding the operator. The

developer and the actual machine operator can remotely control the simulated machine in
the remote-control room for testing and iterating the teleoperation system (Figure 5b).
3. Augmented virtuality: Research about the mixture of virtual view of real construction
site and a numerical model link with lab experiments will also be conducted. The
augmented virtuality system will be developed for observing the machine-machine,
machine-human, and machine-environment behaviors in the construction site (Figure 5c).
4. 3D construction printing: 3D construction printing technique will be developed in R-
lab. We will implement the additive approaches of 3D printing technology on
manufacturing irregular building components, such as domes and curved window frames
(Figure 5d).

CONCLUSIONS
This paper presents an ongoing work of building a robotic construction lab, tentatively
coined as R-Lab, at University of Alberta. The lab is consisted of stationary robot arms with
tracks, mobile robots, augmented virtuality environment, and high-performance computational
units. Four primary research topics will be conducted in the newly developed lab. They are
autonomous construction process development, teleoperation system development, augmented
virtuality implementation, and 3D construction printing. R-Lab is designed to be a sandbox for
developing new construction methods and new machines. Such a lab is expected to lead the
industry toward a robotic construction.

REFERENCES
Allen, E. (1980). “Things Learned in Lab.” Journal of Architectural Education, 34(2): 22-25.
DOI: 10.1080/10464883.1980.10758245.
Association of Workers’ Compensation Boards of Canada. “2014-2016 National Work Injury,
Disease and Fatality Statistics.” < http://awcbc.org/wp-content/uploads/2018/03/National-
Work-Injury-Disease-and-Fatality-Statistics-Publication-2014-2016-May.pdf> (Nov. 18,
2018).
Basey, J., Sackett, L., and Robinson, N. (2008). “Optimal Science Lab Design: Impacts of
Various Components of Lab Design on Students’ Attitudes Toward Lab.” International
Journal for the Scholarship of Teaching and Learning, 2(1): Art. 15. DOI:
10.20429/ijsotl.2008.020115.
Chan, Y. W. and Chu, S. H. (2004). “Effect of Silica Fume on Steel Fiber Bond Characteristics
in Reactive Powder Concrete.” Cement and Concrete Research, 34(2004): 1167-1172. DOI:
10.1016/j.cemconres.2003.12.023.
Feisel, L. D. and Rosa, A. J. (2013). “The Role of the Laboratory in Undergraduate Engineering
Education.” The Research Journal for Engineering Education, 94(1): 121-130. DOI:
10.1002/j.2168-9830.2005.tb00833.x.
Morrow, J., Wetzel, R., and Topper, T. (1970). "Laboratory Simulation of Structural Fatigue
Behavior," Effects of Environment and Complex Load History on Fatigue Life, STP462-EB,
Rosenfeld, M., Ed., ASTM International, West Conshohocken, PA, 74-91. DOI:
10.1520/STP32036S.
Rüßmann, M., Lorenz, M., Gerbert, P., Waldner, M., Justus, J., Engel, P., and Harnisch, M.
(2015) “Industry 4.0: The Future of Productivity and Growth in Manufacturing Industries,”
Boston Consulting Group 2015. <
https://www.bcg.com/publications/2015/engineered_products_project_business_industry_4_f

uture_productivity_growth_manufacturing_industries.aspx> (Nov. 18, 2018).

Statista - The Statistics Portal1. “Degree of robotic process automation/digital labor
implementation in U.S. engineering and construction companies as of 2017.” <
https://www.statista.com/statistics/805207/degree-of-technological-adoption-in-us-
engineering-and-construction-companies-2017/> (Nov. 18, 2018).
Statista - The Statistics Portal2. “Global average sensor sales price from 2010 to 2020 (in U.S.
dollars per unit).” < https://www.statista.com/statistics/736563/global-average-sales-price-of-
smart-sensors/> (Nov. 18, 2018).
Statista - The Statistics Portal3. “Worldwide sales of industrial robots from 2004 to 2017 (in
1,000 units).” <https://www.statista.com/statistics/264084/worldwide-sales-of-industrial-
robots/> (Nov. 18, 2018).
The Canadian Federation of Independent Business (2018). “Help Wanted: Private sector job
vacancies, Q1 2018.” < https://www.cfib-fcei.ca/en/research-economic-analysis/help-wanted-
private-sector-job-vacancies> (Nov. 18, 2018).
Yang, Y. Y., Chang, C. M., and Kang, S. C. (2018). “Framework of Automated Beam Assembly
and Disassembly System for Temporary Bridge Structures.” The 2018 International
Symposium on Automation and Robotics in Construction (ISARC), Berlin, Germany.

Digital Twins as the Next Phase of Cyber-Physical Systems in Construction

C. Kan1 and C. J. Anumba, Ph.D., D.Sc., P.E., F.ASCE2
1
M. E. Rinker, Sr. School of Construction Management, Univ. of Florida, PO Box 115703,
Gainesville, FL 32611, U.S. E-mail: [email protected]
2
College of Design, Construction, and Planning, Univ. of Florida, PO Box 115703, Gainesville,
FL 32611, U.S. E-mail: [email protected]

ABSTRACT
The Industry 4.0 wave has driven various industries to be increasingly digital. Under this
trend, the construction industry is also seeking out new ways to keep pace with the rapid
transformation. As one of the major technologies associated with Industrial 4.0, digital twins
(DT)—a subset of cyber-physical systems (CPS)—shows enormous potential to benefit the
construction industry. The aim of this paper is to present a comprehensive, up-to-date literature
review and critical analysis of existing research into DT applications, with a view to identifying
the opportunities for research and applications in the construction domain. Based on the results
of the analysis, DT has the potential to gather and analyze information at different scales. In this
context, DT would facilitate more efficient construction processes and allows for informed
decision to be made, which will ultimately promote a safer environment at reduced
costs/improved efficiency.

INTRODUCTION
Technological advancement has driven dramatic increases in industrial productivity since the
dawn of the Third Industrial Revolution. Currently, the industry is under the wave of a new
technological advancement: the rise of digital transformation, known as ‘Industry 4.0’, a concept
that was initiated by the German federal government in 2011 to promote the computerization of
manufacturing. Industry 4.0 represents the fourth industrial revolution in diverse industries
including manufacturing, energy (Energy 4.0), transportation, supply chain (Logistics 4.0)…etc.
(Lasi et al. 2014). It is now a collective term for a number of enabling technologies involving
Big Data, cloud computing, Cyber-Physical System (CPS), Internet of Things (IoT), and Digital
Twin (DT)...etc. (Negri et al. 2017). As Industry 4.0 is unlocking a whole new world of
possibilities for many industries, ‘Construction 4.0’ is still in its very early stage and the
construction industry still lags behind other industries in terms of the automated production and
the level of digitalization.
As one of the main concepts associated with the Industry 4.0 wave, DT has received
increasing attention in recent years. It is a specific form of cyber-physical system (CPS) that
refers to a near-real-time digital replica of a physical product or process, which includes all
information that could be useful throughout all lifecycle phases (Boschert and Rosen 2016). The
concept of DT has been around since 2002 and was first used in the field of aerospace. Until
recently, a couple of other industry sectors such as manufacturing, industrial engineering and
robotics (Negri et al. 2017) adopted the DT concept. As the construction industry continues to
become digital at an increasing rate and seeks out innovations, the notion of creating DTs could
possibly open up great opportunities for the future of the construction industry.
Since DT in the construction industry is still at its infancy, a review of the applications in
other domains would be highly beneficial, in order to lay out the conceptual foundation for future

research, explore potential opportunities, and identify the impediments for the technological
adoption within the construction industry. In this regard, an extensive literature review was
conducted, and this paper is structured as follows: it starts with an overview of DT including its
history, evolution, and definitions. This is followed by a review of DT applications in other
industries. The key features and application areas of DT are summarized based on their relevance
to the construction industry. The concluding part of this paper makes suggestions on how DT
application in the construction industry can be achieved.

OVERVIEW OF DIGITAL TWINS (DT)

The origin of the ‘twin’ concept can be traced back to NASA’s Apollo program. It was used
by NASA for the moon exploration mission Apollo 13 and Mars Rover Curiosity. Two identical
space vehicles were built, and the one remaining on earth during the mission was called the
‘twin’. The twin vehicle in the Apollo program was used for training during the preparation, and
while in flight, the twin vehicle mirrored the conditions of the vehicle in space (Rosen et al.
2015). In this sense, a ‘twin’ can be defined as a prototype that allows for the mirroring of the
real-time condition.
The first definition of Digital Twin (DT) was given by NASA in the Technology Roadmap
2010 as ‘an integrated multi-physics, multi-scale, probabilistic simulation of an as-built vehicle
or system that uses the best available physical models, sensor updates, fleet history, etc., to
mirror the life of its corresponding flying twin’ (Shafto et al. 2010). Around 2015, the DT
concept was adapted as comprising a generic ‘product’, and began to be adopted by various
industries such as manufacturing, industrial engineering, and informatics (Negri et al. 2017).
Several definitions of DT have been proposed and it should be noted that there are different
viewpoints on the concept of a DT. For instance, Rosen et al. (2015) viewed DT as ‘the next
wave in modelling, simulation and optimization technology’, where DT was regarded as the
model of the system itself. Some other researchers hold the view that DT represents a platform
where simulations can be built upon. It refers to a comprehensive model of the physical product
as well as the linked description and the operation data. Thus, DT not only serves as a
representation of the physical product, but it is also applicable in the whole lifecycle phases
(Glaessgen and Stargel 2012; Kraft 2016). A general definition of DT, which has been
recognized and used the most, was given by Glaessgen and Stargel in 2012: ‘digital twin is an
integrated multi-physics, multi-scale, probabilistic simulation of a complex product and uses the
best available physical models, sensor updates, etc., to mirror the life of its corresponding twin’.
Based on the definitions, a DT can be regarded as an emerging form of cyber-physical
system (CPS), while the latter refers to the bi-directional integration of computational and
physical resources, wherein the physical and virtual components can interact and communicate
with each other through embedded computers and networks (Lee 2008). DT consists of three
primary components: a physical product in real space, a virtual product in virtual space, and the
connections of data and information that tie the physical and virtual products together (Grieves
and Vickers 2017). It is emerged as a new generation of CPS, which serves to integrate
components from both physical and virtual space and realize the inter-connection between the
two spaces. This ‘twinning’ process, i.e., the establishment of connection between physical and
virtual space is achieved through the data collected by sensors. As DT is a concept of having a
real-time digital equivalent to a physical product, ideally, it contains all aspects of the data
related to the product, both geometrically and functionally (Grieves and Vickers 2017). The data
collected by sensors is used to establish the representation of the physical product, and the digital

representation is later used for modeling, visualization, analysis, simulation and further planning.
Recently, DTs have increased the ability to change dynamically in near real-time as the state of
the physical product changes. Sensory data is able to represent the impact of the external
environment onto the product and vice versa (Grieves and Vickers 2017).
The definitions of DT also reveal one of its key characteristics -- real-time reflection (Tao et
al. 2018). DT enables up-to-date information of the physical space to be reflected in the virtual
space with a high level of precision and synchronization. Another key characteristic of DT is
self-evolution. The model in the virtual space undergoes dynamic modifications through
comparing the two spaces (Tuegel et al. 2011). DT also serves to integrate the two spaces. This
characteristic was termed ‘convergence and interaction’ according to Gabor et al. (2016). Tao et
al. (2018) interpreted this characteristic with two more implications: 1) convergence and
interaction of all physical space data generated in different phases and 2) convergence and
interaction between historical data and real-time data.

APPLICATIONS OF DIGITAL TWINS IN VARIOUS INDUSTRIES

DT technology has been used in a number of industries for various purposes. It was initially
adopted and implemented in the aerospace industry. Manufacturing is another frontier industry
that started applying DT to achieve a ‘smart’ manufacturing environment. Applications can also
be seen in fields such as industrial engineering, product lifecycle management, informatics and
robotics. A summary of the applications of DT in different fields has been developed from
literature and is shown in Table 1. However, due to space limitations, only a few are discussed in
this paper. These are covered in the next section with regard to their relevance to the adoption of
DT in the construction industry. Based on the literature review, the main applications of DT
technology include:
 Enabling real-time monitoring for: 1) service life prediction, 2) ensuring product
reliability, 3) quality optimization, 4) improved maintenance and 5) data capturing and
integration.
 Providing analysis capabilities to: 1) predict future performance, 2) manage the lifecycle
of IoT, 3) support decision-making and 4) promote future planning.

DISCUSSION: OPPORTUNITIES FOR DTS IN THE CONSTRUCTION INDUSTRY

The early applications of DT in other industries have given rise to the recognition of how DT
could potentially benefit the construction industry. They offer insights regarding the
opportunities for DT applications in the construction industry - these are discussed below:

Lifecycle Management
Several authors share the perspective of DT employed in lifecycle management. For instance,
Tao et al. (2018) applied the DT concept in the whole product lifecycle--design, manufacturing
and service while Canedo (2017) proposed a DT method to improve IoT lifecycle phases. There
is potential for significant benefits to construction project delivery through the adoption of the
DT concept. The benefits extend from early design phase, to construction delivery and ultimately
through long-term building operation and facility management. One of the biggest benefits is the
transparency and collaboration across multiple parties - owner, architect, engineer, contractor
and operator – and project stages.

Table 1. Applications of DT in different industry sectors.

No Field Year Reference DT Application Aim
1 Aerospace 2011 Tuegel et Reengineer aircraft To predict life of aircraft
al. structural life prediction structure and assure its
structural integrity
2 Aerospace 2012 Glaessgen Integrate simulation with To continuously monitor and
& Stargel existing management forecast the health of the
system to mirror the life of vehicle
flying
3 Aerospace 2016 Kraft Merge physical modelling To provide engineering analysis
and experimental data to capabilities and support
generate whole lifecycle decision making
digital representation
4 Aerospace 2017 Li et al. Aircraft wing health To provide the decision maker
monitoring, crack growth with the information on damage
prediction state of the aircraft
5 Manufacturing 2015 Rosen et al. Autonomous To be able to quickly respond to
manufacturing system unexpected events without re-
planning and human control
6 Manufacturing 2017 Schleich et Build a reference model for To ensure model scalability,
al. geometrical variation interoperability, expansibility
management and fidelity
7 Manufacturing 2017 Uhlemann Realize the Cyber-Physical To enhance transparency in the
et al. Production System (CPPS) production system and allow
real-time production control
8 Product 2017 Grieves & Address human interaction To mitigate unpredictable,
Lifecycle Vickers that leads to accident undesirable emergent behavior
Management in complex systems
9 Product 2018 Tao et al. Apply big data in product To enable more efficient, smart,
Lifecycle design, manufacturing, and and sustainable product
Management service lifecycle management
10 Industrial 2017 Söderberg Real-time geometry To optimize geometrical quality
Engineering et al. assurance in individualized during design and pre-
production production
11 Informatics 2017 Canedo Industrial IoT lifecycle To improve IoT lifecycle
management and phases with information
optimization feedback and feedforward flows
12 Informatics 2017 Alam & Create an architecture To identify various degrees of
Saddik reference model for Cloud- basic and hybrid computation-
based cyber-physical interaction modes; to enhance
system (C2PS) cross domain integrations
13 Robotics 2016 Schluse & Virtual commissioning To develop a comprehensive
Rossmann software environment
(eRobotics) for the development
of complex technical systems

Big Data
Big data is a key aspect of the DT technology. Rosen et al. (2015) stated that the DT model
requires huge digital data storage. Schleich et al. (2017) confirmed that big data management and
analytics regarding model scalability, interoperability, expansibility and fidelity has become an
issue in the DT context. Tao et al. (2018) also mentioned big data as an analysis method to
elaborate data for DT-based product lifecycle optimization. The advances in information
technology also drive the construction industry toward the big data era. Data mining and analysis
are now very important in dealing with the vast amount of data generated during the entire
project lifecycle – planning, design, construction, and maintenance.
Data interoperability has become a significant issue in the exploration of Big Data
applications in the construction industry, and the need for open data standards and a standard
exchange format to support efficient information exchange is well understood. DT can contribute
to addressing this as it promotes tight coordination between the virtual and physical aspects of a
project. It also enables the direct analysis of the theoretical values of big data and the real
condition of a project. Various information generated by different parties throughout the entire
project lifecycle can be simulated, monitored, optimized, and verified on the same platform. This
sharable information platform will facilitate standardized process, workflow, and procedure for
each specific project, and efficient, transparent and collaborative ways of working will ultimately
benefit the whole industry.

Real-time Monitoring
The real-time monitoring capability of DT is one of the major research streams. Some
researchers focus on the monitoring of the product/process itself. For example, Glaessgen and
Stargel (2012) integrated simulation with existing management system to continuously monitor
the health of a space vehicle while Li et al. (2017) used DT to monitor the damage state of
aircraft and to predict potential crack growth. Söderberg et al. (2017) performed real-time
monitoring in individualized production to assure geometrical quality during design and pre-
production. Another research strand is concerned with monitoring labor behavior to reduce
human error. Grieves and Vickers (2017) addressed human interaction that leads to accidents, to
mitigate unpredictable and undesirable emergent behavior in complex systems.
The construction industry has long been criticized for its poor safety record. As a labor-
intensive industry, human error is one of the significant causes that lead to potential safety
problems. DT, which links the physical with its virtual equivalent, has the potential to mitigate
hazardous conditions or unsafe human behavior. On one hand, the dynamic conditions on site
can be studied by the embedded DT algorithms, and the safety condition can be assessed against
a variety of different boundary conditions. For instance, DT make it possible to check for cracks
on columns or any material displacement through pre-set image-processing algorithms. This
would trigger additional inspections and thus help to detect potential safety issues before an
accident occurs. Additionally, the real-time monitoring and re-construction features of DT allow
for tracking the presence and behavior of workers on site, in order to prevent inappropriate
behavior and activities in hazardous zones. A notification system can be employed, to issue
warnings to the manager on site as well as any workers in potential danger.

Decision Making / Performance Prediction

Another benefit of DT is improved efficiency in the decision-making process, which results

from the multi-modal data (e.g. historical data, data generated during operation…etc.) fused into
a single analytics system. The analysis performed on the stored data makes it possible to provide
the derived insights immediately, thus reducing effort in decision making and providing a basis
for moving forward. This capability of DT was highlighted by researchers such as Tuegel et al.
(2011), to predict life of aircraft structure through the mirrored flight information. Glaessgen and
Stargel (2012) forecasted the health of space vehicles through continuously monitoring and
analyzing the performance data. Kraft (2016) merged physical-based modelling and
experimental data and generated a whole lifecycle digital representation, thus providing
engineering analysis capabilities in supporting decision-making. Li et al. (2017) monitored the
health of aircraft wings, enabling the decision maker to be able to quickly respond to unexpected
events.
Along the lifecycle of a construction project, DT has the capacity to record the progress on
site, keeping a real-time digital replica of the project, and continuously adjusting itself for
optimized results. In this sense, performance prediction generated within the virtual space
represents an accurate basis for well-informed decisions.

Efficiency
Despite the safety perspective highlight in the real-time monitoring capability of DT,
efficiency is another driver for several industries to implement DT. In the construction industry,
there is no doubt that efficiency is of vital importance in ensuring on-time and on-budget project
delivery. With the capability of integrating multi-scale data and simulation, DT offers new
approaches in improving efficiency throughout the project lifecycle:
DT-based design: DT provides a platform for various models (e.g. conceptual model,
architectural model, structural model, etc.) to be integrated, which enables calculation and
simulation to be performed as a whole. Therefore, enhanced coordination is achieved, simulation
scenarios accounting for all aspects (architectural and structural) can be created to accurately
predict the actual performance of the proposed building, and early modifications can be made.
DT-based construction: There are potential benefits from a number of perspectives:
 Progress monitoring. This verifies that the completed work is consistent with plans and
specifications. With the application of DT, an as-built state of a project can be re-
constructed and constantly synced. Moreover, this as-built model can be compared with
an as-designed model in real-time, which allows corrective actions to be taken.
 Site logistics planning. With DT, material allocation and equipment utilization can be
monitored and tracked automatically. This will assist in avoiding over-allocation and
dynamically predicting resource requirements on sites, thereby providing a more efficient
approach to logistics planning.
DT-based operation: While digital twin technology is needed initially for the planning and
construction phase, it can also provide the basis for building operations (which typically include
utilization and maintenance) to move forward. With DT, building utilization can be optimized,
unknowns can be foreseen earlier, and maintenance can be planned accordingly.

Enhanced Cyber-Physical Systems (CPS)

CPS offers a strong basis for DT but this is not acknowledged by all DT literature. One
research strand from the manufacturing field utilizes DT to simulate CPS-based product or
production system. For instance, Uhlemann et al. (2017) proposed a Cyber-Physical Production

System (CPPS) to enhance transparency in the production system and allow real-time production
control. While DT has not been widely investigated in the construction industry, applications of
CPS can be seen in certain fields already, namely energy monitoring, temporary structures
monitoring and mobile crane operations. With the aim of improving mobile crane safety, the
authors proposed a CPS-based approach for planning and monitoring mobile crane operations on
construction sites (Kan et al. 2018). A five-layer framework consists of object layer, sensing
layer, communication layer, sensing layer and actuation layer was developed, with object layer
representing the physical world, actuation layer representing the virtual model, and the
intermediate three serving to connect the two system representations. In this research, CPS
provides a platform which enables the bi-directional information flow and the tight coordination
between physical crane on site and its virtual representation (Kan et al. 2018). There is scope to
extend this using DT technology. The real-time monitoring capability of DT will make it
possible to synchronize the crane operating conditions with the system and thus feed the DT with
real-time operational data, such as environmental conditions, boom angle and load weight. As a
digital replica of the physical condition, the DT incorporates both the historical data and the real-
time operational data generated in different phases of a project. It will enable more thorough
synthesis and analysis to be performed on various states of the equipment/project. As a result,
informed decisions can be made to avoid potential hazards. With the use of DT, the system
would have enhanced decision making autonomy for improved mobile crane safety. More work
is needed to fully explore all the ramifications of developing digital twins of either the mobile
crane or its operations.

CONCLUSIONS
This paper has presented a comprehensive, up-to-date literature review concerning the
history and evolution of DT technology and its current applications in various industries.
Developed in the early 2000s, the concept of DT was first utilized by the aerospace industry. It is
now recognized as an emerging technology in other fields; however, there is little or no literature
on its adoption in the construction industry. It is apparent that the application of DT is still in its
infancy. This paper has sought to explore existing application areas and identify how DT could
potentially benefit the construction industry.
Based on the results of the review, current topics and trends concerning DT application in
other fields are identified, and their relevance to the construction industry discussed. It was
identified that DT has the potential to gather and analyze data on different scales, enabling more
efficient construction processes in a safer environment and at reduced costs. It is also clear that
with regard to the adoption of new technology in the construction process, there is considerable
room for improvement. Questions that will need to be answered include how to demonstrate the
role of DT in the built environment, what sensing technologies can be leveraged to precisely
capture the reality data on site, what data model structure can be used to aggregate the massive
amount of data collected through a project’s lifecycle, and how to support seamless data
exchange between heterogeneous applications.

REFERENCES
Alam, K. M. and Saddik, A. E. (2017). “C2PS: A Digital Twin Architecture Reference Model for
the Cloud-Based Cyber-Physical Systems.” IEEE Access, 5, 2050–2062.
Boschert, S. and Rosen, R. (2016). “Digital Twin—The Simulation Aspect.” Mechatronic
Futures: Challenges and Solutions for Mechatronic Systems and their Designers, P.

Hehenberger and D. Bradley, eds., pp. 59–74. Cham: Springer International Publishing.
Canedo, A. (2016). “Industrial IoT Lifecycle via Digital Twins.” Proceedings of the Eleventh
IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System
Synthesis (pp. 29:1–29:1). New York, NY, USA: ACM.
Gabor, T., Belzner, L., and Kiermeier, M. (2016). “A simulation-based architecture for smart
cyber-physical systems.” IEEE International Conference on Autonomic Computing (ICAC),
2016:374–379
Glaessgen, E. and Stargel, D. (2012). “The Digital Twin Paradigm for Future NASA and U.S.
Air Force Vehicles.” In 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics
and Materials Conference, American Institute of Aeronautics and Astronautics.
Grieves, M. and Vickers, J. (2017). “Digital Twin: mitigating unpredictable, undesirable
emergent behavior in complex systems.” Transdisciplinary Perspective on Complex System,
85–113.
Kan, C., Fang, Y., Anumba, C.J., and Messner, J.I. (2018). “A cyber–physical system (CPS) for
planning and monitoring mobile cranes on construction sites.” Proceedings of the Institution
of Civil Engineers – Management, Procurement and Law,
https://doi.org/10.1680/jmapl.17.00042
Kraft, E. M. (2016). “The Air Force Digital Thread/Digital Twin - Life Cycle Integration and
Use of Computational and Experimental Knowledge.” 54th AIAA Aerospace Sciences
Meeting. American Institute of Aeronautics and Astronautics.
Lasi, H., Fettke, P., Kemper, H. G., Feld, T., and Hoffmann, M. (2014). “Industry 4.0.” Business
& Information Systems Engineering, 6(4), 239–242.
Lee E. A. (2008). “Cyber Physical Systems: Design Challenges.” 2008 11th IEEE International
Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC),
Orlando, FL, 2008, pp. 363-369.
Li, C., Mahadevan, S., Ling, Y., Choze, S., and Wang, L. (2017). “Dynamic Bayesian Network
for Aircraft Wing Health Monitoring Digital Twin.” AIAA Journal, 55(3), 930–941.
Negri, E., Fumagalli, L., and Macchi, M. (2017). “A Review of the Roles of Digital Twin in
CPS-based Production Systems.” Procedia Manufacturing, 11, 939–948.
Rosen, R., Wichert, G., Lo, G., and Bettenhausen, D. (2015). “About the importance of
autonomy and digital twins for the future of manufacturing.” IFAC-PapersOnLine, 48(3),
567–572.
Shafto, M., Conroy, M., Doyle, R., Glaessgen, E., Kemp, C., LeMoigne, J., and Wang, L. (2010).
“Modeling, simulation, information technology &processing roadmap.” Technology Area 11.
Schleich, B., Anwer, N., Mathieu, L., and Wartzack, S. (2017). “Shaping the digital twin for
design and production engineering.” CIRP Annals, 66(1), 141–144.
Schluse, M. and Rossmann, J. (2016). “From simulation to experientable digital twins:
Simulation-based development and operation of complex technical systems.” In 2016 IEEE
International Symposium on Systems Engineering (ISSE), pp. 1–6.
Söderberg, R., Wärmefjord, K., Carlson, J. S., and Lindkvist, L. (2017). “Toward a Digital Twin
for real-time geometry assurance in individualized production.” CIRP Annals, 66(1), 137–
140. https://doi.org/10.1016/j.cirp.2017.04.038
Tao, F., Cheng, J., Qi, Q., Zhang, M., Zhang, H., and Sui, F. (2018). “Digital twin-driven product
design, manufacturing and service with big data.” The International Journal of Advanced
Manufacturing Technology, 94(9–12), 3563–3576.
Tuegel, E. J., Ingraffea, A. R., Eason, T. G., and Spottswood, S. M. (2011). “Reengineering

Aircraft Structural Life Prediction Using a Digital Twin” [Research article].

Uhlemann, T. H.-J., Lehmann, C., and Steinhilper, R. (2017). “The Digital Twin: Realizing the
Cyber-Physical Production System for Industry 4.0.” Procedia CIRP, 61, 335–340.

Semantic Relation Detection between Construction Entities to Support Safe Human-Robot

Collaboration in Construction
Daeho Kim1; Ankit Goyal2 ; Alejandro Newell3 ; SangHyun Lee4; Jia Deng5;
and Vineet R. Kamat 6
1
Ph.D. Student, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann Arbor, MI
48109. E-mail: [email protected]
2
Ph.D. Student, Dept. of Computer Science, Princeton Univ., Princeton, NJ 08544. E-mail:
[email protected]
3
Ph.D. Student, Dept. of Computer Science, Princeton Univ., Princeton, NJ 08544. E-mail:
[email protected]
4
Associate Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann
Arbor, MI 48109. E-mail: [email protected]
5
Assistant Professor, Dept. of Computer Science, Princeton Univ., Princeton, NJ 08544. E-mail:
[email protected]
6
Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann Arbor, MI
48109. E-mail: [email protected]

ABSTRACT
Construction robots have drawn increased attention as a potential means of improving
construction safety and productivity. However, it is still challenging to ensure safe human-robot
collaboration on dynamic and unstructured construction workspaces. On construction sites,
multiple entities dynamically collaborate with each other and the situational context between
them evolves continually. Construction robots must therefore be equipped to visually understand
the scene’s contexts (i.e., semantic relations to surrounding entities), thereby safely collaborating
with humans, as a human vision system does. Toward this end, this study builds a unique deep
neural network architecture and develops a construction-specialized model by experimenting
multiple fine-tuning scenarios. Also, this study evaluates its performance on real construction
operations data in order to examine its potential toward real-world applications. The results
showed the promising performance of the tuned model: the recall@5 on training and validation
dataset reached 92% and 67%, respectively. The proposed method, which supports construction
co-robots with the holistic scene understanding, is expected to contribute to promoting safer
human-robot collaboration in construction.

INTRODUCTION
Autonomous robots have drawn increased attention in construction industry as an effective
means of relieving human workers from the unsafe, repetitive, and unpleasant tasks of
construction operations. Recently, a variety of construction robots are under development and in
the early stage of deployment, including a 3D-printing robot (Zhang et al. 2018), autonomous
vehicles (Sutter et al. 2018), and a humanoid robot (Kurien et al. 2018). It is expected that the
successful deployment of construction robots will significantly contribute to improving both
construction safety and productivity (Feng et al. 2015; Lundeen et al. 2017).
Despite such promises, persistent challenges in the co-robots’ deployment have thwarted the
safe collaboration between human and robot co-workers. Construction takes place in a highly
dynamic and unstructured environment where multiple machines/robots and human workers

collaborate with each other in complex ways. The situational context between them evolves
continually as the project proceeds. Construction robots must thus be able to understand the
evolving scene contexts (i.e., semantic relations to surrounding entities), thereby safely
collaborating with humans, as a human vision system does.
With the use of deep neural network (DNN) and computer vision, several construction
studies have accomplished construction scene understandings, such as construction resources
detection (Zhu et al. 2017; Yuan et al. 2017; Fang et al. 2018), workers’ action recognition (Ding
et al. 2018), and proximity monitoring (Kim et al. 2019). However, little research attempts to
address the situational context understanding, such as semantic relation detection (e.g., an
excavator is guided by a worker or an excavator is not working with a worker). In computer
vision society, the semantic relation detection also remains one of the most challenging tasks
(Newell and Deng 2017).
To address these challenges, this study builds a unique DNN architecture for semantic
relation detection and develops a construction-specialized model that is fine-tuned to the
construction-specific settings. Further, this study evaluates the developed model on real
construction operations data so as to demonstrate its’ potential in real-world applications.

TECHNICAL CHALLENGES IN SEMANTIC RELATION DETECTION

In recent years, computer vision society has made large strides with the advancement of
DNN. “Starting from breakthrough achievement in image classification from 2012, there is no
computer vision applications that has not been affected by this paradigm shift” (Girshick 2017).
The scope of scene understanding is rapidly expanding as a variety of DNN architectures and
learning algorithms are being developed.
Accordingly, many construction studies have leveraged DNNs [e.g., convolutional neural
network (CNN) and recurrent neural network (RNN)] and computer vision, and accomplished
several construction scene understandings, which include construction resources detection (Zhu
et al. 2017; Yuan et al. 2017; Fang et al. 2018), workers’ action recognition (Ding et al. 2018),
and proximity monitoring (Kim et al. 2019). However, the semantic relation detection has not
been tackled yet in the construction domain.
The semantic relation detection has recently garnered attention in computer vision society
(Newell and Deng 2017). With the challenging nature of the task, it remains an open-ended
study, leading to diverse approaches: fusing imagery and text data (Lu et al. 2016); using
message-passing RNNs (Xu et al. 2017); predicting over triplets of object proposals (Li et al.
2017); using reinforcement learning to predict over object proposals (Liang et al. 2017). Most
previous approaches depend on bounding boxes proposed from region proposal network (RPN).
The use of RPN helps to break the task down into more manageable steps (i.e., two-stage
inference: region proposal and semantic relation detection). However, “this breakdown often
restricts the visual features used in later steps and limits reasoning over the full contents of the
image” (Newell and Deng 2017). The separation can make an architecture not only lose the
advantage of end-to-end training, but also be easily affected by errors of RPN.
In addition, developing a construction-specialized model from a DNN architecture has
another challenge: how to successfully train and fine-tune the empty architecture so that it can
perform well in construction-specific settings. Higher levels of scene understanding naturally
demand deeper inference processing and correspondingly deeper network architecture, which in
turn requires an extensive training dataset, otherwise leads to overfitting. Transfer learning offers
a viable option to address this issue using a small training dataset. Pre-training with an extensive

benchmark dataset followed by fine-tuning with a relatively small construction-specific dataset

will help to customize a model to construction-specific settings without the issue of overfitting.
However, the fine-tuning still requires a certain amount of construction-specific dataset as well
as experiments on various fine-tuning scenarios.

RESEARCH OBJECTIVE AND FRAMEWORK

To address these challenges, this study builds a unique DNN architecture (Px2Graph, Newell
and Deng 2017) in which scene understandings not only for individual entities (i.e., location) but
also for their semantic relations can be interactively drawn. Further, a construction-specialized
model is developed by experimenting multiple fine-tuning scenarios, and validated on real
construction operations data so as to demonstrate its potential in real-world applications. This
study follows the below framework to achieve these aims (Figure 1).

Figure 1. Research framework.

 DNN architecture development: This study builds a unique DNN architecture that can
synchronously detect multiple objects and their semantic relations, leveraging hourglass
networks and 1x1 convolution (Px2Graph, Newell and Deng 2017).
 Data collection and annotation: Extensive construction operations data (i.e., videos) has
been collected via YouTube and annotated through a web-based crowdsourcing [i.e.,
Amazon Mechanical Turk (AMT)] with a complete inspection.
 Construction-specialized model development: Construction-specialized model is then
developed by pre-training the proposed architecture (i.e., Px2Graph) with benchmark
dataset [i.e., Visual Genome (Krishna et al. 2016)] and fine-tuning with the collected
construction dataset.
 Validation on real construction data: Evaluation on real construction operations data is

conducted. Lastly, discussion on the results and implications is followed.

DNN ARCHITECTURE DEVELOPMENT

To develop an end-to-end model that can concurrently detect both object and relation, this
study builds a unique network architecture that can address the following questions: (i) how to
extract global features that can likely be effective for semantic relation detection and (ii) how to
localize both object and relation in a single network without region proposal. The model
architecture is detailed, as (Figure 2, for more information, please refer to the previous work,
Newell and Deng 2017):

Figure 2. Network architecture: Px2Graph, Newell and Deng 2017.

 Feature tensor extractor: The four hourglass units stacked in a row takes an image as
input and produces a feature tensor of fixed size. The unique design of hourglass allows
the combination of global and local information, which can likely be effective in
inferencing the semantic relations on a frame (Newell and Deng 2017).
 Feature vector localizer: The output tensor is then converted to heat-maps by 1x1
convolution and sigmoid activation. Each heat value represents the likelihood that an
entity (i.e., object or relation) exists at the given location. The feature vectors of interest
are extracted based on these likelihood values.
 Classifier: In succession, the corresponding feature vectors are fed into the fully
connected layer and Soft-Max classifier, in which final classification of (i) subject class
(e.g., an excavator); relation (e.g., is guided by); and (iii) object class (e.g., a worker) are
performed.

DATA COLLECTION AND ANNOTATION

As an axiom of deep learning in computer vision, the quantity and quality of training dataset
have a significant impact on a model’s final performance. Hence, this study attempts to collect
an extensive data (i.e., videos) for real construction operations and conducted frame-wise
annotations with a complete inspection. First, a variety of videos for real construction sites were

collected from YouTube, which includes various scenes of human-machine (replacement of co-
robots, e.g., excavator, wheel loader, or truck) interactions. Further, the authors developed an
annotation template that links the collected data to web-based crowdsourcing (i.e., AMT) to
reduce the avoidable efforts for massive annotations. The template leads workers to annotate
each object’s bounding box and relations with others (Figure 1, data collection and annotation).
Lastly, manual inspection ensured the validity of the annotations. The annotation examples are
illustrated in Figure 3.

Figure 3. Examples of data annotation.

76 videos from different projects were collected. These videos capture (i) 7 types of objects
(i.e., worker, excavator, truck, wheel loader, roller, grader, and van/car) and (ii) 4 types of
relations (i.e., not working with, guided by, adjusted by, and filling) (Table 1). To avoid
duplications in the dataset, one frame per a second was sampled in each video. As the result, the
total of 2,502 frames were annotated as well as manually inspected (Table 1). This dataset
includes (i) 5,468 objects and (ii) 3,110 relations in total (Table 1).

Table 1. Summary of data.

Category Detail
The # of videos collected 76
The # of images annotated 2,502
Object categories excavator, person, truck, wheel loader, roller, grader, van/car
Relation categories not working with, guided by, adjusted by, filling
The # of objects annotated 5,468
The # of relations annotated 3,110

CONSTRUCTION-SPECIALIZED MODEL DEVELOPMENT

This study elected transfer learning in developing a construction-specialized model, thereby
complementing insufficiency of training dataset. First, the whole network (i.e., Px2Graph) was
pre-trained with Visual Genome dataset (Krishna et al. 2016) that is the most extensive dataset
widely used in relation detection studies (Newell and Deng 2017; Xu et al. 2017; Lu et al. 2016).
The Visual Genome contains 108,077 frames including 3.8 million objects and 2.3 million
relations (Krishana et al. 2017). In succession, the fine-tuning followed with collected
construction data. 2,000 (80% of total) and 502 (20%) images were used for fine-tuning and
validation, respectively.
To discover a better way to transfer the pre-trained network to construction-specific settings,
the fine-tuning particularly considered four different scenarios such that each scenario has

distinctive set of layers (i.e., hourglass unit) to be fine-tuned. Table 2 illustrates the four different
tuning scenarios. For example, the scenario #1 fine-tunes only the last hourglass unit (i.e., 4th
hourglass in the feature tensor extractor) by having zero learning rate at the other three units,
whereas the scenario #4 fine-tunes all hourglass units in the feature tensor extractor.

Table 2. Fine-tuning scenarios.

Hourglass unit to be fine-tuned
Scenarios
Hourglass #1 Hourglass #2 Hourglass #3 Hourglass #4
S #1 X X X O
S #2 X X O O
S #3 X O O O
S #4 O O O O

Table 3. Validation results: Recall@5 of each scenario.

Recall @5 (%)
Scenarios
Training dataset Validation dataset
S #1 87.78 63.90
S #2 92.20 61.68
S #3 93.62 65.12
S #4 91.93 67.41

VALIDATION ON REAL CONSTRUCTION DATA

To examine feasibility of the developed model in real-world applications, evaluation on real
construction data is conducted. As an evaluation metric, this study applied recall@x, which is the
most common metric used in relation detection studies (Newell and Deng 2017; Xu et al. 2017;
Lu et al. 2016). Note that the recall@x reports the fraction of ground truth tuples to appear in a
set of top x predictions. Considering diversity of the construction dataset, this work applied
recall@5. The results for the four scenarios are summarized in Table 3, and graph for the
recall@5 values during training is illustrated in Figure 4 with prediction examples.
It turned out that Scenario #4 (i.e., fine-tuning the entire feature tensor extractor, hourglass
#1~4) outperformed all the other scenarios (Table 3). The construction dataset is highly
distinctive to the Visual Genome dataset to cover universal objects and relations. Accordingly,
fine-tuning the entire feature tensor extractor offered a better option over focusing on last several
layers, as shown in this evaluation.
During the fine-tuning, the relation recall@5 values for training dataset steadily increased
(Figure 4). Consequently, the values converged to around 90% for all scenarios (Table 3). The
stably increasing pattern of relation recall@5 for training dataset shows that the proposed
architecture is capable of being specialized to construction-specific settings. On the other hand,
the relation recall@5 for validation dataset plateaued at around 61~67% (Table 3 and Figure 4).
Although the relation recall@5 on validation dataset for all scenarios showed steadily increasing
pattern, they started to converge at the early stage of fine-tuning. It is analyzed that the all
scenarios suffered from insufficiency of the fine-tuning dataset, and therefore resulted in the
significant overfitting.
Although the developed model showed promising performance on training dataset (i.e., more
than 87% recall@5), it failed at generalization, resulting in the poor performance on validation
dataset (i.e., less than 68% recall@5). It may not be sufficient for real-world applications.

However, it is noteworthy that the proposed architecture demonstrated its potential of being
specialized to construction-specific settings. A follow-up study will therefore more focus on
improving the generalization capability, which can include (i) augmentation of fine-tuning
dataset and (ii) hyper-parameter tuning (e.g., width, height, and depth of feature tensor
extractor).

Figure 4. Results of S #4: Recall@5 during fine-tuning and prediction examples.

CONCLUSION
To support safe human-robot collaboration in construction sites, this study proposes a DNN-
based computer vision method for semantic relation detection. A unique DNN architecture that
can interactively detect both objects and relations is built using hourglass networks and 1x1
convolution. Further, a construction-specialized model is developed by experimenting multiple
fine-tuning scenarios. As a result, the best model (i.e., scenario #4) can achieve recall@5 of
91.93% and 67.41% on training and validation dataset, respectively. The performance on
validation dataset may not be sufficient for real-world applications; however, there are still
plenty of opportunities to improve the performance, which include (i) augmentation of fine-
tuning dataset and (ii) hyper-parameter tuning. With such critical refinement, the proposed
architecture can likely result in a more robust model for construction-specific settings. The
improved model will help construction robots to understand evolving scene contexts (i.e.,
semantic relations to surrounding entities), and it will ultimately contribute to promoting safe
collaboration between human and robot co-workers in construction.

ACKNOWLEDGEMENT
The work presented in this paper was supported financially by a National Science Foundation
Award (No. IIS-1734266, ‘Scene Understanding and Predictive Monitoring for Safe Human-
Robot Collaboration in Unstructured and Dynamic Construction Environment’). Any opinions,
findings, and conclusions or recommendations expressed in this paper are those of the authors
and do not necessarily reflect the views of the National Science Foundation.

REFERENCES
Ding, L., Fang, W., Luo, H., Love, P.E.D., Zhong, B., and Ouyang, X. (2018). "A deep hybrid
learning model to detect unsafe behavior: Integrating convolutional neural networks and long
short-term memory." Automation in Construction, 86(2018), 118-124.
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., and Rose, T.M. (2018). "Detecting non-hardhat-use
by a deep learning method from far-field surveillance videos." Automation in Construction,
85(2018), 1-9.
Girshick, R. (2017). "Editorial-Deep learning for computer vision." Computer Vision and Image
Understanding, 164(2017), 1-2.
Kim, D., Liu, M., Lee, S.H., and Kamat, V.R. (2019). “Remote proximity monitoring between
mobile construction resources using camera-mounted UAVs.” Automation in Construction,
99(2019), 168-182.
Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li,
L.J., Shamma, D.A., Bernstein, M.S., and Fei-Fei, L. (2017). "Visual genome: Connecting
language and vision using crowdsourced dense image annotations." International Journal of
Computer Vision, 123.1(2017, 32-73.
Kurien, M., Kim, M.K., Kopsida, M., and Brilakis, I. (2018). "Real-time simulation of
construction workers using combined human body and hand tracking for robotic construction
worker system." Automation in Construction, 86(2018), 125-137.
Li, Y., Quyang, W., and Wang, X. (2017). "Vip-cnn: A visual phrase reasoning convolutional
neural network for visual relationsip detection." arXiv:1702.07191.
Liang, X., Lee, L., Xing, E.P. (2017). "Deep variation-structured reinforcement learning for
visual relationship and attribute detection." arXiv:1703.03054, 2017.
Lu, C., Krishna, R., Bernstein, M., and Fei-Fei, L. (2016). "Visual relationship detection with
language priors." European Conference on Computer Vision, 852-869.
Lundeen, K.M., Kamat, V.R., Menassa, C.C., and McGee, W. (2017). "Scene understanding for
adaptive manipulation in robotized construction work." Automation in Construction,
82(2017), 16-30.
Newell, A. and Deng, J. (2017). "Pixels to graphs by associative embedding." Advances in
Neural Information Processing Systems, 2171-2180.
Sutter, B., Leleve, A., Pharm, M.T., Gouin, O., Jupille, N., Kuhn, M., Lule, P., Michaud, P., and
Remy, P. (2018). "A semi-automated mobile robot for bridge inspection." Automation in
Construction, 91(2018), 111-119.
Xu, D., Zhu, Y., Choy, C.B., and Fei-Fei, L. (2017). "Scene graph generation by iterative
message passing." IEEE Conference on Computer Vision and Pattern Recognition.
Yuan, C., Li, S., Cai, H. (2016). "Vision-based excavator detection and tracking using hybrid
kinematic shapes and key nodes." Journal of Computing in Civil Engineering, ASCE, 31(1),
04016038.
Zhang, X., Li, M., Lim, J.H., Weng, Y., Tay, Y.W.D., Pham, H. (2018). "Large-scale 3D
printing by a team of mobile robots." Automation in Construction, 95(2018), 98-106.
Zhu, Z., Ren, X., and Chen, Z. (2017). "Integrated detection and tracking of workforce and
equipment from construction jobsite videos." Automation in Construction, 81(2017), 161-
171.

Assessments of Intuition and Efficiency: Remote Control of the End Point of Excavator in
Operational Space by Using One Wrist
Dong-ik Sun1; Sang-keun Lee2; Yong-seok Lee, Ph.D.3; Sang-ho Kim4 ; Jun Ueda5;
Yong K. Cho6; Yong-han Ahn7; and Chang-soo Han8
1
Ph.D. Candidate, Dept. of Mechatronics Engineering, Hanyang Univ. ERICA, 55
Hanyangdaehak-ro, Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail:
[email protected]
2
Ph.D. Candidate, Dept. of Mechatronics Engineering, Hanyang Univ. ERICA, 55
Hanyangdaehak-ro, Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail:
[email protected]
3
Dept. of Graduate School, Hanyang Univ., 55 Hanyangdaehak-ro, Sangnok-gu, Ansan-si,
Gyeonggi-do 15588, South Korea. E-mail: [email protected]
4
Ph.D. Candidate, Dept. of Graduate School, Hanyang Univ. ERICA, 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
5
Associate Professor, Dept. of George W. Woodruff School of Mechanical Engineering, Georgia
Tech, Atlanta, GA 30332-0408, U.S.A. E-mail: [email protected]
6
Associate Professor, Dept. of School of Civil and Environment Engineering, Georgia Tech,
Atlanta, GA 30332-0355, U.S.A. E-mail: [email protected]
7
Associate Professor, Dept. of Architectural Engineering, Hanyang Univ., 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
8
Professor, Dept. of Robot Engineering, Hanyang Univ., 55 Hanyangdaehak-ro, Sangnok-gu,
Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]

ABSTRACT
A robotic teleoperation system of excavator has been developed to improve safety and
remote-control methods at hazardous situations such as construction or disaster sites. This paper
introduces an innovative remote-control method and a newly developed robotic hardware system,
Phantom, to rapidly retrofit existing construction equipment for efficient robotic control from a
remote place. The system allows one joystick to control one excavator. A remote operator with
two joysticks at a remote workstation can intuitively control two excavators equipped with
Phantom simultaneously. The proposed method was compared with typical excavator operation
methods to evaluate its performance based on operation time and mechanical movement
accuracy from given tasks. The tests show the promising results and the proposed robotic
equipment operation approach can significantly improve the safety of the operators who need to
work at dangerous site conditions.

INTRODUCTION
Hydraulic excavators have been commonly used in construction fields to handle heavy and
bulky objects. In many cases, however, the operators are exposed to several dangerous situations
such as disaster sites. Robotic or teleoperation of the equipment can be a solution, but building a
new robotic system or modifying existing equipment’s mechanical and control system requires
very high cost and time-consuming process. Also, tele-operated robots require physical contact
between the environment and the machine, using force feedback and creating a coupled system
including the environment, robot, and operator. Operational performance of the coupled system

often degrades due to inefficient adjustments of controller gains that exacerbate the inherent
instabilities of the system’s coupled dynamics. This has been shown to introduce inherent
instabilities due to the unknown response of the environment.
In this study, we propose a more efficient approach by attaching a remote controller and a
posture measurement sensor to the excavator body, which only commands the end-effector’s
position (e.g., bucket or gripper) without specifying the joint displacements of the other parts of
the excavator (e.g., boom, arm, cabin swing). This design allows the operator to quickly
transform a generic hydraulic excavator into an unmanned robot. To lift large-scale objects at a
remote site, more than one gripper is usually required. To resolve this issue, this study designs
control algorithms so that a single operator can control two excavators in a coordinated fashion.
In addition, the proposed system enables the control of gripper rotation. These unique features of
the machine and interface design would differentiate the proposed study from other existing
methods.
This paper introduces a new remote-control device manipulated by only one hand, which is
controlling the end-point of excavator in operational space. Overall algorithms about remote-
control are also explained. Then, this paper shows the result of assessment about intuitive control
and efficiency of the proposed system. The proposed method was compared with typical
excavator operation methods to evaluate its performance based on operation time and mechanical
movement accuracy from given tasks.

Figure 1. Remote excavator control system with Phantom devices

PRIVIOUS WORKS
Attachable manipulator which is manipulating each lever of excavator remotely
To control the excavator remotely, it must have an additional system to be actuated by
wireless signals. Previously, a specific manipulator was developed to control the excavator
remotely without mechanical modifications. This attachable manipulator can be attached at each
lever to receive the wireless signal and then behaves like human hands. Thus, the lever can be
located anywhere on the workspace, and all cylinders can be operated simultaneously.
Consequentially, the remote excavation system carries out the operation as if there were a
physical operator on board. It is already developed and can be seen in Shin and Kang (2012).
The brief configuration of the robotic excavator is shown in Figure 1.
However, this additional system just handles the actuator directly. Ji (2017) developed the

control algorithm about motion control of hydraulic excavator by using attachable manipulator.
This algorithm can control the end-point of excavator in operational space. The position control
of each cylinder of the link was progressed at first. Then point-to-point control was applied to the
end-point control of excavator in operational space. Figure 2 shows a positive result about
controlled grading works. Each point to be followed was computed and set with a personal
computer. The details can be found in Ji (2017).
The proposed new remote-control device, Phantom, in this paper is a role of command. It can
replace the operator part illustrated in Figure 1. It commands that the end-point of an excavator is
to be moved in operational space. This system makes a worker just intuitively focus on
operational space without caring joint space.

Figure 2. The results when motion control made by Ji (2017) is applied. PC sends the
commands: Leveling direction and via points
APPROACH

General characteristics of typical hydraulic excavator

In view of kinematics, the excavator has three links which are working in vertical 2-
dimensional or planar space. The links are associated with Boom, Arm, and Bucket of the
excavator. For this reason, the robotic excavator can be called a hydraulic articulated serial robot
with three degrees of freedom controlled by a worker who is manipulating the two levers in the
cabin. The three prismatic joints with regard to boom, arm, and bucket are moved by two levers;
one is assigned to boom and bucket, the other is assigned to the arm. Swing and rotation of the
cabin are excepted.
Typical excavator performs various works in 3-dimensional Cartesian space (Operational
space). When it comes to performing pick and place tasks, there have to be requirements of
considerable accuracy. However, it is difficult and complex to control the links of an excavator
at the same time since each link and lever is related to joint space and actuated by a hydraulic
power system which generally produces lower accuracy. This explains a reason that the skilled
workers are needed in handling the construction equipment which is normally hydraulically
powered.

Usefulness of easy method about controlling the end-point of excavator in operational space
When workers operate the typical excavator, they have to manipulate the two levers
simultaneously. In other words, like a general serial robot, the end-point of excavator has to be

controlled in operational space or Cartesian space by manipulating the two levers properly. Many
tasks need a high level of accuracy. However, it is not easy to satisfy it by handling the typical
excavator. Therefore, if the end-point of an excavator were controlled easily in operational space,
the effectiveness of the performance could be higher, which enhances remote operator’s
visibility and workability, especially at an unknown or unfamiliar working environment.

The necessity of intuitive control

Unskilled operators or operators in the very complex working environment are not good at
controlling the excavator using the traditional control method. Remote control is no exception
since there are some constraints such as limited visibility and vibratory and audio feedback.
Therefore, the remote-control system with high intuition, efficiency, and low fatigue has been
investigated by many researchers for those reasons.

Figure 3. The concept of framework

Remote control method
Many remote-control devices have been developed with a variety of shape and functions.
Moon (2009) developed the tele-operation control station for an intelligent excavator. It follows
the traditional control method, so it could not improve the intuition and efficiency in control.
Mavridis (2010) and Kim (2009) implemented the tele-operation system for an industrial robotic
arm moved according to a human arm movement. The teleoperation using a human arm has the
strength of high intuition and accessibility. Nevertheless, this type has significant weakness in
handling objects because the mapping between a robotic arm and a human arm depends on
position elements. In other words, if a user or operator wants to keep some pose of a robotic arm,
then the user also has to keep their arm pose. This fact can result in high fatigue of users. Oh
(2014) measured the fatigue of human arm by using the EMG-based analysis when the proposed
device is being used. It shows the different fatigue value measured between a haptic device and a
typical lever. By considering the aforementioned results, the approach of developing a new
remote-control device was focused on how to improve the three things: intuition, efficiency and
low fatigue.
The motions of excavator without swing and driving are allowed to be only on 2-dimensional

vertical space. A wrist with a joystick can be moved upwards, downwards, forwards, and
backward and also twisted so that it is to be 3 degrees of freedom. It means that the end-point of
an excavator can be located at anywhere within the 2-dimensional workspace by manipulating
some specific control device which is manipulated by only using the motion of a wrist. It is
because the degrees of freedom of wrist is larger than that of excavator motions without swing
and driving. Additionally, it can be a solution for overcoming the fatigue because there is
minimal action. The concept of the proposed remote-control device and framework are in Figure
3.

Remote-control algorithm
Based on both concept of the proposed device and robotic excavator system, the control input
was set as follows:
 The value of the input is the velocity of excavator end-point
 The rate control method was applied to part of the remote-control device.
 The velocity vector of input is made up of two basis vectors: Upward, downward, and
forward, backward.
 The start point of end-point of the excavator is always updated at a given period
 The command signals will be inputted to robotic excavator developed previously.

Figure 4. Flow chart of remote-control algorithm: The reference point of joystick is center.
The displacement of joystick is difference between moved and center point. The longer
arrow is, the faster speed command is transmitted to robotic excavator.
In the flow chart in Figure 4, the initial state of the joystick is illustrated. The joystick stays
horizontally as its home position. If one manipulates the joystick upwards, then the velocity
vector is made to let the end-point of the excavator to be moved upwards. Downward, forward
and backward are also intuitively work in the same fashion. The excavator cannot move while
the joystick is located at the initial point. If it were pushed or raised more, then the larger size of
the velocity vector would be made. In other words, the faster speed command is transmitted to
robotic excavator so that end-point of the excavator can be moved faster. By manipulating the
joystick, i.e., by combining the two basis vectors, any size and direction of the velocity vector
can be formed. A mathematic model was derived such as the Jacobian matrix and forward
kinematics of the serial link of excavator shown in the flow chart in Figure 4.

EVALUATION
Evaluating the effectiveness of the proposed remote-control system, experiments were
conducted by applying it to an unmanned robotic excavator to perform a series of tasks

Experimental set
The implemented system of remote-control is illustrated in Figure 5. The detail system
descriptions are as follows:
 The joystick is attached to the bracket which is fixed to an armrest
 The boom, arm, buckets can be moved upwards, downwards, forwards, and backward
with respect to the origin position as shown in Figure 3.
 Arm processor (MCU) was used for communicating with the robotic excavator equipped
with Phantom

Figure 5. Implementation of the proposed control method to an unmanned robotic

excavator

Table 1. Assessment and result table about intuitive control (the image shows B)

Assessments and results

Subjects manipulated the new remote-control device proposed in this study. In order to check
the intuition and efficiency of the proposed method, operation time was measured while subjects
manipulated the excavator with the proposed method at a remote workstation (A) and with the
traditional control from a cabin (B) for performing a simple task (Figure 6). By comparing the
measured time, it could be considered that the more time is spent, the less control system is
intuitive and efficient. The test results are shown in Table 1. To derive the more reliable

assessments, the subjects were trained for an hour in advance to get used to the control system.
To analyze the intuition, a questionnaire survey was carried out (Table 2). This assessment
results shown in Tables 1 indicate that the superior performance of using the proposes approach.

Table 2. Result of questionnaire survey

CONCLUSION
The robotic remote-control system using only one wrist for controlling the excavator was
devised and implemented. To validate the intuition and efficiency of the proposed system,
operation time was measured, and a questionnaire survey was carried out. The result has shown
that the proposes approach is a promising solution that can overcome the operational difficulties
for remote control of heavy equipment at hazardous operational situations such as an earthquake
or toxic working environments. From the survey, however, some subjects preferred the
traditional control from the excavator cabin because they can feel the entire excavator motions
and reaction forces. As a future study, therefore, we will focus on delivering realistic vibratory,
auditory, and force feedback to a remote operator.

ACKNOWLEDGMENTS
This research was supported by the Industrial Strategic technology development program
(No.10052965) funded by the Ministry of Trade, Industry and Energy (MOTIE), KOREA, and
the Technology Innovation Program (No. 2017-10069072) funded By the Ministry of Trade,
Industry & Energy (MOTIE, Korea).

REFERENCES
ChangUk Ji. Study on front motion control of hydraulic excavator equipped with installation-
type robot manipulator. Grade School of Hanyang University, 2017.
Shin, D., Kang, M. S., Lee, S., & Han, C. (2012, December). Development of remote controlled
manipulation device for a conventional excavator without renovation. In System Integration
(SII), 2012 IEEE/SICE International Symposium on (pp. 546-551). IEEE.
Moon, S. M., Kim, B. S., Hwang, J. H., Kim, Y. O., Hong, D. H., & Ryu, B. G. (2009,
November). Development of tele-operation control station for Intelligent Excavator. In
Technologies for Practical Robot Applications, 2009. TePRA 2009. IEEE International
Conference on (pp. 123-128). IEEE.
Mavridis, N., Machado, E., Giakoumidis, N., Batalas, N., Shebli, I., Ameri, E., ... & Neyadi, A.
(2010). Real-time teleoperation of an industrial robotic arm through human arm movement
imitation. In Proceedings of the International Symposium on Robotics and Intelligent Sensors
(IRIS).
Kim, D., Kim, J., Lee, K., Park, C., Song, J., & Kang, D. (2009). Excavator tele-operation
system using a human arm. Automation in construction, 18(2), 173-182.

Oh, K. W., Kim, D., & Hong, D. (2014). Performance evaluation of excavator control device
with EMG-based fatigue analysis. International journal of precision engineering and
manufacturing, 15(2), 193-199.

Real-Time Hazard Proximity Detection—Localization of Workers Using Visual Data

Idris Jeelani, S.M.ASCE1; Hariharan Ramshankar2; Kevin Han, M.ASCE3;
Alex Albert, M.ASCE4; and Khashayar Asadi, S.M.ASCE5
1
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ.,
Stinson Dr., Raleigh, NC 27607. E-mail: [email protected]
2
Dept. of Electrical and Computer Engineering, North Carolina State Univ., Raleigh
3
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ.,
Stinson Dr., Raleigh, NC 27607
4
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ.,
Stinson Dr., Raleigh, NC 27607
5
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ.,
Stinson Dr., Raleigh, NC 27607

ABSTRACT
Research indicates that workers often fail to recognize a significant proportion of safety
hazards. To reduce injury likelihood, efforts have traditionally focused on developing and
delivering training interventions. Despite such efforts, desirable levels of hazard recognition are
rarely achieved. Therefore, augmenting human abilities with a technology-driven solution to
improve overall hazard recognition can yield substantial benefits. Accordingly, the objective of
this study is to develop a method for localizing workers with respect to pre-identified hazards in
real-time. To achieve this objective, a 3D point cloud of a construction site as a global map is
created and hazard locations are marked on this map. Workers are provided with a head-mounted
camera that continuously records their first-person view (FPV) videos. The image frames from
these videos are localized onto the global map using bag of word (BoW) localization. Apart from
estimating the proximity to safety hazards, the system can also capture large-scale data that
captures unsafe behaviors (e.g., entry to restricted areas) and near-miss incidents for training
purposes.

INTRODUCTION
Despite efforts to improve safety performance, the construction industry is still considered
one of the most dangerous industries to work for (Pinto et al. 2011). Several studies have
attempted to identify the antecedents for these disproportionate injury rates (Abdelhamid and
Everett 2000; Mitropoulos 2003; Rajendran et al. 2009). One of the common findings of such
studies is that workers often fail to recognize a significant portion of hazards in their
environment. In fact, studies from the US (Albert et al. 2014b; Jeelani et al. 2017b), UK (Carter
and Smith 2006), Australia (Bahn 2013) and Israel (Sacks et al. 2009) report that about 50% of
hazards remain unrecognized in construction environments.
To improve hazard recognition ability of workers several training programs have been put
forward (Albert et al. 2014c; Gangolells et al. 2010; Jeelani et al. 2017c, 2018). While these
training programs have been effective in improving the hazard recognition performance of
workers, they have not completely resolved the issue of poor hazard recognition. Training alone
cannot guarantee the detection of all hazards by human workers because hazard recognition is a
visual search process that is impacted by various human factors, such as attention, bias, risk
tolerance, current physiological and psychological states etc. (Jeelani et al. 2019; Wolfe and

Horowitz 2017). Hence, there is a need to provide technological assistance to workers in their
efforts to detect hazards.
The objective of this study is to develop a framework that uses visual data to obtain real-time
and continuous worker positions and detect their proximity to any of the pre-defined hazardous
area. This study builds on the current state-of-the-art localization technique called ORB-SLAM
(Mur-Artal and Tardos 2017). The proposed localization method provides continuous
localization of visual data that are periodically captured (explained in detail in the system
overview section). The proposed system creates a one-time 3D point cloud of a construction site
that serves as a global map, on which the hazard locations are marked. The workers are provided
with a head-mounted camera that continuously records their First-Person View (FPV) videos.
This video stream is used to localize workers (estimate their position) on to the previously built
global map. Finally, their positions are continuously monitored and if the distance between a
worker and any of the pre-identified hazards is less than a set threshold, the system warns the
workers and/or safety manager. Since this system uses simple monocular cameras to detect
proximity of workers to hazardous areas, it can provide a fast, efficient and economical safety
monitoring system compared to existing studies that use other sensors like laser scanners (Teizer
et al. 2010), RFID (Park et al. 2016) magnetic marker (Li et al. 2012). Further the system can be
extended to include an object detection pipeline that detects dynamic hazards in the work
environment.

BACKGROUND
Hazard Recognition: To avoid an accident, the first step is to recognize the hazard that
could potentially lead to an accident. As such, hazard recognition is often defined as the first step
in a safety management process (Albert et al. 2014a; CoVan 1995; Hinze 1997). To help
workers and managers recognize hazards, the construction industry uses several tools and
techniques. These include pre-construction task planning, Job Hazard Analysis (JHA) (Rozenfeld
et al. 2010), safety hazard templates (Fang et al. 2004), and worker-to-worker observation
programs (Choudhry et al. 2007). However, these tools have not been sufficient in ensuring that
all or at least a significant portion of hazards is recognized. This is evident, as studies have
shown that despite using these tools, a large proportion of hazards still remain unrecognized in
construction environments (Carter & Smith, 2006, Bahn et al. 2013, Albert et al. 2013). While
Recent efforts (Jeelani et al. 2017a; c) have been effective to improve hazard recognition
performance, several hazards ( about 20%) were still missed by workers even after receiving the
training. Hence, there is a need to develop technology that can assist workers in their efforts to
detect hazards in their environment and recognize these hazards accurately.
Image-based Localization: To detect the hazard near a worker, we must first know where
the worker is. In other words, it is important for an autonomous system to be aware of the worker
positions in order to determine, whether the worker is in proximity to any hazard. In image-
based localization, a camera is attached to the entity that is to be localized (E.g., a person or a
robot) and the continuous images (referred as query images) captured by this camera are used to
estimate the location of the camera (hence the entity) (Asadi et al. 2019).
In traditional image based-localization methods, the query image is matched with a large
number of images in the database whose locations have been pre-determined, to find the images
that are most similar to the query image. Finally, the location of the query image can be
determined relative to these matched images (Zhang and Kosecka 2007). These methods have
limited accuracy. To achieve a higher localization accuracy, recent efforts use 3D-

reconstruction of the scene instead of the image database (Han et al. 2015; Jeelani et al. 2018;
Sattler et al. 2011). These methods provide the orientation as well as the location of the camera,
which results in accurate localization. However, such methods are computationally expensive
and cannot be used in real-time.
Simultaneous Localization and Mapping (SLAM) techniques estimate camera poses
(orientation and location) of image sequences (or video) and map the environment,
simultaneously in real-time (Asadi et al. 2018). These techniques use feature vocabularies to find
matching features between the consecutive image frames of a video stream to estimate the
position and orientation of the camera (relative to the previous frame). The technique also uses
the corresponding features of the consecutive images to map the environment as it runs. While
these methods are very useful for real-time localization, they have two vital limitations. 1) The
localized positions are relative to one another and not to a global entity. That is, we cannot
determine the position of the camera with respect to any pre-determined location because every
time the SLAM is run, it starts with a different scale and coordinate system. 2) These techniques
create their own map and cannot localize the camera (or entities) within a pre-created global
map. This significantly limits their utility in determining worker positions on a global map where
pre-determined hazards are marked.

RESEARCH OBJECTIVE & CONTRIBUTIONS

The aim of this research effort is to overcome the limitations of current image based-
localization methods and propose a framework for worker localization that detects worker
proximity to pre-defined hazards in real-time. The key steps involved in the framework are:
1. creating a global 3D map of a construction site (pre-processing step)
2. localizing workers within that global map
3. detecting worker proximity to pre-marked hazard locations
4. alerting the worker and/or safety manager, if the proximity of worker is within a
threshold distance from the hazardous area.
Computational Contributions: Proposed system presents an improved method for
Simultaneous Localization and Mapping (SLAM), which re-localizes within a previously built
global map. This feature is not available in the existing SLAM techniques (Mur-Artal and Tardos
2017). The proposed method also solves the scale issue in existing SLAM methods and can re-
localize with the same scale (as that of the global map) instead of a random scale, which
significantly limits the usability of existing SLAM. Finally, by using local feature vocabulary of
the global map, the proposed system does not lose track even during fast camera movements.
Applications: Even though this study uses the proposed system for hazard detection, the
framework can be used for several other tasks that require tracking of entities (people, equipment
etc.). Autonomous navigation is one such area. Moreover, the localization pipeline of the
proposed framework can be used for several tracking applications beyond safety monitoring,
including equipment tracking, productivity analysis, worker movement, idle time analysis, etc.

SYSTEM OVERVIEW
The input for the system is the live stream of FPV of the worker, which is recorded using a
camera mounted on the hard hat, and live streamed to the server. The server receives this video
stream and obtains the camera location (i.e., the worker location) by localizing the image frames
within the pre-built global map. The system computes the camera pose for every frame of the
FPV with respect to the pre-built global map of the construction site. These camera poses are

used to detect worker proximity to any of the pre-identified locations of hazardous conditions. If
the proximity of worker to any pre-identified location is less than the set threshold, the system
generates an alert for the worker and/ or safety manager.

Figure 1 System Overview

SYSTEM DESCRIPTION
As shown in figure 1, the system has two modes – the pre-processing mode (run once) and
the real-time localization mode, which is run every time the worker enters the site. The system
builds on the existing ORB-SLAM (Mur-Artal and Tardos 2017), which is improved in this
study to save, and re-localize within, a pre-built global map using a custom visual vocabulary.
The following sub-sections provide more details.

Pre-processing Mode
The objective of this mode is to create a global map of a construction site within which future
localization of workers will happen. In this mode, we record a walkthrough video of a
construction site and extract ORB (Rublee et al. 2011)features from each image frame of a
continuous video stream. These features are used to create a sparse point cloud of the
environment using original ORB SLAM pipeline run in the mapping mode. However, the
pipeline is modified to generate a custom Bag-of-Words visual vocabulary using enhanced
hierarchical bag-of-word library, DBow2 (Gálvez-López and Tardos 2012), which consists of the
visual words comprising the local ORB features. The custom vocabulary is saved as a binary file
and improves the tracking performance of SLAM in future runs. More importantly, the generated
map is also saved to the disk at the end of the run. The global map file consists of the scale
related variables, the key frames, key frame database, map and map points, which are serialized
to the disk as a binary file using serialization feature of the Boost C++ libraries. Once the map is
generated, the areas of interests (i.e pre-identified hazardous areas) are marked on the global
map. The x, y, and z coordinates of these areas (center) with textual hazard information are saved
in a JSON file. Thus, the outputs of the pre-processing step are: 1) a global map of the
construction site as a binary file, 2) a custom visual vocabulary saved as a binary file, and 3) pre-
identified hazard locations and information as a JSON file.

Real-time Localization Mode

This mode is used in the actual implementation of the system to localize workers in real time.
The FPV of the workers is recorded using a camera mounted to their hard hats. The camera is
connected to the smartphone with Wi-Fi or LTE/3G connectivity, which broadcasts the live
stream of the FPV using Open Broadcasting Software (OBS). The live stream is received by the
server, which can be located on or off-site. The system loads the prebuilt global map and the
custom vocabulary from their respective binary files and localizes the image frames from live
stream within this global map. The module extracts the ORB features from image frames and
represents each frame as BoW representation using visual words from the custom vocabulary.
This makes the initialization (i.e., re-localization of the first frame) almost instantaneous. In the
original SLAM, since mapping occurs simultaneously with localization, fast movements can
cause it to lose track if there are not enough matching features between the successive image
frames. However, in the proposed method, since localization occurs within the pre-built map,
local features are always available for localization, which enables the system to re-localize itself
anytime and thus does not lose track.
The localization occurs for every frame of the video stream but whenever the view of the
worker changes (as worker moves), the localized camera positions are sent to the proximity
detection module. In this way, worker positions are continuously sent to the proximity detection
module that computes the Euclidean distance between worker location and the pre-identified
hazard locations. If the distance is below the threshold, the system generates an alert.

VALIDATION
To test the effectiveness of the system, it was validated on two fronts -1) accuracy of
initialization within a previously built global map and 2) accuracy in real-time localization. For a
SLAM system, the number of features present in the scene determines the difficulty with which
it can initialize and localize. Less the features, more difficult it is to initialize the system and
maintain continuous localization without losing track. The aim of validation, however, was to
test the proposed localization method in the toughest conditions (i.e., a scene with minimum
features). Therefore, a hallway with plane walls was chosen as the test site to validate the
localization method. The two attributes of the system were tested as follows:
Initialization within Global Map
An off the shelf digital camera was strapped to the hard hat and used to record a continuous
walkthrough video of the test site. This video was used to run the proposed system in pre-
processing mode to create a sparse point could of the test site. This served as the global map.
Markers were placed on the site marking five test points as shown in figure 2a with test point A
and E marking the starting and end of the walkthrough respectively. These marker locations were
marked on the global map as well (figure 2 b) and the coordinates saved in a JSON file. This
serves as the ground truth for validation.
In next runs, the walkthrough was started from the test point B, C, D, and E respectively, and
the objective was to load the global map and check whether SLAM can initialize correctly within
this global map. The localized position ( x1, y1, , z1 ) was obtained from the first camera pose given
by the system in each run, after converting it to the real-world scale. This position was compared
with the ground truth i.e., known location of Markers ( Mx1, My1, , Mz1 ) and the error was

calculated by computing the difference between the two points (after subtracting the camera
height) using equation 1
x  Mx1,    y1,  My1,    z1,  Mz1,  h 
2 2 2
 1, Eq. (1)

Figure 2 Markers on the (a) test site and (b) point cloud
The errors obtained are shown in table 1 and Figure 3a shows one example (Test #3)

Table 1: Errors in Initialization

Test Starting End Error in
# Point Point initialization
0 A E n/a (Global map)
1 B C 4.0 mm
2 C D 2.0 mm
3 D E 5.0 mm
4 E A 4.5 mm

Localization Accuracy
In existing SLAM systems, the error in localization usually increases with time. This is called
SLAM drift. Therefore, to measure the effectiveness of the proposed system, the accuracy of
localization needed to be measured over time. Hence, a continuous trajectory was followed while
recording the walkthrough video, such that the camera encountered the five markers multiple
times and in different order. The aim was to monitor camera locations given by system and
compare those with the physical locations of the camera at each test point. Every time the camera
paused on a marker, the location obtained from the proposed system was compared with the
known location of that marker. The error was computed using Eq. 1. Figure 3b shows the test
trajectory and the corresponding errors

Figure 3: a) Example of Accurate Initialization (Test# 3) on pre-built global map b)

Localization Errors

CONCLUSION
Recognizing hazards is one of the key steps in effective safety management. However, a
significant portion of hazards remains unrecognized in construction environments. Therefore, the
goal of this study was to assist workers by warning them of hazardous conditions in their
proximity. A vision-based system was proposed in this study, which uses the FPV of workers to
estimate their positions on the construction site. The system uses a Bag of Words (BoW)
approach for localization that uses visual features on image frames of FPV to localize each
image frame (thus the worker) on a previously saved global map. Using the localized worker
positions, the system computes how far they are from the pre-defined hazardous areas and
generates an alert if the distance is less than a set threshold. The system was validated and the
maximum error in initialization and continuous localization was just 5 mm and 6.5 mm
respectively. The proposed system offered several computational advantages over the existing
vision-based real-time localization methods. These include 1) re-localization within a previously
built global map, 2) localizing and mapping with real-world scale 3) Continuous localization
without track loss even during fast movements.
The proposed system can be used to monitor workers and warn them and/or safety managers
when they are in proximity to a hazard. Further, the localization method developed in this study
has numerous applications in the tracking of entities (people, equipment etc.). One of the
limitations of this method is that the method only works with the static hazards (E.g., fall
potential from edges or slab cut outs, excavation areas, or any such hazards that do not change
position with time) and cannot detect proximity to dynamic hazards such as moving trucks,
excavators etc.

REFERENCES
Abdelhamid, T. S., and Everett, J. G. (2000). “Identifying Root Causes of Construction
Accidents.” Journal of Construction Engineering and Management, American Society of
Civil Engineers, 126(1), 52–60.
Albert, A., Asce, M., Hallowell, M. R., Asce, a M., Kleiner, B., Chen, A., and Golparvar-fard,
M. (2014a). “Enhancing Construction Hazard Recognition with High-Fidelity Augmented
Virtuality.” Journal of Construction Engineering and Management, 140(7), 1–11.
Albert, A., Hallowell, M. R., and Kleiner, B. M. (2014b). “Experimental field testing of a real-
time construction hazard identification and transmission technique.” Construction
Management and Economics, 32(10), 1000–1016.
Albert, A., Hallowell, M. R., and Kleiner, B. M. (2014c). “Enhancing Construction Hazard
Recognition and Communication with Energy-Based Cognitive Mnemonics and Safety
Meeting Maturity Model: Multiple Baseline Study.” Journal of Construction Engineering
and Management, 140(2), 04013042.
Asadi, K., Ramshankar, H., Noghabaee, M., and Han, K. (2019). “Real-time Image Localization
and Registration with BIM Using Perspective Alignment for Indoor Monitoring of
Construction.” Journal of Computing in Civil Engineering.
Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S.,
Han, K., Lobaton, E., and Wu, T. (2018). “Vision-based Integrated Mobile Robotic System
for Real-time Applications in Construction.” Automation in Construction.
Bahn, S. (2013). “Workplace hazard identification and management: The case of an underground
mining operation.” Safety Science, 57.

Carter, G., and Smith, S. D. (2006). “Safety Hazard Identification on Construction Projects.”
Journal of Construction Engineering & Management, 132(2), 197–205.
CoVan, J. (1995). Safety engineering. John Wiley & Sons.
Gálvez-López, D., and Tardos, J. D. (2012). “Bags of binary words for fast place recognition in
image sequences.” IEEE Transactions on Robotics, IEEE, 28(5), 1188–1197.
Gangolells, M., Casals, M., Forcada, N., Roca, X., and Fuertes, A. (2010). “Mitigating
construction safety risks using prevention through design.” Journal of Safety Research,
41(2), 107–122.
Han, K. K., Cline, D., and Golparvar-Fard, M. (2015). “Formalized knowledge of construction
sequencing for visual monitoring of work-in-progress via incomplete point clouds and low-
LoD 4D BIMs.” Advanced Engineering Informatics, 29(4).
Hinze, J. W. (1997). Construction Safety. Prentice Hall, Inc.
Jeelani, I., Albert, A., Azevedo, R., and Jaselskis, E. J. (2017a). “Development and Testing of a
Personalized Hazard-Recognition Training Intervention.” Journal of Construction
Engineering and Management, 143(5).
Jeelani, I., Albert, A., and Gambatese, J. A. (2017b). “Why Do Construction Hazards Remain
Unrecognized at the Work Interface?” Journal of Construction Engineering and
Management, 143(5).
Jeelani, I., Albert, A., Han, K., and Azevedo, R. (2019). “Are Visual Search Patterns Predictive
of Hazard Recognition Performance? Empirical Investigation Using Eye-Tracking
Technology.” Journal of Construction Engineering and Management, 145(1), 04018115.
Jeelani, I., Han, K., and Albert, A. (2017c). “Development of Immersive Personalized Training
Environment for Construction Workers.” Congress on Computing in Civil Engineering,
Proceedings.
Jeelani, I., Han, K., and Albert, A. (2018). “Automating and scaling personalized safety training
using eye-tracking data.” Automation in Construction, Elsevier, 93, 63–77.
Li, J., Carr, J., and Jobes, C. (2012). “A shell-based magnetic field model for magnetic proximity
detection systems.” Safety science, Elsevier, 50(3), 463–471.
Mitropoulos, P. (2003). “Workers at the edge; hazard recognition and action.” 11th International
Group for ….
Mur-Artal, R., and Tardos, J. D. (2017). “ORB-SLAM2: An Open-Source SLAM System for
Monocular, Stereo, and RGB-D Cameras.” IEEE Transactions on Robotics, 1–8.
Park, J., Marks, E., Cho, Y. K., and Suryanto, W. (2016). “Performance Test of Wireless
Technologies for Personnel and Equipment Proximity Sensing in Work Zones.” Journal of
Construction Engineering and Management.
Pinto, A., Nunes, I. L., and Ribeiro, R. A. (2011). “Occupational risk assessment in construction
industry - Overview and reflection.” Safety Science.
Rajendran, S., Gambatese, J. A., and Behm, M. G. (2009). “Impact of Green Building Design
and Construction on Worker Safety and Health.” Journal of Construction Engineering and
Management, 135(10), 1058–1066.
Rozenfeld, O., Sacks, R., Rosenfeld, Y., and Baum, H. (2010). “Construction Job Safety
Analysis.” Safety Science, 48(4), 491–498.
Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). “ORB: An efficient alternative to
SIFT or SURF.” Computer Vision (ICCV), 2011 IEEE international conference on, IEEE,
2564–2571.
Sacks, R., Rozenfeld, O., and Rosenfeld, Y. (2009). “Spatial and Temporal Exposure to Safety

Hazards in Construction.” Journal of Construction Engineering and Management.

Sattler, T., Leibe, B., and Kobbelt, L. (2011). “Fast image-based localization using direct 2D-to-
3D matching.” Proceedings of the IEEE International Conference on Computer Vision.
Teizer, J., Allread, B. S., Fullerton, C. E., and Hinze, J. (2010). “Autonomous pro-active real-
time construction worker and equipment operator proximity safety alert system.” Automation
in Construction, 19(5), 630–640.
Wolfe, J. M., and Horowitz, T. S. (2017). “Five factors that guide attention in visual search.”
Nature Human Behaviour.
Zhang, W., and Kosecka, J. (2007). “Image based localization in urban environments.”
Proceedings - Third International Symposium on 3D Data Processing, Visualization, and
Transmission, 3DPVT 2006.

A Computational Framework for Characterizing Multiple Object Tracking Methods in

Construction Field Applications
Jiawei Chen1 and Pingbo Tang2
1
Del E. Webb School of Construction, Arizona State Univ., 660 South College Ave., Tempe, AZ
85281. E-mail: [email protected]
2
Del E. Webb School of Construction, Arizona State Univ., 660 South College Ave., Tempe, AZ
85281. E-mail: [email protected]

ABSTRACT
Multiple object tracking using videos have gained interests in the construction industry to
support real-time monitoring and control of construction sites for productivity and safety. State-
of-the-art multiple object tracking algorithms could encounter various challenges in the field,
such as inaccurate detection and identity switch because of occlusions. These field challenges
cause various uncertainties in the information derived from tracking results (e.g., average waiting
time of workers in a workspace). Studying the failures of multiple object tracking algorithms in
various scenarios are essential for assessing the decision risks related to the use of these
algorithms for deriving information used by field engineers. The authors have developed a
multiple object tracking algorithm that outperforms state-of-the-art methods found by the authors
in the literature and characterized the performance of the algorithm using video data sets
collected during a nuclear power plant (NPP) outage for monitoring indoor check-in processes of
multiple workers before entering a workspace. The results indicate that the video data collected
on job sites involve complex interactions among human, equipment, and environmental objects,
causing various challenges to the multiple object tracking algorithm. Specifically, the authors
used videos collected from an indoor space of a NPP to quantify how reliable the developed
algorithm could predict the average waiting times of workers in queues at different waiting areas
of the check-in space. The authors categorized scenarios where multiple object tracking fails and
found the major failures came from identity switching and false positive detection of workers in
mirror or shiny surfaces. For the construction research community, this research will form a
framework to assess the reliability of multiple object tracking algorithms in deriving information
used by field engineers. For the computer science community, this research identified the
scenarios where state-of-art visual tracking algorithms fail to motivate the development of new
algorithms.

INTRODUCTION
In recent years, with the emergence of affordable video cameras and advance of computer
vision techniques, an increasing number of construction companies began to set up cameras on
construction sites for field surveillance. Considering the fact that most construction sites involve
collaborative work, there are lots of interactions and communication between workers, workers
and machines (Yang et al. 2010). Tracking multiple workers is thus important to supply
information to analyze how different objects interact with each other. Multiple object tracking is
a computer vision technology to locate multiple objects, maintaining their identities, and
generate trajectories of different objects given an input video (Luo et al. 2014). As mentioned in
(Luo et al. 2018), inaccurate detection and frequent identity switching are still the major
problems of multiple object tracking to be addressed. Previous study did not systematically

consider the identity switch in multiple object tracking, which can produce erroneous
information from objects missed, mislabeled, or having discontinuities in tracking.
The current method adopted by the construction industry has inspectors for site observations
and inspection to recognize unsafety behaviors and monitor the construction progress. Manual
monitoring is time-consuming and error-prone and is not suitable for monitoring large
construction sites with thousands of parallel activities (Zhu et al. 2017). Many researchers
explored the potential of visual tracking in order to provide automated and continuous
monitoring (Cheng et al. 2013). Various challenges, such as occlusions and identity switch, have
been bringing uncertainties into the tracking results, and no existing studies systematically
examine and quantify such uncertainties. A systematic classification and synthesis of failures of
multiple object tracking methods and relevant factors that cause those failures is thus important
for quantifying decision risks based on information derived by multiple object tracking
algorithms from field videos.
Waiting lines or queues exist in almost all industrial processes as well as construction sites
(Akhavian and Behzadan 2014). The research team chooses a waiting time calculation of a
queuing system as the testing application of multiple object tracking. During nuclear power plant
outages, workers need to transport from the valve to the Radiation Protection Island (RPI, the
space connecting the containment and the outside). Monitoring the waiting time workers spend
RPI is important since delays in RPI could compromise the productivity and safety of the entire
workflow (Zhang et al. 2017). The researchers proposed a multiple object tracking algorithm that
uses video input to automatically the time different workers spend in the critical areas that
related to the tasks with co-prerequisite or resource sharing relationships. Evaluation of the
performance of multiple objects tracking in this case is important to assess the reliability of the
waiting time calculation.
In this paper, the researchers reviewed the previous work of multiple object tracking and
proposed a multiple object tracking algorithms to identify the scenarios where the algorithm
calculated the waiting time with low precision and recall. The researchers summarized the failure
scenarios to characterize the performance of multiple object tracking using the test case.

LITERATURE REVIEW
Multiple object tracking (MOT) has gained lots of research interests in recent years due to its
academic and commercial potentials (Luo et al. 2014). The information of objects generated
from multiple object tracking can support further behavior analysis and action recognition. Due
to the construction sites are complex and dynamic, involving lots of workers and equipment, the
algorithm proposed by the researchers usually came across various problems such as missing
objects and losses of tracks. Previous researchers evaluated the multiple objects tracking
regarding precision and recall (Xiao et al. 2018; Yang et al. 2010). Few of them gave a
comprehensive review of where multiple object tracking fails in the field. This research tested
the multiple object tracking algorithm in fourteen video clips (over 5,000 frames) to identify the
scenarios where multiple object tracking fails.

METHODOLOGY
The section describes the methodology used by the researcher to identify the scenarios where
multiple object tracking algorithms fail in the field. As shown in Figure 1, the proposed
methodology consists of three steps. The first step is multiple object tracking to get the trajectory
of the objects. Given the input videos, the researchers tested object detection and tracking to

obtain the trajectories of multiple workers. The second step, activity analysis, is to make use of
the trajectory information to extract the project-related information (Cheng et al. 2013; Gong and
Caldas 2011). Then the researcher evaluated the recall and precision of the waiting time
calculation and identified the videos which the algorithm calculated the waiting time with low
recall and precision. The last step is to check the scenarios where the calculation of waiting time
has low recall and precision. The researchers analyzed the reasons why the multiple object
tracking algorithm failed.

Figure 1 Steps to identify where multiple object tracking fails

Multiple Object Tracking

Object detection serves as the first step for object tracking. The previous researcher
conducted lots of research to detect construction project-related objects such as workers, tools,
and resources (Fang et al. 2018; Memarzadeh et al. 2013; Soltani et al. 2017). The project team
needed to detect workers in each frame then match the detected workers in consecutive frames.
Workers occlude with each other, and the video cannot show the bodies of workers. The project
team used a 2D human pose predictor (Cao et al. 2017) that takes videos as input and predicts the
poses of all workers in the video. When some workers’ bodies occlude other workers’ left legs,
the algorithms still can detect those workers’ head and arms. The research team used the
trajectory of the right foot of the workers to represent the trajectory of the workers.

Figure 2 Results of body joint detection of workers and Tracking head in camera view (the
algorithm colorized the trajectories of different workers differently; workers’ faces were
blurred for privacy protection)
The project team used the pose estimation algorithm proposed by Cao to finish the detection

of the body joints of workers (Cao et al. 2017). Figure 2 shows the joint detection results on
video data collected in RPI. Cao’s pose estimation algorithm can detect eighteen joins of each
worker. As for these eighteen joints shown in Figure 2, 0 represents Nose, 1: Neck, 2: Right
Shoulder, 3: Right Elbow, 4: Right Wrist, 5: Left Shoulder, 6: Left Elbow, 7: Left Wrist, 8: Right
Hip, 9: Right Knee, 10: Right Ankle, 11: Left Hip, 12: Left Knee, 13: Left Ankle, 14: Right Eye,
15: Left Eye, 16: Right Ear, 17: Left Ear. The researcher team choose the detections of right
ankle for the tracking of multiple workers in the video.
Given the detections of body joints in the current frame, the main task is to correctly find a
worker who corresponds to the same worker in the previous frame. In this research, the research
team focused on conducting a systematic classification and synthesize of these failures as well as
the factors underlying those failures. The authors also synthesized the impacts of those failures
on the uncertainties of waiting times derived from the results generated by the multiple object
tracking algorithms. Therefore, the model update strategy is defined using the Kalman Filter
which is easy to implement (Kalman 1960).

Evaluation
This section described the performance of the algorithm of tracking multiple workers. From
the video clips which the algorithm calculated the waiting time with low recall and precision, the
researchers investigated those videos and analyzed the scenarios where the algorithm fails. This
study investigated the performance of the proposed tracking algorithm in over fifty hours long
videos collected from an indoor space to quantify how reliable the computer vision algorithm
could calculate the average waiting times of workers in queues at different areas of a workspace.
The researchers reviewed all the videos and selected fourteen video clips that are of the lengths
of 30 to 60 seconds. Each video contains 300 to 400 frames. These fourteen videos include
challenging scenarios with severe occlusions, scale variations. The researchers tested the
developed algorithm on the selected video clips. The authors characterized the impacts of four
factors including occlusions, number of people in the video, temporal resolution, and spatial
resolution on the precision and recall of estimation of waiting time produced by the tested visual
tracking algorithms.

Figure 3 Example of performance evaluation

The authors labeled the ground truth of the objects of interest manually in the tested videos.
Then the tracking results from the proposed method were compared with the ground truths to
calculate the precision and recall. Tracking precision is measured by the Euclidean distance
between center locations of the objects and their corresponding ground truths in the videos. The
unit of the distance is pixel (Zhu et al. 2017). However, as for the practical construction
application in this research, the research team chose to evaluate the tracking performance in
terms of accuracy and precision of waiting time estimation. The reason is that the final objective

of this multiple object tracking algorithm is to monitor and predict the waiting time of workers.
t 2  t3 t 2  t3
Recall  Precision  Equation 1
t 2  t1 t 4  t3

Table 1 Test results characterizing the number of workers, occlusion level, time resolution,
spatial resolution (numbers highlighted by red and bold font indicate low recall and
precision)
ID number Occlusion Time Spatial Average Average
of level resolution resolution Precision Recall
workers
1 4-6 High 6 968*608 0.98 0.77
2 2-3 no 6 968*608 0.97 0.54
3 1-3 medium 6 968*608 1 0.99
4 1 no 6 968*608 0.32 0.1
5 1-3 no 6 968*608 1 0.15
6 1-3 no 6 968*608 0.70 0.58
7 1 no 6 968*608 1 0.61
8 4-6 High 6 600*600 0.5 0.17
9 2-3 no 6 600*600 0 0
10 1-3 medium 6 600*600 1 0.05
11 1 no 6 600*600 0.1 0.05
12 1-3 no 6 600*600 0.87 0.1
13 1-3 no 6 600*600 0.43 0.32
14 1 no 6 600*600 1 0.94
Average 0.70 0.38

As Figure 3 and Equation 1 shows, the green area in the time axis represents the true value of
the time a person A stayed in station 1 while blue area gives the value predicted by computer
vision algorithm. The researchers calculated the recall and precision of the proposed algorithm.
The recall means the percentage of the time predicted correctly by the computer vision algorithm
among the real-time duration. The precision means among the duration of prediction, the
percentage of the correct prediction.

EXPERIMENTS AND RESULTS

The researchers put a camera on the Radiation Protection Island (RPI) the outage in a
Nuclear Power Plant and collected 24-hour video data on April 2017. The researcher used a
laptop that was placed in the RPI and not needs to be connected to the network and will not be
live streaming. The researchers selected seven video clips to test the algorithm. Also, the
researchers subsampled all the selected to test if the performance will be affected when lowering
the video resolution. In total, the researchers have 14 video clips to evaluate the performance of
the algorithm. For each video clip, the researchers calculated the precision and recall for every
worker showed up in that clip. The researchers used the average of the precision and recall of all
the workers to represent the precision and recall of that video clip.
As Table 1 shows, the algorithm can achieve good precision on the collected data, the
average precision of the tested 14 videos is 0.70. The average precision means that the algorithm
can calculate the waiting time of workers in stations at a 70% level. The average recall of the

tested videos is 0.38. The average recall means for the time the workers spent in stations, the
algorithm can track 38% of the time.

Figure 4 ID switch due to inter-worker occlusion

Figure 5 False detection due to reflective objects (red circle the false detection, the
algorithm detected one worker in the video, whereas there is no worker)
From the tested results, the researchers identified four typical scenarios where the algorithms
were likely to fail based on the data available. The researchers will plan to collect more data for a
more comprehensive testing of the developed algorithms in additional scenarios. The first
scenario is that when irrelevant workers passed the station and occluded the workers. As Figure 4
shows, the algorithm assigned id 2 in the left image to the worker in the station. When the
workers passed the station, the id 2 went to another person in the middle image. In the right
image, id two was lost. This scenario is an example of id switch which causes the calculation of
waiting time inaccurate.

Figure 6 Occlusions due to the background obstacles. (The algorithm missed the worker at
the left.)
Another typical failure is a false detection in Figure 5. Sometimes the algorithms could detect
more people than the number of workers in the video. Due to reflections of the mirror in the RPI

room, the current state-of-art algorithm could give false detection of the worker which also
makes the waiting time inaccurate. A similar problem could happen when there are reflective
objects on the construction sites.
The research team found the algorithm calculated the waiting time with low recall and
precision when the background obstacles occluded the worker. Figure 6 shows that the algorithm
missed the worker at the left of the scene because the wall occluded part of his body. Occlusion
by the background obstacle happened a lot in real construction sites such as occlusion from
excavator and wall.

Figure 7 Missed objects due to workers merge and split

The researcher found another typical failure is that the algorithm missed the worker when
they merge and split. As shown in Figure 7, in the left image, the two workers circled in red and
yellow merge together at first. Then in the middle image, the algorithm considered them as a new
object together. In the right image, when they split, the algorithm assigned a new identity to the
two workers, which means the algorithm failed to track the worker continuously and assigned a
new identity to the worker.

CONCLUSION
This research presented a framework for identifying the scenarios where multiple object
tracking fails. The proposed approach integrated detection and tracking and tested the algorithm
in a nuclear power plant indoor space. From the tested results, the researchers generalized four
typical scenarios where multiple object tracking may fail in the construction applications. Based
on findings in this systematic characterization of the multiple object tracking algorithms and the
uncertainties of waiting time estimation, the future research will be analyzing the root causes of
the failure to improve the multiple object tracking results in construction applications.

ACKNOWLEDGMENT
This material is based upon work supported by the U.S. Department of Energy (DOE),
Nuclear Engineering University Program (NEUP) under Award No. DENE0008403 and NASA
University Leadership Initiative program (Contract No. NNX17AJ86A, Project Officer: Dr. Kai
Goebel, Principal Investigator: Dr. Yongming Liu). The support is acknowledged. Any opinions
and findings presented are those of the authors and do not necessarily reflect the views of DOE
and NASA.

REFERENCE
Akhavian, R., and Behzadan, A. H. (2014). “Evaluation of queuing systems for knowledge-based
simulation of construction processes.” Automation in Construction, Elsevier B.V., 47, 37–49.
Cao, Z., Simon, T., Wei, S.-E., and Sheikh, Y. (2017). “Realtime Multi-Person 2D Pose
Estimation using Part Affinity Fields.” IEEE Conference On Computer Vision And Pattern
Recognition (CVPR).
Cheng, T., Teizer, J., Migliaccio, G. C., and Gatti, U. C. (2013). “Automated task-level activity
analysis through fusion of real time location sensors and worker’s thoracic posture data.”
Automation in Construction, Elsevier B.V., 29, 24–39.
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T. M., and An, W. (2018). “Detecting non-
hardhat-use by a deep learning method from far-field surveillance videos.” Automation in
Construction, Elsevier, 85(May 2017), 1–9.
Gong, J., and Caldas, C. H. (2011). “An object recognition, tracking, and contextual reasoning-
based video interpretation method for rapid productivity analysis of construction operations.”
Automation in Construction, Elsevier B.V., 20(8), 1211–1226.
Kalman, R. E. (1960). “A New Approach to Linear Filtering and Prediction Problems 1.”
Journal of Fluids Engineering, 82(Series D), 35–45.
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X., and Kim, T.-K. (2014). Multiple
Object Tracking: A Literature Review.
Luo, X., Li, H., Cao, D., Yu, Y., Yang, X., and Huang, T. (2018). “Towards efficient and
objective work sampling: Recognizing workers’ activities in site surveillance videos with
two-stream convolutional networks.” Automation in Construction, Elsevier, 94(December
2017), 360–370.
Memarzadeh, M., Golparvar-Fard, M., and Niebles, J. C. (2013). “Automated 2D detection of
construction equipment and workers from site video streams using histograms of oriented
gradients and colors.” Automation in Construction, Elsevier B.V., 32, 24–37.
Soltani, M. M., Zhu, Z., and Hammad, A. (2017). “Skeleton estimation of excavator by detecting
its parts.” Automation in Construction, Elsevier, 82(May), 1–15.
Yang, J., Arif, O., Vela, P. A., Teizer, J., and Shi, Z. (2010). “Tracking multiple workers on
construction sites using video cameras.” Advanced Engineering Informatics, Elsevier Ltd,
24(4), 428–434.
Zhang, C., Tang, P., Cooke, N., Buchanan, V., Yilmaz, A., St. Germain, S. W., Boring, R. L.,
Akca-Hobbins, S., and Gupta, A. (2017). “Human-centered automation for resilient nuclear
power plant outage control.” Automation in Construction, Elsevier, 82(May), 179–192.
Zhu, Z., Ren, X., and Chen, Z. (2017). “Integrated detection and tracking of workforce and
equipment from construction jobsite videos.” Automation in Construction, Elsevier,
81(April), 161–171.

Sequential Pattern Learning of Visual Features and Operation Cycles for Vision-Based
Action Recognition of Earthmoving Excavators
Jinwoo Kim1 ; Seokho Chi, Ph.D., M.ASCE2; and Minji Choi3
1
Constuction Innovation Laboratory, Dept. of Civil and Environmental Engineering, and Institute
of Construction and Environmental Engineering, Seoul National Univ., 1 Gwanak-Ro, Gwanak-
Gu, Seoul 08826, Korea. E-mail: [email protected]
2
Constuction Innovation Laboratory, Dept. of Civil and Environmental Engineering, Seoul
National Univ., 1 Gwanak-Ro, Gwanak-Gu, Seoul 08826, Korea. E-mail: [email protected]
3
Institute of Construction and Environmental Engineering, Seoul National Univ., 1 Gwanak-Ro,
Gwanak-Gu, Seoul 08826, Korea. E-mail: [email protected]

ABSTRACT
Excavator action recognition is a major task for cycle-time and productivity analysis. Many
researchers have developed vision-based methods and showed promising results for automated
action recognition. However, previous methods did not fully incorporate sequential working
patterns of excavators although it is crucial to explain their natural operation procedure. To
overcome the limitations, this paper proposes a vision-based excavator action recognition
framework that considers the sequential patterns of visual features and operation cycles. The
framework was validated with 1,340 images collected from actual earthmoving sites, and the
average precisions and recall rates were 90.9% and 89.2% respectively. This research performed
additional experiments with the other CNN model, and the precision and recall rate were
decreased to 77.5% and 71.6% respectively. The experimental results showed the applicability of
the developed framework and the positive impacts of sequential pattern learning. This study can
contribute to more reliable automated cycle-time analysis and productivity measurement.

INTRODUCTION
An excavator is one of the most important equipment for successful earthmoving projects
since it has a great ability to perform various operations in harsh working environments. For
cycle time and productivity analysis of earthmoving excavators, it is a major task to recognize
their operation types such as ‘digging’, ‘hauling’, ‘dumping’, ‘swinging’, ‘moving’, and
‘stopping’. Through the continuous action recognition, site managers can obtain the detailed
information about operational efficiency and measure the performance indicators (e.g., direct
work rate, cycle time, idle time, etc.). Such information further allows site managers to estimate
project time and costs required to finish repetitive earthmoving operations.
In the past, action recognition and operation analysis have been commonly performed
manually. However, such human-dependent approach is labor-intensive, expensive, and time-
consuming. To this end, researchers have made extensive efforts to develop automated action
recognition and monitoring systems. One of the most popular systems is an Internet-of-Things
(IoT)-based approach. The IoT-based method attaches electronic sensors on equipment’s main
body or components, collects point location data, and interprets its physical motion information.
On the other hand, a computer vision-based method has drawn high attention from many
researchers. This is because the visual data include not only the physical motion information but
also equipment’s visual features (e.g., colors, geometric relationships, etc.). Such visual
information is significant for classifying the operation types of an excavator; for instance, the

‘digging’ actions of excavators can be easily recognized based on the geometric relationship
between their main bodies, arm, and buckets. Furthermore, since 2016, the Korean Government
has even encouraged construction firms to include the costs required to install monitoring
cameras in their safety management budgets. Those technical benefits and social demands made
vision-based systems more practical and affordable.
Due to such above-mentioned needs and advantages, a series of vision-based methods have
been developed and they showed high promising results on the action recognition. However, the
previous studies did not fully take into account sequential working patterns of earthmoving
excavators, which can promote more precise action analysis. To address the drawbacks, this
research proposes a method that recognizes operation types of excavators by learning and
analyzing the sequential working patterns. To be specific, sequential patterns of visual features
and operation cycles are incorporated into the proposed method. An excavator performs a
specific operation for a while with sequential changes of its visual appearances (e.g., the
excavator is ‘digging’ while lifting up its bucket and arm), and it also carries out a series of
different operations in sequence (e.g., the excavator perform ‘hauling’ after ‘digging’). For that
purpose, the research team builds upon advanced deep learning algorithms: Convolutional
Neural Networks (CNN) and Double-layers Long Short Term Memory (DLSTM). The built deep
learning model is called CNN-DLSTM in this paper.
This research has the following contributions. The authors identify critical elements for
explaining sequential working patterns of earthmoving excavators, develop a technical
framework that considers the sequential patterns of visual features and operation cycles, and
improve the performance and applicability of automated action recognition and operation
analysis.

LITERATURE REVIEW
Earthmoving operations and excavators: Earthmoving operations are fundamental and
essential construction tasks to move soil from to the other. A single earthmoving cycle includes a
series of operations such as excavating, loading, hauling, and dumping the soil. Specifically, an
excavator is one of the most important equipment due to its ability to perform various operations
in harsh working environments. The major role of excavators in earthmoving projects is to cut
soil and hand over soil to dump trucks. Excavators can take one of six types of actions:
‘digging’, ‘hauling’, ‘dumping’, ‘swinging’, ‘moving’, and ‘stopping’. During operation,
excavators usually have sequential working patterns. The sequential patterns of visual features
can be observed while carrying out specific operations. For example, an excavator is ‘hauling’ or
‘swinging’ while rotating its main body, root, and bucket, and it is lifting up and putting down its
root and bucket for ‘digging’ or ‘dumping’ actions. Additionally, an excavator performs cyclic
operations. An excavator takes actions of ‘digging’, ‘hauling’, ‘dumping’, and ‘swinging’ in a
sequential order. The other operation cycles might include ‘hauling  stopping’ and ‘stopping
 dumping’.
Vision-based action recognition in construction: In order to conduct action recognition
automatically, existing studies have investigated a range of vision-based methods that identify
operation types of construction equipment and workers. Zou and Kim (2007), for instance,
developed a method that classifies whether a detected excavator is working or idling using Hue,
Saturation, and Value color. Gong and Caldas (2010) detected and tracked a tower crane’s
bucket for automated productivity analysis of concrete pouring activities. Golparvar-Fard et al.
(2013) trained a machine learning-based classifier that analyzes spatio-temporal features for

recognizing operation types of earthmoving equipment. In addition, other studies further

analyzed domain-specific knowledge and contextual information. Rezazadeh Azar et al. (2013)
considered interactive operations between an excavator and a dump truck for estimating dirt-
loading cycles. Kim et al. (2018) improved the concept of interactive operations and enhanced
the performance of automated action recognition of earthmoving equipment. In recent years,
researchers have highlighted the significant advance in deep learning algorithms such as CNN
and LSTM; it is because deep learning-based approaches are said to outperform traditional
machine-learning methods for image classification, object detection, and action recognition. This
opportunity has also accelerated extensive research in construction domain. Kim et al. (2018a)
developed a region-based CNN model to detect construction resources from jobsite images. Such
CNN-based detection models were also investigated for the purpose of detecting non-hardhat-
wearing worker (Fang et al. 2018). Ding et al. (2017) combined CNN and LSTM to analyze
workers’ sequential patterns that cause unsafe behaviors.
The existing studies have shown promising performance on action recognition of
construction equipment and workers. However, the previous vision-based method did not fully
incorporate sequential working patterns of earthmoving excavators, which represent actual
equipment operations in a more natural way and thus enhance recognition performance. To
address such limitations, this study develops a method that recognizes operation types of
excavators by learning and analyzing equipment’s sequential working patterns. For the purpose
of learning sequential patterns of visual features and operation cycles, the research team builds
upon CNN and Double-layer LSTM (CNN-DLSTM).

RESEARCH METHODOLOGY
The research methodology includes three main processes as shown in Figure 1. First,
earthmoving excavators are detected from jobsite images using a pre-trained CNN detection
model. The detected excavators are then tracked by Tracking-Learning-Detection algorithm
(TLD). Last, to recognize their operation types, sequential working patterns are analyzed by
CNN-DLSTM. The details of each process are explained in the following sections.

Figure 1. Research methodology.

Excavator detection: Excavator detection is a process of obtaining information of
excavators’ locations and dimensions at image frames. This research builds on one of the most
popular object detection models: Faster Region-proposal Convolutional Neural Network (Faster

R-CNN) developed by Ren et al. (2017). Faster R-CNN has been used both in computer vision
and construction, and the previous researchers demonstrated its high applicability to detect
construction resources (Fang et al. 2018; Kim et al. 2018a). Faster R-CNN model firstly
processes a raw Red-Green-Blue (RGB) image and proposes a set of possible region-of-interests
with confidence scores. Specifically, the original RGB image proceeds into a series of
convolution and max-pooling layers in order to generate a feature map of the input image. Next,
spatial windows slide onto the feature map and the corresponding windows are projected into
lower dimensional feature vectors. The extracted feature vectors are then used as the input of two
types of fully connected layers (i.e., box-classification layer and box-regression layer). In this
research, the original network model was adapted and the technical details can be found from
Ren et al. 2017.
Excavator tracking: The purpose of this process is to obtain trajectory information of
earthmoving excavators from sequential images. To track multiple excavators, the research team
used a popular object tracking algorithm: Tracking-Learning-Detection, developed by Kalal et al.
(2012). Since the original algorithm was not fully effective in construction jobsite images, the
authors customized TLD by considering domain-specific characteristics in the authors’ previous
study (Kim and Chi 2017). The benefits of TLD are originated from two core modules:
functional integration and online learning. Functional integration is a process of running a
detector and a tracker and determining the final tracking results by counter-analyzing their
results and compensating tracking errors each other. This nature enables to localize earthmoving
excavators in a long term under dynamic movements and high interclass/intraclass variations.
The detector trained by well-developed training database has an ability to handle abrupt changes
of excavators and environments since it independently scans all areas in an image and localizes
target objects. However, the detector may encounter difficulties to localize target objects when
the detector faces unexpected events or visual appearances that are not included in the training
database. This limitation can be compensated by a tracker. The tracker analyzes image frames in
sequence and localizes the most similar regions between consecutive images based on motional
information or feature representation. It means that the tracker is capable of adapting to smooth
changes of object features (e.g., shape changes while excavators are rotating). Hence, possible
detection errors can be minimized by the tracker. In the opposite, the tracker has a limitation that
can be effectively handled by the detector. It is the case that the tracker is sensitive to sudden
changes of excavators’ movements and shapes. Such errors can be counterbalanced with the
detector’s strengths which are robust to abrupt changes and dynamic movements. Online
learning technique – which generates training data from testing image data in real time – is also a
core module for tracking earthmoving excavators. Since training data can be extracted from a
testing image, object-specific detectors can be generated and the tracking performance would be
enhanced.
Sequential pattern modeling and learning: For learning sequential working patterns of
excavators, the model built in this research is composed of two different models as illustrated in
Figure 2: CNN-LSTM and the other LSTM. CNN-LSTM model learns and analyzes sequential
patterns of visual features when an excavator performs a specific operation; for instance, it is
‘hauling’ or ‘swinging’ while rotating its main body, root, and bucket. CNN is used to extract
key visual features from tracking bounding boxes in each image. Once the visual features are
extracted, the first LSTM model learns and analyzes their sequential patterns. This nature
enables to predict operation types of excavators by incorporating visual features within multiple
image frames; this is because a single operation is carried out for a while. Next, the interim

results of CNN-LSTM model are further processed in the second LSTM model for examining the
sequential patterns of operation cycles. At the end, the operation type at each time-step is re-
evaluated by taking into account interim operation sequences. It means that the current operation
is decided by sequences and durations of the previous operations during a certain time interval.
This nature allows LSTM model to learn the sequential patterns of operation cycles and to
classify the current operation types more accurately. For example, generally observable
operation cycles – ‘digging  hauling  dumping  swinging’ can be learned. Additionally,
noisy/outlier inferences of CNN-LSTM model can be filtered out with the second LSTM model.

Figure 2. The workflows of sequential pattern modeling and learning.

EXPERIMENTAL RESULTS AND ANALYSIS

To validate the proposed methodology, this research carried out experiments and the results
were analyzed. The authors collected video stream data from an earthmoving site using normal
optical cameras and smartphones. Since the excavators were performing earthmoving, it was
possible to record all of their operation types including ‘digging’, ‘hauling’, ‘dumping’,
‘swinging’, ‘moving’, and ‘stopping’. When installing optical cameras on construction sites, the
authors followed a systematic camera placement framework developed in a previous study (Kim
et al. 2018d). A total of 1,340 image frames was collected with the resolution of 720 × 1280
pixels and the frame rate of 10 fps. Figure 3 shows classified operation types of the excavator,
and the performance metrics were determined with the average precisions and recall rates of 90.9%
and 89.2% respectively. The experimental results denoted that the developed framework has
acceptable performance in recognizing the operation types of earthmoving excavators. In detail,
the various operation types were identified continuously and correctly from the T  78 image
frame in the left to the T  193 image frame in the right. To confirm the positive effects of the
sequential pattern modeling, the research team performed an additional experiment with the
other CNN model without sequential analysis. When the operation types were recognized
without sequential analysis, and the precision and recall rate were decreased to 77.5% and 71.6%
respectively. The results indicated that the sequential pattern modeling and learning had
significant impacts on the performance of excavator action recognition.

Figure 3. Examples of action recognition results.

CONCLUSION
This paper proposed a vision-based action recognition framework that considers sequential
working patterns of earthmoving excavators. The sequential patterns of visual features and
operation cycles were investigated in the developed framework. To analyze such sequential
patterns, the research team built the novel CNN-DLSTM model. In the model, CNN aids in
capturing key visual features of excavators such as their edges and the geometric relationship
between their buckets and main bodies. Based on the visual features extracted, the first LSTM
model analyzes their sequences (e.g., the excavator rotates its buckets and main body) and
predicts operation types. Then, such interim results are fed into the second LSTM model, which
re-determines the action types by further analyzing the sequences of the operation cycles. The
experimental results supported the feasibility of the proposed method as well as the statistical
significance of the sequential pattern modeling. The precisions and recall rates were increased by
13.4% and 17.6% respectively when the sequential patterns were analyzed. Regarding the
practical applications, based on the results of action recognition, it is possible to automatically
measure cycle time of earthmoving excavators and to analyze root causes of productivity
deviations. For instance, according to the findings of previous studies (Peurifoy 2011; Yoon et
al. 2014), long durations of ‘digging’ operations might be a signal of the difficulty to soil in the
current working conditions (e.g., improper height of cut or excessive soil strength). Also, long
durations of ‘hauling’ and ‘swinging’ operations may indicate that excavators are rotating with a
large angle, which is one of the most critical factors that increases their cycle time and lowers
operational efficiency. Such quantitative analyses can help site managers make project-control
decisions such as operator training, layout planning, and soil blasting to enhance the operational
efficiency of earthmoving excavators.
Several research challenges remain to be addressed. The deep learning model built in this
paper required excessive computational training time and a large amount of training data. Future
studies can investigate algorithmic enhancement such as finding the optimal complexity of
network models with relative short training time while maintaining acceptable accuracy. In
addition, to address intrinsic shortcomings of vision-based methods, future research can integrate
IoT-based systems such as Global Positioning System (GPS). With further improvement, it is
expected that automated action recognition can be achieved for cycle time and productivity
analysis of earthmoving excavators.

ACKNOWLEDGEMENTS
This work was supported by Seoul National University Research Grant in 2018.

REFERENCES
Ding, L., Fang, W., Luo, H., Love, P. E. D., Zhong, B., and Ouyang, X. (2017). “A deep hybrid
learning model to detect unsafe behavior: Integrating convolution neural networks and long

short-term memory.” Automation in Construction, 86, 118–124.

Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T. M., and An, W. (2018). “Detecting non-
hardhat-use by a deep learning method from far-field surveillance videos.” Automation in
Construction, Elsevier, 85, 1–9.
Golparvar-Fard, M., Heydarian, A., and Niebles, J. C. (2013). “Vision-based action recognition
of earthmoving equipment using spatio-temporal features and support vector machine
classifiers.” Advanced Engineering Informatics, Elsevier, 27(4), 652–663.
Gong, J., and Caldas, C. H. (2010). “Computer Vision-Based Video Interpretation Model for
Automated Productivity Analysis of Construction Operations.” Journal of Computing in
Civil Engineering, 24(3), 252–263.
Kalal, Z., Mikolajczyk, K., and Matas, J. (2012). “Tracking-Learning-Detection.” IEEE
Transactions on Pattern Analysis and Machine Intelligence, 34(7), 1409–1422.
Kim, H., Bang, S., Jeong, H., Ham, Y., and Kim, H. (2018a). “Analyzing context and
productivity of tunnel earthmoving processes using imaging and simulation.” Automation in
Construction, 92, 188–198.
Kim, H., Kim, H., Hong, Y. W., and Byun, H. (2018b). “Detecting Construction Equipment
Using a Region-Based Fully Convolutional Network and Transfer Learning.” Journal of
Computing in Civil Engineering, 32(2), 04017082.
Kim, J., and Chi, S. (2017). “Adaptive Detector and Tracker on Construction Sites Using
Functional Integration and Online Learning.” Journal of Computing in Civil Engineering,
31(5), 04017026.
Kim, J., Chi, S., and Seo, J. (2018c). “Interaction analysis for vision-based activity identification
of earthmoving excavators and dump trucks.” Automation in Construction, Elsevier, 87, 297–
308.
Kim, J., Ham, Y., Chung, Y., and Chi, S. (2018d). “Systematic Camera Placement Framework
for Operation-level Visual Monitoring on Construction Jobsites.” Journal of Construction
Engineering and Management, [In-Press].
Peurifoy, R. L. (Robert L. (2011). Construction planning, equipment, and methods. McGraw-
Hill.
Ren, S., He, K., Girshick, R., and Sun, J. (2017). “Faster R-CNN: Towards Real-Time Object
Detection with.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6),
1137–1149.
Rezazadeh Azar, E., Dickinson, S., and McCabe, B. (2013). “Server-Customer Interaction
Tracker: Computer Vision–Based System to Estimate Dirt-Loading Cycles.” Journal of
Construction Engineering and Management, 139(7), 785–794.
Yoon, J., Kim, J., Seo, J., and Suh, S. (2014). “Spatial factors affecting the loading efficiency of
excavators.” Automation in Construction, Elsevier, 48, 97–106.
Zou, J., and Kim, H. (2007). “Using Hue, Saturation, and Value Color Space for Hydraulic
Excavator Idle Time Analysis.” Journal of Computing in Civil Engineering, 21(4), 238–246.

Modelling and Controlling Unmanned Excavation Equipment on Construction Sites

Joo-sung Lee1; Byeol Kim2 ; Dong-ik Sun3; Chang-soo Han4; and Yong-han Ahn5
1
Research Professor, Dept. of Architectural Engineering, Hanyang Univ., 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
2
Master’s Student, Dept. of Architectural Engineering, Hanyang Univ., 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
3
Ph.D. Candidate, Dept. of Mechatronics Engineering, Hanyang Univ., 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
4
Professor, Dept. of Robot Engineering, Hanyang Univ., 55 Hanyangdaehak-ro, Sangnok-gu,
Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]
5
Associate Professor, Dept. of Architectural Engineering, Hanyang Univ., 55 Hanyangdaehak-ro,
Sangnok-gu, Ansan-si, Gyeonggi-do 15588, South Korea. E-mail: [email protected]

ABSTRACT
Excavators on construction sites can pose a serious hazard for both their operators and the
laborers working alongside them because they can easily overturn when operating on uneven
ground. Although there have been many suggestions for improving onsite safety, the ultimate
solution is to avoid these hazards altogether by utilizing unmanned equipment. Unfortunately it
would be both technically challenging and prohibitively expensive to undertake the complete
redesign of an excavator to enable it to function autonomously, as this would require an
integrated control algorithm capable of real time sensing, analysis, and control if the excavator is
to react appropriately to its local environment. We propose an alternative solution, namely the
use of a removable manipulator that can be installed in the cockpit to enable an operator at a
remote location to control excavator operations for 6 types of movement. A prototype system
was constructed and tested.

INTRODUCTION
Construction projects are typically developed on a project-by-project basis. To build a single
building, in order to fulfill the specified designs, processes, and methods workers from multiple
specialties will be onsite simultaneously, working alongside each other as well as a wide range of
different equipment and with various types of materials on the construction site.
Accidents caused by Excavators on Construction Sites: Due to the nature of construction
projects, especially those involving large-scale operations, an unpredictable climate, intense
equipment utilization, and the use of large-sized materials, many workers and the equipment
needed to support them must work together in harmony on a typical construction site. For civil
engineering projects that include extensive earthworks, foundations, and underground structures,
workers are always at risk of falling into holes or being buried when trenches or tunnels collapse
because a great deal of the construction takes place below ground level. Where excavators are
being used to shift large quantities of material, especially on uneven ground, there is also a risk
of the machine overturning and causing serious injuries to not only the operator but also those
working close by.
Existing Research on the Unmanned Operation of Excavators: Zweiri (2008) compared
parameter analysis approaches such as the Newton-Raphson method, the generalized Newton
method and the least squares method to identify the optimum dynamic model for operating the

arm of an unmanned excavator. He reported that the unmanned excavator analyzed in his paper
suffered from a serious limitation in that the cost of converting existing equipment to an
unmanned system was very high and it would also be hard to commercialize because of the low
efficiency of the resulting field performance. Bender et al. (2017) proposed a predictive operator
model for virtual prototyping of a hydraulic excavator, including a methodology for simulating
the degree of freedom of the boom, arm, and bucket. This study suggests an alternative
simulation method for optimizing excavation work, but the researchers did not extend the new
methodology or their research results to encompass unmanned operation methods for an
excavator. Zhou et al. (2017) implemented a linkage algorithm from the boom cylinder to the
bucket cylinder to achieve an optimal balance between excavator driving and energy saving; they
also proposed a prediction based stochastic dynamic programming control methodology.
However, although their findings are a useful contribution to efforts to improve excavator control,
their application to work on developing an unmanned excavator is of only limited utility. J. W.
Kim et al. (2018) proposed a vision-based activity identification methodology for identifying the
dump and excavator status for unmanned excavation. Their results revealed that a vision-based
method is less reliable than those achieved utilizing more recent technologies such as deep-
learning based on the real-time 3D scan data, with the difference in the results depending on the
analysis frequency. J. H. Kim et al. (2018) went further by proposing the concept of a robotic
excavator, using a method based on a predefined work algorithm to automate an excavator. They
did not consider the use of real-time information collection methods such as 3D Scan and Deep
Learning to cope with varying conditions and events in the field, so there remains some way to
go before their model could lead to a practical application. For both optimization and unmanned
excavator studies, although a number of different methods have been developed, all suffer from
the common problem that the excavator or arm needs to be fully remodeled and hence none are
capable of dealing with constantly changing field conditions in real time. This means that such
systems will be difficult to commercialize as they will be prohibitively expensive to operate.
Unmanned Excavator Operations Using a Remotely Controlled Manipulator: In order to
address the issues identified above, a remote excavator manipulation method based on a remote
control system could prove to be a viable alternative. An unmanned control system based on a
removable manipulator installed in an excavator is both effective and easy-to-use under a wide
range of different environmental conditions. The remote control system is able to achieve the
same effect as in conventional excavation work because it supports real-time imaging from
various sensors presented to a human operator located off site in a safe indoor environment. As it
is operated remotely, this exposes fewer workers to possible large-scale accidents that occur all
too often when excavating earthworks, underground frame construction, and demolition work, all
of which have a high risk of burial, rockfall or collapse. The detachable manipulator can be
utilized for a wide variety of equipment by simply modifying the method used to manipulate the
control stick. In this study, an appropriate manipulator was installed in the cockpit of a
commercial excavator and a remote control system and a reliable communication method
between the remote control station and manipulator were developed and tested.

RESEARCH METHODOLOGY
Scope of Research: An unmanned excavation system was designed to reduce the risk of
accidents during excavation operations, which account for about 18% of the deaths related to
equipment on construction sites. In order to achieve the same work performance as existing
excavators, an unmanned excavation control algorithm for navigation, gripper operation and

body swing was developed.

Research Methodology: The research method for the unmanned excavation system and the
development of the individual components is shown in Figure 1.

Figure 1. Research methodology

In order to develop an unmanned excavation system that has the same work performance as
existing conventional excavators, the following procedure was carried out:
1) The conditions and minimum performance that would be required of an unmanned
excavation system were determined through a comprehensive review of the literature;
2) An appropriate system and components were designed; and
3) The system structure and its individual components, including a remote control operation
station, a manipulator for the excavator, a suitable communication network environment,
and a behavior control sensor were developed.
In addition, to analyze the technical performance of the unmanned excavation system, four
types of performance analysis, classified into 2 main categories, were performed.
1) Mechanical performance: the gripper position measurement accuracy and remote
control response speed
2) Communication performance: the optimum information update period for the
excavator and the behavior and maximum distance for the remote wireless
communication network.

DESIGN OF UNMANNED EXCAVATING SYSTEM

Remote Control Station to Control the Manipulator: The new unmanned excavation
system developed for this project consists of three components: 1) an unmanned excavator
control system in a remote location, 2) an unmanned excavator to conduct the on-site work, and
3) a sensor to provide excavator attitude control and position information. The remote control
components consist of a control unit that sends signals that operate the excavator’s levers and
pedals, a positioning GPS, and a gripper attitude control sensor. Real time visualization is
achieved using photographic data recorded by a camera installed on the roof of the excavator,
which is transferred to the PC in the remote workspace via Zigbee. The operator manipulates the
remote control equipment installed in the indoor remote control station to control the excavator.

Figure 2 shows the linkages between the unmanned excavator and the remote operator.

Figure 2. Schematic diagram of the prototype unmanned excavating system

Figure 3. Development of remote control station

DEVELOPMENT OF UNMANNED EXCAVATING SYSTEM
Remote Control Station: In order to provide the same performance as a conventional
excavator, a remote control algorithm for the navigation, gripper operation and body swing was
developed. Six DOF (degrees of freedom) for the remote control joystick manipulation device
were implemented: driving modes (forward, backward, left-turn, right-turn and acceleration),
gripping modes (grasp & roll up, open & roll down), swing modes (left-swing and right-swing),

and boom and arm manipulation (forward & backward, up & down).
Mirroring the excavator’s degrees of freedom, the remote control station (Figure 3) consisted
of an integrated controller, including control arms, control pedals, a monitoring system, and a
camera system set up in a similar configuration to those in the excavator’s cab to ensure that an
experienced excavator driver would be able to operate them without difficulty.
Manipulator Installed to Operate the Excavator Controllers: The operator at the remote
location controlled the manipulator attached to the excavator via Zigbee communication. The
manipulator consisted of four separate control units, one each for the two levers and two pedals,
corresponding to the original joysticks in the excavator. These four control units operated the
joystick in the excavator based on the command signals received by the main controller.
Sensor System to Monitor the Location and Position of Excavator: Based on the
command signal value, the excavator classifies the signals into a micro mode for driving in the
task space and a macro mode for driving the cylinder in the joint region. In the macro mode, the
actuator is directly driven by visual feedback. In the micro mode, the excavator decides whether
to perform the task in accordance with the position and speed regime fed back through the
sensors and the kinematic workspace. In addition, a potentio and encoder type stroke sensor, a
Novatel GPS position sensor, and a tilt sensor were attached to collect necessary information
such as the excavator attitude and gripper position.
Communication Network Linking the R.C Station and Manipulator: To carry the signals
needed to remotely control the excavator, a wireless communication system was constructed
between the control station and manipulator installed in the excavator. Zigbee transceiver
modules were attached to the remote station and the excavator and the station was configured to
transmit the driving and arm operation commands to the manipulator.

Figure 4. Mechanical performance analysis: precision of gripper’s localization (top), system

response speed (bottom)
MECHANICAL PERFORMANCE ANALYSIS OF PROPOSED SYSTEM
Mechanical Performance of the Proposed System: In order to verify the mechanical

performance of the unmanned excavation system, the performance of the system was tested by
examining the precision achieved by the gripper's localization and its response speed and how
well the excavator’s systems functioned under remote operation. The excavator (a 1999 JCB
model XYZ) was operated at a location XX m (or km?) from the operator, who was a skilled
excavator driver with XX years’ experience.
To analyze the precision of the gripper's localization, the (X, Y, Z) coordinates of the gripper
were measured using a CAN Analyzer in real time, then the position of the gripper was
compared with the position of the second GPS antenna based on the reference coordinates at
which the GPS antenna was installed. The measurement was repeated five times and the mean
value found to be within ±1.8 cm, confirming that the performance is satisfactory.
To test the response speed and how well the excavator systems functioned under remote
operation, the difference between the remote control input time (measured by the CAN Analyzer)
and the response time (measured by RTK-GPS and the CAN Analyzer) were compared and
found to correspond to within 0.357 seconds, confirming that the problem of work response
delay commonly experienced by unmanned excavation systems is not an issue here. Figure 4
shows the results of the mechanical performance analysis of the prototype unmanned excavation
system.
Communication Performance of Proposed System: The communication performance
between the excavator and the control station has a direct impact on both the precision and the
productivity of excavation work. We therefore verified the communication performance by
measuring the update frequency of the attitude information and the maximum effective distance
of the wireless telecommunication. Updating the attitude information proceeds via 3 steps: 1)
integrate the action information for the boom, arm and gripper measured by sensor; 2) calibrate
the position and angle; and 3) transmit the position value for the excavator and gripper to control
the system using UART and CAN communication devices. The result show that the attitude
value was transmitted and received every 200 msec (5 Hz). For the wireless remote control
distance analysis, RTK-GPS units were mounted on the remote control station and excavator,
revealing that remote control became unreliable as distances above about 10.25m Figure 5 shows
the results of the communication performance analysis.

Figure 5. Communication performance analysis: precision of gripper localization (left),

system response speed (right)
CONCLUSIONS
The results of this preliminary study suggest that an unmanned excavation system designed
to overcome the limitations identified in earlier research on unmanned excavator operations can
indeed be implemented to prevent some of the accidents that occur on dangerous worksites. The
technical performance of a prototype system was analyzed and the mechanical and

communication performance tested. The proposed unmanned excavation system could be used
not only for high-risk work such as earthworks and demolition, but also in disaster areas or
places that are hard for people to access safely such as nuclear facilities. Installing manipulators
in an excavator could become a core technology for unmanned heavy equipment research
because of this approach’s versatility: it can be applied for a wide range of equipment simply by
modifying the manipulation method. This research in ongoing; the results of a more detailed
study, including the operating principles of the manipulator and the gripper manipulation
methodology will be derived in future research, and the results of the verification study on the
driving performance and the operational degrees of freedom, which determine the core
performance of the excavator, will be presented soon.

ACKNOWLEDGEMENTS
This work was supported by the Technology Innovation Program (No. 2017-10069072)
funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea).

REFERENCES
Bender, F. A., Mitschke, M., Bräunl, T., and Sawodny, O. (2017). “Predictive operator modeling
for virtual prototyping of hydraulic excavators.” Automation in Construction, 84, 133-145.
Kim J. H., Lee S. S, Seo J. W., and Kamat V. R. (2018). “Modular data communication methods
for a robotic excavator.” Automation in Construction, 90, 166-177.
Kim J. W., Chi S. H., and Seo J. W. (2018). “Interaction analysis for vision-based activity
identification of earthmoving excavators and dump trucks.” Automation in Construction, 87,
297-308.
Zhou H., Zhao P. Y., Chen Y. L., and Yang H. Y. (2017). “Prediction-based stochastic dynamic
programming control for excavator.” Automation in Construction, 83, 68-77.
Zweiri, Y. H. (2008). “Identification schemes for unmanned excavator arm parameters.”
International Journal of Automation and Computing, 05(2), 185-192.

Automatic Wall Defect Detection Using an Autonomous Robot: A Focus on Data Collection
Jun Wang, Ph.D.1; and Chaomin Luo, Ph.D.2
1
Assistant Professor, Dept. of Civil and Environmental Engineering, Mississippi State Univ., PO
Box 9546, Mississippi State, MS 39762. E-mail: [email protected]
2
Associate Professor, Dept. of Electrical and Computer Engineering, Mississippi State Univ., PO
Box 9571, Mississippi State, MS 39762. E-mail: [email protected]

ABSTRACT
Detection of wall defects by autonomous robots enhances building inspection with reduced
labor, improved productivity, and inspection accuracy, in comparison with manual inspection.
Accordingly, various advanced technology-enabled techniques have been investigated to detect
defects through collected images. However, rare effort has been put on developing an automatic
data collection system to support the collected data with high quality (complete but not
redundant). In this paper, an autonomous robot-enabled data collection system is developed for
indoor wall condition inspection. The autonomous robot is equipped with sensors for navigation,
map building, and obstacle avoidance. To generate safer, more reasonable collision-free wall-
following trajectories, improved heuristic algorithms are used to optimize robot trajectory. The
developed data collection system with navigation of an autonomous robot is evaluated by
simulation. The obtained results indicate that the hybrid approach for the automatic data
collection system is able to collect all available walls and corners without redundant trajectories.

INTRODUCTION
Inspection of wall conditions is needed on multiple occasions such as periodic building
inspection, maintenance and renovation, changes of ownership, and post-disaster building
assessment. Defects of walls such as cracks on wall surfaces, an appearance defect and unsightly,
can reflect wall conditions and be symptomatic of structural failures. In many cases, cracks even
might be hidden in the freshly-painted walls. Periodic inspection of buildings (such as walls) and
civil infrastructure is an essential manner to ensure their conditions still meet the expected
service requirements (Koch et al. 2015). The defects of building walls vary from the types of the
walls such as efflorescence for solid masonry walls, buckling for wood frame walls, and cracks,
a typical type of wall defect, for solid masonry and masonry veneer walls. Cracks also are of
common occurrence at corners and junctions (e.g., connecting floors and walls or connecting
walls) where they should not beoverlooked. Furthermore, defects on walls (e.g., gaps between
walls and floors and cracks originated from the foot of walls) caused by sinking, sagging, or
settling are precursors of structural failures and required to be inspected as well. Other generally
occurred defects for painted walls are peeling-off and flaking finishes. In addition, thermal
performance is essentially related to wall defects, e.g., moisture and thermal problems have been
proved to be precursors of some wall defects and defected walls undermine building moisture
behavior and thermal performance (Alencastro et al. 2018; Fox et al. 2016). Accordingly, various
wall inspection and defect detection methods and tools have been developed (Perilli et al. 2018;
Kim et al. 2015). However, inspecting manually is still the fundamental and principal manner
applied in the real-world practices of wall inspection. Compared with automatic building
inspection and defect detection, the primary limitations of manual inspection are the heavily
consumed time and labor, high cost, low productivity, and sometimes high safety risk.

With the increased availability of low cost, high quality, and easy-to-use visual sensing
technologies (e.g., cameras), the studies and development of technology-enabled techniques such
as computer vision techniques for automatic defect detection have been exponentially increasing
over the last decade (Koch et al. 2015). Various visual inspection methods have been developed
and tested in different applications such as the inspection of railway, bridges, roads, and
buildings. For example, a fractal analysis technique based on digital images was developed for
surface defect tracking of reinforced concrete bridges (Adhikari et al. 2016). For masonry walls,
a defect finding classification method was developed by Samy et al. (2016) to classify the types
of defects based on the image data captured from vision and 2D laser sensors.
Evidently, data collection is the first step in the above-described research and the quality of
the data captured has an essential impact on the following steps of defect detection. However,
most of the existing literature emphasized the development of advanced techniques to process
the captured data for defect detection while very limited effort was put on the development of
effective data collection systems or methods to advance the productivity and performance of the
automatic detection process, particularly for wall defect detection. In recent years, the quickly
evolving autonomous robots are one of the up-to-date and utmost innovative automation
solutions for engineering applications such as the advancement of rescue and inspection robotics
(Kim et al. 2018a). Montero et al. 2015 reviewed the past, present, and future of tunnel
inspection using robotic systems. Several robots with their adopted sensors for detecting defects
on tunnels were presented. A primary drawback recognized for all these systems is they are tele-
operated, and the need of developing fully automated tunnel inspection systems is identified.
Robots also have been investigated for the inspections of onshore and offshore oil tanks,
pipelines, petro-chemical tanks, etc. Most of the robots are remotely operated with very little
autonomy, which also is the primary limitation identified (Shukla and Karki 2013; Leon-
Rodriguez et al. 2012). Applying autonomous robots for automatic wall defect detection presents
high promise with multiple noticeable advantages such as the reduced safety risk for workforce
and improved productivity and accuracy.
Therefore, this paper is mainly focused on developing an automatic data collection system
using an autonomous robot to support the automated wall defect detection. The algorithms of
robot navigation under different application settings need to be adjusted. Accordingly, the
algorithms ensuring the robot not only to follow walls but also to effectively avoid collisions in
unknown environments need to be developed. The expected trajectories of the robot should cover
all the walls and corners that require to be inspected and also not be redundant. Specifically, the
data collected for further defect detection (i.e., data processing) should be complete and not
redundant, which contributes to improving the inspection accuracy and efficiency and also the -
subsequent troubleshooting and repairing processes. As thus, one of the tasks of this study is to
explore methods allowing a robot to precisely cover all walls and corners, e.g., wall followings.
In addition, as an important consideration for real-time navigation and mapping, Vector Field
Histogram (VFH) algorithm, an easily modifiable algorithm, and many of its variants are
computationally efficient (Babinec et al. 2018). The VFH with its variants has been investigated
for various applications of obstacle avoidance and path planning (Kim et al. 2018b).
The adopted autonomous robot(s) can be customized to be equipped with multiple sensing
technologies to meet their application requirements. The presented automatic wall defect
detection system also can be applied to the evaluation of the wall conditions after some disasters
such as earthquake to make the post-disaster reconstruction and management faster and more
efficient.

The next section describes the framework of the automatic wall defect detection system with
an emphasized explanation on the data collection. Thereafter, behavior-based and Vector Field
Histogram (VFH) algorithms used for robot navigation, obstacle avoidance, and map building
are explained. The performance of the autonomous robot for data collection was evaluated
through simulation. At the end, concluding remarks with future work are presented.

FRAMEWORK OF THE AUTOMATIC WALL DEFECT DETECTION SYSTEM

The framework of automatic wall defect detection systems is illustrated in Figure 1 which
includes three major components: data collection (the focus of this paper), defect detection (i.e.,
data processing), and decision making.

Autonomous • Computer vision based

robot(s) techniques (e.g., using
equipped with cameras, laser sensors,
sensors and infrared Emphasis of the
thermography (IRT) ) existing studies on
• Inspection codes defect detection
Buildings
• Other techniques
Data Processing -
Navigation, map building,
Defect Detection
and obstacle avoidance
algorithms Human knowledge

Maintenance, reconstruction,
Data Collection valuation, and other actions
Decision Making
Focus of this paper
Figure 1. The framework of the automatic wall defect detection system.
Data Collection: Data collection is the first and primary step in the automatic building and
civil infrastructure inspection applications. The quality of the collected data, a longstanding issue
of interest in the construction engineering field, essentially affects the performance of the
subsequent processes, defect detection and countermeasure implementation (Westin and Sein
2014). As the above-identified gap, the work presented in this paper is focused on developing an
autonomous robot-based data collection system for wall defect detection. The autonomous robot
is equipped with a camera, LIDAR, and IR sensors for navigation, map building, and obstacle
avoidance. A hybrid real-time navigation approach (explained later) is proposed and used to
optimize robot trajectory among the potential trajectories in search of angles and distance with
the detected walls and consequently, safer, completed, and more reasonable collision-free wall-
following trajectories are generated to capture data.
Defect Detection and Decision Making: With the data captured, the wall conditions will be
inspected in a near real-time or post-analysis manner based on the application requirements. The
technologies and infrastructures for real-time wireless data communication have made essential
advancements in recent years. The advanced defect detection techniques (such as the techniques
described in the Introduction section) together with inspection codes will be applied. Based on
the conditions detected, decisions are made, and accordingly, countermeasures can be
implemented. Human knowledge also will be involved in the decision making process.

WALL-FOLLOWING AND VECTOR FIELD HISTOGRAM (VFH) NAVIGATION AND

MAPPING
The behavior-based wall-following method is applied to manipulate the autonomous vehicle
in indoor room environments. Behavior decision is made at each sampling time according to
real-time sensor data, the results of global path planning and local navigation, and the distance
and relative velocity between the autonomous robot and walls and obstacles. Autonomous robot
behaviors are classified into straight-line following, corner following, turning, and obstacle
avoidance. In this section, a behavior-based algorithm for both straight-line wall-following and
corner wall-following with obstacle avoidance is presented, respectively. A LIDAR-driven
hybrid real-time navigation approach for the wall defect detection robot is proposed. Behavior-
based algorithm is developed to generate a global wall-following trajectory of an autonomous
wall defect detection robot. At the bottom level, it utilizes Vector Field Histogram (VFH)
algorithm based on the LIDAR sensor information to guide the robot locally to autonomously
traverse from one point to another within the planned wall-following trajectory with obstacle
avoidance.

Behavior-Based Algorithm for Straight-Line Wall-Following with Obstacle Avoidance

Two range sensors, i.e., infrared (IR) sensors, are mounted on each side (left and right) of the
mobile robot to accomplish the wall following. The wall is inferred as a segment of straight line
if no obstacle is detected in front of the mobile robot. Accordingly, the robot will follow the
Range_Follow behavior utilizing the developed Range_Follow behavior-based algorithm
(presented below) (Luo et al. 2014). This algorithm ensures the mobile robot to maintain a
constant, desired distance L0 from the detected wall. If the measured distance detected by the IR
sensors is larger than the desired distance L0, the rotation value calculated is negative, and
consequently, the robot turns in a negative direction as counter-clockwise and turns toward the
wall; alternatively, the robot turns in a positive direction as clockwise and turns away from the
wall if the measured distance is smaller than the L0 (illustrated in Figure 2).
Behavior Range _ Follow
Rotation  g   L0  d 
Translation  c
End Range _ Follow
where d is the measured distance detected by the IR sensors; L0 is the desired distance (a
threshold distance) maintained between the robot and the wall; constant c is the translation
velocity which is a positive constant; g is the gain of our proportional controller that should be
properly adjusted.

Behavior-Based Algorithm for Corner Wall-Following with Obstacle Avoidance

In the real world applications, corners including internal angle corner, external angle corner,
obtuse angle corner, and acute angle corner are also commonly seen (Figure 3). Thus, a robot
also is required to be able to perform the corresponding behavior pertinent to each type of the
four corners instead of a straight line of the wall. The Avoidance_Wall_Following behavior
algorithm is suitable for the sceneries with corners (Adib and Masoumi 2017) and used in this
paper. The two IR sensors of the robot function together to identify the scenery. Four types of
behaviors associated with the four sceneries in Figure 3 can be performed by the robot for

obstacle avoidance and wall-following. In combination with the Rang_Follow behavior

algorithm for straight walls, the robot is capable to perform the most behaviors of wall-following
navigation with obstacle avoidance.

d>L0 d=L0 d<L0

(a) (b) (c)

Figure 2. Wall-following with distance from wall by Range_Follow behavior.

(a) (b) (c) (d)

Figure 3. Wall-following with obstacle avoidance in four behaviors: (a) internal angle
corner, (b) external angle corner, (c) obtuse angle corner, and (d) acute angle corner.
LIDAR-Based Local Navigator and Map Building
In this research, a reactive navigation algorithm associated with the wall-following algorithm
used in mobile robotics is the Vector Field Histogram (VFH). As an important consideration for
real-time navigation and mapping, this VFH algorithm and many of its variants are
computationally efficient. The VFH navigation model outputs a preferred target sector for the
mobile robot to move towards, given the advancement that recommended direction is derived
from an analysis of a polar obstacle density histogram constructed from the LIDAR scans of the
obstacles in front of the mobile robot (Luo et al. 2015; Ulrich and Borenstein 1998). In this
research, it aims to consider the incorporation of the wall-following algorithm to the VFH
navigation and mapping through the LIDAR sensor.
A concept of certainty grids for sensor fusion and sensor data collection with an extensively
accepted map representation is used for navigation and mapping. The Virtual Force Field (VFF)
method (Borenstein and Koren 1989) is a widely used method to integrate the underlining of
potential fields with certainty grids. The VFF model is able to plan continuous, smooth and rapid
collision-free trajectory of autonomous mobile robots with unforeseen-obstacle avoidance.
Although navigated by VFF model, an autonomous mobile robot is capable of traveling more
rapidly and stably (Borenstein and Koren 1989), it is likely to be trapped in local minima. The
VFH method uses a concept of the polar histogram, a kind of intermediate data-representation to
overcome this shortcoming. The VFH algorithm provides mobile robots with a sufficiently
detailed spatial representation of the environment with densely cluttered obstacles (Luo et al.

2015; Borenstein and Koren 1989). The polar histogram in this paper is decomposed into 54
sectors since every sector is 5° with the LIDAR range of 270°. In order to carry out the wall
defect detection mission, the best sector to go through is then used to guide the wall defect
detection robot based on a weighted formula that combines deviation from desired direction and
associated obstacle densities. Consequently, starting from the initial point, the autonomous robot
is able to follow walls based on the proposed behavior-based wall-following algorithm
associated with VFH navigation and mapping model.

SIMULATION AND RESULTS ANALYSIS

The proposed data collection system for automatic wall defect detection was tested by
simulation. The developed hybrid system for real-time concurrent navigation and mapping of an
autonomous wall defect detection robot in unknown environments was tested on Player/Stage
simulator. In this simulation, the wall defect detection robot is navigated to follow walls in an
indoor room environment populated with sufficient objects with different shapes for the testing
(Figure 4).
The robot initiates at the bottom-left corner to follow the wall for wall defect detection
purpose. The simulation results are presented in Figures 4 and 5, including the workspace and
trajectories with wall-following and obstacle avoidance at different stages of movement in Figure
4 and the corresponding built map in Figure 5. The robot is capable of traversing from the initial
point to plan the wall-following route so as to detect the wall defect with obstacle avoidance
while the robot builds the map with 270° LIDAR. The yellow portions in Figure 5 are the
explored and detected walls and obstacles whereas the blue areas represent map scanned by the
270° LIDAR sensor. The local map is dynamically built up until the robot traverses to the
starting point, when the global map is formed. In Figures 4(c) and 5(c), the autonomous robot
returns the starting point after it traversed from the starting point to follow the entire wall in the
indoor room by planning a collision-free wall-following trajectory. Once the robot approaches
the initial point, all the walls have been followed. As unknown terrain is explored by the wall
defect detection robot, a new map is dynamically built up with the front LIDAR (Figure 5), and
wall-following routes are created by the behavior-based wall-following algorithm (Figure 4). The
local map, in which the entire wall of the indoor room is sensed, is dynamically built up until the
robot traverses to the starting point. The results indicate that the robot is able to follow all walls
and corners as well as to avoid obstacles. The obtained trajectory is complete for inspection and
not repeated and redundant.

CONCLUSION AND FUTURE WORK

In comparison with manual inspection, detection of wall defects by autonomous robots
enhances the building inspection with reduced labor, improved productivity, and inspection
accuracy. However, development of an effective data collection system particularly for
automatic wall defect detection was not sufficiently investigated. Therefore, a LIDAR-driven
hybrid real-time navigation approach for the wall defect detection robot is developed. The
developed data collection system was tested in simulation at the current stage. The assemblies
and building of the robot are ongoing work, and conduction of pilot experiments to further
evaluate and validate the performance of the developed data collection system for wall defect
inspection is the continuous work of this research. Comparison of the wall-following algorithms
of this paper with other algorithms is another future work. Cameras and multiple other sensors
can be embedded on the customized autonomous robot(s) to collect and fuse data for defect

detection and analysis with improved accuracy. The developed data collection system with a
further customized autonomous robot also can be applied to the inspection of exterior walls.

Figure 4. Trajectory with wall-following and obstacle avoidance of the wall defect detection
robot in unknown indoor room workspace (a) at the early stage of movement, (b) at the
middle stage of movement, and (c) at the end of movement.

Figure 5. Built map of the wall defect detection robot in unknown indoor room workspace
(a) at the early stage of movement, (b) at the middle stage of movement, and (c) at the end
of movement.

REFERENCES
Adhikari, R. S., Moselhi, O, Bagchi, A, and Rahmatian, A. (2016). “Tracking of Defects in
Reinforced Concrete Bridges Using Digital Images.” J. Comput. Civ. Eng., 30(5), 04016004.
Adib, A., and Masoumi, B. (2017). “Mobile robots navigation in unknown environments by
using fuzzy logic and learning automata.” 7th Conference on Artificial Intelligence and
Robotics, 58–63.
Alencastro, J., Fuertes, A., and de Wilde, P. (2018). “The relationship between quality defects
and the thermal performance of buildings.” Renewable and Sustainable Energy Reviews, 81,
883–894.
Babinec, A., Duchoň, F., Dekan, M., Mikulová, Z. and Jurišica, L. (2018). “Vector Field
Histogram* with look-ahead tree extension dependent on time variable environment.”
Transactions of the Institute of Measurement and Control, 40(4),1250-1264.

Borenstein, J. and Koren, Y. (1989). “Real-Time Obstacle Avoidance for Fast.” IEEE
Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 19(5), 1179–
1187.
Fox, M., Goodhew, S., and De Wilde, P. (2016). “Building defect detection: External versus
internal thermography.” Building and Environment, 105, 317–331.
Kim, M., Sohn, H., Asce, M., Chang, C., and Asce, M. (2015). “Localization and Quantification
of Concrete Spalling Defects Using Terrestrial Laser Scanning.” J. Comput. Civ. Eng., 29(6),
04014086.
Kim, P., Chen, J. and Cho, Y.K. (2018a). “Autonomous Mobile Robot Localization and Mapping
for Unknown Construction Environments.” ASCE Construction Research Congress, 147-
156.
Kim, P., Chen, J., Kim, J. and Cho, Y.K. (2018b). “SLAM-Driven Intelligent Autonomous
Mobile Robot Navigation for Construction Applications.” Workshop of the European Group
for Intelligent Computing in Engineering, Springer, Cham., 254-269.
Koch, C., Georgieva, K., Kasireddy, V., Akinci, B., and Fieguth, P. (2015). “A review on
computer vision based defect detection and condition assessment of concrete and asphalt
civil infrastructure.” Advanced Engineering Informatics, 29, 196–210.
Leon-Rodriguez, H., Hussain, S. and Sattar, T. (2012). “A compact wall-climbing and surface
adaptation robot for non-destructive testing.” The 12th International Conference on Control,
Automation and Systems (ICCAS), IEEE, 404-409.
Luo, C., Gao, J., Li, X., Mo, H., and Jiang, Q. (2015). “Sensor-based autonomous robot
navigation under unknown environments with grid map representation.” 2014 IEEE
Symposium on Swarm Intelligence, 98–104.
Luo, C., Krishnan, M., Paulik, M., and Fallouh, S. (2014). “An Intelligent Hybrid Behavior
Coordination System for an Autonomous Mobile Robot.” Intelligent Robots and Computer
Vision XXXI: Algorithms and Techniques, 9025W.
Montero, R., Victores, J.G., Martinez, S., Jardón, A. and Balaguer, C. (2015). “Past, present and
future of robotic tunnel inspection.” Automation in Construction, 59, 99-112.
Perilli, S., Sfarra, S., Ambrosini, D., Paoletti, D., Mai, S., Scozzafava, M., and Yao, Y. (2018).
“Combined experimental and computational approach for defect detection in precious walls
built in indoor environments.” Int. J. of Thermal Sciences, 129, 29–46.
Samy, M; Foong, S; Soh, G; Yeo, K. (2016). “Automatic Optical & Laser-based Defect
Detection and Classification in Brick Masonry Walls.” IEEE Region 10 Conference, 3521–
3524.
Shukla, A. and Karki, H. (2013). “A review of robotics in onshore oil-gas industry.” IEEE
International Conference on Mechatronics and Automation (ICMA), 1153-1160.
Ulrich, I., and Borenstein, J. (1998). “VFH+: Reliable obstacle avoidance for fast mobile robots.”
IEEE International Conference on Robotics and Automation, 2, 1572–1577.
Westin, S., and Sein, M. K. (2014). “Improving Data Quality in Construction Engineering
Projects: An Action Design Research Approach.” J. Mana. Eng., 30(3), 05014003.

Real-Time Scene Segmentation Using a Light Deep Neural Network Architecture for
Autonomous Robot Navigation on Construction Sites
Khashayar Asadi, S.M.ASCE1; Pengyu Chen2 ; Kevin Han, Ph.D., M.ASCE3; Tianfu Wu4;
and Edgar Lobaton5
1
Ph.D. Student, Dept. of Civil, Construction, and Environmental Engineering, North Carolina
State Univ., 2501 Stinson Dr., Raleigh, NC 27606. E-mail: [email protected]
2
MSCS Student, Dept. of Computer Science, Columbia Univ. in the City of New York, Mudd
Building, 500 W. 120th St., New York, NY 10027
3
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ., 2501
Stinson Dr., Raleigh, NC 27606
4
Dept. of Electrical and Computer Engineering, North Carolina State Univ., 890 Oval Dr.,
Raleigh, NC 27606
5
Dept. of Electrical and Computer Engineering, North Carolina State Univ., 890 Oval Dr.,
Raleigh, NC 27606

ABSTRACT
Camera-equipped unmanned vehicles (UVs) have received a lot of attention in data
collection for construction monitoring applications. To develop an autonomous platform, the UV
should be able to process multiple modules on an embedded platform. Pixel-wise semantic
segmentation provides a UV with the ability to be contextually aware of its surrounding
environment. However, in the case of mobile robotic systems with limited computing resources,
the large size of the segmentation model and high memory usage requires high computing
resources, which a major challenge for mobile UVs. To overcome this challenge, this paper
presents a light and efficient deep neural network architecture to run on an embedded platform in
real-time. The proposed model segments navigable space on an image sequence (i.e., a video
stream). The results demonstrate the performance efficiency of the proposed architecture
compared to the existing models and suggest possible improvements that could make the model
even more efficient.

INTRODUCTION
In the past decade, the construction industry has struggled to improve its productivity while
the manufacturing industry has experienced a dramatic increase (Changal, S., Mohammad, A.,
and van Nieuwland 2015; Shakeri et al. 2015). The deficiency of advanced automation in
construction is one possible reason (Asadi and Han 2018). On the other hand, construction
progress monitoring has been recognized as one of the key elements that lead to the success of a
construction project (Asadi et al. 2019b). Although there were various attempts by researchers to
automate construction progress monitoring (Boroujeni and Han 2017; Bosché et al. 2015; Han
and Golparvar-Fard 2017; Kropp et al. 2018), in the present state, the monitoring task is still
performed by site managers through on-site data collection and analysis, which are time-
consuming and prone to errors (Balali et al. 2018; Jeelani et al. 2018; Noghabaei et al. 2019;
Yang et al. 2015). If automated, the time spent on data collection can be better spent by the
project management team, responding to any progress deviations by making timely and effective
decisions. The use of Unmanned ground and aerial Vehicles (UVs) on construction sites has
dramatically grown in the past few years (Asadi et al. 2018a; Ham et al. 2016). This growth can

potentially automate the data collection required for visual data analytics that will automate the
progress inference process from previous studies (Han et al. 2018).
The authors’ previous research on an integrated mobile robotic system (Asadi et al. 2018b)
presents an unmanned ground vehicle (UGV) that runs multiple modules and enables future
development of an autonomous UGV. In (Asadi et al. 2018b), NVIDIA Jetson TX1 boards
(denoted as Jetson boards) (NVIDIA 2017) are used to process simultaneous localization and
mapping, motion planning and control, and context awareness (via semantic image
segmentation) modules of the system. The two major bottlenecks of this platform, in terms of
computational loads, were SLAM and segmentation. For this reason, there was a designated
Jetson board for each of these tasks. This is a major challenge in developing an autonomous
robot because it increases the size and weight, especially with added batteries. Moreover, the
problem becomes even more challenging if this robotics system was to be applied to an
unmanned aerial vehicle (UAV).
This previous work implements ENet (Paszke et al. 2016) as the semantic segmentation
method. The ENet model was designed to run on embedded boards, which makes this model
more applicable to mobile robotics systems. However, the segmentation task by this model had
the heaviest computational load, which made the authors restrict the speed of the UGV for real-
time performance necessity. Combining the SLAM and Context-Awareness Modules into the
same Jetson TX1 would help to mitigate this problem, but the large size of the segmentation
model and the high memory usage, make this solution practically unfeasible. The major goal of
this paper is to propose a deep convolution neural network (CNN) which reduces the
computational load (model size) in order to reduce the latency by running multiple modules on
the same Jetson while maintaining the accuracy.

Figure 1. An overview of the proposed model for real-time navigable space segmentation.

METHOD
Real-time semantic segmentation algorithms are used to understand the content of images
and find target objects in real-time which are crucial tasks in mobile robotic systems. In this part,
a new convolutional neural network model architecture, as well as the strategies and techniques
for its training, compressing and data processing will be introduced. Adam Optimizer (Kingma
and Ba 2014) is used to train the model and cross entropy is utilized as loss function. Also, a new
pixel-level annotated dataset has been generated and validated for real-time and mobile semantic
segmentation in a construction engineering environment. Figure 1 shows an overview of the

components of the proposed algorithm. In the following, first, factorized convolution block as a
core block is described which the proposed model is built on. We then illustrate the network
structure followed by a model compressing method description.

Factorized Convolution Block

To optimize the amount of parameters in a CNN model while maintaining its performance,
cut down the model sizes, and run the model faster, Depthwise Separable Convolution from
MobileNet (Howard et al. 2017) is applied. Depthwise Separable Convolution is a form of
factorized convolutions which factorizes a standard convolution into two separate operations - a
depthwise convolution and a point-wise convolution. While the normal convolution operates
over every channel of the input feature map, Depthwise Separable Convolution applies a single
filter on every channel and use a point-wise convolution, (1×1) kernel convolution, and linearly
combines the outputs. For a standard convolution layer whose kernel size is K, it takes a feature
map of size Hin × Win × Cin as input where Hin, Win, Cin are height, width, and channels
respectively, and outputs a feature map of size Hout × Wout × Cout. The computational cost is
computed using Equation 1.
K  K  Cin  Cout  Hin Win (1)
Equation 2, shows the computational cost for separable depthwise convolution, considering
the same feature map.
K  K  Cin  Hin Win  Cin  Cout  Hin Win (2)
By transforming the original convolution operation into depthwise separable convolution, the
reduction in computation cost is calculated by Equation 3.
K  K  Cin  H in  Win  Cin  Cout  H in  Win
(3)
K  K  Cin  Cout  H in  Win
In the proposed model, depthwise separable convolutional layers replace the standard
convolution layers in a CNN model. This replacement reduces the computational cost. For
instance, a convolutional layer with 3×3 filters is substituted with depthwise separable
convolutional layer. The reduced computational cost is 0.13, which means that the model uses
almost 8 times less computation than standard computations (see Equation 4). This reduction in
computation results in a negligible reduction in the model’s accuracy (see EXPERIMENTAL
SETUP AND RESULTS Section).
1 1 1 1
 2   (4)
Cout K 64 9
In the proposed model, depthwise separable convolutional layers replace the standard
convolution layers in a CNN model. This replacement reduces the computational cost. By
utilizing the strength of depthwise separable convolution, we proposed a new residual block
based on the traditional residual block. The block has two branches. One branch consists of two
1×1 convolutional layers (the first one is for projection and reducing the dimensionality which is
located before the depthwise separable convolution layer and the second one is placed afterward
which is for expansion), a depthwise convolutional layer, and a batch normalization layer at the
end. The other one is a shortcut branch which outputs an identical feature map as its input. If the
type of the block is Downsample, a MaxPooling layer is added to the shortcut branch. At the end
of the block, the outputs from the two branches are element-wise added. The structure of the
block is shown in Figure 2.

Figure 2. Factorized Convolutional Block

Architecture Design
The proposed model has a light model that has been optimized for a mobile device and real-
time semantic segmentation applications. The model has an encoder-decoder architecture, which
consists of several Factorized Convolution Blocks. The architecture of our network is presented
in Table 1. In the first Initial block where we followed the ENet architecture, we concatenated
the output of a 3×3 convolutional layer with the stride of two and a MaxPooling layer to
decrease the feature size at an early stage. The Downsample and Standard blocks follow the
structure we illustrated in Factorized Convolution Block part. As for the Upsample and LastConv
blocks, the same configurations are applied as ENet.

Model Compressing
A well-trained CNN model usually has significant redundancy among different filters and
feature channels. Therefore, an efficient way to cut down both computational cost and model size
is to compress the model (Li et al. 2016). L1 norm of the filter is used to evaluate the efficiency
of the filter. The L1 norm of a vector x is calculated using Equation 5, where xi represents each
value of the kernel.
x 1   xi (5)
If the L1 norm is close to zero, it generates negligible output because the output feature map
values tend to be close to zero. To improve efficiency, removing these filters would reduce the
computational expense and model size. Network compressing consists of the following steps:
For each filter, the L1 value is calculated and filters are sorted by L1 values. Then, the filters and
corresponding feature maps are removed and the whole network is fine-tuned. However, it is
hard to predict how many filters and which filters can be removed without damaging the
performance. By calculating and sorting the L1 values of every filter in the model, the 128 filters
(denoted as 128-kernels) generally, have smaller L1 values and have the largest number of
redundant filters. 128-kernels are less sensitive to the changes compared to other kernels.
Therefore, removing 128- kernels have a negligible impact on the performance. As a result, these
filters could be removed without harming the accuracy. As it is shown in Table 1 half of the
filters of 128-kernels are removed (is indicated with * in the table) and the model is fine-tuned
on the same dataset.

EXPERIMENTAL SETUP AND RESULTS

Dataset Construction and Training Strategy
A new dataset has been constructed for the proposed model. The main objective of our
classification task is to segment all objects from navigable spaces in the scene, which is why the

proposed model has only two classes. To segment objects even in a new environment, the data is
collected from three completely different environments including construction sites, parking lots,
and roads (1000 images). The Cityscapes public dataset (Cordts et al. 2016) is also used for
training and testing process. This dataset can prevent over-fitting of the proposed model.
Therefore, it was used for training and cross-validation. Cityscape has 5,000 frames that have
high-quality pixel-level annotations. As it has 30 classes, these labels are grouped into 2 classes
for our task (ground and not ground)

Table 1. Model Architecture. Output sizes are given for an example RGB input of 512×256.

Figure 3. Examples of proposed dataset image (top), corresponding labels (bottom).

Tensorflow (Google 2015) and Keras framework are used to implement the algorithm.
CUDA (Nickolls et al. 2008) and CuDNN (Chetlur et al. 2014) are also utilized for accelerating
the computations. The whole training process took about 12 hours. Since Asadi et al. (Asadi et
al. 2018b) have already validated ENet implementation on the embedded platform in real-time,
this paper focuses on comparing the performance of ENet and the proposed model on a server
with the following specification: 128 GB RAM, Intel Xeon E5 processor, and two GPUs -
NVIDIA Tesla K40c and Telsa K20c. Figure 3 shows sample images (top) and their
corresponding labels (bottom).

Test Results
For testing and evaluation, 150 images are used from three scenes, including a parking lot,
road, and construction site. Due to the differences between the three scenes, models performance
is varied and highly related to the complexity of images and categories of objects. The test
results are shown in Table 2.

Table 2. Proposed vs ENet Accuracy in Test Data

Model Comparison
The results show that only 2% percent of accuracy is lost, on the other hand, the complexity
of the network is reduced greatly by decreasing the parameters and the feature layers by half.
The model size drops from 2225 KB to 1068 KB. The comparison between the two models is
shown in Table 3. The inference time is directly related to computation cost in the network. The
proposed model’s inference time decreased by 18%, although the number of parameters
decreased significantly (see Table 3. There are two tasks performed by the GPU: creating the
kernel and evaluating the model. The proposed model only decreases the evaluation time which
is why the inference time does not decrease at the same rate. In other words, the inference time
depends greatly on other computation factors rather than just the evaluation of the neural
network model (i.e., the cost of launching the GPU kernel starts to dominate the computation
time).
The average inference time per frame has reduced from 44.7 ms to 36.5 ms per frame. This
reduction in the inference time means that the maximum input frame rate for real-time
performance can be increased by 5 fps (from 22 fps in ENet to 27 fps in proposed model).
Although the proposed segmentation model has lower inference time compare to the ENet
method, the main contribution of the proposed model is reducing the model size (more than
50%). This reduction enables multiple modules to be run on the same Jetson TX1, which will
reduce the latency caused by integrating multiple Jetson TX1s through the wired network.

CONCLUSION
This paper presents an efficient semantic segmentation model that can be run in real-time on
multiple embedded platforms that are integrated as a system for navigable space segmentation.
The main contributions of this paper are 1) a new pixel-level annotated dataset for real-time and
mobile semantic segmentation in construction environments and combining with transfer
learning to deal with the limited number of training dataset and 2) an efficient semantic
segmentation method with a smaller model size and faster inference speed for future
development of autonomous robots on construction sites. Although the focus of this study is on
reducing the model size to enable running multiple modules on the same Jetson TX1, the
inference time is also decreased, which increases the frame rate of the segmentation process.
50% reduction in the model size is a significant contribution, which enables multiple modules to
be combined and run on the same Jetson TX1. By doing this, the latency caused by integrating

multiple Jetson TX1 through the wired network will reduce drastically (Asadi et al. 2019a).

Table 3. Proposed Model vs ENet

REFERENCES
Asadi, K., and Han, K. (2018). “Real-Time Image-to-BIM Registration Using Perspective
Alignment for Automated Construction Monitoring.” Construction Research Congress 2018,
American Society of Civil Engineers, Reston, VA, 388–397.
Asadi, K., Jain, R., Qin, Z., Sun, M., Noghabaei, M., Cole, J., Han, K., and Lobaton, E. (2019a).
“Vision-based Obstacle Removal System for Autonomous Ground Vehicles Using a Robotic
Arm.” Computing in Civil Engineerig 2019.
Asadi, K., Ramshankar, H., Noghabaee, M., and Han, K. (2019b). “Real-time Image
Localization and Registration with BIM Using Perspective Alignment for Indoor Monitoring
of Construction.” Journal of Computing in Civil Engineering.
Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S.,
Han, K., Lobaton, E., and Wu, T. (2018a). “Building an Integrated Mobile Robotic System
for Real-Time Applications in Construction.” arXiv preprint arXiv:1803.01745.
Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S.,
Han, K., Lobaton, E., and Wu, T. (2018b). “Vision-based integrated mobile robotic system
for real-time applications in construction.” Automation in Construction, 96.
Balali, V., Noghabaei, M., Heydarian, A., and Han, K. (2018). “Improved Stakeholder
Communication and Visualizations: Real-Time Interaction and Cost Estimation within
Immersive Virtual Environments.” Construction Research Congress 2018, American Society
of Civil Engineers, Reston, VA, 522–530.
Boroujeni, K. A., and Han, K. (2017). “Perspective-Based Image-to-BIM Alignment for
Automated Visual Data Collection and Construction Performance Monitoring.” Congress on
Computing in Civil Engineering, Proceedings, 171–178.
Bosché, F., Ahmed, M., Turkan, Y., Haas, C. T., and Haas, R. (2015). “The value of integrating
Scan-to-BIM and Scan-vs-BIM techniques for construction monitoring using laser scanning
and BIM: The case of cylindrical MEP components.” Automation in Construction, Elsevier,
49, 201–213.
Changal, S., Mohammad, A., and van Nieuwland, M. (2015). “The construction productivity
imperative: How to build megaprojects better.” McKinsey&Company,
<http://www.mckinsey.com/industries/capital-projects-and-infrastructure/our-insights/the-
construction-productivity-imperative> (Aug. 30, 2017).
Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., and Shelhamer, E.
(2014). “cuDNN: Efficient Primitives for Deep Learning.”
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth,
S., Schiele, B., R&d, D. A., and Darmstadt, T. U. (2016). The Cityscapes Dataset for

Semantic Urban Scene Understanding.

Google. (2015). “TensorFlow.” <https://www.tensorflow.org/> (Nov. 29, 2018).
Ham, Y., Han, K. K., Lin, J. J., and Golparvar-Fard, M. (2016). “Visual monitoring of civil
infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): a review of
related works.” Visualization in Engineering, Springer International Publishing, 4(1), 1.
Han, K., Degol, J., and Golparvar-Fard, M. (2018). “Geometry- and Appearance-Based
Reasoning of Construction Progress Monitoring.” Journal of Construction Engineering and
Management, 144(2), 04017110.
Han, K. K., and Golparvar-Fard, M. (2017). “Potential of big visual data and building
information modeling for construction performance analytics: An exploratory study.”
Automation in Construction, Elsevier B.V., 73, 184–198.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M.,
and Adam, H. (2017). “Mobilenets: Efficient convolutional neural networks for mobile
vision applications.” arXiv preprint arXiv:1704.04861.
Jeelani, I., Han, K., and Albert, A. (2018). “Automating and scaling personalized safety training
using eye-tracking data.” Automation in Construction, Elsevier, 93, 63–77.
Kingma, D. P., and Ba, J. (2014). “Adam: A Method for Stochastic Optimization.”
Kropp, C., Koch, C., and König, M. (2018). “Interior construction state recognition with 4D BIM
registered image sequences.” Automation in Construction, Elsevier, 86, 11–32.
Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H. P. (2016). “Pruning Filters for
Efficient ConvNets.”
Nickolls, J., Buck, I., Garland, M., and Skadron, K. (2008). “Scalable parallel programming with
CUDA.” ACM SIGGRAPH 2008 classes on - SIGGRAPH ’08, ACM Press, New York, New
York, USA, 1.
Noghabaei, M., Asadi, K., and Han, K. (2019). “Virtual Manipulation in an Immersive Virtual
Environment: Simulation of Virtual Assembly.” Computing in Civil Engineerig 2019.
NVIDIA. (2017). “Unleash Your Potential with the Jetson TX1 Developer Kit | NVIDIA
Developer.” <https://developer.nvidia.com/embedded/buy/jetson-tx1-devkit> (Nov. 29,
2018).
Paszke, A., Chaurasia, A., Kim, S., and Culurciello, E. (2016). “ENet: A Deep Neural Network
Architecture for Real-Time Semantic Segmentation.” Iclr2017, 1–10.
Shakeri, I., Boroujeni, K. A., and Hassani, H. (2015). “LEAN CONSTRUCTION: FROM
THEORY TO PRACTICE.” International journal of academic research, 7(1).
Yang, J., Park, M.-W., Vela, P. A., and Golparvar-Fard, M. (2015). “Construction performance
monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the
future.” Advanced Engineering Informatics, Elsevier, 29(2), 211–224.

Vision-Based Obstacle Removal System for Autonomous Ground Vehicles Using a Robotic
Arm
Khashayar Asadi, S.M.ASCE1; Rahul Jain2 ; Ziqian Qin3 ; Mingda Sun4 ;
Mojtaba Noghabaei, S.M.ASCE5; Jeremy Cole6 ; Kevin Han, Ph.D., M.ASCE7;
and Edgar Lobaton, Ph.D.8
1
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ., 2501
Stinson Dr., Raleigh, NC 27606. E-mail: [email protected]
2
Dept. of Civil and Environmental Engineering, Indian Institute of Technology Patna Bihta,
Patna, Bihar 801103, India
3
College of Computer Science and Technology, Zhejiang Univ., 38 Zheda Rd., Hangzhou, China
4
College of Computer Science and Technology, Zhejiang Univ., 38 Zheda Rd., Hangzhou, China
5
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ., 2501
Stinson Dr., Raleigh, NC 27606
6
Dept. of Electrical and Computer Engineering, North Carolina State Univ., 890 Oval Dr.,
Raleigh, NC 27606
7
Dept. of Civil, Construction, and Environmental Engineering, North Carolina State Univ., 2501
Stinson Dr., Raleigh, NC 27606
8
Dept. of Electrical and Computer Engineering, North Carolina State Univ., 890 Oval Dr.,
Raleigh, NC 27606

ABSTRACT
The use of camera-equipped robotic platforms for data collection and visually monitoring
applications is exponentially growing. Cluttered construction sites with many objects on the
ground are challenging environments for a mobile unmanned ground vehicle (UGV) to navigate.
This study presents a mobile UGV equipped with a stereo camera and a robotic arm that can
remove obstacles along the UGV’s path. To achieve this, the surrounding environment is
captured by the stereo camera and obstacles are detected. The obstacle’s relative location to the
UGV is sent to the robotic arm module through robot operating system (ROS). Then, the robotic
arm picks up and removes the obstacle. The proposed method will greatly enhance the degree of
automation and frequency of data collection for construction monitoring. The results successfully
demonstrate the detection and removal of obstacles, serving as one of the enabling factors for
developing an autonomous UGV with various construction operating applications.

INTRODUCTION
The number of applications of automated platforms to work with human workers on
construction sites has exponentially grown in the past few years. They are used for various
activities, including floor cleaning (Prabakaran et al. 2018), wall building (Gosselin et al. 2016;
Yu et al. 2009), wall painting (Sorour et al. 2011), data collection (Asadi et al. 2018a, 2019b;
Asadi and Han 2018; Boroujeni and Han 2017), constructibility assessment (Balali et al. 2018;
Noghabaei et al. 2019), safety monitoring (Jeelani et al. 2018, 2019b; a), and inspection
(Menendez et al. 2018). However, the implementation of robots on construction sites is still
limited. As more cost-effective applications are found, their use in practice will increase (Shakeri
et al. 2015). Repetitive building tasks have been a target for automation studies (García de Soto
et al. 2018). These studies mostly benefit from a robotic arm to handle different tasks.

(Sorour et al. 2011) developed a low weight autonomous robotic arm which was capable of
painting the vertical wall. However, the robotic arm had only two degrees of freedom (DoF).
Therefore, to increase the dimensional ability of robotic arm, (Gosselin et al. 2016) developed a
robotic arm which had six degree of freedom to deposit layer by layer construction material (e.g.,
cement mortar) to build a multifunctional concrete wall. Recently, (García de Soto et al. 2018)
developed a robotic system which was capable of building a curved concrete wall with mesh
mold with the help of digital fabrication technique. Likewise, (Lublasser et al. 2018) developed a
robot to make the foam concrete surface on the walls to gain facade finish.
There are many activities during the construction process which requires lifting of a three-
dimensional object and placing at a different location. The above-mentioned studies have the
ability to help during the construction stage in many ways; however, these robots don’t have the
capability of picking and keeping three-dimensional objects. Therefore, many research studies
have tried to develop a robotic system which can perform such activities during the construction
stage. (Skotheim et al. 2012) presented a stationary robotic system that scan and localize work
pieces using a laser triangulation sensor for picking and placing operations. Likewise, (Furrer et
al. 2017) conducted a study on a stationary robotic arm to utilize irregular materials found on-site
for autonomous construction. They demonstrated that the robot was able to form a tower from
the detected object by staking over each other. The main limitation with this method was the
necessity of an offline step for scanning the objects and their geometry, which limits the robot’s
capability in dynamic scenes with rapid changes such construction sites.
To address this limitation, the objective of the presented study in this paper is to develop a
robotic system that uses a Kinova Jaco arm (KINOVA 2008) to automatically grab different
types of objects (e.g., pipe and brick for this study) and put them elsewhere. This automated task
has different applications within the construction industry. For instance, removing obstacles
along the UGV’s path which prevents the robot to change its path and choose a longer path
during a data collection with a predetermined path. Another example is material handling,
moving a pile of objects from one location to another, which is a repetitive task that can be
automated. To achieve this goal, a mobile robotic platform that builds on the authors’ previous
study is used. The platform is built on the Clearpath’s Husky mobile robotic platform
(“ClearPathRobotics:Husky” 2018). A laptop is used as a processing unit. A stereo camera is
used as a visual sensor to provide depth information alongside the RGB images (i.e., images with
red, green, and blue color channels). For ease of data exchange among various modules (Control,
Context Awareness, Geometry Descriptor, and Robotic Arm), Robot Operating System (ROS) is
used (Quigley et al. 2009).

METHOD
This section describes the proposed vision-based obstacle removal system. Figure 1
illustrates the integration among multiple modules. The Context Awareness Module receives
images from a ZED stereo camera (StereoLabs 2018). A scene segmentation scheme processes
the images and detects objects of interest (bricks and pipes) in the image by creating a mask
around them. The segmented images and their corresponding depth information are inputs for the
Geometry Descriptor Module. This module calculates the coordinate of the mask's center with
respect to the arm coordinate system and compares it with the predetermined range that the
robotic arm can access. If the object is in that range, a command is sent to the Control Module to
stop the robot and then the mask's center coordinate alongside with the mask's direction is sent to
the robotic arm. The arm can grab the objects in its range with the known location (coordinate of

the mask center) and orientation (mask's direction). Depending on the application, the grabbed
object is then located in another location. In this study, a joystick is used to send control
commands to the Raspberry Pi (Control Module) that is connected to the robot platform. The
main contribution of this approach is to integrate the stereo camera, context awareness, control,
and robotic arm modules for developing an autonomous obstacle removal system. The
capabilities of these modules are detailed in the following subsections.

Figure 1. An overview of the proposed vision-based obstacle removal system

ZED Stereo Camera

The stereo camera provides the Context Awareness Module with RGB-D images (i.e.,
images with red, green, and blue color channels and the related depth value for each pixel).
Depth values are respect to the camera coordinate system. To transfer all the values to the arm
coordinate system, a rigid transformation matrix consists of translation and rotation is calculated.
This matrix is calculated once during the system setup. Equation 1 shows this transformation
where, PA and PC are the same physical points, described in arm and camera coordinate systems
respectively. tCA is a translation vector between the arm origin and the origin of the camera
coordinate system. RCA is the 3×3 rotation matrix of the camera axis with respect to the arm axis.
P A  RCA  PC  tCA (1)
The camera is placed on top of the robot platform with a fixed top-down view. Fixing the
camera in top-down view has the following advantages: 1) the depth values are more accurate
compared to any other camera angles. The reason is that all the objects in top-down view are less
than 1.5 meters far from the camera, 2) this view includes objects that are close enough to the
arm to be picked up, which prevent the unnecessary process of detecting objects that are not in
the arm range, 3) The transformation matrix is calculated easier and it is more accurate compared
to a camera with a random orientation.

Context Awareness
The Context Awareness Module receives images and corresponding depth information from
the ZED stereo camera. The segmentation model proposed by (Asadi et al. 2019a) is used as a
pixel-wise semantic segmentation method. This model takes pixel-wise labeled information as
input for training. In the current study, the authors have collected and labeled about 1000 image
frames. The following three classes have been chosen, as shown in Figure 2: brick, pipe, and
unlabeled. To increase the number of training images, and preventing overfitting label-

preserving data augmentation similar to (Krizhevsky et al. 2012) is used. Different methods of
data augmentation such as flipping, random color jitter, and random Gaussian noise are
implemented and generate 32 new images from each image. The network is trained in two steps.
The first step involves training of the encoder part, which provides a label map of size 64×32. In
the second step, the decoder is trained with the encoder to upscale the intermediate map to full
image dimensions. With pixel-wise semantic segmentation, the object of interest can be easily
detected from the resulting segmented image (see Figure 2).

Figure 2. Example of an image received from the left lens of the ZED camera (left) and the
result segmented image (right)
Geometry Descriptor
The Geometry Descriptor Module calculates the coordinate of the detected mask's center (see
(xc, yc, zc) in the left image of Figure 3) and its direction with respect to the arm coordinate
system. This information is necessary for the arm to grab the object properly. The calculated
mask center is compared with the predetermined range that the robot is able to reach. If the
object is in the range, a command is sent to the Raspberry Pi to stop the robot. Then, the
direction of the mask is calculated based on the depth values of the detected mask. For this
purpose, the longest edge of the object is determined and its slope is calculated (see the red line
in Figure 3). This slope represents the direction of the object with respect to the arm coordinate
system. The arm's hand is supposed to reach the top of the object in parallel to this direction (to
be further detailed in the following subsection).

Figure 3. The mask's center (black dot) and direction (red line) are calculated. By this
information, the arm can grab the object parallel to the object's longest edge (right image).

Robotic Arm
Kinova Jaco robotic arm (KINOVA 2008) with six DoF is used for this experiment. As

previously mentioned, the arm is able to grab the object within a predetermined range. This
range is calculated based on the arm's constraints (Palacios 2015) in different directions. The
Robotic Arm Module receives the location of the object (center of the detected mask) that is
within the arm range alongside with the object's direction (i.e., the slope of the longest edge of
the mask) and then moves to reach the top of the object with a proper pose. The arm movement
follows specific motions in order. The first motion starts from an initial pose (arm's home
position). The arm moves to locate on top of the object. In the next movement, it goes down to
reach the object and grab it. Finally, the arm handles the object and releases it in another location
(depends on the application). The average time for the proposed system to grab an object is
almost 20 seconds. The response times for the Context Awareness and Geometry Descriptor
Modules are negligible which are less than a second.

EXPERIMENTAL SETUP AND RESULTS

The proposed system was tested with 10 objects from two classes (five bricks and five pipes).
These objects varied in terms of size and their directions in the scene. To test the system in
different light condition, the experiment held in an indoor environment. The segmentation model
ran at a speed of 21 fps for an input image size of 512×256 on the laptop with the following
specification: 16 GB DDR3 RAM, Intel Core i7-4710HQ quad-core Haswell processor, and
NVIDIA GeForce GTX 960M. The system grabbed seven objects successfully and failed to grab
one pipe and two bricks. Table 1 shows the failed cases besides the modules that caused the
failure.

Table1. Causes of failures

Since the proposed method uses a ZED stereo camera for estimating depth values, accurate
estimation of depth is vital for this method. However, in the first failure, the estimated location
of the brick had the error of 2cm. By processing the output values from each module, it could be
observed that the cause of this error was inaccurately estimated depth from the camera.

Figure 4. Example of imperfect mask resulting in the second failure.

In the second failure, the Context Awareness Module failed to create a complete mask
around the object (see the left image in Figure 4. This happens because of a noticeable change in

the light condition. Consequently, the Geometry Descriptor also failed to calculate the center of
the object accurately from the imperfect detected mask. So, the arm reached the object, but It
failed to grab it from the proper pose which resulted in a failure (see the right image in Figure 4).
Training the segmentation model with more data in different light conditions can solve this issue.
By processing the last failure, it could be observed that although the received information
from the Camera, Context Awareness, and the Geometry Descriptor Modules were all correct,
the arm failed to reach the object by following the predetermined orders. For instance, instead of
placing on top of the object and then moving down to grab the object, the arm can reach the
object properly if it first moves a little down and then moves to the top of the object. So,
depending on the object's location with respect to the arm coordinate system, different scenarios
of picking the object can be determined to solve this issue.

CONCLUSION
This paper presents a vision-based obstacle removal system for autonomous ground robots
using a Kinova Jaco robotic arm. A scene segmentation pipeline in integration with a stereo
camera is used to detect the objects of interests. Then, the Geometry Descriptor Module tracks
the location and orientation of the detected objects. This information is sent to the robotic arm to
move to the object, grab it, and place it to a desirable location. The system validated through an
experiment in an indoor environment. By investigating on cases with failure, the probable
solutions for better performance of the system were suggested. The proposed system has the
potential for enabling computer vision systems for object handling for automated construction
applications.

REFERENCES
Asadi, K., Chen, P., and Han, K. (2019a). “Real-time scene segmentation using a light deep
neural network architecture for autonomous robot navigation on construction sites.”
Computing in Civil Engineering 2019.
Asadi, K., and Han, K. (2018). “Real-Time Image-to-BIM Registration Using Perspective
Alignment for Automated Construction Monitoring.” Construction Research Congress 2018,
American Society of Civil Engineers, Reston, VA, 388–397.
Asadi, K., Ramshankar, H., Noghabaee, M., and Han, K. (2019b). “Real-time Image
Localization and Registration with BIM Using Perspective Alignment for Indoor Monitoring
of Construction.” Journal of Computing in Civil Engineering.
Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S.,
Han, K., Lobaton, E., and Wu, T. (2018a). “Building an Integrated Mobile Robotic System
for Real-Time Applications in Construction.” arXiv preprint arXiv:1803.01745.
Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S.,
Han, K., Lobaton, E., and Wu, T. (2018b). “Vision-based integrated mobile robotic system
for real-time applications in construction.” Automation in Construction, 96.
Balali, V., Noghabaei, M., Heydarian, A., and Han, K. (2018). “Improved Stakeholder
Communication and Visualizations: Real-Time Interaction and Cost Estimation within
Immersive Virtual Environments.” Construction Research Congress 2018, American Society
of Civil Engineers, Reston, VA, 522–530.
Boroujeni, K. A., and Han, K. (2017). “Perspective-Based Image-to-BIM Alignment for
Automated Visual Data Collection and Construction Performance Monitoring.” Congress on
Computing in Civil Engineering, Proceedings, 171–178.

“ClearPathRobotics:Husky.” (2018). <https://www.clearpathrobotics.com/husky-unmanned-

ground-vehicle-robot/> (Jul. 21, 2017).
Furrer, F., Wermelinger, M., Yoshida, H., Gramazio, F., Kohler, M., Siegwart, R., and Hutter,
M. (2017). “Autonomous robotic stone stacking with online next best object target pose
planning.” 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE,
2350–2356.
García de Soto, B., Agustí-Juan, I., Hunhevicz, J., Joss, S., Graser, K., Habert, G., and Adey, B.
T. (2018). “Productivity of digital fabrication in construction: Cost and time analysis of a
robotically built wall.” Automation in Construction, Elsevier, 92, 297–311.
Gosselin, C., Duballet, R., Roux, P., Gaudillière, N., Dirrenberger, J., and Morel, P. (2016).
“Large-scale 3D printing of ultra-high performance concrete – a new processing route for
architects and builders.” Materials & Design, Elsevier, 100, 102–109.
Jeelani, I., Albert, A., Han, K., and Azevedo, R. (2019a). “Are Visual Search Patterns Predictive
of Hazard Recognition Performance? Empirical Investigation Using Eye-Tracking
Technology.” Journal of Construction Engineering and Management, 145(1), 04018115.
Jeelani, I., Asadi, K., Ramshankar, H., Han, K., and Albert, A. (2019b). “Real-world Mapping of
Gaze Fixations Using Instance Segmentation for Road Construction Safety Applications.”
2019 TRB Annual Meeting.
Jeelani, I., Han, K., and Albert, A. (2018). “Automating and scaling personalized safety training
using eye-tracking data.” Automation in Construction, Elsevier, 93, 63–77.
KINOVA. (2008). “Kinova | Assistive Robots | Jaco Robotic Arms.”
<https://www.kinovarobotics.com/en/products/assistive-technologies/kinova-jaco-assistive-
robotic-arm> (Nov. 29, 2018).
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). “ImageNet Classification with Deep
Convolutional Neural Networks.”
Lublasser, E., Adams, T., Vollpracht, A., and Brell-Cokcan, S. (2018). “Robotic application of
foam concrete onto bare wall elements - Analysis, concept and robotic experiments.”
Automation in Construction, Elsevier, 89, 299–306.
Menendez, E., Victores, J. G., Montero, R., Martínez, S., and Balaguer, C. (2018). “Tunnel
structural inspection and assessment using an autonomous robotic system.” Automation in
Construction, Elsevier, 87, 117–126.
Noghabaei, M., Asadi, K., and Han, K. (2019). “Virtual Manipulation in an Immersive Virtual
Environment: Simulation of Virtual Assembly.” Computing in Civil Engineerig 2019.
Palacios, R. H. (2015). “Robotic Arm Manipulation Laboratory With a Six Degree of Freedom
JACO Arm.”
Prabakaran, V., Elara, M. R., Pathmakumar, T., and Nansai, S. (2018). “Floor cleaning robot
with reconfigurable mechanism.” Automation in Construction, Elsevier, 91, 155–165.
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., and Ng, A. Y.
(2009). “ROS: an open-source Robot Operating System.” ICRA workshop on open source
software, Kobe, Japan, 5.
Shakeri, I., Boroujeni, K. A., and Hassani, H. (2015). “LEAN CONSTRUCTION: FROM
THEORY TO PRACTICE.” International journal of academic research, 7(1).
Skotheim, O., Lind, M., Ystgaard, P., and Fjerdingen, S. A. (2012). “A flexible 3D object
localization system for industrial part handling.” 2012 IEEE/RSJ International Conference
on Intelligent Robots and Systems, IEEE, 3326–3333.
Sorour, M. T., Abdellatif, M. A., Ramadan, A. A., and Abo-Ismail, A. A. (2011). “Development

of roller-based interior wall painting robot.” World Academy of Science, Engineering and
Technology, 59.
“StereoLabs.” (2018). <https://www.stereolabs.com/zed/> (Nov. 29, 2018).
Yu, S.-N., Ryu, B.-G., Lim, S.-J., Kim, C.-J., Kang, M.-K., and Han, C.-S. (2009). “Feasibility
verification of brick-laying robot using manipulation trajectory and the laying pattern
optimization.” Automation in Construction, Elsevier, 18(5), 644–655.

Planning and Execution for Geometrically Adaptive BIM-Driven Robotized Construction

Processes
Kurt M. Lundeen, Ph.D.1; Vineet R. Kamat, Ph.D.2; Carol C. Menassa, Ph.D.3; and Wes McGee4
1
Laboratory for Interactive Visualization in Engineering, Dept. of Civil and Environmental
Engineering, Univ. of Michigan, 2350 Hayward St., 2340 G. G. Brown Building, Ann Arbor, MI
48109-2125. E-mail: [email protected]
2
Laboratory for Interactive Visualization in Engineering, Dept. of Civil and Environmental
Engineering, Univ. of Michigan, 2350 Hayward St., 2105 G. G. Brown Building, Ann Arbor, MI
48109-2125. E-mail: [email protected]
3
Sustainable and Intelligent Civil Infrastructure Systems Laboratory, Dept. of Civil and
Environmental Engineering, Univ. of Michigan, 2350 Hayward St., 2140 G. G. Brown Building,
Ann Arbor, MI 48109-2125. E-mail: [email protected]
4
Digital Fabrication Laboratory, College of Architecture and Urban Planning, Univ. of Michigan,
2000 Bonisteel Blvd., Ann Arbor, MI 48109-2069. E-mail: [email protected]

ABSTRACT
This research explores a means by which a construction robot can leverage its sensors and a
building information model (BIM) to perceive and model the actual geometry of its workpieces,
adapt its work plan, and execute work in a construction environment. The adaptive framework
uses the generalized resolution correlative scan matching (GRCSM) search algorithm for model
registration, a new formulation for fill plan adaptation, and new hardware for robotic material
dispensing. Joint filling is used as a case study to demonstrate the formulation of an adaptive
plan and to evaluate a robot’s ability to perform work when the actual locations and geometries
of its workpieces deviate from their designed counterparts. The robot was found capable of
identifying the true position and orientation of the joint’s center with a mean norm positioning
error of 0.13 mm and orientation error of 1.2 . The adaptive framework offers significant
promise for a range of construction activities, including those involving objects of complex
geometry and precision work.

INTRODUCTION
Despite the potential for robots to help alleviate many of the construction industry’s
problems, the very nature of the industry poses significant challenges for robots. First, the robot
must go to its workpieces instead of having its workpieces brought to it, which produces a
reversed spatial conveyance between the robot and its product. This and the sheer size of
construction projects requires the robot to be mobile. Furthermore, construction tolerances are
relatively loose, which tends to result in large tolerance stack-ups. Such factors produce
additional challenges for robots by contributing to pose (i.e., position and orientation)
uncertainty between the robot and a point of interest on its workpiece.
To overcome these spatial uncertainties, a construction robot needs the ability to perceive its
workpieces in situ and adapt its work plan to the workpiece location and geometry encountered.
The research described in this paper builds upon previous work (Lundeen et al. 2017) in which
the authors explored how to enable a construction robot to sense and model the actual pose and
geometry of its workpieces, but that body of research stopped short of addressing the planning
and execution steps need for the robot to act upon its newly perceived workpiece models. Thus,

the objective of this research is to develop a means by which a robot can adapt its work plan to
its as-perceived workpiece models so it can overcome spatial uncertainties and perform detailed
construction work.
This adaptive framework could be applied to a range of construction activities, but the
authors use joint filling, and in particular caulking, as a case study activity to demonstrate and
evaluate the robot’s adaptive capabilities. The specific objectives of this research are as follows.
 Establish a framework to enable a construction robot to perform geometrically adaptive,
model-driven work.
 Establish a formulation to enable a construction robot to adapt its joint filling work plan
to its as-perceived workpiece models.
 Evaluate a robot’s ability to perform geometrically adaptive, model-driven work using
joint filling as a case study activity.

RELATED WORK
Past research has been conducted into providing construction robots with the ability to adapt
to the circumstances they encounter so they can perform work. However, such studies have
limitations which preclude their methods from being widely adopted.
For example, some studies (Keating et al. 2017; Stentz et al. 1999) have demonstrated
geometrically adaptive capabilities with accuracies on the order of several centimeters, but the
methods used in such studies may not be sufficient for construction tasks with tighter tolerances.
Similarly, other studies (Kahane and Rosenfeld 2004; Wang et al. 2018) have used simple
workpiece models or workpiece identification methods, which might not be very generalizable to
a wide range of tasks or those requiring high-fidelity models. Furthermore, some studies
(Kermorgant 2018; Lussi et al. 2018) have employed model registration techniques that do not
lend well to objects of complex curvature or high geometric complexity, so the methods used in
these studies are unlikely to be suitable for a wide range of object geometries. Other studies have
used different sensing strategies, such as leveraging fiducial markers (Feng et al. 2015) or using
force feedback (Lublasser et al. 2017), but visualization of natural features is anticipated to serve
as the principal sensing modality for most construction robots. Lastly, some studies (Helm et al.
2012; Willmann et al. 2016) have demonstrated noteworthy adaptive construction capabilities,
but have not disseminated the methods used, thereby preventing the results from being
replicated, or the methods adapted or modified.
The research described in this paper attempts to help address such limitations and gaps by
proposing a more generalizable framework to enable construction robots to accurately sense their
workpieces, model their workpieces, adapt their work plans, and perform detailed work using
techniques that are highly accurate and applicable to a wide range of objects.

TECHNICAL APPROACH
The premise for this research is as follows. A construction robot is operating on a
construction site and has been instructed to perform a specific task on specific workpieces. The
robot has access to the project’s Building Information Model (BIM), which contains the as-
designed poses and geometries of the workpieces. The robot reaches the vicinity of its
workpieces via autonomous navigation, teleoperation, or other means. The robot’s jobsite pose
estimate naturally contains error, but the estimate is accurate enough that the robot can aim its
sensor in the direction it expects to find its workpieces and still detect them somewhere in its

field of view.

Workpiece Modeling
After collecting sensor data, the robot must register its workpiece models to the data. To do
so, an aggregated BIM model of the scene is first registered to the sensor data using the
Generalized Resolution Correlative Scan Matching (GRCSM) search algorithm (Lundeen et al.
2017). The scene is then decomposed and individual objects are re-registered independently.
Using these results, the robot is able to form as-perceived models of its workpieces that are
accurate relative to the robot’s local coordinate frame. Readers may refer to (Lundeen et al.
2017) for more details about workpiece sensing and modeling.

Work Planning
Once the robot has formed as-perceived models of its workpieces, it is then able to adapt its
work plan to the models. In the case of filling a square butt joint between two workpieces, the
work plan for an individual joint profile can be formulated as follows, where Figure 1 is used as
a visual reference. Due to space limitations, only select formulas are presented.

Figure 1. Individual profile model for a square butt joint fill plan.

The gap opening’s normal unit vector can be determined as follows, where tˆgap,ind is the gap
opening’s tangent unit vector.
0 1 ˆ 0 1 t gap,ind
nˆ gap ,ind    t gap,ind   
1 0  1 0  t gap,ind
The gap opening’s 3D rotation matrix can be expressed as follows.
ˆ 0
 t gap,ind nˆ gap,ind
Rgap,ind 
 0
 0 0 1 
The 3D rigid-body homogeneous transformation that defines the gap’s opening at a particular
cross section can then be expressed as follows.
R pgap,cntr ,ind 
Tgap,ind   gap,ind 
0 0 0 1 
The gap opening’s 3D (6 DOF) pose at a given cross section can be described as follows.
X   x y z Rot x Rot y Rot z 

Planning also occurs at the aggregated profile level. Shown in Figure 2 is an example of an
aggregated set of joint profiles.

Figure 2. Fill planning at the aggregated model level.

Outlier Removal: The first step in planning at the aggregated level is processing the raw
results from the 2D model registration process. The profile pose values for each workpiece are
filtered using a Hampel identifier (Liu et al. 2004) to remove statistical outliers, as shown in
Figure 2. The authors use a central moving window of 7 samples and a threshold of 3 standard
deviations.
Smoothing: Next, each workpiece’s filtered profile pose values are smoothed using a
Savitzky-Golay filter (Savitzky and Golay 1964) with a second degree polynomial, resulting in
the smoothed gap poses and fill profiles shown in Figure 2.
Path Tangency: Next, various characteristics, such as the gap opening’s tangent vector and
cross-sectional fill area, are recalculated for each profile using the smoothed pose values. Spline
interpolation is then used to fit a spline to the gap’s center along the length of the joint. The
authors use a spline with cubic interpolation, uniform spacing, and not-a-knot end conditions (De
Boor 1978). To reorient the gap’s coordinate frames, the spline’s gradient unit vectors, ĝ , are
determined for each point along the spline. The gap opening’s normal unit vector, nˆ gap,agg , in the
aggregated model is then determined by cross multiplying each gradient unit vector by the gap
opening’s tangent unit vector from the aggregated model. A third orthonormal vector vˆ gap,third , agg
is constructed by cross multiplying the gap opening’s normal unit vector by the local gradient
unit vector. The gap opening’s 3D rotation matrix is then constructed as follows.
Rgap,agg  vˆ gap,third ,agg nˆ gap,agg gˆ 
The gap opening’s 3D homogeneous transformation matrix is then constructed using the gap
opening’s rotation matrix and center spline points, as follows.
R pgap,cntr ,agg 
Tgap,agg   gap,agg 
0 0 0 1 
An example of a spline and its tangentially oriented coordinate frames can be seen in Figure
2.
Tool Speed: In order to produce the desired differential fill volume at each point along the
path, either the caulk flowrate or the tool’s speed must be varied. For simplicity, the authors
choose to fix the volumetric flowrate and vary the tool’s speed. As such, the following equation
can be used to determine the necessary tool speed along the path, where V fill is the volumetric

flowrate of material dispensed and Afill  s  is the desired cross-sectional area of fill material.
vtool  s   V fill / Afill  s 

EXPERIMENT
An experiment was conducted to assess a construction robot’s ability to perform adaptive
work using these geometrically adaptive methods. In the experiment, the robot was provided a
BIM model containing the as-designed poses and geometries of two abutted workpieces, and the
robot was instructed to sense, plan, and adaptively caulk the joint between them. The robot was
expecting a straight, uniform joint with a 4 mm gap width, but the authors presented it with a
joint of unexpected geometry to see if it could adapt to the actual geometry it encountered.
Two Kuka KR120 robot arms were used in the experiment. A caulk dispensing tool was
mounted on one arm, and an LMI Technologies Gocator® 2350 2D laser profiler was mounted on
the other for workpiece sensing. An overview of the experimental setup can be seen in Figure 3.

Figure 3. Experimental overview.

The robot aimed its sensor toward the joint’s as-designed location and collected sensor data,
as shown in Figure 4. Sensor data were collected at intervals of 2 mm , yielding a total of 51
sensor profiles along the length of each joint. After collecting sensor data, the robot built an as-
encountered model of the joint, adapted its fill plan, and executed its plan, as shown in Figure 4.

Figure 4. Sensor data collection (left); execution of fill plan (right).

RESULTS
Shown in Figure 5 are the raw data, 3D modeling results, and adapted fill plan. Also shown is
a comparison between the as-perceived and true poses of the joint’s center along each degree of

freedom, thereby contrasting what the robot thought the joint to be versus what it truly was.

Figure 5. Raw data, 3D model, and fill plan (left); model per degree of freedom (right).
Shown in Table 1 are the summary statistics for the modeling errors along the joint, where
the various degrees of freedom have been combined into single norm measures of position and
orientation error via the 2-norm (e.g., Euclidean distance). As can be seen, the robot was found
capable of identifying the true position and orientation of the joint’s center with a mean norm
positioning error of 0.13 mm and orientation error of 1.19 .

Table 1. Modeling error summary statistics.

Abs Abs Norm Norm
Error Error Error Error
Mean Std Dev Mean Std Dev
Position 0.13 0.05
Y (mm) 0.09 0.07
Z (mm) 0.07 0.04
Orientation 1.19 0.66
Rot-X (deg) 0.80 0.64
Rot-Y (deg) 0.26 0.26
Rot-Z (deg) 0.62 0.54
Area (mm2) 2.73 0.75
Shown in Figure 6 is a series of images showing the motion of the caulk dispenser as the
robot executed its adapted fill plan. A video of the experiment can be found at (Lundeen et al.
2018).

Figure 6. Joint fill action sequence.

For visual assessment, close-up views of the caulk deposited into the joint are shown in

Figure 7.

Figure 7. Visual results of robot’s adaptive joint filling operation.

DISCUSSION
The robot demonstrated the ability to identify the true pose and geometry of its workpieces,
adapt its fill plan, execute its adapted plan, and fill the joint despite discrepancies between the as-
designed workpieces and actual workpieces. The caulk deposited by the robot was not without
defects, but visual inspection indicates it to be on a level similar to a layperson’s. Nonetheless, it
is important to bear in mind that this research is less about joint filling than it is about enabling a
construction robot to perceive and model the actual pose and geometry of its workpieces, adapt
its work plan, and execute work. Another objective of this research is to evaluate the accuracy
with which a robot can model its workpieces and adapt its plan, thereby providing an initial
benchmark of the tool positioning and orienting capabilities that might be achievable for other
tasks using this framework. Thus, the key takeaway from the experiment is that, despite
discrepancies between the as-designed workpieces and actual workpieces, the robot was able to
model the joint and identify the true position and orientation of its center with a mean norm
positioning error of 0.13 mm and orientation error of 1.2 . Such levels of positioning and
orientation accuracy would likely be sufficient for a wide range of construction tasks.

CONCLUSIONS AND FUTURE WORK

A basic framework was proposed to enable a construction robot to adapt to the as-
encountered pose and geometry of its workpieces. The adaptive model-driven framework was
experimentally evaluated to assess the performance capabilities of a robot in executing adaptive
work, and to provide a baseline measure of the tool positioning and orienting that might be
achievable with such a system. Joint filling, and in particular caulking, was used as the case
study construction activity. A joint of unexpected geometry was placed in front of a robot to see
if the robot could adapt to the circumstances it encountered and perform work. Numerical and
pictorial results were presented to provide readers with the ability to assess the suitability of such
a system for their own application. It was found that the robot was capable of identifying the true
position and orientation of an unexpected joint with a mean norm positioning error of 0.13 mm
and a mean norm orientation error of 1.2 .
Future work is needed to explore the system’s sensitivity to unexpected objects in the
sensor’s field of view. Future work is also needed to extend the methods to 3D, since many
objects have geometries more conducive to 3D analysis than 2D. Lastly, future work is needed to
expand the adaptive framework to other construction tasks by further generalizing the methods
so they can be applied to an even broader class of construction activities.

ACKNOWLEDGMENTS
This work was supported by NSF (IIS-1734266) and the University of Michigan Rackham
Graduate School Predoctoral Fellowship Program. The authors also thank LMI Technologies for
making their Gocator sensor technology available for use in the experiments. Any opinions,
findings, conclusions, or recommendations expressed in this paper are those of the authors and
do not necessarily reflect the views of the NSF, University of Michigan, or LMI Technologies.

REFERENCES
De Boor, C. (1978). A Practical Guide to Splines, Springer-Verlag New York.
Feng, C., Xiao, Y., Willette, A., McGee, W., and Kamat, V. R. (2015). "Vision guided
autonomous robotic assembly and as-built scanning on unstructured construction sites."
Automation in Construction, 59, 128-138.
Helm, V., Ercan, S., Gramazio, F., and Kohler, M. "Mobile robotic fabrication on construction
sites: DimRob." Proc., Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International
Conference on, IEEE, 4335-4341.
Kahane, B., and Rosenfeld, Y. (2004). "Real-time “Sense-and-Act” operation for construction
robots." Automation in Construction, 13(6), 751-764.
Keating, S. J., Leland, J. C., Cai, L., and Oxman, N. (2017). "Toward site-specific and self-
sufficient robotic fabrication on architectural scales." Science Robotics, 2(5).
Kermorgant, O. (2018). "A magnetic climbing robot to perform autonomous welding in the
shipbuilding industry." Robotics and Computer-Integrated Manufacturing, 53, 178-186.
Liu, H., Shah, S., and Jiang, W. (2004). "On-line outlier detection and data cleaning." Computers
& Chemical Engineering, 28(9), 1635-1647.
Lublasser, E., Hildebrand, L., Vollpracht, A., and Brell-Cokcan, S. (2017). "Robot assisted
deconstruction of multi-layered façade constructions on the example of external thermal
insulation composite systems." Construction Robotics, 1(1), 39-47.
Lundeen, K. M., Kamat, V. R., Menassa, C. C., and McGee, W. (2017). "Scene understanding
for adaptive manipulation in robotized construction work." Automation in Construction,
2017(82C), 16-30.
Lundeen, K. M., Kamat, V. R., Menassa, C. C., and McGee, W. (2018). "Left-Right Joint Filling
Experiment, Multi-View." <https://youtu.be/Zqr6Mr_NHes>. (06/08/2018).
Lussi, M., Sandy, T., Doerfler, K., Hack, N., Gramazio, F., Kohler, M., and Buchli, J. "Accurate
and adaptive in situ fabrication of an undulated wall using an on-board visual sensing
system." Proc., IEEE International Conference on Robotics and Automation.
Savitzky, A., and Golay, M. J. E. (1964). "Smoothing and differentiation of data by simplified
least squares procedures." Analytical Chemistry, 36(8), 1627-1639.
Stentz, A., Bares, J., Singh, S., and Rowe, P. (1999). "A robotic excavator for autonomous truck
loading." Autonomous Robots, 7(2), 175-186.
Wang, W., Chi, H., Zhao, S., and Du, Z. (2018). "A control method for hydraulic manipulators in
automatic emulsion filling." Automation in Construction, 91, 92-99.
Willmann, J., Knauss, M., Bonwetsch, T., Apolinarska, A. A., Gramazio, F., and Kohler, M.
(2016). "Robotic timber construction—Expanding additive fabrication to new dimensions."
Automation in Construction, 61, 16-23.

Enhancing Visual SLAM with Occupancy Grid Mapping for Real-Time Locating
Applications in Indoor GPS-Denied Environments
Lichao Xu1; Chen Feng2; Vineet R. Kamat3; and Carol C. Menassa4
1
Ph.D. Student, Dept. of Civil and Environmental Engineering, Univ. of Michigan. E-mail:
[email protected]
2
Assistant Professor, Dept. of Civil and Urban Engineering, New York Univ. E-mail:
[email protected]
3
Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan. E-mail:
[email protected]
4
Associate Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan. E-mail:
[email protected]

ABSTRACT
Existing RTLS solutions are mostly based on either wireless technologies, fiducial markers,
or Lidar-based simultaneous localization and mapping (SLAM), and inevitably suffer from some
drawbacks such as low accuracy, reliance on existing facilities, labor-intensive environment
instrumentation, or high economic cost. This paper introduces an ORB2 RGB-D SLAM based
indoor RTLS that can be readily adapted and applied to building or civil infrastructure
applications while addressing the above limitations. Besides the original sparse feature map built
by the visual SLAM (vSLAM), the proposed system builds and maintains an additional 2D
occupancy grid map (OGM) and overlays it with real-time 2D camera pose and virtual laser scan
for 2D localization. These designs not only allow users to interact with the system but also open
up the possibilities of path planning and continuous navigation based on feature-based vSLAM.
The localization accuracy of the system is evaluated with a marker-based method, which proves
its feasibility and applicability in indoor building and construction applications.

INTRODUCTION
The burgeoning demand for robotic applications to support key construction and facility
management functions is creating a strong need for deployable mobile robots that are capable of
performing assigned tasks at specific locations automatically. Examples of such mobile agents
include data collection robots (Mantha et al. 2018; Xu et al. 2018), infrastructure inspection
robots (Menendez et al. 2018), indoor service robots (Llarena and Rojas 2016), or even some
robots that can move in complex environments for versatile applications (Liang et al. 2012; Xu et
al. 2015). Among all the fundamental technical capabilities that make such autonomous robots
possible, Real-Time Locating Systems (RTLS) are indispensable because they allow robots to
estimate their own pose (position and orientation) with respect to the map of an environment.
Benefiting from technology development, RTLS have been extensively utilized to facilitate and
improve safety management, construction resource tracking, infrastructure inspection, and
progress monitoring.
Compared to outdoor localization systems that can take advantage of the widely available
Global Positioning System (GPS), indoor localization in GPS-denied environments is relatively
more challenging. Even though significant research efforts have been invested in wireless
technologies-based indoor RTLS (e.g., Wireless Local Area Network (WLAN), radio frequency
identification device (RFID), Ultra Wideband (UWB), Bluetooth, and ultrasound), their

requirements on dedicated hardware and environment instrumentation inevitably prevent them

from being widely deployed in large-scale indoor environments (Brilakis et al. 2011; Khoury and
Kamat 2009).
As a promising alternative, vision-based RTLS solutions have also been explored (Fang et al.
2016; Park et al. 2011). However, they either depend on pre-installed fixed cameras which
cannot adequately handle inevitable occlusions that occur in typical indoor environments or need
computation-intensive structure from motion (SfM) and thus cannot run in real time. A real-time
mobile camera solution was achieved using artificial fiducial markers (Mantha et al. 2018). The
problem is that this approach needs a dense marker network to guarantee the localization
accuracy and is difficult to be applied in large-scale environments. Besides cameras, inertial
sensors offer another economical and non-instrumented solution. However, such dead-reckoning
methods suffer from drift error accumulation over time and distance (Jimenez et al. 2009).
More recently, 2D Lidar-based Simultaneous Localization and Mapping (SLAM) has started
to receive attention from researchers for 3D geometric modeling of construction sites (Kim et al.
2018). However, those methods need expensive laser scanner sensors and typical need the user
input of an initial pose estimation for them to start working correctly. This is not only expensive
but also inconvenient and even infeasible when such a prior pose estimation is not available.
In order to overcome these limitations and provide a versatile indoor RTLS, this paper
proposes a vSLAM-based localization system that is suitable for a wide range of applications in
indoor, GPS-denied environments. In this system, an additional OGM is built alongside ORB2
RGB-D’s sparse feature map. The OGM enables interaction with users and path planning that
cannot be supported by ORB2 RGB-D. In addition, the proposed RTLS does not need any
environment instrumentation or rely on any existing artificial facilities, which makes its rapid
deployment possible. More conveniently, it also provides visualization tools that allow users to
monitor the pose of the tracked object and interact with the system intuitively.

TECHNICAL DETAILS
The proposed RTLS can work in two modes, mapping mode and localization mode, and can
switch between the two modes at any time as needed (Figure 1). Compared to some other
algorithms (such as hector SLAM) that do not allow to map incrementally on an old map,
incremental mapping can be easily achieved with the proposed system by first using localization
mode to localize the current pose and then switching to mapping mode for further expansion of
the existing map. This provides significant flexibility for a wide range of applications.

Figure 1. Working modes of the proposed RTLS.

Mapping Mode
For an environment in which there does not exist a map, or the existing map is not
proportionally accurate, or the map requires frequent update due to dynamic changes, the RTLS
runs in the mapping mode to creates a new map of the environment from the beginning or

incrementally update the existing map. The map built in the mapping mode includes a sparse
feature map and an additional OGM. The sparse feature map is built incrementally by ORB2
RGB-D using an RGB frame and its corresponding depth image obtained from a Kinect sensor at
the same time (Mur-Artal and Tardós 2017). However, as mentioned before, the geometric
information included in this sparse map is not adequate to be useful for path planning or
navigation purposes. In order to address such issues, the OGM is built at the same time and
serves as an extension of the sparse map. Without using a laser scanner, this OGM is built with
the pose estimated by ORB2 RGBD and the corresponding virtual laser scan created from the
point cloud observed by the Kinect. The two maps are saved for localization in the same
environment when the mapping process completes.

Figure 2. Mapping mode of the system.

As shown in Figure 2, there are mainly four components involved in the mapping mode
(“Kinect Driver”, “Point Cloud to Laser Scan”, “ORB SLAM Mapping” and “Occupancy Grid
Mapping”) and each of them is a ROS package.
With some configuration changes, the ROS openni_launch package (Mihelich 2013) is
directly used here as the “Kinect Driver”. Generally, the RGB images and the depth images from
a Kinect cannot overlap perfectly due to any existing offset between them. In the developed
implementation, the ROS openni_launch package is configured to align the depth image to its
RGB image, by setting the depth_registeration argument to true in the command line or enabling
it via the ROS rqt_reconfigure tool. In this configuration, the driver automatically publishes
different messages to their corresponding topics. Among these messages, only three are further
used in the system, of which the RGB image message and the registered depth message are used
subsequently by “ORB SLAM Mapping” to track the Kinect’s pose in 3D space, and the point
cloud message is used by “Point Cloud to Laser Scan” to create virtual laser scans.
The pointcloud_to_laserscan ROS package (Bovbel and Foote 2015) is adopted to convert
the point cloud received from “Kinect Driver” to its corresponding virtual laser scan by cutting
out a horizontal slice of the point cloud with a certain height range and selecting the points that
have the smallest depth in each column of the slice. The virtual laser scan created in this way
allows the detection of all the obstacles appearing in a height range instead of only being able to
detect the obstacles at a fixed height when a real 2D laser scanner is used. The height range can
be set appropriately with min_height and max_height parameters for the node. There are some
other parameters that can be set to control the generation of the virtual laser scan. Such
parameter information is also included in the output laser scan message and can be retrieved
when these messages are used to update the OGM in “Occupancy Grid Mapping”.
The “ORB SLAM Mapping” package is developed upon ORB2 RGB-D (Mur-Artal and

Tardós 2017). In the implementation, it first synchronizes the input messages of RGB image,
depth image, and laser scan. Then it uses the current RGB image and depth image to estimate the
current pose of the camera and incrementally build the sparse feature map (at upper right in
Figure 2). At this time, it has all the information available such as historical keyframes, their
corresponding laser scans, as well as the current frame. According to the characteristics of the
ORB2 RGB-D, two special publishing strategies are adopted to guarantee the quality of the
OGM as introduced in (Singh and Amiri 2017). First, it publishes a single pose-laser scan pair of
the current keyframe most of the time to help update the OGM in real time. Second, it publishes
the pose-laser scan pairs of all the historical keyframes to help correct the drift introduced into
the OGM so far when either of the two conditions is met. One condition is that a certain number
of single pose-laser pairs have been published since the last time when the historical information
was published. The other condition is that when a loop closure is detected and keyframe poses
involved in the loop has been corrected. The sparse feature map is saved when a mapping
process completes.

Figure 3. Localization mode of the system.

Once a message from “ORB SLAM Mapping” arrives, the “Occupancy Grid Mapping”
component will extract the pose and laser scan information from the received message and use
them to update the OGM it is building and set ROS markers to visualize the camera’s pose in the
as-built OGM. The algorithm first checks if the message only includes one pair of pose and laser
scan data. If so, the algorithm concludes that the message represents the pose and laser scan of
the current keyframe in ORB SLAM and the information can be directly used to update the
OGM based on its current status. However, if the message includes multiple poses (and thus
multiple pairs of pose and laser scan data), this indicates that the message includes the historical
keyframe information. In this case, in order to correct the error in the OGM introduced by the
previous inferior estimation of the keyframe poses, the OGM will be completely erased and
rebuilt entirely with the received poses. After this, the two situations can be fit into a unified
processing step. The algorithm then processes the pose and laser scan pair(s) one by one and uses
each pair of pose and laser scan to update the OGM. After processing all the pose-laser scan
pairs in the message, the algorithm sets the ROS visualization markers that can be displayed in
the ROS rviz tool (Hershberger et al. 2018). In the implementation, the robot pose is expressed
with a red isosceles triangle created by a Line-List Marker in ROS, with its apex representing the
head of the robot. In addition, the laser scan beams are shown as separate green lines with
another Line-List Marker. The OGM can be readily saved with the ROS map save tool since the
OGM is represented with an OccupancyGrid message in the proposed implementation.

Localization Mode
When working in the localization mode, the system loads both the sparse map and the OGM
first. Then, ORB2 RGB-D converts the descriptors of the key points extracted from the input
RGB frame into their bag-of-words (BoW) representations (Gálvez-López and Tardos 2012;
Mur-Artal and Tardós 2017) and queries the keyframe database for the initial pose estimation of
the RGB frame. This process continues until the pose of the current frame is initially estimated
and finally optimized and global localization is successful. Subsequently, the 3D pose of a new
frame in the sparse map can be tracked by tracking its previous frame (or the most recent
reference frame) and its local map. Finally, the 2D pose in the OGM can be found by projecting
the 3D pose onto the 2D plane and continuous 2D pose tracking can be achieved.

Figure 4. A sparse feature map and its corresponding OGM from a building scale
environment, and localization results on the maps.

Figure 5. Marker deployment in a basement corridor environment and corresponding

marker position localization results.
The detailed implementation of the localization mode is shown in Figure 3. In localization,
the camera pose is localized in the maps built in the mapping mode, and this localization process

has several similarities to the way the mapping mode works. Therefore, instead of describing the
localization mode in detail, only its differences from the mapping mode are discussed in this
section.
In the localization mode, “ORB SLAM Localization” loads the saved sparse feature map and
localizes 3D pose of the camera in the feature map. In order to enable real-time localization of
the 2D camera pose on the OGM, it publishes the 3D pose of each frame and corresponding laser
scan instead of keyframe pose and laser scan in the mapping mode.
The “OGM Localization” loads the OGM in the localization mode, and subsequently
receives a single 3D pose-laser scan pair at a time representing the 3D pose and laser scan of the
real-time frame, and then converts the 3D pose to its 2D pose and visualizes the 2D pose and
laser scan on the OGM. The difference of “OGM Localization” from its counterpart “Occupancy
Grid Mapping” in the mapping mode is that the OGM is not updated when the system works
only for localization. Therefore, the virtual laser scan would not be used for OGM update.
However, its significance lies in that besides the 2D camera pose it helps to localize the scene
that is currently observed in the camera. This semantic information can be very useful to register
camera observations to existing environment models.

Table 1. Evaluation results of marker position measurement.

Estimated Marker True Marker Position
Marker Measured
Position Position Error
ID times
X (m) Y (m) X (m) Y (m) RMSE (m)
#1 76 -0.095 0.004 0.000 0.000 0.098
#2 54 0.147 3.031 0.000 3.050 0.151
#3 63 -0.830 3.057 -0.917 3.050 0.088
#4 54 -0.799 6.042 -0.917 6.100 0.135
#5 42 -0.844 10.277 -0.917 10.370 0.121
#6 50 -0.969 16.725 -0.917 16.778 0.075
#7 50 -1.035 24.240 -0.917 24.097 0.186
#8 64 6.887 24.207 6.816 24.097 0.133
#9 80 14.720 24.090 14.539 24.097 0.185
#10 68 13.640 20.296 13.729 20.412 0.149
#11 53 13.630 12.907 13.729 12.788 0.155
#12 57 13.695 5.547 13.729 5.470 0.088
#13 30 13.764 -1.242 13.729 -1.239 0.039
#14 93 7.146 -1.235 7.238 -1.239 0.094
#15 41 -0.058 -1.216 0.000 -1.239 0.065

Qualitative Results
The maps and localization results shown in Figure 3 were from a laboratory room scale
environment. The proposed system was also tested in an entire building scale environment and
the corresponding results are shown in Figure 4. The left side of Figure 4 shows the sparse
feature map built by ORB2 RGB-D and 3D localization within it. Its subfigure on the right
bottom corner shows the feature matching between the features in the current frame and the
features in the sparse feature map. The right side of Figure 4 shows the built OGM and the
corresponding 2D localization results within it.

QUANTITATIVE EVALUATION
In this section, the system’s localization accuracy is quantitively evaluated by measuring the
position of multiple markers in a building scale environment. As shown in Figure 5, 15 markers
were pre-deployed on the ground along the corridor of a basement and formed a loop whose
length was about 80m. The maps shown in Figure 4 were built in this environment and were used
in this experiment for evaluating localization accuracy. The accuracy of the localization system
was evaluated using the marker position root-mean-square error (RMSE), which is the RMS of
the distance between a marker’s estimated position and its true position over multiple
measurements. The corresponding results are shown in Table 1. It can be seen that the range of
RMSE is 0.039m to 0.186m, which is very competitive compared with other indoor localization
systems as shown in (Zafari et al. 2017). It can also be observed that the maximum measurement
errors occurred at the #7 marker and the #9 marker, which are 0.186m and 0.185m respectively.
The key reason for this observation is that the camera went very close to the wall when the
system observed and localized these two markers. Since only a small number of features could
be extracted from the surface of the wall, the localization accuracy of the system was impacted.

CONCLUSION
This paper proposed a vSLAM-based localization system for indoor, GPS-denied
environments by building an OGM alongside a sparse feature map. The system can work in
mapping mode to create the maps (sparse feature map and OGM) of the environment and in
localization mode to localize the position of the camera in the built maps. The accuracy of the
system was evaluated with a landmark-based evaluation method and the evaluation results
showed its high localization accuracy and applicability for a broad range of indoor applications.

REFERENCES
Bovbel, P., and Foote, T. (2015). "Ros package pointcloud_to_laserscan." <
http://wiki.ros.org/pointcloud_to_laserscan>. (10/24/2018).
Brilakis, I., Park, M.-W., and Jog, G. (2011). "Automated vision tracking of project related
entities." Advanced Engineering Informatics, 25(4), 713-724.
Fang, Y., Chen, J., Cho, Y., and Zhang, P. "A point cloud-vision hybrid approach for 3D location
tracking of mobile construction assets." Proc., ISARC. Proceedings of the International
Symposium on Automation and Robotics in Construction, Vilnius Gediminas Technical
University, Department of Construction Economics & Property, 1.
Gálvez-López, D., and Tardos, J. D. (2012). "Bags of binary words for fast place recognition in
image sequences." IEEE Transactions on Robotics, 28(5), 1188-1197.
Hershberger, D., Gossow, D., and Faust, J. (2018). "ROS rviz." <http://wiki.ros.org/rviz>.
(10/24/2018).
Jimenez, A. R., Seco, F., Prieto, C., and Guevara, J. "A comparison of pedestrian dead-reckoning
algorithms using a low-cost MEMS IMU." Proc., Intelligent Signal Processing, 2009. WISP
2009. IEEE International Symposium on, IEEE, 37-42.
Khoury, H. M., and Kamat, V. R. (2009). "Evaluation of position tracking technologies for user
localization in indoor construction environments." Automation in Construction, 18(4), 444-
457.
Kim, P., Chen, J., and Cho, Y. K. (2018). "SLAM-driven robotic mapping and registration of 3D
point clouds." Automation in Construction, 89, 38-48.

Liang, X., Xu, M., Xu, L., Liu, P., Ren, X., Kong, Z., Yang, J., and Zhang, S. "The amphihex: A
novel amphibious robot with transformable leg-flipper composite propulsion mechanism."
Proc., Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on,
IEEE, 3667-3672.
Llarena, A., and Rojas, R. (2016). "I Am Alleine, the Autonomous Wheelchair at Your Service."
Intelligent Autonomous Systems 13, Springer, 1613-1626.
Mantha, B. R., Menassa, C. C., and Kamat, V. R. (2018). "Robotic data collection and simulation
for evaluation of building retrofit performance." Automation in Construction, 92, 88-102.
Menendez, E., Victores, J. G., Montero, R., Martínez, S., and Balaguer, C. (2018). "Tunnel
structural inspection and assessment using an autonomous robotic system." Automation in
Construction, 87, 117-126.
Mihelich, P. (2013). "ROS package openni_launch." <http://wiki.ros.org/openni_launch>.
(10/24/2018).
Mur-Artal, R., and Tardós, J. D. (2017). "Orb-slam2: An open-source slam system for
monocular, stereo, and rgb-d cameras." IEEE Transactions on Robotics, 33(5), 1255-1262.
Park, M.-W., Koch, C., and Brilakis, I. (2011). "Three-dimensional tracking of construction
resources using an on-site camera system." Journal of computing in civil engineering, 26(4),
541-549.
Singh, A. K., and Amiri, A. J. (2017). "2D Grid Mapping and Navigation with ORB SLAM,
GitHub repository." <https://github.com/abhineet123/ORB_SLAM2>. (10/24/2018).
Xu, L., Kamat, V. R., and Menassa, C. C. (2018). "Automatic extraction of 1D barcodes from
video scans for drone-assisted inventory management in warehousing applications."
International Journal of Logistics Research and Applications, 21(3), 243-258.
Xu, L., Zhang, S., Jiang, N., and Xu, R. (2015). "A hybrid force model to estimate the dynamics
of curved legs in granular material." Journal of Terramechanics, 59, 59-70.
Zafari, F., Gkelias, A., and Leung, K. (2017). "A survey of indoor localization systems and
technologies." arXiv preprint arXiv:1709.01015.

Industrialized Construction: Emerging Methods and Technologies

Mohamad Razkenari1; Qi Bing2; Andriel Fenner3; Hamed Hakim4; Aaron Costin, Ph.D.5; and
Charles J. Kibert, Ph.D.6
1
Ph.D. Student, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
Gainesville, FL 32611-5703. E-mail: [email protected]
2
Ph.D. Student, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
Gainesville, FL 32611-5703. E-mail: [email protected]
3
Ph.D. Student, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
Gainesville, FL 32611-5703. E-mail: [email protected]
4
Ph.D. Candidate, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
Gainesville, FL 32611-5703. E-mail: [email protected]
5
Assistant Professor, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
Gainesville, FL 32611-5703. E-mail: [email protected]
6
Holland Professor, Powell Center for Construction and Environment, M. E. Rinker, Sr. School
of Construction Management, Univ. of Florida, Gainesville, FL 32611-5703. E-mail:
[email protected]

ABSTRACT
In the recent years, industrialized construction (IC) is moving the construction industry
toward integrating emerging manufacturing technologies into offsite construction and factory
component assembly practices to improve productivity and efficiency. The key to success in the
IC business is providing mass customization and mass production simultaneously, which is only
possible in a highly flexible manufacturing system. The manufacturing industry is experiencing
the fourth industrial revolution which is blurring the separation of the physical, cyber, and
biological worlds. The primary objective of this study is to identify the new methods and
techniques emerging in manufacturing industry and evaluate their possible application in IC. A
questionnaire was designed to collect data from industry experts. Preliminary data shows a
significant lag in the transition of the new trends in cyber physical systems and data analytics
into the IC production process. The findings of this survey are precisely analyzed and discussed
in this paper.

INTRODUCTION
Industrialized construction (IC) has caught the attention of the construction industry due to
its economic benefits which include reduced project delivery time, improved quality control, and
increased labor productivity. The idea of IC revolves around integrating design and optimization
tools used in manufacturing to solve complex challenges in construction projects. IC is defined
as the process of producing prefabricated systems, building components, or building structures in
a protected factory environment and transporting them to the construction site for installation or
assembly (Razkenari et al. 2018a). Some examples of IC include precast concrete, metal frame
buildings, prefabricated trusses, panelized buildings, and preassembly components, to name a
few. From the theoretical viewpoint, IC is expected to substantially improve the efficiency of
production by using innovative technologies and facilitate the implementation of lean strategies
for more sustainable project delivery (Fenner et al. 2017). In the mechanical and electrical
sectors, a shortage of labor and the need to improve quality control seem to be the main drivers

for using prefabrication (Razkenari et al. 2018b). In fact, it is anticipated that IC will rapidly
replace on-site construction activities as a consequence of a rapidly shrinking construction
workforce and to counter rapidly inflating construction costs.
Despite all the advantages, IC still faces immense resistance from both the industry and the
market. The slow integration of advanced technologies such as computer-aided construction,
automation, standardization, and prefabrication is impeding the expected productivity growth in
the construction industry (Xue et al. 2018). In this aspect, evaluating the advances of the IC is
critical, especially when there is a need to unveil the factors that are still holding the industry
back. The primary purpose of this study is to investigate the state of technology in the
manufacturing process for industrialized construction in the United States. To meet these
objectives, this study first reviewed the literature to identify state-of-the-art technologies that can
be practically and economically implemented in the IC process. The information gathered from
the literature was then used to prepare a survey to collect data from industry experts. The data
collected from questionnaire provides the viewpoint of experts regarding the current state of
technology uptake and the potential for future investment in time, labor, and cost saving
strategies centered around technology.

BACKGROUND
Industrialized construction can achieve higher efficiency by viewing building design and
construction from the perspective of manufacturing industry. Based on type of product and level
of customization, manufacturing systems user a variety of production strategies such as Make-to-
Stock, Make-to-Forecast, Assemble-to-Order, Make-to-Order, and Engineer-to-Order (ETO).
Construction projects have similar scope to ETO manufacturing systems. ETO refers to the
design and production of complex, one of a kind products constrained by uncertain operational
durations, finite capacity resources, and multilevel product structures (Haug et al. 2009).
Innovative design methods, concurrent engineering, and flexible production techniques in form
of industrialized construction, alter the scope of construction projects closer to Assemble-to-
Order systems.
Design for Manufacturing and Assembly (DfMA) facilitates mass production and mass
customization, by migrating innovative ideas and state-of-the-art technologies to the design,
production, and assembly. In general, Design for X (DfX) are design approaches that
“incorporate the manufacturing and assembly criteria for design, as well as beyond to the rest of
the product life-cycle” (Molloy et al. 2012). Design for Manufacture (DfM) is the integration of
manufacturing criteria into the product design process while Design for Assembly (DfA) eases
the assembly process. DfMA is a combination of DfA and DfM with major focus on concurrent
engineering as well as study of alternative products, quantify manufacturing and assembly
difficulties. To make DfMA possible, the designer must have extensive information on the
manufacturing methods for, and life-cycle of, the product (Molloy et al. 2012).
The manufacturing systems in many sectors are expected to boost in performance through the
use of emerging methods such as Internet of Things (IoT) and Cyber Physical Systems (CPS).
The future research on manufacturing systems deals with the operation of CPS integrated
machines and tools, which are capable of simultaneous data collection and analysis. The use of
CPS for collecting events or actuators for executing them, help reason-based control, monitoring,
and management functionalities (Nagorny et al. 2012). One of the early applications of CPS and
IoT in the construction industry is using sensors (such as RFID and QR) in the prefabrication
industry for monitoring the production, transportation, and assembly processes (Qi et al. 2018).

Table 1. Technology categories with possible application for industrialized construction

Category Description and examples
Information and ICT refers to technology to capture, transmit and display data electronically
communications with focus on communications (ex. Electronic Data Interchange, wireless,
technologies Internet, Bluetooth, etc.)
Computer-Aided design The computer systems used to aid in the creation, modification and
tools evaluation of a design (3D visualization, 4D, nD, BIM-based tools, etc.)
Document management and The technologies used for coordination and sharing design document
project integration software between project sectors (ex. BIM 360, Synchro, Navisworks, etc.)
Business and production The information systems used to manage information and support processes,
management information business intelligence, and E-commerce activities (ex. ERP, MES, MIS, GIS,
systems etc.)
Project Management and The complete set of tools used for project management, planning, executing,
scheduling software and controlling (ex. Primavera, MS Project, Procore, etc.)
Industrial Control Systems Different types of control systems used in the factories for managing and
controlling of the physical activities (ex. SCADA, DCS, PCS, PLC, etc.)
Industrial Internet The control systems that have been integrated with information technology
to empower the production planning process (ex. IoT, CPS, Embedded
systems, connected sensors, microcontroller, etc.)
Data acquisition The technologies to collect data from real world conditions (ex. RFID, GPS,
technologies QR Codes, BLE, Barcoding, Digital Imaging, Laser Scanning, etc.)
Data analytics and The methods and tools to explore the data and provide insights using new
computation systems concepts including big data analysis, data mining, machine learning,
business intelligence, cloud computing, etc.
Automation systems The robotic systems used to automate construction processes including
including Robotics and palletization and de-palletizing, material handling, welding, assembly,
Artificial Intelligence material removal, etc.
Autonomous Equipment The new generation of autonomous machinery to monitor and facilitate
and machinery construction tasks (ex. CNC machines, 3d printing, drones, etc.)
Extended Reality The computer technology used to create real-and-virtual combined
environments and human-machine interactions such as AR, VR, and MR
Wearable Technology The electronic devices incorporated into clothing or worn on the body such
as smart watch, helmets, exoskeleton, wearable sensors, etc.
Smart energy The tools to monitor and control energy consumption and distribution such
as smart metering, energy dashboards, dynamic simulations

METHODOLOGY
The purpose of this study is to identify the current status of penetration of emerging
technologies in the Industrialized Construction in the U.S. A questionnaire was designed to
collect specific information which is missing from the literature and would be necessary to reach
the research objectives. The collected data is mostly about technologies that companies use in the
design and manufacturing process and the barriers for implementing these technologies. The
questionnaire and resulting data specifically addressed technologies and methods for smart
manufacturing and Industry 4.0, which are widely reviewed in previous studies (Lu 2017; Zhong
et al. 2017; Kumar 2017; Tao et al. 2018), 14 technology categories with possible application for
IC are identified and described in Table 1.
This questionnaire survey has two parts. The first section collects basic information about the
respondent’s background. In the second section, the participants evaluate emerging technologies

in industrialized construction. Also, for all the technologies which have high possibility for
future application in their view, they are asked to select the possible area of implementation. The
questions are designed to include the Likert scale, rank orders, multiple choice, and text entries.
In this paper, a descriptive analysis is presented on the preliminary result of the questionnaire
to compare sample mean and frequency of different factors. It would be possible to conduct
deeper statistical analysis on the results by normalizing the data and conducting pairwise
comparisons between various sectors. However, while the sample size for the preliminary results
is adequate for obtaining an understanding of the IC industry’s uptake of technology, it is not of
sufficient scale to allow factor analysis and pairwise comparisons.

RESULTS
Respondents’ profile information: A snowball sampling method was utilized in to
distribute the questionnaire survey and collect effective survey responses. Some employees in
local prefabricated construction company were selected as the initial respondents. They were
then requested to distribute the survey to other knowledgeable participants. Finally, 20 effective
responses have been received. The profiles of 20 respondents are presented in Table 2. Questions
asking organization type, respondent’s position, organization location were all multiple choice.
As is shown in Table 2, most of the respondents come from construction companies. Their
organization are distributed in various parts of U.S. Besides, quite a lot of the respondents come
from large companies (high annual revenue more than 100 million dollars), engaged in managed
related occupations (e.g. management and project management), with long
working experience (more than 10 years). As is shown in Table 3, respondents’ institutions have
participated in IC projects more or less. Only three of them have never utilized any IC strategies.
Those profile data ensure the quality and reliability of respondents’ perspective towards IC and
make the findings more convincing.

Table 2. Respondents’ profile

Characteristics Frequency Characteristics Frequency
Organization Type Respondent's position
Construction company 18 Management (Other than 11
project management)
Developer 1 project management 7
Component manufacturer 1 Engineering 2
Industry trade association 1 Design 0
Organization location Construction 5
Northeast 8 Manufacturing operations 2
Midwest 10 Other 4
South 17 Working experience (in years)
West 7 0~5 6
International 6 6~10 1
Annual Revenue (in million dollars) 11~15 2
0~10 2 16~20 3
11~100 4 >20 8
101~1000 10
>1000 4

Table 3. Participation degree of respondents’ organizations in IC project

Characteristics Frequency
IC strategies that respondent's organization have utilized
Component manufacture & sub-assembly (Door furniture, windows, etc.) 9
Non-volumetric pre-assembly (Cladding, wall panels, bridge units, etc.) 11
Volumetric pre-assembly (Plant rooms, shower rooms, etc.) 5
Modular (Motels, prison blocks, medium rise residential, etc.) 10
We have never utilized industrialized construction strategies 3
Type of projects that utilized IC strategies
Single-family buildings 3
Multi-family buildings 6
Commercial/institutional buildings 12
Educational buildings 5
Industrial buildings 9
Retail buildings 4
Healthcare buildings 7
Infrastructure (Highway, bridges, power, water, etc.) 6
None 4

Table 4. Current utilization and possibility for future investments on technology categories
Technology category Current utilization Ranking Future investment Ranking
Project Management and scheduling 3.45 1 3.45 4
software
Information and communications 3.17 2 3.67 1
technologies
Document management and project 3.08 3 3.64 2
integration software
Computer-Aided design tools 2.83 4 3.58 3
Data acquisition technologies 2.64 5 3.27 5
Business and production 2.45 6 3.09 8
management information systems
Autonomous Equipment and 2.25 7 3.27 6
machinery
Data analytics and computation 1.91 8 3.09 9
systems
Wearable Technology 1.83 9 3.18 7
Smart energy 1.67 10 2.55 12
Automation systems including 1.58 11 2.64 10
Robotics and Artificial Intelligence
Industrial Control Systems 1.55 12 2.27 13
Extended Reality 1.5 13 2.64 11
Industrial Internet 1.45 14 2.18 14

Industry’s perspective towards emerging technologies: In the questionnaire survey,

respondents were requested to rate the current level of utilization and possibility for future
investment of each technology from 1 to 4. As is shown in Table 4, the ranking of future
investment level is similar to the ranking of current use level. The technologies that have high
current use level are also considered to be worth more investment. Project management and

scheduling software, information and communications technologies and document management

and project integration software are the three most currently used technologies. Those
technologies have received the attention from respondents and are believed to receive more
investment in the future. Industrial control system, extended reality and industrial internet are the
three least currently used technologies and also won’t receive too much attention in the future.
Table 5 lists respondents’ perspective on the future investment for each emerging technology
in various IC areas. Construction field operation, quality control and progress monitoring and
logistics and material and equipment management are the areas that have the potentials to attract
more investment on emerging technologies in the future.

Table 5. The IC areas which have potential for future investment in emerging technologies
Technologies D ES CFO LMEM QCPM CM S FM Total
Information and communications technologies 3 4 7 6 7 5 5 4 41
Document management and project integration software 7 6 6 1 6 4 2 3 35
Computer-Aided design tools 6 6 7 2 2 3 3 3 32
Project Management and scheduling software 2 6 6 3 4 5 1 2 29
Data acquisition technologies 1 0 4 4 4 0 3 3 19
Data analytics and computation systems 1 2 3 3 3 2 2 2 18
Autonomous Equipment and machinery 0 0 3 1 3 0 2 1 10
Automation systems including Robotics and Artificial 0 0 1 2 2 0 2 1 8
Intelligence
Business and production management information 1 0 2 2 0 0 0 1 6
systems
Wearable Technology 0 0 2 1 0 0 2 1 6
Industrial Internet 0 0 1 1 1 0 1 1 5
Extended Reality 0 0 1 1 1 0 1 1 5
Industrial Control Systems 0 0 0 0 0 0 0 0 0
Smart energy 0 0 0 0 0 0 0 0 0
Total 21 24 43 27 33 19 24 23 214
In Table 5, D denotes Design; ES denotes Estimating and Scheduling; CFO denotes Construction and Field
Operation; LMEM denotes Logistics, Material and Equipment Management; QCPM denotes Quality Control and
Progress Monitoring; CM denotes Contract Management; S denotes Safety; FM denotes Facility Management; Total
denotes sum of frequencies.

DISCUSSION AND CONCLUSIONS

Although emerging technologies being applied in onsite and offsite construction practices
have been highlighted in prior research, the state of emerging technologies for IC practices and
the impact of this technological change on the construction industry remained unknown. This
study outlines the technological change in IC practices from the expert viewpoint. The findings
of this study are summarized here, including the perspective gap and future potential future
technology investments.
Technology Gap: The results point out the gaps between technologies explored in the
literature and the technology adoption perspective in the industry. The responses indicate that
industry experts are inclined to adopt technologies to automate the existing systems. They
support increasing investment in autonomous equipment, wearable technology, and extended
reality because these technologies can easily align with current processes. However, when there
is a fault in the process, automation just speeds up the impact of the fault and reengineering is
necessary to improve performance. Research offers the potential for holistic improvement by

reengineering the production process. Research on process improvement using robotics, smart
energy, and Industrial Internet is increasing while at the same time the experts suggest a low
probability of future investment in these technologies. In fact, the respondents did not indicate
significant interest in adopting technologies to improve the manufacturing systems. The
respondents declared that the current application of autonomous equipment is limited to the use
of drones to capture imagery data for monitoring and inspection and CNC machines for
prefabrication of drywall panels. Because most of the experts are from construction companies,
they overlooked the possibility of drastic changes in the process by using concurrent engineering
and DfMA strategies.
Future Improvement Plans: The experts’ view on the emerging technologies would be
useful for both managers and lenders who tend to invest in improving productivity in the
construction industry. As illustrated in Table 5, there is an emphasis on potential future
investment in CFO and QCPM. The research points out there are differences in the types of
technologies that are being adopted for different categories of IC operations. Additionally, the
respondents selected ICT and project integration and computer-aided design tools as the most
desired process improvement technologies. This information indicates the need to investigate
alternative design strategies and the importance of integrating the design, fabrication, and
assembly process. The future application of data analytics and computation systems is expected
to occur along with the use of interoperable software packages to layout design information, in
industrialized construction that were identified from the literature and the survey results include:
1) Developing a collaborative platform for designers and manufacturers to share open
source parametric BIM libraries of preassembly products to support flexible design ideas.
2) Offering design to fabrication work processes to increase interoperability of design tools
with automated machinery such as CNC machines, laser cutters, and 3D printers.
3) Managing the supply chain, monitoring the production progress, and analyzing
performance through the utilization of sensor technology and Cyber-Physical Systems for
recording progress updates and key performance indicators.
Process improvement strategies for the industry require careful consideration of the
company’s business model, work process, and scope. Since the selected scope of the study
includes all sectors of industrialized construction, further research is still needed to reach a
census from the expert viewpoint. Currently in the industry, subcontractors with specific
products such as precast concrete, mechanical and electrical systems, volumetric units, and
prefabricated panels (interior and exterior walls, floors, roofs, etc.) have had more growth since
their business model is less complicated. However, they target optimizing the process for
maximizing profit for specific services that they offer rather that the entire project. Startups are
beginning to appear that offer an emphasis on integrated project delivery or design-build services,
a re-engineering of the entire process and integration of emerging technologies into their
upgraded production plan. While evaluating the process improvement strategies in the successful
sectors is necessary, a limitation is that companies are reluctant to share their cost-related or
technological innovation data.

REFERENCES
Esmaeilian, B., Behdad, S., and Wang, B. (2016). "The evolution and future of manufacturing: A
review." Journal of Manufacturing Systems, 39 79-100.
Fenner, A. E., Razkenari, M. A., Hakim, H., and Kibert, C. J. (2017). "A Review of
Prefabrication Benefits for Sustainable and Resilient Coastal Areas." The 6th International

Network of Tropical Architecture (iNTA) Conference.

Haug, A., Ladeby, K., and Edwards, K. (2009). "From engineer-to-order to mass customization."
Management Research News, 32(7), 633-644.
Kumar, A. (2017). "Methods and Materials for Smart Manufacturing: Additive Manufacturing,
Internet of Things, Flexible Sensors and Soft Robotics." Manufacturing Letters.
Lu, Y. (2017). "Industry 4.0: A survey on technologies, applications and open research issues."
Journal of Industrial Information Integration, 6 1-10.
Molloy, O., Warman, E. A., and Tilley, S. (2012). Design for Manufacturing and Assembly:
Concepts, architectures and implementation. Springer Science & Business Media.
Nagorny, K., Colombo, A. W., and Schmidtmann, U. (2012). "A service-and multi-agent-
oriented manufacturing automation architecture: An IEC 62264 level 2 compliant
implementation." Comput.Ind., 63(8), 813-823.
Qi, B., Chen, K., and Costin, A. (2018). "RFID and BIM-Enabled Prefabricated Component
Management System in Prefabricated Housing Production." Construction Research Congress
2018, 591-601.
Razkenari, M. A., Fenner, A. E., Hakim, H., and Kibert, C. J. (2018). "Training for
Manufactured Construction (TRAMCON)–Benefits and Challenges for Workforce
Development at Manufactured Housing Industry." Modular and Offsite Construction -
(MOC) Summit.
Razkenari, M. A., Fenner, A. E., Woo, J., Hakim, H., and Kibert, C. J. (2018). "A Systematic
Review of Applied Information Systems in Industrialized Construction." Construction
Research Congress 2018, ASCE, 101-110.
Tao, F., Qi, Q., Liu, A., and Kusiak, A. (2018). "Data-driven smart manufacturing." Journal of
Manufacturing Systems.
Xue, X., Zhang, X., Wang, L., Skitmore, M., and Wang, Q. (2018). "Analyzing collaborative
relationships among industrialized construction technology innovation organizations: A
combined SNA and SEM approach." J.Clean.Prod., 173 265-277.
Zhong, R. Y., Xu, X., Klotz, E., and Newman, S. T. (2017). "Intelligent Manufacturing in the
Context of Industry 4.0: A Review." Engineering, 3(5), 616-630.

Automating the Digital Fabrication of Concrete Formwork in Building Projects: Workflow

and Case Example
M. S. Fardhosseini1; H. Abdirad2; C. Dossick, P.E., Ph.D.3; H. W. Lee, Ph.D.4; R. DiFuria5;
and J. Lohr6
1
Ph.D. Candidate, College of Built Environments, Univ. of Washington, PO Box 355740,
Seattle, WA 98195. E-mail: [email protected]
2
Ph.D. Candidate, College of Built Environments, Univ. of Washington, PO Box 355740,
Seattle, WA 98195. E-mail: [email protected]
3
Dept. of Construction Management, Univ. of Washington, Architecture Hall 130h, PO Box
351610, Seattle, WA 98105. E-mail: [email protected]
4
Dept. of Construction Management, Univ. of Washington, Architecture Hall 120d, PO Box
351610, Seattle, WA 98105. E-mail: [email protected]
5
Turner Construction Company, 830 4th Ave. S., #300, Seattle, WA 98134. E-mail:
[email protected]
6
Turner Construction Company, 830 4th Ave. S., #300, Seattle, WA 98134. E-mail:
[email protected]

ABSTRACT
Digital fabrication is an emerging approach to transforming design to physical products.
While a number of studies have been carried out to highlight the use of digital technologies in
construction, there is still a dearth of studies focused on design-to-fabrication workflows in
construction projects. In response, the overarching goal of this study is to develop a design-to-
fabrication workflow using digital models and computer numerical control (CNC) machines with
a focus on formwork fabrication. The specific objectives are to (1) present a detailed workflow
from the design phase of formwork to the fabrication phase, and (2) develop automated
procedures to support the workflow without extensive manual intervention. This workflow
integrates virtual design and construction (VDC), trade coordination, parametric modeling,
software customization, tool path development, and CNC routing in order to produce precise
prefabricated formwork components. To test and validate the workflow, the authors conducted a
case study for the prefabrication of concrete edge formworks in a 26-story post-tensioned cast-
in-place concrete structure. The case study shows that the use of the workflow for CNC
machines supports craftsmanship to improve labor productivity, safety, and fabrication quality.
The results demonstrate the advantages of using this workflow over the traditional approach to
support project teams’ productivity and enable them to make informed decisions for their
implementation of digital fabrication.

INTRODUCTION
As the Architecture, Engineering and Construction (AEC) industry spotlights cost
management as a critical success factor in projects, cost-saving solutions in project planning and
production are highly valued by practitioners. In this paper, the authors posit automated concrete
formwork design and fabrication as a cost-saving opportunity in conventional commercial
construction. Previous research shows that, in buildings with concrete structures, the cost of
concrete formwork can constitute up to 60 percent of the cost of all concrete work-packages in
projects (Hurd 2005). Given the significance of this cost, value-adding innovations in the design,

planning, and production of concrete formwork can benefit the projects. The empirical AEC
research suggests that computational workflows and the use of manufacturing robots like
Computer Numerical Controlled (CNC) machines extended the opportunities for the digital
design and fabrication of building products (Ajweh 2014). These machines help enhance
fabrication productivity as they largely outperform humans in speed and precision of work
without posing physiological stress (e.g. fatigue) (Samaneh and Masoud 2013). Furthermore, in
term of safety, using CNC machines could support workers to avoid potential occupational
hazards such as loud noise, failure of manual equipment, tripping, and getting struck by different
machines on job sites. Despite the potential benefits of using CNC machines in construction,
researchers have highlighted that the construction industry has a low rate of adopting and
utilizing these machines (Arora et al. 2014; Davidson 2013; Hwang et al. 2018). This is in part
due to the lack of decision-making frameworks, comprehensive digital workflows, and
transferrable precedents. To address this gap in research and practice, the main goal of this study
was to develop and test workflows that utilize advanced integration of digital design with
manufacturing machines (e.g. CNC) for formwork fabrication. This paper presents a framework
that encapsulates design to fabrication processes for formwork production using digital models
and CNC machines. Additionally, this research reports on automation tasks that were
implemented within the digital tools to further maximize the efficiency of this workflow. The
findings will provide useful guidelines for contractors to implement similar workflows for
formwork fabrication to improve their profit margin in concrete work packages.

LITERATURE REVIEW
Gershenfeld (2012) defined digital fabrication as the ability to convert digital data into
physical things. Particularly, and from the AEC industry standpoint, design to fabrication
procedures using digital environment have received considerable attention since 1990. In a
nutshell, design to fabrication workflows constitute automating the task of extracting and
processing the information of a digital model, and then translating it to operational instructions
for digital fabrication tools such as CNC machines (Raspall 2015). Finding the most optimal
procedure for design to fabrication of formwork using a CNC machine requires a good
understanding of the capabilities and the benefits that the machine can offer. From the
production standpoint, Samaneh and Masoud (2013) reported that adopting a CNC machine
provides the following benefits to projects: (1) reduction in setup time, (2) reduction in lead time,
(3) accuracy and repeatability, (4), contouring of complex shapes, (5) simplified tooling and
work holding, (6) consistent cutting durations, and (7) increased productivity. From the human-
factors standpoint, Dozzi and AbouRizk, (1993) suggest that factors such as fatigue, motivation,
physical limitation, and safety could make it infeasible for a construction worker to maintain a
high-level performance for a long time. Therefore, the industry values the tools that address these
human-factors and mitigate the issues associated with the manual processes. From the safety
point of view, traditional formwork production activities such as lifting, sawing, and hammering
by carpenters can result in frequent though low-severe injuries (Fardhosseini et al. 2015). Also,
prolonged awkward postures, and high physical workload in conjunction with the frequent use of
hand tools are ergonomic hazards that can reduce workers’ productivity and increase lost-time
costs (Aghazadeh and Mital 1987). Therefore, not only can innovations in formwork fabrication
help workers avoid occupational hazards, they can also save design and production costs for
contractors and promote consistent and high-quality formwork cuts (Spielholz et al. 1998; Bhoir
et al. 2015; Habibnezhad et al. 2016).

Figure 1. Suggested Framework between design and fabrication using digital model and
CNC machine
Automating the design process in prefabrication workflows can provide a significant
time/cost advantage to the production processes (Manrique et al. 2015). To implement this, in the
first step, practitioners usually use Computer-Aided Design (CAD) tools to design and analyze
products’ geometry. Then, a Computer-Aided Manufacturing (CAM) tool will be used to convert
the cutting procedure into instructions (Hamid et al 2018). In brief, CAD/CAM procedures entail
the use of technologies for design, analysis, and manufacturing of products (Hamid et al 2018;
Schodek et al. 2005). In the final step of the design to fabrication workflows, a manufacturing
robot, like a CNC machine, is needed to convert the instructions generated in CAM to machine
operational tasks. The CNC machines work based on a controller application providing the
machine-code (G-code). This code instructs the machine to operate based on the CAM outcome
(Hamid et al 2018).
The AEC literature shows that although design to fabrication workflows are emerging in
commercial construction, there are still manual modeling processes in the digital workflows that
may pose opportunities for automation. Two recent studies, Arashpour et al. (2018) and Hamid
et al. (2018), presented workflows for fabricating commercial façade systems (using additive
manufacturing and CNC) and engineered-to-order wood cabinets respectively (using CNC).
These studies confirmed that despite the considerable investment costs for these manufacturing
machines, the construction industry can obtain a positive return on such investments if the
workflows are recurringly used across multiple projects for customized building products
(Arashpour et al. 2018; Hamid et al. 2018). In this paper, the authors aim to build on the previous
studies and propose a framework with repeatable workflows for the digital fabrication of

customized temporary structures and formwork in construction projects. Additionally, this

research highlights and implements the automation processes that outperform conventional
modeling practices in the digital fabrication workflows. An illustrative example, presented in the
following section, demonstrates the feasibility of using this framework and its workflows in
conventional commercial projects to increase the efficiency and accuracy of production and
reduce the potential wastes.

Figure 2. (a) Field Installation Drawing. (b) Finding the optimal point for cutting the edge
forms (c) Installation Model Considering the Optimal Point for Cutting

METHODS AND PROCEDURES OF DEVELOPING THE FRAMEWORK

In this study, a large General Contractor (GC), based in the Pacific Northwest region of the
U.S. (Turner Construction Company in Seattle), formed a research and development team of
BIM/VDC professional, researchers, and digital design programmers. The team first observed
and documented the traditional processes for designing and fabricating concrete formwork in
several in-progress projects. The data analysis focused on classifying the observed processes into
standardized reproducible stages in all projects. These stages included: (1) Creating a digital
model of the project (the GC’s perspective), (2) Trade coordination and constructability review,
(3) Drafting Post- Tension (PT) strands/embeds and coordinating the constraints (e.g. location of
PT holes), (4) Establishing formwork design requirements and constraints (e.g. materials,
maximum allowable size, clearance for avoiding formwork cuts near PT holes), (5) Designing,
drafting, and labeling each piece of formwork per location and constraints, and (6) Cutting
formwork pieces with manual carpentry tools.
In the first development, the team attempted to take advantage of the coordination model to
facilitate the digital design and fabrication of the target formwork. Accordingly, the first
development focused on implementing all of the above-mentioned stages in a three-dimensional
environment (especially Stages 3 and 5, which originally were in 2D). Then, the team chose
interoperable and customizable software (customization through the Application Programming

Interface – API) that could support modeling, review, coordination, drafting, and fabrication of
formwork in all projects. With the flexibility to choose any model-authoring tool, the team chose
Autodesk Navisworks as a universal model reader for project review, clash detection, and
coordination. For creating PT elements and modeling formwork in 3D, the team used Rhino with
Grasshopper as a parametric modeling tool that could work with the variation of
parameters/constraints across different projects. Finally, for a seamless data transfer from the
model to a CNC machine, the team used Rhino-CAM, which translated the model into cutting
instructions for the machine. After the successful implementation of the first development, the
team aimed for automating manual modeling processes and manual checks in the model (e.g.
checking if a proposed cut intersects with PT, embeds, or PT holes, labeling formwork,
extracting 2D views and placing them on installation sheets). The team used visual programming
in addition to API functionalities (Rhino/Grasshopper API using Python language and
algorithms) to automate many of the manual modeling tasks/checks. This automated workflow is
the basis for the framework proposed in this paper for digital fabrication of concrete formwork
(see Figure1). This framework requires minimal manual involvement in the design and
fabrication of formwork as the users’ active involvement is limited to determining when they
want to proceed from a stage to the next stage. The process map in Fig 1 demonstrates the stages
and their activities encapsulating the design to fabrication processes of concrete formworks. In
the following section, these stages and their required activities will be explained in detail.

ILLUSTRATIVE EXAMPLE
After incrementally refining the framework in a series of field tests, the team fully
implemented and validated the final framework in an actual project, presented here as an
illustrative example. The project was a 26-story commercial building, located in the Pacific
Northwest region of the U.S. The building had post-tensioned cast-in-place concrete floors. As
the GC self-performs all concrete works in its portfolio of projects, the team had a direct
involvement in fully implementing the framework in the project. In the following sections, the
authors summarize the five stages of the framework as implemented in the project.
BIM Development/VDC Coordination: In this stage, the team refined and consolidated the
3D models, and performed trade coordination specifically to vet slab location and PT
components (strands, heads, and embeds). The team used ArchiCAD for modeling and
Navisworks for coordination. The consolidated models captured the exterior envelope per trade
(with connection embeds), concrete slabs, PT component, curbs, and pony walls. For the
conflicts identified in the model, the team requested information from the structural engineer.
After receiving the information and updating the model, the final configuration was signed off
and sent to the designers for approval. After addressing the designer’s supplementary
information, the team finalized the model for importing it to Rhino.
Formwork Modeling: In this stage, the approved model was used to virtually create
formwork for the concrete slab and modeling constraints such as PT elements and related
clearances. Using the visual programming algorithm in Rhino/Grasshopper, the team picked the
concrete slabs for the process, and the algorithm automatically modeled the formwork on the
concrete surfaces. As the actual cuts for concrete formwork depend on the location of constraints
(e.g. PT holes and embeds, clearances, and their variations on different floors) the allowable size
for the CNC machine, and transportation considerations, the team fed the formwork geometries
and constraints data to the automation script that could parametrically address such variables and
create fabrication-ready formwork cuts.

Preparing Fabrication Components:The automation script (developed in Grasshopper API

with Python) gathers formwork geometries, location of PT components and clearances, and a
maximum allowable size as inputs, and generates formwork cut geometries as the output such
that the cuts are optimized and do not overlap with the constraining geometries (e.g. clearances
and PT holes components). In this specific case, the maximum allowable length for the cuts was
8 feet. However, there were many instances where the final cuts ended up having shorter lengths
due to the constraints imposed by the PT holes and required clearances. After modeling the
formwork cuts, the script labels them, generates a mass fabrication layout (laying down a group
of forms on CNC bed’s dimension), and drafts the installation drawings for the field use. The
labels help construction workers save time during the mobilization and assembly of formwork.
For the mass fabrication layout, the script placed formwork pieces on a rectangle representing
the dimension of the CNC bed. In the last step, the script aligns and places formwork layouts on
a set of installation drawings to show how the labeled forms should be assembled on the job site
(see Figure 2).
Tool Path Development: For tool path development, the team had to choose a material that
satisfied the required strength of the formwork as specified in the submittals. In this project, the
team selected 4x8 MDF sheets (¾ inch thickness). Based on the characteristic of the material and
the CNC machine, the team had to determine the cutting bit with the appropriate federate and
spindle speeds. These important variables can influence the service life of CNC machines
(Albert, 2011). For the cutting part, the team chose a ½” carbide with two flutes. The team found
that a spindle speed of 18000 RPM and a feed rate around 400 work best for MDF formwork. To
carve the label on the edge formwork, a carbide V-carve bit with a 0.745 diameter was used.
Using RhinoCAM, the team simulated the cutting process, established the cutting instructions,
and generated a G-code for the CNC machine.
CNC Routing and Transporting Components: In the last stage, for ‘sourcing materials’,
the team ordered the MDF sheets (with a week lead time) and fed them to the CNC machine. The
G-code generated in the previous stage guided the cutting process. After cutting each series of
mass-fabrication layout, the team stacked and organized them with labels to facilitate the
shipping and assembly process. Each stack was shipped to the job site two days before the
scheduled installation task.

CONCLUSION
With the emergence of opportunities to incorporate digital technologies at construction sites
(Kim et al. 2018; Fardhosseini and Esmaeili 2016; Hwang et al. 2018; Jebelli et al. 2016, 2017;
2018; Lucky Agung Pratama et al. 2018), digital fabrication workflows are increasingly growing
in the manufacturing industries (Wu et al. 2018). However, a systematic understanding of how
these technologies and tools could contribute to conventional construction projects is still
lacking. This research proposed and studied a framework with the associated workflows to
automate the design to fabrication process of concrete formwork in commercial construction
projects. The presented framework consisted of five main stages: VCD slab edge coordination,
Formwork modeling, Fabrication preparedness, Tool path development, and CNC routing.
The authors acknowledge the limitations of this study, listed as follows. This research is
intended to offer a precedent, and not a generalizable workflow for all construction projects. The
workflow was implemented for one GC, who self-performs concrete in a series of relatively
similar projects. Furthermore, the team used a specific CNC machine coupled with certain
software and programming language, which other contractors may not have or intend to use in

their projects. Therefore, it might be feasible for other GCs add different tasks to or remove
specific tasks (e.g. labeling) from the proposed workflow. Despite these limitations, it is
expected that the offered information support project stakeholders in their efforts to develop their
own digital fabrication workflows.

ACKNOWLEDGEMENT
This study was supported, in part, by Turner Construction Company through the ARC
(Applied Research Consortia) Program at the University of Washington (UW). The authors
thank Turner’s personnel involved in the study for their interest and input into the study. Any
opinions, findings, conclusions, and recommendations expressed in this paper are those of the
authors and do not necessarily reflect the views of Turner or UW.

REFERENCES
Aghazadeh, F., & Mital, A. (1987). Injuries due to handtools: Results of a questionnaire. Applied
ergonomics, 18(4), 273-278.
Ajweh, Z. (2014). A Framework for Design of Panelized Wood Framing Prefabrication Utilizing
Multi-panels and Crew Balancing (Doctoral dissertation, University of Alberta).
Albert, A. (2011). Understanding CNC Routers. FP Innovations/Forintek.
Arashpour, M., Bai,Y., Kamat,V., Hosseini,R., and Martek,I.(2018). Project Production Flows in
Off-Site Prefabrication: BIM-Enabled Railway Infrastructure (pp. 80-87). 35th International
Symposium on Automation and Robotics in Construction (ISARC) and International
AEC/FM Hackathon, Berlin, Germany.
Arora, S. K., R. W. Foley, J. Youtie, P. Shapira, and A. Wiek. (2014). “Drivers of Technology
Adoption - the Case of Nanomaterials in Building Construction.” Technological Forecasting
and Social Change 87: 232–244.
Bhoir, S. A., Hasanzadeh, S., Esmaeili, B., Dodd, M. D., and Fardhosseini, M. S. (2015).
“Measuring construction workers’ attention using eye-tracking technology.” Proc., ICSC15:
The Canadian Society for Civil Engineering 5th Int./11th Construction Specialty Conf, Univ.
of British Columbia, Vancouver, Canada.
Davidson, C. 2013. “Innovation in Construction – Before the Curtain Goes up.” Construction
Innovation 13: 344–351.
Dozzi, S. P., & AbouRizk, S. M. (1993). Productivity in construction (p. 44). Ottawa: Institute
for Research in Construction, National Research Council.
Fardhosseini, M. S., Esmaeili, B., & Wood, R. A. (2015). “A Strategic Safety-Risk Management
Plan for Recovery after Disaster Operations.” Proc., ICSC15: The Canadian Society for Civil
Engineering 5th Int./11th Construction Specialty Conf, Univ. of British Columbia,
Vancouver, Canada.
Fardhosseini, M. S., and Esmaeili, B. (2016). “The impact of the legalization of recreational
marijuana on construction safety.” Construction Research Congress 2016, 2972–2983.
Gershenfeld, N. (2012). How to make almost anything: The digital fabrication revolution.
Foreign Aff., 91, p 43.
Habibnezhad, M., Fardhosseini, S., Vahed, A. M., Esmaeili, B., and Dodd, M. D. (2016). “The
relationship between construction workers’ risk perception and eye movement in hazard
identification.” Construction Research Congress 2016, 2984–2994.
Hamid, M., Tolba, O., & El Antably, A. (2018). BIM semantics for digital fabrication: A
knowledge-based approach. Automation in Construction, 91, 62-82.

Hurd, M. (2005). Formwork for Concrete (SP4), Seventh Edition. Michigan: American Concrete
Institute.
Hwang, B.G., M. Shan, and K.-Y. Looi. (2018). “Key Constraints and Mitigation Strategies for
Prefabricated Prefinished Volumetric Construction.” Journal of Cleaner Production 183:
183–193.
Hwang, S., Jebelli, H., Choi, B., Choi, M., and Lee, S. (2018). “Measuring Workers’ Emotional
State during Construction Tasks Using Wearable EEG.” Journal of Construction Engineering
and Management, 144(7), 04018050.
Jebelli, H., Ahn, C. R., and Stentz, T. L. (2016). “Fall risk analysis of construction workers using
inertial measurement units: Validating the usefulness of the postural stability metrics in
construction.” Safety Science, 84, 161–170.
Jebelli, H., Hwang, S., and Lee, S. (2017). “Feasibility of Field Measurement of Construction
Workers’ Valence Using a Wearable EEG Device.” Computing in Civil Engineering 2017,
ASCE, Reston, VA, 99–106.
Jebelli, H., Khalili, M. M., Hwang, S., and Lee, S. (2018). “A Supervised Learning-Based
Construction Workers’ Stress Recognition Using a Wearable Electroencephalography (EEG)
Device.” Construction Research Congress 2018, ASCE, Reston, VA, 40–50.
Kim, H., Ahn, C. R., Stentz, T. L., and Jebelli, H. (2018). “Assessing the effects of slippery steel
beam coatings to ironworkers’ gait stability.” Applied Ergonomics, 68, 72–79.
Manrique, J. D., Al-Hussein, M., Bouferguene, A., & Nasseri, R. (2015). Automated generation
of shop drawings in residential construction. Automation in Construction, 55, 15-24.
Pratama L.A., Fardhosseini M.S., and Lin K.Y. (2018). “An Overview of Generating VR Models
for Disaster Zone Reconstruction Using Drone Footage.” The University of Auckland, New
Zealand, 336–344.
Raspall, F. (2015). A procedural framework for design to fabrication. Automation in
Construction, 51, 132-139.
Samaneh, M.Y., Masoud, M.S.(2013). “CNC Routing Machine." <
https://eng.najah.edu/sites/eng.najah.edu/files/report.pdf>. (Accessed, April, 5th, 2017).
Schodek, D. L., Bechthold, M., Griggs, K., Kao, K. M., & Steinberg, M. (2005). “Digital design
and manufacturing: CAD/CAM applications in architecture and design” Hoboken, NJ:
Wiley, 339-344
Spielholz, P., Wiker, S. F., & Silverstein, B. (1998). An ergonomic characterization of work in
concrete form construction. American Industrial Hygiene Association Journal, 59(9), 629-
635.
Wu, P., Zhao, X., Baller, J. H., & Wang, X. (2018). Developing a conceptual framework to
improve the implementation of 3D printing technology in the construction industry.
Architectural Science Review, 61(3), 133-142.

State-of-the-Art Review on the Applicability of AI Methods to Automated Construction

Manufacturing
Mohsen Hatami1 ; Ian Flood, Ph.D.2; Bryan Franz, Ph.D.3; and Xun Zhang4
1
Ph.D. Student, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida, PO Box
115703, Gainesville, FL 32611. E-mail: [email protected]
2
Professor and Director of Research and Graduate Education, M. E. Rinker, Sr. School of
Construction Management, Univ. of Florida, PO Box 115703, Gainesville, FL 32611. E-mail:
[email protected]
3
Assistant Professor, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida,
PO Box 115703, Gainesville, FL 32611. E-mail: [email protected]
4
Ph.D. Student, M. E. Rinker, Sr. School of Construction Management, Univ. of Florida, PO Box
115703, Gainesville, FL 32611. E-mail: [email protected]

ABSTRACT
Productivity in the U.S. construction industry has stagnated over the past 50 years, whereas
manufacturing industries have about doubled productivity levels. Adoption of smart
manufacturing with construction has challenges to achieving efficiency in a factory environment.
Construction projects are one-off designs with little replication in the configuration of
components. The ability to reconfigure factory production and network optimization
performance help smart manufacturing systems. Artificial intelligence (AI) is well suited to this
problem. This paper provides an in-depth review of AI methods and how the technology may be
applied to automated construction manufacturing systems. This starts with a state-of-the-practice
review of AI applications within construction manufacturing. This is followed by an
identification of the AI needs of construction manufacturing systems. Lastly, the paper reviews
the state-of-the-art of artificial neural networks (ANNs) (e.g. deep learning and transfer learning)
from the domains of manufacturing and industrial engineering, and discusses the potential for
application to construction manufacturing. The objective of the paper is to help identify the
direction for future research and development in this field.

INTRODUCTION
The US construction industry which represents about 6% of the total US GDP (US Census
Bureau, 2016) has experienced a declining trend in productivity over the last 50 years whereas
manufacturing industries have had more than doubled their productivity levels (US Dept. of
Labor, 2016). Over the same period, there has been a growing expectation from owners and
society for improvement in the performance of the construction industry particularly in terms of
construction cost, delivery time, safety, quality, and environmental impact (Jaillon et al. 2009).
The application of manufacturing concepts to construction (such as factory-based panelizing,
modularization and 3D printing) is a promising strategy for improving industrial performance in
all of these respects, as well as overcoming the decline in the availability of transient skilled
labor (Bernstein 2018).
The National Institute of Standards and Technology (NIST) noted that smart manufacturing
systems would be successful in optimizing system performance by applying effective
reconfiguration of factory production and supply networks (Thompson 2014). Importantly, such
systems developed in this way must be able to deal with adverse conditions effectively and adapt

efficiently to uncertain circumstances. To this end, these systems must be improved over time by
using existing experience and by developing straightforward interoperability among all
interacting parties and organizations. Furthermore, it is expected that smart manufacturing
systems will perform proactively as well as being active in response to past events. Also, these
systems must be able to predict future conditions and events such as machine failure using
predictive analytic techniques.
On the other hand, the Association of Equipment Manufacturers (AEM 2018) reported that
the application of Artificial Intelligence (AI) in the construction industry could enhance business
success with respect to factors such as productivity and safety . As an example, AI can boost the
value of the Internet of Things (IoT) by improving predictive maintenance. Hence, by applying
AI and developed patterns, users would be able to predict failures in the construction industry in
a timely manner (Kranz 2016). Analytic systems can be made smarter upon the application of
machine learning, especially in environments that generate large volumes of data. This provides
users with improved daily performance and better planning decisions throughout all stages of
work, whether on construction sites or within manufacturing facilities.
However, AEM ( 2018) addresses existing challenges which hinder this technology from
widespread application and adoption as follows: (1) fear among workers - there is a circulating
belief among some employees regarding AI applications that it may diminish jobs by replacing
human resources, therefore a better understanding of data science by employees is required; (2)
cultural resistance - some culturally based approaches are found as obstacles for adopting new
groundbreaking technologies to industrial improvements, therefore, it is necessary for people to
be aware of this fact that AI exist in our life and its application must be prioritized for improving
industrial affairs (Kranz 2016); and (3) security - this challenge has become a critical problem
from an IT perspective and the security industry eventually address the unique requirements of
IoT, inclusive of AI (Kranz 2016).
In the field of construction, AI is expected to change business models including logistics,
finance, customer relationship management, workflow automation, and support of them.
Furthermore, AI can provide realistic situations through generating simulation scenarios for
giving a better training for reducing injuries and costly mistakes and can help in making
operations more efficient. Hence, operators will be able to better employ existing labor resources
and address issues concerning the shortage of skilled labor in the construction industry.

SPECIAL NEEDS OF CONSTRUCTION MANUFACTURING SYSTEMS

Smart manufacturing can provide a huge benefit to the construction industry (Thompson
2014). However, it also has some complicated problems that may challenge the efficiency
expected in a factory environment. The arrival rate of projects is sporadic, and construction
projects are one-off designs with little replication in the configuration of components.
Consequently, the output of production can rarely be stockpiled, and there are large fluctuations
in the workload over time that can lead to significant inefficiencies in the utilization of factory
resources (Al-Bazi et al. 2010; Shojaei and Flood 2017). A smart manufacturing system that can
resolve these difficulties would help overcome many of the problems currently faced by the
construction industry.
In the field of smart manufacturing systems, determination of specific influences of each part
such as complex system, sub-system, and component interactions on process output metrics and
data integrity are challenging. At all levels in a system, a uniform process with the ability to
guide manufacturing operations management and integrate wireless technologies, prognostics

and diagnostics, including cybersecurity, does not exist (Chen et al. 2018). A variety of solutions
have been proposed and are currently being evaluated and are seldom sporadic. In fact,
simultaneous operations of systems boost our understanding of information flow relationships
and their relevant details.
Existing manufacturing environments have many specific problems. While, higher-level
prognostics and diagnostics depend heavily on detailed undocumented, ad-hoc human
intervention and/or broadly-defined automated methods, detailed automated prognostics and
diagnostics are performed at the lowest levels. In this regard, the ability to integrate and assure
operational efficiency of networked machine tools and robots is not well supported by
appropriate methods, protocols, and tools (Chang et al. 2018; Hatami and Ameri Siahooei 2013).
This has resulted in increasing costs and extended times for responding to dynamic production
demands (Yu Haitao et al. 2013). The machine tools, robots, tooling, and equipment needed for
smart manufacturing systems are complex. In order to model existing information to integrate
these tools, and methods and to measure their efficiency and performance, employing experts
who use their own approaches are required. Artificial Neural Networks (ANNs) can learn and
adapt to changing circumstances in data rich environments, and have proven themselves capable
of outperforming humans in application to some non-trivial problems (see, for example, (Silver
et al. 2017), supplanting or assisting human decision-making. As such, ANNs offer great
potential for advancing computer-integrated manufacturing and intelligent manufacturing
systems.

METHODOLOGY
The goal of this study is to identify and review the role of AI Methods in automated
construction manufacturing and to identify the research trends in this area, the needs and the gaps
in our current knowledge. This includes an assessment of the problems, methodologies, and
results presented in the literature, both within construction manufacturing and in related
industries. This was based on a systematic search within various databases and publications
dated up to November 2018. Primary databases included Science Direct, EBSCO Host, Web of
Science, Emerald, ASCE, and ProQuest. To this end, an assessment was made of the full
contents of these publications for all scholarly works including reports, conference papers, and
journal papers. The literature search involved two broad categories. The first was related to AI
and ANN methods and the second on construction manufacturing. These sets of keywords used
here include: (“Artificial Neural Network” OR “Artificial Intelligence" OR "Deep Learning" OR
"Reinforcement Learning" OR "Transfer Learning") AND (“Construction” OR "Construction
Manufacturing" OR "Automated Construction" OR "Construction Industry"). As a result of this
approach, the most relevant articles were used to compile this state-of-the-art literature review.

CURRENT STATE-OF-THE-PRACTICE IN THE FIELD OF AI WITHIN

CONSTRUCTION
Although a limited number of start-ups have AI-focused approaches to gain market attention,
the application of AI in the construction industry is still at an early stage (Kulkarni et al. 2017).
Examples are project schedule optimizers that evaluate millions of alternatives for project
delivery in order to improve overall project planning. Image recognition and classification,
assessing video data collected on construction sites to determine unacceptable, insecure and
unsafe behaviors by workers. The collection and analysis of data from sensors for enhancing
analytic platforms, ultimately improving understanding of signals and patterns in order to

establish real-time and instant solutions, reduce costs, prioritize preventative maintenance, and
also they aimed at preventing unplanned downtime (Blanco et al. 2018).
Significant applications of AI in construction and building fields are classified by Bharadwaj
(2018) under the following categories:
1. Planning and Design: Building Information Modeling (BIM) is a 3D model-based
process. Various architecture, engineering, and construction (AEC) professionals can
gain insights through BIM to assist in the efficient planning, design, construction, and
management of infrastructure and building projects. A paradigm for BIM software
(commonly termed 4D BIM in the construction industry) is Autodesk Revit (Hamad
2018). This software allows users to design buildings and internal components in 3D
format while linking time and schedule related information to individual components
within this framework.
2. Safety and Efficiency: The BIM 360 Project IQ (Ashuri 2018), developed by Autodesk,
is a product development initiative. This uses connected data and machine learning for
predicting and prioritizing high-risk issues as well as project subcontractor risk.
Currently, Project IQ is still in the pilot phase and requires BIM 360 field data, i.e. data
collected from existing users. The developer has aimed at improving the system for
general contractors who use the Autodesk BIM 360 construction management software.
3. Autonomous Equipment: Komatsu as one of the Japanese equipment manufacturers in
the field of construction and mining, with the cooperation of NVIDIA have aimed at
developing an AI-enhanced safety system. They wanted to create a system for job sites
enabling building 3D visualizations and tracking the entire construction site, machinery,
objects in the site as well as the real-time interaction of people. NVIDIA founder and
CEO Jensen Huang wants to help operators work more efficiently and safely based on
interpreting surroundings and continuously alerting workers about hazards in their
surroundings through the use of 3D models.
4. Monitoring and Maintenance: ‘Comfy’ which is developed by building Robotics, a
startup in Oakland, CA, is an application for automated management of a building
allowing users in commercial spaces and office buildings a means for centralizing
thermostat temperature requests. It is claimed that this platform can alter the thermostat
settings as needed for areas with this capability (AEM 2018). Cha et al. (2017) proposed
a vision‐based method using a deep architecture of convolutional neural networks
(CNNs) for detecting concrete cracks without calculating the defect features.

STATE-OF-THE-ART AI APPROACHES FROM MANUFACTURING, INDUSTRIAL

ENGINEERING, AND COMPUTER SCIENCE
Deep learning-ANNs (LeCun et al. 2015; Schmidhuber 2015; Yu and Deng 2011) are a
relatively new development in the field of AI. They are remarkably good at solving complicated
problems whose solutions were previously defined by a combination of conventional ANNs,
classic AI techniques, and algorithmic optimization methods (Flood 2008). They have, for
example, made feasible the development of self-driving vehicles (Szegedy et al. 2013). The
potential of deep learning-ANNs results from their ability to develop highly structured
hierarchical representations of information. This generally is found to increase an ANN’s ability
to develop meaningful solutions to functionally complicated problems.
A good example of deep learning application in daily life is interface tools for electronic
devices such as cell phones, tablets, and computers. Furthermore, this technology has enabled

other advanced applications such as object detection (Han et al. 2015), image segmentation (He
et al. 2015), site selection (Jozaghi et al. 2018; Shafieardekani and Hatami 2013), voice
analyzing (Fang et al. 2018), emotion detection (Faust et al. 2018), and gender recognition (Raza
et al. 2018).
According to the Construction Equipment (2018), AI applications also support powering
infrastructure, cybersecurity defense, health care analysis, recruiting automation, intelligent
conversational interfaces, reduced energy use and costs, predicting vulnerability exploitation,
becoming more customer-centric, market prediction, accelerated reading, cross-layer resilience
validation, accounting, advanced billing rules, understanding intentions and behaviors, and
proposal review. Transfer learning is a technique used to improve the performance of deep
learning by harnessing the knowledge obtained by another task (Szegedy et al. 2013).

FUTURE POTENTIAL FOR AI IN CONSTRUCTION

AI provides enormous possibilities in areas such as natural language processing and robotics.
In the current review, the authors have highlighted four exemplary applications of AI in other
industries that have significant potential for application to automated construction
manufacturing.
1. In the pharmaceutical industry, prediction of pharmaceutical outcomes for
constructability issues turns out possible using predictive AI solution (Blanco et al.
2018). The pharmaceutical industry has invested in extensive research and development
(R&D) to lower development costs. This has been achieved, in part, through forecasting
medical trial outcomes. A similar approach can be considered for AI applications in the
construction industry, in particular in projects dealing with R&D budgets. Predicting
outcomes are achievable in two ways: first, by applying predictive applications for
anticipating project risks, constructability, and the structural stability of various technical
solutions, helping decision-making and significant cutting of the expenses; and second,
by evaluating multiple materials, limiting the downtime of specific structures during the
inspection are possible upon predictive applications (Juszczyk 2017).
2. In retail supply chain management, AI has helped the optimization of materials and
inventory management (Oztemel and Gursev 2018). AI has been used to reduce costs by
affecting manufacturing downtime, avoiding a surplus of supply, and increasing
predictability of shipments. Supervised learning has been applied extensively in
engineering and construction (E&C) as modularization and prefabrication. Now, many
projects are using off-site construction to supply a vast quantity of components.
Therefore, enhanced supply chain coordination is required to control overall expenses.
3. Application of robotics and automation for modular or prefabricated construction
(including 3D printing) could be considered further for maximizing their benefits for
example through application of machine learning. However, the use of modularization
and 3D printing is being currently expanded in the construction industry. As an example,
researchers have successfully trained robotic arms to move by learning from simulations.
A similar application of robotics is applicable in the E&C industry to be applied in
prefabrication techniques and maintenance operations, for instance, in oil and gas
industries.
4. In the healthcare industry, machine-learning methods have been successfully applied to
image recognition and analysis (Yang et al. 2015). These technological advances have
drastically improved the ability of medical professionals to diagnose illnesses by

detecting, for example, known markers for various conditions. Similar application of the
technology could be applied to drone collected imagery for automated progress
monitoring and quality inspection (Moud et al. 2018). Using this technology, the issues
with quality control such as defects in execution and structural problems with
infrastructures as well as early detection of critical events become feasible. Engineers can
benefit from these technological advancements in automated construction manufacturing
by comparing final products to initial designs. Furthermore, they could be trained in
unsafe-behavior detection to diagnose possible safety risks and hazards.

CONCLUSION
The construction industry in the United States is suffering from a lack of productivity
compared to other industries. This is due to the nature of the construction projects as one-off
designs which provide little room for efficiency from scaling productions, the sporadic arrival of
work, high costs and high risk because of the high level of uncertainty, and the need for
transiency of skilled labor. Automated construction manufacturing has been considered a
potential alternative to tackle ongoing issues with construction projects and promoting the
construction industry. As with other industries where smart manufacturing and artificial
intelligence methods have been successfully applied, the construction industry can benefit from
these advances across the board including all aspects of project planning, monitoring,
coordination and control, as well as safety diagnosis, and quality control. Deep learning methods
are providing effective solutions to functionally complicated problems (Szegedy et al. 2013).
Benefiting from AI by automated construction manufacturing may follow other industrial
approaches such as applying predictive AI solutions for reducing R&D expenses, online
optimization for better monitoring and management, supervised learning for modularization and
prefabrication in engineering and construction, robotic coordination for modular or
prefabrication construction, and machine-learning methods for image recognition for risk and
safety management.

REFERENCES
AEM. (n.d.). “Why Artificial Intelligence Will Transform Industry Business Models.”
Association of Equipment Manufacturers, <https://www.aem.org/news/why-artificial-
intelligence-will-transform-industry-business-models/> (Dec. 1, 2018).
Al-Bazi, A. F. J. (Ammar), Dawood, N. N. (Nashwan), and Dean, J. T. (John). (2010).
“Improving performance and the reliability of off-site pre-cast concrete production
operations using simulation optimisation.” Journal of Information Technology in
Construction.
Ashuri, B. (2018). “Building Information Modeling (BIM) and Big Data Analytics.” 33.
Bernstein, P. (2018). “Future of Construction Is Manufacturing Buildings.” Redshift EN.
Bharadwaj, R. (2018). “AI Applications in Construction and Building - Artificial Intelligence
Companies, Insights, Research.” Emerj, <https://emerj.com/ai-sector-overviews/ai-
applications-construction-building/> (Dec. 3, 2018).
Blanco, J. L., Fuchs, S., Parsons, M., and Ribeirinho, M. J. (2018). “Artificial intelligence:
Construction technology’s next frontier | McKinsey.”
<https://www.mckinsey.com/industries/capital-projects-and-infrastructure/our-
insights/artificial-intelligence-construction-technologys-next-frontier> (Dec. 3, 2018).
Cha, Y.-J., Choi, W., and Büyüköztürk, O. (2017). “Deep Learning-Based Crack Damage

Detection Using Convolutional Neural Networks.” Computer-Aided Civil and Infrastructure

Engineering, 32(5), 361–378.
Chang, C.-W., Lee, H.-W., and Liu, C.-H. (2018). “A Review of Artificial Intelligence
Algorithms Used for Smart Machine Tools.” Inventions, 3(3), 41.
Chen, Q., García de Soto, B., and Adey, B. T. (2018). “Construction automation: Research areas,
industry concerns and suggestions for advancement.” Automation in Construction, 94, 22–38.
Construction Equipment. (2018). “The Ways Artificial Intelligence Will Change Construction |
Construction Equipment.” <https://www.constructionequipment.com/ways-artificial-
intelligence-will-change-construction> (Dec. 3, 2018).
Fang, S.-H., Tsao, Y., Hsiao, M.-J., Chen, J.-Y., Lai, Y.-H., Lin, F.-C., and Wang, C.-T. (2018).
“Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.”
Journal of Voice.
Faust, O., Hagiwara, Y., Hong, T. J., Lih, O. S., and Acharya, U. R. (2018). “Deep learning for
healthcare applications based on physiological signals: A review.” Computer Methods and
Programs in Biomedicine, 161, 1–13.
Flood, I. (2008). “Towards the next generation of artificial neural networks for civil
engineering.” Advanced Engineering Informatics, Intelligent computing in engineering and
architecture, 22(1), 4–14.
Flood, I. (2015). “Modeling Construction Processes: A Structured Graphical Approach
Compared to Construction Simulation.” Computing in Civil Engineering 2015, Proceedings.
Hamad, M. (2018). Autodesk Revit 2019 Architecture. Stylus Publishing, LLC.
Han, J., Zhang, D., Cheng, G., Guo, L., and Ren, J. (2015). “Object Detection in Optical Remote
Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning.”
IEEE Transactions on Geoscience and Remote Sensing, 53(6), 3325–3337.
Hatami, M., and Ameri Siahooei, E. (2013). “Examines criteria applicable in the optimal location
new cities, with approach for sustainable urban development.” Middle-East Journal of
Scientific Research, 14(5), 734–743.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). “Deep Residual Learning for Image
Recognition.” arXiv:1512.03385 [cs].
Jaillon, L., Poon, C. S., and Chiang, Y. H. (2009). “Quantifying the waste reduction potential of
using prefabrication in building construction in Hong Kong.” Waste management, 29(1),
309–320.
Jozaghi, A., Alizadeh, B., Hatami, M., Flood, I., Khorrami, M., Khodaei, N., and Ghasemi Tousi,
E. (2018). “A Comparative Study of the AHP and TOPSIS Techniques for Dam Site
Selection Using GIS: A Case Study of Sistan and Baluchestan Province, Iran.” Geosciences,
8(12), 494.
Juszczyk, M. (2017). “The Challenges of Nonparametric Cost Estimation of Construction Works
with the use of Artificial Intelligence Tools.” Procedia Engineering, Creative Construction
Conference 2017, CCC 2017, 19-22 June 2017, Primosten, Croatia, 196, 415–422.
Kranz, M. (2016). Building the Internet of Things: Implement New Business Models, Disrupt
Competitors, Transform Your Industry. John Wiley & Sons.
Kulkarni, P., Londhe, S., and Deo, M. (2017). “Artificial Neural Networks for Construction
Management: A Review.” Soft Computing in Civil Engineering, 1(2), 70–88.
Lawson, R. M., Ogden, R. G., and Bergin, R. (2012). “Application of Modular Construction in
High-Rise Buildings.” Journal of Architectural Engineering, 18(2), 148–154.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). “Deep learning.” Nature, 521(7553), 436–444.

Moud, H. I., Shojaei, A., Flood, I., Zhang, X., and Hatami, M. (2018). “Qualitative and
Quantitative Risk Analysis of Unmanned Aerial Vehicle Flights over Construction Job
Sites.” In Proceedings of the Eighth International Conference on Advanced Communications
and Computation (INFOCOMP 2018), Barcelona, Spain. 2018.
Oztemel, E., and Gursev, S. (2018). “Literature review of Industry 4.0 and related technologies.”
Journal of Intelligent Manufacturing.
Raza, M., Sharif, M., Yasmin, M., Khan, M. A., Saba, T., and Fernandes, S. L. (2018).
“Appearance based pedestrians’ gender recognition by employing stacked auto encoders in
deep learning.” Future Generation Computer Systems, 88, 28–39.
Schmidhuber, J. (2015). “Deep learning in neural networks: An overview.” Neural Networks, 61,
85–117.
Shafieardekani, M., and Hatami, M. (2013). “Forecasting Land Use Change in suburb by using
Time series and Spatial Approach; Evidence from Intermediate Cities of Iran.” European
Journal of Scientific Research, 116(2), 199–208.
Shojaei, A., and Flood, I. (2017). “Stochastic forecasting of project streams for construction
project portfolio management.” Visualization in Engineering, 5(1), 11.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T.,
Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche,
G., Graepel, T., and Hassabis, D. (2017). “Mastering the game of Go without human
knowledge.” Nature, 550(7676), 354–359.
Szegedy, C., Toshev, A., and Erhan, D. (2013). “Deep Neural Networks for Object Detection.”
Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M.
Welling, Z. Ghahramani, and K. Q. Weinberger, eds., Curran Associates, 2553–2561.
Thompson, K. D. (2014). “Smart Manufacturing Operations Planning and Control Program.”
NIST, <https://www.nist.gov/programs-projects/smart-manufacturing-operations-planning-
and-control-program> (Nov. 19, 2018).
Yang, J., Park, M.-W., Vela, P. A., and Golparvar-Fard, M. (2015). “Construction performance
monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the
future.” Advanced Engineering Informatics, Infrastructure Computer Vision, 29(2), 211–224.
Yu, D., and Deng, L. (2011). “Deep Learning and Its Applications to Signal and Information
Processing [Exploratory DSP].” IEEE Signal Processing Magazine, 28(1), 145–154.
Yu Haitao, Al-Hussein Mohamed, Al-Jibouri Saad, and Telyas Avi. (2013). “Lean
Transformation in a Modular Building Company: A Case for Implementation.” Journal of
Management in Engineering, 29(1), 103–111.

Game Simulation to Support Construction Automation in Modular Construction Using

BIM and Robotics Technology—Stage I
Oscar Wong Chong1 and Jiansong Zhang, Ph.D., A.M.ASCE2
1
Automation and Intelligent Construction (AutoIC) Laboratory, Purdue Univ., West Lafayette,
IN 47907. E-mail: [email protected]
2
Automation and Intelligent Construction (AutoIC) Laboratory, Purdue Univ., West Lafayette,
IN 47907. E-mail: [email protected]

ABSTRACT
Modular construction has been proven to be more time-efficient comparing to stick-built
construction. However, the lack of automation technologies adoption shows a missed
opportunity in modular construction, where the time efficiency of the modular process can be
further improved, along with quality and safety. To address this problem, the authors proposed a
new simulation method for modular construction using BIM, game simulation, and robotics
technology, to help analyze and promote automation in modular construction of wood structures
in a controlled indoor environment. As a first stage, the authors presented their simulation
methodology, definition of the parameters and constraints for the construction simulation, and
creation of the interactive simulation model. This first stage will be the foundation for further
development that will enable the assessment of construction productivity when integrating
robotic systems into the modular construction workflow.

INTRODUCTION
The productivity of the construction industry in the U.S. and many other countries had been
stagnant for decades comparing to a faster productivity improvement in other industries such as
the manufacturing industry (McKinsey Global Institute, 2017). Recent construction workforce
shortage in the U.S. led to the resurgence of modular construction due to its advantages
compared to stick-built construction in time efficiency. However, the lack of a wide adoption of
automation technologies shows a missed opportunity in modular construction, where the time
efficiency of the modular process can be further improved, as well as quality and safety. For
example, fabrication and assembly operations of modular construction in residential units are
still mainly manually performed, which can be prone to human errors, safety issues, and labor
constraints. Therefore, a methodology to determine the performance of automating the modular
construction process with robotics technology is needed, to quantify the potential productivity
improvement in comparison to solely manual operations.
To address this gap, the authors proposed a simulation of modular construction using
building information modeling (BIM), game simulation, and robotics technology, to help analyze
automation in modular construction. The proposed game simulation consists of using robotic
systems to automate the modular construction of wood residential units in a controlled indoor
environment. The simulation game will be created in a modular fashion that will enable easy
reconfiguration and customization of the simulation components. Modular construction
workflows of a single-story wood housing unit with and without the use of robotic systems will
be simulated and analyzed to determine the performance difference between them. As a first
stage, the authors: (1) presented a simulation methodology, (2) defined the parameters and
constraints of automation technologies used, and (3) created an interactive simulation model.

This first stage will serve as the foundation for further development of the methodology that will
enable the assessment of productivity performance by integrating robots in modular construction
operations.

BACKGROUND
Modular construction: Modular construction is a construction method where building
components and/or systems called “modules” are built in a factory setting and then transported to
its final locations where the modules will be installed (Modular Building Institute 2010).
Modular construction can be designed and constructed using different types of materials such as
concrete, steel, and/or other types of composite materials (Lawson et al. 1999, Lawson et al.
2014). This study focuses on wood because it is one of the most utilized material for residential
buildings in the U.S. (Foliente 2000).
In North America, the concept of off-site construction has been around for more than a
century with the introduction of prefabricated houses shipped in pieces and then assembled on
site by local builders (Sears 2012). The use of prefabrication in mass production began after the
world war II, when stick-built construction could not keep up with the high demand of houses in
that time (Musa et al. 2016). After the industrial revolution, the advancement of technologies and
development of automation techniques such as robotics technology has opened new possibilities
for modular construction.
The selection of modular construction is subject to many factors and constraints such as
suitability for the project, need for expediting the schedule, site accessibility, restriction of site
layout, flexibility in changes, and owners’ perceptions (Azhar et al. 2013). However, modular
construction has many advantages over conventional stick-built construction such as working in
a controlled environment, the ability to do parallel activities to compress schedule, and the
reduction in construction-site wastes (Lawson and Ogden 2010, KPMG 2016, McGraw Hill
Construction 2011).
Automation in modular construction: The adoption of robotic systems in modular
construction is suitable because modular construction is performed in an indoor environment
similar to a factory setting, which reduces complexity caused by the unstructured environment of
a construction site and benefits from productivity gain through repetitive prefabrication and
assembly tasks (Balaguer and Abderrahim 2012). A recent increase in automation research of
modular construction has been observed (Neelamkavil 2009; Taghaddos et al. 2018), especially
with the integration of robotics into the construction and fabrication processes. For example,
Willmann et al. (2016) integrated a manipulator robot into the construction of non-standard
timber structures from a series of simple wood members. Eversmann et al. (2017) combined
CNC machines and two industrial robots to enable a large-scale spatial fabrication of timber
construction.
Game Engines: Game engines are traditionally used for game development, but they can
also be used for simulation and modeling purposes because of their flexibility and capability in
creating virtual scenarios with realistic interactions. Game simulation has been widely used in
the architecture, engineering, and construction (AEC) domain (Nikolic et al. 2009; Natephra et
al. 2016; Bille et al. 2014) to serve different purposes such as building virtual walk-through
based on architectural design (Yan et al. 2011), airflow simulation and visualization based on
MEP design (Shen et al. 2012), interior design decision support through intuitive comparison of
alternatives (Heydarian et al. 2015), construction equipment training (Mastali and Zhang 2017),

construction safety investigation (Yu et al. 2017), facility management (Shi et al. 2016), and
lighting fixture design evaluations (Bucarelli et al. 2018), among others. The implementation of
these simulation systems would be technically difficult and/or time consuming without the use of
a game engine (Bille et al. 2014).
In this study, a game engine is used to simulate the construction process of modular buildings
because of its ability to (1) incorporate BIM and robotic model, (2) add physic properties to
objects, and (3) make changes to the simulation through programming environment. Although
simulation tools that use BIM are available, they are not as capable to incorporate robotic
systems into their simulation comparing to game engine. Similarly, many robotic simulators are
available, however, they lack the support of BIM. The authors’ literature review showed a lack
of simulation work that incorporate modular construction, BIM, and robotic system all together.

PROPOSED GAME SIMULATION METHOD AND EXPERIMENT

The authors proposed to use game simulation to analyze the productivity difference between
conventional modular construction and modular construction with robotics technology. The
game simulation will be built upon BIM, and will integrate robotics components (e.g., mobile
robot), according to their real world data such as capabilities and movement speed, to help
analyze their use in construction. The proposed game simulation method is divided into three
phases and each phase is consisted of two steps, resulting in six steps in total (Figure 1).

Figure 1. Proposed simulation method for modular construction.

Phase I is consisted of: Step 1 – Select simulation software, BIM input, and robotic
system(s); and Step 2 – Create simulation scenes and components. For Phase II, two main
scenarios will be simulated. In Step 3 – Simulate construction tasks performed by workers, a
construction operation solely based on manual labor will be simulated. In Step 4 – Simulate
integrated robotic system(s) in construction tasks, selected robotic systems will be integrated into
the workflow of modular construction using their real world performance data. Lastly, in Phase
III, the two steps will focus on analysis and interpretation of results based on experiments from
Phase 2: Step 5 – Compare performances in terms of productivity between different scenarios,
and Step 6 – Interpret results. The implementation of Phase I is presented in the following
section.

PRELIMINARY EXPERIENTIAL RESULTS AND ANALYSIS

By implementing the Phase I (Steps 1 and 2) of the proposed method, the following
preliminary results were obtained.
Step 1 - Select simulation software, BIM input, and robotic system(s): In this study, a
single-story residential wood building was modeled using 3ds max and Revit according to design
drawings. Similar to Aldafaay et al. (2017), who presented a knowledge extraction, game
simulation and visualization development of a steel erection operation, the authors integrated
BIM into the game engine to examine the modular construction of a wood structure. To
incorporate real world data into the simulation of the construction process, the actual
construction operations in the Purdue University construction lab were observed and recorded.
Among game engines available, the authors selected Unity in this study based on its
functional fidelity (physics modules), availability and composability (ability to import/export
various resources), according to the criteria for game engine selection described by Petridis et al.
(2010). The game engine is compatible with many data formats such as .fbx, .dae, .3ds, .dxf,
.obj, and .skp, and supports proprietary files such as those from Maya, Blender, and 3D Studio
Max (Unity Technologies 2018). One major limitation was the lack of a direct and seamless
integration with BIM (e.g., IFC). To overcome this limitation, a partial solution was to use third-
party pipeline approaches such as those summarized by Bille et al. (2014). Recently Unity
disclosed full integration with Autodesk Revit, making the integration of BIM into game
simulation more straightforward (Unity Technologies, 2018a).
The robotic system selected is the Fetch research robot, which is a wheeled robot with 2D
mobility and flexibility in Omni direction motion. Later, a workflow will be developed that
integrates the use of this mobile robot to mobilize small resources on the jobsite, and to aid in the
assembly of the wood structure.

Figure 2. (a) Crane; (b) crane simulation; (c) wood structure; (d) wood structure
simulation.
Step 2 – Create simulation scenes and components: In this step, the background, static and
dynamic components, and movement/interaction rules were developed and added to the
simulation scenes. The Purdue University construction lab was used as the background, where
the building structure and its enclosed space were modeled. Static components mainly refer to
the wood structure under construction, and dynamic components include a heavy-duty crane,
construction workers, and the robotic system. All resources were modeled based on their real
world geometries. For example, the building structure was created in BIM based on the
drawings, and then imported into the game simulation (Figure 2).
In addition to modeling the components, movement/interaction rules were also developed.
An experimental test to determine the speeds of the 3 DOF heavy-duty crane in the construction

lab was conducted. The speeds (rotational, horizontal and vertical translations) of the crane
movements were determined by measuring the time the hook took to travel specified distances.
The orientation and directional movements of the hook were chosen as shown in Figure 3.
The time and distance measured for the rotational, horizontal (forward – backward) and
vertical (up – down) directional movements are presented in Table 1. A total of three
measurements were taken for each directional and rotational movements. The rotational speed
was calculated by dividing the circumference of the crane rotation (i.e., 360 degrees) by the time
the crane arm finished a complete circle. For the horizontal and vertical directional speed
measurements, predefined horizontal and vertical distances were used together with the travel
times of the crane to determine the horizontal and vertical speeds. The average speeds were
computed based on the collected data.

Figure 3. Measurements data: (a) rotation; (b) horizontal and vertical directional
movements.
Table 1. Experimental data and speeds calculation.
Rotational Speed Horizontal Speed Vertical Speed
Rotation Time Angular Speed Distance Time Speed Distance Time Speed
#
(degree) (s) (degree/s) (ft) (s) (ft/s) (ft) (s) (ft/s)
1 360 103.65 3.473 12.46 21.86 0.570 4.00 9.18 0.436
2 360 103.30 3.485 12.46 21.88 0.569 4.00 9.15 0.437
3 360 103.62 3.474 12.46 21.11 0.590 4.00 9.17 0.436
Avg. 3.477 0.577 0.436

Figure 4. Simulation of crane operation by transporting a bucket from point a to point b.

The average speeds were used as input for the modeled crane. To evaluate the feasibility of
such simulation, a comparison between the simulation results and an actual crane operation was
executed. The simulation of the crane operation in carrying a bucket from a predefined point (a)
to another predefined point (b) was conducted, a total of 100 seconds was predicted in
performing this operation (Figure 4). A physical test in the construction lab gave us a total of 109
seconds used to perform this operation (Figure 5). The time difference between simulation
prediction and actual operation is under 10%, which showed the proposed simulation method is

promising.

Figure 5. Testing of crane operation by transporting a bucket from point a to point b.

CONCLUSION
In this study the authors proposed a game simulation method that integrates BIM, game
engine, and robotic systems, in a modularized way, to investigate the modular construction of a
wood structure in different scenarios regarding the use of robotic systems. The authors
implemented the first phase of the method to build the fundamental assets of components from a
wood structure, a construction lab environment, and a 3 DOF heavy-duty crane. All assets were
modeled following their real-world geometries and functions. At the completion of all phases in
the proposed method, the simulations will help analyze productivity of building the wood
structure in different ways. The authors’ preliminary results in predicting the operation time of
the heavy-duty crane in picking up and transporting an object (bucket) gave a time difference
under 10% and showed such promise.

LIMITATIONS AND FUTURE WORK

A main limitation of this study is acknowledged: despite the promise of incorporating real-
world data as showed in the preliminary results, the level of difficulty in simulating real
operations may increase when more interactions are introduced into the simulation between
workers, robots, and equipment/environments. In future work, the simulations for constructing
the wood structure with and without robotic systems, and their comparison will be carried out.
The study will be extended to further assess productivity implications when introducing different
types of robotic systems into the construction workflow. Lessons learned from this simulation
method can be used to support its use in a wide range of scenarios such as interactive safety
training and construction education by integrating with virtual reality interfaces.

ACKNOWLEDGMENTS
The author would like to thank the National Science Foundation (NSF). This material is
based on work supported by the NSF under Grant No. 1827733. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the author and do not
necessarily reflect the views of the NSF.

REFERENCES
Aldafaay, M., Zhang, J., and Oh, J. (2017). “Visualizing the constructability of a steel structure
using building information modeling and game simulation.” Proc., International Conference
on Maintenance and Rehabilitation of Constructed Infrastructure Facilities (2017

MAIREINFRA), South korea.

Azhar, S., Lukkad, M.Y., and Ahmad, I. (2013). “An Investigation of critical factors and
constraints for selecting modular construction over conventional stick-built technique.”
Proc., International Journal of Construction Education and Research, 9(3), 203–225.
Balaguer, C., and Abderrahim, M. (2008). “Trends in robotics and automation in construction.”
Chapter 1 in Robotics and Automation, Balaguer, C., and Abderrahim, M., eds. IntechOpen,
London, UK, 1-20.
Bille, R., Smith, S.P., Maund, K., and Brewer, G. (2014). “Extending building information
models into game engines.” Proc., 2014 Conference on Interactive Entertainment 2014 -
IE2014, 1–8.
Bucarelli, N., Zhang, J., and Wang, C. (2018). “Maintainability assessment of light design using
game simulation, virtual reality, and brain sensing technologies.” Proc., ASCE Construction
Research Congress 2018, ASCE, Reston, VA, 378–387.
Eversmann, P., Gramazio, F., and Kohler, M. (2017). “Robotic prefabrication of timber
structures: towards automated large-scale spatial assembly.” Construction Robotics, 1(1–4),
49–60.
Foliente, G.C. (2000). “History of timber construction.” Wood Structures: A Global Forum on
the Treatment, Conservation, and Repair of Cultural Heritage, Kelley, S.J., Loferski, J.R.,
Salenikovich, A.J., and Stern, E.G.,eds. STP1351-EB, ASTM International, West
Conshohocken, PA, 2000, 3-22.
Heydarian, A., Carneiro, J., Gerber, D., Becerik-Gerber, B., Hayes, T., and Wood, W. (2015).
“Immersive virtual environments versus physical built environments: a benchmarking study
for building design and user-built environment explorations.” Automation in Construction,
54, 116-126.
KPMG. (2016). “Smart construction: how offsite manufacturing can transform our industry.”
<https://home.kpmg/uk/en/home.html> (Nov.30, 2018).
Lawson, R.M., Grubb, P., Prewer, J., and Trebilcock, P.J. (1999). “Modular construction using
light steel framing : an architect’s guide.” P272, SCI, London, England.
Lawson, R.M., and Ogden, R. G. (2010). “Sustainability and process benefits of modular
construction.” Proc., 18th CIB World Building Congress, 38-51.
Lawson, R.M, Ogden, R., and Goodier, C. (2014). “Design in modular construction.” Taylor &
Francis Group, London, UK.
Mastali, M., and Zhang, J. (2017). "Interactive highway construction simulation using game
engine and virtual reality for education and training purpose." Proc., 2017 ASCE Intl.
Workshop on Comput. in Civ. Eng., ASCE, Reston, VA, 399-406.
McGraw Hill Construction. (2011). "Prefabrication modularization: increasing productivity in
the construction industry".
<https://www.nist.gov/sites/default/files/documents/el/economics/Prefabrication-
Modularization-in-the-Construction-Industry-SMR-2011R.pdf> (Oct. 12, 2018).
McKinsey Global Institute. (2017). “Reinventing construction: a route to higher productivity.”
McKinsey and Company, <http://www.mckinsey.com/industries/capital-projects-and-
infrastructure/our-insights/reinventing-construction-through-a-productivity-revolution>
(Nov. 30, 2018).
Modular Building Institute. (2010). “Improving construction efficiency and productivity with
modular construction.” < http://www.modular.org> (Oct. 12, 2018).
Musa, M.F., Yusof, M.R., Mohammad, M.F., and Sahidah, N. (2016). “Towards the adoption of

modular construction and prefabrication in the construction environment : a case study in

Malaysia.” Journal of Engineering and Applied Sciences, 11(13), 8122–8131.
Natephra, W., Motamedi, A., Fukuda, T., and Yabuki, N. (2016). “Integrating building
information modeling and game engine for indoor lighting visualization.” Proc., 16th
International Conference on Construction Applications of Virtual Reality 2016, HK, China,
605-618.
Neelamkavil, J. (2009). “Automation in the prefab and modular construction industry.” Proc.,
the International Symposium on Automation and Robotics in Construction (ISARC 2009),
IAARC, 299–306.
Nikolic, D., Jaruhar, S., and Messner, J.I. (2009). “An educational simulation in construction: the
virtual construction simulator.” Proc., 2009 ASCE Intl. Workshop on Comput. in Civ. Eng.,
ASCE, Reston, VA, 633–642.
Petridis, P., Dunwell, I., De Freitas, S., and Panzoli, D. (2010). “An engine selection
methodology for high fidelity serious games.” Proc., 2nd International Conference on
Games and Virtual Worlds for Serious Applications, VS-GAMES 2010, 27–34.
Sears. (2012). “Sears Archives.” <http://www.searsarchives.com/homes/index.htm> (Nov. 17,
2018).
Shen, Z., Jiang, L., Grosskopf, K., and Berryman, C. (2012). “Creating 3D web-based game
environment using BIM models for virtual on-site visiting of of building HVAC systems.”
Proc., Construction Research Congress 2012, ASCE, Reston, VA, 1212–1221.
Shi, Y., Du, J., Lavy, S., and Zhao, D. (2016). “A multiuser shared virtual environment for
facility management.” Procedia Engineering 145, 120-127.
Taghaddos, H., Hermann, U., and Abbasi, A.B. (2018). “Automated crane planning and
optimization for modular construction.” Automation in Construction, 95, 219-232.
Unity Technologies. (2018a). “Unity and Autodesk: Powering immersive experiences.”,
<https://unity3d.com/partners/autodesk> (Nov. 22, 2018).
Unity Technologies. (2018b). “Unity User Manual.”
<https://docs.unity3d.com/2018.3/Documentation/Manual/3D-formats.html > (Nov. 5, 2018).
Willmann, J., Knauss, M., Bonwetsch, T., Apolinarska, A.A., Gramazio, F., and Kohler, M.
(2016). “Robotic timber construction - Expanding additive fabrication to new dimensions.”
Automation in Construction, 61, 16–23.
Yan, W., Culp, C., and Graf, R. (2011). “Integrating BIM and gaming for real-time interactive
architectural visualization.” Automation in Construction, 20(4), 446–458.
Yu, Y., Zhang, J., and Guo, H. (2017). "Investigation of the relationship between construction
workers' psychological states and their unsafe behaviors using virtual environment-based
testing." Proc., 2017 ASCE Intl. Workshop on Comput. in Civ. Eng., ASCE, Reston, VA,
417-424.

UAV-UGV Cooperative 3D Environmental Mapping

Pileun Kim1; Leon C. Price2 ; Jisoo Park3; and Yong K. Cho4
1
Ph.D. Student, School of Civil and Environmental Engineering, Georgia Institute of
Technology, Atlanta, GA 30332-0355, USA. E-mail: [email protected]
2
Ph.D. Student, School of Mechanical Engineering, Georgia Institute of Technology, Atlanta,
GA 30332-0405, USA. E-mail: [email protected]
3
Ph.D. Student, School of Civil and Environmental Engineering, Georgia Institute of
Technology, Atlanta, GA 30332-0405, USA. E-mail: [email protected]
4
Associate Professor, Dept. of Civil and Environmental Engineering, Georgia Institute of
Technology, Atlanta, GA 30332-0355, USA (corresponding author). E-mail:
[email protected]

ABSTRACT
The primary advantages of utilizing a mobile robot, unmanned ground vehicle (UGV), are its
autonomous navigation and data collection capabilities from a large-scale environment.
However, the UGV is restricted by its limited perception ability in a cluttered environment. To
address this limitation, an unmanned aerial vehicle (UAV) can cooperate with the UGV and
build out a detailed and complete map. The objectives of this study are to operate aerial and
ground robots cooperatively and to fuse point cloud data complementarily. This paper introduces
an autonomous cooperation framework using UGV and UAV for 3D geometric data collection in
a dynamic cluttered environment. First, UAV is deployed and collects 3D terrain data by using
photo images to get an overall idea of the target site. Then, the path planning and stationary
scanning locations for UGV are estimated using the gradient-based 3D model generated by
UAV. This method was tested on a real-world construction site and obtained promising results.
The proposed UAV-UGV cooperative robotic approach is expected to reduce human
intervention in data collection and data processing time significantly. Furthermore, it enables
frequent monitoring, updating, and analyzing a cluttered environment for timely decision
making.

INTRODUCTION
The current available 3D laser scanners capture the surrounding 3D position data with
unprecedented speed and millimeter accuracy; their main application is in the visualization,
modeling, and monitoring of infrastructures and buildings. It is challenging to obtain a high-
resolution point cloud quickly and with well-known, quality observations, but this task is
essential for surveying applications in particular. This is due to the fact that the occlusion by
objects and scanning geometry plays an important role in the quality of the resulting laser
scanned point cloud. Furthermore, the dynamic and complex characteristics of the construction
sites not only require a variety of 3D geometric data but also make it difficult to collect spatial
data efficiently and effectively. The 3D geometric data acquisition process relies on the intuition
or experience of the data collector, which includes a 3D spatial domain and the physical target
placing for registration (Cho et al. 2012). It might not be practical or efficient due to the
complexity of the site, which involves redundant, missing, and unnecessary data. Also, one
cannot determine if the scan task is successful until the point cloud registration is complete as it
is a post-processing job. Therefore, the mobile robot based automatic job site inspection process

for the daily work cycle is an attractive alternative.

To be a practical solution for construction applications, an autonomous mobile robot must
have the capability to safely move around an unstructured and cluttered environment and select
optimal scan locations. In general, mobile robot navigation and planning is based on the existing
map of the environment. However, the construction environment keeps changing dynamically,
and it is different from as-designed conditions in most cases. To increase the efficiency of daily
as-is data collection in the field, this study proposes a framework for laser-scan location and path
planning of an autonomous mobile robot assisted by an unmanned aerial vehicle (UAV). The
proposed method seeks to minimize the number of scanning positions by searching the optimal
position of the mobile robot based laser scanning system, ensuring the completeness of the data.
This process starts with the designated initial scan and places the grid of candidate scan spots in
the entire target site. Then, a global visibility analysis is performed to identify the best scan
locations. As a common cluttered environment, an outdoor construction site is primarily
discussed and utilized as a test bed in this study. The following sections are presented: literature
review, objective, methodology, discussion, and conclusion.

LITERATURE REVIEW

3D Reconstructions from UAV Images and Cooperation with a Mobile Robot

There has been extensive research involving the cooperative use of UGVs and UAVs. Some
of this research has involved target detection and tracking. However, there have also been many
instances of work done on cooperative UAV/UGV systems for path planning of the UGV to
accomplish different tasks. Typically, UAV provides a high-level map to aid in path planning
and object detection for UGV. A cooperation scheme has been demonstrated to aid in
humanitarian demining (Cantelli et al. 2013). Additionally, multi-robot teams of drones and
UGVs have been developed to help transport objects in industrial environments, and many
groups have geared their research efforts towards developing path planning algorithms for UGV
based on information provided by UAV (Li et al., 2016; Giakoumidis et al. 2012). A key part of
this process is to generate a course 3D reconstruction from the images taken by UAV. Even
though a 3D reconstruction can be generated using only UAV, the reconstruction generally has
much lower resolution than required in many applications thus requiring the deployment of UGV
(Irschara et al. 2010).

Scan Planning and Autonomous Scanning

In 3D reconstruction, it is often desirable to minimize the number of scans and the scan time
while maintaining full coverage of the site of interest, which is why scan planning is an essential
element in any 3D reconstruction scheme. While much research has been devoted to scan
planning for photogrammetry and geodesy, Jia points out that it is still an unsettled issue for
terrestrial laser scanners (TLS) (Jia and Lichti, 2018). They also identify two common strategies,
Next Best Viewpoint and another method based on visibility and occlusion analysis. There have
been recent efforts to optimize the scan locations; a genetic algorithm has been used to optimize
the location for the terrestrial laser scans (Kim et al. 2014). Additionally, Song et al. (2014) have
developed an approach for scan planning which considers the level of detail in the resulting point
cloud, and similarly, Biswas et al. (2015) have developed an approach for scan planning with
occlusion handling utilizing the as-designed BIM models. Once scan locations have been
established, the next step is to navigate to the specific scan locations. Zhang et al. (Zhang et al.

2016) proposed a scan planning method related to data quality requirements in the field such as
accuracy and levels of details with as-designed BIM but they are not focused on autonomous
scanning. The simultaneous localization and mapping (SLAM) technique, which is a widely used
method for estimating the robot's current position and orientation in an environment map, have
been used in approaches to generate dynamic point clouds (Kim et al. 2018a). One of the
problems with this method is that most SLAM systems are commanded manually. It means that
human should determine where to go and how to scan a large site completely (Borrmann et al.
2015; Toschi et al. 2015)

Figure. 1 The overall framework of the proposed approach

METHODOLOGY
System Architecture
The objective of this paper is to develop a method including laser scan planning, mobile
robot navigation, and registration of multiple point clouds collected by laser scanners mounted
on a mobile robot. At first, a UAV is deployed to get a rough 3D map of the target site. Then, the
scan locations are obtained by a simulation, with a grid and an occupancy map from the UAV-
generated point cloud. Finally, the navigation path is determined from among the simulated scan
locations. The overall framework of the proposed approach is shown in Figure 1. The proposed
method assumes the following pre-conditions. First, an a priori map of the target environment is
unavailable, such as at a construction site mid-phase. Second, obtaining the point cloud is not
available while the mobile robot is moving, but only when the mobile robot stands stationary.
This is because the acquisition in a stationary state can obtain a much higher quality point cloud.
In this study, UAV is equipped with a camera while UGV, the Ground Robot for Mapping
Infrastructure (GRoMI), utilizes a robotic hybrid LiDAR system, as shown in Figure 2. The
upper part of GRoMI is the laser scanning system including four vertically mounted 2D line laser
scanners to collect 3D map information. A horizontally mounted 2D line laser scanner is to

estimate the robot’s location and pose information in a 2D plane, and a regular digital camera is
to get the RGB data of the scenes. The lower part of GRoMI is used for navigation. The robotic
system offers the following functionalities: 1) point-cloud data acquisition; 2) RGB data
collection through a DSLR camera, and 3) autonomous navigation.

Figure. 2 UGV and UAV used in this study

Scan Planning for the Mobile Robot

The scan planning method proposed by this study is used to locate satisfactory stationary
scanning locations for GRoMI by evaluating the candidate scan locations with a line-of-sight
simulation of a 3D laser scanner. Considerations for selecting these locations include 1) a large
field-of-view of the surroundings with minimal obstructions; 2) minimal overlapped area
between scans; and 3) no missing scan area for the entire site. Optimizing for these criteria, the
number of scans and data collection time can be reduced while maintaining complete coverage
and a high level of detail in the resulting 3D construction. The scan-spot selection process is
accomplished through the following four steps:
1) Divide the job site into cells (1 m by 1 m each) and compute the gradient between
neighbor cells for each cell,
2) Create a 2D gradient map and an occupancy map to find the movable area at a site,
3) Run a line-of-sight simulation for every candidate scan location cell and count how large
area covers, how many points and how few occlusions can be achieved, and
4) Find candidate scan locations where more point-cloud data and less occlusion can be
obtained.
These four steps are described in detail below.
The first stage of the scan planning process requires obtaining a 3D map of the target site.
Therefore, UAV is deployed to get the rough 3D point cloud map. The UAV-generated point
cloud of the construction site is divided into 1 m by 1 m cells and the gradients between neighbor
cells are then computed. Figure 3 (a) demonstrates the gradient map of a test site. The blue color
area is relatively flat with a small gradient value, and the red color area is relatively inclined with
a more considerable gradient value. Based on the generated gradient map, the movable areas in
the site are defined by setting a specific gradient threshold value. The movable area is the place
where GRoMI can move around and constitutes the candidate scan locations, which are used for
scan and path planning. A reference point is required for selecting optimal locations. This study
uses the starting position of GRoMI as the reference point. Figure 3 (b) shows the occupancy

map of the movable area, while GRoMI’s initial location is shown as a small circle in the target
site. Occupancy means whether the obstacle is present at each cell in the evenly spaced field of
binary random variables.

The next step is to compute the visibility of each candidate scan location. The ray tracing
algorithm (Lichti 2017) is used to calculate the line-of-sight visibility. This process simulates the
GRoMI laser scanning. The mesh grid is created from the UAV-generated point cloud, and then

the 3D ray tracing for each laser line is simulated. This simulation lists the scan location with the
maximum number of points from the laser scanner and the minimum number of occlusion points.
In addition, the greedy cover algorithm (Parthasarathy 1997) is utilized to select an
approximation for the optimal number of scan locations necessary to cover the entire site. It
chooses the scan location that can see the largest amount of the boundary and then continues
selecting the scan locations to cover the remaining area from the potential viewpoints. This
process repeats until either the entire site has been covered or the iteration reaches the
approximation for the optimal number of scans. The scan planning simulation calculates the
number of laser points and occlusion points during the previous steps. Among these locations,
removing some of the scan locations within a specific distance is required to minimize the
overlapped area caused by the redundancy. The specific distance differs depending on the size of
the target site; a 10 m distance was used for our test bed. After the simulation, four of the
optimized scan locations were estimated as shown in red dots in Figure 4.

Figure. 5 Scan planning and reference lines (circles shows scan locations)
Table 2. Required time
UAV+GRoMI TLS
Pre-processing 35 min 10 min
Operation 30 min 40 min
Post-processing 0 min 40 min

Path Planning and Automatic Registration of Scans

Once the scan locations are determined, the path from the current location to the next scan
location should be planned. In this paper, the object detection based potential vector field method
(Kim et al. 2018b) is used for path planning. From the collected point cloud data while UGV is
moving, the objects are defined, and a potential field is constructed to avoid them. Then, UGV
keeps moving following the gradient of the potential field and converges to the next scan
location. When UGV reaches the final scan location, it will start to register each point cloud to
one coordinate system by using localization data. The localization data is calculated from the
horizontal LiDAR. The laser-scan data from the horizontal LiDAR is used by the SLAM
algorithm to estimate the position and orientation of GRoMI on the horizontal plane. The Hector
SLAM algorithm, developed by Kohlbrecher et al. (2011) is employed in this study to perform
laser-scan matching between current LiDAR scans and progressively built maps to estimate the

robot’s postures and planar maps of the environment. In addition, the SLAM-driven localization
approach is used to automatically register the point clouds obtained from each stationary scan
location (Kim et al. 2018c). First, the SLAM-driven localization approach is used to calculate a
coarse transformation, and then a fine registration is achieved using ICP technique (Besl and
McKay 1992).

RESULTS AND CONCLUSION

In this study, a construction site at the Georgia Tech campus was selected as a test bed for a
cluttered environment. The authors used the proposed UAV-UGV cooperative method to build
3D point clouds of the site. The authors also used a Terrestrial Laser Scanner (TLS) as the
ground truth to measure the error of the registered point cloud collected by GRoMI since the
TLS has higher accuracy (e.g., ±3 mm). The GRoMI-generated registered point cloud and the
selected six references to assess its error in point cloud are shown in Figures 5.
As shown in Table 1, the errors of all the reference lines are within 2 cm while the TLS
measurement assumes as the ground truth. Also, the mean absolute error (MAE) between the
GRoMI and TLS point clouds is 1.97 cm, and the root mean square error (RMSE) is 2.31 cm;
MAE is an average distance between two point cloud sets, and RMSE is a standard deviation
between two data sets. Table 2 shows the required time to build a 3D registered point cloud with
each device. The pre-processing involves the scanning time of UAV and path-planning processes
for GRoMI, and a target placing and preparation time for the TLS. The operation includes the
data collection process with each device and post-processing includes the point-cloud
registration process. During the 35 min of pre-processing for GRoMI, 10 min of UAV operation,
20 min of point-cloud generation, and 5 min of scan planning are included. One advantage of
cooperation between GRoMI and UAV is the shortened operation time. Also, it does not need
post-processing (registration) because it is registered automatically using the robot localization
data (i.e., SLAM). Therefore, the proposed approach resulted in time and cost efficiencies
compared with the traditional TLS.
This paper introduces an autonomous method for UAV-assisted mobile robot scan and path
planning to generate a 3D point cloud to overcome the problem that there is little useful a priori
information of the target site. The main advantages of using the proposed method over previous
ones involve (1) to automate the point-cloud acquisition process by finding scan locations and
planning navigation paths from UAV generated point cloud; (2) to remove redundant scans and
reduce time and cost for data collection; (3) to reduce missing areas of the target site; and (4) to
automatically register multiple scans. However, this approach requires a UAV generated map
data. For future study, the mobile robot should estimate the preferred scan locations by itself for
cases when UAV data is not available.

ACKNOWLEDGMENT
The work reported herein was supported by a grant (18CTAP-C144787-01) funded by the
Ministry of Land, Infrastructure, and Transport (MOLIT) of Korea Agency for Infrastructure
Technology Advancement (KAIA) and by the United States Air Force Office of Scientific
Research (Award No. FA2386-17-1-4655). Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the authors and do not necessarily
reflect the views of MOLIT and the United States Air Force. Their financial support is gratefully
acknowledged.

REFERENCES
Besl, P., and McKay, N. (1992). “A Method for Registration of 3-D Shapes.” IEEE Transactions
on Pattern Analysis and Machine Intelligence.
Borrmann, D., Heß, R., Houshiar, H. R., Eck, D., Schilling, K., and Nüchter, A. (2015). “Robotic
mapping of cultural heritage sites.” International Archives of the Photogrammetry, Remote
Sensing and Spatial Information Sciences - ISPRS Archives, 40(5W4), 9–16.
Cantelli, L., Mangiameli, M., Melita, C. D., and Muscato, G. (2013). “UAV/UGV cooperation
for surveying operations in humanitarian demining.” 2013 IEEE International Symposium on
Safety, Security, and Rescue Robotics (SSRR), IEEE, 1–6.
Cho, Y. K., Wang, C., Tang, P., and Haas, C. T. (2012). “Target-Focused Local Workspace
Modeling for Construction Automation Applications.” Journal of Computing in Civil
Engineering, 26(5), 661–670.
Giakoumidis, N., Bak, J. U., Gomez, J. V., Llenga, A., and Mavridis, N. (2012). “Pilot-Scale
Development of a UAV-UGV Hybrid with Air-Based UGV Path Planning.” 2012 10th
International Conference on Frontiers of Information Technology, IEEE, 204–208.
Irschara, A., Kaufmann, V., Klopschitz, M., Bischof, H., and Leberl, F. (n.d.). TOWARDS
FULLY AUTOMATIC PHOTOGRAMMETRIC RECONSTRUCTION USING DIGITAL
IMAGES TAKEN FROM UAVS.
Jia, F., and Lichti, D. D. (2018). “AN EFFICIENT, HIERARCHICAL VIEWPOINT
PLANNING STRATEGY for TERRESTRIAL LASER SCANNER NETWORKS.” ISPRS
Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4(2), 137–
144.
Kabir Biswas, H., Bosché, F., and Sun, M. (n.d.). Planning for Scanning Using Building
Information Models: A Novel Approach with Occlusion Handling.
Kim, M., Li, B., Park, J., Lee, S., and Sohn, H. (2014). “Optimal locations of terrestrial laser
scanner for indoor mapping using genetic algorithm.” The 2014 International Conference on
Control, Automation and Information Sciences (ICCAIS 2014), IEEE, Gwangju, South
Korea, 140–143.
Kim, P., Chen, J., and Cho, Y. K. (2018a). “Autonomous mobile robot localization and mapping
for unknown construction environments.” Construction Research Congress 2018:
Kim, P., Chen, J., and Cho, Y. K. (2018b). “SLAM-driven robotic mapping and registration of
3D point clouds.” Automation in Construction, 89C, 38–48.
Kim, P., Chen, J., Kim, J., and Cho, Y. K. (2018c). “SLAM-Driven Intelligent Autonomous
Mobile Robot Navigation for Construction Applications.” Advanced Computing Strategies
for Engineering, I. F. C. Smith and B. Domer, eds., Springer International Publishing, Cham,
254–269.
Kohlbrecher, S., Von Stryk, O., Meyer, J., and Klingauf, U. (n.d.). A Flexible and Scalable
SLAM System with Full 3D Motion Estimation.
Li, J., Deng, G., Luo, C., Lin, Q., Yan, Q., and Ming, Z. (2016). “A Hybrid Path Planning
Method in Unmanned Air/Ground Vehicle (UAV/UGV) Cooperative Systems.” IEEE
Transactions on Vehicular Technology, 65(12), 9585–9596.
Lichti, D. D. (2017). “Ray-Tracing Method for Deriving Terrestrial Laser Scanner Systematic
Errors.” Journal of Surveying Engineering, 143(2), 06016005.
Parthasarathy, S. (1997). “A Tight Analysis of the Greedy Algorithm for Set Cover.” Journal of
Algorithms, 25(2), 237–254.
Song, M., Shen, Z., and Tang, P. (2014). “Data Quality-oriented 3D Laser Scan Planning.”

Construction Research Congress 2014, American Society of Civil Engineers, Reston, VA,
984–993.
Toschi, I., Rodríguez-Gonzálvez, P., Remondino, F., Minto, S., Orlandini, S., and Fuller, A.
(2015). “Accuracy evaluation of a mobile mapping system with advanced statistical
methods.” International Archives of the Photogrammetry, Remote Sensing and Spatial
Information Sciences - ISPRS Archives, 40(5W4), 245–253.
Zhang, C., Kalasapudi, V. S., and Tang, P. (2016). “Rapid data quality oriented laser scan
planning for dynamic construction environments.” Advanced Engineering Informatics,
Elsevier Ltd, 30(2), 218–232.

Deep Learning with Spatial Constraint for Tunnel Crack Detection

Qingquan Li, Ph.D.1; Qin Zou, Ph.D.2; Jianghai Liao3; Yuanhao Yue4; and Song Wang, Ph.D.5
1
Shenzhen Key Laboratory of Spatial Smart Sensing and Service, Shenzhen Univ., Shenzhen,
Guangdong 518060, P.R. China. E-mail: [email protected]
2
School of Computer Science, Wuhan Univ., Wuhan, Hubei 430072, P.R. China. E-mail:
[email protected]
3
Shenzhen Key Laboratory of Spatial Smart Sensing and Service, Shenzhen Univ., Shenzhen,
Guangdong 518060, P.R. China. E-mail: [email protected]
4
School of Computer Science, Wuhan Univ., Wuhan, Hubei 430072, P.R. China. E-mail:
[email protected]
5
Dept. of Computer Science and Engineering, Univ. of South Carolina, Columbia, SC 29200,
USA. E-mail:[email protected]

ABSTRACT
Cracks are the most common defect on the surface of tunnels, which potentially brings
threaten to the safety of the tunnel and the running vehicles. Timely repairing of the crack is of
critical importance. In the past two decades, various vehicle platforms have been developed on
the purpose of efficient crack detection and maintenance. With these platforms, images can be
captured in a traffic speed, and automatic methods can be developed for fast crack localization.
However, for image-based crack detection, traditional methods often meet difficulties in
handling cracks with low contrast and poor continuity. In this paper, deep learning based
techniques are exploited for feature learning and representation for crack detection. A novel deep
neural network is presented for pixel-level crack recognition. Hierarchical features in different
stages of the convolution are fused together to overcome the influence of noise and a spatial
constraint placed on the target pixels is used to guarantee the crack continuity. In the
experiment, a tunnel crack dataset is constructed for performance evaluation. Experimental
results demonstrate the effectiveness of proposed method.

INTRODUCTION
Tunnels are commonly constructed as a part of the highway especially in mountain regions.
For example in China, the highway tunnel of Qinling Zhongnan Mountain in Shaanxi province
has a length of 18.02 Km, the highway tunnel of Maiji Mountain in Gansu province has a length
of 12.29 Km and the highway tunnel of West Mountain in Shanxi province has a length of 13.65
Km, as shown in Figure 1. Once the tunnels are set into operation, defects and damages will
appear after a long time of use. For example, the uneven force outside the tunnel may deform the
tunnel lining, and make cracks on the tunnel surface. These cracks may lead to water leaking,
and consequently the freezing damage in cold winter. Meanwhile, a crack looks like a minor
defect, but it can easily deteriorate into more serious damage such as a wide cleft. In such
situation, the lining board would fall down and threat the safety of high-speed vehicles running
in the tunnel. Thus, it is necessary to fix a crack as early as possible. Traditionally, the defects
including the cracks are visually inspected by testers by closing the tunnel, and then repaired by
professional workers, as illustrated by Fig. 2. This procedure is time-wasting and labor intensive.

Figure 1. Typical long tunnels in China

Figure 2. Traditional testing and repairing method

Figure 3. Three representative tunnel inspecting systems: Tunnelings,

tCrack and MIMM-R.
Due to the requirement of timely mending of cracks, fast crack detection techniques have
been developed in the past two decades. A tunnel lining inspection system named Tunnelings
was developed by Spanish Euroconsult and Pavemetrics Company, as shown in Fig. 3 (left),
which used cameras and laser sensors to scan the tunnel lining with a 1mm resolution at a speed
up to 30 km/h (Gavilán et al.(2013)). The platform that carries the laser cameras was installed on
a truck capable of running on rails and on flat terrain. Another tunnel inspection system named
tCrack was developed by Swiss Terra Company, as shown in Fig. 3 (middle), which includes ten
CCD cameras, mounted on a site vehicle, can recognize cracks of more than 0.3 mm in width,
and can run at a speed of 2.5 km/h. A third equipment named MIMM-R was developed by
Japanese Keisokukensa Company, which integrated CCD cameras, laser scanner and Ground
Penetrating Radar for inspecting cracks, leakage, tunnel deformation and tunnel lining cavities
(Huang et al. (2017)). It can detect cracks at a precision of 0.2mm and at a speed of 70 km/h.
For tunnel crack detection, image-based methods have been widely used. Generally, cracks

are darker than those of their surroundings in image, resulting in different gray scale values
compared to the background (Zou et al.(2012), Kaul et al.(2012), Oliveira et al.(2013), Amhaz et
al.(2016), Koch et al.(2015)). This property allows threshold segmentation techniques as a first
step to segment the image and extract potential crack feature (Li et al. (2011)). Roli (1996)
proposed a method utilizing conditional texture anisotropy for crack detection. Qu et al. (2016)
detected the tunnel lining cracks by firstly eliminating the seams on the concrete surface.
Fujita et al.(2006) proposed two preprocessing methods using the subtraction method and the
Hessian matrix. Since the local window is fixed, these methods cannot be flexibly applied to
different widths. However, these methods only use low-level features for crack detection, and
may suffer to failure when the cracks have low contract to the background or bad continuity.

Figure 4. Network Structure

In the past several year, deep convolutional neural networks (DCNN) have achieved success
in many computer vision applications such as object detection, image segmentation, and image
retrieval, etc. Features abstracted by DCNNs are found to be able to represent the target in an
image in a high level, which can be effectively used for high-level visual perception and
reasoning. Deep convolutional features were also found to be useful for crack detection (Zhang
et al.(2016), Schmugge et al.(2017)). In this paper, we proposed a robust crack detection method
by fusing hierarchical deep convolutional features to represent the cracks. Meanwhile, to
overcome the problem that line structures are not well modeled in the traditional deep models,
we present a spatial constraint in training our deep model. With this spatial constraint, the
detection output will be a continuous line structure, although the cracks in the original image
have low continuity.

THE PROPOSED METHOD

The model detects cracks via pixel-wise semantic segmentation and enhances the crack
continuity by predicted positive links.

Figure 5. Neighbor-connection map

Network Architecture
We build a fast crack segmentation architecture inspired by the Unet network (Ronneberger
et al. (2015)), a fully convolutional network. Unet is a deep convolutional encoder-decoder
architecture designed for pixel-wise semantic segmentation, which contains an encoder network
and a corresponding decoder network. As shown in Fig. 4, it consists of a down-sampling
encoder part and an up-sampling decoder part.
For image-based crack detection, a larger receptive field obtained by down-sampling
convolution feature is useful to overcome the influence of noise, and the decoder part can refine
crack edges with higher precision by using the encoded features. The encoder network is similar
to the convolutional layers in the VGG network (Simonyan et al.(2015)), but is constructed with
less convolution channels. It consists of the repeated application of two 3×3 convolutions. Each
of them is followed by a rectified linear unit (ReLU) and a 2×2 max-pooling operation with a
stride of 2 for down-sampling. Different from VGG network, we double the number of feature
channels before the down-sampling step such that the loss of feature information can be reduced.
In the decoder part, we use nearest neighbor up-sampling to increase the size of feature and
merge corresponding encoder layer features using point multiplication to reduce the amount of
parameters. At the final layer, a 1×1 convolution is used to map each 32-component feature
vector to the crack mask and the 8 neighbor-connection maps. After each convolution operation,
a batch-normalization step is applied to the feature maps, except for the final convolution layer.
The number of model parameters are only one-fifth of Unet. Experimental results show that this
network structure is simple and effective.

Spatial Constraint
A crack is a line structure that holds good continuity in a global perspective. Since the
semantic segmentation just makes independent judgments for each pixel regardless of the
connectivity of other positive pixels, the detected cracks may be not continuous by missing part
of the structures. In our design, a special spatial continuity constraint is used to enhance crack
continuity in the training process. In fact, crack continuity can be effectively used for reasoning
missing data caused by noise during data acquisition. In our work, the ground-truth cracks are
labeled in single-pixel width, so the crack continuity relationship can be built by determining if
the pixel is belonging to a line. In our model, we predict 8 neighbor-connection maps to solve

this problem as illustrated by Fig. 5.

Loss Function
The training loss is a weighted sum of loss on pixels and loss on positive neighbor
connections, as formulated by
L  Lpixel   Lconnected (1)
When calculating crack loss, it is unfair to put the same weight on all positive pixels. We use
classes-balance cross-entropy loss to solve this problem. We also use cross-entropy loss to
calculate the neighbor-connection loss, but only positive pixels will be taken into account. The
balance parameter λ is set to 5 across all the experiments.

EXPERIMENTS AND RESULTS

Data collection: A fast tunnel inspecting system is developed by Wuhan ZOYON Company.
As shown in Fig. 6, the system is equipped on a truck and composed of line-scan CCD cameras,
LED light, infrared thermography and controller mount. High resolution line-scan CCD cameras
capture the tunnel lining images under high-power LED illumination. Tunnel lining images are
first pre-processed to compensate image shift caused by the vibration of the vehicle platform,
and then mosaicked into panorama to support the detection of cracks, water leakage and other
tunnel lining defects. The inspection system can identify cracks with 0.2mm width at a driving
speed of 0-80 km/h. A number of 328 tunnel crack images are collected, in which 250 are used
for training, and the remaining 78 are for test.
Implementation Details: It turns out that it is necessary to augment train dataset in crack
detection. At first, the input images are scaled by 0.5 to 2 times and rotated at a probability of 0.5
by a random angle of 0 to π/2. Secondly, random distortion and affine transformation will be
implemented on the image according to its mask. Thirdly, random cropping will be conducted
with areas ranging from 0.5 to 1, and aspect ratios ranging from 0.5 to 2. Fourthly, the images
are resized uniformly to 512 × 512. Finally, skeleton extraction algorithm is used to ensure that
each crack label is a single-pixel line structure.

Figure 6. ZOYON tunnel inspecting system

The models are optimized by SGD with a momentum of 0.9 and a weight decay of 5×10−4.
All convolution weights are randomly initialized by the Xavier method and the biases are set to
0. The learning rate is set to 10−3. The whole algorithm is implemented using PyTorch 0.4.0.
When training with a batch size of 32 on 2 GPUs (GTX 1080TI), it takes about 0.33s per

iteration, and the whole training processing takes about 2 hours.

Figure 7. Results obtained by different methods on six sample images

Results: We compare the performance of ours with current state-of-the-art e deep learning
based methods on six images randomly sampled from the test dataset. The results are shown in
Fig.7. The results generated by RCF (Liu et al.(2017)) and SRN (Ke et al.(2017)) need to be
post-processed by the standard non-maximum suppression (NMS) to thin the crack maps. The
results obtained by the proposed method are directly evaluated without any post-processing
procedure. Table I lists the results obtained by the proposed method and four comparison
methods, which are DeepCrack (Zou et al.(2019)), SRN, HED (Xie et al.(2015)) and RCF. The
proposed method obtains a highest F-Measure value of 0.9068, while the precision and recall
values are also the highest among all comparison methods.

Table I: Performances of Different Methods

DeepCrack SRN HED RCF
Method Proposed Zou, et al. Ke, et al. Xie, et al. Liu, et al.
(2019) (2017) (2015) (2017)
Recall 0.9224 0.8122 0.7214 0.7105 0.7349
Precision 0.8917 0.8460 0.7748 0.8587 0.7967
F-Measure 0.9068 0.8288 0.7471 0.7776 0.7646

CONCLUSION
In this paper, a novel deep learning network was proposed for crack detection from tunnel
lining images. In this network, a spatial constraint was embedded by checking the continuity of
cracks in eight-neighbor crack maps. A tunnel crack dataset containing 328 images was collected
for performance evaluation. Experimental results demonstrated that, the proposed method
extracted cracks with better continuity than other competing methods, which led to the highest
values in precision, recall and f-measure among all comparison methods.

REFERENCES
Amhaz, R., Chambon, S., Idier, J., &Baltazart, V. (2016). Automatic crack detection on two-
dimensional pavement images: an algorithm based on minimal path selection. IEEE
Transactions on Intelligent Transportation Systems, 17(10), 2718-2729.
Fujita, Yusuke, Yoshihiro Mitani, and Yoshihiko Hamamoto. (2006). A method for crack
detection on a concrete structure. In 18th IEEE International Conference on Pattern
Recognition.
Gavilán, M., et al. (2013). Mobile inspection system for high-resolution assessment of tunnels. In
6th International Conference on Structural Health Monitoring of Intelligent Infrastructure.
Huang, H., Sun, Y., Xue, Y., & Wang, F. (2017). Inspection equipment study for subway tunnel
defects by grey-scale image processing. Advanced Engineering Informatics, 32, 188-201. <
http://www.keisokukensa.co.jp/MIMM.html>
Kaul, V., Yezzi, A., & Tsai, Y. (2012). Detecting curves with unknown endpoints and arbitrary
topology using minimal paths. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 34(10), 1952.
Ke, W., Chen, J., Jiao, J., Zhao, G., & Ye, Q. (2017). SRN: side-output residual network for
object symmetry detection in the wild. In 2017 IEEE Conference on Computer Vision and
Pattern Recognition, pp. 302-310.
Koch, C., Georgieva, K., Kasireddy, V., Akinci, B., &Fieguth, P. (2015). A review on computer
vision based defect detection and condition assessment of concrete and asphalt civil
infrastructure. Advanced Engineering Informatics, 29(2), 196-210.
Li, Q., Zou, Q., Zhang, D., & Mao, Q. (2011). FoSA: F* seed-growing approach for crack-line
detection from pavement images. Image and Vision Computing, 29(12), 861-872.
Liu, Y., Cheng, M. M., Hu, X., Wang, K., & Bai, X. (2017). Richer convolutional features for
edge detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp.
5872-5881.
Oliveira, H., & Correia, P. L. (2013). Automatic road crack detection and characterization. IEEE
Transactions on Intelligent Transportation Systems, 14(1), 155-168.
Qu, Z., Bai, L., An, S. Q., Ju, F. R., and Liu, L. (2016). Lining seam elimination algorithm and
surface crack detection in concrete tunnel lining. Journal of Electronic Imaging, 25(6),
063004.
Roli, Fabio. (1996). Measure of texture anisotropy for crack detection on textured surfaces.
Electronics Letters 32.14: 1274-1275.
Ronneberger, O., Fischer, P., &Brox, T. (2015). U-net: convolutional networks for biomedical
image segmentation. In International Conference on Medical image computing and
computer-assisted intervention, pp. 234-241.
Schmugge, S. J., Rice, L., Lindberg, J., Grizziy, R., Joffey, C., & Shin, M. C. (2017). Crack
segmentation by leveraging multiple frames of varying illumination. In 2017 IEEE Winter
Conference on Applications of Computer Vision, pp. 1045-1053.
Simonyan K, Zisserman A. (2015). Very deep convolutional networks for large-scale image
recognition. In International Conference on Learning Representations.
Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In IEEE International Conference on
Computer Vision, pp. 1395-1403.
Zhang, L., Yang, F., Zhang, Y. D., & Zhu, Y. J. (2016). Road crack detection using deep
convolutional neural network. In 2016 IEEE International Conference on Image Processing,
(pp. 3708-3712).

Zou, Q., Cao, Y., Li, Q., Mao, Q., & Wang, S. (2012). CrackTree: automatic crack detection
from pavement images. Pattern Recognition Letters, 33(3), 227-238.
Zou, Q., Zhang, Z., Li, Q., Qi, X., Wang, Q., and Wang, S. (2019). DeepCrack: learning
hierarchical convolutional features for crack detection. IEEE Transactions on Image
Processing, 28(3), 1498-1512.

Automatic Review of Construction Specifications Using Natural Language Processing

Seonghyeon Moon1; Gitaek Lee2; Seokho Chi, Ph.D., M.ASCE3; and Hyunchul Oh, Ph.D.4
1
Construction Innovation Laboratory, Dept. of Civil and Environmental Engineering, Seoul
National Univ., Seoul 08826, South Korea. E-mail: [email protected]
2
Construction Innovation Laboratory, Dept. of Civil and Environmental Engineering, Seoul
National Univ., Seoul 08826, South Korea. E-mail: [email protected]
3
Construction Innovation Laboratory, Dept. of Civil and Environmental Engineering, Seoul
National Univ., Seoul 08826, South Korea. E-mail: [email protected]
4
Daewoo E&C, Smart Construction Team, Seoul 08826, South Korea. E-mail:
[email protected]

ABSTRACT
Since construction specifications are normally over 1000 pages and are complicated and
often inconsistent, reviewing them is a labor-intensive and time-consuming activity. Thus, the
aim of this study was to automate the review process by comparing construction specifications
with standard specifications using natural language processing. Standard specifications for road
construction projects were collected from 43 different states in the U.S. and used as experimental
data. Doc2Vec, cosine similarity, and named entity recognition (NER) were used to recognize
construction objects, standard values, and execution conditions, which can be used to find
specification errors. As an early stage of the research, most of related sentences were found from
standard specifications with high relevancy, and the average F1 score of NER was 0.256. The
research findings will contribute to enhancing the efficiency of checking for specification errors
by automatically detecting abnormalities and the absence of specific standards.

INTRODUCTION
Reviewing construction specifications is a crucial process for contractors because they must
follow the clients’ requirements, which are stated clearly in the document. Failure to meet the
standards of the document causes economic, technical, and social problems. For example, when
Korean contractors were working on the road construction site in Qatar, problems with the
asphalt pavement occurred due to the use of incorrect construction standards. The contractors
performed the construction according to the specifications provided by the client, but the design
criteria specified in the specifications were not suitable for the environment of Qatar; especially
the hot weather condition. As a result, the contractors had conflicts with the client and the
designers, resulting in the waste of resources and delays in the project.
Even though reviewing the construction specifications are crucial, it is difficult to analyze the
documents due to following issues. First, the documents are generally complicated and contained
errors because some of the clients, if they do not have their own standardized specifications like
Qatar, are not familiar with construction standards and just sometimes tend to piece together
parts of other specifications without careful investigations. In addition, the reviewing process is
time-consuming and expensive since it is performed manually. In addition, the specifications are
interpreted inconsistently because the reviewers are often unfamiliar with the local situations
(e.g., environment, technical skills, and regulations).
To summarize, manual reviews of construction specifications waste time, increase costs, and
contain inconsistent interpretations. To address these problems, in this research, our aim was to

develop an automatic process of reviewing construction specifications using Natural Language

Processing (NLP). Since this research is in its early stages, the overall flow of the research is
described in this paper, including (1) selection of comparable specifications, (2) identification of
corresponding sentences, (3) extraction of construction standards, and (4) comparison of
construction standards.
The construction specifications for the Qatar highway construction site in 2014 was used for
the analysis, and the standard specifications for road construction in 43 states in the U.S. were
used as the reference set. We collected from websites the most recent specifications for 43 states
in the U.S. since they permitted the specifications to be downloaded. The main beneficiary of the
research would be the construction companies whose employees should ascertain the
appropriateness of the clauses in the construction specifications.

RELATED WORKS

Natural Language Processing and Text Mining

NLP is a research area that utilizes various machine learning algorithms to process readable
text, which enables a computer to analyze the text data (Zhang and El-Gohary, 2016). Since there
is a large volume of documents that pertain to the construction industry, many researchers have
analyzed the data they contain to manage the empirical information that is available in the
documents. The text data in the construction industry include, for instance, regulations, bidding
documents, specifications, construction reports, accident reports, and claim documents.
Text mining is a research concept in which text data (i.e., unstructured data) are processed by
NLP and then analyzed by computer to extract information and determine relationships between
the sets of information (Lee et al., 2016). The field of text mining in construction covers
visualization, automatic summarization, information retrieval, ontology development,
compliance checking, and other categories.

FUNDAMENTAL RESEARCH METHODOLOGY

Preprocessing
The text data used in the research went through three steps of preprocessing to be converted
into a clean and computer-understandable format. The preprocessing steps consisted of
tokenization, stopword removal, and stemming.
First, in the tokenization step, the research separated the text into several tokens, a minimum
unit of text analysis, such as a document, paragraph, sentence, and word. This process was to
prepare the text for feature representation that would be essential in the following analysis. In
general, the ‘word,’ a chunk of alphabetical characters divided by space marks, is the most
common unit used to analyze text. In addition, in this research, the combination of a punctuation
mark (e.g., ‘.’, ‘,’, and ‘!’) and a space mark (e.g., ‘ ’, and ‘\n’) was used as a delimiter in order to
separate sentences.
Second, in the stopword removal step, words that appeared in the text too often and were not
significantly important in the analysis of the text were eliminated. The eliminated words are
called ‘stopwords,’ which include grammatical elements, such as definite and indefinite articles
(e.g., ‘a’, ‘an’, and ‘the’), prepositions (e.g., ‘to’, ‘on’, ‘in’), and pronouns (e.g., ‘he’, ‘she’, ‘it’).
Third, in the stemming step, the words that remained after the stopwords were removed were
pruned into root or stem forms to map the various forms of words that have the same meaning to

one unique term. For example, ‘construct,’ ‘construction,’ ‘constructor,’ and ‘constructing’
would be pruned to one term, ‘constr.’ The stemming process shortened the computing time by
reducing the size of the word feature matrix, and it enhanced the quality of the analytical results
by representing various words with essentially the same meaning with one word.

Text Embedding
Text embedding is a kind of text representation method, mapping the text on a real number
vector space, the purpose of which is to use the text vector as input data to machine learning
models (Chopra et al., 2016). This process is essential in NLP to conduct language modeling and
feature learning. While there are several methods for text embedding, in this research, we used
Term Frequency & Inverse Document Frequency (TF-IDF), Word2Vec, and Doc2Vec. The
details of each method are provided below.
TF-IDF conserves the frequency of text data, which implicates the appearance (i.e., whether
or not the frequency is zero) and importance (i.e., how many times the text appears). Term
Frequency (TF) indicates the number of occurrences of a term in a document, which implicates
the frequency of a term. Inverse Document Frequency (IDF) indicates an inverse number of
documents that contain a certain term. TF-IDF represents text data via the importance of
consisting terms (i.e., TF) normalized by IDF. The TF-IDF is calculated as shown in Equations
1-3, where t, d, and c indicate term, document, and corpus, respectively, and f  w, d  indicates
the frequency with which term t appeared in document d.
TFIDF  t , d , c   TF (t , d )  IDF (t , c) Equation (1)
0.5  f (t , d )
TF  t , d   0.5  Equation (2)
max{ f  w, d  : w  d }
c
IDF  t , c   log Equation (3)
{d  c : t  d }
Word2Vec is a neural network language model to learn word vectors, which models word-to-
word relationships (Mikolov et al., 2013). The word-to-word relationship means the distribution
of surrounding words, which could implicate the usage pattern of each word. Technically, the
objective function of Word2Vec is to maximize the log probability of a target word given its
surrounding words, provided as Equation 4.
   
log P  wO | wI   log  vw' O  vwI  i 1 wi ~ Pn  w log  vw' i  vwI  Equation (4)
k
 
where wO is the target word (output word), wI is one of the surrounding words (input word), 
is the sigmoid function, k is the number of negative samples, Pn  w is the noise distribution, vw
is the vector of word w , and vw' is the negative sample vector.
Doc2Vec is an extended version of Word2Vec to represent longer text (e.g., sentence,
paragraph, and document). The document vector would be generated according to the
combination of the Word2Vec vectors that compose the document (Le and Mikolov, 2014).

Similarity Analysis
Throughout the research, cosine similarities between text data were calculated to investigate
the comparable specifications or to extract the corresponding sentences. The most well-known
measure of vector similarity would be Euclidean distance, however, it is known that it does not

fully reflect the distance between text vectors. For this reason, in this research we used cosine
distance to investigate the similarities in the text.
Cosine similarity computes the distance of two vectors based on the inner value of the angle,
not the straight distance. By doing so, the excessive frequency of certain words cannot distort the
distance between vectors. The cosine similarity between two vectors, A and B, would be
calculated as shown in Equation 5, where n indicates the dimension of the vectors, and Ai
indicates the value of the ith element of vector A.

n
A B A Bi
Cosine Similarity  cos    i 1 i
Equation (5)
A B
 
n 2 n 2
A
i 1 i
B
i 1 i

Named Entity Recognition

Named Entity Recognition (NER) is one concept of text classifications, and it automatically
labels each word with informative categories, such as location, name, object, and action
(McCallum and Li, 2003). The target categories, called Named Entities, were assigned by
researchers, and, in the research, the words were labeled by six categories, i.e., (1) none, (2)
object, (3) standard, (4) environment, (5) condition, and (6) reference. The description of each
category is provided in Table 1.

Table 1. Word Categories for NER

Category Description
None Not an informative element for text analysis
Object A subject of construction specification standards
Standard A construction standard stated in the specification
Environment An environmental factor that affects the construction standard
Condition A detailed condition of the environmental factor
Reference A referenced document for the standard

NER is commonly conducted in two different ways, i.e., via the rule-based model and via the
machine learning model. The rule-based model performs the recognizing process based on the
predetermined rules, such as ‘FHWA  [Organization]’, ‘Ohio  [Region]’, and ‘Asphalt 
[Object]’. Because of the definite rules, the accuracy of the model would be considerably high,
but the model could not recognize any other entities that were not stated in the rules. To address
this limitation, in our research, we conducted NER with the machine learning concept by
developing a Recurrent Neural Network (RNN) model. RNN is a concept of Artificial Neural
Network (ANN), which is suitable for handling sequential data (Mikolov et al., 2010). The
model utilizes the previous classification results recurrently, that is, the input vector of the
current step and the output class of the previous step are used as the input to the current step.

EXPERIMENTAL DESIGN AND RESULTS

Research Framework
The overall research framework consists of four steps that are processed by the previously
mentioned text mining methodologies: (1) selection of comparable specifications, (2)

identification of corresponding sentences, (3) extraction of construction standards, and (4)

comparison of construction standards. The research progressed by the third step (i.e., Extraction
of Construction Standards) and such interim results were presented in this section.

Step (1) Result: Selection of Comparable Specifications

Remembering that the research objective was automatically reviewing the construction
specifications, we first needed a set of reference specifications that were able to be used as the
correct answer. We assumed that the standard specifications of the U.S. were well written, and
we used them as candidates of the reference data in our research. In addition, considering the
prior knowledge that construction specifications commonly consist of a combination of the
standard specifications, it seems appropriate to select the most similar specifications for the
reference data (i.e., comparable specifications).
The text data of specifications were represented in numeric vectors by TF-IDF embedding,
and then we calculated the cosine similarities between the construction specification in Qatar
(QAT) and the 43 standard specifications (USA). As a result, the standard specifications from
Alabama, Colorado, and Arkansas showed high similarities to QAT, i.e., 0.728, 0.723, and
0.718, respectively. After qualitative investigation by industry practitioners, these three were
used as the comparable specifications in the following steps.
Although Word2Vec and Doc2vec commonly are known to dominate the TF-IDF in
language modeling and computing efficiency, those models show disadvantages in interpretation
because they mix up the vector space while learning the corpus, relationship database of words
used in specifications. Therefore, since the results of this ‘Selection of Comparable
Specification’ step must be analyzed qualitatively by the practitioners, we embedded TF-IDF
method in the specification documents.

Step (2) Result: Identification of Corresponding Sentences

Occasionally, certain standards might have different values depending on the associated
category. For instance, the standard value of the thickness of concrete would be different
according to whether the category is the ceiling or the floor. Thus, it is crucial to identify the
corresponding text (e.g., category, paragraph, and sentence) that describes the same target prior
to reviewing the construction standard.
This paper identified corresponding sentences from only two paragraphs that we had
concluded correspond to each other for the feasibility testing purpose. The omitted steps, i.e.,
identifying corresponding categories and paragraphs, will be covered by future research planned
by the authors.
In this research, we assumed that the corresponding text would show high similarity,
Doc2Vec embedding was conducted for every sentence from four documents (i.e., QAT
construction specification and three comparable specifications), and then, we calculated the
cosine similarities between each pair of sentences. The results showed insufficient quality, as
evidenced by including the correct sentence in the 7th rank among 334 sentences.

Step (3) Result: Extraction of Construction Standards

After the corresponding sentences were identified, the information of object, standard,
environment, condition, and reference must be extracted automatically. In this research, we
developed an RNN model to recognize the entities from the text data.

Since there is no existing labeled data for NER in the field of construction specification
review, the researchers had to label every sentence one by one. For now, only 273 sentences
have been labeled and utilized in developing the NER model. We used 70% of the data (191
sentences) to train the model, and the remaining 30% (82 sentences) was used to validate the
classification results.
Table 2 is a confusion matrix of the classification results of the NER model. The results in
the table indicate that the model failed to classify anything for the categories ‘none’. Moreover,
the model seemed to be naïve in that it categorized most words into the ‘standard’ category.

Table 2. Confusion Matrix of NER

Prediction
none obj std env con ref total
none 0 204 517 107 43 1 872
obj 0 174 24 11 9 0 218
std 0 27 425 40 22 0 514
Actual env 0 40 48 32 20 2 142
con 0 17 64 19 20 11 131
ref 0 3 15 0 1 3 22
total 0 465 1,093 209 115 17 1,899

To validate the NER model quantitatively, precision and recall were measured as shown in
Table 3. As mentioned above, the model categorized most words into the ‘standard’ category, so
that the recall of ‘standard’ had a high value (0.827). However, the overall results of precision
and recall were both inadequate, and the average F1 score was only 0.256. Discussion of these
results and our plan for future research are presented in the conclusion section.

Table 3. Precision and Recall (NER)

Class Precision Recall F1 Score
none 0.000 (0/0) 0.000 (0/872) 0.000
obj 0.374 (174/465) 0.798 (174/218) 0.510
std 0.389 (425/1093) 0.827 (425/514) 0.529
env 0.153 (32/209) 0.225 (32/142) 0.182
con 0.174 (20/115) 0.153 (20/131) 0.163
ref 0.176 (3/17) 0.136 (3/22) 0.154

CONCLUSION
Since the research was ongoing, some critical assumptions had to be made, and the interim
results were limited with relatively large error rates. In addition, the sizes of the training
sentences definitely were insufficient to train the RNN model; the more training sentences we
have, better the deep learning functions work. To overcome these problems, in future research,
we plan to collect more specifications from Australia and the United Kingdom, expand the
training set by labeling additional sentences, and thus enhance the models that were developed.
The results of this research suggested that an automatic reviewing framework of construction
specifications was required to cover all of the processes involved, ranging from the collection of
data to extracting the target information. In a future study, we will conduct comparison analysis

between the standard information from different specifications. In addition, we plan to test the
applicability of our approach by applying the research results to several construction sites as case
studies.

ACKNOWLEDGMENT
This research was supported by Daewoo Engineering & Construction and the BK21 PLUS
research program of the National Research Foundation of Korea, and we appreciated the pre-
processing of the data that was performed by researchers in the Construction Innovation
Laboratory at Seoul National University.

REFERENCES
Chopra, D., Joshi, N., and Mathur, I. (2016). Mastering Natural Language Processing with
Python, Packt Publishing Ltd.
Le, Q., and Mikolov, T. (2014). “Distributed Representations of Sentences and Documents.” 31st
International Conference on Machine Learning, Beijing, p. 1–9.
Lee, J., Yi, J.-S., and Son, J. (2016). Unstructured Construction Data Analytics Using R
Programming - Focused on Overseas Construction Adjudication Cases: Journal of the
Architectural Institute of Korea, Vol. 32, No. 5, pp. 37–44, DOI:
http://dx.doi.org/10.5659/JAIK_SC.2016.32.5.37.
McCallum, A., and Li, W. (2003). “Early results for named entity recognition with conditional
random fields, feature induction and web-enhanced lexicons.” Proceedings of the seventh
conference on Natural language learning at HLT-NAACL 2003, Edmonton, Canada, p. 188–
191.
Mikolov, T., Corrado, G., Chen, K., and Dean, J. (2013). Efficient Estimation of Word
Representations in Vector Space: Proceedings of the International Conference on Learning
Representations (ICLR 2013), pp. 1–12, DOI: 10.1162/153244303322533223.
Mikolov, T., Karafiat, M., Burget, L., Cernocky, J.H., and Khudanpur, S. (2010). “Recurrent
neural network based language model.” Eleventh Annual Conference of the International
Speech Communication Association,, p. 1045–1048.
Al Qady, M., and Kandil, A. (2015). Automatic Classification of Project Documents on the Basis
of Text Content: Journal of Computing in Civil Engineering, Vol. 29, No. 3, pp. 1–11, DOI:
10.1061/(ASCE)CP.1943-5487.0000338.
Zhang, J., and El-Gohary, N.M. (2016). Semantic NLP-Based Information Extraction from
Construction Regulatory Documents for Automated Compliance Checking: Journal of
Computing in Civil Engineering, Vol. 30, No. 2, pp. 1–14, DOI: 10.1061/(ASCE)CP.1943-
5487.0000346.

Factors Influencing Measurement Accuracy of Unmanned Aerial Systems (UAS) and

Photogrammetry in Construction Earthwork
Xi Wang, Ph.D.1; Julia C. Chen2; and Gabriel B. Dadi, Ph.D., P.E. 3
1
Assistant Professor, Dept. of Engineering, 107 Engineering and Business Building, Univ. of
Mount Union, 1972 Clerk Ave., Alliance, OH 44601-4988. E-mail: [email protected]
2
Student, Dept. of Engineering, Univ. of Mount Union, 1972 Clerk Ave., Alliance, OH 44601-
4988. E-mail: [email protected]
3
Assistant Professor, Dept. of Civil Engineering, Univ. of Kentucky, 151C Raymond Building,
Lexington, KY 40506-0281. E-mail: [email protected]

ABSTRACT
Earthwork is one of the major components of construction projects. In recent years,
unmanned aerial systems (UAS) as a data acquisition platform is becoming attractive for many
surveying applications in construction. This work identifies the effects of various flight
parameters and construction materials on the measurement accuracy and operational efficiency
of UAS. Flight altitudes of 60, 90, 120, and 150 feet above ground level, two image overlapping
rates (70% and 90%), and four common types of earthwork material with the use of 16 ground
control points (GCPs), are chosen to analyze their influence on the positional errors through
multiple comprehensive comparisons and statistical analysis. The results indicate that the use of
GCPs is the most influential factor. However, it is necessary to obtain a balance among all
factors because no single factor is able to improve the spatial quality of the model if others do
not also perform well.

INTRODUCTION
“Unmanned Aerial Systems (UAS)” is an all-encompassing description that encapsulates the
aircraft component, sensor payloads, and a ground control station. The Unmanned Aerial Vehicle
(UAV) platform is equipped with various sensors including cameras, Global Positioning Systems
(GPS) and other specialized communication devices. Capable of operating at different levels of
autonomy, the UAVs are controlled by a ground control station, the activity hub during UAV
missions, which provides the necessary capabilities to plan and execute UAV missions
(Natarajan, 2001). The UAS can transfer visual assets collected by the UAV’s platform to its
ground control station in near real-time (Irizarry and Costa, 2016). Photogrammetry, a
technology using visual assets to derive measurements and three-dimensional (3D) models of
real-world objects or scenes, uses the mathematics of light rays and the location of the camera
when the images are taken to build up information about the geometry of objects. The aim of
utilizing photogrammetry technology is to process or convert images captured by the UAS into
various outputs, such as point cloud models, according to different needs. As more accurate GPS
and camera technologies have developed, the use of UAS is becoming increasingly popular in
various domains such as archaeology and cultural heritage studies (Bendea et al., 2007 and
Gómez-Candón et al., 2014), forestry and agriculture (Grenzdörffer et al., 2008), environment
surveying (Ezequiel et al., 2014), emergency management (Chou et al., 2010 and Molina et al.,
2012), and transportation (Puri et al. 2007). Within the domain of civil engineering, UAS have
been adopted to solve various problems such as bridge inspection (Metni et al., 2007 and
Hallermann et al., 2014), soil erosion (d'Oleire-Oltmanns et al., 2012), earthwork monitoring

(Siebert and Teizer, 2014) and measurement (Wang, X et al., 2017), and 3D model creation (Xie
et al., 2012).
However, despite overall growing popularity within the domain of Civil Engineering, the
utilization of the UAS and photogrammetry technologies in construction is still at an early stage.
Little research has been conducted from a pragmatic perspective to evaluate the effectiveness of
this emerging technology due to practical issues such as local regulations, limited resource of test
fields, or strict flight conditions. Therefore, this work aims to conduct a quantitative analysis to
evaluate the influence of important UAS flight parameters and site conditions on measurement
accuracy. According to practical experience and existing literature, flight altitude, image
overlapping rate, GCPs, and soil type are key factors during the operation of UAS and modeling
quality control (Sibert and Teizer, 2014; Mesas-Carrascosa et al., 2016; Nassar and Jung, 2012).
The goal of this project is to compare the positional accuracy of points when applying different
factor parameters in order to identify the effectiveness of each factor and interactions among
them, thereby providing a practical reference for managers and engineers to allow for efficient
application of UAS and photogrammetry in construction projects.

INFLUENTIAL FACTORS
One of the most important flight parameters during UAS operations is the flight altitude. It
not only determines the relative size of the pixels of an image, but also the flight durations and
the area to be covered (Christiansen et al., 2017). To be more specific, the flight altitude is
related to the Ground Sampling Distance (GSD). GSD is the distance between two consecutive
pixel centers measured on the ground. The larger the value of GSD, the lower the spatial
resolution of the image and the lower visibility of the details. In selecting flight altitude, it is
essential to consider the balance between the spatial resolution and area covered. Higher spatial
resolution will contribute to image quality but may result in prolonged flight duration.
Another crucial factor is the overlapping rate between images. Photogrammetry is a
technology of image processing to interpret the shape and location of an object from one or more
photographs of that object. It aims to reconstruct an object from two-dimensional (2D) graphic
form to three-dimensional (3D) form. The shape and position of an object are determined by
reconstructing bundles of light rays which define the spatial direction of the ray to the
corresponding object point. From the intersection of at least two corresponding and separated
rays, an object point can be located in 3D space. Therefore, image processing is based on
automatically finding thousands of common points between images. Each characteristic point in
an image is called a key-point. When two key-points, from two different images captured at
different locations, are found to be the same, they will match together. When there is high
overlap between images, the camera on the UAS is able to capture a larger common area to
generate more matched key-points and thus improve the computational accuracy.
For the use in surveying application, an absolute accuracy test is mandatory. The quality of
the 3D model depends on the number of images and manual tie points. The use of Ground
Control Points (GCPs) is an effective method to improve accuracy. GCPs are points with known
coordinates measured by highly accurate GPS units in the area of interest. The photogrammetry
software is able to process projects with or without geo-locations, but accurate GCPs improve
the global accuracy of the project. GCPs will give the scales, orientations, and positions to the
final results (Wang, J. et al., 2012). Therefore, the number of GCPs and their distribution are
important to control the modeling quality and accuracy of measurements.
Lastly, the material of the mapping surface also has great impact on the quality of models

during the image processing. A 3D image is a non-contact measurement method applied to

produce a 3D representation of a physical object (Furukawa and Ponce 2010). The point cloud
model is the major output of image processing through photogrammetry. A point cloud is
composed of a set of vertices used to represent the external surface of objects in a 3D coordinate
system. The photogrammetry software generates a point cloud model through measuring a large
number of points on the surface of an object (Nassar and Jung 2012). Therefore, the different
surface material of an object may affect the modelling quality at various levels.

Figure 1. Grid Flight Pattern over the Study Area

METHODOLOGY
The UAS used for image acquisition in this study was the DJI Inspire 1. This UAS is a
vertical takeoff and landing aircraft powered by a 22.2V battery (See Figure 1). Its system has a
maximum takeoff weight of 7.71lbs and maximum wind resistance of up to 10m/s. The
maximum flight duration is approximately 18 minutes. The UAS is equipped with a 1/2.3 inch
CMOS sensor with a 20mm lens, and the stock camera has 4096 × 2160 resolution for still
images (DJI 2018). During operation, the UAS autopilot sends a signal to the equipped sensor to
capture a photo while, simultaneously, recording the geo-referencing information, such as
location and navigation angles, which can be used for post-processing on an SD card. The study
area was 163×247 ft in size and located at Coldstream Dairy Research Farm Complex in
Lexington, Kentucky.
Multiple flights were conducted following the scheme presented in Figure 1. This flight plan
is compatible with different flight altitudes, image overlapping rates, and the use of GCPs. A set
of flight missions were performed at altitudes of 60, 90, 120 and 150 ft. Due to the height of wire
poles on the farm, it was dangerous to fly the UAS lower than 60ft. For each altitude level, the
UAS captured photos based on two different forward and side overlapping rates respectively:
40%-70% and 60%-90%. All the flight missions followed the grid pattern because this study
aimed to perform mapping over an area with large size rather than modeling a vertical object
(See Figure 1). In addition, flight missions were performed under the same weather conditions,

especially wind speed. In this study, Pix4Dmapper photogrammetry software was selected to
process images and generate 3D point cloud and DSM models of the study area. Afterward,
images captured by each flight were processed with and without GCPs. The coordinates of GCPs
were measured by an EPOCH 50 GNSS Rover. A total of 16 GCPs were measured and spaced
evenly across the area of interest to minimize the errors in scale and orientation.

Table 1. RMSE (ft) of Flights Processed by Different Number of GCPs

Flight Image
Altitude Overlapping No 1 4 8 12 16
(ft) Rate (%) GCPs GCPs GCPs GCPs GCPs GCPs

60 70%-40% 4.09 4.08 0.92 0.78 0.55 0.52

90%-60% 5.35 4.25 0.29 0.28 0.28 0.28

90 70%-40% 4.99 4.55 0.81 0.70 0.64 0.50

90%-60% 3.31 3.31 0.32 0.29 0.28 0.28

120 70%-40% 3.12 3.19 0.60 0.60 0.50 0.43

90%-60% 6.85 5.74 0.50 0.45 0.31 0.29

150 70%-40% 3.49 3.65 0.64 0.57 0.38 0.38

90%-60% 2.91 2.89 0.51 0.48 0.30 0.31

RESULTS AND DISCUSSIONS

The major output of image processing was a point cloud model. The accuracy of the position
of each point directly contributed to the linear or volumetric measurements. To be more specific,
the positional absolute accuracy was the indicator or measure of how a spatial object was
accurately positioned on the map with respect to its true position on the ground, within an
absolute reference frame – such as UTM coordinate system (Küng et al., 2011). The 16 GCPs
perform as checkpoints to be used for measurement of positional accuracy, no matter how many
GCPs are used for processing. In this study, the positional accuracy of points was evaluated by
Root Mean Square Error (RMSE). (Luhmann, Thomas, et al. 2014, Siebert and Teizer, 2014).
The CPU and memory specifications of the desktop used for analysis are as follows, Intel(R)
Core(TM) i7-4790 CPU @ 3.60GHz, with 32GB of RAM. The operating system was Windows
7 Professional, 64-bit, and the photogrammetry platform was Pix4Dmapper Pro.
Table 1 shows the RMSE of each point with varying numbers of GCPs applied at multiple
flight altitudes and image overlapping rates. It can be observed that the errors significantly
decrease when all GCPs were used for processing because the GCPs provide an accurate
orientation of the coordinates reference system. Also, the results show random RMSE behavior
when no GCPs were used due to the lack of geometric constraints on the aerial-triangulation
computation. This behavior seems to be independent of flight heights and image overlapping
rates. With the image overlapping settings, the results indicate higher overlapping rates result in
smaller errors when applying all GCPs regardless of flight altitudes. However, as the flight
altitude increases, the errors decrease in magnitude if there is a lower overlapping rate.

According to observations of the collected data, lower flight altitude with higher image
overlapping rate and the use of GCPs results in better positional accuracy. A multiple regression
analysis was used to verify the results based on the observations. According to the results (See
Figure 2), the use of GCPs is a statistically significant predictor because its p-value of t-test is
smaller than 0.05.

Figure 2. Estimations of the Independent Variables Significance

In this study, the effect of soil type on the volumetric measurement accuracy was tested by
modeling four samples composed of different soil types. The four soil types were sand, clay, fine
grade gravel, and coarse grade gravel. The actual volumes of samples were based on the
standards measured by the manufacture. All samples were piled in similar shapes under the same
weather and illumination conditions (See Figure 3).

Figure 3. Sample Piles of Different Soil Types

As seen in Table 2, the results indicated that the measured volume of clay had the smallest
error. In addition, as the soil granularity increased and as the color of material became lighter,
the accuracy of measurement decreased. The reason may be that coarser surface textures created
more visual noise on the surfaces of the models and light-colored and glossy surfaces tended to
saturate images leading to difficulties in visual interpretation.

CONCLUSION
This study aimed to investigate how important flight parameters of the UAS and
environmental factors impacted measurement accuracy through experimental flights and
statistical analysis of positional errors computed through photogrammetry technologies.
After detailed comparisons and analysis for each flight plan, one can derive that the
combination of low flight altitudes, high image overlapping rate, the use of a proper number of

GCPs and modeling surface of clay soil type can maximize the measurement accuracy. The
positional errors become much smaller when more than 1 GCP are used for processing because
GCPs provide an accurate orientation of the coordinate reference system. This behavior is
independent of flight heights and image overlapping rates. With the image overlapping and flight
altitude settings, however, higher overlapping rates result in larger errors as the flight altitude
increases, and the errors decrease if selecting the low overlapping rate. This tendency did not
change when a different number of GCPs were applied. Although GCPs were the most
influential factor, based on the results of multiple regression analysis, it does not mean an
unlimited number of GCPs would be an optimal strategy to guarantee accurate measurements. In
the experiment, there were no significant differences in the errors when comparing the results
from using 4 GCPs as opposed to 16 GCPs. The selections of parameter values largely depend
on the level of accuracy required by users.

Table 2. Impact of Soil Types on the Accuracy of Volumetric Measurements

Soil Type Number of Actual Volume Computed Volume % Error

Calibrated Photos (ft3) (ft3)

Clay 11 1.5 1.48 1.33

Sand 11 0.5 0.53 6.00

Gravel 10 0.5 0.47 6.00

Rock 10 0.5 0.45 10.00

The limitations of this study mainly came from the selection of the UAS equipment and
photogrammetry software. The UAS, an especially low-cost device, limited the sensor payload in
weight and dimension, so low-weight sensors with small-format amateur cameras had to be used.
When compared to more expensive UAS with large format cameras, the UAVs acquired a higher
number of images in order to obtain the same image coverage and comparable image resolution.
Moreover, the low-cost sensors were less stable, which resulted in a lower image quality. When
processing the images collected by the UAV, this study did not research the differences caused
by varying devices. In the future, more research could be conducted regarding how different
devices and other potential environmental factors impact the measurement accuracy when the
limitations of UAS technology can be solved – such as inaccurate geo-referencing capability and
limited battery capacity.

REFERENCES
Bendea, H., Chiabrando, F., Tonolo, F. G., & Marenchino, D. (2007, October). Mapping
ofarchaeological areas using a low-cost UAV. The Augusta Bagiennorum test site. In XXI
International CIPA Symposium (Vol. 1).
Chou, T. Y., Yeh, M. L., Chen, Y. C., & Chen, Y. H. (2010). Disaster monitoring and
management by the unmanned aerial vehicle technology.
Christiansen, M. P., Laursen, M. S., Jørgensen, R. N., Skovsen, S., & Gislum, R. (2017).
Designing and Testing a UAV Mapping System for Agricultural Field Surveying. Sensors,
17(12), 2703.
d'Oleire-Oltmanns, S., Marzolff, I., Peter, K. D., & Ries, J. B. (2012). Unmanned aerial vehicle

(UAV) for monitoring soil erosion in Morocco. Remote Sensing, 4(11), 3390-3416.
Ezequiel, C. A. F., Cua, M., Libatique, N. C., Tangonan, G. L., Alampay, R., Labuguen, R. T., ...
& Loreto, A. B. (2014, May). UAV aerial imaging applications for post-disaster assessment,
environmental management and infrastructure development. In Unmanned Aircraft Systems
(ICUAS), 2014 International Conference on (pp. 274-283). IEEE.
Xie, F., Liu, Z., Gui, D., & Liu, H. (2012). Study on construction of 3D building based on UAV
images. The International Archives of the Photogrammetry, Remote Sensing and Spatial
Information Sciences, 469-473.
Furukawa, Y., & Ponce, J. (2010). Accurate, dense, and robust multiview stereopsis. IEEE
transactions on pattern analysis and machine intelligence, 32(8), 1362-1376.
Gómez-Candón, D., De Castro, A. I., & López-Granados, F. (2014). Assessing the accuracy of
mosaics from unmanned aerial vehicle (UAV) imagery for precision agriculture purposes in
wheat. Precision Agriculture, 15(1), 44-56.
Grenzdörffer, G. J., Engel, A., & Teichert, B. (2008). The photogrammetric potential of low-cost
UAVs in forestry and agriculture. The International Archives of the Photogrammetry,
Remote Sensing and Spatial Information Sciences, 31(B3), 1207-1214.
Hugenholtz, C. H., Whitehead, K., Brown, O. W., Barchyn, T. E., Moorman, B. J., LeClair, A.,
& Hamilton, T. (2013). Geomorphological mapping with a small unmanned aircraft system
(sUAS): Feature detection and accuracy assessment of a photogrammetrically-derived digital
terrain model. Geomorphology, 194, 16-24.
Luhmann, T., Robson, S., Kyle, S., & Boehm, J. (2013). Close-range photogrammetry and 3D
imaging. Walter de Gruyter.
Metni, N., & Hamel, T. (2007). A UAV for bridge inspection: Visual servoing control law with
orientation limits. Automation in construction, 17(1), 3-10.
Mesas-Carrascosa, F. J., Notario García, M. D., Meroño de Larriva, J. E., & García-Ferrer, A.
(2016). An analysis of the influence of flight parameters in the generation of unmanned aerial
vehicle (UAV) orthomosaicks to survey archaeological areas. Sensors, 16(11), 1838.
Nassar, K., & Jung, Y. H. (2012). Structure-From-Motion Approach to the Reconstruction of
Surfaces for Earthwork Planning. Journal of Construction Engineering and Project
Management, 2(3), 1-7.
Pix4D (2016). https://support.pix4d.com/hc/en-us/articles/202557459#label
Siebert, S., & Teizer, J. (2014). Mobile 3D mapping for surveying earthwork projects using an
Unmanned Aerial Vehicle (UAV) system. Automation in Construction, 41, 1-14.
Wang, J., Ge, Y., Heuvelink, G. B., Zhou, C., & Brus, D. (2012). Effect of the sampling design
of ground control points on the geometric correction of remotely sensed imagery.
International Journal of Applied Earth Observation and Geoinformation, 18, 91-100.
Wang, X., Al-Shabbani, Z., Sturgill, R., Kirk, A., & Dadi, G. B. (2017). Estimating Earthwork
Volumes Through Use of Unmanned Aerial Systems. Transportation Research Record:
Journal of the Transportation Research Board, (2630), 1-8.

Perceptions for Crane Operations

Bo Xiao1; Keith Yin Kong Lam2 ; Jieyu Cui3 ; and Shih-Chung Kang, Ph.D., P.Eng.4
1
Graduate Student, Dept. of Civil and Environmental Engineering, Univ. of Alberta, 9211-116
St. NW, Edmonton, AB T6G 1H9, Canada. E-mail: [email protected]
2
Undergraduate, Dept. of Civil and Environmental Engineering, Univ. of Alberta, 9211-116 St.
NW, Edmonton, AB T6G 1H9, Canada. E-mail: [email protected]
3
Undergraduate, School of Naval Architecture, Ocean and Civil Engineering, Shanghai Jiao
Tong Univ., 800 Dongchuan Rd., Shanghai, China 200240. E-mail: [email protected]
4
Professor, Dept. of Civil and Environmental Engineering, Univ. of Alberta, 9211-116 St. NW,
Edmonton, AB T6G 1H9, Canada. E-mail: [email protected]

ABSTRACT
Sensors and computation increase the precision, efficiency, and agility of crane operations.
This paper presents an ongoing work of developing computational methods to enhance the crane
operations. This research focuses on three significant challenges for crane operators: (1)
identifying construction equipment and their activities, (2) identifying and tracking personnel,
and (3) tracking the rigging object. In the preliminary stage, we focus on the first challenge to
identify construction equipment and activities. 5,000 images have been collected and manually
labeled for training deep learning detection algorithms. In the future steps, we will employ
inception-SSD method to locate personnel, trucks, and excavators. After that, we will propose a
method that recognizes excavator activities from crane views. Once learning algorithm is the
reliable, it will benefit the crane operators to operate the cranes with confidence. It can also
reduce the difficulty of crane operation. Training time and safety concerns will be reduced
simultaneously.

INTRODUCTION
Sensors and computation increase the precision, efficiency, and agility of crane operations.
The computer vision technology has gained great success in construction automation field. This
paper presents an ongoing work of developing computational methods to enhance the crane
operations. Crane erections are often in the critical paths for construction projects, and efficiency
of crane operations directly influences the overall project performance (Neitzel et al., 2001).
Meanwhile, crane cableway caused by inertial forces and winds makes it difficult for operators
to control safely. Additionally, a dynamic site environment with moving people and construction
equipment add challenges for safe lifting. The final objective of this research is overcoming three
major challenges for crane operators: (1) identifying construction machines and activities, (2)
identifying and tracking personnel and (3) tracking rigging objects.
Crane perception based on the smart sensors and controllers make crane operations more
efficient and safer. As illustrated in Figure 1, labels S and C represent smart sensors and
controllers respectively. The main idea of crane perception is widely deployed sensors acting like
cameras in the environment, which allow the remote users to "know" the working progress.
Information collected from the sites will be sent for processing uses artificial intelligence for
crane control. The visual perception automatically identifies high risk and high-value works.
This allows crane operators and remote users to sense the environment, prepare for the next
tasks, and most importantly, prevent potential risks.

Figure 1. Vision of Crane Perceptions

In the preliminary stage, we have focused on identifying construction equipment and their
activities. These construction videos can be used to identify construction equipment which
prevent potential collisions during crane operations. Recognizing equipment activities will help
remote users to estimate the valuable productivities. Until now, we have manually labeled 5000
images of equipment and tested them with two deep learning algorithms, YOLOV3 and
Inception-SSD. The detection results indicate that the Inception-SSD performs better than
YOLOV3 in our dataset. After that, we have proposed a method that puts the detected objects to
3D CNN classifiers which recognizes excavator activities from crane view. The proposed
method can be extended to other construction equipment such as lifters, bulldozers, and
backhoes. In the future, we will focus on developing reliable tracking system for personnel and
rigging objects tracking.

LITERATURE REVIEW
The early work of object detection is the cascade detector (Viola and Jones 2001), which
consists of multiple stages. Each stage is an ensemble of simple classifiers. The difficulty of
detection comes from the huge differences within the same category. To fill this gap, various
kinds of deformable template methods (Coughlan et al., 2000; Cootes et al., 2001), and part-
based methods (Crandall et al., 2005; Amit and Trouve, 2007) have been conducted in computer
vision community. In recent, the convolutional neural networks (CNNs) have been demonstrated
in object detection and achieved reliable performance. The CNNs represents images through
designed structure of many layers for feature extraction and transformation, which makes the
detector understands the images from a higher level (Krizhevsky et al., 2012; Vedaldi and
Zisserman, 2015). Girshick (2015) proposed the Fast r-cnn detection model and Ren et al. (2015)
proposed the Faster r-cnn model. Redmon et al. (2016) introduced the YOLO darknet into

detection, which can reach real-time performance. Liu et al. (2016) proposed the SSD model to
exploit the information of the tiny image area.

Figure 2. The analysis of construction equipment detection dataset

Crane operators work in difficult environments. It often requires them to complete erection
activities without a clear view while being alert to all possible risks. Many researchers have
developed methods to use cameras to enhance the perceptions while operating cranes. Gong and
Caldas (2009) installed multiple cameras on the crane boom to identify construction activities
from the video streams. Weerasinghe and Ruwanpura (2009) tracked construction resources to
reduce waste. Rezazadeh and Brenda (2012) have developed automated methods to detect and
track trucks to monitor productivity in real time. Han and Lee (2013) developed a method to
protect workers from potential collisions by using cameras. Yang et al. (2014) employed
Gaussian background subtraction (Wren et al., 1997) to detect crane jibs to analyze crane
activities from video streams. Kim and Chi (2017) have developed tracking methods to locate
construction equipment. Xiao and Zhu (2018) tested 15 tracking algorithms in construction
videos and identified stable trackers in various backgrounds.

CONSTRUCTION EQUIPMENT DETECTION

Object detection is the primary section in this research. We need to identify equipment,
personnel and rigging objects by detectors. It is important to choose a reliable detector to conduct

this research. The deep learning detection methods have shown high performance in many
applications and we decided to adopt this technology in our research to identify construction
objects. In order to evaluate detection algorithms, we have collected and manually labeled 5000
images of equipment and tested them with deep learning detection algorithms YOLOV3 and
Inception-SSD.
There are four types of construction equipment labeled in the current dataset, which are
truck, excavator, loader, and backhoe. We have analyzed our dataset from different perspectives
in Figure 2 and compared it with COCO dataset (Lin et al., 2014), which is a well-known
detection dataset in computer vision. In Figure 2(a), it shows that most images in our dataset
contain one or two categories, while COCO has a uniform distribution. In Figure 2(b), 58% of
images contain only one instance and 30% of the images contain two instances. We can find that
our dataset is a specific dataset for construction equipment, while COCO is a general dataset for
daily life objects. In Figure 2(c), the instance size of our dataset is larger than COCO. Based on
these differences, the detection algorithms perform well in computer vision need to be re-
evaluated with the construction equipment dataset. It needs to point out that this research is an
on-going research. Comparing with other mature datasets (Figure 2(d)), we have limited number
of categories and instances. And we will put more efforts into this construction detection dataset.
YOLOV3 and Inception-SSD have been selected to test on our dataset because of their
promising performance on COCO. This dataset has been separated to trainset (90%) and valid-
set (10%). In this research, we have used the Mean Average Precision (mAP, LeCun et al., 2015)
to evaluate the detection results. mAP is the evaluation criteria decided by Precision and Recall.
Precision measures how accurate the algorithm is, but it cannot reflect the performance of
finding all positive instances. mAP is able to show the detector performance in both accuracy
and robustness. Higher value of mAP means better detection performance. The testing results
can be found in Table 1. It shows that both detectors have higher mAP on construction dataset,
which means detecting construction categories is simpler than detecting general categories. This
result indicates that detecting construction categories by using vision sensors installed on the
crane is a reliable option. Inception-SSD methods perform better than YOLOV3 from an overall
view and we will employ Inception-SSD for the crane perception.

Table 1. The testing performance of detectors

[email protected] [email protected] [email protected] [email protected] [email protected]
Truck Excavator Loader Backhoe Overall
YOLOV3 0.71 0.93 0.91 0.93 0.87
Inception-SSD 0.80 0.93 0.94 0.95 0.91

EXCAVATOR ACTIVITY RECOGNITION

We have proposed a method to recognize construction activities by 2D-CNN detector and
3D-CNN classifier. 3D-CNN means putting continuous images instead of a single image into the
CNN model as the inputs. The 3D-CNN classifier U3D (Tran et al., 2015) has been used in this
research. Because excavation is the most common activity in construction sites, we will take
excavator as an example to illustrate our methods. The overview of proposed methodology has
been shown in Figure 3.
In order to identify excavator activities, multiple continuous images will be the inputs instead
of a single image. This kind of input is well-suited for exploiting spatiotemporal features. For
each new frame, we will compare it with previous t frames. Inception-SSD will detect excavators

from the input images. All excavators in each images will be located. Since we are interested in
the equipment activities, the located objects will be cropped for next stage. The cropped images
queue will be put into the pre-trained 3D CNN Classifier U3D. The 3D ConvNets is designed for
action recognition which perform convolution on multiple image matrix at the same time. The
3D convolution and pooling operations are performed spatio-temporally while they are only done
spatially in 2D ConvNets. Similar with image classification, the output of softmax layer will give
the possibility of each action. According to the possibility level, we are able to know whether
this excavator is working or idling.

Figure 3. The methodology of recognizing excavator activity

CONCLUSION AND FUTURE WORKS
In construction sites, cranes are the most crucial equipment on construction sites. Crane
operations significantly influence the safeness of construction sites. Deploying cameras on the
crane boom will capture images with fair views of the whole construction site. Processing these
information with artificial intelligence algorithm, it allows crane operators and remote users to
“know” what is happening in the site. With these information, operators can sense the
environment, prepare for the next tasks, and most importantly, prevent potential risks.
In this research, we focused on the challenge of identifying construction equipment and their
activities. A construction equipment dataset has been organized and labeled. The comparison of
construction dataset and COCO indicates there are bigger instance size in construction scenarios.
We have tested this dataset with detection algorithms YOLOV3 and Inception-SSD. The testing
results shows that detecting construction objects by videos is a promising solution. After that, we
have proposed a method that recognizes excavator activities from crane view, which can be
extended to other construction equipment, such as lifters, bulldozers, and backhoes. With the
perception of crane operations, construction projects will be safer and more productive.
The final goal of this research is to develop a crane vision method, which is able to locate

construction equipment, identifying personnel, and tracking rigging objects through cameras.
Construction personnel is much smaller in pixels than equipment. We will work on developing a
method which can utilize limited features and perform robustly from a top view. For tracking
rigging objects, we will install cameras on the rig and it will move during construction activities.
The tracking method is expected to overcome the challenge of motion blur. Once the reliable
system developed, it will benefit the crane operators to operate the cranes with confidence and
safety.

REFERENCES
Amit, Y., & Trouvé, A. (2007). Pop: Patchwork of parts models for object recognition.
International Journal of Computer Vision, 75(2), 267-282.
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE
Transactions on pattern analysis and machine intelligence, 23(6), 681-685.
Coughlan, J., Yuille, A., English, C., & Snow, D. (2000). Efficient deformable template
detection and localization without user initialization. Computer Vision and Image
Understanding, 78(3), 303-319.
Crandall, D., Felzenszwalb, P., & Huttenlocher, D. (2005, June). Spatial priors for part-based
recognition using statistical models. In Computer Vision and Pattern Recognition, 2005.
CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 10-17). IEEE.
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer
vision (pp. 1440-1448).
Gong, J., & Caldas, C. H. (2009). Computer vision-based video interpretation model for
automated productivity analysis of construction operations. Journal of Computing in Civil
Engineering, 24(3), 252-263.
Han, S., & Lee, S. (2013). A vision-based motion capture and recognition framework for
behavior-based safety management. Automation in Construction, 35, 131-141.
Kim, J., & Chi, S. (2017). Adaptive detector and tracker on construction sites using functional
integration and online learning. Journal of Computing in Civil Engineering, 31(5), 04017026.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
convolutional neural networks. In Advances in neural information processing systems (pp.
1097-1105).
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., & Zitnick, C. L. (2014,
September). Microsoft coco: Common objects in context. In European conference on
computer vision (pp. 740-755). Springer, Cham.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016,
October). Ssd: Single shot multibox detector. In European conference on computer vision
(pp. 21-37). Springer, Cham.
Neitzel, R. L., Seixas, N. S., & Ren, K. K. (2001). A review of crane safety in the construction
industry. Applied occupational and environmental hygiene, 16(12), 1106-1117.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-
time object detection. In Proceedings of the IEEE conference on computer vision and pattern
recognition (pp. 779-788).
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection
with region proposal networks. In Advances in neural information processing systems (pp.
91-99).

Rezazadeh Azar, E., & McCabe, B. (2011). Automated visual recognition of dump trucks in
construction videos. Journal of Computing in Civil Engineering, 26(6), 769-781.
Tran, D., Bourdev, L., Fergus, R., Torresani, L., & Paluri, M. (2015). Learning spatiotemporal
features with 3d convolutional networks. In Proceedings of the IEEE international conference
on computer vision (pp. 4489-4497).
Vedaldi, A., & Zisserman, A. (2015). VGG Convolutional Neural Networks Practical.
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features.
In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001
IEEE Computer Society Conference on (Vol. 1, pp. I-I). IEEE.
Weerasinghe, I. T., & Ruwanpura, J. Y. (2009). Automated data acquisition system to assess
construction worker performance. In Construction Research Congress 2009: Building a
Sustainable Future (pp. 61-70).
Wren, C. R., Azarbayejani, A., Darrell, T., & Pentland, A. P. (1997). Pfinder: Real-time tracking
of the human body. IEEE Transactions on Pattern Analysis & Machine Intelligence, (7), 780-
785.
Xiao, B., & Zhu, Z. (2018). Two-Dimensional Visual Tracking in Construction Scenarios: A
Comparative Study. Journal of Computing in Civil Engineering, 32(3), 04018006.
Yang, J., Vela, P., Teizer, J., & Shi, Z. (2012). Vision-based tower crane tracking for
understanding construction activity. Journal of Computing in Civil Engineering, 28(1), 103-
112.

An Improved Convolutional Neural Network System for Automatically Detecting Rebar in

GPR Data
Zhongming Xiang1; Abbas Rashidi2 ; and Ge (Gaby) Ou3
1
Ph.D. Candidate, Dept. of Civil and Environmental Engineering, Univ. of Utah, Salt Lake City,
UT 84112. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil and Environmental Engineering, Univ. of Utah, Salt Lake
City, UT 84112. E-mail: [email protected]
3
Assistant Professor, Dept. of Civil and Environmental Engineering, Univ. of Utah, Salt Lake
City, UT 84112. E-mail: [email protected]

ABSTRACT
As a mature technology, ground penetration radar (GPR) is now widely employed in
detecting rebar and other embedded elements in concrete structures. Manually recognizing rebar
from GPR data is a time-consuming and error-prone procedure. Although there are several
approaches to automatically detect rebar, it is still challenging to find a high resolution and
efficient method for different rebar arrangements, especially for closely spaced rebar meshes. As
an improved convolution neural network (CNN), AlexNet shows superiority over traditional
methods in image recognition domain. Thus, this paper introduces AlexNet as an alternative
solution for automatically detecting rebar within GPR data. In order to show the efficiency of the
proposed approach, a traditional CNN is built as the comparative option. Moreover, this research
evaluates the impacts of different rebar arrangements, and different window sizes on the
accuracy of recognizing rebar. The results revealed that: (1) AlexNet outperforms the traditional
CNN approach, and its superiority is more notable when the rebar meshes are densely
distributed; (2) the detection accuracy significantly varies with changing the size of splitting
window, and a proper window should contain enough information about rebar; (3) uniformly and
sparsely distributed rebar meshes are more recognizable than densely or unevenly distributed
items, due to lower chances of signal interferences.
AUTHOR KEYWORDS: Ground penetrating radar; Rebar detection; Convolution neural
network; AlexNet; Rebar arrangement; Window size

INTRODUCTION
Extracting necessary information about the number, location(s), and size(s) of embedded
rebar at existing concrete elements is a major task for civil engineers. As a popular Non-
Destructive Testing (NDT) method, Ground Penetration Radar (GPR) is capable of detecting
rebar and other embedded metallic without causing any destructions to concrete elements. The
technology has been proved to be very efficient in various projects and under different settings
(Kaur et al. 2016; Eisenmann et al. 2017). Based on the propagation principle of electromagnetic
waves (EM), rebar is presented as hyperbolic signatures in GPR data. As a result, extracting
necessary information and interpreting the hyperbolic signatures is a critical step toward
automated detection of rebar. Several approaches have been suggested by researchers and
practitioners to handle this job (Dou et al. 2017; Lee and Mokji, 2014; Yuan et al. 2018). One of
the most effective techniques is implementing machine learning algorithms, such as convolution
neural network (CNN), support vector machines (SVM), BP neural network, etc. As the main
concern of 3D reconstruction of the large scale building, efficiently recognizing rebar can be

solved with a well-trained neural network.

Since image is the most intuitive form of GPR data, CNN, a powerful tool for several image
processing algorithms (Chua and Roska, 1993), is very suitable for interpreting GPR data. One
major advantage is that CNN does not require extracting any features from the raw data, which
eliminates the need for some extra computing steps necessary for other machine learning
methods (Guyon and Elisseeff, 2006). Several studies have been conducted on applications of
CNN for interpreting GPR data. CNN was first implemented by Besaw and Stimac(2015) to
interpret GPR data of buried explosive hazards. The results illustrated that the accuracy could
increase by 10% compared to traditional feature extraction approaches. Similarly, Lameri et al.
(2017) employed CNN to detect buried landmines, and their study showed that the accuracy
could raise up to 95% on real GPR data with minimal pre-processing procedures. For
recognizing rebar, Dinh et al. (2018) used CNN to locate and detect rebar in bridge decks.
Necessary image processing methods were applied to obtain high quality GPR data for CNN,
and the reported accuracy level was higher than 95.75%. One characteristic of those studies is
that the buried rebar meshes were not densely distributed. In other words, low signal interface
occurred in GPR data. However, in most exiting concrete elements (e.g. column, shear wall, slab,
etc.), the distribution of rebar is quite dense, and the reflected hyperbolas in GPR patterns are
often rambling. As a result, a more efficient method is required to deal with densely distributed
rebar meshes at concrete elements.
As an improved version of CNN, AlexNet showed more promising results in image
recognition applications through the ImageNet competition in 2012 (Russakovsky et al. 2015).
This technique has been widely used for recognizing targets and shown distinctly high levels of
recognition rate. Due to implementing deeper layers and exitance of several new features in the
network, a trained AlexNet is very robust. It can overcome the problem of scattered distribution
of dense rebar meshes. For the first time, this paper applies the AlexNet to detect the existence of
rebar in reinforced concreted elements. In order to illustrate the superiority of AlexNet, a
traditional CNN has been built for comparing recognition rate. This study also evaluates the
impacts of window sizes (used for dividing the entire GPR image into training and testing
segments) on accuracy of final results. The proposed method has been tested on three major
structural elements: concrete columns, shear walls, and concrete slabs. The following sections
describe necessary steps for constructing a CNN AlexNet system for detecting rebar, as well as
necessary experimental settings to evaluate the proposed system and obtained results.

RESEARCH METHODOLOGY
The ultimate goal of this research is to propose a more efficient method for automatically
detecting rebar in concrete elements. To achieve this goal, a novel CNN AlexNet has been
constructed. In parallel, a traditional CNN has been built for comparison purposes. architecture
details of the architecture of the two deep networks, as well as the necessary pre-processing steps
to implement these systems are described here:

AlexNet Architecture
An AlexNet system consists of 8 layers, including 5 convolutional and 3 full-connected
layers (Krizhevsky et al. 2012). Figure 1 depicts the network structure of AlexNet. As the
uniform input data of the network, the image size is 227×227×3. In the layer Conv1, 96
convolution kernels are set to process the input data. Meanwhile, the activation function ReLU is
employed to ensure that values in the feature map are in a reasonable range. A max-pooling layer

and the local response normalization (LRN) are used as well in Conv1. The layers 2-5 resemble
Conv1 except that Conv3 and Conv4 do not contain pooling layer. After processed by
convolution layers, a new feature map is sent to the full-connected layers 6-8. Similar to a
traditional CNN, AlexNet considers the weights as the connections of the neurons between
different layers. In order to reduce overfitting, half of neurons are randomly dropped out. In
addition, the Softmax is employed to classify the categories in a reasonable way. The final
classification is conducted based on the highest possibilities of input data to belong to any
desired categories.

Figure 1. The network structure of AlexNet

As a deep learning method, AlexNet incorporates several new features into CNN. These
additions improve both recognition rate and computation efficiency (Krizhevsky et al. 2012).
Different features of AlexNet, compared to traditional CNN, are summarized in Table 1.

Table 1. Improvement details of AlexNet compared to traditional CNN

Items Traditional CNN AlexNet Improvement
Activation Avoid the gradient diffusion in deep
Sigmoid ReLU
function network
All of the
Neurons in use A portion of neurons Reduce overfitting
neurons
Pooling layer Average-pooling Max-pooling Retain the significant features
Neuron activity - LRN Improve the generalization
Operation
CPU GPU Reduce the computation time
mode
Data size Original data Data Augmentation Reducing overfitting

Table 2. Major parameters of the constructed traditional CNN system (TraNet)

Layers Parameters
Convolution 1 8@3×3; Stride: 1; Padding: 0; RELU; Bath Normalization
Pooling 2 2×2; Stride: 2
Convolution 3 16@3×3; Stride: 1; Padding: 0; RELU; Bath Normalization
Pooling 4 2×2; Stride: 2
Convolution 5 32@3×3; Stride: 1; Padding: 0; RELU; Bath Normalization
Full Connected 6 Output neuron numbers: 4; Activation function: Softmax

Comparative CNN
As the next step, and in order to compare the performance of AlexNet with traditional CNN,

another neural network named TraNet, is constructed. TraNet contains 6 layers, in which layers
1, 3, and 5 are built as convolutional layers, layer 2 and 4 are pooling layers, and layer 6 is both
the full-connected layer and the output layer. Detailed settings of TraNet are summarized in
Table 2. For comparison purposes, and to achieve more uniform results, some parameters such as
learning rate, maximum epochs, and weight learn rate factor have been set similarly in both
TraNet and AlexNet systems.

Figure 2. The four typical types of input data: a-left, b-peak, c-right, and d-other

Figure 3. Examples of different window sizes: a-120×30; b-150×50; b-200×80; d-250×100

GPR Data Pre-processing

Normally, a typical GPR data contains several hyperbolic signatures reflected by existence of
rebar. In order to make clear classification, a rectangular window has been applied to split the
whole GPR image into several smaller parts. The main assumption for setting theses windows is
that no more than one complete hyperbolic signature should exist in each part. As presented in
Figure 2, there are four typical shapes in the segmented images: left, peak, right, and other. If the
shape in the sub-image is notably left, peak or right, it will be correspondingly classified as ‘left’,
‘peak’ or ‘right’. Otherwise, and if the segment image does not include any recognizable shape,
it will be classified as ‘other’. Meanwhile, based on the requirement of input data, all of these
images are resized to 227×227×3 prior to using for training or testing the proposed network.
During the training step, it is found that the size of the dividing window has a significant

impact on recognition rate. To further evaluate this impact, four different windows sizes have
been selected: 120×30, 150×50, 200×80, and 250×100 (pixels). The visual effects of different
window sizes are demonstrated in Figure 3.

Figure 4. Three Case studies used as testbed: a-Column; b-Shear Wall; c-Suspended Slab

Figure 5. A typical GPR imagery data obtained by scanning a sample concrete element

EXPERIMENTAL SETUP AND CASE STUDY

For demonstrating the efficiency of the proposed AlexNet system, a number of experiments
have been conducted. This section briefly reviews the selected experimental setup as well as the
obtained results.

Experimental Data
Several reinforced concrete elements in a newly renovated building have been selected as
testbed for this study. Three major building elements have been used as the case studies: one

concrete column with the size of 305 mm×305 mm, one concrete shear wall with thickness of
305 mm, and a 203 mm thick concrete slab. Figure 4 depicts the three selected elements and the
corresponding scanning directions. The stirrup rebar, horizontal rebar, and rebar placed in two
directions are chosen as objects of interest in the column, the shear wall, and the slab,
respectively.
The popular all-in-one ground penetrating system, StructureScan Mini XT, with the central
frequency 2.7 GHz, has been implemented for scanning the case studies and generating raw data
in the form of 2D images. Figure 5 shows the typical GPR image generated by this machine. In
this project, 48 images have been collected. All these images have been divided into relatively
small parts for training purposes. 80% of these small parts have used as training data, and the rest
were considered as test data.

Results and Discussion

The purpose of using the trained AlexNet and TraNet systems is to predict type of label in
each small part. Table 3 summarizes the recognition rate achieved by using two networks based
on test data. As shown in the table, regardless of the window size, AlexNet is capable of
generating more accurate results. For certain window sizes (e.g. 120×30, 150×50, and 250×100),
the accuracy of AlexNet is at least 8% higher than that of TraNet.

Table 3. Testing recognition rate of AlexNet and TraNet

TraNet AlexNet
Window Size Size of input images
28×28×1 32×32×1 227×227×1 227×227×1
120×30 61.27% 57.82% 33.16% 72.68%
150×50 79.19% 78.73% 52.49% 87.78%
200×80 91.21% 91.31% 76.92% 94.51%
250×100 74.62% 77.69% 73.85% 82.31%

When it comes to different window sizes, the recognition rate is highest in the case of
200×80 window size. This phenomenon is mainly due to the fact that this specific window size
contains diverse parts of the hyperbolas. Considering the cases shown in Figure 3-a and b, if the
selected window is too small, it may not contain enough information about the hyperbola. On the
other hand, if the selected window is too large, it may contain the information of more than one
rebar (Figure 3-d).
The accuracy levels for the three elements are separately plotted in Figure 6. In order to make
the comparison concise and clear, we only show one case of TraNet where the size of the input
image is 28×28×1. Through analyzing the GPR data of the three elements, it is found that the
rebar distribution in column is largest and most even due to the small signal interference from the
neighboring rebar. On the contrary, there exist a lot of signal interference in the GPR data of the
slab. As a result, the recognition rate is highest for the column cases and lowest for the slab cases
(Figure 6).

Figure 6. Accuracy of implementing AlexNet and TraNet systems with different window

CONCLUSION AND FUTURE WORK

This paper provided a novel approach of recognizing rebar in GPR patterns with one notable
convolutional neural network AlexNet. Based on the comprehensive analysis of the network
architecture, AlexNet was trained and tested by using GPR data scanned from three concrete
elements (one column, one shear wall, and one suspended slab). The comparisons of detection
accuracy between AlexNet and traditional CNN has been conducted. Meanwhile, for analyzing
the size influence of the windows on rebar detection, four different window sizes have been set
and compared. At the end, the accuracy among the three elements were discussed as well. In
summary, the following conclusions have been learned:
- Compared with traditional CNN, AlexNet could achieve higher levels of accuracy in
recognizing the rebar in actual constructed facilities.
- Variations in sizes of splitting window could remarkably affect the recognition result.
This situation is more crucial for traditional CNN, and AlexNet is more robust to changes
in window sizes.
- Due to lower chances of signal interference from adjacent rebar, the elements with
sparser distributed rebar are more recognizable by GPR scanners.
In this research, AlexNet has been only employed to detect the existence of rebar. As part of
future research directions, the authors plan to focus on recognizing size and depth of rebar with
AlexNet. The authors plan to handle this task by considering the relations between the spatial
information of the rebar and the corresponding coordinates in GPR imagery data.

REFERENCES
Besaw, L. E., & Stimac, P. J. (2015, May). Deep convolutional neural networks for classifying
GPR B-scans. In Detection and Sensing of Mines, Explosive Objects, and Obscured Targets
XX (Vol. 9454, p. 945413). International Society for Optics and Photonics.
Chua, L. O., & Roska, T. (1993). The CNN paradigm. IEEE Transactions on Circuits and
Systems I: Fundamental Theory and Applications, 40(3), 147-156.
Dinh, K., Gucunski, N., & Duong, T. H. (2018). An algorithm for automatic localization and
detection of rebars from GPR data of concrete bridge decks. Automation in Construction, 89,
292-298.
Dou, Q., Wei, L., Magee, D. R., & Cohn, A. G. (2017). Real-time hyperbola recognition and

fitting in GPR data. IEEE Transactions on Geoscience and Remote Sensing, 55(1), 51-62.
Eisenmann, D., Margetan, F. J., Chiou, C. P., Ellis, S., Huang, T., & Tan, J. Y. (2017, February).
Effects of position, orientation, and metal loss on GPR signals from structural rebar. In AIP
Conference Proceedings (Vol. 1806, No. 1, p. 080005). AIP Publishing.
Guyon, I., & Elisseeff, A. (2006). An introduction to feature extraction. In Feature
extraction (pp. 1-25). Springer, Berlin, Heidelberg.
Kaur, P., Dana, K. J., Romero, F. A., & Gucunski, N. (2016). Automated GPR rebar analysis for
robotic bridge deck evaluation. IEEE transactions on cybernetics, 46(10), 2265-2276.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
convolutional neural networks. In Advances in neural information processing systems (pp.
1097-1105).
Lameri, S., Lombardi, F., Bestagini, P., Lualdi, M., & Tubaro, S. (2017, August). Landmine
detection from GPR data using convolutional neural networks. In Signal Processing
Conference (EUSIPCO), 2017 25th European (pp. 508-512). IEEE.
Lee, K. L., & Mokji, M. M. (2014, August). Automatic target detection in GPR images using
Histogram of Oriented Gradients (HOG). In Electronic Design (ICED), 2014 2nd
International Conference on (pp. 181-186). IEEE.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. (2015).
Imagenet large scale visual recognition challenge. International Journal of Computer Vision,
115(3), 211-252.
Yuan, C., Li, S., Cai, H., & Kamat, V. R. (2018). GPR Signature Detection and Decomposition
for Mapping Buried Utilities with Complex Spatial Configuration. Journal of Computing in
Civil Engineering, 32(4), 04018026.

Key Attributes of Change Agents for Successful Technology Adoptions in Construction

Companies: A Thematic Analysis
Afiqah R. Radzi1; Hashim R. Bokhari2; Rahimi A. Rahman, Ph.D.3; and Steven K. Ayer, Ph.D.4
1
Faculty of Civil Engineering and Earth Resources, Universiti Malaysia Pahang, Lebuhraya Tun
Razak, Gambang, Pahang 26300, Malaysia. E-mail: [email protected]
2
School of Sustainable Engineering and the Built Environment, Arizona State Univ., 660 S.
College Ave., Tempe, AZ 85281, USA. E-mail: [email protected]
3
Faculty of Civil Engineering and Earth Resources, Universiti Malaysia Pahang, Lebuhraya Tun
Razak, Gambang, Pahang 26300, Malaysia. E-mail: [email protected]
4
School of Sustainable Engineering and the Built Environment, Arizona State Univ., 660 S.
College Ave., Tempe, AZ 85281, USA. E-mail: [email protected]

ABSTRACT
The construction industry has been criticized for lower productivity than other sectors, which
may be influenced in part by its historical resistance to change in adopting new technologies.
Therefore, it is common for organizations to have dedicated change agents to support effective
technological adoption. This study aimed to identify the key attributes of change agents that
influence the success of adopting new technologies in construction companies. To achieve this
objective, industry practitioners in the United States provided responses to open-ended questions
and the answers were analyzed to identify common themes. The significant findings include: (1)
change agents’ personality is the most frequently reported attribute that contributes to their
effectiveness, and (2) human-related attributes have a larger role in successful technological
adoptions compared to organizational- and technical-related attributes in change agents. This
paper contributes to the body of knowledge by identifying the frequency of attributes that are
reported to impact the effectiveness of change agents in industry.

INTRODUCTION
The construction industry is widely recognized as an industry that has lower productivity
compared to other sectors in the world (McKinsey Global Institute 2017). Various factors cause
poor labor productivity in the industry. One critical factor may be the comparatively slow pace in
adopting new technology compared to other sectors (CII 2008). The fact that each construction
project is unique regarding budget, schedule, specification, and project team stakeholders can
make adopting new technologies difficult. However, in recent years, more construction
companies have started to introduce and adopt new technologies in their projects. Advances in
technology have many benefits, and one of the most often cited advances is the ability to
enhance overall productivity (Goodrum et al. 2004). Although adopting new technologies are
proven to be beneficial to the industry, there are also barriers when adopting them (Rahman
2013). Therefore, identifying approaches to improve the chances of having a successful
technology adoption is crucial.
Most efforts to bring about change in an organization are met with high resistance (Rahman
2013). Thus, it is essential for companies to recruit change agents to successfully lead the effort
of adopting new technologies in construction projects (Lines et al. 2015b). Change agents can be
defined as an individual or a group that is responsible for initiating and managing change in an
organization. Organizations that formally designate change agents to lead the implementation

effort encountered less resistance than those organizations that did not formally identify their
change agents (Lines et al. 2015a). Moreover, change agents that are actively involved in the
process reduce resistance from the employees (Lines et al. 2015a). These prior findings illustrate
the impact of change agents and underscore the importance of better understanding the various
attributes that make them effective for supporting the adoption of new technologies in
construction projects.
The objective of this study is to identify the key attributes of change agents that influence the
success of adopting new technologies in the construction industry. To achieve the objective, this
paper addresses research questions related to (a) When implementing new technology, what are
the attributes of the change agents for successful and sustained technology adoption? And (b)
When implementing new technology, what are the attributes of the change agents that hindered
technological adoption? The authors answer these two questions by analyzing the data from the
questionnaire survey given to attendees at the Construction Industry Institute (CII)’s FIATECH
conference. This paper contributes to the body of knowledge by identifying the frequency of
attributes that are reported to impact the effectiveness of change agents in industry. The findings
from this research can help industry practitioners make decisions when selecting a change agent
for a construction project. If leveraged, these results may help project leaders to consider critical
factors that are occasionally overlooked when identifying change agents for technological
adoption.

BACKGROUND

Change Agent
It is a critical need for organizations to recruit and select people that are not only capable of
initiating change, but also managing change effectively. A very significant stage of an
organizational change effort is the selection of the individuals who will design and execute the
change with success. These individuals are often referred in the literature as ‘change agents.’ To
successfully implement change, an organization must select the right people as change agents to
help promote change throughout the organization. Buchanan and Boddy (1992) compiled fifteen
critical competencies of change agents that can be categorized into specifying goals, team
building activities, communication skills, negotiation skills and influencing skills. While prior
studies have suggested the critical competencies of change agents in general, the critical
competencies of change agents specific for construction industries have yet to be identified.

Change Agents in Construction Industry

Adopting a new approach in the procurement, contracting, and management of construction
projects requires significant organizational changes to assist employees who need to learn new
practices while simultaneously disengaging from traditional methods (Migliaccio et al. 2008).
Therefore, construction companies that want to adopt new technology successfully must hire
dedicated individuals to enact new behaviors so that desired change outcomes are achieved.
Construction companies are recommended to formally identify change agents to lead the
transition as part of their work responsibilities, and they should always be ready to assist
employees whenever they have questions about the change effort (Lines et al. 2015b). Also,
extensive change agent involvement is one of the factors that positively affect change readiness
(Lines et al. 2015b). Lines et al. (2017) produced seven steps of change management practices
that industry professionals can take to improve the chances of successful change adoption and

change agent effectiveness was placed at the highest rank. The study revealed that when
effective change agents were present to manage the change effort, the organization was seven
times more likely to adopt the change. In another study involving electrical contractors, change
agent effectiveness held the most active association with achieving successful organizational
change adoption (Lines et al. 2018). In other words, although previous studies on the topic of
technology adoption within the construction industry have suggested the importance of
identifying the right change agent, there is still a lack of literature on the key attributes of change
agents. Therefore, this study will help fill the gap.

METHODOLOGY
The data collection involves acquiring data from a questionnaire survey. Both qualitative and
quantitative approaches are used to analyze the collected data. The following subsections discuss
the methods of collecting and analyzing the key attributes.

Data Collection
This study collects information through a questionnaire survey given to attendees at CII’s
FIATECH conference. FIATECH is an innovative technology conference where the international
community of technology experts and stakeholders work together to develop technologies and
practices for adopting the technology (CII 2018). Many of the attendees regularly work with new
and emerging technologies. Therefore, this research purposely selected the FIATECH conference
as the venue for the data collection because the industry practitioners in attendance are likely to
have relevant experience to be able to answer questions about technology adoption experiences
in their respective companies. The authors use open-ended questions because they encourage
participants to contribute as much detailed information as desired.

Data Analysis
The qualitative analysis involves performing thematic analysis to identify the key attributes
of change agents because the approach can assist in making sense of qualitative data (Braun and
Clarke 2006). The thematic analysis was conducted based on the six phases described in Braun
and Clarke (2006). The first phase is becoming familiarized with the data. The authors read, re-
read, transcribed the data, and noted the initial ideas based on all responses. The second phase is
to generate the initial codes. The authors coded for as many potential themes and patterns as
possible from the data. The authors then reviewed, discussed, and agreed on any additions and/or
changes to the coding. The third phase is to search for themes based on the initial codes. During
the process of creating the themes for each question, the authors frequently revisited the codes
from the second phase and the original data from the first phase. The fourth phase is to review
the themes. To ensure saturation of the data, the authors continually reviewed the sub-themes,
defined and refined them, checking if themes work in relation to the coded extracts and the entire
data set, and reviewing data to search for additional themes. The fifth phase is to define and
name the themes. The authors continually went back and forth between the themes, codes, and
responses to ensure that the themes were true to the independently coded responses. The final
phase (sixth phase) is to report the output of the analysis.
The quantitative analysis of this research involves grouping and counting the initial codes to
identify key attributes. Then the highest mentioned key attributes of the change agents are
determined. Also, the number of key attributes that are grouped in each theme of attributes is

computed. Then, the mean and standard deviation of each theme were calculated from
frequencies and the total number of responses received. For example, responses related to “good
personality” were mentioned 50 times by the 27 respondents. Therefore, after dividing 50 to 27,
the mean for “good personality” is 1.852 mentions per respondent. Standard deviation represents
the variation of the attributes mentioned in the 27 responses. Significant differences between the
means of the key attributes were determined through confidence interval analysis using Least
Significant Difference (LSD) test with a confidence level of 95%.

RESULTS AND DISCUSSION

Key attributes of change agents for successful and sustained technology adoption
Table 1 shows the frequency, mean and standard deviation of the key attributes for successful
and sustained technology adoption. “Good personality” is the highest mentioned key attribute
compared to other key attributes. All other key attributes are significantly different with “good
personality” at p<0.05 from the LSD test. According to the result, people in the industry
perceived that change agents with good personality could contribute to the success of technology
adoptions. A change agent can be considered as an individual who leads the change process. Past
research has found that leaders with right personality factors are associated with increased
employee performance and leadership effectiveness (Li et al. 2015). Also, other findings showed
that leaders’ personality can influence the stress level of other employees (Robertson et al. 2014).
Thus, it would be useful to direct more attention toward identifying change agents with good
personalities when selecting an individual to lead the process of adopting new technologies in
construction projects.

Table 1: Frequency, Mean and Standard Deviation for each key attribute
Attributes Percentage from N Key Attributes Frequency Mean Standard
related to (N=109) Deviation
Human 85% Good personality 50 1.852 1.936
Good people skill 23 0.852*0.907
Able to lead and 10 0.370*0.565
manage a team
Great communicator 6 0.222*0.424
Able to solve problem 4 0.148*0.362
Organizational 11% Understand the change 9 0.333*0.620
process
Understand the 3 0.111*0.320
business
Technical 4% Subject expert 4 0.148*0.362
*Key attributes that are significantly different with "good personality" at p<0.05 from the LSD test.

Key attributes of change agents for unsuccessful technology adoption

Table 2 shows the frequency, mean and standard deviation of the attributes for unsuccessful
technology adoption. “Bad personality” is the highest mentioned attributes compared to other
attributes. All other attributes are significantly different with “bad personality” at p<0.05 from

the LSD test. Industry practitioners perceived that change agent with bad personality could
hinder the technology adoption from occurring or being sustained. Research in the field of
change has provided evidence that such negative leadership adversely impacts change
implementation (Higgs 2009). In other words, change agent can be considered as a leader of
change, and bad leadership will negatively affect employees.

Table 2: Frequency, Mean and Standard Deviation for each key attribute
Attributes Percentage from Key Attributes Frequency Mean Standard
related to N (N=76) Deviation
Human 86.8% Bad personality 44 1.630 1.573
Lack of people skill 13 0.481**0.643
Bad communicator 8 0.296**0.669
Unable to solve problem 1 0.037**0.192
Organizational 6.6% Doesn’t understand the 3 0.111**0.320
change process
Doesn’t understand the 2 0.074**0.267
objectives
Technical 6.6% Too technical 3 0.111**0.320
Lack of training 2 0.074**0.267
**Key attributes that are significantly different with "bad personality" at p<0.05 from the LSD test.

The relationship between other attributes and individual personality

While other attributes have means that are significantly lower than the highest mentioned
attribute for both successful and unsuccessful technology adoptions, several of the attributes can
be associated with “good personality” and “bad personality” (i.e., individual personalities). For
example, past research found that the relationships between leaders with the right personality
traits and employees’ work outcomes were significantly mediated by the leader’s people skills
(Mo et al. 2017). Also, to be able to lead and manage team members, the leader must have a
good personality (Zaccaro et al. 2004). Furthermore, leaders with right personality traits are most
likely to adopt efficient and effective communication styles in dealing with their subordinates to
achieve the predetermined organizational goals (Solaja et al. 2016). Therefore, it would be
reasonable to think that personality plays a central role to the other attributes that may also
influence the success of change agents.

Human-related attributes
The identified key attributes can be categorized into three types, which are, human-related
attributes, organizational-related attributes, and technical-related attributes. Table 1 shows that
85% of the key attributes that influence the success of technology adoption are related to human,
while 11% are related to organizational and only 4% are related to technical. Table 2 shows that
86.8% of the attributes of change agents that were reported to influence the unsuccessful
technology adoption are related to human, while 6.6% are related to organizational and 6.6% are
related to technical. Industry practitioners think that the non-technical capabilities rather than the
technical capability of maneuvering the technology itself will influence the success of adopting
new technologies. Past research suggests that change agents who facilitate an emotional

connection to the change reduce employees’ resistance in adopting new technology (Vos et al.
2018). Therefore, by hiring the correct change agent, employees will have the required support to
accept the new technology.

Limitations
Despite achieving the objectives, this study has some limitations. Because grouping the data
into the identified attributes involves interpretation by the researcher, the output may be different
between individuals. However, even by considering this limitation, the results will consistently
illustrate that the non-technical attributes of change agents are reported more frequently than the
technical attributes by industry practitioners in having successful or unsuccessful technology
adoption in construction projects. Also, this study only collected questionnaire from attendees of
a specific conference and may not represent specific stakeholders in the industry. However, this
study uses the result only to identify trends between the attributes, and it does not aim to use it as
a means to discredit the others.

CONCLUSIONS
This study analyzed the suggested key attributes of change agents from 27 questionnaire
survey responses by attendees at the FIATECH conference. Quantitative and qualitative
approaches were used to analyze the questionnaire surveys. In conclusion, a change agent’s
personality is the most common attribute suggested by industry practitioners in enabling and
inhibiting the success of adopting new technologies in construction projects. The practical
implication is that companies should assess an individual’s personality before selecting them as a
change agent. Furthermore, it suggests that just because an individual has the technical
competency, it does not necessarily mean that the individual will make for an effective change
agent. Additionally, industry practitioners reported that the non-technical attributes of change
agents play an important role than their technical attributes. This highlights the need for
stakeholders and practitioners to pay particular attention to the process of selecting change
agents for a construction project.
While the findings of this work may seem intuitive, research suggests that when companies
are planning for change, often greater emphasis is placed on technical skills at the expense of
non-technical skills when adopting new technologies in construction companies (Pant et al.
2008). This highlights the importance of planning for the non-technical attributes required for
effective change agents. In a more practical sense, the findings from this work can also be used
to justify the time and resources necessary to study methods of selecting change agents based on
appropriate non-technological skills. This could lead to a formalized tool for screening potential
change agent candidates to determine whether or not seemingly qualified individuals are likely to
possess the critical, non-technological, skills required for success.

ACKNOWLEDGEMENTS
The authors would like to thank the participating FIATECH conference attendees for
providing their insights for this work. Additionally, the authors would like to thank FIATECH’s
Horizon 360 Team and Productivity Advancement Target team leaders who helped to guide this
research.

REFERENCES
Braun, V., and Clarke, V. (2006). Using thematic analysis in psychology. Qualitative research in
psychology, 3(2), 77-101.
Buchanan, D. A., & Boddy, D. (1992). The expertise of the change agent: public performance
and backstage activity. Prentice-Hall.
CII (2008). Leveraging Technology to Improve Construction Productivity. Construction Industry
Institute, New York, NY.
CII (2018). Fiatech. Construction Industry Institute. Accessed October 29 th, 2018.
<https://www.construction-institute.org/groups/sector-committees/fiatech>
Goodrum, P. M., and Haas, C. T. (2004). Long-term impact of equipment technology on labor
productivity in the US construction industry at the activity level. Journal of construction
engineering and management, 130(1), 124-133.
Higgs, M. (2009). The good, the bad and the ugly: Leadership and narcissism. Journal of change
management, 9(2), 165-178.
Li, X., Zhou, M., Zhao, N., Zhang, S., and Zhang, J. (2015). Collective‐efficacy as a mediator of
the relationship of leaders' personality traits and team performance: A cross‐level analysis.
International journal of psychology, 50(3), 223-231.
Lines, B. C., Sullivan, K. T., Smithwick, J. B., and Mischung, J. (2015a). Overcoming resistance
to change in engineering and construction: Change management factors for owner
organizations. International journal of project management, 33(5), 1170-1179.
Lines, B. C., Sullivan, K. T., and Wiezel, A. (2015b). Support for organizational change:
Change-readiness outcomes among AEC project teams. Journal of construction engineering
and management, 142(2), 04015062.
Lines, B. C., and Reddy Vardireddy, P. K. (2017). Drivers of Organizational Change within the
AEC Industry: Linking Change Management Practices with Successful Change Adoption.
Journal of management in engineering, 33(6), 04017031.
Lines, B. C., and Smithwick, J. B. (2018). Best practices for organizational change management
within electrical contractors. International journal of construction education and research, 1-
24.
McKinsey Global Institute. (2017). Reinventing construction: A route to higher productivity.
Migliaccio, G. C., Gibson Jr, G. E., and O'Connor, J. T. (2008). Changing project delivery
strategy: An implementation framework. Public Works Management and Policy, 12(3), 483-
502.
Mo, S., and Shi, J. (2017). Linking ethical leadership to employee burnout, workplace deviance
and performance: Testing the mediating roles of trust in leader and surface acting. Journal of
business ethics, 144(2), 293-303.
Pant, I., & Baroudi, B. (2008). Project management education: The human skills imperative.
International journal of project management, 26(2), 124-128.
Rahman, M. M. (2013). Barriers of implementing modern methods of construction. Journal of
management in engineering, 30(1), 69-77.
Robertson, I., P. Healey, M., P. Hodgkinson, G., Flint-Taylor, J., and Jones, F. (2014). Leader
personality and employees’ experience of workplace stressors. Journal of organizational
effectiveness: people and performance, 1(3), 281-295.
Solaja, M. O., Idowu, E. F., and James, E. A. (2016). Exploring the relationship between
leadership communication style, personality trait and organizational productivity. Serbian
journal of management, 11(1), 99-117.

Vos, J. F., and Rupert, J. (2018). Change agent's contribution to recipients' resistance to change:
A two-sided story. European management journal, 36(4), 453-462.
Zaccaro, S. J., Kemp, C., and Bader, P. (2004). Leader traits and attributes. The nature of
leadership, 101, 124.

Improved Optimization Model for Finance-Based Scheduling

Ahmed Shiha1 and Ossama Hosny, Ph.D.2
1
Dept. of Construction Engineering, American Univ. in Cairo, Parcel 8, 74 S. El-Teseen St., New
Cairo, Cairo 11835, Egypt. E-mail: [email protected]
2
Dept. of Construction Engineering, American Univ. in Cairo, Parcel 8, 74 S. El-Teseen St., New
Cairo, Cairo 11835, Egypt. E-mail: [email protected]

ABSTRACT
Construction contractors often suffer from lack of accurate management of projects’ cash
flow. Retainage, advance payments, and interest rates are among several financial terms that
essentially influence the project’s cash flow. Many research efforts in finance-based scheduling
target developing models to aid contractors in cash flow management, however, these efforts
often concentrate only on the employer-contractor payment conditions. Providing a reliable
management of contractor’s cash flow requires highlighting the contractor-subcontractor as well
as contractor-supplier agreements as they have a considerable impact on the project’s cash flow.
This paper proposes a user-friendly spreadsheet model that utilizes genetic algorithm to reach a
schedule that minimizes the negative cash flow required by the contractor to finance the project
and maximizes the internal rate of return. The model is not limited to number of activities nor to
number of predecessors for each activity and is flexible to incorporate multiple payment
conditions of different subcontractors as well as suppliers to reflect the real picture of general
contractors’ cash flow. The proposed model is demonstrated through sample models
incorporating different subcontractors’ arrangements. The model accounts for time value of
money parameters, multiple subcontractors and suppliers with different payment agreements,
contractor-employer payment conditions, as well as a certain credit limit among other user
inputs. Comparing with other cash flow models, the proposed model proved to provide support
to contractors in better cash flow planning, monitoring, and control through improved depiction
of the construction industry’s payment relations.

INTRODUCTION
The construction industry is usually characterized by its complexity in terms of payments,
schedules, methods and a variety of other things. These complexities are ground for
complications that might affect the project in terms of its scope, quality, cost, and schedule. In
construction, Contractors usually work under cash-constrained conditions due to the time lag
between spending money on execution and getting payments from Owners as well as different
payment conditions such as retention (Gajpal & Elazouni, 2015). Contractors are often
challenged by providing the needed cash for the timely execution of construction activities,
which is known as construction financing (A. Elazouni & Abido, 2013). Usually, Construction
contractors need financial institutions to help finance the project and these institutions will
impose financial charges on the contractors such as interest rates (Elazouni & Metwally, 2005).
In case of insufficient cash or the increase of these charges, Contractors will often adjust the
construction schedules to cover the required costs of the scheduled activities at different times
during the project (A. Elazouni, 2009). Finance-based scheduling aims to find the most suitable
schedule alternative that optimizes cash flow related objectives. Many research efforts have
proposed models to help contractors schedule their projects while meeting a certain credit limit

(Ali & Elazouni, 2009; Elazouni & Abido, 2013; Elazouni & Metwally, 2005; Gajpal &
Elazouni, 2015; Hosny and Mamdouh 2004; Hegazy & Wassef, 2001). Mainly, these papers
where they discuss ways to maximize contractors’ profits in case of having credit limit to finance
the project. However, limited number of researchers have attempted to include the impact of
contractor-subcontractor payment terms on finance-based scheduling. Contractor-subcontractor
payment terms include advance payment paid to subcontractors at the start of his work, time of
interim payments, retention percentage imposed on subcontractor interim payments, and the time
when subcontractor is repaid this retained amount. This paper proposes a model for finance-
based scheduling from the Contractor’s point of view that include Contractor-subcontractor
payment terms as well as terms with other parties as financial institutions and Employers. The
model aims at reaching a schedule that minimizes the total interest paid while meeting a certain
credit limit and maximizing the profit.

METHODOLOGY
Model development
Figure 1 shows the architecture of the model. The model consists of 4 modules: user interface
module, CPM module, Cash flow module, and Optimization module. The user interface gets the
input from the user that includes: activities, predecessors for each activity, Contractor-
subcontractors payment terms, Owner-contractor payment terms, and credit limit. The Critical
Path Method (CPM) module perform the scheduling calculations including Early Start, Early
Finish, Late Start, Late Finish, Total Float, and Free Float for each activity. The cash flow
module estimates the cash-in flow, cash-out flow, and overdraft. The optimization module uses
Genetic algorithms to compare between the different subcontractors and payment options to
reach the project’s schedule that minimizes the contractor’s negative cash flow (overdraft) or
minimizes the total interest paid by the contractor which is the difference between the planned
profit and actual profit.

Figure 1- Model Architecture

To reach a realistic depiction of the construction industry, the model also allows the user to
adjust the subcontractors’ payment terms including advance payments and retention percentages.
Moreover, each activity may be subcontracted or not, and each activity maybe only partially
subcontracted. Also, for each activity, the user has three different subcontractors with different
payment terms and the optimization using Genetic algorithm selects the best subcontractor for
each activity that will coincide with the Contractor’s objectives. The optimization model is then

demonstrated using an example project of 15 activities. The model is developed through the
following procedures:

1. Schedule Critical Path Method (CPM) calculations

The user starts using the model by entering the duration for each activity in weeks. Then,
assigning predecessors for each activity. As shown in Figure 1, the user can input unlimited
number of predecessors. The model will calculate the whole schedule’s logical relationships
based on the user’s input of the predecessors. Once the user enters the durations and predecessors
for all the activities, the model, utilizing CPM method, calculates Early Start, Early Finish, Late
Start, Late Finish, Total Float, and Free Float for all the activities as shown in Figure 1.

Figure 2- User inputs and schedule calculations

2. Contractor-Subcontractor payment terms

The proposed model follows the assumption that subcontractors are paid advance payment at
the start of the activity, then Contractor pays the subcontractor weekly/monthly with advance
payment deducted from each weekly/monthly payment. Also, the model allows for an amount to
be retained from each payment by the Contractor to subcontractor. Retention percentage to be
imposed by the Contractor on the subcontractor can be returned to subcontractors into two
installment each is half the amount: First half after a lag (Lag1) from the end of subcontractor’s
activity, and second half after a lag (Lag2) from the project’s end. These lags can be adjusted by
the user according to the subcontracted activity’s nature. Each activity can have up to, three
different subcontractors each with five different payment terms that are to be entered by the user
as shown in Figure 2.

Figure 3- Different subcontractors payment terms for different activities

3. Cost information and Cash flow calculations

The direct cost each activity shall be divided into two main parts: subcontracted amount and
executed by own resources’ amount. This division was added to realistically depict the actual

construction subcontractors’ works, where in subcontractors may execute certain items, but
requires the Contractor to provide them with material or equipment to finish this item. Also, the
user shall enter the indirect cost per week for the project. The amount of the activity’s direct cost
that is executed by Contractor’s own resources is distributed over the duration of the activity.
The cash flow calculations for the proposed model follows the financial terminology used by Au
& Hendrickson (1986) as well as some modifications on equations used by A. M. Elazouni &
Gab-Allah (2004) and Tabyang & Benjaoran (2016) to include subcontractors payment terms.

4. Owner-Contractor payment terms

Owner-contractor terms of payment that are used in the model include: Markup%, Retention
%, Indirect cost/week, Interest%/year, and Advance Payment%. The model assumes that invoice
to be submitted at the end of each month and paid to the contractor one month later.

5. Optimization using Genetic Algorithm

The proposed model’s output is a detailed construction schedule for each activity and its
impact on the Contractor’s cash flow. The aim for the model is to reach a schedule that satisfies
an allowable credit limit at any time during the project, minimizes the financial costs paid by the
Contractor, and maximizes the Contractor’s profit. Parameters affecting the model’s output
schedule are: selection of subcontractor for each activity, payment terms between Contractor-
subcontractors, payment terms between Owner-Contractor, financial arrangements between
Banks-Contractor, and each activity start and end dates.

Table 1- Payment terms between Owner-Contractor and Bank-Contractor

Financial parameter Value
Interest Rate/ year 22
Markup % 8
Retention % 15
Advance payment % 5
Taxes % 11
Indirect cost /week 50000
Credit limit 14000000
Time to submit invoices to Owner (Weeks) 4
Time to receive payment from Owner after invoice submission (Weeks) 4
Time lag to receive the retained amount after project’s end (Weeks) 8

Objective Function
The objective of the proposed model is to minimize the total interest paid by the Contractor
during the project. The interest paid by the Contractor is the financing cost that Contractor will
pay for months during the project at which the total overdraft is negative. So, at any period
during the project where the net cumulative balance between the cash outflow and inflow is
negative, the contractor will incur extra costs which is the interest paid on financing this negative
cashflow. Hence, the objective function of the proposed model is to minimize the summation of
all the interest or financing costs paid during all the periods where the net cumulative cash
balance is negative. Minimizing the total financing costs will maximizes the actual profit as a
result.

Table 2- Selected subcontractors’ payment terms for each activity by the Genetic algorithm
optimization
Variable after
Subcontractor offer
optimization
Selected Adv. Payment Lag
Activity Sub. Cost Retention% Lag 1
subcontractor Index % 2
A 1 8318000 25% 10% 4 10
B 2 7600000 15% 25% 4 12
C 1 7658000 15% 32% 6 13
D 2 11850000 20% 40% 7 9
E 1 8000000 18% 22% 8 10
F 1 8961000 5% 30% 4 7
G 1 5064000 15% 30% 6 8
H 3 12018733 25% 20% 8 7
I 1 5951000 5% 20% 8 9
J 2 11830000 5% 32% 4 16
K 1 9338000 15% 25% 6 10
L 3 12278482 15% 18% 6 12
M 1 11806000 25% 15% 8 10
N 3 3127279 25% 15% 7 8
O 2 1850000 10% 25% 6 7

Decision Variables
The independent variables for the proposed model are the subcontractors selected for each
activity, and the shift each activity maybe moved. Each subcontractor has an index 1, 2, or 3. The
first set of variables for the model are the indices of the three subcontractors for each activity.
The second set of variables for the model are the shifts: the time activities can be shifted without
affecting the project duration nor the succeeding activities.

Constraints
The optimization problem to minimize the financing cost of the project and maximizing the
profit through selection of subcontractor and shifting the activities within their free floats has the
following constraints: The indices for subcontractors are integers ranging from 1 to, The shift for
each activity is a positive integer, The shift for each activity is less than or equal to that activity’s
free float, and The maximum overdraft at any time is less than or equal to the available credit
limit.

CASE STUDY
An example project that was used by Ali & Elazouni (2009) of 15 activities is applied on the
proposed model. The user will have to input the duration for each activity in weeks. The total
duration for the project resulted from CPM calculations by the model is 63 weeks.
As mentioned, each of the three subcontractors is defined by the model with indices from 1
to 3. The variables for optimization are subcontractors’ indices from 1 to 3 for each of the 15
activities. In addition, the shift for each of the 15 activities within its free float is another set of
variables. Owner-Contractor and Bank-Contractor payment terms are to be set by the user. As

shown in Table 1, markup=8%, advance payment = 5%, Retention = 15%, and Interest
rate/year= 22% are used in the case study. Initialization was done by selection of subcontractor
index 1 and shift=0 for each of the 15 activities. After initialization, the total interest to be paid
by the Contractor was $2,865,783, Maximum overdraft during the project was $14,011,025,
while the profit was $17,241,958. Then, the optimization model was applied using Evolver.
Population size is selected to be 100, cross over rate=0.9, and mutation=0.1. The stoppage
criterion is selected to be 100000 trials.

Table 3- Results of the schedule after optimization

Before Optimization After Optimization
Activity Shift
Start Finish Free Float Shifted Start Shifted Finish
A 0 10 3 2 2 12
B 0 12 0 0 0 12
C 0 13 0 0 0 13
D 12 21 1 1 13 22
E 12 22 0 0 12 22
F 13 20 2 0 13 20
G 22 30 0 0 22 30
H 22 29 0 0 22 29
I 22 31 8 7 29 38
J 30 46 0 0 30 46
K 29 39 0 0 29 39
L 39 51 5 0 39 51
M 46 56 0 0 46 56
N 39 47 9 6 45 53
O 56 63 0 0 56 63

Figure 4- Cash flow after optimization

Table 2 shows the results of the first set of variables from the optimization as each activity is
assigned an index for subcontractor. From the database defined by the user, the selected index
for each activity will determine the payment terms of this activity that will be reflected in the
cash out calculations. Table 3 shows the results of the second set of variables which is the shift
of the activity within its free float. For example, as shown in tables 2 and 3, Activity A will be
subcontracted to subcontractor 1 with a cost of $8,318,800, retention 25%, Advance payment
10%, Lag1 of 4 weeks, Lag2 of 10 weeks, and will be shifted 2 weeks within its free float of 3

weeks. Cash flow calculation results after optimization shows the following: the total interest
paid reduced to $2,159,520, profit increased to $19,570,684, and maximum overdraft during the
project reduced to $13,784,227 which is below the mentioned credit limit of $14,000,000 in
Table 1. The cash flow profile after optimization is shown in Figure 4.

CONCLUSION
In conclusion, the proposed model shows potential in depiction of the construction industry’s
different parties’ payment relations. The model fills a gap in the finance-based scheduling
research efforts by including the impact of subcontracting, subcontractor’s selection, as well as
their different payment terms with contractors on the contractors’ cash flow. Illustrated through
the case study, the optimization model results in the contractor’s profit increased from
$17,241,958 to $19,570,684. Utilizing Genetic Algorithm, the proposed model supports decision
maker in selecting the best schedule and appropriate subcontractor for each activity to minimize
the total interest paid during the project life which would maximize the total profit while meeting
a certain credit limit.

REFERENCES

Ali, M. M., & Elazouni, A. (2009). Finance‐based CPM/LOB scheduling of projects with
repetitive non‐serial activities. Construction Management and Economics, 27(9), 839–856.
https://doi.org/10.1080/01446190903191764
Au, T., & Hendrickson, C. (1986). Profit Measures for Construction Projects. Journal of
Construction Engineering and Management, 112(2), 273–286.
https://doi.org/10.1061/(ASCE)0733-9364(1986)112:2(273)
Elazouni, A., & Abido, M. A. (2013). Contractor-finance decision-making tool using multi-
objective optimization. Canadian Journal of Civil Engineering, 40(10), 961–971.
https://doi.org/10.1139/cjce-2013-0086
Elazouni, A. M., & Gab-Allah, A. A. (2004). Finance-Based Scheduling of Construction Projects
Using Integer Programming. Journal of Construction Engineering and Management, 130(1),
15–24. https://doi.org/10.1061/(ASCE)0733-9364(2004)130:1(15)
Elazouni, A. M., & Metwally, F. G. (2005). Finance-Based Scheduling: Tool to Maximize
Project Profit Using Improved Genetic Algorithms. Journal of Construction Engineering and
Management, 131(4), 400–412. https://doi.org/10.1061/(ASCE)0733-9364(2005)131:4(400)
Gajpal, Y., & Elazouni, A. (2015). Enhanced heuristic for finance-based scheduling of
construction projects. Construction Management and Economics, 33(7), 531–553.
https://doi.org/10.1080/01446193.2015.1063676
Hegazy, T., & Wassef, N. (2001). Cost Optimization in Projects with Repetitive Nonserial
Activities. Journal of Construction Engineering and Management, 127(3), 183–191.
https://doi.org/10.1061/(ASCE)0733-9364(2001)127:3(183)
Hosny, O. and Mamdouh, D. “Decision Support System for Construction Companies Cash-flow
Forecasting”, International Conference on Future Vision and Challenges for Urban
Development, Cairo, Egypt, 2004.
Tabyang, W., & Benjaoran, V. (2016). Modified finance-based scheduling model with variable
contractor-to-subcontractor payment arrangement. KSCE Journal of Civil Engineering,
20(5), 1621–1630. https://doi.org/10.1007/s12205-015-0581-z

Identifying a Ranking Method for Assessing the Potential Risk of Knee Musculoskeletal
Disorders among Roofers in Shingle Installation
Amrita Dutta1; Scott P. Breloff2; Fei Dai3 ; Erik W. Sinsel4 ; Christopher M. Warren5;
and John Z. Wu6
1
Graduate Research Assistant, Dept. of Civil and Environmental Engineering, West Virginia
Univ., PO Box 6103, Morgantown, WV 26506. E-mail: [email protected]
2
Biomedical Research Engineer, National Institute for Occupational Safety and Health, 1095
Willowdale Rd., Morgantown, WV 26505. E-mail: [email protected]
3
Associate Professor, Dept. of Civil and Environmental Engineering, West Virginia Univ., PO
Box 6103, Morgantown, WV 26506. E-mail: [email protected]
4
Computer Scientist, National Institute for Occupational Safety and Health, 1095 Willowdale
Rd., Morgantown, WV 26505. E-mail: [email protected]
5
Mechanical Engineer, National Institute for Occupational Safety and Health, 1095 Willowdale
Rd., Morgantown, WV 26505. E-mail: [email protected]
6
Senior Research Biomechanical Engineer, National Institute for Occupational Safety and
Health, 1095 Willowdale Rd., Morgantown, WV 26505. E-mail: [email protected]

ABSTRACT
The objective of this study is to identify a ranking method for assessing the potential risk of
knee musculoskeletal disorders (MSD) among construction roofers. On a slope-adjustable
wooden platform, nine subjects performed the shingle installation, comprising seven phases: 1)
reaching for shingles, 2) placing shingles, 3) grasping the nail gun, 4) moving to the first nailing
position, 5) nailing shingles, 6) replacing the nail gun, and 7) retuning to upright position. Knee
flexion, abduction, adduction, internal, and external rotational angles were measured using an
optical motion analysis system. To analyze the relative level of risk at each phase, these angles
were combined using multiplication and aggregation-based scoring models that generated ranks
of phases at different roof slopes. The ranking results provide useful information for identifying
the postures that might pose greater MSD risk, and may facilitate effective interventions
development to reduce extreme knee positions which is a MSD risk factor.

INTRODUCTION
Awkward kneeling posture is a common source of knee musculoskeletal disorders (MSD)
among construction roofers. However, the risk assessment of roofers’ knee MSD for a specific
roofing task that involves awkward kneeling is still missing. Shingle installation, a common
repetitive and awkward task performed by the residential roofers, is a cause of work-related
MSD among roofers in the construction community (Everett 1999). In a slanted roof setting,
roofers are restricted to various awkward postures such as crawling, stooping and kneeling for
more than 75% of their total working time. These awkward postures and repetitive motions are
considered to be major contributing factors of MSD. As roofers encounter both of these factors,
there is a high incident rate of MSD injuries among roofers (Wang et al. 2015). During shingle
installation, roofers encounter awkward posture when their knees undergo significant amount of
rotations beyond their tolerance limit. Knee awkward posture and repetitive motions have been
proven to be associated with knee MSD (Hofer et al. 2011). However, identifying the individual
phase of a shingle installation operation, during which the roofers experience the most awkward

knee rotations, is not explored yet. The ranking of the phases based on the awkward postures that
lead the knees to potential MSD risks would be useful for developing effective interventions.

BACKGROUND
Ergonomics practice for occupational kneeling: Although roofers have a high MSD injury
incident rate, there are very few ergonomic guidelines to protect them. Some of the existing
guidelines suggest using mechanical devices during roofing and knee pads while kneeling.
General ergonomic practices to minimize the risk of MSD, promoted by safety and health
organizations, such as the Occupational Safety and Health Administration (OSHA) and the
National Institute for Occupational Safety and Health (NIOSH), are not specifically related to
knee injury prevention for roofers working on a sloped surface. Most of these guidelines are
focused on using knee protective measures to reduce stress on the knee during construction work
on level surfaces (Albers and Estill 2007; OSHA 2018). There is still a lack of a detailed task-
specific risk analysis that identifies roofers’ riskier postures on sloped surfaces and suggests
effective interventions.
Ergonomics research on MSD among roofers: Very few MSD risk assessment studies
were previously done for the construction roofers. Choi and Fredericks (2008) investigated the
impact of surface slope on roofers’ shingling frequencies. Wang et al. (2017) assessed different
work-related risk factors — roof slope, working technique, and working pace, during shingle
installation for low back disorders among roofers. However, lower extremities were not
systematically assessed. Breloff et al. (2019) examined the lower extremity kinematics of roofers
and their associations to MSD while traverse walking across a sloped roof surface. But the
potential MSD risk exposure of the knees while kneeling on a sloped roof surface was not
explored.

PROBLEM STATEMENT AND RESEARCH OBJECTIVE

A detailed ranking of the shingle installation phases and their relation to prospective knee
MSD risk is missing in the literature. Such knowledge is important to promote knee interventions
that can minimize the awkward knee rotations and prevent knee MSD among roofers. Therefore,
the objective of this study is to identify a ranking method for evaluating the phases of a sloped
shingle installation process with respect to awkward kneeling posture. A phase which places the
roofer in a more awkward posture will be considered to be a greater risk for the development of
knee MSD.

METHODOLOGY
Risk indicators: Frequent and high contact stress at knee joint is associated with knee
osteoarthritis and damage of articular cartilage of the knee joint — two common forms of knee
disorders. A previous study showed that, with an increase in knee flexion from 15.5° during
walking to 90° during squatting posture, the contact stress in knee joint increased significantly by
over 80% (Thambyah et al. 2005). Knee flexion beyond 90° generates larger moment and forces
which results in high stress in the knee joint (Nagura et al. 2002). These indicate a strong
association between knee rotational angle and knee joint contact stress which relates to knee
MSD. Therefore, in this study, potential knee MSD risk is defined as an increase in knee
rotations that creates high contact stress in the knee joint. Knee MSD risk is considered to
increase as the knee rotational angles encounter larger awkward postures. Awkward posture is

considered as a deep flexed posture of knee (>90°) with medial and lateral rotation leading to
increased amount of stress in the knee joint. To assess the relative level of risk in each phase,
five knee rotational angles — flexion, abduction, adduction, internal and external rotation —
were measured. Three metrics — maximum, cumulative, and average of each knee rotational
angle — were considered, resulting in a total of fifteen (15) risk indicators. Maximum value of
the knee angles may reflect the risk of forceful exertion of knee during placing and installing
shingles which is a risk factor of MSD. Cumulative angle can represent the risk due to prolonged
kneeling, while the average knee angles can account for the extent of knee repetitive motions. In
this study, the relative level of risk among the phases were compared using these fifteen (15) risk
indicators. To determine the relative level of knee MSD risk, these risk indicators were combined
to produce a risk score for each phase. The phase with the highest score was deemed to yield the
largest potential risk for the development of knee MSD. The seven phases were ranked based on
the scores computed.
Risk analysis framework: A comparative risk analysis of the seven phases was performed
according to the following steps, depicted in Figure 1.
Step 1: Descriptive statistics (as risk indicators) calculation: The maximum, cumulative,
and average knee angles were computed for each phase during each trial, and for each slope,
using the knee kinematics data. The resulting maximum, cumulative, and average knee angles
were then averaged for each phase. The phase averaged data were then used as risk indicators to
compute the risk score for each phase for comparative risk analysis.
Step 2: Scoring and ranking the phases: The seven shingling phases were scored and
ranked to compare their relative risks and identify the phase with the highest potential for knee
MSD risk based on the knee exposure to the highest amount of rotation. To combine the risk
indicators and compute the score for each shingling phase, two distinct scoring models were
used: 1) aggregation-based scoring model, and 2) multiplication scoring model. Three different
methodologies utilizing the aggregation-based scoring model were applied, which are explained
in the next subsection. These models were previously employed in studies, such as decision
making, university ranking, risk assessment and construction project management (El-Sayegh
and Mansour 2015; Odeh and Battaineh 2002), and were found useful to generate ranks from
multiple criteria. Since this study involved multiple knee injury risk indicators — phase averaged
knee rotation angles, these models were applicable for this situation. The phase with the highest
aggregated score was considered to display the most awkward knee posture and therefore
identified as the phase with the most potential knee MSD risk and was ranked one.
(2a) Aggregation-based scoring model: The approach to construct a rank from multiple
indicators using the aggregation-based model included three steps: (a) normalizing the data, (b)
attaching weight to the indicators and, (c) aggregating the weighted values to produce an overall
score. For step (a), three separate normalization approaches were tested: i) dividing the indicator
values of each phase by the maximum among the phases, ii) dividing the indicator values of each
phase by the sum across the phases, and iii) range normalization where each indicator value was
scaled to fall within range [0,1] with respect to the maximum and minimum indicator values
among the phases. As current literature lacks knowledge on the relative contribution of each
knee rotational angle to knee MSD and there are biomechanical reasons of knees getting affected
by these awkward rotations, in step (b), equal weights were assigned to all indicators. In step (c),
for each normalization approach the weighted scores of all indicators were then combined to
generate an overall score for each phase. The phases were then ranked in the descending order of
scores.

Figure 1. Risk analysis framework

Figure 2. Roof platform

Figure 3. Marker set up in lower extremity

(2b) Multiplication scoring model: This study also used a multiplication based scoring
model proposed by Tofallis (2014) to compute the phase scores. Using the phase averaged knee
rotational angles as indicators, the multiplication score of a given phase was computed using the
following equation:
Multiplication score =X1w1  X 2 w2  X 3w3 . X n wn
X i  i  1, 2,..n  were the risk indicators, and wi  i  1, 2,..n  were the weights assigned to each

indicator. In our study wi  i  1, 2,..n  were set to 1 and n  15 , representing 15 risk indicators.
Step 3: Comparative analysis of the consistencies of the scoring models. To ensure the
generated ranks were consistent across the roof slopes, the Spearman correlation coefficient (r)
(Rosso 1997) was used. The Spearman’s test provided the association of the phase ranks between
each pair of slopes generated by the two distinct scoring models (four total ranks: three from the
aggregation-based scoring model and one from the multiplication scoring model). The
Spearman’s rank correlation coefficient is a statistical tool to test the strength of association
between the ranks of two groups. A ‘r’ value close to 1 indicates a strong association between
two ranks.

Table 1. Scores and ranks of the left knee

Multiplicatio
Divide by maxRange normalizationDivide by sum
Phase n Slope
Score Rank Score Rank Score Rank Score Rank
1 1.79×1036 3 10.78 4 6.62 5 2.09 3
37
2 5.67×10 1 13.04 1 7.99 2 2.83 1
34
3 1.99×10 7 9.36 6 4.67 6 1.68 6
4 4.12×1034 6 9.00 7 3.24 7 1.66 7 0°
37
5 4.07×10 2 12.72 2 8.71 1 2.61 2
6 8.35×1035 5 10.96 3 6.63 4 2.06 4
36
7 1.06×10 4 10.70 5 7.55 3 2.03 5
1 2.07×1036 4 10.47 4 7.28 2 2.04 4
38
2 1.92×10 1 13.26 1 9.52 1 2.95 1
35
3 1.59×10 6 9.84 6 6.71 3 1.83 6
4 1.57×1035 7 9.26 7 4.56 7 1.75 7 15°
37
5 1.56×10 2 11.37 2 5.89 5 2.39 2
6 5.59×1035 5 10.03 5 4.57 6 1.94 5
36
7 3.09×10 3 10.52 3 6.60 4 2.08 3
1 5.55×1036 4 10.73 4 7.84 3 2.08 3
38
2 7.22×10 1 13.46 1 9.60 1 3.11 1
3 9.92×1034 7 9.27 7 3.28 7 1.68 7
36
4 2.90×10 5 10.44 5 6.06 5 1.98 5 30°
5 6.72×1037 2 11.63 2 6.95 4 2.44 2
6 2.30×1035 6 9.31 6 3.47 6 1.72 6
7 5.93×1036 3 10.79 3 8.05 2 2.07 4

EXPERIMENTAL SETUP AND PROCEDURE

Nine male volunteers [26.1 years (±5.6 years), 180.2 cm (±6.1 cm), and 99.7 kg (±27.6 kg)]
with no history of MSD simulated shingle installation on a 1.2 ×1.6 m custom-made adjustable
wood platform which was used as a rooftop (Figure 2). The research protocol was approved by
both the Institutional Review Board (IRB) of NIOSH and West Virginia University. A VICON
optical motion capture system with 14 MX VICON cameras (Oxford, U.K.) collected the lower
extremity kinematic data (3D coordinate points) from 42 retroreflective motion capture markers
placed bilaterally on the participant’s shanks, thighs, feet and hip joints (Figure 3). These
coordinate points were used to calculate the knee rotational angles. The experiment was

performed in the NIOSH biomechanics laboratory. Each participant performed five trails of the
task on the roof simulator for three slope angles —00, 150, and 300.

DATA PROCESSING
From the calibrated origin (x, y, z positons) of each marker, the local coordinates of thigh
and shank were calculated and then transformed to a 3D (XYZ) coordinate system. Combining
these local coordinate systems, a rotation transformation matrix was constructed to compute the
five knee rotational angles according to the equations provided by Robertson et al. (2013). All
kinematic data were processed in Visual 3D (C-Motion, Inc., Germantown, MD).

RESULTS
Tables 1 and 2 present the scores and ranks computed by multiplication and aggregation-
based scoring models. Risks in both knees are presented separately.
Table 2. Scores and ranks of the right knee
Multiplication Divide by max Range normalization Divide by sum Slop
Phase
Score Rank Score Rank Score Rank Score Rank e
1 4.76×1037 3 11.58 2 9.44 2 2.37 2
39
2 1.07×10 1 13.26 1 10.91 1 3.07 1
34
3 4.17×10 7 8.47 7 4.19 6 1.58 7
4 1.06×1036 5 9.42 5 5.79 4 1.84 5 0°
37
5 4.86×10 2 10.86 3 5.39 5 2.34 3
6 3.53×1035 6 8.84 6 3.51 7 1.71 6
7 6.52×1036 4 10.33 4 6.98 3 2.07 4
1 1.70×1038 2 12.22 2 10.28 1 2.48 2
38
2 7.41×10 1 12.94 1 9.76 3 2.89 1
35
3 1.28×10 7 8.66 7 3.86 5 1.62 7
4 3.60×1035 6 8.72 6 3.43 6 1.67 6 15°
37
5 4.11×10 4 10.64 4 5.21 4 2.25 4
6 5.24×1035 5 8.88 5 3.07 7 1.71 5
38
7 1.08×10 3 11.75 3 10.22 2 2.37 3
1 5.75×1037 3 11.45 3 10.15 1 2.27 3
38
2 5.29×10 1 12.57 1 7.60 3 2.93 1
3 6.46×1034 7 8.32 7 2.34 7 1.55 7
36
4 1.08×10 6 9.13 6 3.66 6 1.77 6 30°
5 1.57×1038 2 11.58 2 7.02 4 2.46 2
36
6 1.68×10 5 9.87 5 5.38 5 1.89 5
7 1.50×1037 4 10.73 4 8.58 2 2.10 4

Spearman correlation test result. The Spearman’s correlation test result presented in Table
3 demonstrated that most of the strongest associations of ranks between each slope pair were
obtained by multiplication scoring model [0-15 (Left) with r =0.929; 0-30 (Left & Right)
with r = 0.929 and 0.964 respectively; 15-30 (Left & Right) with r = 0.893]. Considering these
strong associations, the results obtained by applying this model was used for subsequent risk
analysis.

Table 3. Spearman correlation test result showing association of ranks across different
slopes
Multiplication Divide by max Range normalization Divide by sum
Slope
Left Right Left Right Left Right Left Right
0°-15° 0.929 0.857 0.857 0.929 0.286 0.786 0.893 0.929
15°-30° 0.893 0.893 0.893 0.893 0.536 0.857 0.857 0.893
0°-30° 0.929 0.964 0.679 0.929 0.571 0.714 0.821 0.929

Risk analysis on left and right knee: From Tables 1 and 2, phase 2 was ranked first at all
three slopes for both knees. Phase 5 was ranked second at all three slopes for left knee and at 0°
and 30° slopes for right knee  which indicated that, overall, placing shingles was the riskiest
phase followed by nailing shingles phase. The next risky phases were phase 1 (ranked third at 0°
for both knees and second at 15° for right knee) and phase 7 (ranked third at 15° and 30° for left
knee and at 15° for right knee). Based on the other ranks, the least risky phases were phases 4, 6
and 3.

DISCUSSION
A comparative ranking-based risk analysis of different phases that roofers undergo during
shingle installation operation on a sloped surface was performed in this study. Based on the five
knee rotational angles, risk scores were generated for each phase using two distinct scoring
models. The multiplication scoring model generally performs better in ranking because it does
not need any normalization of the indicators even if some indicators are numerically much
greater than the other ones. The reason is that rescaling any indicator has no impact on the
ranking result in this model. Therefore, although cumulative angles were much higher than the
maximum and average angles, there was no possibility of the cumulative angles influencing the
ranks.
A correlation analysis result also demonstrated higher consistency of the phase ranks across
different roof slopes computed by the multiplication scoring model and hence was considered
more appropriate for rank generation. The consistency of the ranks across different slope was
computed for evaluating the performances of the scoring models, because the objective of this
study was to identify the riskiest phases yielding the most awkward knee rotations at any roof
setting and hence would be potentially critical for knee MSD. Although, at different roof slopes,
roofers might require maintaining different posture which could yield different knee rotational
angles, but the ranking results at different slopes demonstrated almost similar risk pattern. The
possible reason could be that, at different slopes, individual knee rotation (e.g., flexion) might
vary, but this variation did not impact the overall risk pattern of the phases. However, further
assessment is necessary to confirm this causal relationship.
For both knees, at all slopes, the phase with the highest potential knee MSD risk was placing
shingles. Except for the right knee at 15 slope, the next riskiest phase for both knees at all other
slopes was nailing shingles. A possible reason is that, compared to other phases, the participants
were more repetitively changing their knee angles encountering larger awkward posture during
placing and nailing shingles on sloped roof surfaces for a longer duration. The cumulative effect
of high repetition along with the awkward posture may induce additional stress and force on the
knee joint ligaments and accelerate knee osteoarthritis among roofers. Interventions or strategies
to minimize the extent of knee rotations during placing and nailing shingles may reduce knee
MSD among roofers. However, further assessment is needed to identify which knee rotation

contributes the most to the knee MSD so that proper interventions can be developed to minimize
the extreme rotations commonly associated with knee MSD development.

CONCLUSION AND FUTURE WORK

This study identified a ranking based method for assessing the potential risk of knee MSD
among roofers in shingle installation. The level of risk at different shingling phases was
compared based on the risk scores generated by two distinct scoring models. Spearman
correlation test result exhibited better consistency with multiplication scoring model. Based on
the fifteen risk indicators, placing shingles and nailing shingles phases were identified as
potentially imposing the greater risk for knee MSD development in terms of awkward postures
and repetitive motions compared to the other phases. Further work is required to examine the
contribution of each knee rotation to knee MSD among roofers, and to test different interventions
with the participation of professional roofers with a large sample size in real-world construction
work environment.

DISCLAIMER
The findings and conclusions in this paper are those of the authors and do not necessarily
represent the official position of the National Institute for Occupational Safety and Health,
Centers for Disease Control and Prevention.

REFERENCES
Albers, J., and Estill, C. F. (2007). "Simple solutions; ergonomics for construction workers."
Breloff, S. P., Wade, C., and Waddell, D. E. (2019). "Lower extremity kinematics of cross-slope
roof walking." Applied Ergonomics, 75, 134-142.
Choi, S., and Fredericks, T. (2008). "Surface slope effects on shingling frequency and postural
balance in a simulated roofing task." Ergonomics, 51(3), 330-344.
El-Sayegh, S. M., and Mansour, M. H. (2015). "Risk assessment and allocation in highway
construction projects in the UAE." Journal of Management in Engineering, 31(6), 04015004.
Everett, J. G. (1999). "Overexertion injuries in construction." Journal of construction
engineering and management, 125(2), 109-114.
Hofer, J. K., Gejo, R., McGarry, M. H., and Lee, T. Q. (2011). "Effects on tibiofemoral
biomechanics from kneeling." Clinical Biomechanics, 26(6), 605-611.
Nagura, T., Dyrby, C. O., Alexander, E. J., and Andriacchi, T. P. (2002). "Mechanical loads at
the knee joint during deep flexion." Journal of Orthopaedic Research, 20(4), 881-886.
Odeh, A. M., and Battaineh, H. T. (2002). "Causes of construction delay: traditional contracts."
International journal of project management, 20(1), 67-73.
OSHA (2018). "(Occupational Safety and Health Administration),
(https://www.osha.gov/SLTC/ergonomics/training.html)." (Aug 22, 2018).
Robertson, G., Caldwell, G., Hamill, J., Kamen, G., and Whittlesey, S. (2013). Research methods
in biomechanics, 2E, Human Kinetics.
Rosso, R. (1997). Statistics, probability and reliability for civil and environmental engineers,
Mc-Graw-Hill Publishing Company.
Thambyah, A., Goh, J. C., and De, S. D. (2005). "Contact stresses in the knee joint in deep
flexion." Medical engineering & physics, 27(4), 329-335.
Tofallis, C. (2014). "Add or multiply? A tutorial on ranking and choosing with multiple criteria."

INFORMS Transactions on Education, 14(3), 109-119.

Wang, D., Dai, F., and Ning, X. (2015). "Risk assessment of work-related musculoskeletal
disorders in construction: State-of-the-art review." Journal of Construction Engineering and
management, 141(6), 04015008.
Wang, D., Dai, F., Ning, X., Dong, R. G., and Wu, J. Z. (2017). "Assessing Work-Related Risk
Factors on Low Back Disorders among Roofing Workers." Journal of Construction
Engineering and Management, 143(7), 0401702

Optimizing Neighborhood-Scale Walkability

Andrew J. Sonta, S.M.ASCE1; and Rishee K. Jain, Ph.D., A.M.ASCE2
1
Urban Informatics Lab, Dept. of Civil and Environmental Engineering, Stanford Univ., 473 Via
Ortega, Room 269B, Stanford, CA 94107. E-mail: [email protected]
2
Urban Informatics Lab, Dept. of Civil and Environmental Engineering, Stanford Univ., 473 Via
Ortega, Room 269A, Stanford, CA 94107. E-mail: [email protected]

ABSTRACT
Many designers and researchers have grappled with the problem of optimally locating
buildings and use types in a neighborhood-scale development. But little work has used data-
driven optimization to aid in creating urban design schemes. The paradigm of single-use
Euclidian zoning has heavily impacted the way our neighborhoods, cities, and suburbs are
designed, resulting in the physical separation of uses. However, as we grapple with emerging
issues of environmental and social sustainability in cities, there is a pressing need to consider
alternative urban designs that require less dependence on personal automobiles and that foster
healthier cities. In this paper, we develop a methodology for (1) automatically assessing the
walkability of neighborhoods by adopting a common walkability metric and (2) optimizing the
layout of buildings and amenities across a known grid in order to maximize the walkability
metric. We apply this methodology to a case study of the Potrero Hill neighborhood in San
Francisco, California. We find that, in comparison to the existing layout that can be characterized
by Euclidian-style separation of uses, the optimized layout suggests distributing amenities across
the street network, resulting in a two-fold increase in walkability. This tool and analysis have the
potential to provide computational and data-driven support for urban designers and researchers
hoping to understand and improve the walkability of urban spaces.

INTRODUCTION
The design and planning of urban spaces has a long and storied history, with ideas about the
best use of urban space dating to Ancient Rome. Some of the earliest plans for cities—including
Paris, London, and Washington, D.C.—were created by master-builders or architects with the
backing of government. Today, almost all cities implement some form of urban planning vis-à-
vis rules about building form, use, and location (Best 2016).
Single-use zoning, also known as Euclidian zoning—in which cities are divided into areas
with specific rules for building height, use, and density—became possible and prevalent after the
landmark case Village of Euclid v. Amber Realty Co. in 1926 (Wickersham 2000). In the period
following World War II, the physical separation of functional uses in cities became both feasible
and desirable due to increased rates of property ownership and use of the personal automobile
(Best 2016). Even in dense cities, single-use zoning replaced existing mixed-use developments
(Jacobs 1961). However, recent environmental and social concerns (e.g., the public and planetary
health consequences of automobile pollution) have led urbanists, local governments, and city
planners to rethink rigid Euclidian rules. One important reason is that developments with a mix
of uses reduce residents’ dependence on personal vehicles. Aside from the obvious
environmental implications, urbanists such as Jane Jacobs (1961) have argued that increased use
of sidewalks and reduced dependence on cars create vibrant, socially resilient communities. As a
result, the study and desirability of walkable communities have increased greatly in recent years.

Recently, we have also seen a vast surge in urban data resources and computing power.
Given these resources, researchers now have a unique opportunity to put these concepts of ideal
urban form to the test. This dual paradigm of evolving urban planning concepts and maturing
cyber- physical analysis has the potential to validate or entirely upend the consensus of what
makes a city effective. As a result, there is a pressing need to explore how computing tools such
as optimization can augment current decision-making processes around zoning and rule-making
in urban areas. Given the complexity of city planning—which includes street and path layouts,
building geometries, and use types—various approaches must be explored. In this paper, we
develop a methodology for maximizing the walkability of a neighborhood-scale development by
choosing the layout of buildings in an existing street grid, given the number of buildings, each
building’s prescribed use, and possible lots for placing each building. In a case study, we
compare the existing layout of a neighborhood in San Francisco, CA with an optimized layout
that distributes key urban amenities quite differently.

BACKGROUND
Recent urban design research has identified the concept of walkability as a key metric in
addressing environmental and social sustainability concerns in cities. Porta and Renne (2005)
include interconnectedness and accessibility of the street network as a critical component of their
tool for assessing the sustainability of urban form. Furthermore, they argue that in addition to
these street network characteristics, the community must colocate a diversity of land uses so that
multiple uses can be accessed by walking.
Some studies have used heuristic algorithms to optimize the walkability of neighborhood-
sized developments. These heuristics produce best-practice guidelines for walkable communities
built on architectural and urban design expert knowledge (Southworth 2005). While these
guidelines can be important and effective tools for urban designers in their planning work, they
lack an objective score that can be automatically calculated and applied quickly to various design
alternatives. Exploiting automated computational tools can greatly expand the solution space and
reveal previously overlooked options.
A few recent studies have explored the notion of optimizing physical layouts of structures in
real-world environments. Razavialavi and Abourizk (2017), for example, developed a genetic
algorithm framework for optimizing layout on a construction site. Rakha and Reinhart (2012)
developed a generative modeling platform that assesses different parametric urban massing
forms for walkability. They adopt the walkability scoring system discussed in Carr et al. (2010)
as the metric for optimization, and they utilize a genetic algorithm to optimize walkability by
placing different uses. This previous quantitative optimization work, while valuable in advancing
the role of computing in assessing urban form, has not been applied to evaluate the performance
of existing urban areas. Furthermore, the implications of the walkability optimization results
have not been fully explored, especially in their relationship to conventional wisdom about
effective urban design.

METHODOLOGY
The purpose of the methodology outlined in this section is to maximize a quantitative
walkability metric of a neighborhood-scale development given constraints about the number and
possible locations for each building type. The methods we outline here can be used to compare
optimized layout of buildings and amenities with alternative designs, including those created
through heuristics or those that already exist in cities.

Our approach follows a procedure with three main steps:

1. Problem definition—define the walkability objective function and how it is measured,
and define the solution space (i.e., possible locations of buildings) as well as the
constraints (i.e., number of each building type available).
2. Generate random designs—develop a routine for creating a population of randomly
generated designs, which are defined by the locations of each building type.
3. Optimize design—assess designs, create a new set of designs based on the best
performers, and repeat until convergence.

Problem Definition
In order to accomplish the ultimate goal of maximizing walkability, we first need a
walkability metric and a set of variables that can be changed to vary this metric. In this paper, we
adopt the metric discussed in Rakha and Reinhart (2012) and hereafter refer to it as the Street
Score. The Street Score is a value between 0 and 100, and it is calculated for one residential unit
at a time. In its most general form, the Street Score ( S ) is calculated as the sum of walking
distance scores between each residential unit and a prescribed number of different amenities
(e.g., parks, restaurants, grocery) as follows:
S  1
A

A

a 1
w a  s a 100

where the vector w a is the weighting vector for amenity a and the vector s a is the distance
score vector for amenity a (defined below). The vectors w and s can have different sizes for
each amenity, but the size of w a is always equal to the size of s a . This difference in vector sizes
is simply a function of the fact that the implementation of the Street Score metric can specify
different numbers of each amenity to consider in the scoring (e.g., 2 coffee shops vs. 20
restaurants). The distance score is calculated as a function of the walking distance ( x ) from the
residential unit to the amenity under consideration. This walking distance must be defined
according to the street grid (e.g., in a perpendicular north-south, east-west grid, the distance
would be the L1 norm, or the “Manhattan” walking distance). For instance i of amenity a , the
distance score is calculated as follows:
 1 x  d1
 1  [0.9 / d  d ] x  d d  x  d
  2 1  1 1
sa ,i  
2

0.1  [0.1/  d3  d 2 ]  x  d 2  d 2  x  d3
 d x  d3
for walking distance x , where d1 , d 2 , and d 3 are set by the user. The result is a score, based on
the distance, that is scaled between 0 and 1.
The design variables for the problem are the locations of buildings and amenities. The
categories of buildings/amenities can be set according to the individual problem, but it is
important to note that the initial work by Rakha and Reinhart used residential units, restaurants,
generic shops, coffee shops, bookstores, banks, grocery stores, parks, schools, and entertainment
venues. In this initial work, we simplify the design space by defining specific lots at which
different buildings of different sizes can be placed.

Optimization
Given a calculable objective function (Street Score) and design variables (locations of build-
ings/amenities), the next step is to perform optimization. We utilize a genetic algorithm, as these
have been shown in previous work to be effective in optimizing physical layouts with large
solution spaces (Rakha and Reinhart 2012; Razavialavi and Abourizk 2017). Each step in the
genetic algorithm requires creating a routine specific to this specific problem setting. These
subroutines are outlined in this subsection. To initialize the population, we must be able to create
random designs. Given a street grid with possible lots as well as a building stock with numbers
of available building/amenity types, we can randomly assign each building type to a lot. For
implementation, it can be simplest to randomly assign larger amenities—that may take up
multiple lots—first, working from largest amenities to smallest. Once an initial population is
created, a Street Score can be calculated for each neighborhood design. To adapt the Street Score
methodology from a single residential unit to an entire neighborhood, we randomly sample
residential units from a neighborhood, calculate the Street Score for each, and average the
results. Given Street Scores calculated for all neighborhood designs, we can select parents that
will help us create future generations. Different selection criteria can be utilized, including
truncation selection, tournament selection, and roulette wheel selection, as discussed in
Kochenderfer (2018).
Once parents are selected, crossover and mutation must be implemented to create new
generations. The process for crossover is detailed in Algorithm 1. The notion is to randomly
choose the location of each building/amenity from the parents’ locations for that
building/amenity. First, all non-residential buildings are selected from the parents and placed,
and then the residential units are filled in randomly. The concept of simulated annealing can be
incorporated into the overall genetic algorithm through modification of simple crossover (and
mutation), as discussed in Adler (1993). In simulated annealing crossover, a child is created from
two parents, and it is always accepted if it performs better than the parents. If it performs worse
than its parents, it is accepted with a probability that shrinks over generations. Formally, the
child is accepted with the following probability:

 1 ΔS  0

 min  e ,1 ΔS  0
 ΔS / t


where ΔS is the difference between the child’s Street Score and the best of the parents’ Street
Scores, and t is a temperature value that decreases according to an exponential annealing
schedule, which makes use of the following decay factor: t     t ( k ) , where  is a user-defined
k 1

parameter.
The algorithm for mutation is shown in Algorithm 2. Mutation is only performed on a child

with probability given as a parameter in the overall genetic algorithm. When it is performed, a
given number of randomly chosen non-residential buildings/amenities are swapped with
residential buildings/amenities. Mutation is implemented in this way because the relative
locations of residential units and non-residential amenities are the key drivers in the Street Score
function. Simulated annealing can also be applied to mutation, using the original individual and
the mutated individual as the candidates for acceptance. Crossover and mutation are used to
create new generations of neighborhood designs. In the overall algorithm, we track the best
performing individuals to determine the overall most walkable neighborhood design.

Figure 1. Potrero Hill existing layout with amenities and their weight vectors.

CASE STUDY: POTRERO HILL, SAN FRANCISCO, CALIFORNIA

To evaluate the proposed optimization methodology and test it on a real-world urban area,
we apply it to an existing neighborhood-scale urban design in the Potrero Hill area of San
Francisco, California. The grid we consider in this case study is 9 blocks by 3 blocks and roughly
320,000 m2 in area. Figure 1 shows the abstracted study space, the amenities that are present in
the design space, and weight vector associated with each amenity (as described above). These
amenities are found and categorized through a manual audit of the space using Google Maps.
The categorizations and weights in this study largely follow those in Rakha and Reinhart’s
previous work, which were chosen based on their analysis of both importance and the need for
choice (as lengths of the weight vectors indicate how many of each amenity are considered in the
calculation of the Street Scores). We increased the weight for the park amenity given its
prominence in the existing design. Furthermore, consistent with Rakha and Reinhart, we did not
consider offices to be an amenity.
To convert the physical layout to an abstract layout with appropriate dimensions and with
lots for placing buildings and amenities, we used the osmnx package (Boeing 2017) for Python.
We made certain assumptions in order to simplify the abstract representation of the
neighborhood. Based on our assessment of the study area, we assume that, on average, there are
32 possible lots in each block. We also assume that the park and the schools each occupy one full
block—where a full block is defined as the lots entirely contained by four intersections.
Furthermore, we assume that all other building types each occupy one lot. It is important to note
that this last assumption could be easily changed such that different building/amenity types take
up different numbers of lots and/or partial lots to reflect multi-use development. For this study,
however, we aimed to keep the abstract neighborhood representation as simple as possible in
order to focus on optimization and interpretation.
For calculating the Street Score, we need to set values for the parameters d1 , d 2 , and d 3

(discussed above). Given the geometry of the neighborhood, we set d1  50m , d2  300m , and
d3  1600m . It is important to note that these values are smaller than they are in Rakha and
Reinhart’s initial work. We choose smaller values because the physical distances in our case
study are much smaller than those in the Rakha and Reinhart study, and therefore it would be
relatively difficult to achieve a perfect Street Score. For the existing urban layout, we calculate
the Street Score to be 31.8.

Figure 2. Comparison of implementations with varying degrees of simulated annealing.

Figure 3. Optimized neighborhood layout.

In order to optimize this layout, we execute the genetic algorithm outlined above. The first
step in this algorithm is to generate an initial population. We first generate a random population
of neighborhood designs and assess their Street Scores. The random design routine first chooses
random blocks for placing the park and schools, since these amenities take up full blocks. It then
chooses random lots for placing all other amenities, and finally it fills up the remaining lots with
residential units. After generating a population of 1,000 individuals, we calculate the Street Score
for each. The resulting distribution has a mean of 52.1 and a standard deviation of 3.4.
We implement a version of truncation selection in the genetic algorithm to bias toward the
better performing individuals. We first sort the individuals by decreasing Street Score (since we
are maximizing). We then choose from the best performing individuals, but we also ensure that a
randomly chosen set of the remaining population is incorporated in the selected group in order to
protect against local minima. The mutation and crossover routines are implemented as discussed
in the Methodology section. On a small population, we test three versions of the genetic
algorithm, each with different levels of use of the concept of simulated annealing. In the baseline
case, we do not include simulated annealing, but we test two other cases: one in which simulated

annealing is incorporated into mutation, and another in which simulated annealing is

incorporated into both mutation and crossover. When simulated annealing is incorporated, we
use the exponential annealing schedule with   3 / 4 . The results from this test are shown in
Figure 2 (where GA represents ‘genetic algorithm’ and SA represents ‘simulated annealing’). As
we can see, the genetic algorithm with simulated annealing incorporated into mutation performs
the best.
Once deciding that simulated annealing should only be applied to the mutation step, we
execute the genetic algorithm with the following parameters: 1,000 designs points in a single
population, 100 generations, 5% probability of mutation, 500 parents, 4 children per parent pair,
and initial annealing temperature of 10. The optimization convergence is shown in Figure 2. The
best performing individual found after all generations are scored has a Street Score of 68.7. This
is a little more than a two-fold increase from the existing layout (which was 31.8) and a roughly
32% increase over the random layouts. The final optimized layout is shown in Figure 3.

DISCUSSION AND CONCLUSIONS

Results of our analysis indicate that the average Street Score for the randomly generated
layouts is significantly higher than the Street Score for the existing neighborhood layout. Perhaps
even more surprising, the existing layout’s score is about 6 standard deviations below the
randomly generated layouts’ mean score. It is important to note here that the existing layout
score could start to approach the random layout score if the parameters d1 , d 2 , and d 3 are
increased. However, this finding still suggests a significant difference between the random and
existing layouts. This is partially explained by the fact that the existing layout is reminiscent of
the planning notion of Euclidian zoning. In the existing layout, the shop and restaurant uses are
generally clustered in the lower right hand side of the grid. This clustering negatively impacts the
Street Scores for any residential units located relatively far from the cluster (in our case, the
houses on the upper left). Additionally, the grocery amenity in the existing layout is located all
the way in the upper left corner of the grid, having a similar effect on the scores for residential
units on the bottom right.
The optimization routine seems to converge around a maximum about 32% higher than the
random layout. This optimized layout (as seen in Figure 3) has a much more dispersed layout of
amenities. Importantly, the park and grocery amenities are located quite centrally in the grid. In
addition, the schools are distributed on the left and the right, and the restaurants and shops tend
to be distributed evenly across the entire grid. This makes intuitive sense: the more distributed
amenities are, the higher chance that all residential units will be proximate to at least one of each
amenity—questioning the benefits of Euclidean zoning for walkability in urban neighborhoods.
The main limitations in this work result from the various assumptions that were involved in
setting some of the scoring parameters, including the distance parameters and the weighting
parameters. Future work should consider which parameter values are most appropriate for
different problem settings. However, while assumptions had to be made, the results still suggest
important differences between the existing Euclidian-style layout and the more mixed layout
suggested through optimization. Another limitation of this work is that the optimization and
analysis were solely focused on an existing neighborhood layout in a real neighborhood. While
the optimization could be easily applied to a neighborhood designed from scratch, a few things
would need to be known before optimization: the street grid, the number of each
building/amenity, and the possible lots for building/amenity placement. Future work should

consider the problem of co-optimizing the street grid and possible locations along with building
placement, as this might provide further insights into the optimal design for the urban fabric.
Finally, the findings from this analysis should not be the sole input when designing a new
layout or assessing the performance of an existing layout. To be sure, there are metrics other than
walkability that should be seriously considered when designing an urban space, such as
proximity of amenities to transit stops, public health effects, or expected economic activity. The
relative colocation of a polluting factory with a grocery store may increase walkability, but it
could have dire consequences for public health. Similarly, it may improve walkability to
distribute amenities across a given area, but for canonical economic reasons such as those first
suggested in Hotelling’s law (Hotelling 1929), it may be more economically profitable for two
similar businesses to be located near each other. While the work presented in this paper cannot
provide a sole rationale for designing a neighborhood one way versus another, it can provide
helpful input for urban designers, engineers, and city governments in considering new layouts or
evaluating existing ones.

REFERENCES
Adler, D. (1993). “Genetic algorithms and simulated annealing: A marriage proposal.” IEEE
International Conference on Neural Networks - Conference Proceedings, 1104–1109.
Best, R. E. (2016). “Modeling, Optimization, and Decision Support for Integrated Urban and
Infrastructure Planning.” Stanford University.
Boeing, G. (2017). “OSMnx: New methods for acquiring, constructing, analyzing, and
visualizing complex street networks.” Computers, Environment and Urban Systems,
Pergamon, 65, 126–139.
Carr, L. J., Dunsiger, S. I., and Marcus, B. H. (2010). “Walk scoreTM as a global estimate of
neighborhood walkability.” American journal of preventive medicine, Elsevier, 39(5), 460–
463.
Hotelling, H. (1929). “Stability in Competition.” The Economic Journal, 39(153), 41.
Jacobs, J. (1961). The Death and Life of Great American Cities. New York.
Kochenderfer, M. J., and Wheeler, T. A. (2018). Algorithms for optimization. MIT Press.
Porta, S., and Renne, J. L. (2005). “Linking urban design to sustainability: formal indicators of
social urban sustainability field research in Perth, Western Australia.” Urban Design
International, 10(1), 51–64.
Rakha, T., and Reinhart, C. F. (2012). “Generative Urban Modeling: A design work flow for
walkability-optimized cities.” SimBuild2012 5th National Conference of IBPSA-USA, 1–8.
Razavialavi, S., and Abourizk, S. M. (2017). “Site Layout and Construction Plan Optimization
Using an Integrated Genetic Algorithm Simulation Framework.” Journal of Computing in
Civil Engineering, 31, 04017011 (1-10).
Southworth, M. (2005). “Designing the walkable city.” Journal of urban planning and
development, American Society of Civil Engineers, 131(4), 246–257.
Wickersham, J. (2000). “Jane Jacob’s Critique of Zoning: From Euclid to Portland and Beyond.”
BC Envtl. Aff. L. Rev., HeinOnline, 28, 547.

A Novel Method for Monitoring Air Speed in Offices Using Low Cost Sensors
Ashrant Aryal1; Ishan Shah2; and Burcin Becerik-Gerber3
1
Ph.D. Student, Sonny Astani Dept. of Civil and Environmental Engineering, Viterbi School of
Engineering, Univ. of Southern California, KAP 217, 3620 South Vermont Ave., Los Angeles,
CA 90089. E-mail: [email protected]
2
Undergraduate Research Assistant, Sonny Astani Dept. of Civil and Environmental
Engineering, Viterbi School of Engineering, Univ. of Southern California, KAP 217, 3620 South
Vermont Ave., Los Angeles, CA 90089. E-mail: [email protected]
3
Associate Professor, Sonny Astani Dept. of Civil and Environmental Engineering, Viterbi
School of Engineering, Univ. of Southern California, KAP 217, 3620 South Vermont Ave., Los
Angeles, CA 90089. E-mail: [email protected]

ABSTRACT
Even though HVAC systems consume around 40% of total building energy, they often fail to
provide satisfactory thermal conditions to occupants in commercial buildings. Personalized
environmental control systems (PECS) such as local fans and heaters have the potential to
control the local environment around the occupant to improve occupant satisfaction. Existing
methods for modeling thermal preferences primarily rely on temperature measurements and
often neglect air speed. Current methods for monitoring air speeds are either too expensive or too
bulky to be used in real office environments. We present our preliminary results on predicting air
speeds using alternate environmental parameters such as air pressure, temperature, and humidity
using low cost miniature sensors. The results show that we can accurately predict air speeds with
a mean absolute error of 0.056 m/s for conditions investigated in this study. Although the results
are promising, further studies are needed to improve the system before it can be used in real
world environments.

INTRODUCTION
In the U.S., HVAC systems consume around 43% of the total building energy consumption
(U.S. Department of Energy 2010), yet they often fail their primary purpose of providing
comfortable indoor conditions. A large scale study, collected occupant satisfaction survey in
commercial buildings in North America over 10 years, showed that only 38% of the occupants
are thermally satisfied, with only 2% of buildings meeting the ASHRAE requirement of
satisfying at least 80% of occupants (Karmann et al. 2018). Poor occupant satisfaction stems
from two main reasons: the differences in comfort preferences among occupants, and the
inability of centralized HVAC systems to provide personalized environmental conditioning and
control opportunities. A recent simulation study showed that even when thermal comfort
preferences of occupants are known, it is very difficult to achieve 80% occupant satisfaction
using centralized HVAC systems due to lack of capability of controlling the environment at a
granular level to meet individual occupant requirements (Aryal and Becerik-Gerber 2018).
Personalized Environmental Control Systems (PECS) have recently received interest from
the research community as solutions to provide more comfortable environments while reducing
building energy consumption. PECS, such as local fans and heaters, have the potential to provide
local cooling and heating to maintain comfortable conditions around each occupant while
enabling central HVAC systems to be controlled over a wider range of temperatures, thereby

reducing overall energy consumption while improving occupant satisfaction. For example, a
recent study demonstrated that extending HVAC setpoints could reduce CO 2 emissions up to
21.4% per year, and the use of personal cooling systems can result in cooling energy savings of
10% to 70% depending on the location (Heidarinejad et al. 2018). Thermal comfort depends on
several factors, such as air temperature, radiant temperature, relative humidity, air speed,
metabolic rate and clothing. There have been several recent efforts to learn individual thermal
comfort preferences of building occupants by leveraging recent advancements in Internet of
Things (IoT) and machine learning techniques (Kim et al. 2018). Once individual comfort
preferences are modeled, these models could be used to automatically control HVAC systems
and PECS to improve occupant comfort and satisfaction (Aryal et al. 2018). However, current
methods of modeling thermal comfort preferences primarily rely on environmental temperature
measurements, and physiological measurements using wearable devices, however they do not
often consider air speeds due to the difficulty in monitoring air speed using existing methods
(Kim et al. 2018).
Traditionally, it was believed that air movement has a negative effect of causing draft that
can lead to discomfort. However, there has been a recent shift in focus towards the positive
effects of air movement and its ability to improve thermal comfort of occupants and even
provide thermal pleasure or thermal alliesthesia (Parkinson and de Dear 2017; Zhai et al. 2017).
Air speeds around an occupant could be affected by several factors, such as the fan size, fan
location and orientation, furniture in the room, and so on. For example, a recent study mapped
the influence of desks and office partitions on the air flow patterns inside a room with a ceiling
fan to provide a better understanding of how furniture affects air movement in a room (Gao et al.
2017). Climate chamber studies have also been conducted to identify suitable air speeds for
improving comfort in a typical office layout (Zhai et al. 2017). However, such studies utilize
bulky sensing methods, such as an array of anemometers mounted on a rod for monitoring air
speed around an occupant, which would be impractical for monitoring air movement in an actual
office due to the large space requirements. Other methods, such as hot wire anemometer (~$100 -
$500) and ultrasonic anemometers (~$100 - $2500) are expensive to be utilized at a large scale.
A more practical approach would be to utilize sensing methods that are low cost, and small in
size so they could easily be embedded in an existing office space (e.g., on desks, chairs, and
other furniture) to monitor air movement around an occupant to provide real time measurements
of air speed even when other conditions change in the environment. In this paper, we present our
preliminary results from our attempt to find an alternate way to monitor air speeds in indoor
spaces.

METHODOLOGY
Moving air exerts pressure on objects as we experience in everyday life, such as trees
swaying in the wind. Pitot tubes, a common instrument, used to measure air speeds in aircrafts,
rely on measuring the dynamic pressure caused by moving air to evaluate the air speed. Moving
air also has a cooling effect as it can take away some of the thermal energy from a body by
conduction and convection. Moving air can also change relative humidity by causing a change in
the density of air. Analytical modeling of the underlying physical relationships among
temperature, pressure, humidity and air speed for turbulent flows of open air is a complicated
task. However, machine learning algorithms can model complex relationships solely from the
provided data. Thus, we utilize inexpensive sensors to monitor environmental parameters, such
as air pressure, temperature and humidity, and utilize machine learning algorithms to predict the

air speed. The sensors utilized in this study are low-cost (2 sensors, $20 each) and small in size
(2cm*2cm), which can be easily embedded in office spaces and deployed at a large scale.

Figure 1: Left- Data Collection Setup. Right- Top: Main sensor, Bottom: Reference sensor

Figure 2: Pressure, temperature and humidity readings from main and reference sensors
Data Collection: For this study, air movement was generated using a typical desk fan
(Honeywell HT-908) with three different speed settings. A cup-based anemometer connected to
an Arduino Uno was used to monitor the air speed and provided the training labels for our
model. Environmental measurements for pressure, temperature and humidity were taken using
two BME280 sensors connected to the Arduino Uno. One of the BME280 sensor was directly
facing the fan (referred hereafter as the main sensor), and another BME280 sensor (referred
hereafter as reference sensor) was placed in a location that was not directly influenced by the fan
to serve as a reference measurement of the ambient conditions in the room. The anemometer, as
well as the BME280 sensors were mounted on an office chair, with the anemometer and the main
sensor placed roughly at the location of the head of a person sitting on the chair, while the
reference sensor was mounted at the back of the chair as shown in Figure 1. The three fan speeds
as measured by the anemometer were around 3.2 m/s, 3.5 m/s, and 4 m/s.
Since ambient conditions in the room change throughout the day due to several factors, such
as HVAC operations, occupants in the room, outside climate etc., we utilized the reference
sensor to gather background changes that can be removed from the measurements collected by
the main sensor directly facing the fan. It is important to note that the office space was occupied
during the data collection period where occupants were in the office during the day, and the
temperature setpoint of the HVAC system was set by the occupants as desired. For this pilot
study, the fan was placed roughly 2 feet from the sensors, with the center of the fan aligned to
the sensors. We also utilized a smart plug that can be remotely controlled to switch the fan on or
off. The fan was switched on or off at a random interval between 5 to 15 minutes while the

measurements were taken from all the sensors. The data was collected for a period of 115 hours
(4 days and 19 hours) with sensor measurements taken every second resulting in approximately
416,000 data points. A portion of the collected data for a 6-hour window is shown in Figure 2,
where the large fluctuation is caused by the HVAC being turned on when an occupant entered
the office, and the smaller fluctuations are related to the fan operation.
Data pre-processing: The data collected, from the reference sensor, provided information
regarding the changes in the background - not affected by the fan. The background trend was
removed by subtracting the readings of pressure, temperature and humidity obtained by the
reference sensor from the readings obtained by the main sensor. The data was then normalized
by subtracting the mean from each data point and dividing it by the standard deviation. The
cleaned data, after removing the background trend and normalization, is shown in Figure 3 along
with the airspeed measurements obtained from the anemometer. Turning the fan on corresponds
to increase in pressure and decreases in temperature and humidity, as seen in Figure 3. This trend
can be observed strongly when the HVAC system is not operating but becomes weaker when the
HVAC system is operating, around time 16:00 in Figure 3, due to additional background
variation caused by the system.

Figure 3: Cleaned and normalized data along with air speed measured by an anemometer
Table 1: RMSE for different algorithms using all predictors
Algorithm RMSE using all predictors
Linear Regression 0.91
Decision Trees (Large – minimum leaf size = 4) 0.21
Decision Trees (Small – minimum leaf size = 36) 0.49
Bagged Trees 0.15
Boosted Trees 0.84

Among the three data streams shown in Figure 3, pressure seems to be more robust against
the changes caused by HVAC operation as it does not abruptly change when the HVAC is turned
on (around time 16:00) compared to other measurements. Since the changes observed in the
pressure seemed to be more robust, a new variable that provides the information regarding
whether the fan was on or off (fan state) was derived from the pressure data. An algorithm that
finds the points where there is an abrupt change in the average of data was utilized to detect the
points of change in the cleaned data. Since we know that the pressure increases correspond to the
fan being turned on, the fan state was then calculated by evaluating the direction of change
between two change points. After including this derived measurement, the data was used to train
different regression models using supervised learning techniques.
Model Training: The sensor measurements for pressure, temperature and humidity from the
main and reference sensors, the cleaned data stream for pressure, temperature and humidity, and

the derived fan state, resulting in a total of 10 predictors, were used as predictors for training the
regression models. The target variable (response) was the air speed measured by the
anemometer. The dataset was split into training set (60%), and testing set (40%). We evaluated
the common regression algorithms to determine which algorithm performed well for predicting
air speeds.

RESULTS
The algorithms were evaluated using 5-fold cross validation on the training set in order to
pick the best algorithm. The Root Mean Square Error (RMSE) for each of the algorithms in the
5-fold cross validation is shown in Table 1. Bagged trees outperformed other algorithms that
were evaluated with the lowest RMSE of 0.15 as seen in Table 1. Bagged trees consist of a
collection of decision trees trained on separate subsets of the data, which are later combined
together to provide a more robust prediction. Since all the sensor signals in this case are
inherently noisy, bagged trees seem to outperform other algorithms due to their robust nature.
Furthermore, to evaluate which of the physical measurements were most useful in predicting
the air speed, we trained different bagged tree models using one physical measurement (pressure,
temperature, or humidity) at a time. In addition, we also evaluated the usefulness of adding fan
state, which was derived from changes in pressure readings, as a predictor by including fan state
with one physical measurement at a time. The RMSE in the 5-fold cross validation for bagged
trees, trained using a single type of physical measurement, is shown in Table 2. We see that
pressure was the most useful physical measurement that resulted in the lowest RMSE, followed
by temperature and humidity respectively. In Table 2, we also see that fan state was a useful
predictor in the model as it reduced the RMSE by roughly 50% compared to using one of the
physical measurements alone.

Table 2: RMSE when using data from single type of physical measurement
Predictors used RMSE with Bagged Trees
Temperature 0.85
Pressure 0.64
Humidity 0.9
Temperature, fan state 0.42
Pressure, fan state 0.35
Humidity, fan state 0.41

Using all of the predictors together to train a bagged tree model yielded the best prediction
with an RMSE of 0.15 as seen in Table 1. After selecting bagged trees as the preferred algorithm,
we evaluated its prediction error in the testing set (40% of the data) using all the predictors. The
RMSE in the testing set was 0.15, which is the same as the RMSE obtained from cross
validation, which indicates that there was no overfitting on the training set. The measured vs.
predicted airspeeds from the testing set are shown in Figure 4 for a 6-hour window. The
coefficient of determination (r-squared) was 0.9934 and Mean Absolute Error (MAE) was
0.056m/s between the measured and predicted airspeeds in the test set. MAE, in contrast to
RMSE, provides a better reflection of the average error that can be expected because it does not
square the errors to penalize outliers. The low MAE of 0.056m/s provides a strong evidence that
airspeed can be accurately predicted using other physical measurements.

Figure 4: Airspeed measured from anemometer and predicted from bagged trees model.

DISCUSSION
Our results from this preliminary investigation strongly supports the idea of monitoring air
speed using alternative measurements for pressure, temperature and humidity. This could
provide a low-cost alternative to existing methods of monitoring air speed, such as the ones
currently used, e.g., hot-wire anemometer or ultrasonic anemometers. Furthermore, the BME280
sensor used in this study is only 2cm*2cm in size, which makes it possible to embed these
sensors into everyday objects, such as office furniture, which is not practical with bulky cup-
based anemometers. In addition to monitoring airspeed, the data collected from the BME280
sensor is also useful for other purposes, such as ensuring temperature and humidity in the
environment are within the desired ranges. This increases the usefulness of using our approach to
monitor air speed compared to using different sensors for monitoring different parameters.
In this study, we collected air speeds under three fan speed settings, which were around 3.2
m/s, 3.5 m/s, and 4 m/s. Air speeds in offices tend to be lower than the air speeds used in this
study, typically lower than 0.8m/s due to current guidelines to avoid draft. However, several
studies have shown that higher air speeds such as 1.4m/s (Zhai et al. 2017) or even as high as
3m/s (Fong et al. 2010) can be effective ways to maintain thermal comfort at higher
temperatures. Since this study was intended to provide a proof of concept, a higher air speed was
used to improve the signal to noise ratio. Lower air speeds need to be considered in future
studies.
Improved air quality is another added benefit that may result from using higher air speeds
indoors. A recent study showed that using air movement devices such as a desk fan can
significantly reduce the CO2 inhaled by the occupant (Ghahramani et al. 2019). Since most
existing methods to model individual thermal comfort preferences rely only on temperature and
humidity measurements, having a convenient method to monitor air movement in real time can
be useful in developing new methods that can include air movement as an additional parameter
in individual comfort models. Ultimately, the actual effectiveness of using real time air speed
measurements for thermal comfort prediction needs to be evaluated after the system is fully
developed.
There are several limitations in this preliminary investigation that need to be overcome
before air speed can be monitored in real offices using the approach. In our study, only one type
of fan was used to generate air movement, but the changes observed in the sensor signals might
vary with different types of fans. The air speed also depends on the distance from the fan and the
angle that the fan is positioned in, which was fixed in this study. Future studies need to consider

different types of fans or air terminals positioned at different distances and orientations to
develop more robust models that can work under different conditions. For real world
implementation, the impact of an occupant sitting on a chair where the sensor is placed also
needs to be considered. The ideal location for sensor placement needs to be investigated by
considering the occupant, and methods to filter out additional noise caused by occupant’s
presence need to be developed.

CONCLUSION
In this study, we presented our preliminary results for monitoring air speed in office
environments using alternate measurements of air pressure, temperature and humidity. The
presented approach resulted in a mean absolute error of 0.056m/s in air speed prediction, which
indicates the possibility of using this approach for real time monitoring of air speed in indoor
environments. The approach provides a low-cost alternative to existing methods, such as hot
wire anemometers and ultrasonic anemometers, and is much smaller in size compared to cup-
based anemometers. Although this preliminary investigation has several limitations and there is
additional work needed to develop a robust method to monitor air speeds around an occupant, the
results are promising and warrant further investigation. We believe that with a convenient
method to monitor air speeds, we can better understand the impact of air speed on thermal
comfort by being able to obtain real time air speed measurements and comfort feedback from
occupants. Current methods of learning thermal comfort preferences of individual occupants
mostly rely on temperature measurements alone to infer thermal comfort of occupants and
including air speed measurements could potentially improve the modeling of thermal comfort
preferences. Furthermore, with real time monitoring of temperature and air speed, and by
modeling individual comfort preferences, the local environment around the occupant could be
automatically controlled to improve occupant comfort and satisfaction using PECS.

ACKNOWLEDGEMENTS
This material is based upon the work supported by the National Science Foundation under
Grant No. 1763134 and 1351701. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views of the
National Science Foundation. The assistance of research assistants Irie Cooper and Victoria
Sanchez is greatly appreciated.

REFERENCES
Aryal, A., Anselmo, F., and Becerik-Gerber, B. (2018). “Smart IoT desk for personalizing indoor
environmental conditions.” Proceedings of the 8th International Conference on the Internet
of Things - IOT ’18, ACM Press, New York, New York, USA, 1–6.
Aryal, A., and Becerik-Gerber, B. (2018). “Energy Consequences of Comfort-driven
Temperature Setpoints in Office Buildings.” Energy and Buildings, Elsevier.
Fong, K. F., Chow, T. T., and Li, C. (2010). “Comfort zone of air speeds and temperatures for
air-conditioned environment in the subtropical Hong Kong.” Indoor and Built Environment,
19(3), 375–381.
Gao, Y., Zhang, H., Arens, E., Present, E., Ning, B., Zhai, Y., Pantelic, J., Luo, M., Zhao, L.,
Raftery, P., and Liu, S. (2017). “Ceiling fan air speeds around desks and office partitions.”
Building and Environment, Elsevier Ltd, 124, 412–440.

Ghahramani, A., Pantelic, J., Vannucci, M., Pistore, L., Liu, S., Gilligan, B., Alyasin, S., Arens,
E., Kampshire, K., and Sternberg, E. (2019). “Personal CO2 bubble: Context-dependent
variations and wearable sensors usability.” Journal of Building Engineering, Elsevier, 22,
295–304.
Heidarinejad, M., Dalgo, D. A., Mattise, N. W., and Srebric, J. (2018). “Personalized cooling as
an energy efficiency technology for city energy footprint reduction.” Journal of Cleaner
Production, 171, 491–505.
Karmann, C., Schiavon, S., and Arens, E. (2018). “Percentage of commercial buildings showing
at least 80% occupant satisfied with their thermal comfort.” Proceedings of 10th Windsor
Conference: Rethinking Comfort.
Kim, J., Schiavon, S., and Brager, G. (2018). “Personal comfort models – A new paradigm in
thermal comfort for occupant-centric environmental control.” Building and Environment,
Pergamon, 132, 114–124.
Parkinson, T., and de Dear, R. (2017). “Thermal pleasure in built environments: spatial
alliesthesia from air movement.” Building Research & Information, Routledge, 45(3), 320–
335.
U.S. Department of Energy. (2010). Buildings Energy Data Book.
Zhai, Y., Arens, E., Elsworth, K., and Zhang, H. (2017). “Selecting air speeds for cooling at
sedentary and non-sedentary office activity levels.” Building and Environment, Pergamon,
122, 247–257.

Review of Human-in-the-Loop Cyber-Physical Systems (HiLCPS): The Current Status

from Human Perspective
Behnam Moshkini Tehrani, S.M.ASCE 1; Jun Wang, Ph.D., A.M.ASCE2;
and Chao Wang, Ph.D., A.M.ASCE3
1
Ph.D. Student, Dept. of Civil and Environmental Engineering, Mississippi State Univ., PO Box
9546, Mississippi State, MS 39762. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil and Environmental Engineering, Mississippi State Univ., PO
Box 9546, Mississippi State, MS 39762. E-mail: [email protected]
3
Assistant Professor, Bert S. Turner Dept. of Construction Management, Louisiana State Univ.,
3315D Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]

ABSTRACT
Cyber-physical systems (CPS) are co-engineered interacting networks of physical and
computational components. Most future technologies are believed to have much more human-
aware. Accordingly, integrating human context in CPS instead of placing it outside the system
boundary is becoming increasingly important. It is essential to have a systematic understanding
of the principles that how to integrate human into CPS as human-in-the-loop cyber-physical
systems (HiLCPS). However, in the literature, the roles that human should take in HiLCPS are
not clearly identified. The HiLCPS are heterogeneous systems with high uncertainty and
complexity associated with humans, thus specifying human roles in HiLCPS enables the
optimum system performance by fully considering human perspectives. This study focuses on
synthesizing the existing HiLCPS literature to gain the insight of human roles. Three HiLCPS
configurations are identified. For each configuration, human roles with related studies are
reviewed. Furthermore, the associated challenges and opportunities are identified and
summarized.

INTRODUCTION
Cyber-physical systems (CPS), a term referring to a new generation of systems, are described
as the combination of cyber world (computations and networks) and physical world in a closed-
loop (Petnga and Austin 2016). The progress, states, and/or changes of the physical world (e.g.,
mechanical devices, equipment, sensors, and humans) are monitored, collected and transmitted
to the cyber world (e.g., computational processes) for processing and analysis (Zhu et al. 2018).
The results obtained from the cyber world are sent back to the physical world for decision
implementations (Ma et al. 2018). The CPS approach is expected to bring advances in a wide
range of fields such as healthcare, traffic flow management, construction, and many other areas
just being envisioned. Furthermore, human and technologies are being integrated more
intensively and closely than ever before. Accordingly, integrating human context in CPS instead
of placing it outside the system boundary is becoming increasingly important. As Munir et al.
(2014) mentioned, human can help systems generate more reliable outcomes. Human is better in
performing tasks that relate to cognition, while the cyber world can more accurately analyze
problems in complicated environments with multiple criteria.
To reinforce CPS by considering human factors, it is essential to have a systematic
understanding of the principles and manners that how to integrate human into CPS as human-in-
the-loop cyber-physical systems (HiLCPS) as human, unlike machines, is a complex system with

high levels of uncertainty and unpredictability in his cognition, behaviors, and responses. The
complexity in human behaviors would result in unpredictable conditions into the systems (Munir
et al. 2014). However, in the current literature, how to best integrate human-related elements in
the closed control loop, HiLCPS, is insufficiently studied. The roles that human can play in the
HiLCPS are not clearly identified. Therefore, this study focuses on synthesizing the existing
HiLCPS literature to gain insight on the HiLCPS from the human perspective. A full
understanding of the HiLCPS approach from the human perspective with the associated progress
and challenges enables researchers in different disciplines to:
i. Develop the HiLCPS more scientifically to meet their needs of integrating human into the
systems and have more reliable and efficient outcomes;
ii. Understand how to best design and operationalize both human and technological
functions in an integrated and smart system.
The remainder of this paper is structured as follows. First, the literature search with the
databases examined is explained. Afterwards, the synthesized definition of HiLCPS and the
progress and taxonomies in HiLCPS are presented with three subsections describing each
taxonomy in detail. At the end, the challenges and future opportunities are discussed, followed
by the conclusions and future work of this study.

LITERATURE SEARCH
In this paper, the literature on the HiLCPS in the fields such as manufacturing, maritime
transportation, and construction industry was reviewed to gain insight on the human roles in
HiLCPS. The keywords, CPS, HiLCPS, Cyber-Physical Human Systems (CPHS), Human and
CPS, Cyber-Physical Social Systems (CPSS), Human in Cyber-Physical Systems, were used for
the literature search. The databases IEEE, IEEE CAS, ELSEVIER, SPRINGER, ASCE, SCOPUS,
TRB, SAE, and Telecommunications and Signal Processing (TSP) were double examined. A total
of 96 papers were found based on the keywords searched. Studies were considered for inclusion
if: (i) they were published in peer-reviewed journals and proceedings; and (ii) their introduced
CPS involved human. Studies were excluded if they only introduced CPS without considering
human in the system. As a result, 24 papers meeting the inclusion criteria were selected. The
other portion of the papers (72 papers) were excluded from being systematically reviewed. It can
be seen that there is a limited number of papers done for the HiLCPS. One of the major reasons
is that HiLCPS are bi-directional communication systems in which human should be considered
inside the control loop and not be ignored in the system. The study of HiLCPS is still at the
infant stage. A summary of areas and sources reviewed for this paper is presented in Table 1.

HUMAN-IN-THE-LOOP CYBER-PHYSICAL SYSTEMS (HILCPS)

Definition
HiLCPS is a term referred to CPS in which human has a prominent role, and whose inputs
are integrated into the system in a closed loop (Nikolov et al. 2018). Human inputs can be
considered as his behaviors, any data from human mind (e.g., human’s brain signals), and other
types of human-related information that can be translated into the CPS (Lv et al. 2017; Feng et
al. 2016). Therefore, HiLCPS consider human inside the closed loop, and human’s responses can
change the behavior of the whole system. A synthesized definition of HiLCPS is illustrated in
Figure 1.

Table 1. Summary of areas and sources for the reviewed papers.

Area/ No. of papers reviewed Source Reference
Automobiles Ind./1 Int. J. of Commercial Vehicles (SAE) Lv et al. 2017
Civil Engineering/2 J. of Automation in Construction (Elsevier) Yuan et al. 2016;
Agnisarman et al.
2019
Energy and Environment/3 J. of Transactions on Human-Machine Lu 2018
Systems (IEEE)
J. of Transactions on Emerging Topics in Munir et al. 2014
Computing (IEEE)
J. of Applied Energy (Elsevier) Wang et al. 2019
Information Technology/2 J. of Proceedings of the IEEE Cheng et al. 2018
Int. Conf. on Intelligent Data Acquisition Halcu et al. 2015
and Advanced Computing Systems (IEEE)
Manufacturing Ind./9 Int. Conf. on Research and Tech. for Assunta et al. 2017
Society and Ind. (IEEE)
Int. Conf. on System-Integrated Zamfirescu et al.
Intelligence (Elsevier) 2014
Int. Conf. on Cyber-Physical Systems, Scheuermann et al.
Networks, and Applications (IEEE) 2015
Int. Conf. on PErvasive Technologies Tsiakas et al. 2017
Related to Assistive Environments (ACM
Digital Library)
J. of Robotics and Computer-Integrated Nikolakis et al. 2019
Manufacturing (Elsevier)
Int. Federation of Automatic Control Birtel et al. 2018
(Elsevier)
J. of Manufacturing Systems (Elsevier) Yao et al. 2018
J. of Computers and Industrial Engineering Fantini et al. 2018
(Elsevier)
Int. Federation of Automatic Control Czerniak et al. 2017
(Elsevier)
Maritime Transp./1 J. of IEEE Access Zhang et al. 2018
Medical Sci./1 Int. Sym. on Computer-Based Medical Fu et al. 2017
Systems (IEEE)
Nuclear Ind./2 Canadian Conf. on Electrical and Singh and
Computer Engineering (IEEE) Mahmoud 2017
Int. J. of Industrial Ergonomics (Elsevier) Fan et al. 2018
Signal/Data Processing/3 Int. Conf. on Telecommunications and Nikolov et al. 2018
Signal Processing (IEEE)
J. of Signal Processing Systems (Springer) Ma et al. 2018
Int. Conf. on Bioinformatics and Feng et al. 2016
Bioengineering (IEEE)

Progress and Identified Taxonomies

In this section, the selected publications are reviewed and categorized based on the human
roles in the HiLCPS. There are different HiLCPS configurations in systems where human plays a
role or roles. Human can integrate inputs into the system (e.g., driving mode aware controller),
and the system can regulate itself based on the human’s inputs (e.g., sport driving mode in
driving behaviors) (Lv et al. 2017). Another example of human inputs integration is in systems
that human’s activities are monitored via tools (such as electroencephalography recording
human’s brain signals) and sent to the cyber world which processes the data using related
software or algorithms (Feng et al. 2016). Human also can be the operator to implement the
actuation in a system. For instance, human interacts with his smart phone to receive notifications
about next procedures to follow and guided diagnostics to perform (Fantini et al. 2018). On the
other hand, there are systems in which the actuation process is triggered and implemented
automatically by other tools rather than humans. For example, an energy controller was triggered
and used automatically to prevent energy waste based on system activities and user behaviors
(Munir et al. 2014). There are situations in which human is just in the middle of the cyber and
physical worlds to enhance their interactions. For instance, fishing vessels’ trajectories were
detected, analyzed and displayed to human, so enabled him to make decisions about future
fishing vessels’ behaviors (Zhang et al. 2018).

Figure 1. Illustration of the HiLCPS.

Figure 2. Illustration of HiLCPS Configuration One.

According to the aforementioned findings, three main taxonomies of HiLCPS are generated
based on the existing research:
 HiLCPS Configuration One (Figure 2) where human inputs are integrated into the cyber
world and human is the operator or involved in the actuation process of the system.

 HiLCPS Configuration Two (Figure 3) where human inputs are integrated into the cyber
world, but the actuation is triggered and implemented automatically by the tools other
than human.
 HiLCPS Configuration Three (Figure 4) where human is the operator and only involved
in the actuation process of the system.

Figure 3. Illustration of HiLCPS Configuration Two.

Taxonomy One: HiLCPS Configuration One
The above-described HiLCPS Configuration One is illustrated in Figure 2. One of the most
common HiLCPS for the Configuration One is in human-robot collaborations, mostly in the
manufacturing industry. For instance, human behaviors were monitored through multimodal
sensing in factories, and humans constantly collaborated with robots to get products (Tsiakas et
al. 2017). Today technologies are helping people get more involved in HiLCPS. The use of smart
phones is almost part of everyone’s life. Using the smart phone embedded sensors (such as
accelerometer and GPS) and customized mobile applications, human can be integrated as part of
the control loop. For example, smart phones can monitor human emotions and help human to
join nearby friends while he feels lonely or depressed (Halcu et al. 2015).
Human-robot collaborations are real-time activities, and many variables are involved in the
process, such as the various patterns of workers’ behaviors with high uncertainty. Thus, robust
models to precisely predict human activities are needed (Tsiakas et al. 2017). In addition, human
interactions with smart devices to receive instructions/notifications is on a real-time basis and in
an iterative manner. The performance of the human-smart device system relies on such
interactions. As thus, highly secured platforms for communications is needed to transfer the
information (Scheuermann et al. 2015; Halcu et al. 2015). Therefore, two prominent points in
terms of the HiLCPS Configuration One are identified to be addressed with more focus in future
studies:
 To have more reliable models to model and analyze human activities and behaviors;
 To ensure human operators to receive correct information (e.g., notifications or
instructions) while dealing with smart devices or collaborating with machines to get or
assemble products.

Taxonomy Two: HiLCPS Configuration Two

HiLCPS Configuration Two is illustrated in Figure 3. As discussed earlier, there are some
cases in which human activities are monitored via tools and sent to the cyber world for
processing. The system analyzes and infers human activities and implements the actuation (Feng
et al. 2016; Munir et al. 2014). For example, safety protections and countermeasures can be
taken by robots while human is constantly collaborating with robots. Robots can detect workers

via embedded sensors and automatically prevent crashes with humans by using controllers such
as hardwired safety stop; if needed, robots also can send out alerts to the workers’ mobile phones
regarding the potential crash risks (Nikolakis et al. 2019). Moreover, the use of automated
systems is promising to enhance human safety (Agnisarman et al. 2019).

Figure 4. Illustration of HiLCPS Configuration Three.

There are very few works done in HiLCPS Configuration Two. It raises the attention of
researchers to focus on HiLCPS where actuation is triggered and implemented automatically by
tools other than humans. One example is autonomous vehicles which would generate less
environmental pollutions and provide easier transportation. However, the associated challenges
are increasing with the increased automation level (e.g., full automation which is the level five of
autonomous vehicles). For instance, potential failures in automation system might occur, or the
way system responds to the complicated environments with very limited sensory data is
uncertain and might be unreliable (Fisher et al. 2016; Li et al. 2016). The features of autonomous
systems also can be applied to the construction industry to boost workers’ safety. For example,
workers on foot and equipment interactively working under dynamic construction environments
can benefit from automated systems to prevent collisions in a timely manner. Thus, one potential
future research focus is on HiLCPS that have more autonomy in the actuation process.

Taxonomy Three: HiLCPS Configuration Three

The last taxonomy is dedicated to HiLCPS Configuration Three illustrated in Figure 4.
Different from the Configuration Two, human is integrated in the actuation process to implement
changes or controls to the physical world. One benefit of such integration for construction safety
applications is that safety managers can be proactively notified before a hazardous situation
happens. For example, a mobile application was developed for monitoring temporal structure and
notifying safety inspectors regarding the detected risks to improve site safety (Yuan et al. 2016).
Fan et al. (2018) designed a model to simulate the effects of human/operator errors on a system
by displaying false errors to human. In medical science field a mathematical model was designed
to boost the safety of system by considering human-system involvement assumptions and to
prevent human from making errors (Fu et al. 2017). Human confusion could result in
catastrophic consequences and HiLCPS are sensitive to human operator’s involvements (e.g.,
nuclear plants, medical science) (Fan et al. 2018; Fu et al. 2017).
Another benefit would be that managers can plan in a more efficient manner ahead of time
for future by monitoring real-time activities of a system in fields such as maritime transportation
(Zhang et al. 2018). However, further investigations are required to develop more in-depth
HiLCPS knowledge to most efficiently integrate human in the actuation process of the system.

Moreover, the effectiveness of human-system interactions is another challenge and future work.
In manufacturing, for example, the interaction between human and machines to control
complicated tasks is difficult because most tasks are automatically accomplished by machines.
Also, technical problems such as poor communications or weak batteries with portable smart
devices are remained to be solved to ensure the effectiveness of human-system interactions
(Wittenberg 2016).

CHALLENGES AND OPPORTUNITIES

Based on the three HiLCPS configurations developed above, two primary challenges with
three opportunities in terms of integrating human in HiLCPS are identified from the existing
studies:
Challenges: (i) It is challenging to completely predict human behaviors with current methods
and technologies, which adds unpredictable conditions into systems (Munir et al. 2014). (ii)
Components (cyber, physical, and human) in HiLCPS are heterogeneous, and challenges arise
when expertise across domains is integrated into one system.
Opportunities: (i) There is a need for improving human behavioral models, which can
improve safety and reliability in human-robot collaborations. (ii) More interdisciplinary
collaborations are required for HiLCPS development due to the complicated and dynamic nature
of HiLCPS. (iii) Integration of human factors into the smart systems requires comprehensive and
repetitive learning processes (Ma et al. 2018). The expected outcome of this opportunity is that
new systems can be designed to help human make more reliable decisions (Agnisarman et al.
2019).

CONCLUSIONS
Nowadays human and technologies are being integrated more intensively and closely than
ever before. Thus, integrating human context in CPS instead of placing it outside the system
boundary is becoming increasingly important. The need to integrate human in CPS and the
benefits of having a full understanding of the HiLCPS approach from the human perspective
were identified in this study. One primary challenge for this work was the limited literature
available for inclusion. Given this challenge, a total of 24 publications were identified and
systematically reviewed, and three configurations of HiLCPS were developed to explain human
roles in the systems. The review results indicate that there are gaps to be filled about specifying
how and where to put human in the closed-loop systems. Also, the needs for comprehensive
human behavior modeling and more autonomous systems in actuation process were highlighted.
A full understanding of the HiLCPS approach from the human perspective with the associated
progress and challenges enables researchers in different disciplines to develop the HiLCPS more
scientifically to meet their needs. The future work of this study is to investigate the data
collection tools and analysis methods used in the HiLCPS to further discover their usability and
performance for smart and integrated systems.

REFERENCES
Agnisarman, S., Lopes, S., Chalil Madathil, K., Piratla, K., and Gramopadhye, A. (2019). “A
survey of automation-enabled human-in-the-loop systems for infrastructure visual
inspection.” Automation in Construction, Elsevier, 97(April 2018), 52–76.
Assunta, C., Guido, G., Silvestro, V., and Giusy, V. (2017). “Man-CPS interaction: An

experimental assessment of the human behavior evolution.” RTSI 2017 - IEEE 3rd
International Forum on Research and Technologies for Society and Industry, Conference
Proceedings.
Birtel, M., Mohr, F., Hermann, J., Bertram, P., and Ruskowski, M. (2018). “Requirements for a
Human-Centered Condition Monitoring in Modular Production Environments.” IFAC-
PapersOnLine, Elsevier B.V., 51(11), 909–914.
Cheng, C. C., Hsiu, P. C., Hu, T. K., and Kuo, T. W. (2018). “Oasis: A Mobile Cyber-Physical
System for Accessible Location Exploration.” Proceedings of the IEEE, 106(9), 1744–1759.
Czerniak, J. N., Brandl, C., and Mertens, A. (2017). “Designing human-machine interaction
concepts for machine tool controls regarding ergonomic requirements.” IFAC-PapersOnLine,
Elsevier B.V., 50(1), 1378–1383.
Fan, C. F., Chan, C. C., Yu, H. Y., and Yih, S. (2018). “A simulation platform for human-
machine interaction safety analysis of cyber-physical systems.” International Journal of
Industrial Ergonomics, Elsevier, 68(June), 89–100.
Fantini, P., Pinzone, M., and Taisch, M. (2018). “Placing the operator at the centre of Industry
4.0 design: Modelling and assessing human activities within cyber-physical systems.”
Computers and Industrial Engineering, Elsevier, (xxxx), 0–1.
Feng, S., Quivira, F., and Schirner, G. (2016). “Framework for Rapid Development of Embedded
Human-in-The-Loop Cyber-Physical Systems.” Proceedings - 2016 IEEE 16th International
Conference on Bioinformatics and Bioengineering, BIBE 2016, 208–215.
Fisher, D. L., Lohrenz, M., Moore, D., Nadler, E. D., and Pollard, J. K. (2016). “Humans and
Intelligent Vehicles: The Hope, the Help, and the Harm.” IEEE Transactions on Intelligent
Vehicles, IEEE, 1(1), 56–67.
Fu, Z., Guo, C., Ren, S., Ou, Y., and Sha, L. (2017). “Modeling and Integrating Human
Interaction Assumptions in Medical Cyber-Physical System Design.” Proceedings - IEEE
Symposium on Computer-Based Medical Systems, 2017–June, 373–378.
Halcu, I., Nunes, D., Sgarciu, V., and Silva, J. S. (2015). “New mechanisms for privacy in
human-in-the-loop cyber-physical systems.” Proceedings of the 2015 IEEE 8th International
Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology
and Applications, IDAACS 2015, 1(September), 418–423.
Li, X., Sun, Z., Cao, D., He, Z., and Zhu, Q. (2016). “Real-time trajectory planning for
autonomous urban driving: Framework, algorithms, and verifications.” IEEE/ASME
Transactions on Mechatronics, IEEE, 21(2), 740–753.
Lu, C. H. (2018). “IoT-enabled adaptive context-aware and playful cyber-physical system for
everyday energy savings.” IEEE Transactions on Human-Machine Systems, IEEE, 48(4),
380–391.
Lv, C., Wang, H., Zhao, B., Cao, D., Huaji, W., Zhang, J., Li, Y., and Yuan, Y. (2017). “Cyber-
Physical System Based Optimization Framework for Intelligent Powertrain Control.” SAE
International Journal of Commercial Vehicles, 10(1), 210–220.
Ma, M., Lin, W., Pan, D., Lin, Y., Wang, P., Zhou, Y., and Liang, X. (2018). “Data and Decision
Intelligence for Human-in-the-Loop Cyber-Physical Systems: Reference Model, Recent
Progresses and Challenges.” Journal of Signal Processing Systems, Journal of Signal
Processing Systems, 90(8–9), 1167–1178.
Munir, S., Stankovic, J. A., Liang, C. J. M., and Lin, S. (2014). “Reducing energy waste for
computers by human-in-the-loop control.” IEEE Transactions on Emerging Topics in
Computing, 2(4), 448–460.

Nikolakis, N., Maratos, V., and Makris, S. (2019). “A cyber physical system (CPS) approach for
safe human-robot collaboration in a shared workplace.” Robotics and Computer-Integrated
Manufacturing, Elsevier Ltd, 56(June 2017), 233–243.
Nikolov, P., Boumbarov, O., Manolova, A., Tonchev, K., and Poulkov, V. (2018). “Skeleton-
Based Human Activity Recognition by Spatio-Temporal Representation and Convolutional
Neural Networks with application to Cyber Physical Systems with Human in the Loop.”
2018 41st International Conference on Telecommunications and Signal Processing (TSP),
IEEE, 1–5.
Petnga, L., and Austin, M. (2016). “An ontological framework for knowledge modeling and
decision support in cyber-physical systems.” Advanced Engineering Informatics, Elsevier
Ltd, 30(1), 77–94.
Scheuermann, C., Verclas, S., and Bruegge, B. (2015). “Agile Factory-An Example of an
Industry 4.0 Manufacturing Process.” Proceedings - 3rd IEEE International Conference on
Cyber-Physical Systems, Networks, and Applications, CPSNA 2015, 2008, 43–47.
Singh, H. V. P., and Mahmoud, Q. H. (2017). “EYE-on-HMI: A Framework for monitoring
human machine interfaces in control rooms.” 2017 IEEE 30th Canadian Conference on
Electrical and Computer Engineering (CCECE), 1–5.
Tsiakas, K., Papakostas, M., Theofanidis, M., Bell, M., Mihalcea, R., Wang, S., Burzo, M., and
Makedon, F. (2017). “An Interactive Multisensing Framework for Personalized Human
Robot Collaboration and Assistive Training Using Reinforcement Learning.” Proceedings of
the 10th International Conference on PErvasive Technologies Related to Assistive
Environments - PETRA ’17, 423–427.
Wang, W., Hong, T., Li, N., Wang, R. Q., and Chen, J. (2019). “Linking energy-cyber-physical
systems with occupancy prediction and interpretation through WiFi probe-based ensemble
classification.” Applied Energy, 236(August 2018), 55–69.
Wittenberg, C. (2016). “Human-CPS Interaction - requirements and human-machine interaction
methods for the Industry 4.0.” IFAC-PapersOnLine, Elsevier B.V., 49(19), 420–425.
Yao, B., Zhou, Z., Wang, L., Xu, W., Yan, J., and Liu, Q. (2018). “A function block based
cyber-physical production system for physical human–robot interaction.” Journal of
Manufacturing Systems, Elsevier, (March), 0–1.
Yuan, X., Anumba, C. J., and Parfitt, M. K. (2016). “Cyber-physical systems for temporary
structure monitoring.” Automation in Construction, Elsevier B.V., 66, 1–14.
Zamfirescu, C.-B., Pirvu, B.-C., Gorecky, D., and Chakravarthy, H. (2014). “Human-centred
Assembly: A Case Study for an Anthropocentric Cyber-physical System.” Procedia
Technology, Elsevier B.V., 15, 90–98.
Zhang, J., Geng, J., Wan, J., Zhang, Y., Li, M., Wang, J., and Xiong, N. N. (2018). “An
Automatically Learning and Discovering Human Fishing Behaviors Scheme for CPSCN.”
IEEE Access, 6, 19844–19858.
Zhu, Q., Sangiovanni-Vincentelli, A., Hu, S., and Li, X. (2018). “Design Automation for Cyber-
Physical Systems [Scanning the Issue].” Proceedings of the IEEE, IEEE, 106(9), 1479–1483.

A Deep Learning Framework for Construction Equipment Activity Analysis

Carlos Hernandez1; Trevor Slaton2; Vahid Balali, Ph.D., A.M.ASCE3;
and Reza Akhavian, Ph.D., M.ASCE4
1
Graduate Student, School of Engineering, California State Univ. East Bay, 25800 Carlos Bee
Blvd., Hayward, CA 94542. E-mail: [email protected]
2
Graduate Student, Dept. of Mathematics, California State Univ. East Bay, 25800 Carlos Bee
Blvd., Hayward, CA 94542. E-mail: [email protected]
3
Assistant Professor, Dept. of Civil Engineering and Construction Engineering Management,
California State Univ. Long Beach, 1250 Bellflower Blvd., Long Beach, CA 90840. E-mail:
[email protected]
4
Assistant Professor, School of Engineering, California State Univ. East Bay, 25800 Carlos Bee
Blvd., Hayward, CA 94542. E-mail: [email protected]

ABSTRACT
Systematic evaluation of construction equipment activities is essential to efficient
management of the fleet. Recent research has made significant progress in developing machine
learning activity recognition frameworks. Deep learning in particular can circumvent the need
for complex manually-designed feature extraction/selection procedures that contribute to lower
accuracies in traditional and shallow models. The research presented in this paper develops and
compares deep learning algorithms for construction equipment activity recognition in different
levels of detail. Data are collected in a non-controlled environment from real-world activities. A
convolutional neural network (CNN) called BaselineCNN and a hybrid network that contains
both convolutional and recurrent long short-term memory (LSTM) layers called
DeepConvLSTM are studied. In summary, DeepConvLSTM proved superior to BaselineCNN.
In a six-class identification task, DeepConvLSTM achieved a validation accuracy of 77.1%. In
the vibration-setting-only subproblem, DeepConvLSTM achieved a validation accuracy of
75.2%. In the direction-only subproblem, DeepConvLSTM achieved a validation accuracy of
96.2%.

INTRODUCTION
The overall outcome of a construction project is often the sum of many smaller components.
The activities of individual workers and individual pieces of equipment effectively govern how
quantities such as productivity, sustainability, and safety develop across the project as a whole.
Thus, automated monitoring of a project’s character and progress can be achieved by tracking
the individual contributors working on it. It is widely acknowledged that the construction
industry lags behind other sectors in terms of productivity. While retail and manufacturing have
evolved along with the advent of new technologies to reach an average of 3.6% productivity
growth annually, this metric amounts to only 1% in construction (McKinsey Global Institute
2017). Additional statistics show that the manufacturing industry’s productivity rate of 88% is
more than double the 43% seen in construction (Lean Construction Institute 2004). Considering
this underperformance, many researchers have investigated means of enhancing productivity
such as automated progress monitoring and resource tracking at the levels of individual workers
and individual pieces of equipment (Cheng, et al. 2013, Akhavian and Behzadan 2016).
A key factor in productivity management that is often overlooked is that the productivity rate

varies during different phases of construction (Cheng, et al. 2017). Wideman (1994) indicated
that construction workforce productivity is slow during the early stages of a project’s execution
and grows as a project progresses. Furthermore, different productivity rates have been observed
across different construction sectors. According to the Bureau of Labor Statistics’ 2018 report,
the productivity growth index of the heavy construction sector was the lowest in comparison
with construction in single-family, multi-family, and industrial areas. Productivity during the
initial stages of a construction project is dependent mainly on the productivity of tasks involving
heavy construction machines. Consequently, systematic measurement and analysis of such
equipment’s operation is essential for productivity improvement (Gong and Caldas 2011). In
addition to offering insights about productivity, monitoring and analyzing construction
equipment activities can offer many other benefits. For example, reducing the time that heavy
equipment spends doing non-value-adding activities minimizes the environmental impact of
running such machinery. Although using newer equipment, using well-maintained equipment,
and using clean fuels can improve exhaust emissions, reducing machines’ idling times and
enhancing their operating efficiencies achieves better results (Ahn, Lee and Peña-Mora 2011,
Akhavian and Behzadan 2013).
In this paper a novel methodology is introduced that leverages wireless sensors and deep
learning algorithms to analyze the activities of construction equipment performing various tasks.
Developing and validating an accurate activity recognition framework is a first step toward
building a system that reliably monitors productivity and predicts greenhouse gas (GHG)
emissions. The point of departure from previous work in this area is the use of deep neural
network architectures. Using deep learning techniques, it is expected that higher accuracies can
be achieved, with less manual effort spent in system design and feature selection. Heavy
equipment generates vibration patterns while performing certain tasks that can be picked up by
accelerometers. Thus, processing readings from accelerometer sensors attached to equipment can
result in highly accurate, unintrusive activity recognition systems (Akhavian and Behzadan
2015). The framework proposed in this research consists of a series of steps including collecting
data, processing data, segmenting data using sliding windows, and classifying the activities at
each time step in the data using deep learning.

RESEARCH BACKGROUND
Sensing Approaches for Activity Recognition: Recent methods of automated construction
machinery tracking and activity recognition primarily involve adopting two different sensing
approaches. The first approach involves using sensing devices such as individual accelerometers
(Joshua and Varghese 2011) and inertial measurement units (IMUs) (Akhavian and Behzadan
2015). The second methodology uses computer vision techniques and therefore requires cameras,
either fixed or non-stationary (Golparvar-Fard, Heydarian and Niebles 2013). Even though both
approaches offer certain benefits, those that rely on cameras may miss activities that are out of
sight for any potential reason in the hectic, congested environments of construction jobsites.
Another innovative approach for measuring construction equipment performance analyzes audio
signals produced by working construction machinery (David 2003, Cheng, et al. 2017). This
third approach partially overcomes the line-of-sight and illumination issues facing visual sensing
methods. However, audio-based sensing still has limitations. For one, it’s restricted to
recognizing discrete sound patterns. Obstructions can still distort the signal beyond recognition,
and the noise endemic to construction environments can drown it out entirely. Accelerometer-
based sensing offers a reasonable alternative that is also affordable and unintrusive. As such, it

has recently gained traction in monitoring the activities of construction equipment and
construction workers. For instance, Joshua & Varghese (2011) analyzed the performance and
productivity of mason workers wearing IMUs attached to their waists. Ahn et al. (2013)
developed a framework to distinguish among three main activities of an excavator, namely
working, idling, and engine-off. In another set of studies, IMUs built into smartphones were used
to detect the activities of construction equipment (Akhavian and Behzadan 2015) and of
construction workers (Akhavian and Behzadan 2016) performing complex jobs involving up to
five different tasks. All of these studies used kinematic sensors placed on the subject’s body. A
similar approach is adopted in the presented research, but improved accuracies were obtained
due to the differences outlined in the next section.

Figure 1. (a) Data collection station with the 1) sensors, 2) receiver antenna, and 3) webcam
for synchronous video and (b) sensor installation on the equipment body
Machine Learning Approaches for Activity Recognition: Deep learning is a subset of
machine learning methodologies in which multiple layers or stages of nonlinear information
processing is performed for classification and pattern analysis (Deng and Yu 2014). Recent
research has benefitted from this relatively new concept for activity recognition as well. For
example, Yang et al. (2015) used deep convolutional neural networks (CNN) to recognize the
activities of human subjects wearing inertial sensors. Ordóñez and Roggen (2016) used long
short-term memory (LSTM) recurrent units for human activity recognition using multimodal
wearable sensors. A notable advantage of deep learning-based activity recognition frameworks
that distinguishes them from previous work is the automation of the “feature extraction” step,
which otherwise requires a heuristic design (Wang, et al. 2018). This is a huge improvement
since feature engineering can be complex and time consuming; furthermore, it tends to bake-in
the biases of the human who selects the features, which can limit models’ accuracies. The
dynamic environments of construction jobsites add an extra complexity to the task of deducing
equipment’s activities. To the best of the authors’ knowledge, this is the first study that leverages
the power of deep learning to detect multiple complex activities of heavy machinery.

EXPERIMENTS
In order to ensure the practicality of the developed framework, it was applied to real
construction equipment performing real work. The operator was asked to resume a commercial
building project’s tasks as scheduled, meaning that the experiments were not conducted in a
controlled environment. The equipment studied was a BOMAG BW 145PDH-3 Single Drum
Vibratory Roller performing activities related to landscaping and soil preparation, as well as
traveling on a paved driveway. All of the experiments were video recorded. Two MyoMotion
684 sensors by Noraxon were attached to articulated parts of the equipment body (Noraxon USA

2018). A signal receiver antenna connected to a laptop on the jobsite was used to log data in real-
time. The software module included with the sensors kit was used for data preprocessing such as
automated synchronization between the sensor data and video recordings and manual labeling of
the activities. Figure 1 shows the data collection station with the laptop and the receiver. One of
the two sensors for this experiment was attached on the cabin dashboard close to the steering
wheel; the other was attached on the roller’s support arm as shown in Figure 1.

METHODOLOGY
Overview: Ordóñez and Roggen (2016) managed to achieve state-of-the-art results on the
problem of human gesture recognition using a novel neural network architecture called
DeepConvLSTM after training it on multimodal sensor data. The task of predicting construction
equipment activity from accelerometer readings presented here, however, requires a meticulous
understanding of the neural network architecture due to the different nature of the data and
movement patterns. The results indicate that the success of combining convolutional layers with
long short-term memory (LSTM) layers is not limited to the domain of human gesture
recognition; it translates to equipment activity recognition as well.
Data: While the roller performed its activities, readings were sampled from two 3-axis
accelerometers mounted at two different locations on the roller at a rate of 100 Hz. These
activities generated six channels of 116,536 sensor readings over a period of 20 minutes.
Data processing: The data were then split into two disjoint subsets: training and validation.
The first 92,728 contiguous samples were used for training, and the remaining contiguous
samples formed the validation set. This split was chosen so as to maximize the similarity
between the activity label distributions of the validation set and the activity label distributions of
the training set while maintaining the correlations in time between adjacent samples that are
critical to the problem. Additionally, a small number of samples were dropped from the extreme
ends of the data sets to exclude the activities Idle and Off from further consideration. Previous
research demonstrates that detection of these two modes can be done with almost 100%
accuracy. The first 1,040 samples were dropped, as were the last 8,017 samples. In addition to
the full problem with six activity classes presented above, the subproblem of distinguishing
forward motion from backward motion and the subproblem of distinguishing activities related to
the three vibration modes were studied separately by combining the class labels as appropriate in
two more sets of training and validation data, derived from the sets with all the labels.
In order to learn patterns of significant predictive value, it is critical that the model be able to
analyze the sensor reading at each time step within a larger context of other sensor readings. To
facilitate this, a sliding window length of 200 samples, with a step size of 1 sample, was used to
segment the data into overlapping frames corresponding to 2 seconds of activity each. The
activity label at the last sample in each frame was used as the label of the frame. This setup
structures the problem as a task of predicting the activity label at each time step in the data
series, given the 199 most recent readings prior to the time-step of interest. Smaller window sizes
could be useful in real-time monitoring applications where a lead-time of 2 seconds is considered
too slow; larger window sizes provide greater context to each data point at the cost of
computational complexity. For this problem, it was only necessary that the window size be large
enough to provide adequate context for each sample. During the sliding window segmentation
process, each sensor channel was normalized to the range [0, 1] and stacked horizontally so that
time is the vertical axis of each frame and the sensor channels are on the horizontal axis. The
segmentation process was applied independently to the training and validation subsets of the data

to avoid the pitfall of validation data leaking into the training data when the frames overlap at the
boundary. The validation data are only useful as a means of evaluation.
Models: Implementations were done using the Python library Keras, running on top of
TensorFlow. The models studied include a convolutional neural network (CNN) and a hybrid
network that contains both convolutional and recurrent LSTM layers. Significant differences
between this work and prior research in human activity recognition are that batch normalization
was added between each convolutional layer to speed up convergence and an additional dropout
layer (p=0.25) was added between the convolutional feature-extraction block and the recurrent
sequencing block, as an additional form of regularization. The architectures share many
characteristics so that the meaningful difference between them is the presence of the LSTM
layers in DeepConvLSTM. In BaselineCNN, these layers are substituted with standard fully-
connected layers. BaselineCNN begins with a succession of four convolutional layers, each
containing 64 filters of size (3, 1). The filters’ active dimension is vertical so that convolution is
only performed across the time axis in each frame. This ensures that the features derived from
each sensor channel in the horizontal axis of each frame are kept independent as they propagate
through the layers of the network. Although it is a common practice to employ max pooling after
each convolutional layer, doing so encourages models that are invariant to translations in the
input data. Since it is important that the model considers where patterns occur in the sequence of
sensor readings it sees in each frame, max pooling is omitted. The set of four convolutional
layers described thus far is considered the feature-extracting a block of the network. After
extraction, these features are then fed into two fully-connected layers and finally a softmax
classifier that outputs a prediction score for each class. The highest score is taken to be the
network’s prediction. DeepConvLSTM is a hybrid model that takes the same feature extracting
block used in the baseline network but follows it with a series of two LSTM layers instead of two
fully-connected layers. Each of the fully-connected layers and each of the LSTM layers is of size
128. Higher performance is expected from the LSTM layers. LSTMs are able to take advantage
of patterns in sequential data due to their memories. During training, a network inside the LSTM
learns to manage its finite memory by forgetting certain aspects of the data sequences it has
observed and holding onto others. Each output of the second LSTM is processed by the ensuing
softmax classifier, and the last value of the output sequence is taken as the overall prediction
since it corresponds to the output derived from a full set of observations of the input sequence.

RESULTS
Training: The model parameters were optimized over five epochs using batched gradient
descent with a batch size of 100 frames and the Adam optimizer with a learning rate of 0.001. In
order to combat exploding gradients inside the LSTM layers, gradient clipping was applied with
a maximum gradient norm of 1.0 and a maximum gradient value of 0.5. This technique leads to
smoother training curves. Model parameters were saved in checkpoints after each training epoch,
so the parameters that resulted in the highest validation accuracies were chosen for computing
additional performance metrics. In each of the tasks, both models were able to achieve very high
training accuracy, but this was deemed to be overfitting when it occurred at the expense of
validation accuracy. In particular, LSTMs are sometimes found to have the ability to memorize
the training data, so it is not surprising that DeepConvLSTM achieved nearly perfect training
accuracy. Its peak validation accuracy was also superior to that of BaselineCNN, however, so the
model selected had significant predictive value beyond mere memorization.
Classification Overview: Both BaselineCNN and DeepConvLSTM were able to classify the

roller’s activities with reasonable accuracy, but DeepConvLSTM was the best performer. Both
models showed higher performance on the easier subproblem of predicting combined classes.
Full activity identification problem: For the six-class category, the BaselineCNN had a
validation accuracy of 74.2% while the DeepConvLSTM achieved a validation accuracy of
77.1%. Table 1 summarizes precision, recall, and F1 score for both models. Predictions for both
BaselineCNN and DeepConvLSTM are plotted against the ground truth labels on the full data set
in Figure 2. As predictions coinciding with the ground truth signal get covered by it, the amount
of visible spikes in the prediction signals is an indication of the degree to which they deviate
from the ground truth. Furthermore, the signals are plotted with a degree of transparency, so
darker lines indicate stronger signals. The predictions are not considered for the yellow shaded
regions in the graphs, which were excluded from training and validation. Overall, the
DeepConvLSTM predictions displayed in orange are a better match for the ground truth signal in
both the training and the validation regions than the BaselineCNN predictions displayed in green.

Table 1. Full Activity Metrics for BaselineCNN and DeepConvLSTM

Precision Recall F1-Score
Activity
Baseline DeepConvLST BaselineCN DeepConvLST BaselineCN DeepConvLST
Label
CNN M N M N M
Fwd. High 0.73 0.81 0.77 0.73 0.75 0.77
Bwd. High 0.81 0.75 0.34 0.32 0.47 0.45
Fwd. Low 0.65 0.72 0.67 0.8 0.66 0.76
Bwd. Low 0.76 0.75 0.91 0.93 0.83 0.83
Fwd. Off 0.87 0.80 0.72 0.9 0.79 0.85
Bwd. Off 0.69 0.86 0.99 0.86 0.82 0.86
Average 0.75 0.78 0.73 0.76 0.72 0.75

Direction-only subproblem: In this problem, the possible activity labels were reduced to
just Forward and Backward. BaselineCNN achieved a validation accuracy of 93.6% and an
average F1 score of 0.94. DeepConvLSTM achieved a validation accuracy of 96.2% and an
average F1 score of 0.96.
Vibration-setting only subproblem. In this problem, the possible activity labels were
reduced to just the vibration settings: High, Low, and Off. BaselineCNN achieved a validation
accuracy of 74.4% and an average F1 score of 0.75. DeepConvLSTM achieved a validation
accuracy of 75.2% and an average F1 score of 0.75.

CONCLUSION
The ability of the models to almost completely fit the training data suggests that they are
complex enough to handle the kinds of time series studied and that, given more training data,
they would be able to generalize very well. State-of-the-art results in human gesture recognition
suggest that combining sensors of different modalities such as accelerometers and gyroscopes
can result in a significant performance boost when compared to using only sensors of a single
modality. As the data studied herein contained only accelerometer readings, it is likely that this
could have been somewhat limiting to the models’ abilities to distinguish similar activities in the
full six-class and vibration activity problems. Although the BaselineCNN managed to achieve a
validation accuracy similar to that of DeepConvLSTM in the full problem, the alignment of their
predictions with the ground truth labels shown in Figure 2 highlights that DeepConvLSTM is the

superior model. Readings from wearable sensors in the human activity recognition domain often
feature some degree of peculiarity specific to the individual wearer. As construction equipment
has fewer behavioral degrees of freedom when performing its tasks, it seems reasonable to
conjecture that predictive models trained on a single piece of equipment or a small set of
machines would retain much of their predictive power when observing different individual
machines of the same type. This property would be critical to a model’s ability to enable
management decisions over a population of hardware. Future work will study how a model
trained on a particular machine can make predictions about the activities of other machines.

Figure 2. The predictions of both models vs. time compared to the ground truth data.
ACKNOWLEDGEMENT
The presented work is supported by the California’s Senate Bill 1 - California State
University Transportation Consortium (CSUTC). The authors gratefully acknowledge CSUTC
support.

REFERENCES
Ahn, C. R., Lee, S., & Peña-Mora, F. (2013). Application of Low-cost Accelerometers for
Measuring the Operational Efficiency of a Construction Equipment Fleet. J. Comput. Civ.
Eng. doi:04014042
Ahn, C., Lee, S. H., & Peña-Mora, F. (2011). Carbon Emissions Quantification and Verification
Strategies for Large-scale Construction Projects. Proc. Int. Conf. on Sustainable Design and
Construction.
Akhavian, R., & Behzadan, A. H. (2013). Simulation-Based Evaluation of Fuel Consumption in
Heavy Construction Projects by Monitoring Equipment Idle Times. In Simulation
Conference (WSC), 3098-3108.
Akhavian, R., & Behzadan, A. H. (2015). Construction Equipment Activity Recognition for
Simulation Input Modeling Using Mobile Sensors and Machine Learning Classifiers.
Advanced Engineering Informatics, 867-877.
Akhavian, R., & Behzadan, A. H. (2016). Productivity Analysis of Construction Worker
Activities Using Smartphone Sensors. 16th Int. Conf. Comput. Civil Building Eng.
Cheng, C.-F., Rashidi, A., Davemnport, M. A., & Anderson, D. V. (2017). Activity Analysis of
Construction Equipment Using Audio Signals and Support Vector Machines. Aut. in Const,

240-253.
Cheng, T., Teizer, J., Migliaccio, G. C., & Gatti, C. (2013). Automated Task-Level Activity
Analysis Through Fusion of Real Time Location Sensors and Worker's Thoracic Posture
Data. Aut. in Const., 29, 24-39.
David, G. (2003). Audio Signal Classification: History and Current Techniques. Technical
Report.
Deng, L., & Yu, D. (2014). Deep Learning Methods and Applications. Foundations and Trends
in Signal Processing, 1-199. doi:10.1561/2000000039
Golparvar-Fard, M., Heydarian, A., & Niebles, J. C. (2013). Vision-Based Action Recognition of
Earthmoving Equipment Using Spatio-Temporal Features and Support Vector Machine
Classifiers. Advanced Engineering Informatics, 652-663.
Gong, J., & Caldas, C. H. (2011). An Object Recognition, Tracking, and Contextual Reasoning-
based Video Interpretation Method for Rapid Productivity Analysis of Construction
Operations. Elsevier, 20(8), 1211-1226. doi:10.1016/j.autcon.2011.05.005
Joshua, L., & Varghese, K. (2011). Accelerometer-based Activity Recognition in Construction.
J. Comput. Civ. Eng., 370-379. doi:10.1061/(ASCE)CP.1943-5487-0000097
Lean Construction Institute. (2004). What is Lean Construction? Retrieved from
http://www.leanuk.leanconstruction.org/whatis.htm
McKinsey Global Institute. (2017). Reinventing Construction: A Route to Higher Productivity.
Moore, W. (2012). Practical Telematics. Construction Equipment. Retrieved from
http://www.constructionequipment.com/practical/telematics
Noraxon USA. (2018). Retrieved November 15, 2018, from http://www.noraxon.com
Ordóñez, F., & Roggen, D. (2016). Deep Convolutional and LSTM Recurrent Neural Networks
for Multimodal Wearable Activity Recognition. Sensors, 16(1), 115.
Wang, J., Chen, Y., Hao, S., Peng, X., & Hu, L. (2018). Deep Learning for Sensor-Based
Activity Recognition: A Survey. Pattern Recognition Letters.
Wideman, R. M. (1994). A Pragmatic Approach to Using Resource Loading, Production, and
Learning Curves on Construction Projects. Canadian Journal of Civil Engineering, 21(6),
939-953. doi:10.1139/194-100
Yang, J., Nguyen, M., San, P. P., Li, X., & Krishnaswamy, S. (2015). Deep Convolutional
Neural Networks on Multichannel Time Series for Human Activity Recognition. In licai, 15,
3995-4001.

Seeding Strategies in Online Social Networks for Improving Information Dissemination of

Built Environment Disruptions in Disasters
Chao Fan1; Yucheng Jiang2; and Ali Mostafavi, Ph.D.3
1
Ph.D. Student, Zachry Dept. of Civil Engineering, Texas A&M Univ., College Station, TX
77843-3136. E-mail: [email protected]
2
B.S. Student, Dept. of Computer Science and Engineering, Texas A&M Univ., College Station,
TX 77843-3136. E-mail: [email protected]
3
Assistant Professor, Zachry Dept. of Civil Engineering, Texas A&M Univ., College Station, TX
77843-3136. E-mail: [email protected]

ABSTRACT
The objective of this study is to propose a seed-search algorithm and develop seeding
strategies in online social networks for disseminating credible situational information regarding
built environment disruptions in disasters. Rapid and extensive dissemination of credible
situational information is important for disaster preparedness, response, and recovery in
communities. Online social networks such as Twitter have become popular media sources among
the public to share information in disasters. Due to the directed relations and fragmentations in
networks, however, little is known about the ways of selecting starting nodes (a.k.a., seeds) in
order to broadly and rapidly spread information online. To address this gap, this study proposes a
computational approach, which is an integration of greedy algorithm and graph analysis, to
capture fragmentations, identify critical seeds, and develop seeding strategies to disseminate
information in online social networks. A case study of an infrastructure disruption (water release
from reservoirs in Houston) during 2017 Hurricane Harvey was used to illustrate the capabilities
of the proposed approach. The results indicate that seeding top 10 users is effective for
distributing information to more than 80% of nodes in a network. The findings inform about
strategies to better report, transmit, and gather situational information on social media, which can
further enhance situation awareness and community resilience in disasters.

INTRODUCTION
Built environment disruptions happen abruptly and evolve rapidly in disasters (Zhu &
Mostafavi, 2018). Massive and rapid spread of credible situational information during built
environment disruptions is significantly important to disaster response and community resilience
(Fan et al., 2018). Online social networks such as Twitter, Facebook, and Instagram are
emerging media to prompt information dissemination in extreme events. Generally, online social
networks are directed networks, which means the direction of sharing information from one node
to another is irreversible unless they are following each other (Kim & Hastak, 2018). The
irreversibility of users’ relations also leads to the fragmentations in online networks where a
node from one group cannot distribute its information to a node in another group. Hence, it is
impossible to disseminate credible information throughout the whole network by randomly
selecting starting nodes (a.k.a., seeds) if fragmentations exist in the network. To maximize the
speed and magnitude of spreading credible situational information, thus, identifying the
fragmentations and strategizing the seeds in online social networks is essential in built
environment disruptions.
To this end, existing studies have attempted to develop computational approaches and

seeding strategies in online social networks (Chin et al., 2018). Some studies focus on one-hop
targeting, in which the seeds are selected by selecting the highest in-degree nodes and randomly
selecting one of their neighbors (Kim et al., 2015). These studies hypothesized that the online
social network is a strongly connected component, in which every node can get access to the rest
of the nodes in the network (Shakya et al., 2017). However, as discussed earlier, in some cases,
online social networks are sparse, and the connections between different components are
irreversible. Hence, the seeds in one component may not be able to deliver the information
through the entire network. Another stream of research focuses on the friendship networks and
the structural properties by applying social contagion models (Sinan Aral & Dhillon, 2018; Sela
et al. 2015). These studies hypothesized that information and influence can spread through the
established friendship based on threshold values and did not consider the presence of
fragmentations in large-scale social networks. However, network fragmentation can make a
drastic impact on the paths of information propagation and the accessibility of online users.
Meanwhile, the fragmentation is also a result of the variance of interests among users in
networks. On social media, people tend to retweet others as a form of social endorsement and
common interests. Therefore, approaches to analyze the retweeting network of localized people
who are affected by disasters are needed, and the outcomes can provide reliable evidence to
develop seeding strategies for prompt situational information dissemination on social media.
This paper presents a framework built upon retweeting networks related to built environment
disruptions to identify the seeds by taking the percentage of reachable users, neighbors’
overlapping, and depth of cascades into account. Through the proposed framework, we identified
critical seeds who play a primary role in spreading situational information and examined the
robustness of the proposed approach in a six-day disaster and investigated the evolution of the
seed sets.

SEED-SEARCH ALGORITHM
The development of seeding strategies relies on the characteristics of the seeds and their
social networks. There are two characteristics that are defined by existing studies to evaluate the
capability of seeding strategies: (1) the number of reachable users; and (2) the size of the seed
sets (Sinan Aral & Dhillon, 2018). As such, we can convert the seeding strategy problem to a
constrained extreme value problem, which can be defined as follows:
Given a directed online social network G = (V, E) where V is the set of active users, and E is
the set of edges, which represent the retweeting behaviors among the active users. Each node
has posted or retweeted at least one tweet about the disruption, water release from
reservoirs. The problem asks to find a seed set for a certain network G with and
, where N is a set of reachable users by the seeds in S. The problem of searching seeds in
a directed network is NP-hard (Ahmad & Ali, 2017). The computational cost is exponentially
proportional to the size of the network. To reduce the computational complexity, this study is
only interested in top seeds which can reach more than 80% of the active users.
To address this problem in online social networks, we adopted a greedy algorithm (Das,
2017) integrated with pre-defined criteria (see Algorithm 1). The searching approach starts with
a user who has the highest in-degree. To take the dissemination efficiency into account,
meanwhile, the algorithm computes the depth of cascades that a seed can access. Imagining that
two seeds can reach the same number of users, the deeper of the cascades that the seed has, the
lower the dissemination efficiency will be. Then, we iterate the computational process to identify
the seeds and cumulate the number of reachable users. The iteration criterion is that the next seed

should maximize the cumulative number of reachable users in |N|, and the cannot be the
seeds. This criterion can ensure that the reachable nodes of seeds do not overlap, and
subsequently the dissemination process can converge in an efficient way. Based on this criterion,
we do the iterations until the cumulative number of reachable nodes is greater than 80% of |V|.

CASE STUDY
To illustrate the capabilities of this algorithm for strategizing seeds in online social networks,
we conducted a case study of information dissemination on Twitter during water release from
reservoirs in the west side of Houston in 2017. In this study, we investigated the event associated
with Barker and Addicks reservoirs during Hurricane Harvey and their effects on nearby
neighborhoods and other localized users in Houston. Water release from reservoirs is a built
environment event that was induced by Hurricane Harvey. Due to the sudden heavy rainfall, the
water levels in both reservoirs reached their maximum capacity, which led to water release and
flooded some nearby neighborhoods. This event followed the damages of Hurricane Harvey and
lasted a couple of days (August 27th to September 1st) (Olsen, 2018).
To investigate the dissemination of situational information on Twitter, we collected all of the
tweets, which was around 21 million, over the Houston area from August 22 nd to September 30th
(Fan et al., 2019). The rules for filtering the tweets included multiple bounding boxes in Houston
and users’ profiles with a location of Houston. Based on those rules, the tweets we collected
were all posted by localized users so that they were reliable for the analysis on understanding the
impacts of built environment disruptions on residents. In this study, we extracted 6991 tweets
posted or shared during the water release and relevant to this event. As shown in Figure 1, the
directed social networks mapped the posting and retweeting behaviors of online users in each
day. Due to the rapid changes in the built environment and its influence on the public, the
retweeting networks are varied. For example, the networks on August 29 th and August 30th are
much denser than the other networks. Much more users are were active during these two days
(2417 users on August 29th and 1617 users on August 30th). That is because the reservoirs
reached their maximum capacities on these two days. More water was released from those

reservoirs and it led to severe damages in the nearby neighborhoods. After that, the hurricane
past and the rain stopped. Floods gradually receded from the neighborhoods on August 31 st and
September 1st. In addition, August 27th was the first day when Harris County started to release
water from reservoirs. Therefore, there are some users with a large number of in-degrees due to
their distribution of the first-hand information on August 27th.

Node: 936 Node: 568 Node: 2417

Aug. 27th Aug. 28th Aug. 29th
Edges: 1213 Edges: 573 Edges: 2916

Node: 1617 Node: 692 Node: 761

Aug. 30th Aug. 31st Edges: 717 Sept. 01st Edges: 860
Edges: 1903

Figure 1. Retweeting networks among localized online users.

Once the retweeting networks were constructed, we applied the proposed seed-search
algorithm to those networks and obtained six sets of top seeds which can reach at least 80% of
active users in these networks. Figure 2 shows the cumulative percentage of reachable users in
six online social networks by expanding the seed sets. Due to the directions of retweeting
behaviors, the information starting from one user cannot be disseminated through the entire
network. As a result, the contributions of different seeds to disseminating information through a
network vary. As shown in Figure 2, the slopes of the cumulative reachable nodes in these six
days are skew in the first ten seeds and then become flat when continuously adding more seeds.
Therefore, to minimize the efforts on seeding online users, in each daily online social network,
seeding 10 top users to spread the situational information can reach most of the online users. In
addition, at the beginning and the end of the disruption, the top one seed can reach more than
50% of the users in the networks, while they can only reach around 30% of the users during the
disruption. During the built environment disruption, the top one seed is not the only user that can
gather and report real time situational information. Many other users are active in posting and
sharing their observations on social media. Therefore, the top one seed can reach fewer users but
still important for credible information dissemination in online social networks during the built
environment disruption. For example, the local news media with most reachable residential users
on Twitter can be the top one seed. Other accounts such as a meteorological department, police
department and journalists who have their own massive audiences also play the role of top seeds
in distributing situational information. Capitalizing on this finding, the public who want to obtain
real-time information in disasters only need to pay close attention to such ten seeds rather than
all possible online users. This practical suggestion can save users much effort on gathering
information in a time-sensitive context, and subsequently improve their response efficiency in
disasters.
To better distinguish the contributions of the seeds, we analyzed the relationship between the
cumulative percentage of reachable nodes and the number of reachable nodes of a seed. As

shown in Figure 3, the rank of the seeds is created based on their contribution to the cumulative
percentage of reachable nodes in networks. Note that, in some networks such as August 27th,
August 29th and August 30th, some seeds with a large number of reachable nodes are out of the
top ten. In particular, there are three users who can reach more than 600 users in the social
network on August 29th. However, except the top one seed, the others are at 15th and 28th places
respectively. That is because there is a significant overlap of reachable users among these three
seeds. Therefore, when one seed can reach the overlapped group of users, the other seeds would
have less contribution to increase the total number of reachable users in the network. In the
context of built environment disruptions, the number of users who can get access to the
situational information is important to effective response actions. Identifying the overlap of the
reachable users among the seeds can better seed the users to spread information broadly in online
social networks. For example, a residential user who wants to report a road damage only needs to
mention or send direct messages to a few top seeds who do not have much overlap in their
reachable users. This practical way of reporting situation will benefit the efficiency of spreading
information and also reduce the information redundancy in only a few groups of users.
100% 95.45% 95.39%
Cumulative % of reachable nodes

90.36%
90%
85.87% 83.37%
80% 86.15%
78.37%
70% 27-Aug
60% 50.94% 28-Aug
29-Aug
52.64%
50% 30-Aug
39.04% 31-Aug
40%
31.44% 1-Sep
30% 29.56%
20%
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
# of seeds

Figure 2. Cumulative percentage of reachable users in online social networks

In addition, Figure 3 also indicates that the seeds such as seed 1, 15, and 28 on August 29 th,
who have a great number of reachable users do not connect to each other. Even though these
seeds have overlaps of reachable users, they did not share or retweet the information among each
other. This is a common social problem, “Network Fragmentation” (Chami et al., 2017), in most
of the retweeting networks. The presence of fragmentations is a cause of disparities in situation
awareness and inconsistency in response actions. For example, one group of users can get access
to the real-time weather information, while another group of users has the information of
inundation level in reservoirs. Due to lack of connections between these two groups, the users in
different groups may have different understandings of the coming risks to their lives and
properties, and consequently, they would take different actions such as evacuation or staying at
home. Such disparities will induce severe losses to the people who take wrong actions. The seed-
search algorithm presented in this paper can benefit the identification of fragmentations in online
social networks and provide evidence to address this issue. For example, we can inform the seeds
who have a similar number of reachable nodes in networks to follow each other and share
situational information during built environment disruptions. Adding connections among these
seeds cannot only reduce the fragmentations in networks, but also improve the efficiency of
information distribution.

Aug. 27th Aug. 28th Aug. 29th

Aug. 30th Aug. 31st Sept. 01st

Figure 3. Top seeds and their number of reachable users in online social networks
9
The depth of cascades is an important indicator to quantify the capability of a seed to reach
other users in networks. Figure 4 shows the depths of the cascades of the top eight seeds in each
network. Obviously, the top-one seeds have deeper cascades than the others. The cascade depth
of other seeds is only one in most cases. However, some seeds such as seed 3, 5, 7, and 8 on
August 27th also have more than 4 cascades. A seed with a depth of 2 means that information
from the seed can spread to a cascade in two steps, and a seed with a depth of 4 means the
information from the seed can spread to a cascade in four steps. The more steps the seed have,
the more influential the seed is. Therefore, the seed with a depth of more than 4 are potential
disseminators who can broadly spread information in networks. These results support the
implications in Figure 4. Thus, the findings here can validate the reliability of the proposed
algorithm in identifying and strategizing seeds in online social networks for disseminating
credible situational information in disasters.
6
27-Aug 28-Aug 29-Aug 30-Aug 31-Aug 1-Sep
5

4
Depth

0
1 2 3 4 5 6 7 8
Top 8 seeds in sequence

Figure 4. Depth of cascades that the top seeds can reach

CONCLUDING REMARKS
This paper presents a computational approach to identify seeds and develop seeding
strategies in online social networks regarding built environment disruptions in disasters. The
results of a case study of water release from reservoirs during the 2017 Hurricane Harvey implies
three important seeding strategies: first, seeding 10 top users to spread credible information is
efficient for reaching more than 80% of users in a network; second, identifying overlaps of

reachable nodes among the seeds can help users spread the information more extensively; third,
the proposed approach can capture the fragmentations in social networks and help users to
reduce the fragmentation and spread information more efficiently.
The proposed seed-search algorithm provides a new computing approach to promote
information spreading by seeding influential users. Considering the overlaps of reachable users
among the seeds, the strategies presented in this paper can enables infrastructure managers,
disaster managers, and public officials to effectively identify users to maximize the diffusion of
information on social media regarding built environment disruption events that affect residents.
The use of these findings in educating online users can enhance the adoption of situational
information, the response to risks, and the resilience of communities in ever-growing disasters.
This work can be further extended in: (1) examine the relationship between the priority of the
seeds and their features, such as number of followers, timeliness of the posted information, and
verification of their profiles; and (2) interpret the temporal topology of retweeting networks as an
indicator of community resilience to infrastructure disruptions.

ACKNOWLEDGEMENT
This material is based in part upon work supported by the National Science Foundation under
Grant Number IIS-1759537. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views of the
National Science Foundation.

REFERENCES
Ahmad, W., & Ali, R. (2017). “A framework for seed user identification across multiple online
social networks.” 2017 International Conference on Advances in Computing,
Communications and Informatics, ICACCI 2017, 2017–January 708–713.
Aral, S., & Dhillon, P. S. (2018). “Social influence maximization under empirical influence
models.” Nature Human Behaviour, 2(June), 1–8.
Chami, G. F., Ahnert, S. E., Kabatereine, N. B., & Tukahebwa, E. M. (2017). “Social network
fragmentation and community health.” Proceedings of the National Academy of Sciences,
114(36), E7425–E7431.
Chin, A., Eckles, D., & Ugander, J. (2018). “Evaluating stochastic seeding strategies in
networks.” 1–42. Arxiv
Das, S. (2017). “Seed Node Distribution for Influence Maximization in Multiple Online Social
Networks.” 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing,
1260–1264.
Fan, C., Mostafavi, A., Gupta, A., & Zhang, C. (2018). “A System Analytics Framework for
Detecting Infrastructure-Related Topics in Disasters Using Social Sensing.” In I. F. C. Smith
& B. Domer (Eds.), Advanced Computing Strategies for Engineering (pp. 74–91). Cham:
Springer International Publishing.
Fan, C., Mostafavi, A., Yao, W., & Huang, R. (2019). “A Graph-based Approach for Detecting
Critical Infrastructure Disruptions on Social Media in Disasters.” In 2019 52th Hawaii
International Conference on System Sciences (HICSS) (pp. 1975–1984).
Kim, D. A., Hwong, A. R., Stafford, D., Hughes, D. A., O’Malley, A. J., Fowler, J. H., &
Christakis, N. A. (2015). “Social network targeting to maximise population behaviour
change: A cluster randomised controlled trial.” The Lancet, 386(9989), 145–153.
Kim, J., & Hastak, M. (2018). “Social network analysis: Characteristics of online social

networks after a disaster.” International Journal of Information Management, 38(1), 86–96.

Olsen, L. (2018). “Record reservoir flooding was predicted even before Harvey hit Houston.”
Houston Chronicle. Retrieved June 5, 2018, from https://www.houstonchronicle.com
Sela, A., Ben-Gal, I., Pentland, A. S., & Shmueli, E. (2015). “Improving Information Spread
through a Scheduled Seeding Approach.” In Proceedings of the 2015 IEEE/ACM
International Conference on Advances in Social Networks Analysis and Mining.
Shakya, H. B., Stafford, D., Hughes, D. A., Keegan, T., Negron, R., Broome, J., … Christakis,
N. A. (2017). “Exploiting social influence to magnify population-level behavior change in
maternal and child health: Study protocol for a randomized controlled trial of network
targeting algorithms in rural Honduras.” BMJ Open, 7(3), 1–13.
Zhu, J., & Mostafavi, A. (2018). “Enhancing Resilience in Disaster Response: A Meta-Network
Analysis Approach.” Construction Research Congress 2018, (1), 2250–2259.

Overview of Supporting Technologies for Cyber-Physical Systems Implementation in the

AEC Industry
Daniel A. Linares, S.M.ASCE1; Chimay Anumba, M.ASCE2;
and Nazila Roofigari-Esfahan, A.M.ASCE3
1
Ph.D. Student, Dept. of Building Construction, Virginia Tech. E-mail: [email protected]
2
Professor and Dean, College of Design, Construction, and Planning, Univ. of Florida. E-mail:
[email protected]
3
Assistant Professor, Dept. of Building Construction, Virginia Tech. E-mail: [email protected]

ABSTRACT
The development of cyber-physical systems (CPSs) is an essential milestone in advancing
operations in many industries including the architecture, engineering, and construction (AEC).
Its development is supported by technological advancements that align with the CPSs goals of
automation and intelligent interaction between cyber and physical spaces. This study focuses on
the evaluation of the supporting technologies for CPSs in AEC projects. To this end, a holistic
review of industry and academic literature is carried out and then, the technologies are evaluated
in terms of their level of implementation and application thorough the project lifecycle for
horizontal and vertical project scenarios. Finally, a categorization was made to present the
interrelations between the technologies to provide new insight into future research into CPS
applications in the AEC industry.

INTRODUCTION
In the past few decades, advancements in Information and Communication Technologies
(ICT), has redefined the way people and industries perform basic daily routines. Although the
AEC industry is known to be slow to incorporate technology, notable advancements have been
made, re-shaping the industry’s traditional practices towards being more technology-oriented.
Construction projects and the built environment are expected to be augmented by autonomous
and connected systems to improve operations, productivity, comfort, safety, communication, etc.
(Correa 2018; Nunes et al. 2015). Cyber-physical Systems (CPSs) are integrated monitoring,
sensing and actuating systems that build a bi-directional connection between the physical world
and cyber components in order to autonomously manage processes, information, and resources.
Application of CPSs in the AEC industry will greatly ameliorate the way infrastructure and
buildings are built, managed and connected to other autonomous systems such as transportation
and health systems.
In order to better understand the CPSs’ potentials in any context including construction, first,
the enabling components including technologies need to be identified. CPSs applications in the
AEC industry relies on technologies that are currently established, as well as technologies that
are under development for future implementation. Current studies presented efforts towards
evaluation and application of these technologies in different contexts. However, a review of the
current and future state of the technologies that support CPSs in the AEC industry is missing.
This study addresses this gap by analyzing the technologies that support the CPSs development
from the perspective of the current status of technology development and its level of
implementation. To this end, a holistic review of the academic and industry literature is carried
out to evaluate and categorize the supporting technologies from the perspective of their

implementation status in the AEC industry. The technologies supporting CPSs in the AEC
industry are classified here according to their implementation status as already established and
trendy technologies. A comparative analysis of applications of the found technologies in various
processes of two different construction settings, namely vertical (buildings) and horizontal
construction (civil infrastructure) is then presented.

REVIEW OF LITERATURE
The literature review comprehensively covers three subtopics pertaining to the technologies
supporting CPSs in construction projects: current applications of CPS in the AEC industry, the
implementation status of the supporting technologies and an overview of their characteristics.

Cyber-physical systems in the AEC Industry

Fundamentally, CPS is the bidirectional interaction between the physical component and the
cyber components of a system. To achieve bi-directional interaction, CPSs depends on sensors to
perceive the physical components, then captured data is automatically transferred to the cyber
components and is analyzed and transformed into required information by means of cyber
processes. The required actions are determined based on the previous analysis and comparison
and are transferred back to physical components through actuators. In the AEC industry, the
construction site and the built assets are the physical CPS components and the BIM can be
perceived as the an important CPS enabler, representing cyber counterparts and connecting them
with their physical components and other cyber components including cloud servers and
knowledge management centers. In this context, real-time interaction is desirable but it is still
very challenging due to interoperability between the BIM platforms and the CPS (Correa and
Maciel 2018).
Current bi-directional CPSs applications in the AEC industry are not many, as challenges
exist to perceive, and safely and accurately actuate in construction sites and the built
environment. Still, early examples include intelligent building systems (Zheng 2018), intelligent
traffic management systems (Chen et al. 2014), and construction site safety operations (Teizer et
al. 2010). However, most current applications of CPSs in the AEC industry are unidirectional
and similar to IoT implementations. The benefits of CPSs in the construction industry include the
automation of simple and repetitive activities, increased modularization, increased productivity,
increased safety, and the integration of future technology trends that enhance project outcomes.
However, the more important issues to consider for pushing CPS in the industry include a higher
level of CPSs integration, standards definition, and the development of the supporting
technologies for CPSs.
In other industries, standards for CPSs have been established to support the research efforts
and to align objectives from stakeholders, organizations, industries and the public (Ahmadi et al.
2017). In order to avoid fragmentation and poor integration of CPS applications in the AEC
industry and create a common vision for its practices, standards that align with other industries
need to be established. Such standards need to consider the constraints and peculiar methods that
the construction industry entails and help clarify higher level concepts such as the application of
CPS in the project development stage as well as in the built environment.
CPSs is supported by several technologies that according to their level of implementation can
be divided into two main types. First, the technologies currently accepted and implemented in the
AEC industry such as BIM, portable devices, image-capturing technologies, GPS and drones that
are currently used in various construction processes and also have the potential to support future

higher levels of CPSs implementation. Second, trendy technologies that are still under
development or not fully implemented but have the potential to take CPSs applications in the
AEC industry to the next level. Examples of the second group are the Internet of Things,
Robotics, AR/VR, Artificial intelligence, etc.

Figure 1. Evaluation parameters for supporting technologies

Nowadays, Internet of Things (IoT) and ICT may be considered intermediate states of
implementation of CPSs. These systems are mostly unidirectional, primarily monitoring the
physical world and transferring and visualizing information to a user that decide to take actions
in the physical world. Current applications of lower-level CPSs and IoT/ICT in AEC projects
and built environment seek to address common issues including safety (Kanan et al. 2018) and
collaboration (Lecce et al. 2018).

Implementation status of technologies supporting CPS

Technology implementation in the AEC is influenced by various factors (Jacobsson et al.
2017). An overview of the CPS implementation of the supporting technologies in the AEC
industry is presented here with the aim to bring additional insight about their current status and
challenges ahead. The CPS implementation of these technologies was evaluated using
parameters shown in Figure 1. Technology in the AEC industry is either developed specifically
for the industry, e.g. BIM or is adapted from other industries, e.g. virtual and augmented reality
(VR/AR). The latter is the most common as it requires fewer resources for research and
development. Nevertheless, both approaches can only be effective if properly integrated with the
current means and methods in the industry.
CPSs is supported by integrating different technologies that enable bi-directional interaction
between physical and cyber components to achieve automation. These technologies are
categorized here as already implemented technologies and trendy technologies. Implemented
technologies refer to advancements that are accepted and widely used. On the other hand, trendy
technologies have a clear potential to facilitate application of CPSs in the AEC industry, but still
are not widely accepted in the industry. One example of trendy technology is artificial
intelligence (AI), which has a limited implementation in the industry today (e.g. activity
recognition), but its future potential benefits are promising as AI will be essential to process the
autonomous processes for the future self-directed CPSs. Other technologies such as drones are

already implemented for applications including data acquisition and have the potential to be used
for advanced CPSs and automated construction processes, so are considered here as both
implemented and trendy.

Table 1. Analysis of technology implementation for CPSs applications

Supporting Technology for Approach for
Level of Adoption Status
CPSs Implementation
Adapted from another Technology considered part of the means and
CAD Implemented
industry/sector methods
Adapted from another Technology considered part of the means and
Email Implemented
industry/sector methods
Adapted from another Technology considered part of the means and
Internet Implemented
industry/sector methods
Audio and Video Adapted from another Technology considered part of the means and
Implemented
Communications industry/sector methods
Adapted from another Technology considered part of the means and
GIS Implemented
industry/sector methods
Technology recognized as a best practice.
BIM Specific for AEC Industry Implemented
Massively implemented
Adapted from another Technology considered part of the means and
Portable Devices Implemented
industry/sector methods
Image-Capture technologies
Adapted from another
(photogrammetry and laser- Efforts to implement undergone. Still a niche Implemented
industry/sector
scanning)
Adapted from another
Sensors Efforts to implement undergone. Still a niche Implemented
industry/sector
Adapted from another Technology considered part of the means and
GPS Implemented
industry/sector methods
Adapted from another
Internet of Things Efforts to implement undergone. Still a niche Implemented
industry/sector
Adapted from another
Drones Efforts to implement undergone. Still a niche Implemented
industry/sector
Adapted from another
3D Printing Technology with Potential in CPSs Implemented
industry/sector
Adapted from another
Exoskeletons Technology with Potential in CPSs Trendy
industry/sector
Adapted from another
Robotics Technology with Potential in CPSs Trendy
industry/sector
Adapted from another
Sensors Technology with Potential in CPSs Trendy
industry/sector
Adapted from another
AR/VR Efforts to implement undergone. Still a niche Trendy
industry/sector
Adapted from another
Artificial Intelligence Technology with Potential in CPSs Trendy
industry/sector
Adapted from another
Blockchain Technology with Potential in CPSs Trendy
industry/sector
Adapted from another
Autonomous vehicles Technology accepted for implementation Trendy
industry/sector
Adapted from another
5G Technology with Potential in CPSs Trendy
industry/sector

Finally, the impact of a technology is determined by its level of implementation. Once a

technology has been accepted and recognized to be useful in the industry, it will undergo a
progressive adoption as shown in a scaled qualitative assessment in Figure 1. For example,
drones’ level of implementation is considered here as “efforts to implement undergone but still is
a niche”. However, drones have the potential for a higher level of CPS implementation for
mainstream applications in construction operations related to data acquisition thus, the level of
CPS implementation can be increased. Table 1 analyzes the current technologies according to
these evaluation parameters.

Overview of the supporting trendy technologies for CPS in construction projects

An extended overview of the trendy technologies shown in Table 1. is presented. The
purpose of this section is to highlight the supporting role of the trendy technologies in CPS
implementation. Some of these technologies include cyber applications and other technologies
are physical devices that can be integrated into project development.

Cyber applications as supporting trendy technologies for CPS in construction projects

The trendy technologies for cyber applications presented here include IoT, blockchain, 5G,
and AI. IoT has been implemented in the AEC industry as services and products including
semantic web technologies or ICT to improve data sharing and communications. Its industry
acceptance is increasing and standards for better interoperability with other systems are being
researched and implemented (Jacobsson et al. 2017; Rezgui et al. 2011). Current applications
include cost estimation, data sharing, sustainability and automation in buildings operations;
potential applications include but are not limited to enhanced safety, digital construction, and
enhanced building management (Abanda et al. 2013; Kanan et al. 2018; Woodhead et al. 2018).
It is expected that AI will be essential for autonomous CPS implementation because of the need
for synthetic thinking by non-human decision-makers. This technological framework has the
potential to be used for high-level systems including smart cities and smart buildings and
infrastructure. However, the recognition of the construction and built environment will be one of
the early applications for CPS in the AEC industry (Anumba and Aziz 2006).
Blockchain as a technology framework has the potential to solve some concerning issues in
CPSs deployment in the AEC industry by supporting the processes related to data verification
and improved cyber-security. Research has been made to use blockchain to verify shared
information and thus maintain trust in information exchanges (Gries et al. 2018; Klyukin et al.
2018). Other applications include supply chain management for material control, improvements
in BIM databases, etc.(Lanko et al. 2018; Turk and Klinc 2017). Thus, blockchain can enhance
security and information sharing of CPS. 5G is the next generation of a wireless cellular
communication protocol which will bring additional bandwidth for data transfer but most
importantly, improved latency of data transfers. The latter means improved response to signals
and data transfer that are essential for applications of CPS on safety, autonomous systems
including equipment (Total Telecom Magazine 2017).

Physical devices as supporting trendy technologies for CPS in construction projects

The trendy technologies considered as applied devices are drones, 3D printing, exoskeletons,
AR/VR, robotics, sensors, and autonomous vehicles. Drones open a new dimension in project
development in the AEC industry. Their current applications for image-capturing, surveying, and
photogrammetry are becoming mainstream in the industry. Continued development of drone
technology may help overcome some of the technical challenges that limit their potential
applications such as indoors uses, range and battery duration. Once these challenges are
surpassed, this technology has the potential to greatly support CPSs implementation. The
construction industry is an ideal scenario where exoskeletons may have a substantial impact due
to health concerns for trade workers. Advancements of this technology caused this technology to
be deployed in other industries, thus, the AEC industry is expected to explore it and depending
on its acceptance, substantial adoption may occur (Linner et al. 2018). This technology coupled
with sensors for monitoring purposes has the potential to support CPSs to evaluate human

performance and prevent health hazards in the future construction sites.

New applications of AR/VR will impact project development to connect technologies to
humans. VR/AR will serve as an interface to control, visualize and access information from the
cyber systems. Applications of AR/VR includes safety, design, remote control operations and
BIM interoperability (Li et al. 2018). Sensors include different technologies that collect data
from project elements, personnel, and environment. Some examples of sensors are proximity,
humidity, depth, passive and active RFID and other sensors for use in a construction site and
facility management. Future sensors are expected to add many other automated functionalities to
construction components. Robotics are expected to be used to perform simple and repetitive
tasks (Ardiny et al. 2015). However, its implementation has been lagging because of interactions
issues between robots, objects, and humans. Nevertheless, the undergoing research in the past
few years is helping to overcome this issue to admit robots to be used in the industry. Robotic
deployment still require overcoming technical challenges such as reliability, programming for
tasks, situation awareness, etc. (Czarnowski et al. 2018). Robotics can be implemented as the
actuators in the CPSs. Efforts to automate vehicles, particularly equipment in the AEC industry
has been investigated, however acceptance of this technology has been lagging because of
concern related to safety and coordination with equipment and workers. Recent research about
proximity tracking between equipment and other parties may bolster the adoption of autonomous
equipment for CPSs (Fennelly 2018).

CATEGORIZING THE SUPPORTING TECHNOLOGIES

The supporting technologies’ role in the CPS processes during project development is
categorized in accordance with 4 parameters, namely, project stage, the functional layers, CPS
implementation status, and project scenarios as shown in Table 2.

Project Stages
The supporting technologies evaluated were visualized in the context of project phases as a)
Pre-construction, divided into planning and design, b) Construction and c) Operations and
Maintenance.

CPSs Processes and Cyber Functional Layers

The bi-directional communication in CPSs is established between the physical components
of the project or the built asset, and the cyber components through project processes. shows both,
physical and cyber processes impacted by the supporting technologies. Furthermore, the cyber
processes are considered into functional layers were the supporting technologies were evaluated
in the context of their functions along the project stages. The functional layers are:
1. Computational platforms layer. Technologies function as systems where data is stored,
processed and transmitted using established platforms.
2. Decision-making. Technologies act as rational systems that receive information, analyze
it and provide a direction for other systems to follow.
3. Monitoring. Technologies assure the performance or quality of other processes.
4. Creation. Technologies are used to create project elements.
5. Communication. Technologies enable communication among stakeholders and systems.
6. Data acquisition. Technologies functions to sense and capture physical parameters.
7. Physical. Technologies interact directly with the physical project elements.

Table 2. Categorization of supporting technologies for CPS Implementation in AEC

Pre-construction
Planning Design Construction O&M
Information sharing by Information sharing by
Information sharing by email Information sharing by email
email email
Computational

Internet Internet Internet Internet

Platforms

IoT Services IoT Services IoT Services IoT Services

5G Cellular Technology 5G Cellular Technology 5G Cellular Technology 5G Cellular Technology
Drawing References
from CAD
BIM Building Database
(B)
Decision-
making

AI for Data Recognition AI for Data Recognition AI for Data Recognition AI for Data Recognition
AI for process
AI for process recognition AI for process recognition AI for process recognition
recognition
Monitor information transfer Monitor information transfer with Monitor information transfer with Monitor information
with Blockchain Blockchain Blockchain transfer with Blockchain
Sensors to monitor
Monitoring

Photogrammetry for material

buildings occupants and
quantities control (C)
systems (B)
Laser-scanning for material
quantities control (C)
Sensors to monitor field operations
Drones to monitor project controls
Schematic Design with CAD Design Creation with CAD Visualize and edit with CAD
Creation

Schematic Design with BIM Design Creation with BIM Visualize and edit with BIM
Printing of 3D Mockups Printing of 3D Models Printing of 3D elements
Visualize with VR/AR Visualize with VR/AR Visualize with VR/AR
Communication by email Communication by email Communication by email Communication by email
Cyber

Communication based on Communication based on IoT Communication based on

Communication

Communication based on IoT services

IoT services services IoT services
Use portable devices for Use portable devices for Use portable devices for
Use portable devices for communications
communications communications communications
Audio and Video Audio and Video
Audio and Video Communication Audio and Video Communication
Communication Communication
Communication with
Communication with VR/AR Communication with VR/AR
VR/AR
Use sensors to get data
GIS for terrain and site GIS for topography and location
GIS for terrain from the physical facility
recognition control
or infrastructure
Laser-scanning for As-Built Input data from portable
Laser-scanning for As-Built capture Input data with portable devices
capture devices
Laser-scanning for 3D GIS for as-built civil infrastructure Image-capture with
Laser-scanning for 3D terrain
terrain (C) drones
Data Acquisition

Sensing data from

Input data for portable Photogrammetry for As-Built
Input data for portable devices portable devices for
devices creation
occupancy (B)
Photogrammetry for As-
Photogrammetry for As-Built capture Photogrammetry to terrain capture
Built capture
Photogrammetry for 3D Use sensors to get data from the
Photogrammetry for 3D terrain
terrain physical project
Use sensors to get data from Use sensors to get data from the physical
Image-capture with drones
the physical project project
Image-capture with drones Image-capture with drones Sensing data from portable devices
Sensing data from portable
Sensing data from portable devices
devices
GIS for as-built civil infrastructure (C)
Automated Vehicles Actuators in Buildings
Physical

Actuators and Infrastructure

Robotics
Exoskeletons
(B) Technology application only for Buildings
Implemented Trendy (C) Technology application only for Infrastructure
No reference. Technology applies to both scenarios

Project Scenarios
Project scenarios were also considered to represent the different applications of the
supporting technologies. The project scenarios considered are 1) Vertical construction (marked
as B in the table), 2) Horizontal construction, also referred as civil infrastructure construction
(marked as C in the table) and 3) technologies applied to both scenarios (no marks).

CPS Implementation status of the supporting technologies

The CPS implementation statuses of the supporting technologies are 1) Implemented
technologies are technologies that are already established in the industry, and 2) Trendy
technologies are technologies with a potential for CPS in construction projects. These are
highlighted in different colors in Table 2.

CONCLUSIONS
This paper evaluates the essential technologies that support CPSs implementation in the
construction industry. It also provides a visual overview of the current implementation status of
these technologies for a better comprehension of their interaction and application. As such, it
defines a new starting point for research efforts on this topic to bring recent innovations that have
not been evaluated such as blockchain and 5G. The evaluation of the technology’s CPS
implementation status may help to identify cases of adoption that can be replicated in future
technologies. The role of BIM in implementation of CPS in the AEC industry is essential as it
becomes the enabler to connect both the physical and cyber components of a project. Still,
interoperability among these technologies in CPSs will continue to be a challenge and efforts are
required to avoid fragmentation of their applications. The categorization in this study shows that
the trendy technologies will impact many aspects of the industry including decision-making,
monitoring, communication, and physical systems. According to the same diagram, implemented
technologies are predominant in the data acquisition, creation systems. The categorization also
shows that supporting technologies are in general applicable for both horizontal and vertical
scenarios.

REFERENCES
Abanda, F. H., Tah, J. H. M., and Keivani, R. (2013). “Trends in built environment semantic
Web applications: Where are we today?” Expert Systems with Applications, 40(14), 5563–
5577.
Ahmadi, A., Cherifi, C., Cheutet, V., and Ouzrout, Y. (2017). “A review of CPS 5 components
architecture for manufacturing based on standards.” 2017 11th International Conference on
Software, Knowledge, Information Management and Applications (SKIMA), 1–6.
Anumba, C., and Aziz, Z. (2006). “Case Studies of Intelligent Context-Aware Services Delivery
in AEC/FM.” Intelligent Computing in Engineering and Architecture, Lecture Notes in
Computer Science, I. F. C. Smith, ed., Springer Berlin Heidelberg, 23–31.
Ardiny, H., Witwicki, S., and Mondada, F. (2015). “Construction automation with autonomous
mobile robots: A review.” 2015 3rd RSI International Conference on Robotics and
Mechatronics (ICROM), IEEE, Tehran, Iran, 418–424.
Chen, Y., Luo, J., Li, W., Erqing, Z., and Shi, J. (2014). “Self-Organization Framework and
Simulation Realization of Transportation Cyber-Physical System.” CICTP 2014,

Proceedings.
Correa, F. R. (2018). “Cyber-physical systems for construction industry.” 2018 IEEE Industrial
Cyber-Physical Systems (ICPS), 392–397.
Correa, F. R., and Maciel, A. R. (2018). “A methodology for the development of interoperable
bim-based cyber-physical systems.” ISARC 2018.
Czarnowski, J., Dąbrowski, A., Maciaś, M., Główka, J., and Wrona, J. (2018). “Technology gaps
in Human-Machine Interfaces for autonomous construction robots.” Automation in
Construction, 94, 179–190.
Fennelly, J. (2018). “Autonomous Off-Road Construction Vehicles.” Machine Design, 90(9), 52.
Gries, S., Meyer, O., Wessling, F., Hesenius, M., and Gruhn, V. (2018). “Using Blockchain
Technology to Ensure Trustful Information Flow Monitoring in CPS.” 2018 IEEE
International Conference on Software Architecture Companion (ICSA-C), 35–38.
Jacobsson, M., Linderoth, H. C. J., and Rowlinson, S. (2017). “The role of industry: an analytical
framework to understand ICT transformation within the AEC industry.” Construction
Management and Economics, 35(10), 611–626.
Kanan, R., Elhassan, O., and Bensalem, R. (2018). “An IoT-based autonomous system for
workers’ safety in construction sites with real-time alarming, monitoring, and positioning
strategies.” Automation in Construction, 88, 73–86.
Klyukin, A. A., Kulachkovsky, V. N., Evseev, V. N., and Klyukina, A. I. (2018). “Possibilities
of New Information Technologies in the System of Urban Planning and Construction.” Key
Engineering Materials, 771, 49–55.
Lanko, A., Vatin, N., and Kaklauskas, A. (2018). “Application of RFID combined with
blockchain technology in logistics of construction materials.” MATEC Web of Conferences,
(I. Ilin and O. Kalinina, eds.), 170, 03032.
Lecce, V. D., Guaraganella, C., Palagachev, D., Dentamaro, G., Quarto, A., and Soldo, D.
(2018). “IoT based cooperative agents architecture: Lightweight applications for smart
cities.” 2018 IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems
(EESMS), 1–6.
Li, X., Yi, W., Chi, H.-L., Wang, X., and Chan, A. P. C. (2018). “A critical review of virtual and
augmented reality (VR/AR) applications in construction safety.” Automation in Construction,
86, 150–162.
Linner, T., Pan, M., Pan, W., Taghavi, M., Pan, W., and Bock, T. (2018). “Identification of usage
scenarios for robotic exoskeletons in the context of the Hong Kong construction industry.”
35th International Symposium on Automation and Robotics in Construction, International
Association for Automation and Robotics in Construction I.A.A.R.C).
Nunes, D. S., Zhang, P., and Silva, J. S. (2015). “A Survey on Human-in-the-Loop Applications
Towards an Internet of All.” IEEE Communications Surveys Tutorials, 17(2), 944–965.
Rezgui, Y., Boddy, S., Wetherill, M., and Cooper, G. (2011). “Past, present and future of
information and knowledge sharing in the construction industry: Towards semantic service-
based e-construction?” Computer-Aided Design, 43(5), 502–515.
Teizer, J., Allread, B. S., Fullerton, C. E., and Hinze, J. (2010). “Autonomous pro-active real-
time construction worker and equipment operator proximity safety alert system.” Automation
in Construction, Building Information Modeling and Collaborative Working Environments,
19(5), 630–640.
Total Telecom Magazine. (2017). “NEC to conduct remote construction trial utilising 5G with
KDDI, Obayashi.” Computer Database.

Turk, Ž., and Klinc, R. (2017). “Potentials of Blockchain Technology for Construction
Management.” Procedia Engineering, 196, 638–645.
Woodhead, R., Stephenson, P., and Morrey, D. (2018). “Digital construction: From point
solutions to IoT ecosystem.” Automation in Construction, 93, 35–46.
Zheng, Y. (2018). “Design and Testing of Automatic Control System of Intelligent Building.”
2018 10th International Conference on Measuring Technology and Mechatronics
Automation (ICMTMA), IEEE, Changsha, China, 272–275.

Investigation and Analysis of Human, Organizational, and Project Based Rework

Indicators in Construction Projects
Elnaz Safapour, S.M.ASCE1; and Sharareh Kermanshachi, Ph.D., M.ASCE 2
1
Ph.D. Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman Hall,
416 Yates St., Arlington, TX 76019. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman
Hall, 416 Yates St., Box 19308, Arlington, TX 76019 (corresponding author). E-mail:
[email protected]

ABSTRACT
Since construction project rework is one of the most important causes of cost overruns, the
best strategy is to identify the causes of rework at the right time. Therefore, the aim of this study
is to investigate and analyze the rework indicators (RIs) belonging three categories of
organization, project, and people. To fulfill the objectives of this study, a structured survey was
developed in order to collect data associated with various construction project characteristics,
change orders, cost, and schedule performance. Appropriate statistical tests were applied to the
39 collected survey responses. The result reveals that the experience of the project management
team PMT in the design and/or construction phase, and number of PMTs who work in a
construction project are the most important indicators in deriving rework. The findings of this
study would assist project managers (PMs) in planning proactively to prevent construction
project rework.

INTRODUCTION
Change orders are inevitable in all types of construction projects (Safapour et al. 2018). They
affect the cost of a project, create scheduling delays, and decrease productivity (Kermanshachi et
al, 2016a & 2016b). Additionally, change orders play an important role in a project’s success or
failure (Safapour et al. 2018). Because of the uniqueness of the budget and schedule estimation
of each construction project, as well as the availability of resources for planning, such as time,
money and human resources, change orders and their consequences, vary significantly from
project to project. Change orders are usually issued to modify and/or change design during the
construction phase but can be issued for various reasons by different stakeholder parties, so the
issued rework have a significant potential to create serious challenges for owners, designers, and
contractor stakeholders, and may cause conflicts among the project stakeholders (Habibi et al.
2018).
Investigation and identification of causes of rework lead to manage them effectively
(Safapour et al. 2017). Various studies have been performed to study the serious causes of
rework and their undesirable consequences on construction projects (Love et al. 2010, 2016; Ye
et al. 2015, Forcada et al. 2017). Accordingly, Love et al. (2012) classified the critical causes of
rework into three main categories: organization, project, and people. Researchers believe that
there are inherently potential problems in project systems, such as organizational problems (e.g.
insufficient quality management), project problems (e.g. unclear scope definition)
(Kermanshachi et al, 2017), and individual problems (e.g. low experience of employees). These
nomenclatures portray the procedure that enables the mapping of dependencies that has great
impacts on rework prevention.

Since deriving rework in the construction industry is considered as one of the most important
causes of cost overruns, the best strategy is to identify the causes of rework at the right time.
Therefore, the overall goal of this study is to determine significant rework indicators belonging
three categories of organization, project, and people. The following objectives were formulated
to achieve the goal of this study: (1) identify the potential rework variables; (2) classify the
potential rework variables; and (3) determine the significant rework indicators. This study could
help researchers and professionals assess the construction and/or design rework at the right time
in order to prevent significant dollar values.

LITERATURE REVIEW
Change orders are unavoidable in the construction industry and have serious impacts on
construction performance and productivity. According to Baxendale and Schofield (1986),
change orders can be defined as any change occurs that is different from the agreed upon and
signed contract. It is stated that changes of plans or in the construction process itself must be
expected because of the complexity of construction projects, and change orders in both the
design and construction phases are unavoidable. Therefore, the construction industry is subject to
poor cost management and schedule performance due to rework (Habibi et al. 2018,
Kermanshachi et al. 2019).
There are several definitions associated with rework through existing literature in the area of
the construction industry. Josephson at al. (2002) defines rework as useless output that
commonly occurs due to mistake through execution of a construction project. CII (2001)
characterized rework in the construction phase as activities that have to be done more than once,
or activities that remove previous work installed as part of a project.
Since the process of rework assessment and management is commonly time-consuming and
costly if it is not addressed at the right time, Bearup (1995) claimed that the earlier that change
orders are identified and managed, the greater time benefits will be realized. It is believed that
during the early stages of a design phase, rework could be conducted at the lowest value of cost
and has the maximum potential for the greatest reduction in cost overruns (Kermanshachi and
Rouhanizadeh 2019). Arain and Pheng (2007) stated that rework is easier to assess in order to
prevent through the earlier construction stages because, to put it simply, they make it possible to
avoid changes. Kamalirad et al. (2017) likewise indicated that good contract documentation,
effective communication, and cooperation among the individuals involved are the critical key
elements leading to mitigate change orders. Additionally, the mentioned authors stated that good
communication is commonly facilitated by designing an effective change order prevention,
which should be geared towards understanding the workflow of the change orders.

RESEARCH METHODOLOGY
A five-step research methodology was developed and implemented, as shown in Figure 1.
The details of each step are described as follows:
Step 1. A comprehensive literature review was conducted to define the focus of the present
study.
Step 2. The potential rework indicators were identified, and then classified into three
categories of organization, project, and people were identified through literature review.
Step 3. To collect the required data, a structured survey was developed and distributed among
Subject Matter Experts (SMEs).
Step 4. The descriptive data analysis is performed using collected data.

Step 5. Based on the type of collected data, the appropriate statistical data analyses, including
Two-sample t-test and Kruskal-Wallis, were conducted to determine the significant rework
indicators.

Figure 1. Research methodology approach.

Table1. Descriptive Analysis of Survey Collected Data

Construction Standard Deviation
Minimum Mean Maximum
Phase
Baseline
$337,721 $87M $740M $134M
Cost Budget
Actual Cost $327,000 $151M $2,500M $393M
Baseline
4 Months 16 Months 40 Months 9.6 Months
Schedule
Schedule
Actual
3 Months 17.5 Months 46 Months 10.5 Months
Schedule
Change
Rework $21,000 $2M $9M $2M
Orders

Figure 2. Formation structure of rework.

DATA COLLECTION
To collect the required data, a structured survey was developed. The survey questions were
categorized into two groups: (1) general project description; and (2) potential root causes of

rework. The collected responses were in two forms: continuous number and seven-point Likert
scale. The first section of the survey consisted of 20 questions associated with general
information and project characteristics. The second section consisted of fifty questions based on
the potential rework indicators. Each of the potential rework indicators became one question of
the survey.

Table 2. Significant Rework Indicators Associated with Organization Category

Rework Indicators Causes of Rework P-Value
RI-1. Difficulty in obtaining design approval Long waiting time for approval (Chan & Kumaraswamy 0.015**
1997)
RI-2. Number of financial approval authority threshold Long-lead procurement (Fisk 1997) 0.001**
RI-3. Number of external entities required to approve the Occurrence of conflicts and disputes (Wu et al. 2005) 0.034**
design
RI-4. Number of active Internal stakeholders in decision Impediment of prompt decision-making (Sanvido et 0.043**
making process al.1992)
RI-5. Alignment quality of internal stakeholders Poor coordination (Arain and Pheng 2005) 0.020**
RI-6. Number of owner organizations Impediment of prompt decision-making (Sanvido et al. 0.003**
1992)
RI-7. Number of designer organizations Poor coordination ( Arain and Pheng 2005) 0.055*
RI-8. Number of contractor organizations Poor site management (Sunday 2010) 0.016**
RI-9. Communication effectiveness within owners Owner fail to make decision right time (Jadhav & 0.006**
Bhirud 2015)
RI-10. Communication effectiveness within designers Failure by consultant to supervise effectively (Jadhav & 0.001**
Bhirud 2015)
RI-11. Communication effectiveness within contractors Poor project management by contractor (Ye et al. 2015) 0.001**

Furthermore, the pilot test was administered to four experienced practitioners from the
construction industry to examine the clarity of each question. After the questionnaire was
validated, it was finalized and distributed to experienced industry professionals. The survey
process was entirely set up and managed through an online system. After sending two follow-up
emails, 39 responses were received.

DESCRIPTIVE DATA ANALYSIS

The breakdown of information associated with baseline and actual budgets and schedules, as
well as change order dollar values for the construction phase, belonging to 39 construction
projects are shown in Table 1. This table indicates that mean value of the baseline budget and
actual cost were $87 million and $151 million, respectively. The value of standard deviation for
the baseline budget and actual cost were $134 million and $393 million, respectively.
Additionally, Table 1 shows that the mean value of the baseline schedule and actual schedule
were 16 Months and 17.5 Months, respectively. Also, the standard deviation of the baseline
schedule was 9.6 Months and the actual schedule was 10.5 Months. In terms of change orders,
the mean and standard deviation values were both $2 million.

DETERMINE SIGNIFICANT REWORK INDICATORS

In this step of the current study, the significant RIs were determined using the appropriate
statistical data analysis. The procedure employed for this step is shown in Figure 2. As
mentioned earlier, the RIs were categorized into three groups: organization, project, and people,
based on studies conducted by Love (2012), and Safapour and Kermanshachi (2018).

Table 3. Significant Rework Indicators Associated with Project Category

Rework Indicators Causes of Rework P-
Value
RI-12. Percentage of active project Poor site management and supervision (Ye et al. 2015) 0.023**
management staff
RI-13. Number of executive oversight Low speed of decision making (Chan & Kumaraswamy 1997) 0.035**
entities above the PM
RI-14. PMT experience in design phase Lack of experience ( Arain and Pheng 2005) 0.001**
RI-15. PMT experience in construction phase Lack of experience ( Arain and Pheng 2005) 0.001**
RI-16. Number of execution locations on this Inappropriate linking all design team (Chan & Kumaraswamy 1997) 0.051*
project during detailed design phase
RI-17. Number of countries involved in Socio-cultural factors (O’Brien 1998) 0.057*
design phase
RI-18. Number of countries involved in Socio-cultural factors (O’Brien 1998) 0.081*
construction phase
RI-19. Difficulty in system design Mistake and defect in design (Hsieh et al. 2004) 0.038*
RI-20. Percentage of design at the start of Incomplete design information (Jadhav & Bhirud 2015) 0.031*
construction
RI-21. RFI leads to design changes Changes in design ( Arain and Pheng 2005) 0.018*
RI-22. Number of new systems tied into Lack of experience ( Arain and Pheng 2005) 0.035**
existing systems
RI-23. Delay in delivery of permanent Unavailability of equipment (O’Brian 1998) 0.003**
facility equipment
RI-24. Permanent equipment quality issues Low productivity of equipment (Assaf and Al-Hejji 2006) 0.047**
RI-25-Quality of bulk materials Replacement of material (Karthick et al. 2015) 0.043**
RI-26. Clarity of owner’s project goals and The owner may make changes to achieve certain milestones within a 0.039**
objectives given time frame (Wu et al. 2005)
RI-27.Total number of joint-venture partners Low speed of decision making (Chan & Kumaraswamy 1997) 0.042**
in a project
RI-28. Number of funding phases Delay in payment (Karthick et al. 2015) 0.044**

As the survey consisted of continuous and seven-point Likert scale questions, Two-sample t-
test and Kruskal-Wallis were utilized in order to determine significant RIs. The results of
significant RIs associated with organization categories are illustrated in Table 2. This table
consists of three columns. In the first column, significant RIs are placed. In the second column,
the causes of rework based on the previous studies are shown. In the last column, the P-Value of
the significant RIs is presented.
Figure 2 shows that each potential rework variable leads to a cause of rework, which was
found in existing literature. Then, each of the stated causes derives rework in construction
projects.
Table 2 illustrates that the low quality of communication within the owners and/or designers
and/or contractors seriously affects the rework issuance. As the mentioned issue leads to
conflicts, which makes reaching an agreement very time-consuming. The possibility of design
changes/modifications occurring increases with late decision-making by the
owners/designers/contractors. Considering RI-7, reaching an agreement, which belongs to the
organization category, can be very time consuming due to conflicts between designers.
Consequently, the process of decision making by designer entities takes a lot of time and the
possibility of design changes increases.
The results of significant rework indicators associated with project category are shown in
Table 3. This table indicates that increases the number of joint venture partners in a construction
project, causes increasing the number of stakeholder parties. Thus, the number of stakeholder
parties who have authority in the decision-making and approval processes would increase. The
more people involved, the more problems pertaining to decision-making are likely to occur due

to disagreement, increasing the possibility of major reworks in the construction phase.

As shown in Table 3, if the design of a system is complex and complicated, the potential for
an increasing number of mistakes in design is greater, due to deficiencies in the designers’ skill
and/or knowledge. Consequently, the mentioned design errors/mistakes lead to reworks and
modifications during the construction phase. Similarly, low quality of bulk materials causes the
replacement of the materials and so leads to changes and/or modifications of the design and/or an
increasing construction duration.
As the PMT usually is responsible for the planning, programming, and execution of any
construction project, the experience of the PMT in the design and/or construction phases (i.e.,
RI-14, RI-15) and the number of project management staff who work on the project (i.e. RI-13)
were determined to be the three significant rework indicators in construction projects. The
significant rework indicators associated with people category is shown in Table 4. This table
illustrates that if the designers and/or field craft labors have insufficient skills in working with
recent technologies in the design phase and/or construction phase, the probability of mistakes
and/or errors in complex and complicated construction projects would increase.

Table 4. Significant Rework Indicators Associated with People Category

Rework Indicators Causes of Rework P-Value
RI-29. Familiarity with technologies in design Defect in design (Hsieh et al. 2004) 0.082*
RI-30. Familiarity with technologies in construction phase Changes in construction method(Wu et al. 2005) 0.063*
RI-31. Field craft labor quality issue Skill Shortage ( Arain and Pheng 2005) 0.069*
RI-32. Percentage of craft labor sourced locally Socio-cultural factors (O’Brien 1998) 0.011**

CONCLUSION
This study aims to determine significant rework indicators belonging to three categories of
organization, project, and people. The results reveal that the experience of PMT in the design
phase and/or construction phase, belonging to the project category, were identified as a rework
indicator. In addition, in the same category, the number of PMT in both mentioned phases were
determined statistically significant as a rework indicator. In terms of people category, skill,
knowledge, and familiarity of designers/engineers and field craft labors with new and recent
technologies through the design phase and/or construction phases are identified as RIs. The
present study would assist researchers and experts in the timely assessment of rework to prevent
it successfully.
REFERENCES
Arain, F. M., and Pheng, L., (2007), “Modeling for management of variations in building
projects.” Eng. Constr. Arch. Manage. 14 (5), 420-433.
Assaf, S.A., and Al-Hejji, S. (2006). “Causes of delay in large construction projects.” Int. J. Pro.
Manage. 24 (4), 349-357.
Bearup, W. K., (1995). “An environment to support computer-assisted design review.” Ph.D.
Thesis. Department of Civil Engineering, University of Illinois at Urbana-Champaign.
Chan, D.W.M., and Kumaraswamy, M.M. (1997). “A comparative study of causes of time
overruns in Hong Kong construction projects.” Int. J. Pro. Manage. 15 (1), 55–63.
Construction Industry Institute (CII). (2001). “The field rework index: Early warning for field
rework and cost growth,” Rep. NO. RS153-1, Austin, TX.
Fisk, E.R. (1997). “Construction project administration.” 5th edition, Prentice Hall, Upper Saddle
River, NJ.

Forcada, N., Alvarez, A. P., Love, P. E. D., and Edwards, D. J. (2017b). “Rework in urban
renewal projects in Colombia.” J. Infra. Syst., http://doi.org//10.1061/(ASCE)IS.1943-
555X.0000332, 04016034.
Habibi, M., Kermanshachi, S. (2018). “Phase-based analysis of key cost and schedule
performance causes and preventive strategies: Research trends and implications,” Eng.
Constr. Arch. Manage., 25, 1009–1033.
Habibi, M., Kermanshachi, S., and Safapour, E. (2018). “Engineering, Procurement and
Construction Cost and Schedule Performance Leading Indicators: State-of-the-Art Review,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Hsieh, T., Lu, S., and Wu, C. (2004). “Statistical analysis of causes for change orders in
metropolitan public works.” Int. J. Pro. Manage. 22(8), 679-686.
Jadhav, O. U., and Bhirud, A. N. (2015). “An analysis of causes and effects of change orders on
construction projects in Pune.” I. J. E. R. G. S., 3(6), 795-799.
Josephson, P. E., Larsson, B., and Li, H. (2002). “Illustrative benchmarking rework and rework
costs in Swedish construction industry.” J. Manage. Eng.,
http://doi.org//10.1061/(ASCE)0742-597X(2002)18:2(76), 76-83.
Karthick, R., Malathi, B., and Umarani, C. (2015). “Study on change order impact on project
lifestyle.” I. J. E. R. T., 4(5), 691-695.
Kamalirad, S., Kermanshachi, S., Shane, J. and Anderson, S. (2017). “Assessment of
Construction Projects’ Impact on Internal Communication of Primary Stakeholders in
Complex Projects,” Proceedings for the 6th CSCE International Construction Specialty
Conference, Vancouver, Canada May 31-June 3.
Kermanshachi, S., Beaty, C. and Anderson, S.D. (2016). “Improving Early Phase Cost
Estimation and Risk Assessment: A Department of Transportation Case Study.” In
Transportation Research Board 95th Annual Meeting, No. 16-2202.
Kermanshachi, S., Anderson, S. D., Goodrum, P., and Taylor, T. R. (2017). “Project Scoping
Process Model Development to Achieve On-Time and On-Budget Delivery of Highway
Projects,” Transportation Research Record: Journal of the Transportation Research Board,
(2630), 147-155.
Kermanshachi, S., Safapour, E. Anderson, S., Molenaar, K., and Schexnayder, C. (2019).
“Development of the Cost Baseline for Achieving Excellence in Rural Transit Facilities.” In
Transportation Research Board 98th Annual Meeting, No. 19-05955.
Kermanshachi. S. and Rouhanizadeh, B. (2019). “Robustness Analysis of Design Phase
Performance Predictors Using Extreme Bounds Analysis (EBA),” Proceedings of ASCE
International Conference on Computing in Civil Engineering, Atlanta, GA, June 17-19,
2019.
Kermanshachi, S., Shane. J., and Anderson, S. (2016). “Factors Influencing Procurement Phase
Cost Overruns in Construction Industry,” International Conference on Sustainable Design,
Engineering and Construction (ICSDEC), Tempe, AZ, 2016.
Love, P. E. D., Edwards, D. J., Watson, H., and Davis, P. (2010). “Rework in civil infrastructure
projects: Determination of cost predictors.” J. Constr. Eng. Manage., 10.1061/ (ASCE)
CO.1943-7862.0000136, 275-282.
Love, P. E. D., Lopez, R., Edwards, D. J., and Goh, Y. M. (2012). “Error begat error: Design
error analysis and prevention in social infrastructure projects.” Acc. Anal. Prev., 48, 100-110.
Love, P. E. D., Edwards, D. J., and Smith, J. (2016). “Rework causation: Emergent Theoretical

insights and implications for research.” J. Constr. Eng. Manage., 10.1061/ (ASCE) CO.1943-
7862.0001114, 04016010.
O’Brien, J. J. (1998). Construction change orders: Impact, avoidance, and documentation.
McGraw Hill, New York, NJ.
Safapour, E., and Kermanshachi, S. (2019). “Identifying Early Indicators of Manageable Rework
Causes and Selecting Mitigating Best Practices for Construction,” J. Manage. Eng. 35 (2),
04018060. http://doi.org/10.1061/(ASCE)ME.1943-5479.0000669.
Safapour, E., Kermanshachi, S., Shane, J. and Anderson, S. (2017). “Exploring and Assessing
the Utilization of Best Practices for Achieving Excellence in Construction Projects,”
Proceedings of the 6th CSCE International Construction Specialty Conference, Vancouver,
Canada May 31-June 3.
Safapour, E. Kermanshachi, S., Habibi, M., and Shane, J. (2018). “Resource-Based Exploratory
Analysis of Project Complexity Impact on Phase-Based Cost Performance Behavior,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Safapour, E., Kermanshachi, S. and Ramaji, I. (2018). “Entity-Based Investigation of Project
Complexity Impact on Size and Frequency of Construction Phase Change Orders,”
Proceedings of Construction Research Congress, New Orleans, Louisiana, April 2-4, 2018.
Sanvido, V., Grobler, F., Prafitt, K., Guvenis, M., and Coyle, M. (1992). “Critical success factors
for construction projects.” J. Constr. Eng. Manage., 118(1), 94-111.
Sunday, O. A. (2010). “Impact of variation orders on public construction projects.” Proc. 26th
Annual ARCOM Conference, Leeds, U.K., 101-110.
Ye, G., Jin, Z., Xia, B., and Skitmore, M. (2015). “Analyzing causes for reworks in construction
projects in China.” J. Manage. Eng. http://doi.org/ 10.1061/ (ASCE) ME.1943-
5479.0000347, 04014097.
Wu, C.; Hsieh, T.; and Cheng, W. (2005). “Statistical analysis of causes for design change in
highway construction on Taiwan.” Int. J Pro. Manage. 23, 554-563.

Development of Effective Communication Network in Construction Projects Using

Structural Equation Modeling Technique
Elnaz Safapour, S.M.ASCE1; Sharareh Kermanshachi, Ph.D., M.ASCE 2;
and Shirin Kamalirad, S.M.ASCE3
1
Ph.D. Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman Hall,
416 Yates St., Arlington, TX 76019. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman
Hall, 416 Yates St., Box 19308, Arlington, TX 76019 (corresponding author). E-mail:
[email protected]
3
Graduate Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman
Hall, 416 Yates St., Arlington. TX 76019. E-mail: [email protected]

ABSTRACT
Stakeholder management plays a critical role in successfully delivering construction projects;
thus, identifying causes of miscommunication within primary stakeholders (owner, designer, and
contractor) would be beneficial. Previous studies identifying significant causes of
miscommunication have not studied how different causes work together to influence
communication within primary stakeholders. Therefore, different appropriate statistical methods
were used to identify the effective communication indicators (ECIs) within three primary
stakeholders. Factor analysis technique was then utilized to investigate main components
associated with ECIs. Structural equation modeling (SEM) was then used to identify the
pertinent relationship between the identified main components. For this purpose, 38 completed
case study projects were collected. As information related to other aspects of case studies was
needed, a structured survey was developed and 38 completed surveys were collected. Thirty
ECIs associated with three primary stakeholders were determined significant, which belonging to
20 main components. The results revealed that project target, scope clarity, filed labor, objectives
and restrictions, and resource availability were critical components. The outcomes of this study
help project managers focus on main effective communication factors to mitigate the risk of
project failure because of dispute and conflict.

INTRODUCTION
The success of a construction project is the main target of each primary stakeholders through
a construction project (Cho et al. 2009). Communication is inherently one of the important
factors in a construction project success, specifically while numerous parties are involved and
they need to cope with probable issues in order to reach the success (Kamalirad et al. 2017).
Cleland and Ireland (1999) defined communication as a two-way process between a sender and a
receiver to exchange information. In this respect, Malisiovas (2014) defines communication as
“friendship or collaborations in projects, mutual organizational work, and others.” Similarly,
Chinowsky et al. (2010) define communication as a direct relation between the success of a
project and the right amount of communications and knowledge sharing in completing a set of
tasks. Additionally, communication refers to relationship and collaboration emerged when there
is an interaction between and within project teams (Senescu et al., 2012).
Communication simply refers to the exchange of information and other resources such as
ideas, knowledge, skills, and technology among team members and organizations (Cheng 2001).

As different entities such as primary stakeholders are generally involved in the execution of a
construction project (Lee and Kim 2018), extensive information exchange across members of a
project is required. Dainty and Lingard (2006) explained that through transmission and exchange
of information and knowledge, communication might become distorted. This distortion of
communication between a project’s team members and parties portrays the misunderstanding,
and/or future extra workload, or even conflicts that may arise among them (Kermanshachi 2010;
kamalirad and Kermanshachi, 2018). These issues would cause schedule delays, cost overruns,
leading to project failure (Forcada et al. 2017; Lee and Kim 2018) as cost increase may lead to
the suspension of the project (Safapour et al. 2018). Some researchers believe that timely transfer
of relevant information is necessary for project performance in light of the mutually dependent
nature of construction activities (Wong and Lam 2011, Habibi et al, 2018). Therefore,
communication plays an important role to build a structured successful team (Pentland 2012).
Effective communication is regarded as the basis for creating the alliances, which are required
for a successful project (Cheng et al. 2001). Cook and Macaulay (2013) explained that effective
communication could substantially improve teamwork and the higher level of collaboration in a
construction project. On the contrary, ineffective communication might cause
misunderstandings, delays, and defects in different phases of a construction project (Cheng et al.
2001; Lee and Kim 2018). Effective internal communication is necessary to structure a
successful organization as it has significant impacts on the ability of strategic managers to
engage staff and attain project objectives (Welch and Jackson 2007, Nipa et al. 2019). In 2014,
Mazzei explained that establishment of the effective internal communication is crucial for
employees and staff of an organization. In addition, the stated author believed that effective
internal communication would be very beneficial to generate knowledge and skill, exchange
information, and other strategic communication actions.
Since project characteristics have a strong connection with project’s performance (Cho et al.
2009), being able to recognize that how the different project characteristics affect the quality of
communication assists in improving project performance (Kermanshachi 2016). Additionally,
researchers have rarely conducted an investigation of the quantitative relationship between the
key factors of effective internal communication within primary stakeholders. Thus, this study
makes an effort to answer the following research questions: Q1. Which characteristics of a
construction project affect the quality of internal communication within three primary
stakeholders? Q2. What are the key factors of effective communication within primary
stakeholders in a construction project? Q3. What is the quantitative relationship between the key
factors of effective internal communication?
Primary stakeholders could utilize the outcomes of this study to improve communication
quality within primary stakeholders in construction projects. Additionally, this study allows
project managers to plan proactively and allocate resources based on their role in the
construction projects.

RESEARCH METHODOLOGY
To achieve the objectives of this study, the following five-step methodology was developed
and implemented, which is presented in Figure 1. In the first step, a thorough literature review
was conducted in order to identify effective communication indicators associated with three
primary stakeholders. Then, 38 completed case studies were collected associated with
construction projects. Next, a structured questionnaire was developed in order to collect
comprehensive data from the same 38 completed case study projects. In the third step, several

appropriate statistical analyses were conducted to determine significant entity-based

communication indicators within each of primary stakeholders. Additionally, Factor Analysis
method was used to reduce the number of significant ECI and investigate key entity-based
components. In Step 4, a model was developed using the Structural Equation Model technique.
Finally, the outcomes and results were discussed.

Figure 1. Research framework.

Table 1. Results of Significant Communication within Indicators Primary Stakeholders

Contractor
Designer
Owner
# ECI Explanation

ECI_1 Number of Required Total Permits 0.082* 0.836 0.986

ECI_2 Difficulty Level in Obtaining Permits 0.975 0.036** 0.322
ECI_3 Impact of External Agencies on Execution plan 0.421 0.349 0.470
ECI_4 Impact of Required Approvals-Internal Stakeholders 0.009** 0.021** 0.067*
ECI_5 Impact of Required Approvals-External Stakeholders 0.065* 0.678 0.396
ECI_6 Number of Owner Organizations 0.044** 0.862 0.713
ECI_7 Number of Designer/Engineer Organizations 0.096* 0.036** 0.279
ECI_8 Project Management Team Peak Size-Procurement 0.031** 0.073* 0.042**
ECI_9 Project Management Team Experience -Construction 0.004** 0.009** 0.014**
ECI_10 Percentage of Modularization (offsite Construction) 0.602 0.313 0.224
ECI_11 Number of Countries Involved in Design Phase 0.007** 0.848 0.929
ECI_12 Number of Countries Involved in Construction Phase 0.099* 0.062* 0.027**
ECI_13 Clarity of Projects Scope in Designer/Contractor Selection 0.008** 0.040** 0.054*
ECI_14 Clarity of Owners Project Goals and Objectives 0.024** 0.008** 0.022**
ECI_15 Delay in Delivery of Permanent Facility Equipment 0.612 0.053* 0.060*
ECI_16 Field Craft Labor Quality Issues 0.711 0.076* 0.017**
ECI_17 Bulk Materials Quality Issues 0.842 0.015** 0.007**
ECI_18 Permanent Equipment Quality Issues 0.804 0.060* 0.007**
ECI_19 Percentage of Craft Labor Turnover 0.476 0.048** 0.031**
ECI_20 Percentage of Permanent Equipment Sourced Locally 0.014** 0.693 0.511
ECI_21 Percentage of Craft Labor Sourced Locally 0.007** 0.359 0.563
ECI_22 Reuse of Existing Installed Equipment 0.741 0.031** 0.018**
ECI_23 Degree of Additional Construction Specifications 0.453 0.051* 0.001**
ECI_24 Degree of Additional Materials Specifications 0.094* 0.243 0.078*
ECI_25 Project Funding Delays 0.198 0.072* 0.039**
ECI_26 Clarity of Funding Process during Front End Planning 0.002** 0.093* 0.128
ECI_27 Company’s Familiarity with Technologies -Design 0.654 0.014** 0.554
ECI_28 Company’s Familiarity with Technologies -Construction 0.856 0.030** 0.839
ECI_29 Company’s Familiarity with Technologies -Operation 0.904 0.061* 0.418
ECI_30 Number of New Systems Tied into Existing Systems 0.956 0.029** 0.342
**denotes significance with 95% confidence level,
*denotes significance with 90% confidence level

Table 2. Rotated Component Matrix Associated (Effective Communication within Owner)

Component
Component's Name Effective Communication Indicator
1 2 3 4

Clarity of Projects Scope in Designer/Contractor Selection 0.849 -0.102 0.371 -0.017

Clarity of Owners Project Goals and Objectives 0.772 -0.096 0.165 0.036

Project Target (O1) Clarity of Funding Process during Front End Planning 0.754 0.135 0.063 0.127

Impact of Required Approvals-Internal Stakeholders 0.656 0.082 -0.224 0.386

Percentage of Permanent Equipment Sourced Locally 0.481 -0.388 -0.476 -0.061

Number of Required Total Permits 0.128 0.91 -0.14 -0.186

Technical
Percentage of Craft Labor Sourced Locally 0.207 -0.839 -0.059 -0.031
Challenges (O2)
Degree of Additional M aterials Specifications 0.091 0.45 0.332 0.009
Logistic Number of Countries Involved in Design Phase 0.08 -0.007 0.852 -0.122
Challenges (O3) Number of Owner Organizations -0.115 -0.018 -0.503 -0.028
Project M anagement Team Experience -Construction phase 0.077 0.478 -0.051 -0.77
Coordination (O4)
Impact of Required Approvals-External Stakeholders 0.37 0.131 -0.078 0.735

Table 3. Rotated Component Matrix (Effective Communication within Engineering)

Component
Component's Name Effective Communication Indicator
1 2 3 4 5 6

Company’s Familiarity with Technologies -Construction phase 0.863 0.123 -0.076 0.007 0.116 -0.085
Design & Technology
(D1) Company’s Familiarity with Technologies -Engineering phase 0.829 0.049 -0.039 -0.075 0.176 0.1
Company’s Familiarity with Technologies -Operation phase 0.823 -0.046 0.121 0.149 -0.036 0.195

Clarity of Projects Scope During De s igne r/Contra c tor Selection 0.069 0.86 0.006 0.095 0.107 0.009

Clarity of Owners Project Goals and Objectives -0.047 0.832 0.013 0.168 -0.064 -0.107
S cope Clarity (D2) Clarity of Funding Process during Front End Planning 0.488 0.706 0.182 -0.006 0.031 -0.034
Impact of Required Approvals-Internal Stakeholders 0.04 0.639 0.047 -0.2 -0.371 0.001
Number of Countries Involved in Construction Phase -0.277 0.471 0.299 0.317 0.127 0.392
Bulk M aterials Quality Issues -0.016 0.407 0.332 0.25 -0.257 0.309

Project M anagement Team Peak Size-Procurement Phase -0.223 -0.094 0.826 0.065 0.049 -0.156
Technical & Financial
S upport (D3) Degree of Additional Construction Specifications 0.027 0.223 0.696 -0.014 0.096 0.206
Project Funding Delays 0.47 0.074 0.693 0.059 -0.117 -0.016
Delay in Delivery of Permanent Facility Equipment -0.009 0.142 -0.036 0.886 0.076 -0.001
Facility (D4)
Number of New Systems Tied into Existing Systems 0.347 0.017 0.373 0.569 -0.311 -0.042

Experience Challenges Project M anagement Team Experience -Construction Phase 0.145 -0.12 -0.012 0.05 0.779 -0.098
(D5)
Difficulty Level in Obtaining Permits 0.158 0.158 0.397 -0.313 0.551 0.295
Decision-Making
Number of Designer/Engineer Organizations 0.16 -0.119 -0.021 -0.045 -0.038 0.853
Challenges (D6)

DATA COLLECTION
For this study, 38 completed case study projects were collected. As data regarding potential
root causes of effective communication within primary stakeholders are required, a structured
questionnaire was developed. The collected case studies consisted of information on general
projects’ description. The developed survey consisted of 50 questions associated with potential
root causes of effective communication. Each of the potential root causes of effective
communication became one question of the survey. Next, the developed survey was sent to a
selected representative who worked in each of the collected case studies. The survey process was
entirely set up and managed through an online system. After sending two follow-up emails, 39
responses were received.

DATA ANALYSIS
Significant Effective Communication Indicators within Primary Entities
According to the type of the collected data, various appropriate statistical analyses including
two sample t-Test, Chi-squared, and Kruskal-Wallis were utilized in order to determine the
significant EPCIs associated with quality of internal communication within each of the primary
stakeholders. The significant ECIs, which affect the quality of internal communication within
primary stakeholders, were then determined using the mentioned statistical tests. The results
regarding significant ECIs associated with quality of internal communication within the owner,
consultant, and contractor stakeholders are shown in Table 1. This table illustrates that 30 ECIs
were obtained significant. It was found that more than one countries involved in the design
and/or construction phases affects the quality of internal communication within owner entities.
When different countries are involved in the design phase and/or construction phase, the owners
need to learn the restrictions such as working hours, weather condition, and religious events in
different locations, which are involved in design and/or construction phases (Kamalirad et al.
2018). So it is required that the owner entities learn and comply with international laws and
regulations. These new challenges and issues would affect quality of internal communication
within the mentioned stakeholders. As presented in Table 1, a high number of
designer/engineering organizations affects quality of internal communication within designers.
As each of designer staff has unique experience, knowledge, and skill related to construction
projects, the process of reaching an agreement within the designer and/or contractor staff would
be challenging and time-consuming (Safapour et al. 2017). Thus, quality of internal
communication within the mentioned stakeholders would be affected. Next, Factor Analysis
method was applied to the identified significant ECIs associated with three primary stakeholders
in order to explore the groupings that might exist among the ECIs, as presented in Tables 2, 3,
and 4.
Tables 2 shows that the results of dimension reduction associated with owner entity consisted
of four main components: project target, technical challenges, logistic challenges, and
coordination. As presented in Table 2, the first component of owner entity is project target,
accounted for 21.77% of the total variance among the ECIs. According to the factor analysis
theory, the first factor accounts for the largest part of the total variance, which implies that
project target is of vital for effective internal communication within owner entities. When the
project is not well-defined and/or communicated, perceptions and expectations of the participants
may differ and miscommunication may occur which will adversely affect the project ultimate
outcome. Table 3 shows that the results of effective communication corresponding to
engineering/design entities consisted of six main components: design and technology, scope
clarity, technical and financial support, facility, experience challenges, and decision-making
challenges. As presented in Table 3, the first component, which constitutes the largest part of
total variance, is design and technology. The percentage of total variance for this component is
17.05% including five ECIs. Being the first component of factor analysis, design and technology
component holds the maximum importance in designer’s effective communication. In other
words, using commonly used technologies makes the designers work more familiar with each
other leading to effective communication with each other. In addition, unclear scope could affect
the quality of communication within the engineers/designers as they need to have more
interactions and brainstorming sessions regarding the unclear scope (Kermanshachi et al. 2017).
Table 4 illustrates that the results of effective communication related to contractor entities

also consisted of six main components: field labor, objectives and restrictions, resource
availability, equipment, site management challenges, and turnover. Table 4 shows that the results
corresponding to engineering/design entity consisted of six main components: design and
technology, scope clarity, technical and financial support, facility, experience challenges, and
decision-making challenges.

Table 4. Rotated Component Matrix (Effective Communication within Contractor)

Component
Component's Name Effective Communication Indicator
1 2 3 4 5 6
Bulk M aterials Quality Issues 0.875 0.173 0.051 0.139 -0.126 -0.081
Field Labor (C1) Field Craft Labor Quality Issues 0.874 0.279 0.124 0.127 0.046 0.028
Percentage of Craft Labor Turnover 0.749 0.115 0.405 0.133 0.03 0.064
Clarity of Projects Scope During De s igne r/Contra c tor
0.062 0.851 0.112 0.247 -0.001 0.071
Selection
Objectives & Restrictions (C2) Clarity of Funding Process during Front End Planning 0.206 0.788 0.185 -0.061 0.07 -0.073
Clarity of Owners Project Goals and Objectives 0.139 0.776 -0.021 0.304 -0.139 0.008
Impact of Required Approvals-Internal Stakeholders 0.428 0.598 -0.079 -0.36 -0.155 0.022
Degree of Additional M aterials Specifications 0.199 -0.015 0.865 -0.051 0.165 0.18
Degree of Additional Construction Specifications 0.27 0.108 0.857 0.152 -0.028 0.049
Resource Availability (C3) Project Funding Delays -0.1 0.25 0.745 0.019 -0.114 -0.246
Project M anagement Team Peak Size-Procurement
0.381 -0.249 0.478 0.121 -0.048 -0.349
Phase
Permanent Equipment Quality Issues 0.027 0.083 0.258 0.767 -0.002 0.211
Equipment (C4) Delay in Delivery of Permanent Facility Equipment 0.195 0.092 -0.124 0.686 0.085 -0.07
Number of Countries Involved in Construction Phase 0.237 0.239 0.113 0.539 -0.441 -0.235
Project M anagement Team Experience -Construction
S ite Management Challenges (C5) -0.014 -0.027 0.035 0.06 0.945 -0.119
Phase
Turn Over (C6) Reuse of Existing Installed Equipment 0.009 -0.005 0 0.049 -0.09 0.92

Figure 2. Measurement model.

Figure 2 indicates that the final measurement model with the standardized coefficients was
obtained using the SEM analysis. The standardized coefficients were obtained by carrying out
the SEM analysis in order to define the relationship among the main components of effective
communication within each of three primary stakeholders. The standardized coefficients, which
were based on the standard deviations of the significant communication indicators, are typical

measurement used by authors for making coefficients comparable. By analyzing the standardized
coefficients in the measurement model, Figure 2 shows that five main components had
significant correlation with the effective internal communication: project target (O1), scope
clarity (D2), field labor (C1), objectives and restrictions (C2), and resource availability (C3).
Accordingly, the mentioned components had comparable influences on effective internal
communications.

CONCLUSION
This study had three aims: (1) determine the significant effective communication indicators
within three primary stakeholders; (2) determine the main components of significant ECIs
belonging to three primary stakeholders; and (3) identify the relationship between main
components. Thirty effective communication indicators were determined statistically significant
within owner, engineering, and contractor entities. The significant ECIs within owner entities
belonged to four main components: project target, technical challenges, logistic challenges, and
coordination. Similarly, the main components of significant ECIs associated with engineering
stakeholder were identified as design and technology, scope clarity, technical and financial
support, facility, experience challenges, and decision-making challenges. Likewise, the
significant ECIs corresponding to contractor entities belonged to six main components: field
labor, objectives and restrictions, resource availability, equipment, site management challenges,
and turn over. Additionally, the results demonstrated that project target, scope clarity, field labor,
objectives and restrictions, and resource availability were critical components. The outcomes of
this study assist project management team in early identifying effective communication
indicators in order to plan proactively to manage internal miscommunication and consequently
improve projects’ performance.

REFERENCES
Cheng, E. W. L., Li, H., Love, P., and Irani, Z. (2001). “Network communication in the
construction industry,” Cor. Com.: Int. J., 6(2), 61-70.
Chinowsky, P., Taylor, J. E., and Di Marco, M. (2010). "Project network interdependency
alignment: New approach to assessing project effectiveness." J. Manage. Eng., 27(3), 170-
178.
Cho, K. M., Hong, T. H., and Hyun, C. T. (2009). “Effect of project characteristics on project
performance in construction projects based on structural equation model,” Exp. Sys. App., 36,
10461-10470.
Cleland, D. I., and Ireland, L. R. (1999). “Project management: strategic design and
implementation.” Singapore: McGraw-Hill, 1999.
Cook, S., and Macaulay, S. (2013). “Collaboration within teams.” Training Journal.
Dainty, A. R., and Lingard, H. (2006). “Indirect discrimination in construction organizations and
the impact on women’s careers.” J. Manage. Eng., 22(3), 108-118.
Forcada, N., Serrat, C., Rodriguez, S., and Bortolini,R. (2017). “Communication key
performance indicators for selecting construction project bidders.” J. Manage. Eng., 33(6):
04017033.
Habibi, M., Kermanshachi, S., and Safapour, E. (2018). “Engineering, Procurement and
Construction Cost and Schedule Performance Leading Indicators: State-of-the-Art Review,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.

Kamalirad, S., and Kermanshachi, S. (2018). “Development of Project Communication Network:

A New Approach to Information Flow Modeling,” Proceedings of Construction Research
Congress, ASCE, New Orleans, Louisiana, April 2-4, 2018.
Kamalirad, S., and Kermanshachi, S. (2018). “Development of Project Life Cycle
Communication Ladder Framework Using Factor Analysis Method.” Proceedings of
Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4, 2018.
Kamalirad, S., Kermanshachi, S., Shane, J. and Anderson, S. (2017), “Assessment of
Construction Projects’ Impact on Internal Communication of Primary Stakeholders in
Complex Projects,” Proceedings for the 6th CSCE International Construction Specialty
Conference, Vancouver, Canada May 31-June 3.
Kermanshachi, S. (2010). “U.S. Multi-party Standard Partnering Contract for Integrated Project
Delivery,” Master’s Thesis, Mississippi State University, 2010.
https://books.google.com/books/about/U_S_Multi_party_Standard_Partnering_Cont.html?id
=qX3sygAACAAJ
Kermanshachi, S., Anderson, S. D., Goodrum, P., and Taylor, T. R. (2017). “Project Scoping
Process Model Development to Achieve On-Time and On-Budget Delivery of Highway
Projects,” Transportation Research Record: Journal of the Transportation Research Board,
(2630), 147-155.
Kermanshachi, S. (2016). Decision Making and Uncertainty Analysis in Success of Construction
Projects. Doctoral dissertation, Texas A & M University. http: / /hdl .handle .net /1969 .1
/158020.
Lee, N., and Kim, Y. (2018). “A conceptual framework for effective communication in
construction management: Information processing and visual communication,” Construction
Research Congress, 531- 540.
Malisiovas, A., and Song, X. (2014). "Social Network Analysis (SNA) for Construction Projects'
Team Communication Structure Optimization." Construction Research Congress, 2032-
2042.
Mazzei, A. (2014). “Internal communications for employee enablement,” Corp. Com.: I. J.,
19(1), 82-95.
Nipa, T., Kermanshachi, S., and Kamalirad, S. (2019). “Development of Effective
Communication Framework Using Confirmatory Factor Analysis Technique,” Proceedings
for the ASCE International Conference on Computing in Civil Engineering, Atlanta, Georgia
June 17-June 19.
Pentland, A. (2012). “The new science of building great teams.” Harvard Business Review, 90
(4), 60-69.
Safapour, E., Kermanshachi, S., Habibi, M., and Shane, J. (2018), “Resource-Based Exploratory
Analysis of Project Complexity Impact on Phase-Based Cost Performance Behavior,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Safapour, E., Kermanshachi, S., Shane, J. and Anderson, S. (2017), “Exploring and Assessing
the Utilization of Best Practices for Achieving Excellence in Construction Projects,”
proceedings of the 6th CSCE International Construction Specialty Conference, Vancouver,
Canada May 31-June 3.
Senescu, R. R., Aranda-Mena, G., and Haymaker, J. R. (2012). "Relationships between project
complexity and communication." Journal of Management in Engineering, 29(2), 183-197.
Welch, M., and Jackson, P. R. (2007). “Rethinking internal communication: a stakeholder

approach.” Corp. Com.: Int. J., 12(2), 177-198.

Wong, F. W. H., and Lam, P. T. I. (2011). “Difficulties and hindrances facing end users of
electronic information exchange systems in design and construction.” J. Manage. Eng.,
27(1), 28-39.

Localizing Local Vulnerabilities in Urban Areas Using Crowdsourced Visual Data from
Participatory Sensing
Hongjo Kim, Ph.D.1; Youngjib Ham, Ph.D.2; and Hyoungkwan Kim, Ph.D.3
1
Research Associate, Dept. of Construction Science, Texas A&M Univ., 3137 TAMU, College
Station, TX 77843; School of Civil and Environmental Engineering, Yonsei Univ., 50 Yonsei-ro,
Seodaemun-gu, Seoul 03722, Republic of Korea. E-mail: [email protected]
2
Assistant Professor, Dept. of Construction Science, Texas A&M Univ., 3137 TAMU, College
Station, TX 77843. E-mail: [email protected]
3
Professor, School of Civil and Environmental Engineering, Yonsei Univ., 50 Yonsei-ro,
Seodaemun-gu, Seoul 03722, Republic of Korea. E-mail: [email protected]

ABSTRACT
An essential prerequisite for reducing natural disaster damage is to identify objects
vulnerable to extreme weather events. However, it is not trivial to scrutinize large urban areas
within a short period of time, using conventional data collection processes for disaster
preparedness. To address this issue, we propose a novel geospatial localization method building
on participatory sensing to localize vulnerable objects or areas in cities. The proposed method
consists of sequential modules—a geographic coordinate conversion, mean-shift clustering, deep
learning-based object detection, magnetic declination adjustment, line of sight equation
formulation, and the Moore-Penrose generalized inverse method—to localize urban objects in
crowdsourced data. The localization accuracy of the proposed method is evaluated in a case
study of urban areas in Texas. The proposed method is expected to contribute to rapid data
collection practice in disaster preparedness and enable practitioners to concentrate their limited
resources on where focus is needed.

INTRODUCTION
Globally, governments and municipal agencies have endeavored to reduce natural disaster
impacts through emergency management, which consists of four steps—mitigation, preparedness,
response, and recovery (Neal 1997). Among the four steps of emergency management, disaster
preparedness has been emphasized to reduce the potential damage in urban areas by minimizing
undesirable improvisation through preparing for a natural disaster (Alexander 2015). However,
there is a relatively small body of literature that is concerned with identifying potential hazards
in an earlier disaster phase, while previous studies have been mainly addressing issues related to
damage assessment or disaster recovery in post-disaster phases (Ham et al. 2017).
Proactively identifying vulnerable objects or areas with respect to a forecasted severe
weather event is essential for enabling responsible government agencies to establish appropriate
resource allocation or action plans for the locals. However, conventional data collection
processes such as site visits, interviews, and surveys have been found to be a major obstacle to
timely investigate potentially vulnerable objects in large urban areas within a short period of time.
To facilitate data collection over large urban areas with little incremental cost, participatory
sensing has attracted considerable research attention in various applications (Christin et al. 2011;
Goldman et al. 2009). The term ‘participatory sensing’ refers to voluntary participation of
citizens to report personal observations regarding an object or area of interest nearby using their
mobile devices in general. While the benefit of participatory sensing for data collection in

disaster situations has long been acknowledged (Alexander 2015; Henriksen et al. 2018; Norris
et al. 2008), previous studies have mainly employed it to collect local information in regard to
disaster response and recovery phases such as flood monitoring (Lo et al. 2015; Restrepo-Estrada
et al. 2018; Wang et al. 2018). Considering the needs in disaster preparedness regarding severe
weather, the geographic locations of vulnerable objects or areas, which are likely to be affected
by strong wind and precipitation, should be identified in advance to prevent potential accidents.
Despite the benefits from participatory sensing, consumer-level smartphones would be
inappropriate to localize distant objects or places due to the absence of ranging sensors. Besides,
crowdsourced data are typically prone to contain noise, which results in the low data reliability
(Kanhere 2011).
To address these challenges, this study presents a novel geospatial localization method to
pinpoint the geospatial location of distant objects from data collected by citizens’ smartphones.
The proposed method exploits three types of mobile sensor data: images, geographic locations,
and compass bearings to estimate the location of a distant target object through consecutive
processing algorithms. To obtain such data, consumer-level smartphones are used to capture an
image of a distant target object and to measure a compass bearing in the direction toward the
target object. Each algorithm of the proposed method is dedicated to improving the data
reliability by minimizing the impact of human error, measurement error, and the geospatial
proximity of urban objects that result in inconsistent localization accuracy. By presenting the
novel geospatial localization method in the perspective of participatory sensing, this work
contributes to the knowledge of the geospatial localization for distant objects in cities using
consumer-level mobile devices. Moreover, it is expected that the proposed method will promote
the exploitation of participatory sensing in the earlier disaster phases to identify local
vulnerabilities in urban areas regarding extreme weather events. Although the definition of local
vulnerabilities varies depending on the types of weather events, the proposed localization method
can be universally used to find the target object location regardless of object types, when the
same types of mobile sensor data are available.

BACKGROUND
Previous studies on disaster management have primarily focused on damage assessment and
recovery phases. One of the important things for disaster recovery is to identify the current status
of damaged areas. In this context, Yeum et al. (2019) proposed a structural drawing image
restoration method after natural disasters to analyze structural performance of buildings. Naser
and Kodur (2018) presented an embedded sensor-based method to predict the structural
performance of infrastructure through key response parameters such as temperature, strain,
deformation, and vibration levels during disasters. Zhou et al. (2016) utilized the structure-from-
motion algorithm to investigate its applicability regarding residential building damage
assessment in the post-disaster phase. El-Anwar et al. (2009) presented a temporary housing
allocation system to arrange temporary housing solutions for displaced people after natural
disasters by optimizing a set of objectives. Lee et al. (2013) developed an urban facility
management system to monitor abnormalities in facilities through a ubiquitous sensor network
for facilitating prompt responses in emergency situations. These studies particularly contribute to
the knowledge with respect to emergency response during natural disasters or recovery of
damaged areas. Regarding disaster preparedness, a few studies have utilized simulation methods
to analyze emergency response of citizens to establish proactive action plans (Bunea et al. 2016;
Choi et al. 2018; Xiong et al. 2015). Meanwhile, there has been much less attention to

developing data collection methods using participatory sensing, despite its potential for
establishing proactive action plans in the disaster preparedness phase to reduce damages from
extreme weather events in urban areas. To employ participatory sensing as a data collection tool
for large areas, it should be equipped with the geospatial localization capability for distant
objects, since the proactive identification of local vulnerabilities is essential for disaster
preparedness. Herein, geospatial localization denotes the process of identifying the location of an
object or places in world coordinate systems. Previous studies related to geospatial localization
have investigated ways to localize distant objects or scenes using smartphones in urban areas
(Chen et al. 2016; Ha et al. 2018; Ham and Yoon 2018; Manweiler et al. 2012; Ouyang et al.
2013), but their method showed the unstable localization accuracy.

Figure 1. Overview of the proposed geospatial localization method using participatory

sensing
THE PROPOSED GEOSPATIAL LOCALIZATION METHOD USING
PARTICIPATORY SENSING
Figure 1 shows the overview of the proposed method to localize distant objects from
crowdsourced data, which can be largely divided into three parts—pre-processing the input data,
geospatial localization, and post-processing. A data sample in a crowdsourced data set contains a
geographic location in form of spherical coordinates (Latitude, Longitude, and Altitude), an
image having a distant object of interest, and a compass bearing pointing a smartphone at the
object of interest. To reliably identify the location of a distant target object, the impact of noise—
human error, measurement error, and the geospatial proximity of urban object—involved in
crowdsourced data should be minimized, which is the subject of this paper.
Preprocessing the crowdsourced data: The preprocessing part is designed to facilitate the

geospatial localization by converting geographic coordinates and to reduce the impact of human
error and the geospatial proximity of urban objects. First, the geographic location of
crowdsourced data is converted (from spherical coordinates) to two-dimensional Cartesian
coordinates (Easting and Northing) in the Universal Transverse Mercator (UTM) system. This
conversion replaces the complex calculation in spherical trigonometry with the simple
calculation in the two-dimensional space to robustly find the location of a distant object. Then,
the mean-shift clustering algorithm separates the crowdsourced data into each cluster based on
their proximity to nearby data samples. By doing so, only relevant data to a target urban object
can be selected for the localization process. Finally, a region-based fully convolutional network
(R-FCN) (Dai et al. 2016) is employed to detect a distant object of interest in images by
eliminating irrelevant data based on the presence of a target object in a selected cluster. In other
words, the crowdsourced data that does not contain the target object is regarded as irrelevant to
the target object, and it is thus removed to improve the localization accuracy.
Geospatial Localization: The proposed geospatial localization method finds the location of
a distant object by calculating the point of intersection of multiple line equations, each of which
directs toward a distant object from a citizen’s location. Each line equation is formulated by
using the geographic location of a crowdsourced data and its compass bearing. The compass
bearing value is adjusted before the line formulation considering magnetic declination of the
Earth, which represents the angle difference between the true north and the magnetic north. A
value of magnetic declination varies by locations, thus, a specific magnetic declination value is
determined by referring to the International Geomagnetic Reference Field Model (National
Centers for Environmental Information 2018). Using a geographic location and an adjusted
compass bearing, a line equation is formulated as follows:
y  tan(90     )   x  x1   y1 (1)
where x : Easting, y : Northing,  : compass bearing,  : magnetic declination, x1 : the Easting of
a data sample, and y1 : the Northing of a data sample.
With this line formulation, each data sample has its own line equation that points to a
direction toward a distant object of interest. The location of a distant object is obtained by
solving a linear system that consists of multiple line equations obtained by equation (1). The
linear system can be arranged as follows:
 a11   a12   b1 
a   a  b 
x  21 
 y  22    2  (2)
     
     
 an1   an 2  bn 
where an1 is a coefficient of x , an 2 is a coefficient of y , bn is a constant.
AL  b (3)
 a11 a12   b1 
a a  b 
 x
where A   21 22  , L    , and b   2  .
   y  
   
 an1 an 2  bn 
The estimated location of L is derived by solving equation (3) using the Moore-Penrose
generalized inverse method (Serre 2010), as follows:

A AL  A b (4)

LA b (5)
where A+ satisfies the four properties of (1) AA+A=A, (2) A+AA+=A+, (3) (AA+)T=AA+, and (4)
(A+A)T=A+A, when the elements of A are real numbers.
Post-processing the estimated locations: Since each crowdsourced data set has different
measurement error and human error, the estimated locations of distant objects vary to a large
degree in distance. The post-processing module is devised to stabilize the geospatial localization
accuracy by determining a final estimated location of a distant object of interest, using the mean-
shift clustering algorithm over multiple estimated locations. The underlying assumption is that
the selected cluster center location has a relatively smaller distance error than most estimated
locations with respect to the ground truth location. Specifically, when estimated locations are
generated by equations (5), the mean-shift clustering algorithm separates the estimated location
into different clusters. Then, a cluster with the largest number of the estimated locations is
selected and its center location is determined as the final estimated location of a distant object.

EXPERIMENTS AND DISCUSSION

The experiments were conducted using a computer with the configuration of the Intel i7-
6700 CPU and the GTX1080 8GB GPU in the Ubuntu 16.04 operating system. As a target urban
object, a tower crane was selected, because it is generally considered as a vulnerable object in
concentrated urban areas regarding severe wind-related events and twenty-one data samples were
collected at a distance ranging from 68m to 296m (average distance: 186m) in Houston, TX. An
additional localization experiment was also conducted with respect to fire hydrants in College
Station, TX, for evaluating the geo-clustering performance in the pre-processing module. The
localization accuracy was measured by calculating the difference between the ground truth
location and a final estimated location. To evaluate the object detection performance, a mean
average precision value was used. The magnetic declinations at the data collection locations were
+258' in College Station and +223' in Houston. The window size of the mean-shift clustering
algorithm was 400m at the pre-processing module, and 40m at the post-processing module. To
train the object detection model, 230 images containing tower cranes and 53 images for fire
hydrants were used, and the model was tested with 57 and 49 test images for each class,
respectively. In the detection experiment, a mean average precision of 74.67% for tower cranes
and 95.62% for fire hydrants were recorded. Based on the detection results, only relevant data
samples were selected for geospatial localization. For evaluation, 300 localization results were
averaged to obtain an average distance error. Table 1 shows the localization accuracy, and the
average computation time per each localization was 0.106s during the experiments.

Table 1. Examples of geospatial localization results in the case study.

Average distance
Number of Average distance error of
Target object from data samples
data samples the final estimated location
to a distant object
Tower crane 22 186m 27.8m
Fire hydrant 25 62m 2.5m

The experimental result demonstrated the efficacy of the proposed geospatial localization
method to identify the location of a distant object based on multiple crowdsourced data. The
recorded localization errors for the tower crane and fire hydrants signify that the proposed

method has the potential for localizing a distant object in the context of participatory sensing.
Particularly, it was observed that the final localization error was considerably less than the
average distance error of each estimated locations by 2.5% to 1241.7% in the case studies. Such
performance improvement was brought primarily by the mean-shift clustering module at the
post-processing, which determines the final location of a distance object as the cluster center of
multiple estimated locations, as shown in Figure 2.

Figure 2. The final estimated location (CT: a cluster center with the largest number of the
estimated locations) and the ground truth (GT) of the tower crane among the estimated
locations (Circles) and the cluster centers (Diamonds) in the UTM Zone 15R.
CONCLUSION
This paper aims to present a novel participatory sensing-based geospatial localization method
to localize distant objects vulnerable to extreme weather events in urban areas. Through the
proposed computational process that includes the geographic coordinate conversion, mean-shift
clustering, deep learning-based object detection, magnetic declination adjustment, line of sight
equation formulation, and the Moore-Penrose generalized inverse method, the geographic
location of objects in an urban area was able to be robustly identified in the case study of
Houston, TX. In the context of participatory sensing for disaster management, the proposed
method is expected to be used to quickly and reliably collect and analyze local vulnerabilities
over large areas within a short period of time. To evaluate the generality of the proposed method,
further studies should be conducted to enhance the localization accuracy over challenging
conditions for data collection.

ACKNOWLEDGMENT
This material is in part based upon work supported by the National Science Foundation (NSF)
under CMMI Award#1832187. In addition, this work was supported by the National Research
Foundation of Korea (NRF) grant funded by the Korea government—the Ministry of Education
(No. 2018R1A6A1A08025348) and the Ministry of Science, ICT and Future Planning (No.
2018R1A2B2008600)—and the Yonsei University Research Fund (Yonsei Frontier Lab. Young
Researcher Supporting Program) of 2018. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s) and do not necessarily
reflect the views of the funding agencies.

REFERENCES
Alexander, D. (2015). "Disaster and emergency planning for preparedness, response, and
recovery." Oxford University Press.
Bunea, G., Leon, F., and Atanasiu, G. M. (2016). "Postdisaster Evacuation Scenarios Using
Multiagent System." Journal of Computing in Civil Engineering, 30(6), 05016002.
Chen, H., Guo, B., Yu, Z., and Han, Q. "Toward real-time and cooperative mobile visual sensing
and sharing." Proc., IEEE INFOCOM 2016 - The 35th Annual IEEE International
Conference on Computer Communications, 1-9.
Choi, M., Starbuck, R., Lee, S., Hwang, S., Lee, S., Park, M., and Lee, H.-S. (2018). "Distributed
and interoperable simulation for comprehensive disaster response management in facilities."
Automation in Construction, 93, 12-21.
Christin, D., Reinhardt, A., Kanhere, S. S., and Hollick, M. (2011). "A survey on privacy in
mobile participatory sensing applications." Journal of Systems and Software, 84(11), 1928-
1946.
Dai, J., Li, Y., He, K., and Sun, J. (2016). "R-FCN: Object detection via region-based fully
convolutional networks." Proc., Advances in Neural Information Processing Systems, Neural
Information Processing Systems Foundation, Barcelona, Spain, 379-387.
El-Anwar, O., El-Rayes, K., and Elnashai, A. (2009). "An automated system for optimizing post-
disaster temporary housing allocation." Automation in Construction, 18(7), 983-993.
Goldman, J., Shilton, K., Burke, J., Estrin, D., Hansen, M., Ramanathan, N., Reddy, S., Samanta,
V., Srivastava, M., and West, R. (2009). "Participatory Sensing: A citizen-powered approach
to illuminating the patterns that shape our world." Foresight & Governance Project, White
Paper, 1-15.
Ha, I., Kim, H., Park, S., and Kim, H. (2018). "Image retrieval using BIM and features from
pretrained VGG network for indoor localization." Build. Environ., 140, 23-31.
Ham, Y., Lee, S. J., and Chowdhury, A. G. (2017). "Imaging-to-Simulation Framework for
Improving Disaster Preparedness of Construction Projects and Neighboring Communities."
Computing in Civil Engineering, 230-237.
Ham, Y., and Yoon, H. (2018). "Motion and Visual Data-Driven Distant Object Localization for
Field Reporting." Journal of Computing in Civil Engineering, 32(4), 04018020.
Henriksen, H. J., Roberts, M. J., van der Keur, P., Harjanne, A., Egilson, D., and Alfonso, L.
(2018). "Participatory early warning and monitoring systems: A Nordic framework for web-
based flood risk management." International Journal of Disaster Risk Reduction.
Kanhere, S. S. "Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban
Spaces." Proc., 2011 IEEE 12th International Conference on Mobile Data Management, 3-6.
Lee, J., Jeong, Y., Oh, Y.-S., Lee, J.-C., Ahn, N., Lee, J., and Yoon, S.-H. (2013). "An integrated
approach to intelligent urban facilities management for real-time emergency response."
Automation in Construction, 30, 256-264.
Lo, S.-W., Wu, J.-H., Lin, F.-P., and Hsu, C.-H. (2015). "Visual Sensing for Urban Flood
Monitoring." Sensors, 15(8), 20006-20029.
Manweiler, J. G., Jain, P., and Choudhury, R. R. (2012). "Satellites in our pockets: an object
positioning system using smartphones." Proceedings of the 10th international conference on
Mobile systems, applications, and services, ACM, Low Wood Bay, Lake District, UK, 211-
224.
Naser, M. Z., and Kodur, V. K. R. (2018). "Cognitive infrastructure - a modern concept for
resilient performance under extreme events." Automation in Construction, 90, 253-264.

National Centers for Environmental Information, N. (2018). "Magnetic Field Calculators."

<https://www.ngdc.noaa.gov/geomag-web/#declination>. (Oct. 10, 2018).
Neal, D. M. (1997). "Reconsidering the Phases of Disasters." International journal of mass
emergencies and disasters, 15(2), 239-264.
Norris, F. H., Stevens, S. P., Pfefferbaum, B., Wyche, K. F., and Pfefferbaum, R. L. (2008).
"Community Resilience as a Metaphor, Theory, Set of Capacities, and Strategy for Disaster
Readiness." American Journal of Community Psychology, 41(1-2), 127-150.
Ouyang, R. W., Srivastava, A., Prabahar, P., Choudhury, R. R., Addicott, M., and McClernon, F.
J. (2013). "If you see something, swipe towards it: crowdsourced event localization using
smartphones." Proceedings of the 2013 ACM international joint conference on Pervasive and
ubiquitous computing, ACM, Zurich, Switzerland, 23-32.
Restrepo-Estrada, C., de Andrade, S. C., Abe, N., Fava, M. C., Mendiondo, E. M., and de
Albuquerque, J. P. (2018). "Geo-social media as a proxy for hydrometeorological data for
streamflow estimation and to improve flood monitoring." Computers & Geosciences, 111,
148-158.
Serre, D. (2010). Matrices: Theory and Applications, Springer, New York, USA.
Wang, R.-Q., Mao, H., Wang, Y., Rae, C., and Shaw, W. (2018). "Hyper-resolution monitoring
of urban flooding with social media and crowdsourcing data." Computers & Geosciences,
111, 139-147.
Xiong, C., Lu, X., Hori, M., Guan, H., and Xu, Z. (2015). "Building seismic response and
visualization using 3D urban polygonal modeling." Automation in Construction, 55, 25-34.
Yeum, C. M., Lund, A., Dyke, S. J., and Ramirez, J. (2019). "Automated Recovery of Structural
Drawing Images Collected from Postdisaster Reconnaissance." Journal of Computing in
Civil Engineering, 33(1), 04018056.
Zhou, Z., Gong, J., and Guo, M. (2016). "Image-Based 3D Reconstruction for Posthurricane
Residential Building Damage Assessment." Journal of Computing in Civil Engineering,
30(2), 04015015.

Enhancing Construction Safety Monitoring through the Application of Internet of Things

and Wearable Sensing Devices: A Review
Ibukun Awolusi, Ph.D.1; Chukwuma Nnaji, Ph.D.2; Eric Marks, Ph.D., P.E.3;
and Matthew Hallowell, Ph.D.4
1
Dept. of Construction Science, Univ. of Texas at San Antonio, 501 W. Cesar E. Chavez Blvd.,
San Antonio, TX 78207. E-mail: [email protected]
2
Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, 263 Hardaway
Hall, Tuscaloosa, AL 35487. E-mail: [email protected]
3
School of Civil and Environmental Engineering, Georgia Institute of Technology, 790 Atlantic
Dr. NW, Atlanta, GA 30313. E-mail: [email protected]
4
Dept. of Civil, Environmental, and Architectural Engineering, Univ. of Colorado at Boulder,
1111 Engineering Dr., Boulder, CO 80309. E-mail: [email protected]

ABSTRACT
The high frequency of work-related injuries and fatalities experienced on construction sites
makes the construction process a very hazardous endeavor. The collection and analysis of safety
data is an important element in measurement and improvement strategy development. Wearable
sensing devices (WSDs) and the internet of things (IoT) have been identified as emerging
technologies with strong potential for a transformative change in many aspects of construction
workers safety monitoring including tracking and transmitting workers safety information in
real-time. This paper provides an evaluation of the potential applications of WSDs and IoT for
the continuous collection, analysis, and monitoring of construction workers’ safety metrics to
mitigate safety hazards and health risks on construction sites. Wearable sensors and systems that
can be used for physiological monitoring, environmental sensing, proximity detection, and
location tracking of a wide range of construction hazards and vital signals which can provide
early warning signs of safety issues to construction workers are reviewed. A schematic model for
integrating wearable sensors for interoperability and multi-parameter monitoring to capture and
track several safety metrics is also presented. The challenges facing the widespread adoption of
WSDs and IoT in construction are also identified and evaluated. Based on the outcomes of the
review completed, recommendations are made on how WSDs and IoT can be effectively
implemented to enhance safety performance on construction sites.

INTRODUCTION
The high rate of fatalities in the construction industry remains a major concern of
practitioners and researchers. Given the high proportion of fatal and non-fatal accidents
occurring in the construction industry, construction companies constantly seek novel strategies
that promote safety (Demirkesen and Arditi 2015). Because of the transient and dynamic nature
of construction, organizations must be able to quickly adapt to change by effectively capturing,
storing, and disseminating new strategies that prevent injuries (Hallowell 2012). Thus, new
technologies may be candidates for safety advancement. Although technology has undoubtedly
played a major role in the improvement of construction processes, its application for
personalized construction safety monitoring has not been fully explored (Cheng et al. 2012).
Most of the existing data collection approaches are manual and are faced with major
challenges related to accurate recording, interpretation, and efficiency (Teizer and Vela 2009).

WSDs offer a non-intrusive solution that provides objective and real-time data that can be used
to make efficient and proactive decisions. WSDs are considered a subset of IoT which traverses
different market segments including smart appliances, connected cars, and many others.
Wearable IoT (WIoT) is a technological infrastructure that interconnects wearable sensors to
enable monitoring human factors including health, wellness, behaviors, and other data useful in
enhancing individuals’ everyday quality of life (Hiremath et al. 2014). WSDs can be used by
workers to monitor and control their health profile via real-time feedback, so that the earliest
signs of safety issues arising from health problems can be detected and corrected (Sung et al.
2005). Wearable sensors can also provide safety supervisors with quantitative measures of
subjects’ status on construction sites, thus facilitating decisions made concerning the adequacy of
ongoing interventions and possibly allowing for prompt modification of the strategy if needed
(Bonato 2009). In spite of the conceivable benefits of the application of WSDs and IoT, there
remains some resistance to applying this class of technology in the construction industry. This
paper provides an evaluation of the potential applications of WSDs and IoT for construction
safety monitoring.

CONSTRUCTION SAFETY MONITORING USING WSDS AND IOT

Over the years, the world has transitioned from basic internet services to social networks to
wearable web leading to a continuous upsurge in the need for interconnecting smart wearable
devices. This emergence of wearable devices is giving a new dimension to IoT by creating an
intelligent fabric of body-worn or near-body sensors communicating with each other or with the
internet (Hiremath et al. 2014). The IoT is the network of physical objects supported by
embedded technology for data communication and sensors to interact with both internal and
external objects states and the environment (Haghi et al. 2017). According to Hiremath et al.
(2014), the concept of IoT provides a solid framework for interconnecting edge computing
devices––wearable sensors and smartphones––and cloud computing platforms for seamless
interactions. It merges the virtual world and the physical world by bringing different concepts
and technical components together: pervasive networks, miniaturization of devices, mobile
communication, and new ecosystem (Chen et al. 2014). Existing studies indicate that the
adoption of WSDs based on IoT infrastructure has the potential to enhance worker safety through
an efficient data collection, analysis, and provision of real-time information about safety and
health risks to personnel (Bonato 2009; Ananthanarayan and Siek 2010; Nath et al., 2017;
Awolusi et al. 2018).
However, the application of WSDs and IoT in construction is at the incipient stage when
compared to other industries (Cheng et al. 2012). The constrained and slow implementation is
due, in part, to a lack of reliable data supporting their potential benefits and absence of critical
information needed for integrating such technologies into work processes (Yang et al. 2016;
Nnaji et al. 2018). Recently, the construction industry has been experiencing a gradual increase
in the adoption of mobility and automation tools as well as other technologies that can increase
efficiency. It is anticipated that in-line with this industry trend, a rise in the use of IoT based
WSDs could uncover possibilities for improvement in construction, particularly in their
implementation for personalized safety monitoring. It is expected that the increased utilization of
these technologies will improve construction safety performance by reducing injuries, illnesses,
and fatalities on construction sites. The various applications of WSDs for construction safety
monitoring are discussed as follows.
Physiological Monitoring: Construction site workers often encounter various health risks as

a result of the austere and dynamic work environments that can impact the safety performance
and overall working effectiveness of construction personnel. Physiological data such as heart
rate, breathing rate, body posture, body speed, and body acceleration can be automatically
recorded and analyzed using different sensors and systems such the Physiological Status
Monitoring (PSM) system and GPS tracking device to assess ground workers and construction
equipment operator’s health (Gatti et al. 2011; Awolusi et al. 2016; Shen et al. 2017). A broad
set of physiological sensors commonly used may include electrocardiogram (ECG/EKG) sensor,
electromyography (EMG) sensor, electroencephalography (EEG) sensor, skin temperature
sensor, blood pressure sensor, tilt sensor, breathing sensor, and movement sensors. The various
metrics captured by these sensors give an indication of a construction worker’s stress level and
health status that are measures of the safety performance of the workers. WSDs containing
gyroscope, accelerometer, and magnetometer have gradually found practical applications in
human motion analysis to improve balance control and reduce falls. Data analysis procedures
could be exclusively developed to detect falls via processing of motion and vital sign data.
Environmental Sensing: Construction work environment poses health and safety risks to
workers not only due to the continuous exposure to weather elements but also because of the
inherent need for certain materials that might be hazardous to construction workers. Automated
sensing of these injurious materials and inclement weather elements is necessary to provide early
warning signals to construction personnel. More generally, it is now possible to use
environmental sensors to measure a range of concerns, including temperature, air quality,
humidity, barometric pressure, gas leaks, visibility, light intensity, spectrum, radiation, hydrogen
sulfide, and carbon monoxide (Swan 2012). Workers can be monitored when performing their
normal work and simultaneously having the ability to see highly localized, real-time data on
things like temperature, hazardous gases, air particles, and possible toxic chemical leaks.
Proximity Detection: The high rate and severity of contact injuries experienced on
construction sites can be prevented in a timely manner using a real-time proximity detection and
warning system. WSDs for proximity detection are capable of alerting construction personnel
and equipment operators during hazardous proximity situations (Marks and Teizer 2012). Many
proximity avoidance systems have been developed by utilizing various technologies, such as an
ultrasonic-based sensor (Choe et al. 2014), radio-frequency identification (RFID) sensing
technology (Chae and Yoshida 2010; Teizer et al. 2010; Park et al. 2016), radar (Choe et al.
2014; Ruff 2006), GPS (Oloufa et al. 2003; Wang and Razavi 2016), and magnetic field
generators (Li et al. 2012), to prevent contact accidents, particularly for accidents due to being
struck by equipment. Most of these technologies provide some form of warning signals to
workers when they are close to heavy equipment. These signals could be visual, vibratory, or
audible warning signals. The choice of the type of signal chosen is also dependent on the type of
task being carried out on the construction site. The proximity zones could either be within
warning zones with limited risks or within danger zones, which constitute regions of high risks.
Location Tracking: Locating and tracking resources is critical in many industrial
applications for monitoring productivity and safety. In construction, various technologies such as
GPS (Papapostolou and Chaouchi 2011), RFID and RF localization (Zhu et al. 2012), UWB
(Cho et al. 2010; Saidi et al. 2011; Shahi et al. 2012), sonar, magnetic field, and radar have been
proposed for monitoring safety performance. Localization and tracking technologies have been
applied to identify undetected obstructions in blind spots (Fullerton et al. 2009) and have also
been utilized in the tracking of workers to manage factors related to human error, such as lack of
hazard recognition (Hallowell et al. 2010). All these applications highlight the importance of

real-time location and progress tracking technologies. The prominent development of low-cost
and small in-size wearable sensors that work with the IoT system can collect location
information and then provide services based on the collected location information.
Daily, approximately 6.5 million people work at about 252,000 construction sites across the
U.S. (OSHA 2017). Due to the inherent nature of the diverse tasks performed on construction
sites, these workers are continuously exposed to a wide range of safety and health hazards that
increase the potential for becoming sick, ill, and even disabled for life. Some of the potential
safety and health hazards for construction workers include falls from heights due to improper
erection of scaffolding or use of ladders; repetitive motion injuries; heat exhaustion or heat
stroke due to body temperatures rising to dangerous levels; and being struck by moving
equipment working in close proximity to workers (OSHA 2017). Table 1 illustrates the sensors
and systems that are currently used in IoT enabled WSDs for monitoring and tracking the
common safety and health hazards associated with the construction process (Awolusi et al.
2018).

Table 1. Sensors and Systems for Construction Safety Monitoring

Construction Metrics Sensing Technologies
Site Hazards
Physiological Falls from Body posture Gyroscope, Accelerometer,
Monitoring height Magnetometer
Slips and Body posture, Body Gyroscope, Accelerometer
Trips speed, Body rotation and
orientation
Stress Heart rate, Blood ECG/EKG, Infrared, Radar
Pressure, Respiratory rate
Heat or cold Body temperature Thermistor
Environmental Fire and Smoke and fire detection Infrared
Sensing explosions
Noise Noise level Noise sensor
Proximity Caught-in or - Proximity detection RFID, UWB, Infrared, Radar,
Detection and between Bluetooth
Location Cave in Location tracking GPS, RFID, UWB
Tracking Struck-by Proximity detection, RFID, UWB, Infrared, Radar,
object Location tracking Bluetooth, GPS
Electrocution Proximity detection, RFID, Infrared, Radar,
Location tracking Bluetooth, GPS, RFID, UWB

A few commercially available IoT based WSDs specifically meant for monitoring
construction worker safety exist. Most of these devices are still at their nascent stage of evolution
and thus, have not been extensively applied in research to evaluate their effectiveness.
Additionally, there are obviously so many other wearable devices used for construction safety
but are not based on IoT. Although these devices provide some form of protection or warning
alerts to working, they are limited in the areas of information processing, cloud computing, and
edge analytics that IoT brings. A current study being undertaken by the researchers is the
evaluation and comparison of these WSDs and other potential devices used in other industrial

sectors based on parameters such as size/weight, power source/battery life, sensors and sensor
network, wireless connectivity, data logging and storage, software library, etc.
From the existing works on the implementation of WSDs in construction, no efficient
solution has been developed to integrate different sensors for a comprehensive construction
safety monitoring that covers physiological monitoring, environmental sensing, proximity
detection, and location tracking. Creating systems such as multiple sensors on one node, multiple
nodes on one individual, and multiple individuals on one cloud system might offer a possible
breakthrough. Figure 1 illustrates how different sensors and systems can be integrated into
WSDs for multi-sensor platforms and multi-parameter monitoring.

Figure 1. Sensors and Systems for WSDs

CHALLENGES FOR IOT AND WSDS ADOPTION IN CONSTRUCTION
WSDs based on IoT platforms must provide simple, powerful application access to IoT
devices, intelligent learning, fast deployment, best information understanding and interpreting,
and privacy protection against fraud and malicious attack (Chen et al. 2014). Some of the key
capabilities that leading IoT platforms must enable are simple and secure connectivity, privacy
and security (i.e. reduced risk of data loss), power consumption (energy sustainability),
wearability, and interoperability (Chen et al. 2014; Haghi et al. 2017). For instance, some WSDs
required for monitoring construction workers safety collect sensitive information such as the
user’s physiological data, absolute location, and movement activities that compromise the user’s
privacy. This information must be protected during the processes of storage or communication.
For instance, a protocol could be developed that limits the type of information that is transmitted
through the IoT platform, thereby ensuring that workers have primary control of certain
important, yet private health-related information. Furthermore, a non-punitive, opt-in-based
system that provides some wellness benefits to workers should be considered. To mitigate the

risk of cyber-attacks on WSDs based on IoT, there is a need for strong network security
infrastructure for short- and long-range communication (Hiremath et al. 2014; Arias et al. 2015).
Careful precautions are desired in each passing layer of the system from the wearable sensors
to the gateway devices to the cloud, and to ensure users’ privacy and security. Heterogeneity of
connected wearable devices, multi-dimensionality of safety data that can be collected and
generated make the demand for interoperability very high. Interoperability is the essential issue
for crossing layers of the physical device, communication (protocol and spectrum utility),
function, and application (Chen et al. 2014). A holistic approach is required in addressing and
solving the interoperability of IoT devices and services at several layers.
Largescale service deployment of new technologies needs to be framed within a set of
standards. Because IoT spans multiple industries with many manufacturers and differs broadly in
application scenarios and user requirements, large-scale commercial deployment of related IoT
services seems very challenging. IoT itself currently lacks theory, technology architecture, and
standards that integrate the virtual world and the real physical world in a unified framework
(Chen et al. 2014). Developments and coordination of standards and proposals will stimulate the
effective expansion of IoT infrastructures and applications, services, and devices including
WSDs. In general, standards developed by a concerted effort of multiple parties, information
models and protocols in the standards, shall be open. It should be noted that global standards are
typically more relevant than any local agreements.

CONCLUSION
This paper provides a review of the potential applications of WSDs and IoT for the
continuous monitoring of construction workers’ safety metrics to mitigate safety hazards and
health risks on construction sites. The study evaluated wearable sensors and systems that can be
used for physiological monitoring, environmental sensing, proximity detection, and location
tracking of a wide range of construction hazards and vital signals which can provide early
warning signs of safety issues to construction workers. The schematic model presented in this
study can be used by manufacturers of WSDs as a tool for integrating wearable sensors and
systems into a single device for interoperability and multi-parameter monitoring of construction
safety metrics.
For technologies such as WSDs and IoT to be accepted by end-users, their effectiveness,
applicability to operations, and value-adding impact as identified and discussed in this study
must be continuously evaluated and established. The application of WSDs and IoT is expected to
foster proactive and active construction safety management strategies for reducing injuries,
illnesses, and fatalities on construction sites. Further research effort should be directed toward
identifying factors and developing tools that can drive the effective application of these
technologies whenever they are deployed on construction sites.
To enhance the diffusion of WSDs and IoT in construction, there is a need to further evaluate
some of the challenges facing their wide-spread adoption as identified and discussed in this
study, particularly, users’ inherent concerns with respect to privacy and security, interoperability,
and standardization of the technologies. Additionally, more research is required in understanding
the resistance of construction employees to WSDs. This review has unlocked areas of further in-
depth research studies on how to enhance the application of WSDs and the IoT for proactive and
active construction safety management. The evaluation of the adoption, adaptation, and infusion
of WSDs in construction; evaluation of commercially available WSDs in construction; and
development of prototypes of construction specific WSDs are subjects of further research

currently being undertaken by the researchers involved in this study.

REFERENCES
Ananthanarayan, S., and Siek, K. A. (2010). “Health Sense: A Gedanken Experiment on
Persuasive Wearable Technology for Health Awareness.” Proceedings of the 1st ACM
International Health Informatics Symposium, 400–404.
Arias, O., Wurm, J., Hoang, K., and Jin, Y. (2015). “Privacy and Security in Internet of Things
and Wearable Devices.” IEEE Trans. Multi-Scale Comput. Syst., 1(2), 99-109.
Awolusi, I., Marks, E., and Hallowell, M. (2016). “Physiological Data Collection and
Monitoring of Construction Equipment Operators.” Construction Research Congress 2016,
1136–1145.
Awolusi, I., Marks, E., and Hallowell, M. (2018). “Wearable technology for personalized
construction safety monitoring and trending: Review of applicable devices.” Automation in
Construction, 85 (Jan): 96–106. https://doi.org/10 .1016/j.autcon.2017.10.010.
Bonato, P. (2009). “Advances in wearable technology for rehabilitation.” Studies in Health
Technology and Informatics, 145, 145–159. https://doi.org/10.3233/978-1-60750-018-6-145
Chae, S., and Yoshida, T. (2010). “Automation in Construction Application of RFID technology
to prevention of collision accident with heavy equipment.” Automation in Construction,
19(3), 368–374. https://doi.org/10.1016/j.autcon.2009.12.008
Chen, S., Xu, H., Liu, D., Hu, B., and Wang, H. (2014). “A Vision of IoT: Applications,
Challenges, and Opportunities with China Perspective,” IEEE Internet of Things Journal,
1(4), 349-359.
Cheng, T., Migliaccio, G. C., Teizer, J., and Gatti, U. C. (2012). “Data fusion of Real-time
Location Sensing and Physiological Status Monitoring for Ergonomics Analysis of
Construction Workers.” Journal of Computing in Civil Engineering, 27(3), 320–335.
Cho, Y. K., Youn, J. H., and Martinez, D. (2010). “Error modeling for an untethered ultra-
wideband system for construction indoor asset tracking.” Automation in Construction, 19(1),
43–54. https://doi.org/10.1016/j.autcon.2009.08.001
Choe, S., Leite, F., Seedah, D., and Caldas, C. (2014). “Evaluation of sensing technology for the
prevention of backover accidents in construction work zones.” Journal of Information
Technology in Construction, 19(August 2013), 1–19.
Demirkesen, S., and Arditi, D. (2015). “Construction safety personnel’s perceptions of safety
training practices.” International Journal of Project Management, 33(5), 1160–1169.
Fullerton, C. E., Allread, B. S., and Teizer, J. (2009). “Pro-Active-Real-Time Personnel Warning
System.” 2009 Construction Research Congress, 31–40.
Gatti, U., Migliaccio, G., and Schneider, S. (2011). “Wearable Physiological Status Monitors for
Measuring and Evaluating Worker’s Physical Strain: Preliminary Validation.” Computing in
Civil Engineering 2011, 2000(413), 194–201. https://doi.org/10.1061/41182(416)94
Haghi, M, Thurow, K., Habil, I., Stoll, R., and Habil, M. (2017). “Wearable Devices in Medical
Internet of Things: Scientific Research and Commercially Available Devices.” Healthcare
Informatics Research, 23(1):4-15.
Hallowell, M. R. (2012). “Safety-Knowledge Management in American Construction
Organizations.” American Society of Civil Engineers, 28(2), 203–211.
Hallowell, M. R., Teizer, J., and Blaney, W. (2010). “Application of Sensing Technology to
Safety Management.” Construction Research Congress 2010, (2006), 31–40.
Hiremath, S., Yang, G., and Mankodiya, K. (2014). “Wearable Internet of Things: Concept,

Architectural Components and Promises for Person-Centered Healthcare.” 4th International

Conference on Wireless Mobile Communication and Healthcare (MOBIHEALTH), 304-307
Li, J., Carr, J., and Jobes, C. (2012). “A shell-based magnetic field model for magnetic proximity
detection systems.” Safety Science, 50(3), 463–471.
Marks, E., and Teizer, J. (2012). “Proximity Sensing and Warning Technology for Heavy
Construction Equipment Operation.” Construction Research Congress 2012, 981–990.
Nath, N. D., Akhavian, R., and Behzadan, A. H. (2017). “Ergonomic analysis of construction
worker's body postures using wearable mobile sensors.” Applied Ergonomics, 62:107-117
Nnaji, C., Lee, H. W., Karakhan, A., and Gambatese, J. (2018). “Developing a Decision-Making
Framework to Select Safety Technologies for Highway Construction.” Journal of
Construction Engineering and Management, 144(4), 04018016.
Occupational Safety and Health Administration (OSHA) (2017). “Worker Safety Series:
Construction.” United States Department of Labor.
Oloufa, A. A., Ikeda, M., and Oda, H. (2003). “Situational awareness of construction equipment
using GPS, wireless and web technologies.” Automation in Construction, 12, 737–748.
Papapostolou, A., and Chaouchi, H. (2011). “RFID-assisted indoor localization and the impact of
interference on its performance.” Journal of Network and Computer Applications, 34(3),
902–913. https://doi.org/10.1016/j.jnca.2010.04.009
Park, J., Marks, E., Cho, Y. K. and Suryanto, W. (2016). “Performance Test of Wireless
Technologies for Personnel and Equipment Proximity Sensing in Work Zones.” Journal of
Construction Engineering and Management, 142(1), 04015049-1-9.
Ruff, T. (2006). “Evaluation of a radar-based proximity warning system for off-highway dump
trucks.” Accident Analysis and Prevention, 38(1), 92–98.
Saidi, K. S., Teizer, J., Franaszek, M., and Lytle, A. M. (2011). “Static and dynamic performance
evaluation of a commercially-available ultra wideband tracking system.” Automation in
Construction, 20(5), 519–530. https://doi.org/10.1016/j.autcon.2010.11.018
Shahi, A., Aryan, A., West, J. S., Haas, C. T., and Haas, R. C. G. (2012). “Deterioration of UWB
positioning during construction.” Automation in Construction, 24, 72–80.
Shen, X., Awolusi, I., and Marks, E. (2017). “Construction Equipment Operator Physiological
Data Assessment and Tracking.” Practice Periodical on Structural Design and Construction,
22(4), 04017006-1-7. http://dx.doi.org/10.1061/(ASCE)SC.1943-
5576.0000329#sthash.8vk1J1ly.dpuf
Sung, M., Marci, C., and Pentland, A. (2005). “Wearable feedback systems for rehabilitation.”
Journal of Neuroengineering and Rehabilitation, 2(17), 1–12.
Swan, M. (2012). “Sensor Mania! The Internet of Things, Wearable Computing, Objective
Metrics, and the Quantified Self 2.0.” Journal of Sensor and Actuator Networks, 1(3), 217–
253. https://doi.org/10.3390/jsan1030217
Teizer, J., Allread, B. S., Fullerton, C. E., and Hinze, J. (2010). “Autonomous pro-active real-
time construction worker and equipment operator proximity safety alert system.” Automation
in Construction, 19(5), 630–640. https://doi.org/10.1016/j.autcon.2010.02.009
Teizer, J., and Vela, P. A. (2009). “Personnel tracking on construction sites using video
cameras.” Advanced Engineering Informatics, 23(4), 452–462.
Wang, J. and Razavi, S. N. (2016). “Low False Alarm Rate Model for Unsafe-Proximity
Detection in Construction.” Journal of Computing in Civil Engineering, 30(2), 1-13.

Zhu, X., Mukhopadhyay, S. K., and Kurata, H. (2012). “A review of RFID technology and its
managerial applications in different industries.” Journal of Engineering and Technology
Management - JET-M, 29(1), 152–167. https://doi.org/10.1016/j.jengtecman.2011.09.011

Evaluating Generated Layouts in a Healthcare Departmental Adjacency Optimization

Problem
Jennifer I. Lather, S.M.ASCE1; Timothy Logan2 ; Kate Renner3;
and John I. Messner, Ph.D., M.ASCE 4
1
Dept. of Architectural Engineering, Pennsylvania State Univ., 104 Engineering Unit A,
University Park, PA 16802. E-mail: [email protected]
2
HKS Inc., 350 North St. Paul St., Suite 100, Dallas, TX 75201. E-mail: [email protected]
3
HKS Inc., 1250 Eye St. NW, Suite 600, Washington, D.C. 20005. E-mail: [email protected]
4
Dept. of Architectural Engineering, Pennsylvania State Univ., 104 Engineering Unit A,
University Park, PA 16802. E-mail: [email protected]

ABSTRACT
The effective layout of departments within a new hospital influences the efficiency and
effectiveness of delivering healthcare services. This study explores a computational approach to
generate and evaluate potential hospital layouts. Given a set of healthcare departments, areas,
and structural bay sizes, a graph theoretical approach with a placement strategy was used to
develop an initial set of optimal and near optimal layouts based on both adjacency ratings and
distances. Input data of adjacency ratings was collected from experts involved in the design of a
new hospital project. An optimal adjacency graph was calculated and a placement strategy with a
discrete set of constraints was used. Each layout was given a distance weighted score based on
the pair-wise distance weighted adjacency rating. Healthcare planning and design experts were
surveyed for their input on the use of this approach. Comparison of their results to the layout
scores indicate that planning and design experts select the best scoring layouts more consistently
than the worst scoring layouts.

INTRODUCTION
Healthcare planning and design is a complex process requiring careful consideration of the
key departmental adjacencies throughout the facility to ensure patient safety, operational
efficiency, and reduced travel distances for patients, care givers, and non-clinical staff. There are
often competing priorities influencing the location and adjacency relationships to other key
areas, particularly in the diagnostic and treatment (DT) departments. These priorities can be
dictated by code and guideline requirements, functional needs, and individual preferences (Carr
et al. 2017). In addition, healthcare owners are asking for increasing amounts of evidence to
support design decisions (Burmahl et al. 2017). Generative software can be used to develop
initial layouts based on evidence-based priorities to create a quantitative starting point for
healthcare planning and design experts to review and refine layouts with healthcare facility
stakeholders. This will allow the development of optimized configurations of the DT
departments within a facility. In this study, we investigate how planning and design experts
perceive generative optimal layouts.
Previous Research: Optimization methods have been discussed for healthcare facility layout
planning problems in the literature for more than 40 years. One of the earliest formulations of the
problem was from Elshafei (1977), where it was presented as a Quadratic Assignment Problem
(QAP). Many researchers have addressed generic QAPs as well as other formulations of facility
layout planning problems over the years (Anjos and Vieira 2017; Francis et al. 1992). The graph

theoretical approach (GTA) was developed as a heuristic method for solving common QAPs
(Foulds and Robinson 1978). In a healthcare setting, several researchers have applied this
method to hospital layout problems (Arnolds and Nickel 2015; Assem et al. 2012). Arnolds and
Nickel (2015) discussed the GTA as more useful than other techniques for communicating with
healthcare experts and architects who may not be as familiar with typical facility layout planning
optimization methods. Even with a growing demand for data-driven methods and years of
research on healthcare layout planning problems, there is a lack of research on the evaluation of
these techniques from experts such as healthcare strategists, planners, and designers. This study
investigates the use of optimization techniques to generate layouts of a test case hospital and
present expert evaluation of the techniques for future development efforts.

RESEARCH METHODOLOGY
The research methodology focused on developing, implementing, and evaluating a layout
optimization technique for a hospital layout problem. A recently designed new construction
hospital was selected as the test case. The project had gone through a strategizing, planning, and
designing process with considerable changes in program scope and area over the project
timeline. Layout of the DT and support service departments had changed frequently throughout
the design. A layout optimization technique was proposed to test optimizing and streamlining the
layout of these departments based on programming and adjacency data. Using a graph theoretical
optimization technique, an optimal adjacency subgraph of hospital departments was obtained.
From the subgraph, a placement strategy was generated to locate departments conceptually on a
floor plan using constraints of structural bay size, department area, and strategy. Two score-
based evaluation criteria were used, a weighted adjacency score for the subgraph and a distance
weighted adjacency score for the placed layout. Adjacency ratings from project experts were
used as the input data for generated layouts. These layouts, using a script between the layout
generating program and common building information model (BIM) authoring software,
generated massing objects for layout visualization. To evaluate the usefulness of the
implementation of this technique, healthcare strategists, planners, and designers were surveyed
on their perceptions of the output from this technique. Data includes perceptions of the best and
worst layouts in a generated set; opinions on the advantages and disadvantages of using this
method; and demographic data.

LAYOUT DEVELOPMENT APPROACH

Development of Graph Theoretical Approach: The GTA is a heuristic method of
generating a maximally planar and maximally weighted subgraph of nodes (in this case:
departments). The R-construction deltahedral heuristic developed by Foulds and Robinson
(1978) starts with a set of departments with a flow or adjacency rating between each pairwise
combination. An initial arrangement of four nodes is created by selecting the top weighted nodes
and placing them in a tetrahedron fashion, where each line represents the edge weighted by the
relationship between those nodes (Figure 1a). Subsequent nodes are added to the graph step by
step: first, by selecting the next node with the greatest overall impact using the department’s total
closeness score (TCS, sum of all adjacency ratings for that department), and, second, by
evaluating the entering node’s impact on the overall score of the graph based on its weight with
each triangular segment. In order to maintain the planar property, no lines can cross. A final step-
wise optimal graph is obtained based on the edge weights once all nodes are placed in the graph
(Figure 1b). A dual of this graph provides the rough arrangement of departments without specific

area and shape characteristics but provides abstract information to layout planners (Figure 2a).
A relationship diagram (REL) was used for the input data for this methodology. The
AEIOUX rating system was used, where each department pair was given an adjacency rating of:
(A) absolutely necessary, (E) especially important, (I) important, (O) ordinary importance, (U)
unimportant, and (X) undesirable. The following numeric weights were used for calculation
purposes: A-5, E-4, I-3, O-2, U-1, X-0. This method is sensitive to the adjacency input values
(Francis et al. 1992). To help alleviate this problem, six experts involved with the project were
surveyed to generate different optimal layout sets.

Figure 1. (a) Example initial tetrahedron formulation of graph theoretical approach. (b)
Example final maximally planar subgraph.

Figure 2. (a) Example dual (red) of the adjacency graph. (b) A possible block layout
formulation meeting all adjacency relationship requirements.
Development of Placement Strategy: After a near-optimal subgraph is obtained, a planner
can use the graph and dual to plan an optimal layout with area and shape information (Figure
2b), which is usually a manual exercise. Computational methods have been explored to create a
block layout, such as spiral, crystal formulation, serpentine placement, and unit grid. One
common algorithm used is the Computerized Relative Allocation of Facilities Technique
(CRAFT) algorithm, which is a layout improvement strategy, meaning it needs an initial layout.
For implementation, the GTA was selected given its easily communicated methodology and a
serpentine placement strategy was selected given its ease of implementation. Common structural
bay sizes of 30, 60, and 90 feet were used and layout area was based on the square root of the
total area requirements. For deciding how to navigate the graph for placement, a starting node

was used to enter the graph, and each subsequent department was placed based on the set of
neighbors, picking the highest weighted relationship. If there was a relationship weight tie, the
department with the highest TCS was selected, and, if there was still a tie, then the department
with the greatest number of neighbors was selected. Each department was placed in order based
on its programmed area and prescribed bay size (Figure 3a). Since the bay size and starting node
were unknown parameters, two starting nodes were used (a prescribed start node and the top
weighted start node) and all three common bay sizes were tested providing a total of six layouts.
These were each given a distance weighted adjacency score.

Figure 3. (a) Serpentine Placement Strategy. (b) BIM Objects Generated

Development of Layout Output: At this stage a design authoring software was selected to
generate boundary massing BIM objects for each department. Each department had a set of
vertices outlining the shape. Generic massing elements that corresponded with the rectangular,
L-shape, and S-shape departments were created as BIM objects with the necessary parameters to
be sized according to each department. A script was created to take a set of layouts and identify
shape, define origin points, perform transforms, and ultimately place the department with
parameter values such as name and area (Figure 3b).

LAYOUT EVALUATION METHODS

Scoring Metrics: The most common numeric approaches for evaluating layouts are
adjacency or distance (Francis et al. 1992). Some have proposed a combination of adjacency
distance scoring metrics. For this study we used an inverse distance weighted adjacency score
metric for the layouts to take both adjacency and distance into account. Given n departments,
the pairwise adjacency rating between departments i - j ( wij , n ), and the adjacency state (
aij   0,1 , integer, evaluating to 1 when departments i-j are adjacent), the adjacency score ( S A )
can be calculated (Equation 1). Given the distance between departments i-j ( d ij ) and the cost or
flow, depending on the formulation, between departments i - j ( Vij ), the distance score ( S D ) can
be calculated (Equation 2). Distances are evaluated by the rectilinear distance between the
department, but other distance measurements can be used relevant to the application problem.
Given the pairwise adjacency rating between departments i - j ( wij , n ) and the inverse distance
between departments i - j ( Iij  1/ dij ), the distance weighted adjacency score ( S DWA ) can be

calculated as a maximization problem (Equation 3). Instead of the adjacency state, the score is
based on how far apart the centroid of a department is from each pairwise combination, where
d ij is the rectilinear distance between the centroids of departments i-j. This metric combines both
distance and adjacency rating into a single metric and can easily be expanded from adjacency
ratings to department flow data. This deviates from typical graph scores because once the shapes
and areas are decided, the distance between centroids of departments can increase considerably,
providing a more realistic metric.
n n
max S A  wij aij (1)
i 1 j 1
n n
min S D  d ijVij (2)
i 1 j 1
n n
max S DWA  I ij wij (3)
i 1 j 1

Expert Evaluation: A survey was developed to gain expert perception of the layout outputs.
A survey sample was selected to represent a diverse range of roles and experience levels for
those with healthcare strategy, programming, planning, and design experience. A single
healthcare design firm was surveyed, utilizing an internal database of individuals who work
primarily on healthcare projects around the world. For the purposes of this study, participants
were limited to individuals working in the United States with more than one year of healthcare
experience and a focus on planning and design of healthcare facilities. Additionally, participants
were included from the company’s healthcare consulting group, focused on strategy,
programming, and operational planning, and the company’s computational design group, focused
on computational design methodologies for a variety of project types. In total, the survey was
sent to 262 potential participants. Potential participants were given information about the study
and were asked to volunteer to share feedback on generative layout options. The survey asked for
opinions on the best and worst layouts in a set of options, general perceptions of using generative
layout and optimization techniques, and demographic data, including age, gender, type of
expertise, years in the healthcare design field, and years at the current company. Respondents
were not given the scores of the layouts.

Table 1. Numeric Layout Scores

Option 1 2 3 4 5 6
Start Node Lobby Lobby Lobby Top Dept. Top Dept. Top Dept.
Bay Size 30’ 60’ 90’ 30’ 60’ 90’
S DWA 4.502 4.434 5.009 4.787 4.580 4.821

EVALUATION OF LAYOUT RESULTS

Layout Generation Results: Adjacency ratings were collected from six individuals, some in
a group setting, resulting in three sets of input data and one set of combined scores. A graph was
generated for each of these four input sources using S A (Equation 1). The highest relative graph
score using the graph’s theoretical upper bounds (Foulds and Robinson 1978) determined which
layouts to use for user evaluation. The highest scoring graph was used to generate six layout
options corresponding to different constraints of start node and bay size, and using the S DWA

(Equation 3). The highest scoring layouts were option 3 and option 6, with scores of 5.01 and
4.82 respectively (Table 1).
Demographics of Respondents: The survey had a response rate of 11.8%, with 31
respondents completing the survey. The average age was between 40 and 44, with one
respondent younger than 24 and one older than 65. The mode age category was between 40-44.
Respondents averaged 16.2 years of experience with a minimum of one year and a maximum of
35 years. Of the respondents, 54.8% were female. All had 4-year degrees or higher, and 71.0%
had Master’s degrees. 94% had experience in architecture, 84% in medical planning, 45% in
operational planning, 35% in research, and 29% in strategy. All but two respondents had never
seen or used generative layout methods before.
Survey Results: When asked on a scale of 1-7 if additional information about the decisions
the system was making was needed, with 1 being strongly agree and 7 being strongly disagree,
respondents on average agreed (average 1.74 with a 95% confidence interval of 1.25 and 2.23).
When asked on a scale of 1-7 if respondents found generative layouts useful, with 1 being
extremely useful and 7 being extremely useless, on average they found generative layouts
slightly useful (average of 3.13, with a 95% confidence interval of 2.54 and 3.71).
Comparisons of Respondents and ‘Optimal’ Layout: Respondents were asked to select
the best, or set of best, options with a total of three possible choices. Of the total responses, 11
chose more than one ‘best’ option, and only one person chose three options. Of the respondents,
39% selected the highest scoring layout and 29% selected the second highest scoring layout as
the best option. The most commonly chosen ‘best’ functioning layout was the highest scoring
layout. Respondents were asked to select the worst, or set of worst, layout options with a total of
three possible choices. Of the total responses, 15 chose more than one, and four respondents
chose three options. When asked to select the worst layout, 26% chose the worst scoring layout
and 29% chose the second worst scoring layout. The most commonly chosen ‘worst’ functioning
layout was the third highest scoring layout, with 42% of respondents. Respondents were more
closely aligned with selecting a layout that scored well than one that scored poorly (Figure 4).

Figure 4. Frequency of Respondent’s Choice of ‘Best’ and ‘Worst’ Layouts

DISCUSSION
Evaluation of the GTA shows that respondents aligned well with higher scoring layouts, and
respondents did identify the best layout based on flow. It was found that people were interested
in using generative layout techniques, especially for fast iteration through multiple options not
traditionally possible. Some respondents thought the addition of these tools would add to layout
accuracy. Additionally, respondents thought these methods would provide evaluation tools for

past and future projects. They thought these methods would be advantageous for getting “the
creative planning started,” teaching young planners, communicating with other groups, and
“quickly generating multiple options for review and evaluation.” Several discussed the use of
generative tools to help filter different evaluative criteria and overcome personal design bias, as
one respondent explained: “every plan has an inherent bias, whether known, intentional, or non,
unintentional.” While they saw advantages, the results were mixed. Respondents agreed that
more details about the generative layout system were needed to find these techniques useful. The
results add to healthcare facility layout methodologies by providing expert feedback on
adjacency focused generative layouts. Further evaluation of adjacency ratings from a variety of
experts including those from other architecture firms, consulting groups, and especially from
care providers, would provide a more robust understanding of the variability in adjacency ratings
and flow. Additionally, input from those different perspectives would be useful in testing if an
expert’s role or level of experience has a significant impact on their opinion of using these types
of methods. Future work should develop the method for multi-story optimization by considering
both horizontal and vertical adjacencies, to look at intradepartmental adjacencies of rooms and
activities, and focus on a user interface with parameter selections including the ability to adjust
programming data, size of departments, and the use of a variety of placement strategies.

CONCLUSIONS
This study explores a computational methodology for generating many near best layout
options in a new hospital facility, presents the findings from expert review of those layouts, and
provides insight into experts’ perceptions of using an automated approach to support layout
design. Generative layout techniques have many opportunities for aiding planning and design
experts in optimizing the layout of healthcare facilities. Previous work used GTA in isolation of
planning and design professionals. This work presents opportunities for this approach in rapid
generation and analysis of multiple options that takes into account more factors than typical
designers have time to consider while providing evaluation criteria. The results show promise in
generative layout techniques, however more detail about the decisions the system makes and the
consideration of more factors are desired by respondents. These results indicate a need for
transparent approaches to generative layout methodologies.

REFERENCES
Anjos, M. F., and Vieira, M. V. C. (2017). “Mathematical optimization approaches for facility
layout problems: the state-of-the-art and future research directions.” European Journal of
Operational Research, 261(1), 1–16.
Arnolds, I., and Nickel, S. (2015). “Layout planning problems in health care.” Applications of
Location Analysis, International Series in Operations Research & Management Science, H.
A. Eiselt and V. Marianov, eds., Springer International Publishing, Cham, Switzerland, 109–
152.
Assem, M., Ouda, B. K., and Wahed, M. A. (2012). “Improving operating theatre design using
facilities layout planning.” 2012 Cairo International Biomedical Engineering Conference
(CIBEC), 109–113.
Burmahl, B., Hoppszallern, S., and Morgan, J. (2017). “2017 Hospital construction survey.”
Health Facilities Management, 30(2), 18–24.
Carr, R. F., and WBDG Health Care Subcommittee. (2017). “Health care facilities: Hospitals.”
Whole Building Design Guide, National Institute of Building Sciences, Washington D.C.

Elshafei, A. N. (1977). “Hospital layout as a quadratic assignment problem.” Journal of the

Operational Research Society, 28(1), 167–179.
Foulds, L. R., and Robinson, D. F. (1978). “Graph theoretic heuristics for the plant layout
problem.” International Journal of Production Research, 16(1), 27–37.
Francis, R. L., McGinnis, L. F., and White, J. A. (1992). Facility layout and location: an
analytical approach, 2nd Ed. Prentice Hall, Englewood Cliffs, NJ.

Thermal Comfort Aggregation Modeling Based on Social Science Theory: Towards a

Comfort-Driven Cyber Human System Framework
Lu Zhang, Ph.D., A.M.ASCE1; and Shankar Sanake2
1
Assistant Professor, Moss School of Construction, Infrastructure, and Sustainability, Florida
International Univ., 10555 West Flagler St., EC 2935, Miami, FL 33174, USA. E-mail:
[email protected]
2
Graduate Student, Moss School of Construction, Infrastructure, and Sustainability, Florida
International Univ., 10555 West Flagler St., Miami, FL 33174, USA. E-mail: [email protected]

ABSTRACT
A classroom is a multioccupancy environment that typically involves a large number of
students with varying comfort zones and comfort levels in one single space, making it
challenging to provide a satisfactory environment to accommodate the comfort needs of this
diverse population. Current building practices that rely on centralized heating, ventilation, and
air conditioning (HVAC) systems result in poor occupant satisfaction with the indoor
environment, especially when the occupant group is large and diverse. There is, thus, a need to
integrate HVAC system with a cyber human system (CHS) that allows for adaptive HVAC
operation based on an aggregated group comfort level that collectively accounts for the comfort
levels of diverse student populations. At the cornerstone of the CHS system, this paper focuses
on discussing our proposed group comfort aggregation model that quantifies the thermal comfort
levels of a group of occupants based on their individual thermal comfort levels. The proposed
model is theoretically grounded in social welfare theory and social welfare function. This
research contributes to the body of knowledge by providing a more explicit understanding of
how to measure group thermal comfort in the indoor environments. The proposed model serves
as the foundation of adaptive and comfort-driven HVAC operation that offers more comfortable
and healthier classroom environments to students.

INTRODUCTION
The fundamental way of providing a classroom environment that is conductive to learning is
to enhance the health, comfort, and well-being of our students (USGBC 2013). Absenteeism,
difficulty concentrating, inability to perform mental tasks, and poor learning performance are
shown to be closely related to discomfort in the classroom environment (U.S.EPA 2018). The
basic tenet of classroom design and operation is – “every corner of [the] environment should
provide the things to promote [the] health and well-being [of the students]” (USGBC 2013). But,
all too often, that goal remains an aspiration (USGBC 2013). A classroom is a multioccupancy
environment that typically involves a large number of students with varying thermal comfort
levels in one single space, making it challenging to provide a satisfactory environment to
accommodate the thermal comfort needs of this diverse population. Current building practices
that rely on centralized heating, ventilation, and air conditioning (HVAC) systems result in poor
occupant satisfaction with the indoor environment, especially when occupant group is large and
diverse (Jazizadeh et al. 2013). Therefore, extensive efforts have been directed toward
personalized HVAC control to improve the overall occupant satisfaction with the indoor
environment. However, personalized HVAC control only allows for customized HVAC setting
in different spaces without solving the problem of accommodating or balancing diverse thermal

comfort needs in the same space. There is, thus, a need to integrate HAVC systems with a cyber
human system (CHS) that allows for adaptive HVAC operation based on an aggregated group
thermal comfort level that collectively accounts for the thermal comfort levels of diverse student
populations.
However, it is challenging to achieve comfort-driven HVAC operation due to a major
knowledge gap. There is a lack of understanding of group-level thermal comfort or how diverse
individual thermal comfort can be aggregated to represent group thermal comfort. For a
classroom environment that is occupied by a large number of students, it is difficult to satisfy the
thermal comfort needs of all students. Different students have different thermal comfort needs,
and their thermal comfort levels are different even when exposed to the same indoor
environment (Zhang et al. 2018). The current practices on group thermal comfort are limited in
revealing the true thermal comfort level of a group of occupants. The most commonly used
Predicted Percentage of Dissatisfied (PPD) function merely reflects the percentage of occupants
that are dissatisfied with the thermal conditions (Olesen and Parsons 2002) without accounting
for the degree of dissatisfaction or the inequality of occupant thermal comfort levels. Without a
method to model the overall group thermal comfort level while accounting for individual
comfort levels, HAVC system cannot be operated to satisfy the comfort needs of a diverse
population in the same space.
To address the gap, this paper proposes a group thermal comfort aggregation (GTCA) model
for quantifying the thermal comfort level of a group of occupants based on their individual
thermal comfort levels. The proposed model is theoretically grounded in social welfare theory
and social welfare function. A case study was conducted to illustrate how the proposed GTCA
model can be applied in measuring the collective thermal comfort of occupants. The paper also
discusses a CHS framework that integrates the proposed GTCA model with the comfort-driven
HVAC control model to support adaptive HVAC operation. This research contributes to the body
of knowledge by providing a more explicit understanding of how to model group comfort in the
indoor environments. The proposed model serves as the foundation of adaptive and comfort-
driven HVAC operation that offers more comfortable and healthier indoor environments to
diverse student populations.

BACKGROUND
Thermal Comfort Modeling: In recent decades, although a number of studies (e.g., Yao et
al. 2009; Zhang et al. 2004) have been conducted to understand and measure thermal comfort of
building occupants, these studies have only focused on measuring thermal comfort on individual
occupant level. There has been a lack of study that aims to understand the comfort of large
populations who occupy the same space or aggregate/measure group comfort levels based on
individual comfort levels. Existing methods for measuring group thermal comfort or satisfaction
cannot reveal the true thermal comfort level of a group of occupants. One of the most commonly
used methods is Predicted Percentage of Dissatisfied (PPD) function, which predicts the
percentage of occupants that are dissatisfied with the thermal conditions. The recommended
acceptable PPD range is less than 20% of persons dissatisfied for an interior space (ASHRAE
2013). However, the PPD function only reflects the percentage of occupants that are dissatisfied
with the thermal conditions without measuring the collective thermal comfort level of the group
of populations (Olesen and Parsons 2002). Moreover, traditional methods are mostly
experimental-oriented without a theoretical foundation, which results in a number of research
questions unanswered. For example, how does the group thermal comfort relate to the individual

thermal comfort? What are the factors that could affect group thermal comfort? The Moorean
view shows that the whole may not simply equal to the sum of the parts (Moore 1992). In fact, in
many domains, the development of an aggregation method is a challenging task. Some
aggregation methods suffer from methodological difficulties or challenges that can be linked to a
lack of a theoretical foundation (Blanc et al. 2008).
Social Welfare Theory: Social welfare theory is a theory that studies the measurement of
collective or aggregated welfare of a society or a group of individuals (Feldman and Serrano
2006). A social welfare function is, used to measure and analyze the welfare of the whole society
in various states; it is used to determine the optimal distribution of income to achieve the highest
social welfare (Tresch 2008). Social welfare function represents the relationship between the
collective welfare of a group and the welfare of the individuals in that group (Barr 1998). In
social welfare function, income is commonly adopted as the measurement of well-being.
Different researchers (e.g., Sen 1997, Bellu 2006, Atkinson and Brandolini 2010) have proposed
different social welfare functions. For example, Sen (1997) proposed a social welfare function
that accounts for the average welfare of a group and the inequality of distribution of welfare.
This function uses the Gini coefficient, which is a measure of inequality, to penalize the situation
of unequal distribution of welfare. According to this function, it is not favorable that a small
group of rich people processes most of the welfare in the society. However, as per Sen’s
function, the equality (thus high group social welfare) can be achieved when all individuals are
poor, which is not a favorable condition in reality. Atkinson and Brandolini (2010), thus,
proposed a social welfare function that integrates the poverty indicator. This function uses a
predefined poverty line to penalize the condition where the welfare (e.g., income) of a group of
individuals is below the poverty line.

PROPOSED SOCIAL-WELFARE-BASED APPROACH

Social welfare function is generally defined as a measure of the aggregated welfare of a
group based on the allocation of one or more well-being requisites among the individuals of that
group. In the context of occupant thermal comfort in the indoor environments, the thermal
comfort level of each individual can be considered as the indoor environment well-being
indicator. The aggregated thermal comfort level of a group can be defined based on the
allocation of thermal comfort levels among the individuals of that group. Both inequality
indicator and poverty indicator can be adapted into group thermal comfort aggregation.
Inequality of Thermal Comfort Distribution: In an ideal indoor environment, all building
occupants should feel equally comfortable. In other word, an ideal indoor environment should be
able to provide equal comfort to all building occupants. However, in reality, different building
occupants usually have different thermal comfort levels, even they are exposed to the same
indoor environment. Therefore, an indicator of inequality is introduced to reflect the differences
in thermal comfort levels perceived across different building occupants. The Gini coefficient
(Barr 1998) is adapted into the domain of thermal comfort inequality measurement. Gini
coefficient is one of the most commonly used measures for inequality of social welfare. In the
proposed GTCA model, the Gini coefficient is a measure of the inequality of a distribution that
ranges between 0 and 1. It is 0 when the thermal comfort is perceived perfectly equally among a
set of building occupants, and it is 1 when only one building occupant of the group perceives the
perfect thermal comfort (i.e., maximal unbalanced perception of thermal comfort among the
occupants). The Gini coefficient is measured by dividing different areas on the Lorenz curve of a
group (Barr 1998). In our proposed model, a Lorenz Curve (Figure 1) is a curve that illustrates

the percentage of the cumulative thermal comfort possessed by the percentage of the occupants
in the built environment. A point on the curve shows what percentage of occupants shares how
much percentage of the total thermal comfort. The Gini coefficient is calculated by dividing the
area between the Lorenz curve and the line of complete equality (as the shaded Area shown in
Figure 1) by the total area under the line of complete equality. The further away the Lorenz curve
is from the line of complete equality, the greater the degree of inequality.

Figure 1. Lorenz curve for distribution of occupant comfort.

Poverty of Thermal Comfort: As per the above discussion, an increase of equality can
make it easier to achieve higher thermal comfort level of the group. However, increasing equality
can also be achieved by making every building occupant extremely uncomfortable. This is
analogous to the definition of poverty in social welfare, in which a large group of people receives
less welfare. The occupants perceive extreme discomfort in the indoor environment more often
than expected, and such situation severely affects the health and well-being of the occupants.
Therefore, extreme discomfort needs to be penalized when aggregating group comfort. To define
the extreme discomfort perceived by the building occupants, a line of extreme discomfort is
defined by adapting the cutoff poverty line used in social welfare theory, where any individual
below the poverty line is identified as poor (Sen 1992). By defining this line, an alarm bell could
be set off to both the occupants and the building professionals (e.g., designer, facility manager) if
the thermal comfort levels of certain percentage of the occupants fall below this line. Unlike the
poverty line in the social science domain, there is no existing study on “poverty (i.e., extreme
discomfort) line” for occupant comfort. Therefore, we propose to set the extreme discomfort line
through empirical studies by soliciting the extreme discomfort ratings from the building
occupants. The extreme discomfort line is then defined as the mean value of the extreme
discomfort ratings provided by the building occupants. Further studies that integrate the building
codes and expert interviews will be conducted to further validate the line of extreme discomfort.
Group Thermal Comfort Aggregation Function: The proposed GTCA model defines the
thermal comfort level of a group of occupants by aggregating the thermal comfort levels of
individual building occupants. The theoretical foundation of the aggregation function is
grounded in the areas of social welfare theory and social welfare function. A GTCA function is
proposed by incorporating the thermal comfort levels of individual building occupants,
inequality in the distribution of thermal comfort among building occupants, and extreme
discomfort perceived by the building occupants. The proposed GTCA function is presented in

Eq. (1):
1 m 1
GCk  i 1 ICik  1  Gk   im1 max 0,  l  ICik   (1)
m m
where GCk = the group thermal comfort level of a group of occupants in the indoor environment
k; ICik = individual thermal comfort level of the occupant i in the indoor environment k; Gk =
the Gini coefficient of the indoor environment k; l = the line of extreme discomfort; and m = total
number of individuals in the indoor environment k.
As per Eq. (1), the GTCA function is composed of two subfunctions: group thermal comfort
inequality (GTCI) subfunction and thermal comfort poverty (TCP) subfunction. The GTCI
subfunction aggregates and averages the thermal comfort levels of individual occupants, and it
penalizes the conditions of unequal distribution of thermal comfort across different building
occupants. This function is based on the assumption that an ideal indoor environment should be
able to provide equal thermal comfort to all building occupants. Thus, when defining the group
comfort level, the unequal distribution of thermal comfort across individual occupants is
considered as a discounting factor, which lowers the aggregated group thermal comfort level.
The GTCI function penalizes the unequal distribution of thermal comfort by measuring such
inequality through the Gini coefficient. The TCP subfunction penalizes the conditions where
some occupants feel extremely uncomfortable in the indoor environment. This function is based
on the assumption that the indoor environment that makes some occupants extremely
uncomfortable is below the building standard, thus should be penalized when defining the group
thermal comfort level in such indoor environment.

CASE STUDY
A hypothetical case study was conducted to illustrate the use of the proposed GTCA model
in measuring group thermal comfort level in a classroom environment. The use of a hypothetical
case study helps to illustrate the benefit of using our proposed model given extreme comfort
ratings provided by the hypothetical stakeholders. Hypothetical case studies are widely used in
the literatures (e.g., de la Garza et al. 2007) to test or illustrate the use of new methodologies or
models.
Ten hypothetical occupants are involved in the case study to provide their thermal comfort
levels in a classroom environment. The comfort levels of the ten hypothetical occupants are
presented in Table 1.

Table 1. Thermal Comfort Ratings Provided by Occupants

Occupant Comfort description Comfort level
A Uncomfortable 2
B Slightly uncomfortable 3
C Neutral 4
D Slightly comfortable 5
E Slightly comfortable 5
F Very comfortable 7
G Very comfortable 7
H Slightly uncomfortable 3
I Uncomfortable 2
J Comfortable 6

The thermal comfort aggregation analysis was then conducted using the proposed GTCA
model. The Gini coefficient was first determined through the Lorenz Curve by dividing the area
between the Lorenz curve and the line of complete equality by the total area under the line of
complete equality. The Lorenz curve for thermal comfort ratings of the ten hypothetical
occupants is represented in Figure 2. Auto CAD was used to calculate the Gini coefficient, and
the Gini coefficient in this case study is 0.22. The group thermal comfort level in the classroom
environment was then calculated using the GTCA function [Eq. (1)]. The extreme discomfort
line is assumed to be 2 in this case and any given thermal comfort ratings at or below this line
were penalized. The result of the group thermal comfort for classroom indoor environment is 3.
The result means, collectively, occupants in this classroom environment are slightly
uncomfortable. This is a hypothetical case study that is used to illustrate the use of the proposed
GTCA model. In the future, we will conduct real case studies to further validate the proposed
GTCA model.

Figure 2. Lorenz Curve for the case study.

INTEGRATION WITH CHS FRAMEWORK

The proposed GTCA model is an important element of a proposed “Group Comfort-Driven
HVAC CHS Framework” (Figure 2). The framework is composed of three key elements: (1) an
individual thermal comfort measurement module that learns from each individual occupant’s
physiological data captured through Physiological Status Monitoring (PSM) devices, (2) a group
comfort aggregation module that aggregates individual thermal comfort levels into group thermal
comfort level (the focus of this paper), and (3) a comfort-driven HVAC control module that
adaptively operates the HVAC system based on the changes of group thermal comfort levels.
The proposed CHS framework integrates the collective thermal comfort of occupants into the
HVAC control loop, thus supporting occupants’ overall comfort and health in the
multioccupancy classroom environments.

CONCLUSIONS
This paper presents a group thermal comfort aggregation (GTCA) model for defining the
aggregated thermal comfort level of a group of occupants based on their individual thermal

comfort levels, the inequality of comfort distribution among the occupants, and the extreme
discomfort perceived by some occupants. The proposed model is primarily theoretically
grounded in social welfare theory. This research contributes to the body of knowledge by
providing a new thermal comfort aggregation model. This will lead to a better understanding of
the relationship between the “individual thermal comfort” and “group thermal comfort” and a
more robust measurement of the overall thermal comfort of all the occupants in the building.

Figure 3. Group comfort-driven HVAC CHS framework.

In the ongoing/future work, the authors will integrate the group thermal comfort model with
an HVAC control model to adaptively operate the HVAC system in a way that adapts to the
change of occupant thermal comfort levels. This work on group comfort aggregation, together
with the future work on adaptive building system control, will promote human-centered building
planning and design by offering a better understanding about the interactions between the
collective human comfort levels and built environment control.

REFERENCES
American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE).
(2013). ASHRAE Standard 55. ASHRAE, Atlanta, GA.
Atkinson, A., and Brandolini, A. (2010). “On analyzing the world distribution of income.” World
Bank Econ. Rev., 24(1), 1–37.
Barr, N. (1998). The economics of the welfare state, 3rd Ed., Stanford Press, Stanford, CA.
Zhang, H., Huizenga, C, Arens, E, Wang, D. (2004). “Thermal sensation and comfort in transient
non-uniform thermal environments.” Eur J Appl Physiol, 92(6), 728-33, 2004.
Bellu, L. (2006). “Social welfare, social welfare functions and inequality aversion.” Food and
Agriculture Organization of the United Nations, 〈www.fao.org〉 (Nov. 8, 2017).
Blanc, I., Friot, D., Margni, M., and Jolliet, O. (2008). “Towards a new index for environmental
sustainability based on a DALY weighting approach.” Sustainable Dev., 16(4), 251–260.
De la Garza, J., Prateapusanond, A., and Ambani, N. (2007). “Preallocation of total float in the

application of a Critical Path Method based construction contract.” J. Constr. Eng. Manage.,
11(836), 836–845.
Feldman, A., and Serrano, R. (2006). Welfare economics and social choice theory, 2nd Ed.,
Springer, New York.
Jazizadeh, F., Ghahramani, A., Becerik-Gerber, B., and Kichkaylo, T. (2014). “Human-Building
Interaction Framework for Personalized Thermal Comfort-Driven Systems in Office
Buildings.” Journal of Computing in Civil Engineering, 28(1), 2-16.
Moore, G. E. (1992). Principia ethica, Cambridge University Press, Cambridge, U.K.
Olesen, B.W. and Parsons, K.C. (2002). “Introduction to thermal comfort standards and to the
proposed new version of EN ISO 7730.” Energy and Buildings, 34(2002), 537-548.
Sen, A. (1992). Inequality reexamined, 1st Ed., Harvard University Press, Cambridge, MA.
Sen, A. (1997). On economic inequality, 1st Ed., Clarendon Press, Oxford, UK.
U.S. EPA (U.S. Environmental Protection Agency). (2008). “Care for your air: A guide to indoor
air quality.” EPA 402/F-08/008, Washington, DC.
USGBC (U.S. Green Building Council). (2013). “Health is a Human Right. Green Building Can
Help.” Summit on Green Building & Human Health, USGBC, Washington D.C.
Tresch, R.W. (2008). Public Sector Economics. Palgrave Macmillan. New York, NY.
Yao, R., Li, B., and Liu, J. (2009). “A theoretical adaptive model of thermal comfort – Adaptive
Predicted Mean Vote (aPMV).” Build. & Envir., 44 (10), 2089-2096.
Zhang, L., Pradhananga, N., and D’Souza, N. (2018). “Capturing Human Sensation Using
Physiological Sensing Devices to Support Human-Centered Indoor Environment Design.”
Proc. 2018 Construction Research Congress (CRC), ASCE, Reston, VA, 178-188.

Prototype Development of a Tactile Sensing System for Improved Worker Safety

Perception
Sayan Sakhakarmi1 ; JeeWoong Park, Ph.D.2; and Chunhee Cho, Ph.D.3
1
Ph.D. Student, Dept. of Civil and Environmental Engineering and Construction, Univ. of
Nevada, Las Vegas, NV. E-mail: [email protected]
2
Dept. of Civil and Environmental Engineering and Construction, Univ. of Nevada, Las Vegas,
NV. E-mail: [email protected]
3
Dept. of Civil and Environmental Engineering, Univ. of Hawaii at Manoa. E-mail:
[email protected]

ABSTRACT
The nature of construction activities limits the sensing capability of construction workers,
which causes difficulty in perception of hazards threatening their lives. This difficulty is a
continuous challenge for construction workers. In an attempt to overcome this challenge, this
study proposes a tactile communication system as a means to deliver information about detected
hazards to workers. The system is based on the sense of touch, which workers can perceive even
when their senses of sight and hearing are impaired or limited. In the prototype system
development, the researchers investigated three parameters of vibration signals: signal intensity,
signal duration, and delay between signals, in order to categorize a set of distinct signals that
would result in a minimum rate of signal misperception. This study implemented clustering
analysis and categorized eight distinguishable signals. Such tactile signals, each with specific
information, would enable workers to perceive surrounding information about hazards at
construction sites, which otherwise would be difficult. This method demonstrates the potential of
such a technique to seamlessly deliver hazard information to construction workers by enhancing
their perception ability on a loud and dynamic construction site.

INTRODUCTION
Records from the Bureau of Labor Statistics (2017) show that there were 5,190 fatal work
injuries in 2016 in the United States. The construction industry alone accounted for more than
20% of total worker fatalities (OSHA 2018), and according to statistics (Bureau of Labor
Statistics 2017), the majority of such fatalities in construction were due to falls, electrocution, or
being struck by an object or equipment. The prevention of mishaps due to these reasons alone
could have saved the lives of 631 construction workers (OSHA 2018). Thus, the researchers are
focused on studies to detect such hazards using advanced technologies and have developed real-
time safety monitoring systems to detect potential hazards (Carbonari et al. 2011; Jebelli et al.
2016, 2018; Kim et al. 2016; Lee et al. 2009; Li et al. 2016; Park et al. 2016; Riaz et al. 2014).
Despite this, the workers often fail to perceive onsite hazards, and this limits the widespread
applications of past proximity sensing research.
For the prevention of fatalities, detected hazard information must be promptly communicated
to concerned workers, and the workers must take preventive action accordingly. However, only a
few studies have attempted to alert workers of potential hazards (Carbonari et al. 2011; Park et
al. 2015; Teizer et al. 2010). Moreover, the systems in these studies may be ineffective in
communicating information to the workers due to the adverse nature of construction sites, so
workers may not be able to take preventive actions in time. Some studies have identified the

harsh construction environment as a situation when alerting systems fail (Fyhrie 2016; Wang et
al. 2011). This is due to the fact that construction activities are typically loud, dynamic, and
complicated. The nature of construction activities and site conditions cause difficulty for
construction workers to perceive hazards through their senses of sight and hearing. The inability
to sense such potential hazards may result in severe injuries and fatalities. Therefore, it is
essential to investigate ways to enhance the sensing capability of workers, so that the detected
hazards can be effectively communicated to workers and potential fatalities can be prevented.
Previous studies have demonstrated that an artificial sensory system can replace lost senses
like sight or hearing (Chebat et al. 2011; Ward and Meijer 2010). Researchers have explored the
use of vibration systems to deliver directional information to people without the sense of sight
(Durá-Gil et al. 2017; Faugloire and Lejeune 2014). However, no studies have yet explored the
applicability of such artificial sensory systems to enhance construction safety. Therefore, this
research aims to develop a prototype tactile system that can be used to improve the
communication with workers in a harsh construction environment.

OBJECTIVE AND SCOPE

The main purpose of this research is to enhance the sensing capabilities of construction
workers through an additional sensing system, which would enable them to sense potential
hazards and prevent fatalities on construction sites. For this purpose, the research team
developed a prototype tactile system as a means of communicating important information about
detected hazards to the workers through the sense of touch on their backs. However, the
application of a tactile sensory system designed to deliver information requires an in-depth
understanding of the human capability to perceive such signals, which is the major element
explored in this research. The misperception of communicated signals may result in a more
dangerous situation, rather than solving the problem. Thus, to eliminate problems with signal
misperception, this research, as a prototype in development, focused on identifying a set of
distinct tactile signal units based on three parameters: signal intensity, signal duration, and delay
between signals. These distinguishable signal units could be used on construction sites for quick
delivery of specific hazard information to construction workers and, ideally, prevent potential
fatalities.

METHODOLOGY
The general approach of this experimental study is to develop and use a prototype tactile
sensing system and then identify distinguishable signal units based on signal intensity, signal
duration, and delay between signals. This prototype tactile system, as shown in Figure 1(a), is
developed with 3-volt cylindrical vibration motors, a Wemos D1 R1 Wi-Fi board, a Wemos D1
R2 Wi-Fi board, and a laptop. The vibration motors together with the Wemos D1 R2 Wi-Fi
board act as a sensor system firmly attached to a waistband. The Wemos D1 R2 Wi-Fi board acts
as a control processing unit that triggers the vibration motors. The Wemos D1 R1 Wi-Fi board
acts as a server, which wirelessly links the system with the laptop used to deliver signals to the
system. Upon receiving signals, the vibration motors vibrate to deliver information. Figure 1(b)
illustrates the general experimental procedure in which the system transmits vibration signals to
the worker, and the signal identified by the worker is recorded.
Figure 2 shows a sample of a signal profile with the signal parameters marked on it: signal
intensity, signal duration, and delay between signals. A completely different set of unique signal
profiles can be generated by making changes to any of these parameters. However, two different

signal profiles may not be easily distinguishable for human perception. So, each of the three
signal parameters are studied individually to gain a better understanding of such signals and
categorize a set of distinguishable signal units.

Figure 1. Prototype Tactile System.

Figure 2. Tactile Signal Profile.

Figure 3. Sample Probability Density Functions.

The vibration motors used in this study have the capacity to trigger signals up to 3.1 V
intensity. Thus, this study tested the signals with different intensities within 3.1 V. With an effort
to make the duration of signals as short as possible for quick delivery of information, durations
in milliseconds are tested, for both signal duration and delay between signals. The first step in
the experimental study is to identify the minimum threshold value of each signal parameter that
is perceivable to human subjects. After identifying the minimum values for all parameters,
different signal units are uniquely indexed for each parameter. Then the test subjects are asked to
identify these indexed signal units, and the researchers record their responses. The recorded data

are analyzed to determine the basic distinguishable signal units for each of the three signal
parameters.
Figure 3 shows a sample of probability density functions for three signal units. The figure
clearly shows overlapping of signals 1 and 2, whereas there is no overlapping of signals 2 and 3.
The overlapping of signals means that the two signals are not clearly distinct, while no
overlapping between two signals is an ideal situation, meaning that these two signals are distinct
and easily distinguishable. However, in reality, such an ideal situation, in which there is no
overlap, is rare. Thus, the signals with minimum overlapping areas are considered as
distinguishable signal units. Accordingly, this study used probability distribution functions and
clustering methods to identify the signal units with a minimum overlapping area for each of the
three signal parameters.

EXPERIMENT
The experimental study used five test subjects who were able to perceive tactile signals
received on their backs and react accordingly. To determine the minimum perceivable values of
each parameter, the test subjects were asked if they are able to perceive different values of each
parameter. In the case of signal intensity, it was found that signals below 1.3 V were not
perceivable for the test subjects. Thus, a vibration intensity in the range of 1.3 V to 3.1 V with 10
equally spaced intensities was used in this study, as shown in Table 1. For the signal durations
and delays, various durations starting from 50 ms were tested. However, it was found that the
signals with a duration of less than 75 ms were not easily detectable. Thus, a minimum signal
duration of 75 ms was used. Signal durations, as well as delays between signals ranging from 75
ms to 500 ms were used in this study. Each of these parameters was indexed with numerical
values from 1 to 10 as shown in Table 1.

Table 1. Signal Units Indexing.

Index
Signal Parameters
1 2 3 4 5 6 7 8 9 10
Intensity (V) 1.3 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1
Duration (ms) 75 100 150 200 250 300 350 400 450 500
Delay (ms) 75 100 150 200 250 300 350 400 450 500

The experiment comprised of both training and testing phases with the test subjects. The test
subjects were equipped with the prototype tactile system embedded on a waist belt, and a laptop
was used to send signals. For each parameter, the test subjects were trained with 100 signal units
for each index. Therefore, each test subject received a total of 3,000 signals during the training
phase. In the testing phase, each test subject was asked to identify 40 signal units for each index
separately: signal intensity, signal duration, and delay between signals. Therefore, the five test
subjects received a total of 6,000 random vibration signal units. The indices for each signal
responded to by each subject as he/she perceived them, were recorded along with the actual
signal indices.

ANALYSIS AND RESULTS

The data recorded from the experiment was used to plot the probability density functions for
each signal parameter separately, as shown in Figure 4. The figure illustrates that the overlapping
area between the probability density functions of two signals decreases when the signal indices

are far apart for all signal parameters. For example, the overlapping area between Signal 1 and
any other signal decreases as the index of other signal changes from 2 towards 10, as seen in
Figure 4(a). The plots also demonstrate that there are no ideal situations, without overlap
between two signal units.

Figure 4. Probability Density Functions.

The researchers implemented a clustering method to quantify signal effectiveness and
determine the optimum number of signal indices and their combinations for each of the signal
parameters. First, the signal clusters of 2 to 10 different signal indices were formed for each
signal parameter. Then, the sums of overlapping areas were computed for each cluster to find the
minimum value of the overlapping area. Table 2 shows the overlapping areas calculated for the
signal intensity cluster of two signal units. The table shows an increase in the overlapping area as
the signal indices get closer to one another.

Table 2. Overlap Areas for Signal Intensity Cluster of Two Signal Units.
Index 1 2 3 4 5 6 7 8 9 10
1 1 0.6019 0.3074 0.0671 0.0185 0 0 0 0 0
2 0.6019 1 0.7056 0.3662 0.0791 0.0303 0.0379 0.0192 0 0
3 0.3074 0.7056 1 0.5778 0.2333 0.0414 0.0490 0.0303 0 0
4 0.0671 0.3662 0.5778 1 0.6042 0.4122 0.3434 0.1442 0.0139 0
5 0.0185 0.0791 0.2333 0.6042 1 0.6079 0.5598 0.2229 0.0556 0
6 0 0.0303 0.0414 0.4122 0.6079 1 0.7576 0.3607 0.0530 0
7 0 0.0379 0.0490 0.3434 0.5598 0.7576 1 0.5425 0.0985 0.0152
8 0 0.0192 0.0303 0.1442 0.2229 0.3607 0.5425 1 0.1603 0.0769
9 0 0 0 0.0139 0.0556 0.0530 0.0985 0.1603 1 0.6708
10 0 0 0 0 0 0 0.0152 0.0769 0.6708 1

Figure 5 shows a plot between the minimum summation of overlapping area and the
corresponding number of signal units in the cluster for all signal parameters. For signal intensity
and signal duration, the plot shows a sharp rise in the overlapping area when the number of
signal units in the cluster is more than three. However, in case of signal delay, there is a sharp
rise in the overlapping area for a cluster with more than two signal units. These results suggest
that there are three distinct signal units for both signal duration and signal intensity, whereas
there are just two distinct signal units for delay between signals.

Figure 5. Optimum Number of Signal Units.

Based on the minimum summation of overlapping areas for different clusters of three signal
units, it was found that the optimum cluster for signal intensity was with signal units 1, 5, and 10.
Similarly, signal units 1, 5, and 9 formed an optimum cluster for signal duration. In case of delay
between the signals, signal units 1 and 10 formed an optimum cluster. Figure 6 shows the plots
of the optimum cluster for each of the three signal parameters.

Figure 6. Distinguishable Signal Units.

CONCLUSION
Construction sites are loud, complex, and dynamic, which limit the perception ability of
workers. To overcome workers’ limited sensing ability especially for safety applications, this
study presented a tactile-based sensing system through prototype development and follow-up
testing. Particularly, the research team conducted experiments to identify distinguishable tactile
signal units based on three signal parameters: signal intensity, signal duration, and delay between
signals. The researchers used the prototype tactile system to deliver different types of indexed
signal units to five test subjects and recorded the signals perceived by them. The data collected
from the five test subjects were analyzed using probability density functions and clustering
methods to identify an optimum number of distinct signal units for each of the three signal

parameters. Out of 30 signal units, the study categorized eight signal units as distinguishable,
meaning that the construction workers can easily perceive them.
This study focused on identifying basic signal units for a prototype tactile system. However,
the implementation of the prototype tactile system in construction sites will require a complete
tactile-based communicable language. The signal units identified in this study could serve as
basic signal units for developing such a communication language to deliver meaningful
information. With such tactile-based language, the prototype tactile system could be enhanced as
a full communication system for construction sites. Therefore, further extensive study would be
required to develop fully communicable tactile signals with meaningful information.

ACKNOWLEDGEMENT
This publication was supported by CPWR through NIOSH cooperative agreement
OH009762. Its contents are solely the responsibility of the authors and do not necessarily
represent the official views of CPWR or NIOSH.

REFERENCES
Bureau of Labor Statistics. (2017). “National Census of Fatal Occupational Injuries in 2016.”
<https://www.bls.gov/news.release/pdf/cfoi.pdf> (Oct. 20, 2018).
Carbonari, A., Giretti, A., and Naticchia, B. (2011). “A proactive system for real-time safety
management in construction sites.” Automation in Construction, Elsevier B.V., 20, 686–698.
Chebat, D. R., Schneider, F. C., Kupers, R., and Ptito, M. (2011). “Navigation with a sensory
substitution device in congenitally blind individuals.” NeuroReport, 22(7), 342–347.
Durá-Gil, J. V., Bazuelo-Ruiz, B., Moro-Pérez, D., and Mollà-Domenech, F. (2017). “Analysis
of different vibration patterns to guide blind people.” PeerJ, 5, e3082.
Faugloire, E., and Lejeune, L. (2014). “Evaluation of Heading Performance With Vibrotactile
Guidance : The Benefits of Information – Movement Coupling Compared With Spatial
Language.” Journal of Experimental Psychology: Applied, 20(4), 397–410.
Fyhrie, P. B. (2016). “Work Zone Intrusion Alarms for Highway Workers.” Caltrans Division of
Research, Innovation and System Information,
<http://www.dot.ca.gov/newtech/researchreports/preliminary_investigations/docs/work_zone
_warning_preliminary_investigation.pdf> (Oct. 20, 2018).
Jebelli, H., Ahn, C. R., and Stentz, T. L. (2016). “Comprehensive Fall-Risk Assessment of
Construction Workers Using Inertial Measurement Units: Validation of the Gait-Stability
Metric to Assess the Fall Risk of Iron Workers.” Journal of Computing in Civil Engineering,
30(3), 04015034.
Jebelli, H., Hwang, S., and Lee, S. H. (2018). “EEG-based workers’ stress recognition at
construction sites.” Automation in Construction, Elsevier, 93, 315–324.
Kim, H., Lee, H. S., Park, M., Chung, B. Y., and Hwang, S. (2016). “Automated hazardous area
identification using laborers’ actual and optimal routes.” Automation in Construction, 65, 21–
32.
Lee, U.-K., Kim, J.-H., Cho, H., and Kang, K.-I. (2009). “Development of a mobile safety
monitoring system for construction sites.” Automation in Construction, Elsevier B.V., 18,
258–264.
Li, H., Yang, X., Wang, F., Rose, T., Chan, G., and Dong, S. (2016). “Stochastic state sequence
model to predict construction site safety states through Real-Time Location Systems.” Safety
Science, Elsevier Ltd, 84, 78–87.

OSHA. (2018). “Commonly Used Statistics.”

<https://www.osha.gov/oshstats/commonstats.html> (Mar. 18, 2018).
Park, J., Kim, K., and Cho, Y. K. (2016). “Framework of Automated Construction-Safety
Monitoring Using Cloud-Enabled BIM and BLE Mobile Tracking Sensors.” Journal of
Construction Engineering and Management, American Society of Civil Engineers, 143(2),
05016019.
Park, J., Marks, E., Cho, Y. K., and Suryanto, W. (2015). “Performance Test of Wireless
Technologies for Personnel and Equipment Proximity Sensing in Work Zones.” Journal of
Construction Engineering and Management, 142(1).
Riaz, Z., Arslan, M., Kiani, A. K., and Azhar, S. (2014). “CoSMoS: A BIM and wireless sensor
based integrated solution for worker safety in confined spaces.” Automation in Construction,
45, 96–106.
Teizer, J., Allread, B. S., Fullerton, C. E., and Hinze, J. (2010). “Autonomous pro-active real-
time construction worker and equipment operator proximity safety alert system.” Automation
in Construction, Elsevier B.V., 19, 630–640.
Wang, M.-H., Schrock, S. D., Bai, Y., and Rescot, R. A. (2011). Evaluation of Innovative Traffic
Safety Devices at Short-Term Work Zones. Kansas Department of Transportation.
Ward, J., and Meijer, P. (2010). “Visual experiences in the blind induced by an auditory sensory
substitution device.” Consciousness and Cognition, Elsevier Inc., 19, 492–500.

Robustness Analysis of Design Phase Performance Predictors Using Extreme Bounds

Analysis (EBA)
Sharareh Kermanshachi, Ph.D., P.E., M.ASCE1; and Behzad Rouhanizadeh, S.M.ASCE2
1
Assistant Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, 438 Nedderman
Hall, 416 Yates St., Arlington, TX 76019 (corresponding author). E-mail:
[email protected]
2
Ph.D. Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman Hall,
416 Yates St., Arlington, TX 76019. E-mail: [email protected]

ABSTRACT
The design phase is one of the important phases in any construction project. Leading cost and
schedule performance indicators of design phase have been studied by some of the researchers,
however, these variables' robustness has been studied rarely. The goal of this research is to
discern between the robust and fragile cost overrun and schedule delay indicators of the design
phase. Both extreme bounds analysis (EBA) methods including Leamer’s and Sala-i-Martin were
implemented in this research. The Leamer’s method only considers the extreme bounds of the
indicator's distribution while Sala-i-Martin focuses on the whole indicator's distribution. Results
of this study revealed robust cost overrun and schedule delay indicators in the design phase.
Project managers can use the findings of this research to prioritize the more robust indicators as
higher priority and reduce required design modifications within the project process.

INTRODUCTION
Construction industry plays a key role in economic growth of communities and comprises
almost six to nine percent of GDP in every developed country like the U.S. (Chitkara, 1998).
There are a variety of inherent uncertainty sources in each phase of a construction project
including the design/engineering, procurement, and construction phases. These uncertainties
could affect the performance of a project, thus, identification and measurement of effect of them
are extremely required for success of a project by minimizing the cost overruns and time delays.
The uncertainties driven in the design phase could be of many reasons such as unexpected site
conditions, imprecise project requirements at the beginning, design errors, etc. (Assaf and
AlHejji, 2006). Understanding the indicators which affect the performance of a construction
project in design phase is the first step in controlling their impacts (Habibi and Kermanshachi,
2018). In this regard, the indicators with the greatest effects on the performance of a project
should be identified. To measure the performance of a project, the delays and cost overruns
would be considered as the indicator of performance (Kermanshachi et al. 2016). While the
design process generally acknowledged as imperfect, there has been a lack of study on what
uncertainties to expect and how to manage them in order to control their impacts on project
performance. Furthermore, some studies have analyzed the causes of delays and cost overruns in
construction projects, few of them have focused on these problems in the design phase (Habibi et
al. 2018). Therefore, the goal of this research was to identify the factors that have the most
impact on time and cost performance of a construction project. To achieve this goal, the
objective of this research was to develop a sensitivity analysis model which determines how
robustly each of the engineering/design cost and schedule indicators is associated with the
project cost and schedule performance. By identifying these robust indicators, project managers

could make better decisions for allocating their resources in the project, which leads to higher
project cost and schedule performance.

BACKGROUND
Most of the construction projects experience cost and time overruns in all or some of their
phases (Assaf and Al-Hejji, 2006). Some studies have been conducted on cost overruns leading
to different definitions of cost overruns. Flyvbjerg et al. (2002) defined cost overruns as the gap
between the estimated cost and the actual cost in a project. Alongside the cost overruns, a time
overrun is defined as a delay beyond the agreed contract deadline or beyond agreed date for the
final delivery of a project. Cantarelli et al. (2012) studied the cost overruns several of the
infrastructures and concluded thatthe geography plays a key role causing this problem. During
the last decades, several researches have attempted to identify the causes of schedule delays in
construction projects. For instance, Baldwin and Manthei (1971) examined causes of delays in
building projects in the U.S.; Sullivan and Harris (1986) investigated causes delays in large
construction projects in the United Kingdom; Odeh and Battaineh (2002) analyzed causes of
time overruns in huge construction projects executed in Jordan. Many indicators were identified
through the studies mainly including poor labor productivity, design changes, and inadequate
planning. Obviously, all of the construction managers tend to minimize schedule delays and cost
overruns (Safapour et al. 2018). Different phases of a project have different influences on a
project’s cost and schedule, thus, each of them should be analyzed to identify all the causes of
this problem. Paulson (1976) indicated that the intensity of impact on cost of a project is the
greatest within the engineering/design phase. In addition, delays during design phase compress
the schedule when completion date of a project is fixed. Hence, spending effort in this early stage
of a project impacts the performance of the project and level of success tremendously (Chang et
al., 2010). Therefore, identifying the causes of schedule delays and cost overruns are the initial
step of resolving these problems. Kermanshachi (2016) has analyzed and identified the main
indicators of project performance during all the project phases. For the current study, the results
of that effort have been used and the indicators identified for the design phase were implemented
for analysis to determine their robustness on performance of a construction project. As shown in
Tables 1 & 2, Kermanshachi (2016) found nine indicators from design phase, which affect cost
performance and nine indicators that affect schedule performance in a construction project. The
results of this study will help practitioners and academicians in construction management field to
better understand phase-based project success. In addition, by identifying robust schedule and
cost performance indicators in engineering/design phase, the project managers could allocate
their resources with more efficiently.

RESEARCH METHODOLOGY
In this study, first, a comprehensive literature review was performed about indicators of
design phase in construction industry. As shown in Table 1 & 2, nine indicators for cost overruns
and nine for schedule delays were identified by Kermanshachi (2016) as the main factors
influencing the performance of a construction project, thus, these indicators were considered in
this research as the basis for robustness analysis of indicators in design/engineering phase. Then,
for both groups of indicators, the EBA analysis was performed, which includes the Leamer’s test
and Sala-i-Martin’s test. In the following, EBA, and both of the corresponding approaches are
presented. EBA is a sensitivity analysis approach, which in a regression model investigates how
robustly the dependent variable is related to the independent variables (Leamer, 1985). This

method is highly recommended for uncertainty measurement due to its accuracy and allowing to
obtain lower and upper limits for any given variable (Moosa & Cardak, 2006). Many areas of
knowledge have used this method including economics, social science, political science, etc.

Table 1. Design/Engineering Phase Schedule Performance Indicators

No. Indicator P-value
ES1 Risk Assessment Process Implementation 0.02
ES2 Project Management Team Peak Size-Engineering/Design Phase 0.00
ES3 Project Execution Driver 0.00
Engineering Project Management Team (PMT) Efficiency Level-Average
ES4 0.00
Number of Participants
ES5 Actual Percentage of Engineering/Design Completion at the Start of Construction 0.03
ES6 Number of Owner Organizations 0.00
ES7 Number of Financial Approval Authority Thresholds 0.00
ES8 Change Management Process Effectiveness in Controlling Cost & Schedule 0.00
Planned Percentage of Engineering/Design Completion at three Start of
ES9 0.00
Construction

Table 2. Design/Engineering Phase Cost Performance Indicators

No. Indicator P-value
EC1 Use of Alignment Strategy 0.02
EC2 Reuse of Existing Installed Equipment 0.17
EC3 Project Primary Nature 0.31
EC4 Project Documents Translated into a Different Language 0.11
EC5 Number of Owner Driven Change Orders-Engineering/Design Phase 0.45
EC6 Engineering/Design Phase Baseline Schedule 0.06
EC7 Contract Containing Penalties for Late Completion 0.15
EC8 Contract Containing Liquidated Damages 0.42
EC9 Engineering/Design Phase Actual Schedule 0.03

Levine & Renelt (1992) presented a simplified version of EBA as shown in Eq. 1.
y   j   jv   j F   j Dj   (Eq. 1)
where y stands for a dependent variable; j is the index of the regression model; F stands for a
vector of variables used in the regression; v is the variable that its robustness will be examined;
D j is a vector of up to three additional variables used in regression; and ε stands for the error.

Leamer’s EBA
Leamer’s EBA focuses on the extreme bounds of the regression coefficients to determine if a
variable is fragile or robust, (Leamer 1985). This EBA approach scans many model
specifications for the highest and lowest value that β could take at the desired confidence level.
First, considering v as the variable of interest, a fundamental regression including variables of F
vector and D j would be estimated. Second, to define the lowest and highest value of  j , all
combinations of variables of D j vector would be calculated. Then the lower extreme bound for

 , and the upper extreme bound for  would be estimated using Eq. 2 & Eq. 3 respectively.
 
Lower Extreme Bound  Min   j   2 j (Eq. 2)

Upper Extreme Bound   Min      2

j j (Eq. 3)
where  stands for standard deviation.
If the estimated upper and lower bound have the same sign, v will be referred to as robust and
if not, it would be fragile. While changes in the input of a robust variable will not affect the
model output significantly, the output of the model is more inclined to change slightly in the
input of a fragile variable. This method relies on a very demanding robustness criterion because
the results of a regression model are enough for classification of a variable as fragile.

Sala-i-Martin’s EBA
To relieve several drawbacks accompanied with the Leamer’s EBA, Sala-i-Martin (1997)
suggested an alternative EBA technique, which instead of only focusing on the extreme bounds
of the regression coefficients, relies on the entire distribution. Therefore, this method assigns
higher level of confidence to the robustness of the variables due to not applying a binary label of
fragile or robust. According to the literature on this method, “if 95 percent of the density function
for the estimates of 1 lies to the right of zero and only 50 percent of the density function for  2
lies to the right of zero, one will probably think of variable 1 as being more likely to be
correlated with the dependent variable] than variable 2.” In Sala-i-Martin’s EBA, if a greater
proportion of a variable’s coefficient estimates lies on the same side of zero considers it more
robust.

Sensitivity Analysis of Engineering/Design Phase Schedule Performance Indicators

The nine schedule performance indicators of design/engineering phase identified by
Kermanshachi (2016) were used for EBA sensitivity analysis and to decide if an indicator is
fragile or robust. Both of the EBA tests described previously were implemented and as
demonstrated in Table 3, six of the variables were robust and three of them were fragile. Except
“risk assessment process implementation”, “PMT peak size of engineering/design phase”, and
“actual percentage of engineering/design completion at the start of construction” the other
indicators are the robust schedule delay indicators of the design phase. According to EBA
results, if the average number of PMT participants during the engineering phase (ES2) increases,
the project will face less schedule overrun during the design phase due to the availability of
diverse skilled experts (Kermanshachi et al. 2017). Furthermore, this study revealed that if more
owner entities are partnered up for a project, the engineering phase schedule will benefit due to
the availability of more resources and collaboration of experienced practitioners. This study also
indicated that if the number of financial approval authority thresholds in the project increases
(ES7), the engineering phase schedule will suffer due to the uncertainties associated with the
ultimate budget. When the design and construction budget is not fully approved and finalized,
engineers will face difficulties in proceeding with the project design. In addition, the change
management process (ES8) and planned percentage of engineering/design completion at the start
of construction (ES9) were the last two robust determinants of schedule performance during the
engineering phase. The results indicated that implementation of balanced change culture of
recognition, planning and evaluation of project changes in an organization reduces the
probability of extending the schedule during the design phase. Implementation of change culture

makes the project participants ready to embrace owner’s change requests and manage to
accomplish the owner’s desires effectively. Besides, planned percentage of design completion at
the start of construction (ES9) has an adverse relationship with the schedule performance during
the design phase.
Although the fragile variables will not have a considerable impact as the robust indicators,
they could provide some information about the project schedule performance. It was also found
that implementation of the risk assessment process (ES1) could potentially improve the
engineering phase schedule overrun due to the effective management of unexpected problems.
Planning for the risk assessment process could require more resource commitment and thus,
negatively affect the project schedule during the engineering phase. Therefore, depending on the
project nature, implementation of the risk assessment process should be considered consciously.
Yet, Peak number of project management team size during the engineering phase (ES2) was
another fragile design schedule performance determinant which could have a positive or negative
relationship.

Table 3. EBA Study of Project Engineering Schedule Overrun Model

Leamer EBA test Sala-i-Martin EBA
Lower Upper
Normal Normal Non-Normal Non-Normal
Extreme Extreme Robustness
CDF(β<=0) CDF(β>0) CDF(β<=0) CDF(β>0)
Bound Bound
ES1 -0.15 0.11 79.03 20.97 76.79 23.21 Fragile
ES2 -0.01 0.01 38.15 61.85 37.92 62.08 Fragile
ES3 -0.07 0.23 1.25 98.75 3.98 96.02 Robust
ES4 -0.01 0.00 99.49 0.51 99.25 0.75 Robust
ES5 -0.29 0.09 94.91 5.09 93.14 6.86 Fragile
ES6 -0.60 0.00 99.69 0.32 99.40 0.60 Robust
ES7 0.01 0.25 0.03 99.97 0.17 99.83 Robust
ES8 -0.21 0.05 100.00 0.00 100.00 0.00 Robust
ES9 -0.02 0.01 98.84 1.17 95.91 4.09 Robust
(ES1): Risk Assessment Process Implementation, (ES2): Project Management Team Peak Size-Engineering/Design
Phase, (ES3): Project Execution Driver, (ES4): Engineering PMT Efficiency Level-Average Number of Participants,
(ES5): Actual Percentage of Engineering/Design Completion at the Start of Construction,(ES6): Number of Owner
Organizations, (ES7): Number of Financial Approval Authority Thresholds, (ES8): Change Management Process
Effectiveness in Controlling Cost and Schedule, (ES9): Planned Percentage of Engineering/Design Completion at
the Start of Construction.

Increasing the size of the project management team could positively affect schedule
performance since more human resources will be available to the project. At the same time,
increasing the number of participants could cause more disagreements and conflicts between
project members (Kermanshachi et al. 2018a, 2018b, and 2019). Thus, the impact of peak
number of PMT should be determined depending on other factors such as how large the number
of participants or how complicated is the scope of the project (Kermanshachi et al. 2017). Figure
1 illustrates the normality or non-normality of ES2 and ES3 distributions for engineering
schedule overrun indicators. Once a determination is made regarding the normality or non-
normality of these indicators’ coefficients, appropriate columns in Table 3 will be used to decide
if an indicator is robust or fragile. The magnitudes of regression coefficients are on the horizontal
axis and the vertical axis indicates the corresponding probability density.

Figure 1. Graphical representation of Sala-i-Martin EBA for ES2 & ES7

Sensitivity Analysis of Engineering/Design Phase Cost Performance Indicators
The nine cost performance indicators of design/engineering phase identified by
Kermanshachi (2016) were used for EBA sensitivity analysis and to decide if an indicator is
fragile or robust. According to Table 4, six out of nine of the cost performance indicators in the
engineering phase were robust and the remaining three were fragile. Except “use of alignment
strategy”, “reuse of existing installed equipment” and “number owner driven change orders of
engineering/design phase” the other indicators are robust indicators in design phase of cost
performance. As the analysis shows, if project documents are translated into a different language
(EC4), it would be an indicator of cost overrun during the engineering phase. Moreover,
international projects often require more resources to plan and execute. This slower process will
ultimately negatively impact engineering phase cost performance due to the overhead costs.

Table 4. EBA Study of Project Engineering Cost Overrun Model

Leamer EBA test Sala-i-Martin EBA
Lower Upper Normal
Normal Non-Normal Non-Normal
Extreme Extreme CDF(β<= Robustness
CDF(β>0) CDF(β<=0) CDF(β>0)
Bound Bound 0)
EC1 -0.15 0.11 79.03 20.97 76.79 23.21 Fragile
EC2 -0.01 0.01 38.15 61.85 37.92 62.08 Fragile
EC3 -0.07 0.23 1.25 98.75 3.98 96.02 Robust
EC4 0.00 0.01 0.51 99.49 0.75 99.25 Robust
EC5 -0.09 0.29 5.09 94.91 6.86 93.14 Fragile
EC6 -0.60 0.00 99.69 0.32 99.40 0.60 Robust
EC7 0.01 0.25 0.03 99.97 0.17 99.83 Robust
EC8 0.05 0.21 0.00 100.00 0.00 100.00 Robust
EC9 -0.02 0.01 98.84 1.17 95.91 4.09 Robust
(EC1): Use of Alignment Strategy, (EC2): Reuse of Existing Installed Equipment, (EC3): Project primary Nature,
(EC4): Project Documents Translated into a Different Language, (EC5): Number of Owner Driven Change Orders-
Engineering/Design Phase, (EC6): Engineering/Design Phase Baseline Schedule, (EC7): Contract Containing
Penalties for late completion, (EC8): Contract Containing Liquidated damages, (EC9): Engineering/Design Phase
Actual Schedule.

The same analysis revealed that if the number of owner-driven change orders during the
engineering/design phase increases (EC5), the engineering phase cost performance will suffer
and there is a high chance of cost overrun in this phase. This poor cost performance could be
explained due to the extra engineering hours required to satisfy the owner’s change orders

(Safapour et al, 2018). This study also concluded that if the contract contains penalties for late
completion (EC7) as well as liquidated damages (EC8), the project engineering phase may face
cost overruns due to the extra human resources required to complete the project on time. This
issue could often happen when the project has a design-build project delivery and the
engineering phase should be shortened to dedicate more time to the construction phase. The
results demonstrated that the engineering phase baseline schedule (EC6) is a robust indicator of
design phase cost performance. These results show that if the project has a more flexible baseline
schedule, there will be less probability that it faces engineering cost overrun. It is also concluded
that the use of alignment strategy (EC1) will reduce the probability of engineering phase cost
overrun. Alignment is the condition where appropriate project participants are working within
acceptable tolerances to develop and meet a uniformly defined and understood set of project
priorities (Safapour et al. 2017). Figure 2 illustrates the normality or non-normality of EC7 and
EC8 distributions for engineering cost overrun indicators.

Figure 2. Graphical representation of Sala-i-Martin EBA for EC7 & EC8

CONCLUSIONS
In this study, an EBA sensitivity analysis of the cost and schedule overruns indicators was
applied to determine their robustness. Most of the projects, including construction projects, deal
with schedule delays and cost overruns during all phases of their life cycle, including the
engineering/design phase. Since there were limited studies in the literature on determining cost
and schedule performance indicators’ robustness, this study was aimed to determining cost and
schedule performance indicators’ robustness in design/engineering construction phase.
According to the analysis, except “use of alignment strategy”, “reuse of existing installed
equipment” and “number owner driven change orders of engineering/design phase” the other
indicators are robust indicators in design phase of cost performance. Findings of the analysis also
indicate that except “risk assessment process implementation”, “project management team peak
size of engineering/design phase”, and “actual percentage of engineering/design completion at
the start of construction” the other indicators are the robust schedule delay indicators of the
design phase. Both EBA approaches, Leamer’s and Sala-i-Martin, were implemented and the
results were presented, but the conclusion about the fragility or robustness of the indicators was
made based on the Sala-i-Martin EBA method. The results of this study contributed to lead
project managers to allocate their resources more effectively in construction projects, specifically
when there is a lack of resources of any type.

REFERENCES
Assaf, S.A. and Al Hejji, S. (2006). “Causes of delay in large construction projects.” Int. J.

Project Management, 24: 349- 357.

Baldwin, J.R., Manthei J.M. (1971). “Causes of delay in the construction industry.” Journal of
Construction Division, ASCE, 97: 177–187.
Cantarelli, C., C., van Wee, B., Molin, E., J., E., Flyvbjerg B. (2012a). “Characteristics of cost
overruns for Dutch transport infrastructure projects and the importance of the decision to
build and project phases.” Transport Policy, 22: 49–56.
Chang K., Shin T., Klein G. (2010). “User commitment and collaboration: motivational
antecedents and project performance.” Inf. Softw. Technol., 52:672–679.
Chitkara, K. K. (1998). “Construction project management: Planning, scheduling and
controlling.” McGraw-Hill, New Delhi.
Flyvbjerg, B., Holm, M. K. S. and Buhl, S. L. (2002). “Cost underestimation in public works
projects: Error or lie?” Journal of the American Planning Association, 68(3): 279–295.
Habibi, M., Kermanshachi, S., and Safapour, E. (2018). “Engineering, Procurement and
Construction Cost and Schedule Performance Leading Indicators: State-of-the-Art Review,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Habibi, M., and Kermanshachi, S. (2018). “Phase-based analysis of key cost and schedule
performance causes and preventive strategies: Research trends and implications.” Journal of
Engineering, Construction, and Architectural Management. 2018, 25, 1009–1033.
Kermanshachi, Sharareh (2016). Decision Making and Uncertainty Analysis in Success of
Construction Projects. Doctoral dissertation, Texas A & M University. http : / /hdl .handle
.net /1969 .1 /158020.
Kermanshachi, S., Dao, B., Shane. J., and Anderson, S. (2017). “Uncertainty Analysis of
Procurement Phase Performance Indicators Using Extreme Bounds Analysis (EBA),”
Proceedings of the 6th CSCE International Construction Specialty Conference, Vancouver,
Canada May 31-June 3.
Kermanshachi, S., Anderson, S. D., Goodrum, P., and Taylor, T. R. (2017). “Project Scoping
Process Model Development to Achieve On-Time and On-Budget Delivery of Highway
Projects,” Transportation Research Record: Journal of the Transportation Research Board,
(2630), 147-155.
Kermanshachi, S., Safapour, E., Anderson, S., Goodrum, P., Taylor, T., and Sadatsafavi, H.
(2019), “Development of Multi-Level Scoping Process Framework for Transportation
Infrastructure Projects Using IDEF Modeling Technique,” Proceedings of Transportation
Research Board 98th Annual Conference, Washington, DC, 2019.
Kermanshachi, S., Safapour, E., Anderson, S., Goodrum, P., and Taylor, T. (2018). “Exploring
Current Scoping Practices Used in the Development of Transportation Infrastructure
Projects,” Proceedings of the 12th CSCE International Transportation Specialty, Fredericton,
Canada June 13-June 16.
Kermanshachi, S., Safapour, E., Anderson, S., Goodrum, P., and Taylor, T. (2018).
“Establishment of Effective Project Scoping Process for Highway and Bridge Construction
Projects,” Practice Periodical on Structural Design and Construction (In Press).
Kermanshachi, S., Beaty, C. and Anderson, S. (2016). Improving Early Phase Cost Estimation
and Risk Assessment: A Department of Transportation Case Study. In Transportation
Research Board 95th Annual Meeting, No. 16-2202.
Leamer, E. E. (1985). “Sensitivity analysis would help.” American Economic Review, American
Economic Association, 57(3), 308-313.

Leavitt D., Ennis S., McGovern P. (1993). “The cost escalation of rail projects: using previous
experience to re-evaluate the Cal speed estimates.” California High Speed Rail Series,
University of California, Berkeley, CA.
Levine, Ross, and David Renelt (1992). “A Sensitivity Analysis of Cross-Country Growth
Regressions.” American Economic Review, 942-63.
Moosa, I., A., Cardak B., A. (2006). “The Determinants of Foreign Direct Investment: An
Extreme Bounds Analysis.” Journal of Multinational Financial Management, 16: 199-211.
Odeh, A.M., Battaineh, H.T. (2002). “Causes of construction delay: traditional contracts.” Int. J.
Project Manage. 20: 67-73.
Safapour, E., Kermanshachi, S., Shane, J. and Anderson, S. (2017). “Exploring and Assessing
the Utilization of Best Practices for Achieving Excellence in Construction Projects,”
Proceedings of the 6th CSCE International Construction Specialty Conference, Vancouver,
Canada May 31-June 3.
Safapour, E., Kermanshachi, S., Habibi, M., and Shane, J. (2018). “Resource-Based Exploratory
Analysis of Project Complexity Impact on Phase-Based Cost Performance Behavior,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Safapour, E., Kermanshachi, S. and Ramaji, I. (2018). “Entity-Based Investigation of Project
Complexity Impact on Size and Frequency of Construction Phase Change Orders,”
Proceedings of Construction Research Congress, New Orleans, LA, April 2-4, 2018.
Sala-i-Martin, X. (1997). “I just ran two million regressions.” American Economic Review,
American Economic Association, 87(2): 178-183.
Sullivan, A., Harris, FC. (1986). “Delays on large construction projects.” Int. J. Oper. Prod.
Management, 6(1): 25-33.

Biomechanical Analysis of Manual Material Handling Tasks on Scaffold

Srikanth Sagar Bangaru1; Chao Wang, Ph.D., A.M.ASCE2;
and Fereydoun Aghazadeh, Ph.D., P.E. 3
1
Ph.D. Student, Bert S. Turner Dept. of Construction Management, Louisiana State Univ., 237
Electrical Engineering Building, Baton Rouge, LA 70803. E-mail: [email protected]
2
Assistant Professor, Bert S. Turner Dept. of Construction Management, Louisiana State Univ.,
3315D Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]
3
Professor, Dept. of Mechanical and Industrial Engineering, Louisiana State Univ., 3250A
Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected]

ABSTRACT
Working on elevated surfaces such as scaffolds has high risk of work-related musculoskeletal
disorders (WMSDs), injuries, and fatal accidents. In the past, researches have studied postural
stability, rotational postural stability, and cardiovascular stress of manual works on scaffold.
Most of these studies focused on erecting and dismantling scaffold end-frames. However, there
is a necessity for biomechanical evaluation of manual material handling (MMH) tasks and study
of the effect of independent factors such as scaffold height, task type, and weights on low-back
compression, shear forces, and perceived task difficulty. This study focuses on evaluating the
biomechanical stresses on lower back due to the three MMH tasks using three different weights
at two levels of the scaffold height using 3DSSP models. The statistical analysis result shows
that a significant increase in low-back compression and shear forces with the scaffold height and
lifting weights.

INTRODUCTION AND BACKGROUND

Historically, health issues in the construction industry have received less attention compared
to other more immediate, high profile safety issues (Saurin and de Macedo Guimarães 2006).
This may be attributable to the temporary and often mobile nature of the construction workforce;
most construction workers are not directly employed; composition of crews change
continuously; work to be completed is unique per worksite; and shortage of health expertise
within the industry, leading to difficulty in measuring and quantifying the benefits of safety
management (Institute 1999). Notwithstanding, ergonomists agree that the adoption of
ergonomic measures and biomechanical tools to reduce the physical workload of construction
workers should be encouraged.
Considering the prevalence of heavy construction equipment, working on elevated surfaces,
continuous use of machinery and powered tools, and manual handling of heavy construction
materials among others, the construction industry is characterized by high frequencies of work-
related injuries (Ohdo et al. 2011). Unfortunately, scaffolding work, which not only includes
dismantling and erection, but also working on scaffolds, involves many of the above working
conditions. Indeed, scaffolding work is one of the construction's highest hazardous jobs (Hsiao
and Stanevich 1996). Scaffolds are used in all aspects of construction, including building
construction, heavy construction, and special trades. High-elevation workers have been reported
to encounter difficulties in physiological adjustment when performing heavy duties or delicate
tasks (Hsu et al. 2016). Further, the effects of extreme weather conditions, such as thermal or
cold stress and strong winds, on the work performance and safety of high-rise building

construction workers have been shown to be greater than those of ground-level workers (Hsu et
al. 2008). These findings indicated that working on scaffolds or suspended heights affects the
performance of workers and exposes them to hazards. Min (Min et al. 2011) investigated
whether the level of worker experience (expert and novice), the level of the floor (first and
second) affect postural stability and rotational spinal stability and the availability of safety
handrail (with and without handrail) while conducting manual work on scaffolds. However, their
study did not examine how these factors affect workers’ perceived task difficulty ratings. In a
sequel study, (Min et al. 2012) examined cardiovascular stress and postural stability in
construction workers. It showed that at higher scaffold height and absence of handrails, stability
was compromised, but the effects of these variables on major body segment forces were not
examined.
Not many researchers have examined postures in occupational settings, particularly in
construction workers and in association with subjective ratings of stability (DiDomenico et al.
2010). Aghazadeh and Lu (Aghazadeh and Lu 1994) examined the relationship between posture
and lifting capacity, and concluded that wearing high-heeled shoes (≥7.6cm) affects subjects’
lifting capacity and may cause back injury. However, lifting tasks were performed on the solid
ground, and subjective ratings of difficulty were not reported for all participants. The objective
of this study is to investigating the low-back compression forces, shear forces, and perceived task
difficulty of workers performing manual material handling (MMH) tasks on the scaffold
structures.

INDOOR EXPERIMENTS
Participants
Four male university students between the ages of 23 and 26 voluntarily participated in the
experiment. The mean values of age, height, and weight were 24.75 ± 1.26 years, 173.15 ± 8.31
cm, and 90.45 ± 16.17 kg, respectively. All participants were healthy and had no musculoskeletal
disorders. None of the participants had previous experience working on scaffold, and the
instructions on the proper lifting procedures were provided before the start of the experiment.

Experimental Procedures
This experiment was conducted in a controlled lab environment. Before starting each
experiment, the objectives and procedures were explained to the participants, and each
participant gave verbal consent. Then, the participants were asked to perform light exercises to
warm up their bodies, followed by performing three manual material handling tasks on a
scaffold. The tasks involved lifting the weight from the surface to knuckle height (Task 1),
carrying weight from one end to the other end on the scaffold at knuckle height (Task 2), and
lifting the weight from elbow height to maximum reach (Task 3). These three tasks represent
most of the manual material handling construction activities on scaffold. Participants repeated
each task for three times and performed it on two levels of the scaffold. Each of these tasks was
performed using three different weights (28lbs, 33lbs, and 40lbs). The participants were asked to
take rest between each task to prevent fatigue. The front and side view of the participants were
recorded during the experiment to capture their postures. The order of experiments was
randomized for each participant. After the experiment, the participant answered the 7-point
Likert scale survey regarding the perceived task difficulty.

Apparatus
Apparatus used for this experiment are two high-resolution cameras, scaffold, wooden box,
weighing scale, weights, and safety harness. The cameras are used to record the side and front
view of the participants while performing the experiment to capture the postures. The weight and
dimensions of the wooden box used to hold the weights are 8 lbs. and 46x30.5x20 respectively.
The scaffold is shown in Figure 1 (a).

Independent Variables
Manual Material Handling Tasks: Three common manual material handling tasks that
construction workers usually perform on the scaffold are considered for this study.
Weights: Construction workers involved in heavy lifting and carrying activities on the
scaffold. Three weights 28lbs, 33lbs, and 40lbs are used in this study.
Scaffold Height: Two levels of scaffold are considered for the study. The level-2 of the
scaffold is at a height of 7.5 feet from level-1 measured from the base of the scaffold tower. The
surface of the scaffold is rigid and supported with handrails. Construction workers are often
work in confined spaces. The working surface area is limited to 2.5 feet on the scaffold during
the experiment.

Dependent Variables
Low-back Compression: The low back compression in L4/L5 intervertebral disc calculated
by simulating the motions using 3D Static Strength Prediction Program (3DSSPP).
Shear Force: The shear force on the L4/L5 intervertebral disc is also studied due to the
MMH tasks on scaffold.
Perceived Task Difficulty: A 7-point Likert scale was used to assess the difficulty of
manual material handling tasks in combination with the weight, and surface level. The scores
were 1 (easy job), 2 (light work), 3 (normal task), 4 (somewhat difficult), 5 (difficult), 6 (very
difficult), and 7 (physically fatiguing).

Biomechanical Analysis
To evaluate the biomechanical stresses due to manual material handling on scaffold. The
postures associated with tasks are reconstructed using the computer simulation software 3DSSPP
from the images obtained from recorded video. The images for one of the participants for all the
three tasks using 28lbs at Level 1 and 2 are shown in the Figure 1 (b) (c) (d). The simulated
postures and other input data such as anthropometry, weight and shape of the box are used to
estimate the low back compression and shear forces due to the MMH on scaffold. Later, the
forces information is used to understand the effect of independent variables such as weight,
height of scaffold, and MMH tasks on the low back compression and shear forces.

Statistical Analysis
To evaluate the effect of experimental conditions on low-back compression, shear forces, and
perceived task difficulty, T-test and a univariate ANOVA were performed. The Bonferroni
adjustment for the p-values was used for multiple comparisons to determine the significance of
difference among experimental conditions. Biomechanical analysis results of one of the
participants were ignored since the participant did not follow the instructions.

Figure 1. (a1) side view of the scaffold; (a2) front view of the scaffold; (b1 and b2) task-1 on
scaffold level 1 and 2; (c1 and c2) task-2 on scaffold level 1 and 2; (d1 and d2) task-3 on
scaffold level 1 and 2

BIOMECHANICAL ANALYSIS
Low-back Compression
A significant difference exists between average low-back compression and scaffold level for
three tasks. The p-values of one-tail t-test between low-back compression of Level-1 and 2 for
Task-1, 2, and 3 are 0.03, 0.002, and 0.023 respectively. The low-back compression increases
from Level-1 to Level-2 with a percentage difference of 6%, 11%, and 11% for Task-1, 2, and 3
respectively (Figure 2(a)). The average low-back compression forces are significantly different
between the three weight groups for Task-1 and Task-2 at a significance level of 0.01. The p-
values of the ANOVA test between three weight groups of Task-1 and Task-2 are 0.08 and 0.09
respectively. The average low-back compression forces increase as weight increases with a
percentage difference between 28 Lbs. – 33 Lbs. and 28 Lbs. – 40 Lbs. of 6%, 16% (Task-1) and
20%, 25% (Task-2) (Figure 2(b)). Bonferroni post-hoc test between the weight groups of Task-3
shows there is a significant difference between 33 Lbs. and 40 Lbs. at a significance level of 0.05
(p=0.0005). The low-back compression is significantly different between three task groups for
Level-1 (p-value = 0.0004) and Level-2 (p-value = 0.0009) at a significance level of 0.05. The
low-back compression is higher for Task-1 and lower for Task-2 (Figure 2). The percentage
differences between Task-1&Task-2 and Task-1&Task-3 are (70%, 41%) and (66%, 36%) for
Level 1 and 2 respectively (Figure 2(c)).

Figure 2. Low-back compression versus (a) scaffold levels (b) weights and (c) tasks.
Different asterisks (*, **, and ***) indicate a significant difference between the groups
Shear Force
A significant effect exists between the average value of shear force and scaffold level for
three tasks. The p-values of a one-tail t-test of average shear forces between scaffold levels for
Task 1, 2, and 3 are 0.98, 0.99, and 0.87 respectively. The shear force increases from Level 1 to
2 with a percentage difference of 5%, 2%, and 9% for Task 1, 2, and 3 respectively (Figure 3(a)).
In case of weights, shear forces are significantly different weight groups for Task 1 (p-value =
0.08), and 2 (p-value = 0.09) at significance level of 0.1. No significance was observed in the
case of Task 3. The shear force increases as weight increases with percentage difference between
28 Lbs. – 33 Lbs, and 28 Lbs. – 40 Lbs. of 16%, 24% (Task-1) and 7%, 12% (Task-2) (Figure
3(b)). Whereas the shear forces are significantly different in case of task groups at Level 1 and
Level 2. The p-values of ANOVA analysis of shear force between three task types are 0.0042
and 0.0009. The percentage differences between Task-1 and Task-2 and Task-1 and Task-3 are
(31%, 47%) and (34%, 43%) for Level 1 and 2 respectively (Figure 3(c)).

Perceived Task Difficulty

The analysis of perceived task difficulty shows there exists a significant increase in task
difficulty from Level-1 to Level-2 with a percentage difference of 18%, 38%, and 19% for Task-
1, 2, and 3 (Figure 4(a)). The p-values of T-test analysis of perceived task difficulty between
Level-1 and 2 for Task-1, 2, and 3 are 0.04, 0.01, and 0.03 respectively. As weight increases,
there is a significant increase in task difficulty for Task-1 and Task-3. The p-values of ANOVA

analysis between three weight groups for Task-1 and Task-3 are 0.006 and 0.019 respectively.
Bonferroni post-hoc test between weight groups of Task-2 shows there is a significant difference
in perceived task difficulty between 28 Lbs. and 33 Lbs. (p-value = 0.025). The percentage
difference between 28 Lbs. & 33 Lbs. and 28 Lbs. & 40 Lbs. are (31%, 102%) and (58%, 89%)
for Task-1 and Task-3 respectively (Figure 4(b)). There was a significant difference between
perceived task difficulty and task type.

Figure 3. Shear forces versus (a) scaffold levels (b) weights and (c) tasks. Different asterisks
(*, **, and ***) indicate a significant difference between the groups

Figure 4. Perceived task difficulty versus (a) scaffold levels and (b) weights. Different
asterisks (*, **, and ***) indicate a significant difference between the groups

DISCUSSION
In this study, low-back compression, shear forces and subjective assessment of perceived

task difficulty were evaluated for novice constructional laborers performing manual material
handling tasks on different scaffold platform heights with safety handrails and harness. Our
results imply that the stress of working at greater heights increased the low-back compression
forces observed in the L4/L5 spinal region of participants. This result is in agreement with other
studies (Elders et al. 2001) suggesting that scaffold workers are more vulnerable to low back
injuries.
Significant differences were found in the risk of low back pain from high low-back
compression forces and shear forces due to type of lifting tasks. Results from biomechanical
analyses showed that lifting weights from surface to elbow height has highest low-back
compression and shear forces. On the other hand, lifting at knuckle height has the least low-back
compression, while lifting from knuckle height to maximum reach has the least shear forces.
This could be attributed to the fact that the spinal column is (fairly) upright and the load is borne
in the shoulders. The evidence of higher low-back compression and shear forces because of
lifting from the floor level is widely reported in literature, and HSE.gov.uk (2018) recommends
as part of its guidelines to “avoid lifting from floor level or above shoulder height, especially
heavy loads”.
As previously reported, falls are a major cause of fatalities, injury, and lost days in the
construction industry. This study suggests that perceived task difficulty is higher because of
exposure to elevated working surfaces and deficits in balance. (Wade et al. 2014) in their study
reported that balance decreases following exposure to the inclined surface. Coupled with
manually handling materials on elevated surfaces, the worker’s point of contact is reduced
compared to under normal working where an individual has at least 4-point contact for stability.
This possible loss of balance and risk of falling is a plausible explanation for participants’ higher
ratings of task difficulty as scaffold height increases.

CONCLUSION
In conclusion, scaffold heights, lifting weights and task types have significant effects on the
low-back compression forces, shear forces and perceived task difficulty ratings of participants.
Based on the results of this study, we recommend that construction materials to be handled on a
scaffold be transported to workers via lifts, which allows them to reach over at elbow heights to
retrieve the materials, rather than from floor level. Of course, this prevents the risk of falling, but
both the installment of safety handrails and practical training for workers on lifting techniques
while working on scaffolds and at elevated heights must be provided. Further study can be
expanded to include participants with experience of working on scaffold, older and female
participants, and consider multiple poses for analysis. This study serves as a preliminary work
for initiating intervention measures on the hazardous components of scaffold workers, especially
work performed on scaffolds.

REFERENCES
Aghazadeh, F., and Lu, H. J. I. J. o. I. E. (1994). "Relationship between posture and lifting
capacity." 13(4), 353-356.
DiDomenico, A., McGorry, R. W., Huang, Y.-H., and Blair, M. F. J. S. s. (2010). "Perceptions of
postural stability after transitioning to standing among construction workers." 48(2), 166-
172.
Hsiao, H., and Stanevich, R. L. J. I. j. o. i. e. (1996). "Biomechanical evaluation of scaffolding
tasks." 18(5-6), 407-415.

Hsu, D., Sun, Y., Chuang, K., Juang, Y., and Chang, F. J. S. S. (2008). "Effect of elevation
change on work fatigue and physiological symptoms for high-rise building construction
workers." 46(5), 833-843.
Hsu, F.-W., Lin, C. J., Lee, Y.-H., and Chen, H.-J. J. A. e. (2016). "Effects of elevation change
on mental stress in high-voltage transmission tower construction workers." 56, 101-107.
Institute, E. C. (1999). The ECI guide to managing health in construction, Thomas Telford.
Min, S.-N., Kim, J.-Y., and Parnianpour, M. "Effects of construction worker's experience, the
presence of safety handrail and height of movable scaffold on postural and spinal stability."
Proc., Biomedical Engineering (MECBME), 2011 1st Middle East Conference on, IEEE,
146-149.
Min, S.-N., Kim, J.-Y., and Parnianpour, M. J. A. e. (2012). "The effects of safety handrails and
the heights of scaffolds on the subjective and objective evaluation of postural stability and
cardiovascular stress in novice and expert construction workers." 43(3), 574-581.
Ohdo, K., Hino, Y., Takanashi, S., Takahashi, H., and Toyosawa, Y. J. P. E. (2011). "Study on
fall protection from scaffolds by scaffold sheeting during construction." 14, 2179-2186.
Saurin, T. A., and de Macedo Guimarães, L. B. J. I. j. o. i. e. (2006). "Ergonomic assessment of
suspended scaffolds." 36(3), 229-237.

Development of Effective Communication Framework Using Confirmatory Factor Analysis

Technique
Thahomina Jahan Nipa, S.M.ASCE1; Sharareh Kermanshachi, Ph.D., M.ASCE 2;
and Shirin Kamalirad, S.M.ASCE3
1
Ph.D. Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman Hall,
416 Yates St., Arlington, TX 76019. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, 438 Nedderman
Hall, 416 Yates St., Box 19308, Arlington, TX 76019 (corresponding author). E-mail:
[email protected]
3
Graduate Student, Dept. of Civil Engineering, Univ. of Texas at Arlington, 425 Nedderman
Hall, 416 Yates St., Arlington, TX 76019. E-mail: [email protected]

ABSTRACT
Primary stakeholders (owner, consultant, and contractor) are commonly subject to challenges
associated with internal miscommunication. Inaccurate transmission of data may cause major
project delays and cost overruns. As it was found that few researchers have focused on prediction
and analysis of designers’ effective communication indicators (DECI) in construction projects,
this study aims at determination of DECIs in construction projects. For this purpose, a survey
containing 52 questions corresponding to potential effective communication indicators was
developed and distributed among experienced designers. Based on survey responses and
implemented statistical tools, DECIs were identified. Confirmatory factor analysis technique was
then utilized to determine major components of DECI. Results revealed that principal
components of DECI are design and technology, scope clarity, technical and financial support,
facility, experience issues, and decision-making issues. This study assists practitioners make
proactive plan to utilize project resources properly in order to manage internal
miscommunication.

INTRODUCTION
Construction projects are like complex networks where human, information, resources and
tasks act as nodes of these networks and through communication these nodes stay linked to each
other (Kamaliad and Kermanshachi 2018). With the increasing rate of urbanization and
population growth, construction industries are continuously expanding beyond the national
boundary and always in search for more revenues (Jevernick-Will and Scott 2009). Two of the
major factors that affect the revenue of construction projects are time and cost performance.
Unfortunately, miscommunication within primary stakeholders (owners, engineers, and
contractors) and secondary stakeholders (subcontractors, vendors, and suppliers) are most likely
to affect time and cost performance negatively (Kamalirad and Kermanshachi 2018).
The design phase is one of the most important phases of the construction projects (Habibi et
al. 2018) and the majority of the time, the success of this phase depends on winning a very
competitive bid. This nature of design phase oftentimes generates mistrusts and the conflicting
relationship among stakeholders (Hartmann and Caerteling 2010). Moreover, the current
construction industry is dealing with megaprojects involving multiple designers, thus increases
the possibility of friction within them. Effective communication places a vital role to clear out
this friction, on the contrary, miscommunication fuels it which cost time and resources out of the

project (Fulford and Standing 2014; Safapour et al., 2018). Yet, literature provides very limited
materials that describe the factors affecting internal communication among designers.
Therefore, the aim of this study is to identify DECI after analyzing the impact of designers’
internal communication on the performance of complex projects. In addition to identifying
DCEIs, this study also aims to reduce the dimension of indicators into a limited number of
components to find out the similarity among indicators and their relative importance. The
outcome of this paper will help project managers in identifying the stakeholder's needs in
communication and formulating proper engagement strategy to ensure effective communication.
Thus, this study solves the often-faced challenges of project managers in complex construction
projects (Mok et al. 2015).

LITERATURE REVIEW
The construction industry is taking full advantage of the open market by expanding its reach
to beyond the national border with the help of advanced transportation and communication
system (Ngowi et al. 2005). However, a slight miscommunication resulting in a combative
relationship can ruin this progress for the construction industry as it is very competitive in nature
(Chan et al. 2004). Likewise, throughout the literature, many researchers identified ineffective
communication as one of the main causes of delay in construction projects (Larsen et al. 2015;
Liu et al. 2007; Assaf and A-Hejji 2006).
Communication in general sense can be defined as an exchange of thoughts and information
from one source to another (Perumal and Abu-Bakar 2011). This simple definition is not
adequate for construction industry where multiple stakeholders come together for a very short
period of time with a specific aim in mind and required to collaborate together to fulfill the goal
of the project (Murray et al. 2007). In this sector, communication can be seen as a generating,
collecting, disseminating, storing and transferring process of information among or within the
stakeholder entities (Perumal and Abu-Bakar 2011).
Many researchers over the decades tried to find out the relationship between effective
communication and project performance (Kamalirad et al., 2017). For example, Sambasivan and
Soon (2007), after conducting a questionnaire survey among 150 owner, contractors, and
consultants, confirmed that without proper communication, involved parties misunderstand each
other which eventually delays the execution of the project. Ejohwomu et al. (2017) found out
that construction project teams loose trust due to ineffective communication which hampers
project performance. Similarly, Kermanshachi (2016) found out that, communication quality and
project performance is positively correlated. In addition, the present construction market uses
partnering nationally and internationally to handle complex projects (Safapour et al., 2017). This
partnership will result in output with minimum cost and time overruns only when effective and
open communication environment will exist among teams (Chan et al. 2004). In fact, the success
of construction projects significantly depends on trustworthy and open communication
environment (Becker et al. 2011).
Besides, establishing the reasons behind miscommunication among stakeholders is also
important (Safapour et al., 2019). Ejohwomu et al (2017) found out that ambiguous project goal,
faulty reporting scheme, and poor leadership act as the main hindrance for effective
communication based on the opinion of 100 contractors and consultants. Moreover, Odeh and
Battaineh (2002) found that a lack of proper communication affects the performance of designer
entities more than contractor entities. However, most of the above-mentioned studies focused on the
communication within multiple stakeholders and develops corollaries based on the relationship among

them. Likewise, current literature contains a lot of studies where the communication indicators were
established for multiple stakeholders. As a result, existing literature does not provide enough materials
that ensure required attention for a particular stakeholder.

RESEARCH METHODOLOGY
The methodology adopted in this paper was divided into four steps shown in Figure 1. In the
first step, a thorough search through literature was performed to find out the existing works
related to impacts of communication of designers on project performance and ultimate success of
the project. Based on reviewed scholarly articles, a list of potential DECIs and the related
attributes were identified and categorized.

Figure 1. Research Methodology

The second step focused on the data collection process. Data collection step started with the
development of a comprehensive survey on the DECIs. The questionnaire used in the survey was
finalized after conducting pilot testing. The finalized survey was distributed among designers
who are actively working on design problems of construction projects. After multiple follow-ups,
the completed surveys were collected for further analysis. In the third step, two sample test and
chi-squared test were used to identify effective communication indicators among designers in
construction projects. Based on the test results, indicators were finalized. In the next step, factor
analysis was done to reduce the number of identified DECIs for future modeling purposes.

DATA COLLECTION
To find out the relevance and importance of DECIs in the present condition of the
construction industry, 52 questions were generated, and a pilot questionnaire survey was
developed. Based on the responses of four pilot tests, the pilot survey questions were modified in
a manner so that the questions were clearly understandable by the intended participants. Based
on the type of response, the survey questions were categorized into three categories, continuous,
Likert scale, and binary formats. Few sample questions are shown in Table 1.
After appropriate modifications, the survey was sent to prominent construction practitioners

mainly involved in design projects with at least 10 years of experience. After several follow-up
emails, 30 completed response were collected and grouped in two categories: complex projects
with effective communication and complex projects with ineffective communication. The
continuous, Likert scale and binary questions went through two-sample t-test, analysis of
variance and Chi-squared test respectively. These statistical tests were done considering both
90% and 95% level of confidence to determine the significant DECIs based on the relation
between quality of communication and potential indicators.

Table 1. Sample questions for the survey

Type Question (options for response)
Continuous How many different countries worked on the detailed engineering/design phase of
the project? (Number: __________)
Likert scaleWhat was the difficulty in obtaining design approvals? (Scale 1 to 7, 1 being not at
all difficult, 4 being moderately difficult and 7 being extremely difficult)
Binary Did the project experience any delays or difficulties in securing project funding?
(a. yes, b. no)

Table 2. Significant designer’s internal communication quality indicators

Category DECI # Designer’s Communication Indicators Sig.
Bureaucracy DECI 7 Impact Of Required Approvals-Internal Stakeholders **0.021
Coordination DECI 16 Difficulty Level In Obtaining Permits **0.036
Project Management Team Peak Size – Procurement
DECI 10 *0.073
Phase
Interface Project Management Team Experience – Construction
DECI 15 **0.009
Phase
DECI 17 Number Of Designer/Engineer Organizations **0.036
Location DECI 8 Number Of Countries Involved In Construction Phase *0.062
Clarity Of Projects Scope During Designer/Contractor
DECI 4 **0.040
Scope Definition Selection
DECI 5 Clarity Of Owners Project Goals And Objectives **0.008
DECI 9 Bulk Materials Quality Issues **0.015
Material Resources DECI 11 Degree Of Additional Construction Specifications *0.051
DECI 13 Delay In Delivery of Permanent Facility Equipment *0.053
Clarity Of Funding Process During Front End
DECI 6 *0.093
Economic Issues Planning
DECI 12 Project Funding Delays *0.072
Company’s Familiarity With Technologies –
DECI 1 **0.030
Construction Phase
Company’s Familiarity With Technologies –
DECI 2 **0.014
Technology Engineering Phase
Company’s Familiarity With Technologies –
DECI 3 *0.061
Operation Phase
DECI 14 Number Of New Systems Tied Into Existing Systems **0.029
**denotes significance with 95% level of confidence
*denotes significance with 90% level of confidence

DATA ANALYSIS
After analyzing survey responses, 35 indicators were listed, and two sample t-test was
performed. Table 2 shows 17 significant indicators which are categorized into seven categories.
These 17 indicators significantly define whether the designers will have effective communication
in a complex construction project.

Dimension Reduction
This study used statistical package for social science (SPSS v. 10) to run factor analysis.
Before proceeding to factor analysis, it must be determined whether the data is adequate for
factor analysis. To establish that point, Kaiser-Meyer-Olkin (KMO) measure of sampling
adequacy test and Bartlett’s test of Sphericity were conducted. The KMO value is found to be
0.508 which is greater than the least acceptable limit of 0.5 (Fadun and Saka 2018). The
Bartlett’s test of Sphericity was found to be significant at 0.001 which is less than the maximum
acceptable limit of 0.005 (Priyanka 2017). Both tests indicated that our variables are significantly
correlated, hence appropriate for factor analysis. The result of KMO and Bartlett’s test of
Sphericity are shown in Table 3.

Table 3. KMO and Bartlett’s Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy 0.508
Approx. Chi-Square 195.326
Bartlett's Test of Sphericity df 136
Sig. 0.001

To find out the similarity of the indicators, factor analysis was utilized. After using this
technique, the indicators were grouped into six components with minimum significant loading
value of 0.4 which can be taken as standard (Brown 2014). Each component is a collection of
interrelated indicators and these six components explain 70.62% of the variables. Table 4 shows
the rotated component matrix from SPSS factor analysis.
The first component which constitutes the largest part of total variance is Design and
Technology. The percentage of total variance for this component is 17.05% and includes five
indicators (DECI 1, 2, 3, 6 and 12). Being the first component of factor analysis, Design and
Technology component holds the maximum importance in designers’ effective communication.
In other words, using commonly used technologies makes the designers’ work more familiar
with each other leading to effective communication with each other.
The second component is Scope Clarity with a total variance of 16.97%. This component
includes DECI 4, 5, 6, 7, 8, and 9. Having clear knowledge about the owner’s requirement from
the beginning of the project leads to a well-defined scope (Kermanshachi et al., 2017) and results
in smooth communication for the designers. Besides, the process by which the project is getting
funded, helps the designer to have an idea regarding the scope and timeline of the project, hence
clear knowledge regarding this process helps in communication. In addition, a clear idea of a
number of national and international stakeholders’ involvement and quality of material almost
always makes the communication effective among designers.
The third component is Technical and Financial Support with a total variance of 12.96%.
This component includes DECI 10, 11 and 12. The size of team the designers must work with is
another factor which affects the effective communication among designers. Also, the degree of
having additional construction specification and availability of funding to adopt these additions

also affect designer’s communication.

The fourth component is Facility with a total variance of 8.75%. This includes DECI 13 and
14. Designers start working at the beginning of the project and keep working throughout the
project lifecycle. Not having permanent facility equipment delivered at the right time will delay
their work which will affect the whole project life, this phenomenon will affect the
communication of designers. Also, having a combination of new systems with old systems will
make it difficult for them to communicate with each other, especially if the experience of
working with different systems/facilities vary for different designer entities.
The fifth component is Experience Issues with a total variance of 7.77%. This includes
DECI 15 and 16. Designers have to obtain a permit with their design before starting construction
based on their design. Having a complex system to obtain the permits often time affect the
internal communication of designers as it is seen from factor analysis. However, experienced
team can manage this complexity more efficiently, thus having proper experience is also
important for effective communication among designers.

Table 4. Rotated component matrixa

Component
DECI # DECI Description
1 2 3 4 5 6
Company’s Familiarity with Technologies -
DECI 1 .863 .123 -.076 .007 .116 -.085
Construction phase
Company’s Familiarity with Technologies -
DECI 2 .829 .049 -.039 -.075 .176 .100
Engineering phase
Company’s Familiarity with Technologies -Operation
DECI 3 .823 -.046 .121 .149 -.036 .195
phase
Clarity of Projects Scope During Designer/Contractor
DECI 4 .069 .860 .006 .095 .107 .009
Selection
DECI 5 Clarity of Owners Project Goals and Objectives -.047 .832 .013 .168 -.064 -.107
Clarity of Funding Process during Front End
DECI 6 .488 .706 .182 -.006 .031 -.034
Planning
DECI 7 Impact of Required Approvals-Internal Stakeholders .040 .639 .047 -.200 -.371 .001
DECI 8 Number of Countries Involved in Construction Phase -.277 .471 .299 .317 .127 .392
DECI 9 Bulk Materials Quality Issues -.016 .407 .332 .250 -.257 .309
Project Management Team Peak Size-Procurement
DECI 10 -.223 -.094 .826 .065 .049 -.156
Phase
DECI 11 Degree of Additional Construction Specifications .027 .223 .696 -.014 .096 .206
DECI 12 Project Funding Delays .470 .074 .693 .059 -.117 -.016
DECI 13 Delay in Delivery of Permanent Facility Equipment -.009 .142 -.036 .886 .076 -.001
DECI 14 Number of New Systems Tied into Existing Systems .347 .017 .373 .569 -.311 -.042
Project Management Team Experience -Construction
DECI 15 .145 -.120 -.012 .050 .779 -.098
Phase
DECI 16 Difficulty Level in Obtaining Permits .158 .158 .397 -.313 .551 .295
DECI 17 Number of Designer/Engineer Organizations .160 -.119 -.021 -.045 -.038 .853
Extraction Method: Principal Component Analysis
Rotation Method: Varimax with Kaiser Normalization
a. Rotation converged in 7 iterations

The sixth component is Decision-Making Issues with a total variance of 7.12%. This
includes DECI 17 which is number of designer/engineer organizations. It is almost always
difficult to maintain effective communication especially for making an important decision when

the number of involved parties is large. Similarly, complex projects with many designer firms’
involvement often time suffer from ineffective communication regarding decision making.

CONCLUSION
Complex projects require multiple designers working together and whenever there is
involvement of more than one entity with similar responsibility and authority, there will be some
friction. Hence, it was the aim of this study was to develop designers’ internal communication
indicators to help smoothen out this friction. This study pointed out six set of indicators which
depicts the possible causes of designers’ ineffective communication. This six sets or components
were named as design and technology, scope clarity, technical and financial support, facility,
experience issues, and decision-making issues. This study also prioritized these components by
their percentage of total variance. Therefore, this study not only helps project managers to
identify the causes beforehand and take proper measures to reduce the friction based on
indicators but also direct project managers to focus on communication factors based on their
importance level. However, communication is a changing matter which depends on time and
culture. So before applying the result of this study one should consider the effect of time and
place appropriately if needed.

REFERENCES
Assaf, S. A., and Al-Hejji, S. (2006). “Causes of delay in large construction projects.”
International journal of project management, 24(4), 349-357.
Becker, T. C., Jaselskis, E. J., and McDermott, C. P. (2011). “Implications of construction
industry trends on the educational requirements for future construction professionals.” In
Proceedings of the Associated Schools of Construction 2011 International Conference,
Omaha, NE (pp. 1-12).
Brown, T. A. (2014). Confirmatory factor analysis for applied research. Guilford Publications.
Chan, A. P., Chan, D. W., Chiang, Y. H., Tang, B. S., Chan, E. H., and Ho, K. S. (2004).
“Exploring critical success factors for partnering in construction projects.” Journal of
construction engineering and management, 130(2), 188-198.
Chan, A. P., Scott, D., and Chan, A. P. (2004). “Factors affecting the success of a construction
project”. Journal of construction engineering and management, 130(1), 153-155.
Ejohwomu, O. A., Oshodi, O. S., and Lam, K. C. (2017). “Nigeria’s construction industry:
barriers to effective communication.” Engineering, Construction and Architectural
Management, 24(4), 652-667.
Fadun, O. S., and Saka, S. T. (2018). “Risk management in the construction industry: Analysis of
critical success factors (CSFS) of construction projects in Nigeria.” International Journal of
Development and Management Review, 13(1).
Fulford, R., and Standing, C. (2014). “Construction industry productivity and the potential for
collaborative practice.” International Journal of Project Management, 32(2), 315-326.
Habibi, M., Kermanshachi, S., and Safapour, E. (2018), “Engineering, Procurement and
Construction Cost and Schedule Performance Leading Indicators: State-of-the-Art Review,”
Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Hartmann, A., and Caerteling, J. (2010). “Subcontractor procurement in construction: the
interplay of price and trust.” Supply chain management: an international journal, 15(5), 354-
362.

Javernick-Will, A. N., and Scott, W. R. (2009). “Who needs to know what? Institutional
knowledge and international projects.” Submitted to: Journal of Construction Engineering
and Management (under review).
Kamalirad, S., and Kermanshachi, S. (2018). “Development of Project Communication Network:
A New Approach to Information Flow Modeling,” Proceedings of Construction Research
Congress, ASCE, New Orleans, Louisiana, April 2-4, 2018.
Kamalirad, S., Kermanshachi, S., Shane, J. and Anderson, S. (2017), “Assessment of
Construction Projects’ Impact on Internal Communication of Primary Stakeholders in
Complex Projects,” Proceedings for the 6th CSCE International Construction Specialty
Conference, Vancouver, Canada May 31-June 3.
Kermanshachi, S. (2016). Decision Making and Uncertainty Analysis in Success of Construction
Projects. Doctoral dissertation, Texas A & M University. http : / /hdl .handle .net /1969 .1
/158020.
Kermanshachi, S., Anderson, S. D., Goodrum, P., and Taylor, T. R. (2017). “Project Scoping
Process Model Development to Achieve On-Time and On-Budget Delivery of Highway
Projects,” Transportation Research Record: Journal of the Transportation Research Board,
(2630), 147-155.
Larsen, J. K., Shen, G. Q., Lindhard, S. M., and Brunoe, T. D. (2015). “Factors affecting
schedule delay, cost overrun, and quality level in public construction projects.” Journal of
Management in Engineering, 32(1), 04015032.
Liu, J., Li, B., Lin, B., and Nguyen, V. (2007). “Key issues and challenges of risk management
and insurance in China's construction industry: An empirical study.” Industrial Management
& Data Systems, 107(3), 382-396.
Mok, K. Y., Shen, G. Q., and Yang, J. (2015). “Stakeholder management studies in mega
construction projects: A review and future directions.” International Journal of Project
Management, 33(2), 446-457.
Murray, M., Dainty, A., and Moore, D. (2007). Communication in construction: Theory and
practice. Routledge.
Ngowi, A. B., Pienaar, E., Talukhaba, A., and Mbachu, J. (2005). “The globalization of the
construction industry—a review.” Building and Environment, 40(1), 135-141.
Odeh, A. M., and Battaineh, H. T. (2002). “Causes of construction delay: traditional contracts.”
International journal of project management, 20(1), 67-73.
Perumal, V. R., and Bakar, A. H. A. (2011). “The needs for standardization of document towards
an efficient communication in the construction industry.” Acta technica corviniensis-Bulletin
of engineering, 4(1), 23.
Priyanka, K., Anandakumar, S., and Krishnamoorthy, V. (2017). Antecedents of Migrated
Construction Workers Stress–An Indian Perspective.
Rad, SK., and Kermanshachi, S. (2018). “Development of Project Life Cycle Communication
Ladder Framework Using Factor Analysis Method.” Proceedings of Construction Research
Congress, ASCE, New Orleans, Louisiana, April 2-4, 2018.
Safapour, E., Kermanshachi, S., and Kamalirad, S. (2019). “Development of the Effective
Communication Network in Construction Projects Using Structural Equation Modeling
Technique,” Proceedings for the ASCE International Conference on Computing in Civil
Engineering, Atlanta, Georgia June 17-June 19.
Safapour, E., Kermanshachi, S., Habibi, M., and Shane, J. (2018), “Resource-Based Exploratory
Analysis of Project Complexity Impact on Phase-Based Cost Performance Behavior,”

Proceedings of Construction Research Congress, ASCE, New Orleans, Louisiana, April 2-4,
2018.
Safapour, E., Kermanshachi, S., Shane, J. and Anderson, S. (2017), “Exploring and Assessing
the Utilization of Best Practices for Achieving Excellence in Construction Projects,”
proceedings of the 6th CSCE International Construction Specialty Conference, Vancouver,
Canada May 31-June 3.
Sambasivan, M., and Soon, Y. W. (2007). “Causes and effects of delays in Malaysian
construction industry.” International Journal of project management, 25(5), 517-526.

Reliability and Validity of a Posture Matching Method Using Inertial Measurement

Unit-Based Motion Tracking System for Construction Jobs
Wonil Lee, Ph.D.1; Jia-Hua Lin, Ph.D.2; Stephen Bao, Ph.D.3; and Ken-Yu Lin, Ph.D.4
1
Safety and Health Assessment and Research for Prevention (SHARP) Program, Washington
State Dept. of Labor and Industries, WA 98504, USA. E-mail: [email protected]
2
SHARP Program, Washington State Dept. of Labor and Industries, Olympia, WA 98504, USA.
E-mail: [email protected]
3
SHARP Program, Washington State Dept. of Labor and Industries, Olympia, WA 98504, USA.
E-mail: [email protected]
4
Dept. of Construction Management, Univ. of Washington, PO Box 351610, Seattle, WA 98195.
E-mail: [email protected]

ABSTRACT
Posture quantification is important in biomechanical analysis of loadings of workers’ trunk
and upper extremities in studying back/neck/shoulder musculoskeletal disorders caused by labor-
intensive construction activities. This study demonstrates the reliability and validity of a posture-
matching method using inertial measurement unit (IMU) tracking system. A commercial IMU-
based motion capture system was used to obtain upper extremity and trunk joint angles.
Observers wore IMU sensors and mimicked prescribed distinct work postures shown on a
screen; then joint angles were obtained by the IMU sensors. A reliability test and the root-mean-
square difference indicated that the posture-matching method was reliable in estimating certain
joint angles such as upper arm flexion/extension but unreliable in estimating upper arm rotation
and wrist ulnar/radial deviation. The reliability and validity were affected by the body part, joint
angle range, observers’ perception while mimicking and matching postures, angle of the camera
captured the worker’s posture, and errors from IMU sensors.

INTRODUCTION
Work-related musculoskeletal disorders (WMSDs) are prevalent among construction workers
engaged in repetitive motions, heavy lifting, awkward postures, and high-force exertions to
perform tasks. Studies on shoulder, upper arm, elbow, lower arm, and wrist disorders are the
most common, followed by back and knee WMSDs (Boschman et al., 2012). Awkward posture
is a leading risk cause of WMSDs. Posture risks are often assessed using observational
assessment tools including the Rapid Upper Limb Assessment, Ovako Working Posture Analysis
System, and Quick Exposure Check (David, 2005). These tools have been implemented in onsite
worker observation and office/laboratory post-assessment based on video or photo frames taken
from work sites. Such observational methods are inexpensive and efficient for ergonomic
professionals in ergonomics risk assessment application (Dempsey et al., 2019). The reliability
and validity of the observer’s assessment are influenced by the observer’s capacity of processing
information simultaneously (onsite assessment) and camera angle, body part size, and video
quality (off-site assessment) (Bao et al., 2009). To overcome the limitations of observational
methods, direct measurement method can be used. For posture analysis, details of body joint
movement can be obtained using an electrogoniometer, inertial measurement unit (IMU) sensor,
and video-based motion-tracking system. The comparison of observational methods and direct
measurements in terms of reliability and validity was performed by several researchers for

applications in the construction industry. Chen et al. (2017) compared the entire body joint
angles measured by IMUs and machine learning algorithms with ground truth posture
information and demonstrated acceptable accuracy in awkward posture recognition. Alwasel et
al. (2017) obtained body joint angles using the IMU system for 3DSSPP biomechanical analysis
input to assess joint compression forces and moments of entire body segments in the laying
concrete blocks task.

ISSUES ON PRACTICAL IMPLEMENTATION OF POSTURE ASSESSMENT TOOLS

Video-based assessment is advantageous in that ergonomists are able to conduct posture
analysis using computerized technology at alternative locations and times (Weir et al., 2011).
The conventional posture observation method is a technique in which ergonomists observe
postures viewed on a computer screen after selecting a posture boundary category from a video
frame that matches the body segment angle of the posture or after numerically converting it into
a posture category after recording the direct angle. Visual estimation by an analyst is one of the
methods in posture quantification (Bao et al., 2007) in which the observer clicks on the location
in the upper extremities and trunk using the mouse, which is predetermined in the posture
diagram of the computer screen, after observing the worker’s posture in the image (clicking
method). As an alternative approach to posture observation, the visual angle estimation method
can be recorded by the observer entering the joint angle in each body element directly with the
given form after an observer analyzes the posture in the image (Bao et al., 2007). Then, joint
angles obtained by this method are calculated as posture distribution based on predefined angle
categories and used as postural exposure data for epidemiological analyses of musculoskeletal
disorders.
Weir (2011) demonstrated that conventional posture observation method through such video-
based posture assessment method is influenced by the analyst’s experience level that determines
performance in trunk-bending posture analysis. Direct perception or analysis of the absolute joint
angle or joint angle range directly from the photo/video frames or worker’s posture observed
onsite could be a challenge to novice-level construction safety professionals. To overcome these
issues of observational assessment tools, direct measurement using IMU for posture analysis has
been tested for field application (Schall et al., 2015, 2016). However, in terms of direct
measurement of awkward posture among the ergonomic risk factors in IMU use, there are
practical limitations for construction workers wearing multiple IMU sensors on all body limbs
because of the interference with personal protective equipment and tool belts. Furthermore,
errors in joint-angle measurements will increase over time because drifting issues with the
gyroscope in the IMU are known to occur (El-Gohary and McNames, 2012). Magnetometer
disturbance is also a known issue (Roetenberg et al., 2007) because several materials on the
construction site may contain ferromagnetic properties.
This study evaluates a new and innovative method combining observational and direct
measurement (i.e., motion tracking using IMU) for construction safety research and field
applications. The proposed posture measurement method allows the observer to mimic the
worker’s posture as observed in a photo/video frame rather than directly judging absolute joint
angles or joint angle range. Rather than instructing workers to wear the IMU, the observers wear
the IMU system and mimic the postures of selected video/photo frames taken from the site.
Observers do not have to be trained for posture observations in the conventional posture-
measurement method. They only need to mimic the postures for relevant posture measurements;
we named this the “posture-matching” method. The main purpose of the current study was to test

the reliability and validity of the proposed posture-matching method.

RESEARCH METHOD
Participants and Instrument: Three observers (one woman and two men) were recruited to
test the ability of the proposed posture-matching method. The observers were 163–180-cm tall
and weighed 70–86 kg. Two observers had basic level experience on ergonomic risk assessment
using observational methods, and one had general research experience on construction safety and
health. However, all were inexperienced novice-level observers of ergonomic posture analysis.
The I2M IMU system (NexGen Ergonomics Inc.) was used to automatically track and record
joint angles. The Kalman filter is a sensor fusion algorithm that allows compensating the sensor
component limitation such as increasing gyroscope capability of the accelerometer to accurately
measure trunk flexion/extension (Schall et al., 2015), and was applied to compensate for the
gyroscope drift in this study. The I2M IMU system combined the accelerometer data, orientation
information of the magnetometer, with the gyroscope data to calculate joint angles output (El-
Gohary et al., 2011). Additionally, three trained ergonomists analyzed joint angles from the same
posture frames using conventional posture observation methods (including both clicking and
direct angle input). These data were compared with the results from the proposed posture-
matching method. We hypothesized that the posture-matching method used by a novice observer
will outperform conventional posture observation methods in terms of reliability and validity to
analyze postures, especially in obtaining joint angles of upper body extremities and trunk.
Experiment Design: The current study was a proof of the “posture-matching” concept,
wherein a professional ergonomist posed prescribed joint angles of the upper extremities and
trunk in the laboratory. These known angles (verified with a manual goniometer) were used as
ground true values in the analysis. The observers wore four IMU sensors: one each on their hand,
wrist, upper portion of the dominant arm, and sternum of the torso (Figure 1).

Figure 1. Placements of sensors worn by an observer.

The observers then mimicked postures in 92 randomly-ordered sample images (symmetrical
postures of the upper extremities and trunk during general material handling activities) shown on
a computer screen. The photos were taken from three different camera angles (0°, 45°, and 90°)
with postures of various prescribed joint angles. Postures in these photos represent typical
manual material handlings. The experimental process is shown in Figure 2. This data-collection
procedure was repeated thrice for each observer; therefore, 276 data points were collected from
each observer.
Data analysis: The TK Motion Manager software (version 1.0.0, NexGen Ergonomics Inc.)
was used to obtain calibrated data from an accelerometer, gyroscope, and magnetometer. The
HM Analyzer software (version 2.5.0, NexGen Ergonomics Inc.) was used to convert these data
into joint-angle measurements, based on anthropometric data of the US population using the

participant’s weight and height as inputs. The calibrated raw IMU data were processed in the
HM analyzer to calculate joint angles of the upper extremities and trunk. The software obtains
joint angle measurements in the local Cartesian coordinate system calculated based on the
International Society of Biomechanics recommendation (Wu et al. 2005). As commonly
practiced, if any joint-angle data point obtained from the same photo frame in the three repetitive
measurements was deviated from the average of the three measurements, they were regarded as
outliers and removed from the dataset (Howell, 1998).

Figure 2. Scheme of experiments to test the ability of the proposed posture-matching.

In testing the reliability of the proposed method, the joint angles obtained were classified
according to posture categories defined in Bao et al. (2009, pp. 295). The reliability of posture-
matching method was evaluated based on the intraclass correlation coefficient (ICC). ICC is an
index to test the reliability of the measurement in biomechanical and physiological devices
(Denegar and Ball, 1993). ICC’s absolute agreement option was used because random errors due
to IMU drifting among raters also varied from repetitions. ICC model 2.1 (two-way random,
consistency, single measures) was selected in the analysis option for the ICC of three different
observers, denoting inter-observer repeatability (Shrout and Fleiss, 1979). For ICC within the
same observer, we used joint-angle data for one (randomly selected) of the three observers. Each
data session in three repeated data collections was used for testing within-observer repeatability;
ICC model 1.1 (one-way random, single measures) was selected. As per the guideline suggested
by Sankarpandi et al. (2017), we interpreted that ICC ≥ 0.8 represents excellent reliability of the
posture-matching method, while 0.6 < ICC ≤ 0.8 is good, 0.4 < ICC ≤ 0.6 is moderate, and ICC <
0.4 is poor. In assessing the validity of the proposed posture-matching method, we estimated the
root-mean-square difference (RMSD) between the sensor data and true joint angle of eight
different body postures (e.g., elbow flexion) for each observer as an index to test the validity of
the new instrument measurement (Schall et al., 2015; 2016), and percentile angular displacement
values were estimated.

RESULTS AND DISCUSSION

The result of between-observer repeatability (Table 1, see ICC 2.1 columns) shows that the
proposed posture-matching method using the IMU system was reliable for most posture variables

among the three different novice observers, especially for trunk flexion/extension, upper arm
flexion/extension, upper arm abduction/adduction, and elbow flexion. A lower reliability level
was noted for upper arm rotation and wrist ulnar/radial deviation in observers’ analyses. Within-
observer repeatability analysis (Table 1, see ICC 1.1 columns) also indicated that the proposed
posture-matching methods were reliable except for wrist flexion/extension, wrist ulnar/radial
deviation, and forearm supination/pronation. Upon comparing the results of the ICC between and
within observers, we found a notable discrepancy in the reliability of upper arm rotation joint-
angle estimation (Table 1).

Table 1. Inter-rater and intra-rater reliability.

ICC
ICC ICC ICC ICC ICC ICC ICC ICC
Posture
2.1 1.1 2.1 1.1 2.1 1.1 2.1 1.1
parameter
Trunk flexion/extension 0.81 0.90 0.72 0.96 0.70 0.97 0.74 0.94
Upper arm 0.85 0.88 0.80 0.87 0.79 0.86 0.82 0.87
flexion/extension
Upper arm 0.81 0.95 0.86 0.80 0.90 0.88 0.86 0.87
abduction/adduction
Upper arm rotation 0.33 0.79 0.16 0.63 0.22 0.73 0.23 0.71
Elbow flexion 0.85 0.92 0.67 0.89 0.84 0.93 0.78 0.91
Wrist flexion/extension 0.37 0.54 0.68 0.69 0.22 0.38 0.55 0.61
Wrist ulnar/radial 0.36 0.55 0.18 0.51 0.31 0.73 0.24 0.59
deviation
Forearm 0.35 0.39 0.55 0.56 0.54 0.41 0.49 0.49
supination/pronation
a. b.
Camera angle 0° 45° 90° All
Number of observations 24 43 25 92
Note: a. Sagittal plane view,
b. frontal plane view

For validity testing in Table 2, the posture of larger body parts that presented high reliability
between observers (Table 1), including trunk flexion/extension, upper arm flexion/extension,
abduction/adduction, and elbow flexion, was selected because these postures involve the major
body parts associated with WMSDs (CPWR, 2018). With the validity testing results (i.e.,
RMSD), Table 2 also summarized the variation of joint angle estimates of the selected postures
to compare the mean joint angles from three posture analysis methods in each percentile rank.
The proposed posture-matching method did not improve the RMSD average of joint angle
estimation in trunk flexion/extension, upper arm flexion/extension, abduction/adduction, or
elbow flexion compared to the RMSD average of joint angles estimated using the observational
method of direct angle input or clicking method (see “a” markers in Table 2).
We found that the averages of novice observers’ joint angle measurements of trunk and
elbow flexion were approximately close to the averages of the trained ergonomists’ assessments
using traditional posture-matching methods when the flexion angles reached the 90th percentile
flexion (see “b” markers in Table 2). However, the difference between angle measurements from
the proposed posture-matching increased when observers mimicked increased upper arm flexion
and abduction (see “c” markers in Table 2). We found that the averages of novice observers’
joint angle measurements in trunk flexion showed the greatest validity of measurement compared

to true trunk flexion angle in “known” postures.

Table 2. Summary of joint angles estimates and RMSD from three posture analysis
methods.
Summary measure True joint angle Conventional Conventional Proposed
from the known direct angle input clicking method posture
postures method matching
Upper arm Mean (SD) Mean (SD) Mean (SD)
flexion/extension
10th percentile (°) 0.0 0.0 (0.0) -0.7 (0.6) 1.0 (7.0)
50th percentile (°) 0.0 0.0 (0.0) 0.0 (1.0) 14.4 (10.2)
75th percentile (°) c
45.0 36.7 (3.8) 26.5 (7.8) c 52.5 (20.7)
th c
90 percentile (°) 90.0 85.0 (8.7) 76.0 (18.4) c 80.3 (11.2)
Maximum flexion (°) 90.0 90.0 (0.0) 94.3 (2.3) 94.6 (3.7)
a a a
Sample-to-sample Ref. 10.1 (1.2) 11.9 (2.1) 34.1 (1.2)
RMSD (°)
Upper arm Mean (SD) Mean (SD) Mean (SD)
abduction/adduction
10th percentile (°) 0.0 0.0 (0.0) 0.0 (1.0) 1.7 (2.9)
th
50 percentile (°) 0.0 0.0 (0.0) 2.0 (1.7) 11.3 (3.7)
75th percentile (°) c
45.0 43.3 (2.9) 37.8 (7.8) c 31.3 (6.4)
90th percentile (°) c
90.0 90.0 (0.0) 89.0 (1.0) c 72.5 (4.4)
Maximum abduction 90.0 90.0 (0.0) 91.7 (1.2) 90.7 (3.2)
(°)
a a
Sample-to-sample Ref. 12.4 (3.6) 11.6 (4.2) a 13.3 (0.8)
RMSD (°)
Elbow flexion Mean (SD) Mean (SD) Mean (SD)
10th percentile (°) 0.0 0.0 (0.0) 1.3 (0.6) -1.3 (4.0)
th
25 percentile (°) 45.0 40.0 (8.7) 37.0 (2.6) 19.9 (7.8)
50th percentile (°) 90.0 86.7 (5.8) 87.8 (2.5) 56.6 (8.3)
th
75 percentile (°) 90.0 90.0 (0.0) 90.0 (0.0) 72.9 (6.6)
th b b
90 percentile (°) 90.0 90.0 (0.0) 92.7 (2.9) 87.1 (6.0)
Maximum flexion (°) 135.0 146.7 (27.5) 128.0 (8.9) 118.5 (17.4)
a a
Sample-to-sample Ref. 11.1 (4.2) 11.1 (1.5) a 24.4 (5.0)
RMSD (°)
Trunk flexion/ Mean (SD) Mean (SD) Mean (SD)
extension
10th percentile (°) 0.0 0.0 (0.0) 0.0 (1.0) -8.6 (9.9)
th
50 percentile (°) 0.0 0.0 (0.0) 0.3 (1.5) -3.5 (6.5)
75th percentile (°) 0.0 0.0 (0.0) 2.0 (1.7) -0.7 (5.7)
th b b
90 percentile (°) 30.0 25.0 (0.0) 28.7 (2.3) 34.3 (14.3)
Maximum flexion (°) 30.0 40.0 (5.0) 48.3 (4.2) 48.8 (17.9)
a a a
Sample-to-sample Ref. 0.0 (0.0) 4.5 (1.1) 9.4 (4.4)
RMSD (°)
Note: Ref., reference measure

One source of error contributing to the lower reliability of results related to these movements
was the camera angle (Table 1), especially for wrist flexion/extension. Variations of ICC
estimates were calculated using three different camera angles. The reliability of the observer’s
posture analysis decreased if the worker’s postures were photographed from parallel directions
with the upper extremities elevated. Observers did not have clear perceptions of the wrist
postures and upper/lower arm inner and outer rotations from the images. The other source that
contributed to the lower reliability and validity of the posture-matching was potentially the errors
of sensors. For instance, the RMSD of upper arm flexion/extension reported consistently higher
values among three observers compared to estimates for the conventional observation methods,
especially the measurements between the 25th and 90th percentiles (Table 2). Our sensor drifting
test of angle output in the same static neutral posture in the beginning and end of data collection
demonstrates that drifting remarkably influenced the reliability and validity of upper arm rotation
measurements (Figure 3).

Figure 3. Angle outputs in the same neutral posture‒ beginning and end of data collection.
Limitations and directions for future work: The 92 posture frames only included a limited
trunk flexion range between 0° and 30°, so we were unable to capture the sensitivity of the
proposed posture-matching in various trunk flexion ranges (Table 2). The current study could not
distinguish whether the error occurred due to human perception (Golabchi et al. 2017) or random
errors, such as gyroscope drifting, because of the limited number of observers recruited for this
study. The future study will discriminate the different sources of variations by recruiting a
sufficient number of observers and test the feasibility of posture-matching with photos taken
from worst cases that expose construction worker to high risk of WMSDs.

CONCLUSION
The construction industry still has a higher level of WMSD prevalence than the average of all
industries combined (CPWR, 2018). The conventional observational ergonomic assessment
method requires a significant amount of training and experience of the observers, which posts a
huge barrier for broad implementation and industry adoption. The proposed posture-matching
method utilizing the IMU system is a post hoc ergonomics posture quantification method that
industry practitioners can perform anywhere based on videos or photos of worker activities on

the construction site. In this method, observers mimic the posture in photo frames that were
obtained onsite while wearing the IMU system. The proposed posture-matching method was
reliable for estimating certain joint angles such as upper arm abduction/adduction but unreliable
for upper arm rotation and wrist ulnar/radial deviation. We foresee how the assessment method
to identify awkward postures will eventually be incorporated as a routine for construction safety
and health management.

REFERENCES
Alwasel, A., Abdel-Rahman, E. M., Haas, C. T., and Lee, S. (2017). “Experience, productivity,
and musculoskeletal injury among masonry workers.” J. Constr. Eng. Manag., 143(6),
05017003.
Bao, S., Howard, N., Spielholz, P., and Silverstein, B. (2007). “Two posture analysis approaches
and their application in a modified rapid upper limb assessment evaluation.” Ergon., 50(12),
2118-2136.
Bao, S., Howard, N., Spielholz, P., Silverstein, B., and Polissar, N. (2009). “Interrater reliability
of posture observations.” Hum. Factors, 51(3), 292-309.
Boschman, J. S., van der Molen, H. F., Sluiter, J. K., and Frings-Dresen, M. H. (2012).
“Musculoskeletal disorders among construction workers: a one-year follow-up study.” BMC
Musculoskelet. Disord., 13(1), 196.
CPWR (Center for Construction Research and Training) (2018), The Construction Chart Book
(sixth edition), < https://www.cpwr.com> (Jan. 24, 2019)
Chen, J., Qiu, J., and Ahn, C. (2017). “Construction worker's awkward posture recognition
through supervised motion tensor decomposition.” Autom. Constr., 77, 67-81.
David, G. C. (2005). “Ergonomic methods for assessing exposure to risk factors for work-related
musculoskeletal disorders.” Occup. Med., 55(3), 190-199.
Dempsey, P. G., Lowe, B. D., and Jones, E. (2018). “An international survey of tools and
methods used by certified ergonomics professionals.” In Congr. of the International
Ergonomics Association. Springer, Cham, 223-230.
Denegar, C. R., and Ball, D. W. (1993). “Assessing reliability and precision of measurement: an
introduction to intraclass correlation and standard error of measurement.” J. Sport Rehabil.,
2(1), 35-42.
El-Gohary, M., and McNames, J. (2012). “Shoulder and elbow joint angle tracking with inertial
sensors.” IEEE Trans. Biomed. Eng., 59(9), 2635-2641.
El-Gohary, M., Holmstrom, L., Huisinga, J., King, E., McNames, J., and Horak, F. (2011).
Upper limb joint angle tracking with inertial sensors. In Annu. Int. Conf. of the IEEE
Engineering in Medicine and Biology Society, 5629-5632.
Golabchi, A., Han, S., Fayek, A. R., and AbouRizk, S. (2017). “Stochastic Modeling for
Assessment of Human Perception and Motion Sensing Errors in Ergonomic Analysis.” J.
Comput. Civil Eng., 31(4), 04017010.
Howell, D. C. (1998). Statistical methods in human sciences. New York: Wadsworth.
Roetenberg, D., Slycke, P. J., and Veltink, P. H. (2007). “Ambulatory position and orientation
tracking fusing magnetic and inertial sensing.” IEEE Trans. Biomed. Eng., 54(5), 883-890.
Sankarpandi, S. K., Baldwin, A. J., Ray, J., and Mazzà, C. (2017). “Reliability of inertial sensors
in the assessment of patients with vestibular disorders: a feasibility study.” BMC Ear, Nose
and Throat Disord., 17(1), 1.
Schall Jr, M. C., Fethke, N. B., Chen, H., Oyama, S., and Douphrate, D. I. (2016). “Accuracy

and repeatability of an inertial measurement unit system for field-based occupational

studies.” Ergon., 59(4), 591-602.
Schall, M. C., Fethke, N. B., Chen, H., and Gerr, F. (2015). “A comparison of instrumentation
methods to estimate thoracolumbar motion in field-based occupational studies.” Appl.
Ergon., 48, 224-231.
Shrout, P. E., and Fleiss, J. L. (1979). “Intraclass correlations: uses in assessing rater reliability.”
Psychol. Bull., 86(2), 420-428.
Spielholz, P., Davis, G., and Griffith, J. (2006). “Physical risk factors and controls for
musculoskeletal disorders in construction trades.” J. Constr. Eng. Manag., 132(10), 1059-
1068.
Weir, P. L., Andrews, D. M., van Wyk, P. M., and Callaghan, J. P. (2011). “The influence of
training on decision times and errors associated with classifying trunk postures using video-
based posture assessment methods.” Ergon., 54(2), 197-205.
Wu, G., Van der Helm, F. C., Veeger, H. D., Makhsous, M., Van Roy, P., Anglin, C., et al.
(2005). “ISB recommendation on definitions of joint coordinate systems of various joints for
the reporting of human joint motion—Part II: shoulder, elbow, wrist and hand.” J. Biomech.,
38(5), 981-992.

Investigating the Neurophysiological Effect of Thermal Environment on Individuals’

Performance Using Electroencephalogram
Xi Wang1; Da Li2; Carol C. Menassa3 ; and Vineet R. Kamat 4
1
MSE Student, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann Arbor,
MI 48109-2125. E-mail: [email protected]
2
Ph.D. Candidate, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann Arbor,
MI 48109-2125. E-mail: [email protected]
3
Associate Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann
Arbor, MI 48109-2125. E-mail: [email protected]
4
Professor, Dept. of Civil and Environmental Engineering, Univ. of Michigan, Ann Arbor, MI
48109-2125. E-mail: [email protected]

ABSTRACT
The thermal environment has a great influence on individuals’ performance; however, factors
such as one’s motivation to perform well under experimental conditions cause difficulties in
assessing how room temperature affect subjects’ performance. One approach to overcome this
problem is to understand the changes in individuals’ neurophysiological conditions. This paper
reports on the results of an experiment where electroencephalogram (EEG) data were collected
from 5 subjects while they performed four computerized cognitive tasks. Power spectral density
of EEG signals in three different thermal environments, slightly cool, neutral, and slightly warm,
was compared within subjects. In most cases, significant differences in PSD of the frontal theta
(4–8 Hz) activity are observed, indicating individuals’ mental effort varies with room
temperature. In the long run, the increased mental workload will reduce individuals’ performance
and be detrimental to their productivity. The study indicates that the proposed method could be
implemented on a larger scale for further studies.

INTRODUCTION AND BACKGROUND

Temperature is one of the most important factors of indoor environmental quality (IEQ),
which can significantly affect occupants’ thermal comfort, health, and well-being (Li et al.
2017a). An undesirable thermal environment may lead to sick building syndrome symptoms such
as eye, nose and throat irritation and affects occupants’ performance and productivity (Witterseh
et al. 2004). Previously, researchers investigated how different IEQ factors including lighting,
acoustics, indoor air quality, and thermal comfort affect occupants’ performance (Al Horr et al.
2016). Providing a comfortable work environment is considered an effective way to improve
office workers’ performance and well-being (Seppänen and Fisk 2006, Li et al. 2018).
Meanwhile, it was shown that the high costs of improving those IEQ factors are offset by
improved worker health (e.g., reduced absenteeism) and productivity (Djukanovic et al. 2002).
There are three most-commonly-used methods to measure the effect of IEQ on office
workers’ performance. The first method asks workers to subjectively rate their perceived
performance (Tanabe et al. 2015). This method is convenient to implement and independent of
the type of task. Although the rating is direct and comparable, it is highly biased and subjective.
Therefore, this method has very limited value on evaluating the actual performance of occupants
in different indoor environments. In the second method, workers' productivity in real working
conditions is directly measured to evaluate their performance (Kekalainen et al. 2010); however,

most office work (e.g., management, research) involves a variety of different tasks and skills and
does not have clearly measurable and comparable output. Therefore, it is impractical to quantify
office worker performance by directly measuring their productivity. The third method involves
simulating office work using performance tests that represent the typical office activities in a
laboratory environment (Lan et al. 2009). It enables us to directly understand the effect of IEQ
on individuals’ different cognitive and executive functions. Nevertheless, test results are
significantly influenced by people’s high motivation to perform better under experimental
conditions (McCarney et al. 2007). In addition, experimental performance tests have short
durations (e.g., range from 30 minutes to 2 hours) compared to real office work (e.g., 8 hours per
day), which give subjects motivation to maintain high performance through the experiment.
Among all the IEQ factors, controlling room temperature is one of the simplest ways to
achieve optimal workplace environments (Seppanen et al. 2005, Li et al. 2017b). Previous
studies mainly focused on the effect of thermal environment on occupants’ performance on the
behavioral level, which is reflected by their productivity and psychometric test results (Toftum et
al. 2005, Lan et al. 2009). However, subjects tend to maintain their performance under moderate
thermal stress (typical of office environment) throughout the short experimental duration (Holm
et al. 2009). Therefore, there are underlying changes in the occupants’ neurophysiological level
that are not reflected behaviorally, and thus could not be directly detected through the
measurement of task performance (Hocking et al. 2001). In real office settings, neglecting these
neurophysiological activities will eventually result in reduced long-term performance and
detrimental health effects. As a result, developing a method to understand the neurophysiological
effect of workplace thermal environment on workers’ mental workload is of great significance. It
could provide us with insights into setting workplace temperature in order to achieve the highest
worker overall performance and well-being in the long run.
The objective of this paper is to understand how indoor thermal environment influences
occupants’ performance by investigating the neurophysiological effect of room temperature on
subjects while they perform four different types of computerized cognitive tasks. In this paper,
neurophysiological activity was measured using a low-cost, wireless EEG headset while the
subjects performed tasks under three different thermal conditions, slightly warm, neutral and
slightly cool, derived from Fanger’s Predicted Mean Vote (PMV) model (Fanger 1970). After
preprocessed the data, the PSD for each segment of data was obtained, and a within-subject
comparison between PSD of the slightly cool/warm environment (PMV=-1/PMV=1) and PSD of
the neutral environment (PMV=0) using the Wilcoxon signed-rank test was performed.

METHODOLOGY
In this study, a comprehensive framework was developed to study the neurophysiological
effect of three different thermal environments on occupants, as shown in Figure 1. The
temperature setpoints of the environment are based on the PMV model, representing the
occupants’ thermal sensation ranged from slightly cool to slightly warm. The subjects were
asked to perform four selected cognitive tasks representing different cognitive functions (detailed
in Section 2.1).
The neurophysiological activity was captured by Emotiv EPOC+, a low-cost wireless EEG
headset that can record brain activities with reasonable quality. EEG is a non-invasive technique
to monitor and record the electrical activities of the brain, typically through the electrodes placed
on the scalp surface. By directly capturing the activities of the central nervous system, EEG can
accurately reflect brain neural activities and subjects’ cognitive states. EEG can capture the

subtle variations in subjects’ cognitive states with a high time resolution which cannot be
reflected from subjects’ behavioral responses (Cohen 2011) . Several studies can be found using
EEG to investigate the effect of temperature on individuals. However, Lan et al. (2010) and Yao
et al. (2009) did not record the EEG signals in real-time while the occupants were performing
tasks. Choi et al. (2019) asked subjects to study for their own work and did not control the
difficulty of the task subjects performed under each thermal conditions.
To address the limitations mentioned above, in this study the EEG signal was measured at the
same time while subjects were performing the given tasks with same difficulty levels and
compared among three different moderate thermal conditions typical to the office environment.

Figure 1: Framework to study the neurophysiological effect of the thermal environment

Figure 2: Overview of Cognitive tasks (a) Number addition (b) Digit Span (c) Choice
Reaction (d) Visual Search
Cognitive tasks
In this study, we selected four computer-based cognitive tasks to arouse subjects’ functions
on thinking, working memory, perception, and choice reaction. All the tasks were developed
with acceptable difficulty level for graduate students using the Javascript. The number addition
(NA) task asked subjects to mentally add up columns of four 3-digit numbers (see Figure 2a)
shown on the computer screen in a given time. In the digit span (DS) task, a sequence of eleven
single-digit number series appeared on the screen one digit at a time (see Figure 2b). After all the
digits showed up, subjects were asked to recall the number series and input it using the number
pad. In the choice reaction (CR) task, the name of color appeared in the center of the screen one
at a time. The subjects were requested to respond to the font color of the word as soon as
possible regardless of the meaning of the word by pressing the first letter of the color on the
keyboard (see Figure 2c). The visual search (VS) task required subjects to rapidly and accurately
search for the target object on the right side of the screen from the 9x9 grid on the left side of the

screen (see Figure 2d). All trials in the four tasks were randomly generated by a computer to
ensure that the task difficulties remained the same each time it was conducted.

Experiment design
In our experiment, the temperature settings were derived from the PMV model, which is an
international standard to evaluate the occupants’ indoor thermal comfort based on the human
body’s thermal balance equation (Fanger 1970). Given the fact that the temperature in an office
environment is usually controlled within a moderate range, we set the thermal condition to be
PMV=-1 (69.8 ℉/21 ℃), PMV=0 (76.3 ℉/24.61 ℃), and PMV=1 (82.7 ℉/28.17 ℃), which
corresponds to slightly cool, neutral, and slightly warm on the ASHRAE thermal sensation scale,
respectively. The experimental protocol was approved by the Institutional Review Board at the
University of Michigan. All subjects recruited were graduate students. Each subject was required
to participate in the experiment at the same time every day for three consecutive days with same
clothing level and good rest the night before to eliminate the circadian effects.
In each experimental condition, the subjects followed the procedures shown in Figure 3.
Before the experiment started, the subjects spent 30 minutes relaxing to adapt to the
environment. Then, the authors had 15 minutes to set up the EEG headset on subjects with a
good connection. The four cognitive tasks were divided into two sections, with two in each
section. The order of the cognitive tasks was randomly shuffled among different subjects, while
each subject performed the tasks in the same order on different days. Between the two cognitive
task sections, the subjects had a fifteen-minute rest with the EEG headset removed from their
heads. After the cognitive tasks, the subjects conducted a short survey about their thermal
sensation and thermal comfort in the experiment. Due to the scope of this paper which focuses
on the EEG data itself and its analysis, the results of this survey will not be discussed.

Figure 3: Experiment procedure for each day

Data analysis
The raw EEG data collected were divided into segments according to the start and end time
the subjects conducted the task recorded by the computer. Each segment corresponds to the
dataset of a subject to perform a task in a thermal environment. To preprocess the raw data, we
first removed the DC (direct current) offset and limited the slew rate. Next, the finite impulse
response bandpass filter with 1 Hz low-pass and 55 Hz high-pass was applied to remove
extrinsic artifacts from the data. Eye artifacts (i.e. eye blink, eye movement) and muscular
artifacts were removed by implementing the Independent Component Analysis algorithm
(Comon 1994). After that, we calculated the PSD for each dataset. PSD is the power of the signal
as a function of frequency. In our study, we used the Welch's average periodogram method with
Hanning window that returns 2048 discrete Fourier Transform points. Since the previous study
suggested that theta band power (4-8 Hz) of frontal region increases with the growing task
demands, which require higher mental effort, and the left cerebrum hemisphere is detail-oriented
and responsible for the logical, mathematical and scientific skills (Holm et al. 2009, Klimesch

1999), we only focused on the PSD in theta frequency band of the F3 channel (placed over the
left frontal lobe). The Wilcoxon signed-rank test was then used to compare PSD of the same type
of task in different thermal environments within-subject because different subjects’ individual
differences cause their EEG signal patterns not comparable to each other.

RESULTS
For each dataset, we calculated the PSD and selected 65 data points with an equal interval
(1/16 Hz) in the theta frequency band. The Wilcoxon signed-rank test was used to compare the
theta PSD of the neutral thermal environment (PMV=0) to the slightly cool (PMV=-1) and
slightly warm (PMV=1) environments for each type of task. The mean PSD is given in Table 1.
Values in bold means the PSD of that dataset is significantly different (p<0.05) with the PSD in
the corresponding neutral environment (PMV=0). In most cases, significant differences could be
observed on theta band PSD of the F3 channel when thermal condition went from neutral to
slightly cool or slightly warm. Since frontal theta PSD is positively correlated with task demands
and subjects’ mental effort exerted, we concluded that subjects had different mental workloads in
different thermal environments even though task demands are the same. Therefore, the method
enabled us to understand how thermal environment influence office workers’ performance
through its effect on their neurophysiological activities.
Bar charts in Figure 4 show the change of mean frontal theta PSD for each subject on
different types of tasks. In most cases, subjects’ mean PSD was relatively low when PMV=0,
which implies relatively less mental effort was spent in the neutral thermal condition. It could be
found that subjects have different sensitivity to the room temperature. For example, Subject 2
was not sensitive to the slightly warm environment since his/her frontal theta PSD have little
difference for all types of tasks. The effect of the thermal environment also depends on the task
type. Taking Subject 5 for example, he/she was very sensitive when the thermal condition
deviated from neutral for the number addition, digit span and visual search task. However,
his/her PSD did not change significantly for the choice reaction task. In addition, he/she had
lower mental workload in the cooler environment than in the warmer one for the digit span and
visual search task, while vice versa for the other two tasks.

Table 1: The Wilcoxon signed-rank test results

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5
PMV 0 -1 1 0 -1 1 0 -1 1 0 -1 1 0 -1 1
NA 2.324 2.007 1.267 0.570 0.589 0.689 0.372 1.752 0.761 1.486 1.920 1.393 1.392 2.497 2.198
DS 1.543 1.439 1.525 0.632 2.026 0.565 0.587 1.183 0.636 1.381 2.465 1.143 0.885 1.559 2.673
CR 1.884 1.495 2.879 0.781 1.065 0.682 0.455 1.352 0.828 1.358 1.451 0.980 1.044 1.281 0.849
VS 1.628 1.436 2.883 0.507 1.810 0.389 0.516 1.249 0.633 1.542 2.556 1.684 0.439 1.038 1.670
Bold represents a significant difference (p<0.05)

Subjects’ task performance data are shown in the tables above the bar charts. We used the
average response time (10ms) for correct responses to quantify subjects’ task performance for
the choice reaction task and the number of correct trials to quantify performance for the other
three tasks. In most cases, subjects had the worst performance in the slightly warm environment
and better performance in the neutral or the slightly cool environment. It could be found that

additional mental effort exerted did not necessarily result in better performance on the tasks,
which warrants additional data analysis which is part of our future work.

Figure 4: Mean PSD and task performance comparison for each task
CONCLUSIONS
In this study, we proposed a method to measure the neurophysiological effect of thermal
environments on individuals when they were performing different types of cognitive tasks.
Based on the conclusion of the previous study that frontal theta PSD raised with increasing task
demands and higher mental workload, we investigated how thermal environment affect subjects’
performance by studying its effect from the neurophysiological perspective. Even though the
task difficulties remained the same in different thermal conditions, the tasks had different
demands on subjects’ mental effort when the thermal conditions varied and subjects tend to have
worse performance in the slightly warm environment. We also found that subjects did not
achieve better performance with higher mental effort. The study shows the potential of using the
neurophysiological effect measured by the EEG to acquire optimal office environments so as to
achieve the lowest workers’ mental workload considering individual differences and types of
work they are performing. A limitation of the proposed method is that the subjects need to be
relatively static while performing tasks to keep the EEG data clean. In the future, more elaborate
index to reflect mental effort could be reached by including more EEG channels and wider
frequency range, which will enable us to dive deeper into the effect of temperature on
individuals’ performance by studying on more subjects.

ACKNOWLEDGMENTS
The authors would like to acknowledge the ﬁnancial support for this research received from
the U.S. National Science Foundation (NSF) CBET 1349921 and CBET 1804321. Any opinions
and ﬁndings in this paper are those of the authors and do not necessarily represent those of the

NSF.

REFERENCES
Al Horr, Y., Arif, M., Kaushik, A., Mazroei, A., Katafygiotou, M., and Elsarrag, E. (2016).
“Occupant productivity and office indoor environment quality: A review of the literature.”
Building and Environment, 105, 369–389.
Choi, Y., Kim, M., and Chun, C. (2019). “Effect of temperature on attention ability based on
electroencephalogram measurements.” Building and Environment, 147, 299–304.
Cohen, M. X. (2011). “It’s about Time.” Frontiers in Human Neuroscience, 5, 1-15.
Comon, P. (1994). “Independent component analysis, A new concept?” Signal Processing, 36(3),
287–314.
Djukanovic, R., Wargocki, P., and Fanger, P. (2002). “Cost-benefit analysis of improved air
quality in an office building.” Indoor Air, 6.
Fanger P.O. (1970). Thermal Comfort: Analysis and Applications in Environmental Engineering,
McGraw-Hill, Berkeley, CA
Hocking, C., Silberstein, R. B., Lau, W. M., Stough, C., and Roberts, W. (2001). “Evaluation of
cognitive performance in the heat by functional brain imaging and psychometric testing.” 16.
Holm, A., Lukander, K., Korpela, J., Sallinen, M., and Müller, K. M. I. (2009). “Estimating
Brain Load from the EEG.” The Scientific World JOURNAL, 9, 639–651.
Kekäläinen, P., Niemelä, R., Tuomainen, M., Kemppilä, S., Palonen, J., Riuttala, H., Nykyri, E.,
Seppänen, O., and Reijula, K. (2010). “Effect of reduced summer indoor temperature on
symptoms, perceived work environment and productivity in office work.” Intelligent
Buildings International, 17.
Klimesch, W. (1999). “EEG alpha and theta oscillations reflect cognitive and memory
performance: a review and analysis.” Brain Research Reviews, 29(2–3), 169–195.
Lan, L., Lian, Z., Pan, L., and Ye, Q. (2009). “Neurobehavioral approach for evaluation of office
workers’ productivity: The effects of room temperature.” Building and Environment, 44(8),
1578–1588.
Lan, L., Lian, Z., and Pan, L. (2010). “The effects of air temperature on office workers’ well-
being, workload and productivity-evaluated with subjective ratings.” Applied Ergonomics,
42(1), 29–36.
Li, D., Menassa, C. C., and Kamat, V. R. (2017a). “A Personalized HVAC Control Smartphone
Application Framework for Improved Human Health and Well-Being.” Computing in Civil
Engineering 2017, American Society of Civil Engineers, Seattle, Washington, 82–90.
Li, D., Menassa, C. C., & Kamat, V. R. (2017b). “Personalized human comfort in indoor
building environments under diverse conditioning modes.” Building and Environment, 126,
304-317.
Li, D., Menassa, C. C., and Kamat, V. R. (2018). “Non-intrusive interpretation of human thermal
comfort through analysis of facial infrared thermography.” Energy and Buildings, 176, 246–
261.
McCarney, R., Warner, J., Iliffe, S., van Haselen, R., Griffin, M., and Fisher, P. (2007). “The
Hawthorne Effect: a randomised, controlled trial.” BMC Medical Research Methodology,
7(1).
Seppänen, O., Fisk, W. J., and Faulkner, D. (2005). “Control of Temperature for Health and
Productivity in Offices.” 680–686.
Seppänen, O., and Fisk, W. (2006). “Some Quantitative Relations between Indoor

Environmental Quality and Work Performance or Health.” HVAC&R Research, 12(4), 957–
973.
Tanabe, S., Haneda, M., and Nishihara, N. (2015). “Workplace productivity and individual
thermal satisfaction.” Building and Environment, 91, 42–50.
Toftum, J., Wyon, D., and Svanekj, H. (2005). “Remote performance measurement (RPM) – a
new, Internet-based method for the measurement of occupant performance in office.” Indoor
Air, 5, 357-361.
Witterseh, T., Wyon, D. P., and Clausen, G. (2004). “The effects of moderate heat stress and
open-plan office noise distraction on SBS symptoms and on the performance of office work.”
Indoor Air, 14(s8), 30–40.
Yao, Y., Lian, Z., Liu, W., Jiang, C., Liu, Y., and Lu, H. (2009). “Heart rate variation and
electroencephalograph - the potential physiological factors for thermal comfort study.”
Indoor Air, 19(2), 93–101.

Enhanced Welding Operator Quality Performance Measurement: Work Experience-

Integrated Bayesian Prior Determination
Yitong Li, S.M.ASCE1; Wenying Ji, A.M.ASCE2; and Simaan M. AbouRizk, M.ASCE3
1
Ph.D. Student, Dept. of Civil, Environmental, and Infrastructure Engineering, George Mason
Univ., Fairfax, VA 22030. E-mail: [email protected]
2
Assistant Professor, Dept. of Civil, Environmental, and Infrastructure Engineering, George
Mason Univ., Fairfax, VA 22030. E-mail: [email protected]
3
Professor, Dept. of Civil and Environmental Engineering, Univ. of Alberta, Edmonton, AB T6G
2W2, Canada. E-mail: [email protected]

ABSTRACT
Measurement of operator quality performance has been challenging in the construction
fabrication industry. Among various causes, the learning effect is a significant factor, which
needs to be incorporated in achieving a reliable operator quality performance analysis. This
research aims to enhance a previously developed operator quality performance measurement
approach by incorporating the learning effect (i.e., work experience). To achieve this goal, the
plateau learning model is selected to quantitatively represent the relationship between quality
performance and work experience through a beta-binomial regression approach. Based on this
relationship, an informative prior determination approach, which incorporates operator work
experience information, is developed to enhance the previous Bayesian-based operator quality
performance measurement. Academically, this research provides a systematic approach to derive
Bayesian informative priors through integrating multi-source information. Practically, the
proposed approach reliably measures operator quality performance in fabrication quality control
processes.

INTRODUCTION
Pipe spool fabrication is essential to the successful delivery of an industrial construction
project (Wang et al. 2009). During the process of pipe fabrication, welding is an important step
and its quality must be examined to ensure the specified requirements are satisfied. Although
welding is undertaken by skilled operators, variations commonly exist in welding operator
quality performance due to the lack of essential knowledge and skills (Ji et al. 2018). Therefore,
being able to reliably measure welding operator quality performance is crucial since the reliable
performance measurement leads to considerable advancement in project quality performance,
which would further decrease rework cost and overcome schedule delays. To achieve this goal, Ji
and AbouRizk (2018) have developed a Bayesian statistics-based method to estimate operator
quality performance by assuming operator quality performance is stationary over time. However,
one of the most significant factors—the learning effect (i.e., the continuously improved quality
performance as operator work experience increases)—was neglected, which leads to a biased
estimation of operator quality performance.
This research aims to enhance the previously developed approach to reliably measure
welding operator quality performance by incorporating the effect of work experience.
Specifically, the objective is achieved by (1) selecting a learning curve model to describe the
relationship between quality performance and work experience; (2) applying the beta-binomial
regression to derive the equation of the selected model; (3) determining a informative prior to

represent quality performance for a given operator; (4) demonstrating the advantages of the
enhanced Bayesian-based approach using a case study. The remainder of this paper is arranged
as follows. In the next section, previous work on Bayesian-based operator quality performance
measurement is discussed. After that, the newly proposed methodology is introduced step by
step. In the following section, a practical case study is conducted to demonstrate the advantages
of the newly proposed approach. Finally, contributions, limitations, and future work are
concluded.

PREVIOUS WORK
Previously, Ji and AbouRizk (2017) have advocated the advantages of using a Bayesian-
based operator quality performance measurement to incorporate inspection sampling uncertainty.
In their research, operator quality performance is reflected by fraction nonconforming (i.e.,
percentage repair rate) as indicated below:
X
p (1)
n
Where p denotes fraction nonconforming, X denotes the number of welds which fails
inspections, and n denotes the total number of welds. To cover the sampling uncertainty, a beta
distribution Beta  a, b  was chosen to model the prior distribution for the Bayesian-based
fraction nonconforming estimation. The prior distribution represents operator quality
performance when no inspection results are collected. The posterior distribution describes the
latest measurement of operator quality performance by continuously adding more inspection
results. An analytical solution for the posterior distribution follows:
Beta  X  a, n  X  b  (2)
In Bayesian statistics, two types of priors are commonly used, namely, informative priors—
probability distributions derived from historical data or subjective knowledge; and
noninformative priors—vague, flat, and diffuse probability distributions that have the lowest bias
to prior estimation when information is insufficient (Ji and AbouRizk 2017).
In estimating the welding operator quality performance, Ji and AbouRizk (2017) used a
noninformative prior distribution Beta 1/ 2,1 / 2  without incorporating the learning effect of
operators. The reliability of a Bayesian statistic-based method is heavily dependent on the prior
determination (Winkler 1967). Incapable of determining reliable priors leads to unreliable
posterior inferences, which further misleads practical decision support. Therefore, in aims of
improving the existing approach, an informative prior determination method, which is able to
incorporate work experience, is developed in this study.

METHODOLOGY
The research methodology of this study is demonstrated as Figure 1. First, the Plateau
learning curve model is selected to illustrate the relationship between operator quality
performance and work experience. Then, a beta-binomial regression approach is utilized to
derive the unknown parameters for the selected learning curve model. After that, informative
priors are determined through the derived learning curve equation. Lastly, posterior distributions
are computed by incorporating newly collected inspection data.

Plateau Learning Curve Beta-Binomial Informative Prior Posterior Distribution

Modeling Regression Modeling Determination
Figure 1. Research methodology flow chart.
Plateau Learning Curve Modeling: The Plateau model (Baloff 1971) describes a linear-log
model with a constant term which indicates the operator’s steady-state performance. The Plateau
model is selected to represent the relation between welding operator quality performance and
work experience. It is applicable in this research because operator quality performance reaches a
steady-state as operators gain enough practices. The Plateau model in this research is represented
in Equation (3):
FN  A  B(n)C (3)
Where FN denotes fraction nonconforming and n denotes the total number of welds. A, B,
and C are unknown parameters that can be derived using the regression approach described in
the next section.
Beta-Binomial Regression: Regression is a statistical technique to determine the
relationship between dependent variable and independent variables. For this research, the beta-
binomial regression model is selected to derive parameters in the Plateau model.
R's gamlss package (Stasinopoulos and Rigby 2018) is a regression package which allows all
the parameters of the distribution of the dependent variable to be modeled as non-linear functions
of the independent variables (Rigby and Stasinopoulos 2005). In this study, gamlss function is
used to model the mean and the variation of fraction nonconforming (i.e., dependent variable) as
a non-linear function of the total number of welds (i.e., independent variable). Here, variations of
fraction nonconforming are assumed to be the same for all values of total welds and are
represented as  FN . The relationship between the mean value of fraction nonconforming and
total number of welds follows Equation (3) can be represented as:
FN  A  B(n)C (4)
This equation allows defining an exclusive mean value of fraction nonconforming for every
operator based on their total number of welds (i.e., work experience).
Informative Prior Modeling: In the Bayesian-based operator quality performance
measurement approach, the prior distribution of fraction nonconforming is represented with a
beta distribution as shown in Equation (5), which can be reparametrized using  and  , where
 (shown as Equation (6)) is the mean value of a beta distribution, and  (shown as Equation
(7)) represents the spread of the distribution. The reparametrized equation is shown as Equation
(8).
Beta  a, b  (5)
a
 (6)
ab
1
 (7)
ab
 1   FN ,i 
Beta  FN ,i ,  (8)
  FN  FN 
In Equation (8),  FN ,i is computed from Equation (4) which represents the mean value of
fraction nonconforming for operator i with the total number of welds ni . The reparametrized

beta distribution is used as the informative prior distribution for the Bayesian-based approach.
Posterior Distribution Determination: After obtaining the prior distribution, the posterior
distribution can be computed by following the same process as discussed in Equation (2).

CASE STUDY
Data Source: The same dataset that contains information on engineering design system and
quality management system from an industrial pipe fabrication company in Edmonton, Canada is
used (Ji and AbouRizk 2018). The engineering design system stores information of pipe design
attributes which are grouped in the format of (nominal pipe size, pipe schedule, material type,
weld type) to represent a type of welds. The quality management system stores inspection
records for various pipe types, inspection records for a specified weld type which can further be
summarized as inspection results shown in Table 1. The detailed description of the dataset and
data processing steps are referred to the authors’ previous research (Ji and AbouRizk 2018).

Table 1. Sample inspection results for a specified weld type.

Operator ID Total number of welds Repaired welds Fraction nonconforming
1 208 9 0.043
2 141 5 0.035
… … … …

To verify the learning effect exists in the studied dataset, relationships between fraction
nonconforming and total number of welds for four common types of welds are shown as scatter
plots in Figure 2. All these plots demonstrate improved quality performance with increased work
experience, which also proofs the motivation of this study.

Figure 2. Relationship between fraction nonconforming and total manufactured welds for
four commonly used weld types.
Main Outputs: To demonstrate the feasibility of the proposed methodology, the weld type
with design attributes (STD, 2, A, BW) is selected and further analyzed since it is the most
common weld type in the studied dataset. The processed data indicates that 57 welding operators
had experience in producing this weld type. In Figure 3, the line is the fitted Plateau learning
curve using the beta-binomial regression. The value of  FN equals to 0.184 101 and the
computed Plateau learning curve follows the equation:
FN ,i  0.149  0.544 102   ni 
0.5
(9)

Figure 3. Relationship between fraction nonconforming and total number of welds for weld
type (STD, 2, A, BW).

Table 2. Bayesian-statistics using informative prior determination for weld type (STD, 2, A,
BW).
Operator n X  FN Prior: Posterior: Posterior
ID Beta  a, b  Beta  X  a, n  X  b  mean
1 175 25 0.077 4.172, 50.005 29.172, 200.005 0.127
2 111 11 0.092 4.966, 49.211 15.966, 149.211 0.097
… … … … … … …

Figure 4. Relationship of mean fraction nonconforming between informative prior

approach and noninformative prior approach.
Using Equation (9), the mean value of fraction nonconforming for a welding operator can be

computed based on his work experience (i.e., total number of welds). Table 2 shows the sample
calculations for welding operators, which includes welding inspection information, mean value
of fraction nonconforming (as per Equation (9)), informative prior distribution (as per Equation
(8)), posterior distribution (as per Equation (2)), and mean value of fraction nonconforming
computed from the posterior distributions (use Equation (6)).
Figure 4 illustrates differences between the proposed informative prior approach and the
previous noninformative prior approach as the total number of welds changes. In this figure, the
x-axis represents the mean value of fraction nonconforming computed using the noninformative
prior approach and the y-axis shows the mean value of fraction nonconforming computed using
the informative prior approach. The legend color represents the total number of welds (i.e., a
darker color represents a low total number of welds and a lighter color represents a high total
number of welds). The diagonal line represents situations when there is no difference between
using the two approaches. From Figure 4, it is observed that the majority of lighter color points
fall onto the diagonal line, which indicates there is no significant difference with using the two
approaches when the total number of welds is high. However, deviations of the darker color
points from the diagonal line indicate the differences between using the two approaches when the
total number of welds is low. In summary, when the total number of welds is low, the
noninformative prior approach cannot provide a quality measurement which is as reliable as the
informative prior approach since the noninformative prior approach does not incorporate work
experience.

Figure 5. Posterior distributions of fraction nonconforming for 17 welding operators: (a)

noninformative prior approach and (b) informative prior approach.

In the authors’ previous research, 17 welding operators, who have manufactured weld type
(STD, 2, A, BW), were selected to demonstrate the feasibility of noninformative prior approach
(Ji and AbouRizk 2018). For comparison purpose, the proposed informative prior determination
approach is conducted on the same group of operators. Results of the two approaches are
demonstrated as boxplots shown in Figure 5.
In Figure 5 (a), welding operators are ordered decreasingly as per mean values of their
fraction nonconforming with ID numbers reassigned from 1 to 17. Compared to Figure 5(a),
multiple changes in rankings are observed in Figure 5 (b). These changes demonstrate work
experience does have an impact on welding operator quality performance, which proves that the
informative prior approach is capable of measuring more realistic operator quality performance
than the noninformative prior approach.

CONCLUSION
This research enhances the previously developed operator quality performance measurement
approach to reliably measure welding operator quality performance by incorporating the effect of
work experience. In this research, the Plateau learning curve model is utilized to represent the
relationship between operator quality performance and work experience. A quantitative
representation of the relationship is derived using the beta-binomial regression model. Based on
the relationship, informative priors are determined and then utilized to obtain posterior
distributions to reflect operator quality performance.
Academically, this research developed a systematic informative prior determination approach
which is capable of incorporating rich information to represent the variable of interest.
Practically, the proposed approach can be used to measure operator quality performance and
provide practitioners with guidance in decision-making processes for improved project quality
control.
Still, operator quality performance is subject to various factors (e.g. training levels and
working conditions). Therefore, in the future, studies on identifying and quantifying factors that
related to operator quality performance will be performed to achieve a better measurement of
operator quality performance.

REFERENCE
Baloff, N. (1971). “Extension of the learning curve — some empirical results.” Oper. Res. Q.,
22(4), 329-340.
Ji, W., and AbouRizk, S. M. (2017). “Credible interval estimation for fraction nonconforming:
Analytical and numerical solutions.” Autom. Constr., 83 (Nov), 56–67.
Ji, W., AbouRizk, S. M., Osmar, R. Z., and Li, Y. (2018). “Complexity analysis approach for
prefabricated construction products using uncertain data clustering.” J. Constr. Eng. Manag.,
144(8), 04018063.
Ji, W., and AbouRizk, S. M. (2018). “Simulation-based analytics for quality control decision
support: Pipe welding case study.” J. Comput. Civil Eng., 32(3), 05018002.
Rigby, R. A., and Stasinopoulos, D. M. (2005). “Generalized additive models for location, scale
and shape.” Appl. Statist., 54(3), 507–554.
R Core Team (2018). R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria.
Stasinopoulos, D. M., and Rigby, R. A. 2018. “Package ‘gamlss’” <https://cran.r-
project.org/web/packages/gamlss/index.html> (Nov. 19, 2018).

Winkler, R. L. (1967). “The assessment of prior distributions in bayesian analysis.” J. Am. Stat.
Assoc., 62(319), 776–800.
Wang, P., Y. Mohamed, S. Abourizk, and A. Rawa. (2009). "Flow production of pipe spool
fabrication: simulation to support implementation of lean technique." J. Constr. Eng.
Manag., 135(10), 1027-1038.

Understanding Different Views on Emerging Technology Acceptance between Academia

and the AEC/FM Industry
Yong K. Cho, A.M.ASCE1; Youjin Jang, Ph.D.2; Kinam Kim3 ; Fernanda Leite, A.M.ASCE4 ; and
Steven Ayer, A.M.ASCE5
1
School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA
30332-0355. E-mail: [email protected]
2
School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA
30332-0355. E-mail: [email protected]
3
School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA
30332-0355. E-mail: [email protected]
4
Civil and Environmental Engineering, Univ. of Texas at Austin, TX 78712-1094. E-mail:
[email protected]
5
School of Sustainable Engineering and the Built Environment, Arizona State Univ., Tempe, AZ
85281. E-mail: [email protected]

ABSTRACT
Technology plays an essential role in accelerating and improving various construction
processes, so both academia and industry have interest in adopting and adapting technologies in
their research or projects. In this respect, academic and industry collaboration has increasingly
become essential in the construction industry. A critical step for academia-industry collaboration
is to understand how academia and industry accept and reject technologies differently. Therefore,
this study aims to examine the technology maturity gap between academia and industry. In a
partnership with the Construction Industry Institute’s Horizon-360 team, we investigated the
levels of acceptance for the emerging technologies by academic research and the architecture,
engineering, construction, and facilities management (AEC/FM) industry through an onsite and
online survey, and classified the identified technologies into relevance and maturity. The survey
results show that there are several technologies viewed differently between academia and
industry. The findings from this study will serve as the basis in closing the gap for more active
emerging technology-oriented academia-industry cooperation.

INTRODUCTION
The construction industry plays a key role for governments in both developed and developing
economies in terms of creating new jobs, driving economic growth, and providing solutions to
address social, climate and energy challenges. However, overall productivity has remained
nearly stagnated for the last 50 years because the construction industry had been slower to adopt
and adapt to new technologies than other industry sectors. Recently, with the advent of a new
technological era known as the fourth industrial revolution (Industry 4.0), advances made in
technologies have gradually led to a change in the construction industry. Technologies, including
building information modeling (BIM), prefabrication, wireless sensors, 3D printing, and
automated and robotic equipment, are affecting the entire construction industry. These
technologies ensure functioning overdetermination of construction processes and help to
consolidate separated knowledge advancements made within time, cost, accessibility,
sustainability, thermal and visual comfort, buildability and maintainability into a holistic
decision-making tool. In this respect, technologies play an essential role in accelerating and

improving various construction process regarding productivity, so both academia and industry
have interested in adopting and adapting technologies to their research and construction projects.
However, academia has pressures including the growth in new knowledge and technologies,
the challenge of rising costs and funding problems, and the industry also has pressures including
rapid technological change and intense global competition (Ankrah and AL-Tabbaa 2015). These
pressures on both parties have led to a stimulus for academia-industry collaboration. Academia-
industry collaboration refers to the interaction between any parts of the higher education system
and industry aiming mainly to encourage knowledge and technology exchange (Bekkers and
Bodas Freitas 2008; Siegel et al. 2003). Leite et al. (2016) examined the knowledge gap between
academia and industry by surveying the seventeen grand challenges of advanced technologies. A
successful collaboration between academia and industry can make industry easier to
commercialize new technology, and academia can benefit from the industry which provides new
technology research directions and additional funding to researchers (Bercovitz and Feldmann
2006; D'Este and Perkmann 2011). In this respect, academic and industry collaboration has
increasingly become essential in the architecture, engineering, construction, and facilities
management (AEC/FM) industry.
Several researchers studied on the academia-industry collaboration in the construction
industry including academia-industry collaborative research agenda (Lucko and Kaminsky 2015),
the key challenges and roles of academic-industry collaboration (Arif et al. 2014; Sahpira and
Rosenfeld 2011), and the ways in which academia-industry partnerships change engineering
students’ knowledge and attitudes about corporate social responsibility (Smith et al. 2018). In
practice, one of the critical issues for academia-industry collaboration is to understand their
technology gaps and select technologies to exchange. Nevertheless, few studies focused on
analyzing specific technologies that have been adopted or being considered from academia and
industry.
To address this problem, this study aims to examine the technology maturity gap between
academia and industry. We investigated the technologies that are positively or negatively
accepted by academia and industry and classified the identified technologies in accordance with
relevance and maturity. The remainder of this paper is structured as follows. The next section
describes the derivation process of the emerging technologies in the AEC/FM industry. Then, we
design the questionnaires and survey the experts. Finally, this paper provides the survey results
on the technology maturity and acceptance gap in academia and industry, accompanied by
closing remarks.

EMERGING TECHNOLOGIES IN THE AEC/FM INDUSTRY

The list of emerging technologies in the AEC/FM industry was derived from the work of the
Horizon–360 (H–360) team, which is a legacy group from Fully Integrated & Automated
Technologies (Fiatech) in Construction Industry Institute (CII) organization. The H–360 team
sought and identified technology-based solutions and innovative practices that were
implementable and cost-effective for both incremental and breakthrough industry advancement.
They have scanned the global economy to identify advanced technologies that would impact and
benefit the AEC/FM industry. The H–360 team derived twenty-three available technologies, and
nine of them are shown in Fig. 1. The details of the twenty-three technologies are as follows: 1)
Autonomous vehicles; 2) Exoskeleton; 3) Active bridge monitoring; 4) ISO 15926-
interoperability; 5) Smart materials; 6) Biomaterials; 7) Data automation tools (Equip DX); 8)
Energy storage dens in batteries; 9) Augmented reality; 10) 3D printing; 11) Just in time

fabrication; 12) Real-time auto translation; 13) Smart vehicle equipment; 14) Get rid of drawings
by using information or machine; 15) SUAVs & UAS; 16) Artificial intelligence; 17) Robotics;
18) Modularization; 19) Blockchain; 20) Construction holodeck / 3D glasses HoloLens; 21)
Computer vision monitoring to support robotics/vehicles; 22) Virtual reality; and 23) Multi-user
collaboration (not VR). Specifically, Augmented reality (AR) is divided into ‘reality capture,’
‘AR BIM integration,’ ‘AR BIM integration analytics’ and ‘collaborative AR – holodeck.’ 3D
printing is divided into ‘3D printing polymers’, ‘3D printing cement’, ‘3D printing in other
materials’ and ‘3D printing models for new construction’. SUAVs UAS is divided into ‘UAVs
for condition assessment’, ‘UAVs for autonomous data capture’, ‘UAVs for scanning &
photogrammetry’, ‘UAVs for analytics of condition assessment’, ‘UAVs for construction’,
‘UAVs for outside use only’, ‘UAVs for inside use’, ‘UAVs for reality capture’, ‘UAVs for
reality capture with BIM integration’, ‘UAVs for analytics of performance monitoring’.
Artificial intelligence (AI) is divided into ‘AI for automated pipe routing’ and ‘neural
networks/deep learning products.’ Robotics is divided into ‘robotics ground preparation,’
‘robotics bricks laying,’ ‘robotics fabrication-mechanical piping,’ ‘robotics-welding,’ ‘layout
robot,’ and ‘material lifting robots.’

Fig. 1. Example of Technology Scan Matrix

METHOD
Questionnaire Design
To examine the technology gap, this study first designed a questionnaire for identifying
technologies that were actively and inactively investigated by academic research and adopted by
the ACE/FM industry. A questionnaire was designed that each listed technology in this survey
was ranked by respondents using a 4-point scale: a) currently using; b) considering using in the
near future; c) not using or considering, but would like more information; and d) not interested in
this technology. This study also designed a questionnaire for classifying the technologies into
watch, use, learn and pilot phase in accordance with relevance and maturity. Fig. 2 illustrates the

four quadrants for technology classification and example results of classified technologies. Each
listed technology in this survey was categorized by respondents using a 4-point scale: a) use this
technology widely; b) implementing in most projects (more than 60%); c) implementing in a few
projects (less than 20%); and d) piloting only.

Fig. 2. Four Quadrant for Technology Classification (Source:

https://wikis.utexas.edu/display/H360/About+Horizon+360)
Expert Survey
Technology experts in academia and the AEC/FM industry participated in a survey. The
expert survey was conducted in two ways, web-based survey and in-booth survey as depicted in
Fig.3.

Fig. 3. Expert Survey (left: web-based survey; right: in-booth survey)

We distributed a web-based survey through e-mail to experts in academia. The authors and
H-360 team performed an in-booth survey on industry experts at the CII annual conference in
Indianapolis on July 24 – 25, 2018. In-booth survey was conducted in a way that survey
participants placed green, yellow and red colored dots on the matrix. From a data analysis
standpoint, we considered green dots, yellow dots, red dots, and no dot to be equivalent to a 4-

point scale, (a), (b), (c), and (d), respectively.

Fig. 4. Survey Results (up: academia; down: industry) *Note: 1) Autonomous vehicles; 2)
Exoskeleton; 3) Active bridge monitoring; 4) ISO 15926-interoperability; 5) Smart
materials; 6) Biomaterials; 7) Data automation tools (Equip DX); 8) Energy storage dens in
batteries; 9) Augmented reality; 10) 3D printing; 11) Just in time fabrication; 12) Real-time
auto translation; 13) Smart vehicle equipment; 14) Get rid of drawings by using
information or machine; 15) SUAVs & UAS; 16) Artificial intelligence; 17) Robotics; 18)
Modularization; 19) Blockchain; 20) Construction holodeck / 3D glasses HoloLens; 21)
Computer vision monitoring to support robotics/vehicles; 22) Virtual reality; and 23)
Multi-user collaboration.

RESULTS
The survey effort is still ongoing so that this paper shows the initial results of the analyzed
survey data. Until the present, fifteen participants from academia and fifty participants from
industry including fifteen owners, fifteen from engineering, procurement, and construction
(EPC), seven service providers/vendors, and three others responded to the survey. We calculated
the ratio of the four answers for each technology. Fig. 4 shows up-to-date survey results
collected from academia and industry respectively.

Fig. 5. Technologies Gap between Academia and Industry

The survey responses from academia revealed that technologies currently using were ‘VR’
(43%), followed by ‘SUAVs & UAS’ (37%) and ‘computer vision monitoring’ (33%).
Technologies to consider high use in the near future were found to be ‘block chain,’ ‘AR’ and
‘modularization,’ which accounted for 47%, 44%, and 40%, respectively. Meanwhile, ‘smart
material’ (60%), ‘biomaterial’ (47%) and ‘interoperability’ (47%) were not used or considered to
use, but academic researchers had an interest in these technologies. It was, however, found that
academic researchers were not of interest to ‘robotics’ (87%), ‘computer vision monitoring’
(86%) and ‘biomaterial’ (82%) at all. On the other hand, survey responses from industry
indicated that ‘modularization’ (50%), ‘just in time fabrication’ (38%) and ‘SUAVs &UAV’
(35%) were highly used in their company. ‘3D printing’ (32%), ‘get rid of drawings’ (32%) and
‘just in time fabrication’ (26%) were considered to be adopted in the near future if they are not
currently using. Companies were not using or considering in 'autonomous vehicles' (40%), 'get
rid of drawings' (24%) and 'exoskeleton' (22%), but they would like to have more information
about them. However, companies showed low interest in ‘robotics’ (87%), ‘computer vision
monitoring’ (86%) and ‘biomaterial’ (82%).
More specifically, ‘modularization’ (43%) and ‘VR’ (29%) had the largest gap between

academia and industry as illustrated in Fig. 5. In other words, while companies are highly
adopting ‘Modularization,’ academic researchers are not investigating it. On the contrary, ‘VR’
is widely used in academic research, but relatively not much in companies. ‘energy storage
density in batteries’ (0%), ‘blockchain’ (0%) and ‘robotic’ (0%) had no gap since they are not
currently used in both academia and industry. ‘SUAVs & UAS’ (1%) showed that there was
almost no gap in current use between academia and industry.
In the survey results, it is revealed that there exist apparent gaps in the technologies that
academia and industry were interested in. Academia is mainly using or investigating ‘Virtual
reality,’ but the industry is widely using ‘Modularization.’ It is noteworthy that academia was
currently using or investigating ‘Computer vision monitoring,’ but companies were not interested
in it. Nevertheless, ‘SUAVs & UAS’ is widely using in both academia and industry at present.
Overall, it is also found that academia is more interested in cutting-edge technologies than the
industry.

CONCLUSION
With the importance of adopting and adapting the emerging technologies in the AEC/FM
industry, the academia-industry partnership has increasingly become essential. To examine the
technology maturity gap between academia and industry, this study identified the technologies
that are actively and inactively accepted by academic research and the AEC/FM industry and
classified the identified technologies into relevance and maturity. The survey results showed that
there are several technologies that academia and industry view differently. It is found that
technology which is mainly used in academia is ‘Virtual reality’ whereas in the industry is
‘Modularization.’ In the case of ‘Computer vision monitoring,’ academia is widely used, but the
industry has very low interest in it. ‘SUAVs & UAS’ is the only technology that both academia
and industry are widely used at present. It is noteworthy that academia is more interested in
newer technologies than the industry.
This paper provided the initial results of analyzing the survey up to the date. The initial
results showed that the gaps between academia and industry in technologies of the construction
industry exist. Since this study is still in the early stage, the analyzed results introduced in this
paper could be biased. As a follow-up effort, thus, the results of the survey with the larger pool
will be revisited, and statistical analysis such as Analysis of Variance (ANOVA) will be
conducted. Technology gaps between academia and industry will be further discussed with
classifying technologies in accordance with relevance and maturity. It is highly expected that the
findings from this study will serve as the basis in closing the gap coming from the technology
acceptance disagreement between academia and industry and pave the way for promoting
technology-oriented academia-industry cooperation.

ACKNOWLEDGMENT
The authors appreciate the Construction Industry Insitute (CII)’s Horizon-360 committee for
their support, input, as well as for sharing survey data for this collaborative study. Any opinions,
findings, and conclusions, or recommendations expressed in this material are those of the authors
and do not necessarily reflect those of the CII.

REFERENCES
Ankrah, S., and AL-Tabbaa, O. (2015). “Universities – industry collaboration: A systematic

review.” Scandinavian Journal of Management, 2015(31), 387–408.

Bekkers, R., and Bodas Freitas, I. (2008). “Analyzing knowledge transfer channels between
universities and industry: To what degree do sectors also matter?.” Research Policy, 37,
1837—1853.
Bercovitz, J., and Feldmann, M. (2006). “Entrepreneurial universities and technology transfer: A
conceptual framework for understanding knowledge-based economic development.” The
Journal of Technology Transfer, 31(2), 175–188.
D’Este, P., and Perkmann, M. (2011). "Why do academics engage with industry? The
entrepreneurial university and individual motivations." The Journal of Technology Transfer,
36(3), 316–339.
Siegel, D., Waldman, D., and Link, A. (2003). “Assessing the impact of organizational practices
on the relative productivity of university technology transfer offices: An exploratory study.”
Research Policy, 32, 27–48.
Leite, F., Cho, Y., Behzadan, A. H., Lee, S., Choe, S., Fang, Y., Akhavian, R., and Hwang, S.
(2016). “Visualization, Information Modeling, and Simulation: Grand Challenges in the
Construction Industry.” Journal of Computing in Civil Engineering, 30(6), 4016035.
Lucko, G., and Kaminsky, J.A. (2015). “Construction Engineering Conference and Workshop
2014: Setting an industry-academic collaborative research agenda, Journal of Construction
Engineering and Management, 142(4), 04015096.
Sahpira, A. and Rosenfeld, Y. (2011). “Achieving construction innovation through academia-
industry cooperation – key to success.” Journal of Professional Issues in Engineering
Education & Practice, 137(4), 223–231.

(Monograph (ASCE Council on Disaster Risk Management) Number 8) Ang, A. H-S._ Liu, Xila - Sustainable Development of Critical Infrastructure _ Proceedings of the 2014 International Conference on Susta
No ratings yet
(Monograph (ASCE Council on Disaster Risk Management) Number 8) Ang, A. H-S._ Liu, Xila - Sustainable Development of Critical Infrastructure _ Proceedings of the 2014 International Conference on Susta
526 pages
Computing in Civil Engineering 2015
No ratings yet
Computing in Civil Engineering 2015
730 pages
An Algorithm To Obtain Moment-Curvature Diagram Fo
No ratings yet
An Algorithm To Obtain Moment-Curvature Diagram Fo
564 pages
Pushover Analysis For Structure Containing RC Walls
No ratings yet
Pushover Analysis For Structure Containing RC Walls
8 pages
Computational Fluid Dynamics Modelling of Pipe Soil Interaction in Current
No ratings yet
Computational Fluid Dynamics Modelling of Pipe Soil Interaction in Current
5 pages
Engineering and Design - Stability Analysis of Concrete Stru
100% (1)
Engineering and Design - Stability Analysis of Concrete Stru
152 pages
Over Head Tank
No ratings yet
Over Head Tank
44 pages
District Wise Population in India As of 2011 Census
No ratings yet
District Wise Population in India As of 2011 Census
22 pages
Road and Airport Engineering: New Frontiers
No ratings yet
Road and Airport Engineering: New Frontiers
362 pages
Mourad Belgasmia - Optimization of Design For Better Structural Capacity
No ratings yet
Mourad Belgasmia - Optimization of Design For Better Structural Capacity
299 pages
Confined Concrete Columns With Stubs
No ratings yet
Confined Concrete Columns With Stubs
19 pages
E-Proceedings ICRACEM 2020
No ratings yet
E-Proceedings ICRACEM 2020
578 pages
Introduction of Seismic Design Categories in IS 1893
No ratings yet
Introduction of Seismic Design Categories in IS 1893
8 pages
A Nonlinear Numerical Model For Analyzing Reinforced Concrete Structures PDF
No ratings yet
A Nonlinear Numerical Model For Analyzing Reinforced Concrete Structures PDF
151 pages
Global Positioning System Surveying PDF
No ratings yet
Global Positioning System Surveying PDF
328 pages
Research Final Report
No ratings yet
Research Final Report
239 pages
TR132 Gupta
No ratings yet
TR132 Gupta
379 pages
Worli Bandra Sea Link - Large Dia Piles
No ratings yet
Worli Bandra Sea Link - Large Dia Piles
9 pages
The Spring Constant and Soil Structure Interaction PDF
No ratings yet
The Spring Constant and Soil Structure Interaction PDF
16 pages
Book (Assessment of Soil Liquefaction Safety..)
No ratings yet
Book (Assessment of Soil Liquefaction Safety..)
380 pages
Cet308 Comprehensive Course Work, January 2024
No ratings yet
Cet308 Comprehensive Course Work, January 2024
5 pages
Water Control and Structure
No ratings yet
Water Control and Structure
82 pages
EQ28
No ratings yet
EQ28
105 pages
Reinforced
No ratings yet
Reinforced
725 pages
Usace - Response Spectra & Seismic Analysis For Concrete Hyd Struct
No ratings yet
Usace - Response Spectra & Seismic Analysis For Concrete Hyd Struct
248 pages
Steel Bridge Design Handbook: Archived
No ratings yet
Steel Bridge Design Handbook: Archived
33 pages
Structural Reliability Analysis Using Monte Carlo Simulation and Neural Networks
No ratings yet
Structural Reliability Analysis Using Monte Carlo Simulation and Neural Networks
9 pages
Artificial Intelligence AI Applied in Civil Engineering
No ratings yet
Artificial Intelligence AI Applied in Civil Engineering
700 pages
The Seismic Design of Waterfront Retaining Structures PDF
No ratings yet
The Seismic Design of Waterfront Retaining Structures PDF
328 pages
Sanet - Ws - Optimization of Design For Better Structural Capacity
No ratings yet
Sanet - Ws - Optimization of Design For Better Structural Capacity
529 pages
From Materials To Structures
100% (1)
From Materials To Structures
1,224 pages
New York DOT Retaining Wall Lateral Earth Pressure Guide
100% (1)
New York DOT Retaining Wall Lateral Earth Pressure Guide
55 pages
Time Development of Local Scour Around Semi Integral Bridge Piers and Piles in Malaysia
No ratings yet
Time Development of Local Scour Around Semi Integral Bridge Piers and Piles in Malaysia
6 pages
Seismic Behavior of Nonseismically Detailed Interior Beam-Wide Column Joints-Part I: Experimental Results and Observed Behavior
No ratings yet
Seismic Behavior of Nonseismically Detailed Interior Beam-Wide Column Joints-Part I: Experimental Results and Observed Behavior
12 pages
Denavit and Hajjar - Composite Members and Frames With Applications For Design - Report No. NSEL-034-UIUC 2014
No ratings yet
Denavit and Hajjar - Composite Members and Frames With Applications For Design - Report No. NSEL-034-UIUC 2014
691 pages
Continental Steel - Structural Steel
No ratings yet
Continental Steel - Structural Steel
191 pages
Machine Learning Applications in Earthquake Engineering, A Literature Review
No ratings yet
Machine Learning Applications in Earthquake Engineering, A Literature Review
13 pages
Em - 1110!2!2007 (Structural Design of Concrete Lined Flood Control Channels)
No ratings yet
Em - 1110!2!2007 (Structural Design of Concrete Lined Flood Control Channels)
54 pages
AI Structural Engineering
100% (1)
AI Structural Engineering
43 pages
Trujillo, David Engineering&Codes
No ratings yet
Trujillo, David Engineering&Codes
37 pages
September 14
No ratings yet
September 14
89 pages
SAPWood User's Manual V20 PDF
No ratings yet
SAPWood User's Manual V20 PDF
80 pages
Structural Design Calculations For Main Piperack
No ratings yet
Structural Design Calculations For Main Piperack
897 pages
FR - 567 Development of Guidelines For Transportation of Long Prestressed Concrete Girders
No ratings yet
FR - 567 Development of Guidelines For Transportation of Long Prestressed Concrete Girders
294 pages
Final M.Tech Thesis - Asim Bashir (Roll No. 14210022) PDF
No ratings yet
Final M.Tech Thesis - Asim Bashir (Roll No. 14210022) PDF
173 pages
SP47
No ratings yet
SP47
177 pages
RajeshM TeshProject2013 PDF
No ratings yet
RajeshM TeshProject2013 PDF
126 pages
P-M INTERACTION CURVE Group 4
No ratings yet
P-M INTERACTION CURVE Group 4
29 pages
Structural Analysis Selected Topics: Mohammed Bin Salem
No ratings yet
Structural Analysis Selected Topics: Mohammed Bin Salem
1 page
ASCE Pubs 2012
No ratings yet
ASCE Pubs 2012
56 pages
Ground Slab Reinforcement - Heavy Load
No ratings yet
Ground Slab Reinforcement - Heavy Load
54 pages
Examples For Ductile Detailing RCC Buildings
No ratings yet
Examples For Ductile Detailing RCC Buildings
72 pages
PHD Thesis Goldgruber Nonlinear Concrete Dams PDF
No ratings yet
PHD Thesis Goldgruber Nonlinear Concrete Dams PDF
208 pages
Structural Reliability
From Everand
Structural Reliability
Maurice Lemaire
No ratings yet
A Catalogue of Details on Pre-Contract Schedules: Surgical Eye Centre of Excellence - Kath
From Everand
A Catalogue of Details on Pre-Contract Schedules: Surgical Eye Centre of Excellence - Kath
Edward Ayebeng Botchway
No ratings yet
Computing in Civil Engineering Proceedings of The 2013 ASCE International Workshop On Computing in Civil Engineering June 23-25-2013 Los Angeles California
100% (2)
Computing in Civil Engineering Proceedings of The 2013 ASCE International Workshop On Computing in Civil Engineering June 23-25-2013 Los Angeles California
923 pages
CP Special Collection Future - PDF
No ratings yet
CP Special Collection Future - PDF
1 page
(Ebook) Information and Knowledge Engineering by Hamid R. Arabnia; Ray R. Hashemi; Fernando G. Tinetti; Cheng-Ying Yang ISBN 9781683925750, 1683925750 - The ebook in PDF and DOCX formats is ready for download
100% (1)
(Ebook) Information and Knowledge Engineering by Hamid R. Arabnia; Ray R. Hashemi; Fernando G. Tinetti; Cheng-Ying Yang ISBN 9781683925750, 1683925750 - The ebook in PDF and DOCX formats is ready for download
54 pages
Advances in Computer Methods and Geomechanics IACMAG Symposium 2019 IACMAG Symposium 2019 Volume 2 1 ed. 2020 Edition Amit Prashant (Editor) - The ebook is available for instant download, no waiting required
100% (1)
Advances in Computer Methods and Geomechanics IACMAG Symposium 2019 IACMAG Symposium 2019 Volume 2 1 ed. 2020 Edition Amit Prashant (Editor) - The ebook is available for instant download, no waiting required
65 pages
Buffalo: J. Comput. Civ. Eng., 1992, 6 (1) : 106-109
No ratings yet
Buffalo: J. Comput. Civ. Eng., 1992, 6 (1) : 106-109
4 pages
Prof. Dr. Anwar Ejaz Beg
No ratings yet
Prof. Dr. Anwar Ejaz Beg
1 page
Methods For Assessing The Stability of Slopes During Earthquakes-A Retrospective
100% (1)
Methods For Assessing The Stability of Slopes During Earthquakes-A Retrospective
20 pages
Monitoring and Evaluation For MPH Students
No ratings yet
Monitoring and Evaluation For MPH Students
46 pages
Research Group 5 Impacts of Online Gaming To The Academic Performances of Grade 11 Students
No ratings yet
Research Group 5 Impacts of Online Gaming To The Academic Performances of Grade 11 Students
8 pages
Sample Title Page of A Research Paper
No ratings yet
Sample Title Page of A Research Paper
49 pages
Design, Construction, and Performance of Continuously Reinforced Concrete Pavement Reinforced With GFRP Bars: Case Study
No ratings yet
Design, Construction, and Performance of Continuously Reinforced Concrete Pavement Reinforced With GFRP Bars: Case Study
13 pages
Critical Reviews in Biotechnology
No ratings yet
Critical Reviews in Biotechnology
12 pages
The Making of A Scientist
100% (1)
The Making of A Scientist
6 pages
Clinical Diagnosis of Endometriosis: A Call To Action
No ratings yet
Clinical Diagnosis of Endometriosis: A Call To Action
12 pages
2017 - Construction Rental MArkets and Penetration - Era-Market-Report
No ratings yet
2017 - Construction Rental MArkets and Penetration - Era-Market-Report
88 pages
Ux Ui
No ratings yet
Ux Ui
55 pages
Module 1 Lesson 1 Introduction To Competency-Based Curriculum
No ratings yet
Module 1 Lesson 1 Introduction To Competency-Based Curriculum
6 pages
Narratives of loneliness multidisciplinary perspectives from the 21st century First Issued In Paperback Edition Eric D. Miller 2024 Scribd Download
100% (4)
Narratives of loneliness multidisciplinary perspectives from the 21st century First Issued In Paperback Edition Eric D. Miller 2024 Scribd Download
61 pages
Group 4
No ratings yet
Group 4
6 pages
Current Trends in Technical and Vocational Education Research: A Meta-Analysis
No ratings yet
Current Trends in Technical and Vocational Education Research: A Meta-Analysis
1 page
Research Proposal, by Syed
No ratings yet
Research Proposal, by Syed
4 pages
Application Traffic Classification Using Neural Networks
No ratings yet
Application Traffic Classification Using Neural Networks
49 pages
Thematic Analysis and Visualization of Textual Corpus
No ratings yet
Thematic Analysis and Visualization of Textual Corpus
17 pages
Books 2024
No ratings yet
Books 2024
15 pages
Class Size Research Literature Review
No ratings yet
Class Size Research Literature Review
5 pages
ARDL Used in Econometrics
No ratings yet
ARDL Used in Econometrics
6 pages
Kamla Nehru Mahavidyalaya, Nagpur: Two Days National Conference
No ratings yet
Kamla Nehru Mahavidyalaya, Nagpur: Two Days National Conference
4 pages
附录在论文中的重要性
100% (1)
附录在论文中的重要性
7 pages
Difference Between Rank Coefficient and Karl Pearson-10
No ratings yet
Difference Between Rank Coefficient and Karl Pearson-10
5 pages
Competitive Strategies and Performance of Kenya Ai
No ratings yet
Competitive Strategies and Performance of Kenya Ai
14 pages
Dave C. Prodigo: Teacher I/Researcher Rotonda Elementary School
No ratings yet
Dave C. Prodigo: Teacher I/Researcher Rotonda Elementary School
34 pages
Social Class: Presented By: Group No. B
No ratings yet
Social Class: Presented By: Group No. B
15 pages
12dm Getting Started For Surveying
No ratings yet
12dm Getting Started For Surveying
372 pages
Daftar Pustaka: Selected Applications
No ratings yet
Daftar Pustaka: Selected Applications
3 pages
Martín Albo, J., Núñez, J., Navarro, J., & Grijalvo, F. (2007)
No ratings yet
Martín Albo, J., Núñez, J., Navarro, J., & Grijalvo, F. (2007)
11 pages