0% found this document useful (0 votes)
129 views

Text Mining and Natural Language Processing in Construction

Uploaded by

nakranitirth7
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views

Text Mining and Natural Language Processing in Construction

Uploaded by

nakranitirth7
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Automation in Construction 158 (2024) 105200

Contents lists available at ScienceDirect

Automation in Construction
journal homepage: www.elsevier.com/locate/autcon

Review

Text mining and natural language processing in construction


Alireza Shamshiri, Kyeong Rok Ryu *, June Young Park
Department of Civil Engineering, The University of Texas at Arlington, 416 Yates Street, Arlington, TX 76019, USA

A R T I C L E I N F O A B S T R A C T

Keywords: Text mining (TM) and natural language processing (NLP) have stirred interest within the construction field, as
Text mining they offer enhanced capabilities for managing and analyzing text-based information. This highlights the need for
Natural language processing a systematic review to identify the status quo, gaps, and future directions from the perspective of construction
Machine learning
management. A review was conducted by aligning the objectives of 205 publications with the specific domains,
Computational linguistics
Language models
areas, tasks, and processes outlined in construction management practices. This review reveals multiple facets of
Construction the construction sector empowered by TM/NLP approaches and highlights essential voids demanding consid­
Project management eration for automation possibilities and minimizing manual tasks. Ultimately, following identified obstacles, the
review results indicate potential research opportunities: (1) strengthening overlooked construction aspects, (2)
coupling diverse data formats, and (3) leveraging pre-trained language models and reinforcement learning. The
findings will provide vital insights, fostering further progress in TM/NLP research and its applications in
academia and industry.

1. Introduction computational linguistics had evolved to encompass various aspects,


including morphology, syntax, and semantics [3]. From 1993 onwards,
Since construction projects involve long multi-phased delivery TM and NLP have garnered significant attention due to their remarkable
among various stakeholders, they typically generate significant amounts advancements and diverse applications in various sectors such as health
of information. Textual information is the dominant data type that exists and law [4–7]. These applications range from analyzing clinical notes,
in every stage of construction management, with over 80% of it being prescriptions, and legal cases to the development of intelligent personal
unstructured [1]. Text data is stored in different structures, formats, and assistants like Siri and Alexa. In light of these advancements, the field of
sizes, such as e-mails, drawings, and contracts, across construction construction research has also witnessed a growing interest in the
projects through different phases for specific goals. Retrieving a specific analysis of construction documents through TM and NLP techniques in
piece of textual information from documents is critical for project parties recent years. Text mining extracts information from unstructured and
to successfully carry out the project. Lack of proper and integrated in­ structured text without semantic considerations, and NLP is the subfield
formation exchange and analysis in construction management in of both machine learning and linguistics that can process, understand,
complicated business environments can lead to poor communication and simulate human language abilities through computational tech­
and performance throughout the project lifecycle [2]. Furthermore, niques [3]. A variety of text-based analyses can be performed by NLP to
many construction activities and processes are still performed either understand the linguistic knowledge of humans based on the level of
manually by an operator or semi-automatically, which is still inefficient analysis, including semantic, syntactic, morphological, and lexical
and labor-intensive. The rapidly growing amount of construction textual analysis [3]. Furthermore, NLP can perform: (1) an array of tasks to
information has amplified the need for big data analytical tools. The tackle with text processing, including tokenization, stemming, and part
emergence of advanced technologies such as text analytics in con­ of speech tagging (POS); (2) information retrieval (IR), information
struction has sparked discussion on the digitalization and automation of extraction (IE), and speech recognition; and (3) higher-level text anal­
construction management due to the increasing amount of construction ysis using both rule-based and machine learning (ML) methods [1,8].
text data. Progressive advancements in text analytics and NLP have driven
Textual analytics has its roots in the late 1940s, with the introduction progress in construction-related studies, enabling various construction
of machine translation and information retrieval. By 1961, domains to achieve a degree of automation in recent years [9]. Previous

* Corresponding author.
E-mail address: [email protected] (K.R. Ryu).

https://doi.org/10.1016/j.autcon.2023.105200
Received 8 March 2023; Received in revised form 6 November 2023; Accepted 13 November 2023
Available online 21 November 2023
0926-5805/© 2023 Elsevier B.V. All rights reserved.
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

text-based studies have addressed numerous challenges and issues been done in the previous reviews within the context of computer sci­
within the construction field by utilizing TM/NLP methods. However, ence. This gap in research leaves critical aspects unexplored, where TM
owing to the novelty of applying TM and NLP in construction and the and NLP could potentially bring significant benefits to construction
fact that construction encompasses a wide range of tasks and processes management. Consequently, this review aims to address the knowledge
across its various areas and domains, it is crucial to gain insight into the gap, as there has been no systematic review conducted from the
extent to which TM and NLP are actively applied to the areas, tasks, and perspective of construction management to comparatively assess
processes, as well as to identify the aspects where TM and NLP have been various areas, tasks, and processes in which TM and NLP have been
less explored. Hence, this review is structured as follows to address these implemented or not. The core value of this review is to identify con­
gaps: Section 1 provides an overview of the existing literature pertaining struction management domains, areas, tasks, and processes within
to the use of TM/NLP in the field of construction. Additionally, this construction management where TM/NLP have been applied. Conse­
section presents a rationale for the aims of this review. In Section 2, the quently, it aims to shed light on detecting gaps that need addressing to
objective and methodology of the review are presented. Following that, minimize manual operations and boost automation within various
in Section 3, analyses of the studies in light of the processes and tasks of construction domains in future research investigations. This review is
construction domains and areas are presented. The findings are dis­ expected to provide an immense and up-to-date reference for TM/NLP
cussed, and the review is concluded in Sections 4 and 5, respectively. applications and capabilities in construction management concepts and
propose future research directions by highlighting the concepts that
1.1. Related reviews have received the least attention in current TM/NLP research.

Given the rising interest in leveraging NLP and TM in construction 2. Review methodology
[7], several scholars have directed their attention toward a detailed
exploration of diverse NLP and TM tools and techniques within the In this review, construction and NLP/TM-related terminologies were
construction sector or within specific domains. Baek et al. examined integrated to identify the most pertinent publications. Fig. 1 provides an
text-based research, reviewing methods, data sources, challenges, and overview of the publication selection process, which unfolds as follows:
future applications of text analytics in construction [7]. Wu et al. delved Initially, Google Scholar and ProQuest were used to collect relevant
into the mainstream applications of NLP in smart construction by articles that utilized NLP/TM techniques within the field of construc­
reviewing the various stages of NLP implementation within construction tion. Subsequently, a combination of text processing and construction-
project texts and documents [1]. This encompassed crucial steps in NLP specific keywords were employed to scrutinize these publications. To
implementation, methods for information and relation extraction, facilitate this, several construction management practices were adopted,
exploration of information/document retrieval techniques, and down­ including “A Guide to the Project Management Body of Knowledge”
stream NLP applications. Ding et al. conducted a scientometric analysis (PMBOK Guide) [14], Construction Industry Institute Handbook [15],
of NLP applications in construction, with a primary emphasis on data Total Facility Management [16], Construction Contracts [17], Principals
sources, tools, technologies, and various applications [9]. Chung et al. of Construction Safety [18], Construction Codes and Inspection Hand­
performed a systematic review by comparing the utilization of NLP in book [19], and BIM and Construction Management [20]. These practices
the construction sector with the most recent advancements in NLP served two primary purposes: firstly, to identify a comprehensive array
within the domain of computer science [10]. As previous reviews have of construction-related terms that would bolster coverage of potential
pointed out, addressing the challenges and advancements in integrating subject areas and domains; and secondly, to align the objectives of the
NLP into future construction studies necessitates a collective endeavor collected papers with the domains, tasks, areas, and processes outlined
focused on the seamless integration of diverse data sources, the effective in these practices for identifying contributions and gaps in construction-
utilization of pre-trained language models, and the implementation of related studies. The construction-related terms were used in the paper
NLP to automate and enhance construction processes. Other researchers collection include: Safety Management; Risk Management; Asset Man­
have focused on specific domains of construction. Hassan et al. reviewed agement; Cost Management; Cost Overrun; Schedule Management;
the application of NLP in construction legal issues and contracts, Schedule delays; Facility Management; Information Management;
encompassing historical legal case analysis, violation detection in con­ Document Management; Quality Management; Sustainability; Environ­
struction regulations, and regulatory code and contract review [11]. In a mental; Green Building; Communication; Stakeholder Management;
similar vein, Locatelli et al. explored NLP's potential and applications in Public Engagement; Procurement Management; Supply Management;
the context of building information modeling (BIM) [12]. Dinis et al. Modularization, Prefabrication, or Precast; Integration Management;
also conducted a review of recent developments in semantic enrichment Resource Management; Human Resource; Project Management;
applications and systems for BIM [13]. Knowledge Management; Materials Management; Contract Manage­
ment; Claims and dispute Management; Project Controls; Bidding or
1.2. Contribution of this review Tender; Change Management; Operation and Maintenance; Construc­
tion; Designing; Closeout; Pre-construction; Post-construction; Scope
Prior reviews have predominantly directed their focus toward the Management; Project Planning; Designing; and Productivity.
implementation of TM and NLP in the construction sector through the Conversely, the TM/NLP-related terminology included: Natural Lan­
lens of computer science. This concentration revolved around the guage Processing; Text Mining; Text Analytics; Syntactic Analysis; Se­
thorough examination and assessment of the advancements in state-of- mantic Analysis; Information Retrieval; Information Extraction; Text
the-art NLP and TM techniques and algorithms as found within the Classification; Natural Language Understanding; Natural Language
construction publications. Despite offering recommendations aimed at Generation; Sentiment Analysis; and Topic Modeling. Subsequently,
advancing the automation and intelligence of language models and text considering terminological variations, a comprehensive examination of
analytics, along with identifying construction domains where NLP and logical combinations of the construction and TM/NLP-related terms was
TM could diminish the need for manual investigation, these reviews conducted. On the other hand, certain keywords can carry distinct in­
have shortcomings in several regards. Firstly, these reviews lack con­ terpretations within interdisciplinary studies (i.e., cost estimation in the
sistency and do not comprehensively explore the subject from the business sector and construction in computer science), ultimately
perspective of construction management concepts. Secondly, no previ­ guiding the selection of papers focused exclusively on construction-
ous review has undertaken a comprehensive examination of the appli­ related topics. Following the extraction of all relevant papers, irrele­
cations of TM and NLP in construction areas, tasks, and processes from a vant papers were excluded through manual screening. In the subsequent
construction and project management standpoint, similar to what has phase, each paper's objective was manually reviewed, aligning with the

2
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Fig. 1. The overview of paper collection and review procedure.

processes, tasks, domains, and areas provided in the construction prac­ representations from transformers (BERT) and embeddings from lan­
tices. Over a span of two decades, from January 2002 to August 2023, guage models (ELMo), resulting in a substantial rise in the number of
205 papers were selected from the mentioned data sources for this re­ publications. Despite the pivotal advancements in the context of large
view. The analysis revealed 12 primary domains/areas, where TM/NLP language models (LLMs), such as the chat generative pre-trained trans­
applications intersected with the construction field. The implications of former (ChatGPT) in late 2022, investigating how LLMs influence text-
publications in terms of tasks and processes in each specific domain/ based research in the construction field is premature as of the time of
area will be discussed in Section 3. writing this review. From Fig. 2, it can be perceived how TM/NLP ap­
plications in the construction sector align with the developments in
2.1. Overall trend of literature collection techniques, particularly in the context of word embedding methods.
Moreover, the introduction of models such as RoBERTa and ALBERT in
Fig. 2 depicts the trajectory of the paper collection, which can be 2019 underscores how BERT-based language models have empowered
categorized as the inception, take-off, and expedition periods. The first NLP applications in construction-related studies as effective encoding
paper emerged in 2002, and the overall publication trend remained tools. While there have been developments in fine-tuning BERT-based
relatively stable until 2013, with merely ten publications over the models for domain-specific applications like health and law [6,21], the
eleven-year span of the inception phase. However, the upswing in TM/ construction industry still lacks a fine-tuned BERT model tailored to
NLP applications within the construction sector began in 2013, coin­ construction-specific corpora. On the other hand, despite a slight
ciding with the introduction of the Word2Vec word embedding method decrease in the number of publications in 2022 and 2023, as of now, and
and the initiation of neural network applications in NLP. A year later, advancements in LLMs such as GPT, it is conceivable that future de­
another word embedding technique, Global Vectors (GloVe), was velopments will comprise both encoding and decoding advances, rather
introduced. In the takeoff period, starting in 2013, research on TM/NLP than focusing solely on encodings. Still, researchers have shown an
applications showed an increasing pace and extended until 2018. increasing interest in applying TM/NLP in construction research since
Starting the expedition phase in 2018, word embedding techniques met 2018.
further developments with the introduction of bidirectional encoder Fig. 3 provides an overview of the distribution patterns seen across

Fig. 2. Evolution of TM/NLP applications in construction research along with the introduction of key algorithms.

3
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Fig. 3. Number of publications by domain/area/task covered.

various domains, tasks, and areas in the field of construction. The largest contract phase before accepting the agreement. Tang et al. also proposed
segment, comprising approximately 20% (42 papers) of the total 205 an IE model utilizing named entity recognition (NER) for automating
publications, is dedicated to construction safety and management. This quantity take-off based on RSMeans cost items [24]. These three studies
area commands the highest share of research attention. The second focal implemented NLP tasks to improve the cost estimating step. Williams
point among the published papers is automated compliance checking, and Gong's model is suitable for the budget determination step of cost
which encompasses 18% (37 papers) of the total publications. Addi­ management and performs well for projects with significant cost over­
tionally, there is a notable emphasis on contract management, runs [25]. In brief, as illustrated in Table 1 and Fig. 4, while TM and NLP
comprising 14% (30 papers) of the publications. Furthermore, within have been extensively utilized for BIM-based cost estimation, budgeting,
the realm of BIM, studies that leverage TM and NLP techniques account and cost overrun estimation, their application to improve the cost con­
for 12% of the overall publications. Additionally, knowledge, document, trol process, accounting practices, cost reporting, financial statement
and information management collectively comprise 11% of the total analysis, and overhead, direct, and indirect calculations have not been
publications. In summary, five primary domains—safety management, widely pursued.
automated compliance checking, contract management, and knowledge,
document, and information management—stand out as the main areas 3.2. Schedule management
of focus, accounting for over 75% of the published papers.
Schedule management encompasses planning, defining activities,
3. TM and NLP applications in construction sequencing activities, estimating activity resources, estimating activity
durations, and developing schedules [14]. Hong et al. compared various
This section delves into the ramifications and value-added insights clustering methods for scheduling activities to cluster construction ac­
derived from the selected publications, particularly in the context of tivities, consequently enhancing scheduling quality [26]. This under­
domains, areas, tasks, and processes associated with each thematic taking also contributes to the improvement of the activity definition
category, all while assessing the utilization of TM/NLP techniques. process. Models incorporating Long Short-Term Memory (LSTM) have
Furthermore, a comparative analysis of each paper's objectives and additionally been suggested to automate activity dependencies, making
contributions to construction management practices is conducted to them suitable for automating the sequence of activities. Amer and
identify areas, tasks, and processes where the TM and NLP have yet to Golparvar-Fard provided a model to automate the logical dependency of
find application. activities in order to model dynamic construction work templates from
historical project schedules for new project scheduling and design
3.1. Cost management optimization [27]. Amer et al. proposed a model for checking the quality
of scheduling and automating the logical dependencies of unordered
Cost management involves several vital processes, including cost activities [28]. In a separate investigation, Amer et al. developed a
management planning, cost estimation, budget determination, and cost model capable of generating look-ahead planning activities based on the
control [14]. Scholars have developed models to facilitate and automate input master activity prompt [29]. The model employs a combination of
cost-related processes, such as the semantic NLP-enabled model devel­ distance-based matching and transformer architecture [29]. Prieto et al.
oped by Akanbi and Zhang, which matches design information from employed the generative pre-trained transformer (GPT-2) model to
construction specifications with materials to enhance cost estimation generate a construction schedule from a defined scope of work [30]. The
within a BIM environment [22]. Their developed information extraction outcomes of the created schedule are subsequently assessed by partici­
and matching (IEM) algorithm reduces specification information pants, considering the quality and logical connections of the de­
extraction time to 5.56% of the similar traditional way. Jafari et al. pendencies. In short, scheduling-related studies have predominantly
proposed a model for predicting the overhead costs and time to complete applied TM/NLP techniques to schedule activities and dependencies.
reporting requirements from specifications and contracts using Monte Nevertheless, when taking into account the whole set of scheduling
Carlo simulation, rule-based, and ML-based classification methods [23]. management tasks and processes, as depicted in Table 1 and Fig. 4, it
XGBoost achieved the highest recall accuracy among tested classifica­ becomes evident that TM/NLP remains insufficiently leveraged for
tion methods, making it a suitable choice during the bidding and planning, estimating activity durations and resources, BIM-based

4
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Table 1
Summary of studies in schedule, cost, and quality, and advanced work packaging.
Area Implications Data Sources Objectives Ref.

Supporting cost estimation, predicting cost overruns,


Cost Cost estimation, budgeting, quantity Specifications, biddings, contracts, RSMeans
automating quantity take-off, predicting the time and [22–25]
Management take-off, BIM cost items
overhead costs of reporting requirements
Automating the logical dependencies, comparing
Schedule Defining, classifying, sequencing, and Schedules' activities, master schedules, scope
activities clustering techniques, generating look- [26–30]
Management generating activities of work
ahead planning activities
Standard specification, building quality Automating quality requirements and extraction of
Quality Quality management planning, quality
complaints, inspection and supervision constraints, classification of quality issues and [31–35]
Management control and assurance, BIM
reports, Chinese national standards complaints
Constraint Management, CWP, IWP, and Working plans, bridge rehabilitation manuals, Improving constraint management, automating
Advanced work
EWP, automating work packaging, workflows of module production, schedules, constraint information extraction, automating work [36–39]
packaging
modular construction, BIM bill of materials and quantities packaging in modular construction

Fig. 4. Connections between identified and unidentified schedule, cost, quality, and advanced work packaging processes and tasks.

scheduling, schedule overruns, and delay analysis. Bayes-based, and SVM for automatic classification of building quality
complaints and identifying the semantic features [33]. Their model is
ideal for the quality control step to categorize deficiencies for further
3.3. Quality management
improvement. Lin et al. developed a framework for on-site issues uti­
lizing topic modeling on inspection records to explore the issues and
Quality management has three procedures: planning, quality assur­
concerns over time [34]. Zhang et al. proposed a model for extracting
ance, and quality control [14]. As depicted in Table 1 and Fig. 4, re­
and classifying quality issues by employing BERT, Word2Vec, deep
searchers utilized TM/NLP techniques for almost all processes. For
learning, and ML algorithms on supervision reports to support quality
example, Jeon et al. employed Word2Vec, GloVe, convolutional neural
control and compliance checking tasks [35]. Their approach makes the
networks (CNN), and recurrent neural networks (RNN) to automate
quality control step efficient in terms of updating and understanding the
quality requirements extraction from specifications and convert them to
issues for improvement actions. In general, TM/NLP techniques can
create a checklist [31]. Their model automates three quality manage­
automate quality management processes. However, complying with
ment processes simultaneously, including quality management plan­
quality standards still requires further attention.
ning, quality assurance, and quality control. The model proposed by
Zhong et al. is suitable for quality assurance. They used hybrid bidi­
rectional LSTM (Bi-LSTM), conditional random field (CRF), LSTM, and 3.4. Advanced work packaging (AWP)
multilayer perceptron (MLP) together for automating the extraction of
procedural constraints from regulations [32]. Zhong et al. utilized CNN, AWP is concerned with a systematic approach for enhancing

5
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

controllability, predictability, and productivity by integrating con­ and prioritizing maintenance requests [46].
struction activities through the project lifecycle. The Construction In­ The assessment and benchmarking of maintenance operations were
dustry Institute (CII) classified work packages with various constraints also conducted to enhance the decision-making processes. Nojedehi
into several groups: construction work area (CWA), construction work et al. visualized and benchmarked the maintenance performance by
package (CWP), engineering work package (EWP), procurement work analyzing recorded work orders using a rule-based classifier, which is
package (PWP), and installation work package (IWP) [15]. Three studies helpful at the portfolio level [47]. Dutta et al. also performed sentiment
were discovered with the aim of enhancing and automating AWP for analysis and topic modeling methods on surveys and work orders to
constraint management and modular construction. Wu et al. automated evaluate the performance of maintenance operations based on occu­
constraint information extraction to automate AWP [36]. In an alter­ pants' feedback and complaints [48]. Chen and Tsai developed a chatbot
native approach, a hybrid model combining Bi-LSTM-CRF and rule- by integrating NLP and BIM to improve information delivery between
based techniques was developed to enhance AWP by automating users and facility management platforms [49]. Bazzan et al. developed
constraint information extraction [37]. Previous studies have primarily an information management model for classifying complaints, leading to
focused on automating extraction and identification of relations among enhancements in data collection and analysis [50]. Studies were iden­
CWP, IWP, and EWP for constraint management. Yet, the constraints tified applying IE and text mining methods to explore relationships
related to procurement and CWA packages were not thoroughly among entities in inspection reports, aiming to enhance maintenance
addressed. Moreover, the processes for prioritizing and resolving con­ decision-making [51]. Overall, maintenance management services were
straints require more attention through constraint modeling. While the primary focus of the studies, as seen in Table 2 and Fig. 5. Yet, TM
previous studies emphasized constraint management, Li et al. proposed and NLP have not been applied to general management, security, lei­
a model for the automatic generation of optimal work packages for sure, and landscaping services.
modular construction project progresses [38]. In an alternative study, a
dynamic knowledge graph-based approach for generating work pack­ 3.6. Stakeholder management
ages in modular construction was proposed by employing graph con­
volutional networks (GCN) and Bi-LSTM [39]. The automatic package Identifying, engaging, managing, and monitoring internal and
generation developed by Li et al. improved productivity and perfor­ external stakeholders are essential processes of successful project
mance, progress tracking, and team engagement. The gaps still exist in stakeholder management [14]. Xue et al. developed a dynamic
applying TM/NLP as a tool for automating AWP tasks, specifically for stakeholder-associated topic modeling approach by employing a topic-
cost, time, quality, safety, risk analysis, and front-end planning, as over-time model based on Latent Dirichlet Allocation (LDA) and a
illustrated in Table 1 and Fig. 4. scoring system to assess the relevance between public concerns and
project stakeholders over three phases, including planning, construc­
3.5. Facility operations and maintenance tion, and handover [52]. Zhou et al. presented a framework utilizing
LDA for online public opinion mining [53]. Both studies provided
Sensors and user input are two major types of data that facility managerial suggestions for public engagement. Using sentiment analysis
managers collect to assess facility management. Computerized mainte­ and LDA, frameworks have been developed to collect and analyze the
nance management systems (CMMS) contain valuable user input infor­ stakeholders' and public's attitudes about the projects. For instance, Wan
mation such as work order logs and maintenance requests. CMMS, along et al. performed spatiotemporal analysis using LDA and lexicon-based
with other information and communication technology (ICT)-based sentiment analysis on the South-to-North water diversion project
tools like computer-aided facility management (CAFM), integrated located in China [54]. Moreover, a framework was developed utilizing
workplace management systems (IWMSs), and enterprise asset man­ LDA to assess governments' attitudes toward the impacts of expressway
agement (EAM) [16], offers insights into user activities, financial man­ construction projects on the natural environment [55]. Ren et al.
agement, and operational task requests. Publications in facility assessed sentiment changes to identify barriers obstructing the progress
management predominantly focus on managing operational tasks by of modular construction [56]. Unlike previous studies, topic modeling
relying on CMMS using NLP. Researchers have implemented text ana­ and sentiment analysis were implemented to explore competencies and
lytics for automating the assignment of proper staff to work orders to demand for project manager roles in construction [57]. According to
enhance the productivity of maintenance and operations. Mo et al. Table 2 and Fig. 5, while pertinent studies have mainly centered on
proposed a model using ML methods for automating staff assignments identifying and managing public opinion and engagement, planning for
for maintenance tasks using maintenance records [40]. This approach is stakeholder engagement has received the least attention.
applicable for human-building interactions in the long term and can be
applied in the preconstruction phase to allocate the right staff for change 3.7. Risk management
order requests. Bouabdallaoui et al. implemented pre-trained word
embedding and a combination of CNN and LSTM to classify service re­ Risk management processes include planning, risk identification,
quests and assign them to the qualified technician [41]. qualitative and quantitative risk analysis, risk response planning, and
Researchers have also developed models for supporting the preven­ risk control [14]. The publications centered on retrieving, identifying,
tive maintenance approach by extracting information from CMMS by and analyzing risks. Models have been proposed to improve risk
focusing on analyzing and classifying maintenance requests based on retrieval systems and identify similar risks for making predictions based
building characteristics, types, and locations where faults occurred. on historical risk cases. For instance, a risk retrieval model was intro­
Gunay et al. utilized TM for extracting failure occurrences and patterns duced to improve case-based reasoning (CBR) limitations [58], and an
of facility equipment [42]. Bortolini and Forcada utilized text mining alternate method was introduced for measuring the risk similarity from
and statistical techniques on maintenance requests to evaluate building historical risk register data [59]. Erfani and Cui constructed a model to
systems that can help employ preventive strategies before a mainte­ automate generating risk templates [60]. Zhou et al. also employed deep
nance request is submitted [43]. Marocco and Garofolo proposed a learning and knowledge-based BERT methods to develop a model
model to extract maintenance information from work orders to identify capable of generating risk responses [61]. Matthews et al. developed an
room locations containing fault components [44]. D'Orazio et al. ontology for rework risk analysis in transportation projects by per­
implemented sentiment analysis using different lexicons to automati­ forming topic modeling [62]. Jallan and Ashuri performed text analytics
cally detect service requests based on their severity [45]. They also for risk identification and analysis for publicly traded construction
evaluated how text preprocessing methods impact the performance of companies from annual Securities and Exchange Commission 10-K re­
machine learning techniques in automatically classifying work orders ports [63]. NLP was also implemented for the risk identification and

6
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Table 2
Summary of facility operations and maintenance, and stakeholder management studies.
Area Implications Data Sources Objectives Ref.

Preventive maintenance, staff Staff allocation, maintenance request management and


Facility Operations Service requests, work orders, tenant
assignment, performance classification, proposing models to improving bridge
and surveys, inspection and repair [40–51]
measurement, BIM, quality and safety deterioration prediction, user's satisfaction measurement,
Maintenance reports, complaints, BIM models
management chatbot development, O&M data analysis
Proposing frameworks to identify and analyze public
Stakeholder Identify, evaluate, and manage (public Microblogs, Twitter, Weibo posts, concerns and give managerial recommendations, analyze
[52–57]
Management and governments) opinions environmental impact assessments social media and explore public attitude and barriers in
construction industry

Fig. 5. Relationships between identified and unidentified stakeholder-related and facility operations and maintenance processes and tasks.

quantification of New Engineering Contract (NEC) projects in the U.K. of ML, deep neural networks, and rule-based methods to extract and
[64]. While a major portion of risk studies has primarily focused on risk classify the contractual requirements [8]. Furthermore, Candaş and
identification, quantitative risk analysis, and risk response approaches, Tokdemir's model automates the classification of contractual re­
there is insufficient development in qualitative risk analysis, risk quirements based on company departments, leading to enhanced con­
response, and risk control processes, as shown in Table 3 and Fig. 6. tract review efficiency [70]. In addition to automating risk identification
approaches within engineering-procurement-construction (EPC) con­
tract clauses [71], studies have been identified focusing on the risks of
3.8. Contract management
EPC projects in the bidding phase [72]. For example, a model was
designed to predict bidding risks from pre-bid request for information
Researchers referred to standard forms of contracts, such as the In­
(RFI) documents [72,73]. Son and Lee proposed a model to estimate EPC
ternational Federation of Consulting Engineers (FIDIC), for text-based
schedule delay risks during the bidding process [73]. They used the
contractual studies. In addition, general conditions, specifications, and
vector space model (VSM) to assign feature weights to the vectorized
change orders are among the primary construction documents applied in
words and regression. Frameworks have also been developed for EPC
contractual studies. Scholars employed different rule-based analytics as
projects to identify and analyze unilateral changes and design risks from
the primary method for identifying risk-prone, vague, and contractor-
specifications [74]. Contractual semantics have been improved through
friendly clauses. Clause-related publications utilized expert opinions to
the development of a taxonomy, which has consequently led to the
validate and evaluate their models. The proposed models are suitable for
development of ontologies for further contractual text analytics [75].
use both before the contract agreement and during the bidding phase.
Moreover, Al Qady and Kandil utilized concept relation identification
Furthermore, it seems that the existing lexicons suffer from broader
using shallow parsing to extract semantic knowledge for improving
contract-related terms that should be developed for semantic contract
contract management, document management, and IR [76]. Pham et al.
analysis. Candaş and Tokdemir implemented rule-based and ML classi­
created a BERT-based model capable of handling and making semantic
fiers to identify vague terms of contracts [65]. Change management was
predictions of contractual risks [77]. Fu et al. suggested a model aimed
also considered by employing change orders [66]. For instance, Ko et al.
at enhancing comprehension of the connections between contract
also proposed an NLP-driven model using CRF to identify and extract
complexity and various contract variables [78]. LDA was also performed
change reasons and altered work items [66]. Their model is not only
to find the patterns of construction defect litigation [79]. In general,
suitable for classifying and archiving but is also operational for
scholars primarily applied TM/NLP methods to enhance contract risk
retrieving similar changes when a change occurs. Specific efforts were
management and identify risks in diverse contracts and construction
made to automate specification reviewing, which is an essential task in
documents. However, managing construction contracts during the
risk management. The pertinent studies highly relied on applications of
construction phase, claims and disputes received the least attention
two embedding methods, including Doc2Vec and Word2Vec, and NER
when considering contractual procedures in construction [17].
for developing automatic specification review systems [67]. For
Furthermore, there are rooms for improvements in applying text ana­
instance, Moon et al. employed Word2Vec as an input for developing a
lytics to other types of contract documents, such as special conditions,
NER model based on BiLSTM for automating construction specification
drawings, integrated project delivery contracts, schedules, and delay
review [68]. Another concentration of the researchers is the analysis of
contract terms for project control and delay analysis, as depicted in
contractual obligations specified in contracts, which subsequently assist
Table 3 and Fig. 6.
in the selection of standard forms [69].
Models have been established to extract and classify contractual re­
quirements [8,70]. Hassan and Le presented models using a wide range

7
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

3.9. Safety management

[8,65–79]
[58–64]

[80–99]
Text classification and information extraction methods have been

Ref.
widely employed on construction site accidents and injury reports.
Automating the extraction, identification, and prediction of safety risk

accident narratives), job hazard analysis, proposing models to find the


Proposing models to analyze risk-prone contract clauses and changes
factors are among the core themes of safety-related studies. Researchers
similarity measurement; query system framework for retrieving risk

Identifying and extracting injury precursors and outcomes of safety


knowledge, automating classification of the contract requirements,
introduced IE methods for extracting knowledge from accident reports

hazard and accident causes, assess severity and frequency of the


construction accidents (causes, near-misses, work zone crashes,
Proposing methods to identify, classify, and quantify risks; risk

to improve safety domain knowledge [80]. In addition, retrieval systems

specifications reviewing, enhancing semantics in contractual


in contracts, proposing models for automating construction

risk factors, proposing models to automate classification of


have been proposed for retrieving accident cases and supporting health
and safety plan preparation [81]. Tian et al. have created an intelligent
question-answering system that employs BERT and bidirectional gated
recurrent units to automatically recommend safety hazard management
cases; risk response and template generator

measures [82]. Models were developed to identify the frequency and


severity of construction safety risks both before and during construction
operations [83]. A wide range of deep learning and ML-based models
have been developed to extract precursors and outcomes from incident
contract complexity analysis

construction safety risks reports based on various attributes, including injury types, body parts,
and sources [84]. Notably, deep learning has proven to be effective in
extracting injury precursors and making safety predictions from injury
reports [85]. Using LDA, the models identified accident precursors and
safety risk factors [86]. Additionally, approaches have been presented
Objective

that can be run with a small dataset to extract risks from accident reports
[87,88]. Researchers have also aimed to support and improve safety
issues during the design phase in the BIM environment for a better
decision-making and accident prevention approach.
Safety (accident, injury, hazard, and near-miss) reports, safety and
Contracts (FIDIC, EPC, AIA, DB, EPC, NSW GC21), standard forms

specifications, change orders, request for information, number of

Text classification methods were implemented for (1) classification


of contracts, invitation to bid, lessons learned, scope of work,

of accident causes, (2) improving job hazard analysis, (3) accident


health plans, accident news reports, fire accidents, images,

narratives, and (4) near misses and risky behaviors. In particular, ML


and deep learning methods have been utilized to classify accident causes
and injury types [89]. Sayad et al. proposed a model using text mining
Risk cases, risk registers, risk factor disclosures

techniques to classify work zone crashes [90]. Near-miss information


from safety reports has been classified using deep learning by re­
searchers, which is useful for preventive safety approaches. Fang et al.
developed a BERT-based model to classify near-misses [91]. Chen et al.
issued addendum, litigation cases

also implemented classifiers based on near-miss data from hydropower


station projects [92]. Classification algorithms have also been employed
to analyze accident narratives [93–95] and the linear support vector
machine (SVM) model achieved the highest accuracy compared to the
other classifiers [93]. In contrast, the proposed CNN model by Zhong
specifications

et al. outperformed other classifiers [94]. They also utilized network


Data Source

analysis using LDA to visualize narratives. While several shallow and


deep learning methods have been utilized in classifying near misses,
SVM and CNN achieved the highest accuracy [95]. Furthermore,
ontology-based methods have been implemented to classify hazardous
Contract risk management, change management, contractual

activities to improve job hazard analysis [96]. Fire accidents at con­


obligations analysis, contractual requirements, contractual

extraction, BIM, preventive safety approach, accident case


knowledge improvement, risk management, information

struction sites were analyzed, and causes were identified [97]. Analyses
semantics, knowledge management, construction-defect

have also been conducted to explore the risk factors and their in­
Construction safety scenarios classification, domain
Risk cases retrieval system, risk (identification and

terconnections in rail transit construction projects, with the aim of


Summary of risk, contract, and safety management studies.

improving safety risk management [98]. In addition, scholars utilized


and integrated computer vision and NLP for hazard identification,
retrieval and question-answering system

classification, and retrieval systems from image processing to support


safety management [99]. Overall, considering safety procedures in
construction [18], as illustrated in Table 3, Fig. 6, monitoring, pro­
quantification, and response)

gressive improvements, and safety-related issues in training and man­


agement levels held the lowest degree of prominence.
litigation analysis

3.10. Building information modeling (BIM)


Implication

Scholars extensively utilized NLP to develop intelligent query


answering systems (QAS) and search engines to extract and retrieve
information from BIM. For instance, QAS for retrieving BIM model in­
formation [100] and a search engine for BIM model objects have been
Management

Management

Management

designed [101]. On the other hand, Shin and Issa proposed an intelligent
automatic speech recognition system for BIM [102]. Their model is
Contract
Table 3

Safety

capable of changing objects and their attributed materials by speaking.


Area

Risk

Wang et al. developed a voice-based QAS based on automatic speech

8
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Fig. 6. Relationships between identified and unidentified risk, contract, and safety processes and tasks.

recognition (ASR) and a pre-trained optimized BERT for retrieving BIM made to facility management, cost, quality, safety, and compliance
information and categorizing queries into predefined categories [103]. checking within the BIM environment. Considering the BIM environ­
The proposed model can also be applied to safety management. Wong ment and its applications [20]. As shown in Table 4 and Fig. 7, the use of
et al. proposed a voice-enabled interactive real-time location sharing TM/NLP in developing various aspects of BIM has not been progressive,
system using ASR and natural language understanding (NLU) for fire specifically in terms of improving communication and early engagement
emergencies [104]. Additionally, other chatbots have been created of stakeholders, which have been the least.
using BERT and NLU to acquire information for decision-making at the
managerial level [105,106]. Query systems and retrieval models have
3.11. Knowledge/document/information management
been tailored to facilitate BIM design. Liu et al. introduced a retrieval
method for online BIM product model libraries [107]. A novel approach
The first TM/NLP implementations in construction belong to the
has been developed by Yin et al., utilizing a graph neural network for
classification of construction documents. As depicted in Table 4 and
semantic parsing, enabling automated alignment between queries and
Fig. 7, various TM/NLP techniques have been used to improve con­
ontology to retrieve BIM models [108]. Additionally, a semantic parsing
struction information management systems (CIMS), document man­
technique was also developed capable of converting intricate queries
agement systems (DMS), and QAS in construction. Typically, knowledge
into executable structured queries for retrieving BIM models [109]. Gao
management processes consist of locating and accessing, capturing,
et al. proposed an automatic semantic annotation model for online BIM
representing, sharing, and creating new knowledge [15,120]. Text an­
documents [110]. Zhang et al. also used term frequency and inverse
alytics have been applied to improve DMS systems by proposing
document frequency (TF-IDF) and parsing methods to retrieve patterns
different classification methods to classify random construction docu­
from BIM user logs to measure productivity [111]. Wang et al. employed
ments and automate document classification. Al Qady and Kandil, in
CBR and BERT to capture and retrieve knowledge in BIM [112].
several studies, developed methods for classifying construction docu­
Scholars have also suggested classification models for BIM objects
ments based on text semantics and similarities [121]. A framework was
and case studies [113]. Efforts, in particular, have been made to
also proposed to classify transportation data terms based on semantic
implement TF-IDF and TM methods for improving the interoperability
similarity [122]. Through text analysis, Xue et al. established a project
and integration between International Foundation Class (IFC) as a data
data-sharing framework to elevate communication among stakeholders
exchange format for BIM software tools and City Geography Markup
in the context of smart construction [123]. Text analytics was also
Language (CityGML) [114]. Studies also utilized NLP based on IFC to
applied to visualize the construction documents' information.
achieve matching facility locations on the BIM models [115], and
Information retrieval, as an essential element of knowledge extrac­
automating the identification and validation of change requests [116].
tion and management, has been widely used for retrieving architecture,
Following previous progressions, Yin et al. developed a model to
engineering, and construction (AEC) information and documents by
enhance the BIM ontology and align different properties of BIM models
developing retrieval and question-answering systems. For instance,
with IFC using an ontology learning approach [117]. Social network
Torkanfar and Azar proposed a method using different similarity mea­
analysis has been employed to determine essential roles and skills to
surements that is capable of finding similar projects based on their work
enhance energy efficiency training through implementing BIM in con­
breakdown structure (WBS) [124]. Ko et al. employed BERT to assess the
struction [118]. Similarly, IR and association rule mining (ARM) have
similarity of project scope statements, a method that recommends
been utilized by employing singular value decomposition of the term-
similar previously executed projects for the early pre-construction phase
document matrix for analyzing BIM roles and skills [119]. To sum up,
[125]. Choudhary et al. presented a QAS using knowledge discovery and
in addition to the previous contributions, improvements have been
TM methods and general architecture for text engineering (GATE)

9
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Table 4
Summary of BIM, knowledge/document/information management, automated compliance checking studies.
Area Implications Data Sources Objectives Ref.

Developing NLP-based query answering


system and search engines, automatic speech
BIM (models, queries, product model libraries,
Building Information and model retrieval, query recognition model, improving IFC properties,
online documents, objects, case studies,
Information systems and chatbots, integration, analyzing required roles for BIM, proposing [100–119]
publications, user comments), tweets, BIM-
Modeling (BIM) classification, facility management models for enhancing libraries and objects,
related jobs, surveys
developing parsing techniques to enhance BIM
model retrieval
Proposing methods for automating
Meeting minutes, change orders, claims, scope
Knowledge/ construction document and term
Document classification, text statements, project reviews, weekly reports,
Document/ classification, proposing retrieval systems,
visualization, information retrieval, query engineering guidelines, online construction [121–129]
Information proposing methods to enhance machine
system, data sharing products and web pages, NCREE documents,
Management translation in earthquake engineering domain,
CAD and WBS cases, LEED cases
proposing project data sharing framework
Proposing models to support ACC by
improving POS tagging, regulatory text
classification and extraction, extending rule
Regulatory documents and codes (IBC, EPA, checking, automating reasoning, developing
Regulatory information (extraction,
Automated OSHA, IECC, International Energy question-answering system for building
classification, and transformation), BIM,
Compliance Conservation Code), safety regulations, utility regulations, ACC enhancement and safety [131–147]
question-answering, rule checking, safety
Checking glossaries and specifications, construction compliance checking in BIM, developing
management, violation detection
activities, jobsite data, health building notes taxonomy for ambiguity in building
requirements, addressing spatial reasoning
constraints, safety regulations relation
extraction

Fig. 7. Relationships between identified and unidentified BIM, knowledge/document/information management, automated compliance checking processes
and tasks.

software to extract knowledge from post-project reviews [126]. Query improve cross-language IR (CLIR) in AEC. Additionally, Lin et al. eval­
systems have also been designed to retrieve information from the web. uated the quality of the produced machine translation through CLIR
Demian and Fruchter utilized text analytics to retrieve objects in product applications [129]. Overall, positioning, data sharing, and information
models by measuring the relevance of different project features using accessibility processes received the least attention.
latent semantic indexing to support design reuse [127]. Search queries
have also been created to retrieve construction product manufacturing
3.12. Automated compliance checking (ACC)
information from the web pages and online construction-related infor­
mation in English or French [128]. Lin et al. created a reference
Construction designs and executions should comply with the codes
collection considering matching semantics from Chinese to English to
and regulations in different areas, such as safety, design, environment,

10
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

and be controlled subsequently [19,130]. With the advances in applications in various NLP tasks within the field of construction man­
computing technologies, ACC has attracted attention among re­ agement. However, the potential of reinforcement learning (RL) remains
searchers. Researchers have focused on the automatic extraction of untapped in the construction field. Recently, RL has demonstrated
regulatory concepts and information transformation using text analytics remarkable advancements in NLP, achieving human-level performance.
to facilitate ACC. Two studies primarily focused on improving the per­ These developments in RL have made it feasible to incorporate RL into a
formance of POS tagging on building codes to improve rule conversion wide range of NLP tasks, including information retrieval, QAS, and
and IE in ACC. Xue and Zhang investigated the performance of POS search engines; text generation; machine translation; sentiment analysis;
using a rule-based transformational method [131]. Later, they imple­ text summarization for long documents; and IoT [148]. Additionally, the
mented deep learning to improve POS tagging [132]. Text classification capabilities of RL in making sequential decisions based on textual in­
methods were also employed by researchers to support ACC. They formation can be implemented in risk management, scheduling, cost
implemented and compared different ML classifiers, term weighting estimation, bidding, resource allocation and leveling, and constraint
schemes, and ontologies for improving semantics and classifications in management. Furthermore, information retrieval and QAS were widely
ACC. In addition, a clustering model has been proposed for the applied in BIM and knowledge, document, and information manage­
computability analysis of building codes [133]. ment studies. While RL typically demands a substantial amount of data,
Automating information extraction and transformation is another which can complicate its implementation, it can potentially evolve IR
approach for the application of text analytics [134–136]. In this context, and text generation applications within the field of construction.
various methods, including rule-based NLP, ontology-based IE, and Another approach to the current challenges in fully automating and
bidirectional LSTM-CRF have been implemented. Zhang et al. developed developing construction is employing NLG in the context of NLP.
a taxonomy for addressing ambiguity in building requirements [137]. Employing NLG has several applications, such as text and summary
The IE model developed by Zhou and El-Gohary achieved satisfactory generation, ASR, machine translation, chatbots, and QAS, and has
accuracy in extracting energy efficiency requirements [135]. Zhang and important roles in employing ASR and text-to-speech (TTS) synthesis.
El-Gohary also utilized rule-based NLP and deep learning methods to For instance, ChatGPT, a well-known model among LLMs, can be used to
extract and transform regulatory information into logic clause elements generate text-based information related to construction codes, mate­
and hierarchies for the full automation of ACC [138]. Wang et al. rials, safety, and education. It is important to note that ChatGPT was
introduced BiLSTM-CNN and CNN-driven models designed for the trained on a large-scale dataset up to September 2021. However, this
automatic extraction and representation of safety requirements from substantial volume of web data can result in biases, constraints in the
construction safety regulations, with the capability to detect compliance produced results, and an inflated performance [149]. Still, scholars have
violations [139,140]. Xue and Zhang developed a ruleset extension already begun to utilize information retrieval techniques for BIM and the
method using pattern matching-based rules that can be applied to generation of construction schedules based on GPT prompts. It is highly
various codes and regulatory requirements [141]. Furthermore, a se­ recommended to employ LLMs such as GPT-4, Claude 2, and LLaMA
mantic rule-based IE method was proposed to support ACC from con­ family, to explore their potential applications in the construction sector.
struction procedural documents [136]. Several attempts have been On the other hand, advances in leveraging deep learning have made
made to make ACC even more intelligent and fully automated. Two significant progress in performing NLP tasks. In contrast to handcrafted
studies developed systems for chatbots and automated generation of and other ML algorithms, LSTM-based models can be applied to higher-
intelligent code. Zhong et al. system can answer questions and queries level text classification, NLG, query-answering systems, sentiment
regarding building regulations [142]. Zhang and El-Gohary model is analysis, and machine translation. Moreover, the breakthrough of pre-
capable of generating intelligent building code requirements [143]. trained models has even made NLP applications more developed in
Models have also been developed to facilitate spatial reasoning through several ways, such as reaching high performance and accuracy in a short
text analytics. This involves the extraction of utility specifications and time. For example, recent utilization of pre-trained language models
regulations that encompass spatial configurations while taking such as BERT, ELMo, GPT, and XLNet have developed NLP tasks
constraint relationships into consideration. Subsequently, these extrac­ compared to traditional techniques. Another advantage of utilizing pre-
ted details are transformed into machine-readable spatial rules to trained language models is their ability to address the limitations posed
automate the compliance checking process [144]. Xu and Cai developed by small datasets. Therefore, it is advisable to employ a range of pre-
and enriched ontology using LSTM and CRF for utility infrastructure, trained language models at different levels to address the challenges
which is helpful for expanding the semantics and interoperability in the encountered in text-based studies related to construction. Furthermore,
utility infrastructure domain [145]. Implementing ACC in the BIM given the abundance of construction text data and advancements in
environment is also prominent. These studies focused on developing other fields that introduced BERT-based models to specific domains, it is
computer-interpretable rules through extracting design and safety in­ recommended to pretrain a BERT-based model tailored to construction
formation regulations for ACC [134,146]. Additionally, methodologies corpora in order to enhance text analytics within the various construc­
were crafted within the IFC schema framework to enhance rule extrac­ tion areas and domains.
tion and interpretation in the BIM setting [147]. As shown in Table 4 and
Fig. 7, the reviewed studies predominantly support ACC systems and 4.2. Data collection challenges
rules in the design phase. Despite two studies suggesting models for
predicting compliance violations, it is imperative to underscore the Text analytic studies and their subjects heavily rely on textual in­
significance of compliance violation checking within construction phase formation datasets, which are the cornerstone of text-based studies. Yet,
tasks. text-based studies are still not fully developed using different types of
datasets because the level of accessibility to the information is limited
4. Discussion and varies due to confidentiality and sensitivity. Therefore, one of the
obstacles in construction-related text analytics is the lack of a real-world
In this section, the challenges, gaps, and consequent suggestions for dataset of any kind and any part. For each domain/area of construction
improving the seamless implementation of TM and NLP in construction TM/NLP-related publications, researchers used specific data types, and
are discussed. the subject of published studies follows the availability of that area
dataset. One of the solutions to overcome data limitations is combining
4.1. Algorithmic variations for future direction textual information with other types. For instance, Williams and Gong,
and Son and Lee combined textual and numerical data in their studies
Supervised and unsupervised algorithms have already found that can be used and applied in other areas such as schedule, cost, and

11
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

risk-related studies due to the significant connection between the textual environment is an ideal direction to support the paradigm shift in
and numerical information [25,72]. On the other hand, textual infor­ human-centric operations. In addition, other aspects of facility man­
mation and images can be utilized specifically for safety studies and can agement systems and services, such as CAFM, IWMS, EAM, and general
be generalized to other areas of construction such as monitoring and risk management, can also be enhanced. Contract-related studies predomi­
management as well. Furthermore, the majority of datasets used in fa­ nantly concentrated on contract risk management and analyzing con­
cility operations and maintenance studies relied on users' input tract standard forms and clauses during the bidding stage. Future studies
including work orders and service requests exist in CMMS. Therefore, a are encouraged to analyze other forms of contract and practices such as
sensor-based dataset with textual information on different facility integrated project delivery (IPD) contracts and improve contract man­
management platforms can be the next step in applying TM and NLP. agement during the lifecycle of the project, especially during the con­
Voice-based information is another type of data that can be used for struction phase, and enhancing contract management processes such as
text analytics in the construction sector and can be utilized for retrieval contract drafting, reporting updates, search and retrieval systems, re­
and query-answering systems to improve communication, BIM, knowl­ view, and documentation. On the other hand, since disputes are so
edge, and document and information management systems. Therefore, critical in construction projects, it is suggested to employ TM/NLP for
future studies can leverage voices and audios for their studies and collect dispute resolution like litigation and legal disputes among parties. Re­
voices and audios through social media and communication platforms. searchers can also employ TM/NLP for conflict resolution between
Another partial solution would be developing models that work with a different regulations and identify compliance violations to support ACC
small dataset. Studies also primarily relied on social media for public in the future. Predicting the consequences of actions and safety-related
opinion analysis. However, it is suggested to collect physical surveys and factors in construction operations can be another focus point for re­
compare different online data sources such as news to achieve searchers in safety-related studies in the future. Further recommenda­
comprehensive results. Various features of one online source such as tions can involve the concurrent utilization of speech recognition, voice-
Twitter pictures and emojis can also be utilized to tackle online data to-text, and image processing methods to enhance safety management
collection challenges and enhance sentiment analysis. Moreover, future through text analytics and computer vision.
studies can benefit from legal online databases such as “Westlaw” and Stakeholder management and public opinion analysis can also be
“LexisNexis” to support dispute resolution and litigation in construction. developed by predicting sentiments and performing temporal-spatial
analysis of sentiments and topics in the future. It is also suggested to
4.3. Current gaps and future suggestions for construction domains, areas, collect both online and in-person data to fill the gaps in data collection
tasks, and processes and achieve more comprehensive results. Studies have specifically
empowered BIM through the development of retrieval systems, query-
To fully benefit from text analytics in construction, it is necessary to answering systems, and the improvement of interoperability. Howev­
identify the stages, processes, and tasks within each construction er, only a few studies were found to improve the BIM's functions and
domain/area in terms of TM and NLP utilization. This will facilitate the dimensions, including facility, safety, cost, quality, and automated
development of a holistic system that is more intelligent and less compliance checking. Therefore, it is suggested to take advantage of text
dependent on human intervention. The following suggestions and dis­ analytics to improve different dimensions and aspects of BIM for
cussions will elucidate the processes and tasks within current con­ scheduling, cost, safety, quality, contracting, engagement, and
struction domains and areas where TM and NLP have either not been communication.
employed or received insufficient attention. While TM/NLP have found applications in various domains and
The construction scheduling-related studies focused on activities and areas, there are still areas where they have yet to be employed. Resource
their sequences. It is suggested to improve all schedule management management, scope management, constraint management, communi­
processes, specifically for automating schedule control, schedule over­ cation management, integration management, procurement manage­
runs, progress reporting, schedule development, and identifying, clas­ ment, commissioning and startup, materials management, and project
sifying, and assigning resources for schedules. Another recommendation control are among the construction areas where text analytics has not
is related to delay analysis in scheduling using contractual and delay been employed in a systematic and progressive approach. Taken as a
clauses and terms. In addition to the previously mentioned recommen­ whole, applications of TM/NLP have significantly improved construc­
dations, it is proposed to implement text analytics in scheduling, tion tasks and processes.
aligning it with other processes such as cost and resource management
to enhance automation within scheduling. Furthermore, it is recom­ 4.4. Limitation of this review
mended to apply text analytics to both quantitative and qualitative
scheduling analysis. For instance, cost estimation and prediction can be There are limitations in conducting this review: first, the existing
automated by identifying types of schedule activities or generating cost- construction areas and domains consist of interrelated and complex
effective activities using text analytics while considering their durations tasks and processes. This review only focuses on generally identified
as well. This approach can also be extended to other aspects of con­ tasks and processes. There are obviously other tasks and processes in
struction management, such as resource management, by generating different construction areas that have not been mentioned, and a more
and allocating resources to the schedule or by automating the classifi­ detailed explanation is out of the scope of this review. Second, some
cation of resources based on other attributes present in schedules and articles might be missed during the collection process due to the use of
costs. Cost-related studies also focused on cost estimation and cost different terms and TM/NLP methods. Third, with the fast TM/NLP
overrun tasks. Yet, future studies are suggested to leverage TM/NLP for advances, there are some papers that might be published after this re­
automating cost control, cost reporting, accounting, and identifying and view preparation that consequently might cause inconsistency with the
classifying direct, indirect, and overhead costs. Future research in current findings of this review.
quality-related issues can implement TM/NLP using on-site inspections
and quality control data to address facility operations and maintenance, 5. Conclusions
compliance checking, and closeout phase issues.
Using TM and NLP in facility operation and maintenance will create This paper provides a comprehensive review of the TM/NLP appli­
a promising solution to address the grand challenge of occupant comfort cations in the construction domains, areas, tasks, and processes and
management. As numerous data sources are generated by citizens on reveals research areas where implementing text analytics would
social media, it is easy to understand the public's opinions of facility enhance automation in construction. First, a total of 205 papers were
operations. Incorporating more human opinions in the built selected that utilized TM and NLP in the construction sector. The

12
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

collected papers were then classified based on their application and [10] S. Chung, S. Moon, J. Kim, J. Kim, S. Lim, S. Chi, Comparing natural language
processing (NLP) applications in construction and computer science using
implication to each specific area and domain of construction manage­
preferred reporting items for systematic reviews (PRISMA), Automation in
ment, as outlined in construction management practices. In the next Construction 154 (2023), 105020, https://doi.org/10.1016/j.
stage, each paper was categorized with respect to its implications and autcon.2023.105020.
contributions to processes, tasks, areas, and domains as outlined in [11] F. Ul Hassan, T. Le, X. Lv, Addressing legal and contractual matters in
construction using natural language processing: a critical review, Journal of
construction practices. Based on our review results, we discussed the Construction Engineering and Management 147 (2021) 3121004, https://doi.
trends and implications of TM and NLP-related papers employed in org/10.1061/(ASCE)CO.1943-7862.0002122.
various construction domains and areas over the past two decades. In [12] M. Locatelli, E. Seghezzi, L. Pellegrini, L.C. Tagliabue, G.M. Di Giuda, Exploring
natural language processing in construction and integration with building
Section 3, applied TM and NLP publications with respect to their im­ information modeling: a Scientometric analysis, Buildings 11 (2021) 583, https://
plications and applications to each construction area and domain were doi.org/10.3390/buildings11120583.
comprehensively introduced and discussed in detail. In brief, this review [13] F.M. Dinis, J. Poças Martins, A.S. Guimarães, B. Rangel, BIM and semantic
enrichment methods and applications: a review of recent developments, Archives
provides a comprehensive reference and understanding of current TM/ of Computational Methods in Engineering 29 (2022) 879–895, https://doi.org/
NLP potentials and gaps and proposes future directions, including 10.1007/s11831-021-09595-6.
constraint management, scope management, integration management, [14] Project Management Institute, A Guide to the Project Management Body of
Knowledge (PMBOK guide), 7th edition, Project Management Institute, 2022
resource management, communication management, materials man­ (ISBN: 1628256648).
agement, procurement management, commissioning and startup, and [15] Construction Industry Institute, Knowledge areas, in: Construction Industry
project control. In the end, the contributions of this review are twofold: Institute Knowledge Base, 2017. https://www.construction-institute.org/resourc
es/knowledgebase (accessed March 4, 2022).
(1) identifying the current status of TM/NLP applications in construction
[16] B. Atkin, A. Brooks, Total Facility Management, 5th edition, Wiley-Blackwell,
by comparing the focus points of publications and their contributions to 2021 (ISBN: 1119707943).
processes and tasks existing in construction domains and areas; and (2) [17] W. Hughes, J. Murdoch, Construction Contracts: Law and Management, 4th
giving recommendations for future construction studies to make them edition, Routledge, 2007 (ISBN: 0415393698).
[18] A.S.J. Holt, Principles of Construction Safety, 1st edition, Wiley-Blackwell, 2005
more automated, intelligent, and less human-dependent and error-prone (ISBN: 1405134461).
by identifying gaps in the current studies. Moreover, the limitations of [19] G. Taylor, Construction Codes & Inspection Handbook, 1st edition, McGraw Hill,
this review include not covering all the tasks and processes, missing 2006 (ISBN: 0071468250).
[20] B. Hardin, D. McCool, BIM and Construction Management: Proven Tools,
publications in the review procedure due to the interrelated terms Methods, and Workflows, 2nd edition, John Wiley & Sons, 2015 (ISBN:
among different sectors and publication time. 1118942760).
[21] S. Golchin, M. Surdeanu, N. Tavabi, A. Kiapour, Do not mask randomly: effective
domain-adaptive pre-training by masking in-domain keywords, in: Proceedings of
Declaration of Competing Interest the 8th Workshop on Representation Learning for NLP, Association for
Computational Linguistics, 2023, pp. 13–21, https://doi.org/10.18653/v1/2023.
The authors declare that they have no known competing financial repl4nlp-1.2.
[22] T. Akanbi, J. Zhang, Design information extraction from construction
interests or personal relationships that could have appeared to influence specifications to support cost estimation, Automation in Construction 131 (2021),
the work reported in this paper. 103835, https://doi.org/10.1016/j.autcon.2021.103835.
[23] P. Jafari, M. Al Hattab, E. Mohamed, S. AbouRizk, Automated extraction and
time-cost prediction of contractual reporting requirements in construction using
Data availability natural language processing and simulation, Applied Sciences 11 (2021) 6188,
https://doi.org/10.3390/app11136188.
The journal articles analyzed in this review were collected using the [24] S. Tang, H. Liu, M. Almatared, O. Abudayyeh, Z. Lei, A. Fong, Towards automated
construction quantity take-off: an integrated approach to information extraction
Google Scholar and ProQuest and, therefore, they are publicly available. from work descriptions, Buildings 12 (2022) 354, https://doi.org/10.3390/
buildings12030354.
References [25] T.P. Williams, J. Gong, Predicting construction cost overruns using text mining,
numerical data and ensemble classifiers, Automation in Construction 43 (2014)
23–29, https://doi.org/10.1016/j.autcon.2014.02.014.
[1] C. Wu, X. Li, Y. Guo, J. Wang, Z. Ren, M. Wang, Z. Yang, Natural language
[26] Y. Hong, H. Xie, G. Bhumbra, I. Brilakis, Comparing natural language processing
processing for smart construction: current status and future directions,
methods to cluster construction schedules, Journal of Construction Engineering
Automation in Construction 134 (2022), 104059, https://doi.org/10.1016/j.
and Management 147 (2021) 4021136, https://doi.org/10.1061/(ASCE)
autcon.2021.104059.
CO.1943-7862.0002165.
[2] N. Craig, J. Sommerville, Information management systems on construction
[27] F. Amer, M. Golparvar-Fard, Modeling dynamic construction work template from
projects: case reviews, Records Management Journal 16 (2006) 131–148, https://
existing scheduling records via sequential machine learning, Advanced
doi.org/10.1108/09565690610713192.
Engineering Informatics 47 (2021), 101198, https://doi.org/10.1016/j.
[3] K.R. Chowdhary, Natural language processing, in: K.R. Chowdhary (Ed.),
aei.2020.101198.
Fundamentals of Artificial Intelligence, Springer, India, New Delhi, 2020,
[28] F. Amer, J. Hockenmaier, M. Golparvar-Fard, Learning and critiquing pairwise
pp. 603–649, https://doi.org/10.1007/978-81-322-3972-7_19.
activity relationships for schedule quality control via deep learning-based natural
[4] N. Tavabi, J. Pruneski, S. Golchin, M. Singh, R. Sanborn, B. Heyworth, A. Kimia,
language processing, Automation in Construction 134 (2022), 104036, https://
A. Kiapour, Building large-scale registries from unstructured clinical notes using a
doi.org/10.1016/j.autcon.2021.104036.
low-resource natural language processing pipeline, MedRxiv (2022) 2012–2022,
[29] F. Amer, Y. Jung, M. Golparvar-Fard, Transformer machine learning language
https://doi.org/10.1101/2022.12.23.22283914.
model for auto-alignment of long-term and short-term plans in construction,
[5] S. Suresh, N. Tavabi, S. Golchin, L. Gilreath, R. Garcia-Andujar, A. Kim,
Automation in Construction 132 (2021), 103929, https://doi.org/10.1016/j.
J. Murray, B. Bacevich, A. Kiapour, Intermediate Domain Finetuning for Weakly
autcon.2021.103929.
Supervised Domain-adaptive Clinical NER, in: The 22nd Workshop on Biomedical
[30] S.A. Prieto, E.T. Mengiste, B. García de Soto, Investigating the use of ChatGPT for
Natural Language Processing and BioNLP Shared Tasks, 2023, pp. 320–325,
the scheduling of construction projects, Buildings 13 (2023) 857, https://doi.org/
https://doi.org/10.18653/v1/2023.bionlp-1.29.
10.3390/buildings13040857.
[6] I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, I. Androutsopoulos,
[31] J. Jeon, X. Xu, Y. Zhang, L. Yang, H. Cai, Extraction of construction quality
LEGAL-BERT: The muppets straight out of law school, in: Findings of the
requirements from textual specifications via natural language processing,
Association for Computational Linguistics: EMNLP, 2020, pp. 2898–2904,
Transportation Research Record 2675 (2021) 222–237, https://doi.org/10.1177/
https://doi.org/10.48550/arXiv.2010.02559.
03611981211001385.
[7] S. Baek, W. Jung, S.H. Han, A critical review of text-based research in
[32] B. Zhong, X. Xing, H. Luo, Q. Zhou, H. Li, T. Rose, W. Fang, Deep learning-based
construction: data source, analysis method, and implications, Automation in
extraction of construction procedural constraints from construction regulations,
Construction 132 (2021), 103915, https://doi.org/10.1016/J.
Advanced Engineering Informatics 43 (2020), 101003, https://doi.org/10.1016/
AUTCON.2021.103915.
j.aei.2019.101003.
[8] F. Ul Hassan, T. Le, Automated Requirements Identification from Construction
[33] B. Zhong, X. Xing, P. Love, X. Wang, H. Luo, Convolutional neural network: deep
Contract Documents Using Natural Language Processing, Journal of Legal Affairs
learning-based classification of building quality problems, Advanced Engineering
and Dispute Resolution in Engineering and Construction 12, 2020, p. 04520009,
Informatics 40 (2019) 46–57, https://doi.org/10.1016/j.aei.2019.02.009.
https://doi.org/10.1061/(asce)la.1943-4170.0000379.
[34] J.-R. Lin, Z.-Z. Hu, J.-L. Li, L.-M. Chen, Understanding on-site inspection of
[9] Y. Ding, J. Ma, X. Luo, Applications of natural language processing in
construction projects based on keyword extraction and topic modeling, IEEE
construction, Automation in Construction 136 (2022), 104169, https://doi.org/
10.1016/j.autcon.2022.104169.

13
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

Access 8 (2020) 198503–198517, https://doi.org/10.1109/ Construction Engineering and Management 146 (2020) 4020083, https://doi.
ACCESS.2020.3035214. org/10.1061/(ASCE)CO.1943-7862.0001865.
[35] D. Zhang, M. Li, D. Tian, L. Song, Y. Shen, Intelligent text recognition based on [58] Y. Zou, A. Kiviniemi, S.W. Jones, Retrieving similar cases for construction project
multi-feature channels network for construction quality control, Advanced risk management using natural language processing techniques, Automation in
Engineering Informatics 53 (2022), 101669, https://doi.org/10.1016/j. Construction 80 (2017) 66–76, https://doi.org/10.1016/j.autcon.2017.04.003.
aei.2022.101669. [59] A. Erfani, Q. Cui, I. Cavanaugh, An empirical analysis of risk similarity among
[36] C. Wu, X. Wang, P. Wu, J. Wang, R. Jiang, M. Chen, M. Swapan, Hybrid deep major transportation projects using natural language processing, Journal of
learning model for automating constraint modelling in advanced working Construction Engineering and Management 147 (2021) 4021175, https://doi.
packaging, Automation in Construction 127 (2021), 103733, https://doi.org/ org/10.1061/(ASCE)CO.1943-7862.0002206.
10.1016/j.autcon.2021.103733. [60] A. Erfani, Q. Cui, Predictive risk modeling for major transportation projects using
[37] C. Wu, P. Wu, J. Wang, R. Jiang, M. Chen, X. Wang, Developing a hybrid historical data, Automation in Construction 139 (2022), 104301, https://doi.org/
approach to extract constraints related information for constraint management, 10.1016/j.autcon.2022.104301.
Automation in Construction 124 (2021), 103563, https://doi.org/10.1016/j. [61] H. Zhou, S. Tang, W. Huang, X. Zhao, Generating risk response measures for
autcon.2021.103563. subway construction by fusion of knowledge and deep learning, Automation in
[38] X. Li, C. Wu, F. Xue, Z. Yang, J. Lou, W. Lu, Ontology-based mapping approach for Construction 152 (2023), 104951, https://doi.org/10.1016/j.
automatic work packaging in modular construction, Automation in Construction autcon.2023.104951.
134 (2022), 104083, https://doi.org/10.1016/j.autcon.2021.104083. [62] J. Matthews, P.E.D. Love, S.R. Porter, W. Fang, Smart data and business analytics:
[39] X. Li, C. Wu, Z. Yang, Y. Guo, R. Jiang, Knowledge graph-enabled adaptive work a theoretical framework for managing rework risks in mega-projects,
packaging approach in modular construction, Knowledge-Based Systems 260 International Journal of Information Management 65 (2022), 102495, https://
(2023), 110115, https://doi.org/10.1016/j.knosys.2022.110115. doi.org/10.1016/j.ijinfomgt.2022.102495.
[40] Y. Mo, D. Zhao, J. Du, M. Syal, A. Aziz, H. Li, Automated staff assignment for [63] Y. Jallan, B. Ashuri, Text mining of the securities and exchange commission
building maintenance using natural language processing, Automation in financial filings of publicly traded construction firms using deep learning to
Construction 113 (2020), 103150, https://doi.org/10.1016/j. identify and assess risk, Journal of Construction Engineering and Management
autcon.2020.103150. 146 (2020) 4020137, https://doi.org/10.1061/(ASCE)CO.1943-7862.0001932.
[41] Y. Bouabdallaoui, Z. Lafhaj, P. Yim, L. Ducoulombier, B. Bennadji, Natural [64] M.-F.F. Siu, W.-Y.J. Leung, W.-M.D. Chan, A data-driven approach to identify-
language processing model for managing maintenance requests in buildings, quantify-analyse construction risk for Hong Kong NEC projects, Journal of Civil
Buildings 10 (2020) 160, https://doi.org/10.3390/buildings10090160. Engineering and Management 24 (2018) 592–606, https://doi.org/10.3846/
[42] H.B. Gunay, W. Shen, C. Yang, Text-mining building maintenance work orders for jcem.2018.6483.
component fault frequency, Building Research and Information 47 (2019) [65] A.B. Candaş, O.B. Tokdemir, Automated identification of vagueness in the FIDIC
518–533, https://doi.org/10.1080/09613218.2018.1459004. silver book conditions of contract, Journal of Construction Engineering and
[43] R. Bortolini, N. Forcada, Analysis of building maintenance requests using a text Management 148 (2022) 4022007, https://doi.org/10.1061/(ASCE)CO.1943-
mining approach: building services evaluation, Building Research and 7862.0002254.
Information 48 (2020) 207–217, https://doi.org/10.1080/ [66] T. Ko, H.D. Jeong, G. Lee, Natural language processing–driven model to extract
09613218.2019.1609291. contract change reasons and altered work items for advanced retrieval of change
[44] M. Marocco, I. Garofolo, Operational text-mining methods for enhancing building orders, Journal of Construction Engineering and Management 147 (2021)
maintenance management, Building Research and Information 49 (2021) 04021147, https://doi.org/10.1061/(ASCE)CO.1943-7862.0002172.
893–911, https://doi.org/10.1080/09613218.2021.1953368. [67] S. Moon, G. Lee, S. Chi, Semantic text-pairing for relevant provision identification
[45] M. D’Orazio, E. Di Giuseppe, G. Bernardini, Automatic detection of maintenance in construction specification reviews, Automation in Construction 128 (2021),
requests: comparison of human manual annotation and sentiment analysis 103780, https://doi.org/10.1016/j.autcon.2021.103780.
techniques, Automation in Construction 134 (2022), 104068, https://doi.org/ [68] S. Moon, G. Lee, S. Chi, Automated system for construction specification review
10.1016/j.autcon.2021.104068. using natural language processing, Advanced Engineering Informatics 51 (2022),
[46] M. D’Orazio, G. Bernardini, E. Di Giuseppe, Automated priority assignment of 101495, https://doi.org/10.1016/j.aei.2021.101495.
building maintenance tasks using natural language processing and machine [69] F.U. Hassan, T. Le, C. Le, Automated approach for digitalizing scope of work
learning, Journal of Architectural Engineering 29 (2023) 04023027, https://doi. requirements to support contract management, Journal of Construction
org/10.1061/JAEIED.AEENG-1516. Engineering and Management 149 (2023) 04023005, https://doi.org/10.1061/
[47] P. Nojedehi, W. O’brien, H.B. Gunay, Benchmarking and visualization of building JCEMD4.COENG-12528.
portfolios by applying text analytics to maintenance work order logs, science and [70] A.B. Candaş, O.B. Tokdemir, Automating coordination efforts for reviewing
technology for the, Built Environment 27 (2021) 756–775, https://doi.org/ construction contracts with multilabel text classification, Journal of Construction
10.1080/23744731.2021.1913957. Engineering and Management 148 (2022) 4022027, https://doi.org/10.1061/
[48] S. Dutta, H.B. Gunay, S. Bucking, Benchmarking operational performance of (ASCE)CO.1943-7862.0002275.
buildings by text mining tenant surveys, science and technology for the, Built [71] S.-W. Choi, E.-B. Lee, Contractor’s risk analysis of engineering procurement and
Environment 27 (2021) 741–755, https://doi.org/10.1080/ construction (EPC) contracts using ontological semantic model and bi-long short-
23744731.2020.1851545. term memory (LSTM) technology, Sustainability 14 (2022) 6938, https://doi.
[49] K.-L. Chen, M.-H. Tsai, Conversation-based information delivery method for org/10.3390/su14116938.
facility management, Sensors 21 (2021) 4771, https://doi.org/10.3390/ [72] J. Lee, J.-S. Yi, Predicting project’s uncertainty risk in the bidding process by
s21144771. integrating unstructured text data and structured numerical data using text
[50] J. Bazzan, M.E. Echeveste, C.T. Formoso, B. Altenbernd, M.H. Barbian, An mining, Applied Sciences 7 (2017) 1141, https://doi.org/10.3390/app7111141.
information management model for addressing Residents’ complaints through [73] B.-Y. Son, E.-B. Lee, Using text mining to estimate schedule delay risk of 13
artificial intelligence techniques, Buildings 13 (2023) 737, https://doi.org/ offshore oil and gas EPC case studies during the bidding process, Energies 12
10.3390/buildings13030737. (2019) 1956, https://doi.org/10.3390/en12101956.
[51] K. Liu, N. El-Gohary, Semantic neural network ensemble for automated [74] B. Shuai, A rationale-augmented NLP framework to identify unilateral contractual
dependency relation extraction from bridge inspection reports, Journal of change risk for construction projects, Computers in Industry 149 (2023), 103940,
Computing in Civil Engineering 35 (2021) 4021007, https://doi.org/10.1061/ https://doi.org/10.1016/j.compind.2023.103940.
(ASCE)CP.1943-5487.0000961. [75] J. Niu, R.R.A. Issa, Developing taxonomy for the domain ontology of construction
[52] J. Xue, G.Q. Shen, Y. Li, J. Wang, I. Zafar, Dynamic stakeholder-associated topic contractual semantics: a case study on the AIA A201 document, Advanced
modeling on public concerns in megainfrastructure projects: case of Hong Engineering Informatics 29 (2015) 472–482, https://doi.org/10.1016/j.
Kong–Zhuhai–Macao bridge, Journal of Management in Engineering 36 (2020) aei.2015.03.009.
4020078, https://doi.org/10.1061/(ASCE)ME.1943-5479.0000845. [76] M. Al Qady, A. Kandil, Concept relation extraction from construction documents
[53] Z. Zhou, X. Zhou, L. Qian, Online public opinion analysis on infrastructure using natural language processing, Journal of Construction Engineering and
megaprojects: toward an analytical framework, Journal of Management in Management 136 (2010) 294–302, https://doi.org/10.1061/(ASCE)CO.1943-
Engineering 37 (2021) 4020105, https://doi.org/10.1061/(ASCE)ME.1943- 7862.0000131.
5479.0000874. [77] H.T.T.L. Pham, S. Han, Natural language processing with multitask classification
[54] X. Wan, R. Wang, M. Wang, J. Deng, Z. Zhou, X. Yi, J. Pan, Y. Du, Online public for semantic prediction of risk-handling actions in construction contracts, Journal
opinion mining for large cross-regional projects: case study of the south-to-north of Computing in Civil Engineering 37 (2023) 04023027, https://doi.org/
water diversion project in China, Journal of Management in Engineering 38 10.1061/JCCEE5.CPENG-5218.
(2022) 05021011, https://doi.org/10.1061/(ASCE)ME.1943-5479.0000970. [78] Y. Fu, C. Xu, L. Zhang, Y. Chen, Control, coordination, and adaptation functions
[55] L. Wu, K. Ye, P. Gong, J. Xing, Perceptions of governments towards mitigating the in construction contracts: a machine-coding model, Automation in Construction
environmental impacts of expressway construction projects: a case of China, 152 (2023), 104890, https://doi.org/10.1016/j.autcon.2023.104890.
Journal of Cleaner Production 236 (2019), 117704, https://doi.org/10.1016/j. [79] Y. Jallan, E. Brogan, B. Ashuri, C.M. Clevenger, Application of natural language
jclepro.2019.117704. processing and text mining to identify patterns in construction-defect litigation
[56] X. Ren, Y. Li, M. Guo, Dynamically identifying and evaluating key barriers to cases, Journal of Legal Affairs and Dispute Resolution in Engineering and
promoting prefabricated buildings: text mining approach, Journal of Construction Construction 11 (2019) 4519024, https://doi.org/10.1061/(ASCE)LA.1943-
Engineering and Management 149 (2023) 04023075, https://doi.org/10.1061/ 4170.0000308.
JCEMD4.COENG-13285. [80] N. Xu, L. Ma, L. Wang, Y. Deng, G. Ni, Extracting domain knowledge elements of
[57] J. Zheng, Q. Wen, M. Qiang, Understanding demand for project manager construction safety management: rule-based approach using Chinese natural
competences in the construction industry: data mining approach, Journal of

14
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

language processing, Journal of Management in Engineering 37 (2021) 4021001, [104] M.O. Wong, H. Zhou, H. Ying, S. Lee, A voice-driven IMU-enabled BIM-based
https://doi.org/10.1061/(ASCE)ME.1943-5479.0000870. multi-user system for indoor navigation in fire emergencies, Automation in
[81] M. Martínez-Rojas, R.M. Antolín, F. Salguero-Caparrós, J.C. Rubio-Romero, Construction 135 (2022), 104137, https://doi.org/10.1016/j.
Management of construction safety and health plans based on automated content autcon.2022.104137.
analysis, Automation in Construction 120 (2020), 103362, https://doi.org/ [105] T.-H. Lin, Y.-H. Huang, A. Putranto, Intelligent question and answer system for
10.1016/j.autcon.2020.103362. building information modeling and artificial intelligence of things based on the
[82] D. Tian, M. Li, Q. Ren, X. Zhang, S. Han, Y. Shen, Intelligent question answering bidirectional encoder representations from transformers model, Automation in
method for construction safety hazard knowledge based on deep semantic Construction 142 (2022), 104483, https://doi.org/10.1016/j.
mining, Automation in Construction 145 (2023), 104670, https://doi.org/ autcon.2022.104483.
10.1016/j.autcon.2022.104670. [106] W.Y. Lin, Prototyping a Chatbot for site managers using building information
[83] X. Luo, X. Li, Y.M. Goh, X. Song, Q. Liu, Application of machine learning modeling (BIM) and natural language understanding (NLU) techniques, Sensors
technology for occupational accident severity prediction in the case of 23 (2023) 2942, https://doi.org/10.3390/s23062942.
construction collapse accidents, Safety Science 163 (2023), 106138, https://doi. [107] H. Liu, Y.-S. Liu, P. Pauwels, H. Guo, M. Gu, Enhanced explicit semantic analysis
org/10.1016/j.ssci.2023.106138. for product model retrieval in construction industry, IEEE Transactions on
[84] H. Baker, M.R. Hallowell, A.J.-P. Tixier, AI-based prediction of independent Industrial Informatics 13 (2017) 3361–3369, https://doi.org/10.1109/
construction safety outcomes from universal attributes, Automation in TII.2017.2708727.
Construction 118 (2020), 103146, https://doi.org/10.1016/j. [108] M. Yin, L. Tang, C. Webster, J. Li, H. Li, Z. Wu, R.C.K. Cheng, Two-stage text-to-
autcon.2020.103146. BIMQL semantic parsing for building information model extraction using graph
[85] H. Baker, M.R. Hallowell, A.J.-P. Tixier, Automatically learning construction neural networks, Automation in Construction 152 (2023), 104902, https://doi.
injury precursors from text, Automation in Construction 118 (2020), 103145, org/10.1016/j.autcon.2023.104902.
https://doi.org/10.1016/j.autcon.2020.103145. [109] M. Yin, L. Tang, C. Webster, S. Xu, X. Li, H. Ying, An ontology-aided, natural
[86] Y. Liu, J. Wang, S. Tang, J. Zhang, J. Wan, Integrating information entropy and language-based approach for multi-constraint BIM model querying, Journal of
latent Dirichlet allocation models for analysis of safety accidents in the Building Engineering 76 (2023) 107066, https://doi.org/10.1016/j.
construction industry, Buildings 13 (2023) 1831, https://doi.org/10.3390/ jobe.2023.107066.
buildings13071831. [110] G. Gao, Y.-S. Liu, P. Lin, M. Wang, M. Gu, J.-H. Yong, BIMTag: concept-based
[87] D. Feng, H. Chen, A small samples training framework for deep learning-based automatic semantic annotation of online BIM product resources, Advanced
automatic information extraction: case study of construction accident news Engineering Informatics 31 (2017) 48–61, https://doi.org/10.1016/j.
reports analysis, Advanced Engineering Informatics 47 (2021), 101256, https:// aei.2015.10.003.
doi.org/10.1016/j.aei.2021.101256. [111] L. Zhang, M. Wen, B. Ashuri, BIM log mining: measuring design productivity,
[88] X. Li, R. Zhu, H. Ye, C. Jiang, A. Benslimane, MetaInjury: Meta-learning Journal of Computing in Civil Engineering 32 (2018) 4017071, https://doi.org/
framework for reusing the risk knowledge of different construction accidents, 10.1061/(ASCE)CP.1943-5487.0000721.
Safety Science 140 (2021), 105315, https://doi.org/10.1016/j.ssci.2021.105315. [112] H. Wang, X. Meng, X. Zhu, Improving knowledge capture and retrieval in the BIM
[89] X. Pan, B. Zhong, Y. Wang, L. Shen, Identification of accident-injury type and environment: combining case-based reasoning and natural language processing,
bodypart factors from construction accident reports: a graph-based deep learning Automation in Construction 139 (2022), 104317, https://doi.org/10.1016/j.
framework, Advanced Engineering Informatics 54 (2022), 101752, https://doi. autcon.2022.104317.
org/10.1016/j.aei.2022.101752. [113] N. Jung, G. Lee, Automated classification of building information modeling (BIM)
[90] M.A. Sayed, X. Qin, R.J. Kate, D.M. Anisuzzaman, Z. Yu, Identification and case studies by BIM use based on natural language processing (NLP) and
analysis of misclassified work-zone crashes using text mining techniques, unsupervised learning, Advanced Engineering Informatics 41 (2019), 100917,
Accident; Analysis and Prevention 159 (2021), 106211, https://doi.org/10.1016/ https://doi.org/10.1016/j.aei.2019.04.007.
j.aap.2021.106211. [114] X. Ding, J. Yang, L. Liu, W. Huang, P. Wu, Integrating IFC and CityGML model at
[91] W. Fang, H. Luo, S. Xu, P.E.D. Love, Z. Lu, C. Ye, Automated text classification of schema level by using linguistic and text mining techniques, IEEE Access 8 (2020)
near-misses from safety reports: an improved deep learning approach, Advanced 56429–56440, https://doi.org/10.1109/ACCESS.2020.2982044.
Engineering Informatics 44 (2020), 101060, https://doi.org/10.1016/j. [115] Q. Xie, X. Zhou, J. Wang, X. Gao, X. Chen, C. Liu, Matching real-world facilities to
aei.2020.101060. building information modeling data using natural language processing, IEEE
[92] S. Chen, J. Xi, Y. Chen, J. Zhao, Association Mining of near Misses in hydropower Access 7 (2019) 119465–119475, https://doi.org/10.1109/
engineering construction based on convolutional neural network text ACCESS.2019.2937219.
classification, Computational Intelligence and Neuroscience 2022 (2022), [116] H. Dawood, J. Siddle, N. Dawood, Integrating IFC and NLP for automating change
https://doi.org/10.1155/2F2022/2F4851615. request validations, Journal of Information Technology in Construction 24 (2019)
[93] Y.M. Goh, C.U. Ubeynarayana, Construction accident narrative classification: an 540–552. https://www.itcon.org/2019/30.
evaluation of text mining techniques, Accident; Analysis and Prevention 108 [117] M. Yin, L. Tang, C. Webster, X. Yi, H. Ying, Y. Wen, A deep natural language
(2017) 122–130, https://doi.org/10.1016/j.aap.2017.08.026. processing-based method for ontology learning of project-specific properties from
[94] B. Zhong, X. Pan, P.E.D. Love, L. Ding, W. Fang, Deep learning and network building information models, Computer-Aided Civil and Infrastructure
analysis: classifying and visualizing accident narratives in construction, Engineering (2023) 1–26, https://doi.org/10.1111/mice.13013.
Automation in Construction 113 (2020), 103089, https://doi.org/10.1016/j. [118] A. Hodorog, I. Petri, Y. Rezgui, J.-L. Hippolyte, Building information modelling
autcon.2020.103089. knowledge harvesting for energy efficiency in the construction industry, Clean
[95] J. Qiao, C. Wang, S. Guan, L. Shuran, Construction-accident narrative Technologies and Environmental Policy 23 (2021) 1215–1231, https://doi.org/
classification using shallow and deep learning, Journal of Construction 10.1007/s10098-020-02000-z.
Engineering and Management 148 (2022) 4022088, https://doi.org/10.1061/ [119] M.R. Hosseini, I. Martek, E. Papadonikolaki, M. Sheikhkhoshkar, S. Banihashemi,
(ASCE)CO.1943-7862.0002354. M. Arashpour, Viability of the BIM manager enduring as a distinct role:
[96] N.-W. Chi, K.-Y. Lin, S.-H. Hsieh, Using ontology-based text classification to assist association rule mining of job advertisements, Journal of Construction
job hazard analysis, Advanced Engineering Informatics 28 (2014) 381–394, Engineering and Management 144 (2018) 4018085, https://doi.org/10.1061/
https://doi.org/10.1016/j.aei.2014.05.001. (ASCE)CO.1943-7862.0001542.
[97] J. Kim, S. Youm, Y. Shan, J. Kim, Analysis of fire accident factors on construction [120] K. Ruikar, C.J. Anumba, C. Egbu, Integrated use of technologies and techniques
sites using web crawling and deep learning approach, Sustainability 13 (2021) for construction knowledge management, Knowledge Management Research and
11694, https://doi.org/10.3390/su132111694. Practice 5 (2007) 297–311, https://doi.org/10.1057/palgrave.kmrp.8500154.
[98] N. Xu, H. Chang, B. Xiao, B. Zhang, J. Li, T. Gu, Relation extraction of domain [121] M. Al Qady, A. Kandil, Automatic classification of project documents on the basis
knowledge entities for safety risk management in metro construction projects, of text content, Journal of Computing in Civil Engineering 29 (2015) 4014043,
Buildings 12 (2022) 1633, https://doi.org/10.3390/buildings12101633. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000338.
[99] S. Chen, K. Demachi, F. Dong, Graph-based linguistic and visual information [122] T. Le, H. David Jeong, NLP-based approach to semantic classification of
integration for on-site occupational hazards identification, Automation in heterogeneous transportation asset data terminology, Journal of Computing in
Construction 137 (2022), 104191, https://doi.org/10.1016/j. Civil Engineering 31 (2017) 4017057, https://doi.org/10.1061/(ASCE)CP.1943-
autcon.2022.104191. 5487.0000701.
[100] J. Lin, Z. Hu, J. Zhang, F. Yu, A natural-language-based approach to intelligent [123] H. Xue, T. Zhang, Q. Wang, S. Liu, K. Chen, Developing a unified framework for
data retrieval and representation for cloud BIM, Computer-Aided Civil and data sharing in the smart construction using text analysis, KSCE Journal of Civil
Infrastructure Engineering 31 (2016) 18–33, https://doi.org/10.1111/ Engineering 26 (2022) 4359–4379, https://doi.org/10.1007/s12205-022-2037-6.
mice.12151. [124] N. Torkanfar, E.R. Azar, Quantitative similarity assessment of construction
[101] S. Wu, Q. Shen, Y. Deng, J. Cheng, Natural-language-based intelligent retrieval projects using WBS-based metrics, Advanced Engineering Informatics 46 (2020),
engine for BIM object database, Computers in Industry 108 (2019) 73–88, 101179, https://doi.org/10.1016/j.aei.2020.101179.
https://doi.org/10.1016/j.compind.2019.02.016. [125] T. Ko, H. David Jeong, J. Lee, Natural language processing–driven similar project
[102] S. Shin, R.R.A. Issa, BIMASR: framework for voice-based BIM information determination using project scope statements, Journal of Management in
retrieval, Journal of Construction Engineering and Management 147 (2021) Engineering 39 (2023), 04023005, https://doi.org/10.1061/JMENEA.MEENG-
4021124, https://doi.org/10.1061/(ASCE)CO.1943-7862.0002138. 5229.
[103] N. Wang, R.R.A. Issa, C.J. Anumba, Transfer learning-based query classification [126] A.K. Choudhary, P.I. Oluikpe, J.A. Harding, P.M. Carrillo, The needs and benefits
for intelligent building information spoken dialogue, Automation in Construction of text mining applications on post-project reviews, Computers in Industry 60
141 (2022), 104403, https://doi.org/10.1016/j.autcon.2022.104403. (2009) 728–740, https://doi.org/10.1016/j.compind.2009.05.006.

15
A. Shamshiri et al. Automation in Construction 158 (2024) 105200

[127] P. Demian, R. Fruchter, Measuring relevance in support of design reuse from computable requirements, Journal of Computing in Civil Engineering 36 (2022)
archives of building product models, Journal of Computing in Civil Engineering 04022022, https://doi.org/10.1061/(ASCE)CP.1943-5487.0001014.
19 (2005) 119–136, https://doi.org/10.1061/(ASCE)0887-3801(2005)19:2 [139] X. Wang, N. El-Gohary, Deep learning-based relation extraction and knowledge
(119). graph-based representation of construction safety requirements, Automation in
[128] M. Kovacevic, J.-Y. Nie, C. Davidson, Providing answers to questions from Construction 147 (2023), 104696, https://doi.org/10.1016/j.
automatically collected web pages for intelligent decision making in the autcon.2022.104696.
construction sector, Journal of Computing in Civil Engineering 22 (2008) 3–13, [140] X. Wang, N. El-Gohary, Deep learning–based named entity recognition and
https://doi.org/10.1061/(ASCE)0887-3801(2008)22:1(3). resolution of referential ambiguities for enhanced information extraction from
[129] K.Y. Lin, K.W. Chou, H.T. Lin, S.H. Hsieh, H.P. Tserng, Exploring the effectiveness construction safety regulations, Journal of Computing in Civil Engineering 37
of Chinese-to-English machine translation for CLIR applications in earthquake (2023) 04023023, https://doi.org/10.1061/(ASCE)CP.1943-5487.0001064.
engineering, Journal of Computing in Civil Engineering 23 (2009) 140–147, [141] X. Xue, J. Zhang, Regulatory information transformation ruleset expansion to
https://doi.org/10.1061/(ASCE)0887-3801(2009)23:3(140). support automated building code compliance checking, Automation in
[130] X. Sun, M.A. Brown, M. Cox, R. Jackson, Mandating better buildings: a global Construction 138 (2022), 104230, https://doi.org/10.1016/j.
review of building codes and prospects for improvement in the United States, autcon.2022.104230.
Wiley Interdisciplinary Reviews: Energy and Environment 5 (2016) 188–215, [142] B. Zhong, W. He, Z. Huang, P.E.D. Love, J. Tang, H. Luo, A building regulation
https://doi.org/10.1002/wene.168. question answering system: a deep learning methodology, Advanced Engineering
[131] X. Xue, J. Zhang, Building codes part-of-speech tagging performance Informatics 46 (2020), 101195, https://doi.org/10.1016/j.aei.2020.101195.
improvement by error-driven transformational rules, Journal of Computing in [143] R. Zhang, N. El-Gohary, Natural language generation and deep learning for
Civil Engineering 34 (2020) 4020035, https://doi.org/10.1061/(ASCE)CP.1943- intelligent building codes, Advanced Engineering Informatics 52 (2022), 101557,
5487.0000917. https://doi.org/10.1016/j.aei.2022.101557.
[132] X. Xue, J. Zhang, Part-of-speech tagging of building codes empowered by deep [144] S. Li, H. Cai, V.R. Kamat, Integrating natural language processing and spatial
learning and transformational rules, Advanced Engineering Informatics 47 reasoning for utility compliance checking, Journal of Construction Engineering
(2021), 101235, https://doi.org/10.1016/j.aei.2020.101235. and Management 142 (2016) 4016074, https://doi.org/10.1061/(ASCE)
[133] R. Zhang, N. El-Gohary, Clustering-based approach for building code CO.1943-7862.0001199.
computability analysis, Journal of Computing in Civil Engineering 35 (2021) [145] X. Xu, H. Cai, Domain ontology for utility infrastructure: coupling the semantics
4021021, https://doi.org/10.1061/(ASCE)CP.1943-5487.0000967. of CityGML utility network ADE and domain glossaries, Journal of Computing in
[134] D. Guo, E. Onstein, A.D. La Rosa, A semantic approach for automated rule Civil Engineering 35 (2021) 4021011, https://doi.org/10.1061/(ASCE)CP.1943-
compliance checking in construction industry, IEEE Access 9 (2021) 5487.0000977.
129648–129660, https://doi.org/10.1109/ACCESS.2021.3108226. [146] J.-K. Lee, K. Cho, H. Choi, S. Choi, S. Kim, S.H. Cha, High-level implementable
[135] P. Zhou, N. El-Gohary, Ontology-based automated information extraction from methods for automated building code compliance checking, Developments in the
building energy conservation codes, Automation in Construction 74 (2017) Built Environment 15 (2023), 100174, https://doi.org/10.1016/j.
103–117, https://doi.org/10.1016/j.autcon.2016.09.004. dibe.2023.100174.
[136] R. Ren, J. Zhang, Semantic rule-based construction procedural information [147] R. Zhang, N. El-Gohary, Transformer-based approach for automated context-
extraction to guide jobsite sensing and monitoring, Journal of Computing in Civil aware IFC-regulation semantic information alignment, Automation in
Engineering 35 (2021) 4021026, https://doi.org/10.1061/(ASCE)CP.1943- Construction 145 (2023), 104540, https://doi.org/10.1016/j.
5487.0000971. autcon.2022.104540.
[137] Z. Zhang, L. Ma, N. Nisbet, Unpacking ambiguity in building requirements to [148] M. Naeem, S.T.H. Rizvi, A. Coronato, A gentle introduction to reinforcement
support automated compliance checking, Journal of Management in Engineering learning and its application in different fields, IEEE Access 8 (2020)
39 (2023) 04023033, https://doi.org/10.1061/JMENEA.MEENG-5359. 209320–209344, https://doi.org/10.1109/ACCESS.2020.3038605.
[138] R. Zhang, N. El-Gohary, Hierarchical representation and deep learning–based [149] S. Golchin, M. Surdeanu, Time travel in LLMs: tracing data contamination in large
method for automatically transforming textual building codes into semantic language models, ArXiv Preprint (2023), https://doi.org/10.48550/
arXiv.2308.08493.

16

You might also like