Man-Hour Estimation Model Development in BIM-Based
Man-Hour Estimation Model Development in BIM-Based
319
Conference Article
(First received September 18, 2023 and in final form December 22, 2023)
Abstract
using Python format. Data set, which was turned into a matrix, was analysed and the consistency
of the parameters in data set was tested. Spent man-hour estimation was made, graphed over
parameters with simple and multiple linear regression algorithms.
Keyw ords: Building Information Modelling (BIM), construction management, bidding, project
management, Artificial Intelligence (AI), machine learning, regression analysis
1. Introduction
is still a developing system. Despite BIM technique has many applications, practical
application is few and without good entry point when applied to the project. BIM
technique is currently not mature with some difficulties in practice, and in addition,
project personnel is not familiar with BIM technique, all contributing to difficult field
application of BIM technique [3, p. 310] .
For these reasons, there are different working techniques and these techniques are still
developing, the margin of error in the estimated project duration can be misleading.
Considering that man-hour and project cost estimations are financially very important
for the company. This study is aimed to make observations with advanced technologies
and different analysis methods for man-hour estimations in BIM projects at the bidding
stage.
In this study, it is planned to use applied machine learning algorithms to analyze the man
hour estimations of BIM projects at the bidding stage.
Machine learning (ML) is the subset of artificial intelligence (AI) that focuses on building
systems that learn—or improve performance—based on the data they consume. AI is a
broad term that refers to systems or machines that mimic human intelligence [4].
ML, a subset of AI, is also classified within itself and the method used in this study will
be regression analysis in the Supervised Learning category [Fig.1].
In statistical modeling, regression analysis is a set of statistical processes for estimating
the relationships between a dependent variable (often called the 'outcome' or 'response'
variable, or a 'label' in machine learning parlance) and one or more independent variables
(often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most
common form of regression analysis is linear regression, in which one finds the line (or a
more complex linear combination) that most closely fits the data according to a specific
mathematical criterion. The types of regression analysis can shown as Table 1.
Lineer Regression
Logistic Regression
Polynomial Regression
Quantile Regression
Ridge Regression
Lasso Regression
Ordinal Regression
Poisson Regression
Cox Regression
Tobit Regression
Not all types of analysis will be included in this study. To summarize the study, in the
first section, the information to perform man-hour analysis with completed construction
projects was requested from offices and individuals who have gained experience in BIM
projects. As a result of a research, the project information of 16 projects were tabulated.
In Section 2, the collected information was converted into a data set and transferred into
regression analysis algorithms using Jupyter notebook(Python format). The data set,
which was turned into a matrix within the algorithm, was analysed within itself and the
consistency of the parameters in data set was tested. In section 3 of the study, spent man
hour estimation was made and graphed over various parameters with the linear
regression algorithm.
2.1. Collecting Project Information
In the first section of the study, in order to perform the targeted analyses, some
information that should be used in the algorithm had to be collected by the target
audience. The target audience was people who had previously worked in BIM projects,
offices with expertise and experience in BIM project applications. In order to collect this
information from the target group, an Excel spreadsheet was sent by e-mail to the people
identified and they were asked to fill it in. The information requested from the experts is
as follows;
●Building Type: This parameter represents the difficulty level of the project. The
difficulty level of the project directly affects the number of man hours spent. There are
different challenges that each type of project may pose, and it is relatively unclear to
compare this. For this reason, experts were asked to rank the construction project types
according to their difficulty levels, while the information of the project type was
requested. The type of the project was not written directly into the data matrix, but the
difficulty level was entered numerically. In this digitization action, the ranking method
was performed.
●Finish Year of Construction: The end date of the project was requested from the experts
in order to be able to deal with the foreign currency values of this period if a cost analysis
was desired.
●Project Cost: This information was requested from the experts in order to answer the
question of how much the project cost is consistent with the man-hours spent.
●Total Construction Area: The size of the project in square meters will be directly
proportional to the man-hours to be spent, because considering that each element of the
project is modelled in BIM projects, the more square meters, the more work to be done.
●Number of Architectural Drawing Sheets: As number of sheets increases in the project,
the work to be drawn also increases, so man-hour will directly affect it.
●Spent Man-Hour: This parameter is the main parameter of the analysis.
●Project Delivery Methods: The project delivery method affects the processes in the
project. It was aimed to observe the effect of this factor on the man-hour analysis.
●Architectural Project Type: Project type affects how detailed the project will be drawn.
●Construction System: The difficulty of the construction system of the project means
more work and time consumption. In this parameter, Concrete-Steel and similar types
given by the experts were turned into a numerical data by the ranking method according
to their difficulty levels.
●BIM Level: As the BIM level increases in the BIM project, the layers of the project and
the work to be modelled increase.
●LOD Level: As the LOD level increases, in some project’s elements modeled from
scratch can cause a huge man-hour spent.
●Interdisciplinary Coordination: Interdisciplinary coordination in BIM requires different
working principles compared to the traditional project drawing process, it can negatively
affect the man-hours spent.
As a result of the interviews with experts, information on 16 projects was collected as
shown in Figure 2.
Data Set
In the second stage, normalization processes were carried out in order to make the
information obtained processable. The answers of the parameters with verbal answers
were converted into numerical values with the ranking method and as a result, the matrix
in Fig. 3 was created. The created dataset was introduced to the algorithm to be analyzed
as a cvs. file.
For the first stage of the analysis, it was aimed to analyze the consistency of the
parameters determined in the study. As a first step, a ptyhon script was created on the
Jupyter notebook (Fig. 4) and the created dataset was introduced to the script.
With the Describe() method, the statistical values of the data we have are shown and the
number, standard deviation, minimum value, max value of the parameters are analysed.
As can be seen in Fig. 3, when the parameters of 16 projects are analysed, the oldest
project in the project string belongs to 2009 and the newest project belongs to 2022. The
standard deviation of the project delivery method and the next parameters is low, so it is
difficult to make a prediction or comment on these parameters because the values of all
of them are the same or very close to each other in many projects and there is very little
diversity in the data.
Figure 5
When the distribution of variables and the correlation between them are analyzed with
the Corr() method, the results in Figure 6 are obtained.
Figure 6
Figure 6 shows that there is a positive relationship between some variables and a negative
relationship between others. Within these 16 projects, it can be interpreted that as the end
date of the project increases, the square meters of the projects decreases. In the same way,
we can also examine the correlation graphs of the two variables with the help of the
seaborn library via Python. We can see the relationship between Building Type and Spent
Man-Hour variables with the graph in Figure 7.
Figure 7
Since our main objective in the study is to analyze the man-hours spent, in Section 3 the
analysis is continued on two variables: total construction area and spent man-hours.
Analysis
When we look at the correlation graph(Fig. 8) between Spent Man-hour and total
construction area variables, we see that as the total square meters increases, the man-hour
spent also increases.
Figure 8
When the linear regression model was created with the statemodels library in the
regression algorithm, the results were as shown in Figure 9. As seen in the model, it is
seen that the R-square value is 20%. This value indicates that 20% of the change in man
hour value can be explained by the change in total construction area.
Figure 9
When we estimate with the Print function, we can formulate (Equation 1) the relationsh ip
between spent man-hour and total construction area as follows. The output of the
function is as follows.(Equation 2)
(Equation 1)
SPENT MAN-HOUR = 17751.33 + TOTAL CONSTRUCTION AREA*0.56
Equation 2
When we graph this equation, it is as in Figure 10.
Figure 10
In this section, it is assumed that 20% of the data is tester with machine learning logic and
a prediction is made after the necessary change analysis is done. When we proceed
through an example prediction, let's assume that the model equation takes the following
scenario A as a reference.
Company A will bid for the tender of a shopping center project, which is planned to be
completed in 2025, where architectural modeling and interdisciplinary coordination is
required to be done at Level 2 in BIM. It is stated in the BEP that the average LOD level
in the project is 300. According to the ranking of the project group in the cost dataset
allocated for the project, it corresponds to the 7th place. The total square meter area of the
project is 200.000 m2 and the application project of the plan project will be drawn in a
mixed structure system. There will be a total of 450 sheets targeted to be deliver ed.
Contractual requirements should be taken into consideration for the design-build project.
The following estimation model has been created for the estimation of the project in this
scenario.
Figure 11
As can be seen in Figure 11, the coefficients of influence of the other variables for the man
hours spent variable are observed and the consistency of the factors on the constant
variable is 96.3%.In order to perform the machine learning estimation, a formulation in
Figure 12 was performed and as a result of this formula, the recommended duration for
scenario 1 was 174.966 hours.
Figure 12
3. Result
RMSE (Root Mean Square Error) is measured to find the distance between the predicted
values and the actual values of a machine learning model. RMSE is the standard deviation
of the prediction errors (residuals). That is, the residuals are a measure of how far the
regression line is from the data points; the RMSE is a measure of how far these residuals
are spread. In other words, it tells us how concentrated that data is around the line that
best fits the data. An RMSE value of zero means that the model makes no errors. The
plus-minus values for the prediction in the scenario in this study can be considered as
follows. 13832 hours is the margin of error of the 80% project taught to the system. 137978
is the margin of error for 20% of the projects tested. Since the number of tested projects is
very small, it is normal to have such a high deviation. (Figure 13)
Figure 13
save a lot of time for these estimations and with a successful application, the estimation
errors made in the proposals will be significantly reduced thanks to machine learning.
The machine learning applications shown in this report have been analysed with simple
linear regression and Multiple Linear Regression within the scope of regression. Data wa s
of great importance in the analysis. However, during the data collection process,
sufficient information could not be obtained due to the confidentiality principles of the
companies. These analyses, which were conducted with a total of 16 projects, could have
yielded more efficient and more accurate testable results if they had been conducted with
many more projects. However, realistic results were obtained with the existing projects.
In the light of the results, it is seen that integrating artificial intelligence applications into
this type of analysis is a great benefit for business efficiency.
References
[1] X. D. Ma, F. Xiong, T. O. Olawumi and N. Dong, "Conceptual Framework and Roadmap
Approach for Integrating BIM into Lifecycle Project Management," Journal of
Management in Engineering , 2018.
[3] P. Yan, X. Xie and Y. Meng, "Application of the BIM Technique in Modern Project
Management," ASCE, 2014.