0% found this document useful (0 votes)
171 views9 pages

Estimating Project S-Curves Using Polynomial Function and Neural Networks Chao, Chien

The S-curve is a graphical representation of a construction project’s cumulative progress from start to finish. While S-curves for project control during construction should be estimated analytically based on a schedule of activity times, empirical estimation methods using various mathematical S-curve formulas have been developed for initial planning at predesign stages, with the mean for past similar projects often used as the basis of prediction.

Uploaded by

isenefret
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views9 pages

Estimating Project S-Curves Using Polynomial Function and Neural Networks Chao, Chien

The S-curve is a graphical representation of a construction project’s cumulative progress from start to finish. While S-curves for project control during construction should be estimated analytically based on a schedule of activity times, empirical estimation methods using various mathematical S-curve formulas have been developed for initial planning at predesign stages, with the mean for past similar projects often used as the basis of prediction.

Uploaded by

isenefret
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Estimating Project S-Curves Using Polynomial Function and

Neural Networks
Li-Chung Chao1 and Ching-Fa Chien2

Abstract: The S-curve is a graphical representation of a construction project’s cumulative progress from start to finish. While S-curves
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

for project control during construction should be estimated analytically based on a schedule of activity times, empirical estimation
methods using various mathematical S-curve formulas have been developed for initial planning at predesign stages, with the mean for past
similar projects often used as the basis of prediction. In an attempt to make an improvement, a succinct cubic polynomial function for
generalizing S-curves is proposed and a comparison with existing formulas shows its advantages of accuracy and simplicity. Based on an
analysis of the attributes and actual progress of 101 projects, four factors, i.e., contract amount, duration, type of work, and location, are
then used as the inputs of a model developed for estimating S-curves as represented by the polynomial parameters. For model develop-
ment, it is proposed to use neural networks for their ability to perform complex nonlinear mapping. The neural network model is
compared with statistical models with respect to modeling and testing accuracy. The results show that the presented methodology can
achieve error reduction consistently, thereby being potentially useful for owners and contractors in early financial planning and checking
schedule-based estimates.
DOI: 10.1061/共ASCE兲0733-9364共2009兲135:3共169兲
CE Database subject headings: Construction management; Neural networks; Curve fitting; Polynomials; Estimates.

Introduction cumulative progress are not smooth and often are highly uneven;
Fig. 1 shows an example in which time is standardized 共divided兲
The S-curve is a graphical representation of the cumulative by project duration to the range of 0–1.
progress of a construction project from start to finish, with the Estimated S-curves are widely used by owners and contractors
horizontal scale showing time and the vertical scale showing cu- for project planning and control. During the preconstruction
mulative project progress in dollars or in percent complete. For a phases, an estimated S-curve is used as the basis for forecasting
project of n activities, the cumulative progress at time point t, Pt, cash flows in making financial arrangements. During the con-
is defined as struction stage, an S-curve agreed in the contract is used as the
n target against which the actual progress of the project at any point
Pt = 兺
i=1
wi · pti 共1兲
can be evaluated to establish whether it is overall behind schedule
and to assess the amount of delay. Since construction contracts
often contain clauses stipulating that a delay should not exceed a
where wi = percent weight of activity i in the project; pti certain percentage, the S-curve estimate will affect the determi-
= percent complete of activity i at t. nation of violation of a contract and the resulting penalty. In re-
The shape of the S-curve, normally with a smaller slope at the cent years, some schools of thought such as lean construction
beginning and near the end and a larger slope in the middle, have questioned the practice of using the S-curve to establish the
indicates that progress is slow in mobilization and demobilization progress target for construction control, citing the possibility that
periods but faster when the bulk of the work takes place. Al- contractors under the threat of penalty will speed up nonurgent
though the shape generally applies to any project consisting of activities so as to offset delays of earned value in critical activi-
activities whose times overlap to some extent, each individual ties, to undesirable effects 共Kim and Ballard 2000兲. However, the
project being a unique undertaking will have an S-curve with S-curve, being able to tell overall project progress in single num-
differing geometric properties, such as the relative length and bers, has the inherent advantage of simplicity and remains handy
slope of each section. In addition, curves constructed from actual for many project cases in construction. On balance, use of the
S-curve for financial planning is unquestionable, but, admittedly,
1
Associate Professor, Dept. of Construction Engineering, National unqualified use of it as the chief control during construction may
Kaohsiung First Univ. of Science and Technology, Kaohsiung 824, Tai- give cause for concern due to oversimplified information, espe-
wan, ROC. E-mail: [email protected] cially for large projects where additional means such as mile-
2
Ph.D. Student, Institute of Engineering Science and Technology, stones should be used.
National Kaohsiung First Univ. of Science and Technology, Kaohsiung Different methods are used at different stages of a project to
824, Taiwan, ROC. E-mail: [email protected]
obtain reasonable estimates of project progress. When design and
Note. Discussion open until August 1, 2009. Separate discussions
must be submitted for individual papers. The manuscript for this paper detailed project information is available, the normal and accepted
was submitted for review and possible publication on September 18, approach to estimating an S-curve is analytical, i.e., based on a
2007; approved on September 11, 2008. This paper is part of the Journal schedule of planned activity times and progress calculation using
of Construction Engineering and Management, Vol. 135, No. 3, March Eq. 共1兲. However, S-curve estimation at predesign stages with
1, 2009. ©ASCE, ISSN 0733-9364/2009/3-169–177/$25.00. only sketchy project definition has to use historical-data-based

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 169

J. Constr. Eng. Manage. 2009.135:169-177.


Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

Fig. 1. Example of actual cumulative progress versus fitted curve Fig. 2. Example of envelope curves built as 90% confidence interval
of mean progress
empirical methods, the development of which has attracted much
research interest. Over the years, many alternative ways of deter-
mination of S-curves have been studied, along with various math- 90 percentiles of progress for the sample. An example built from
ematical formulas for generalizing cumulative project progress as real projects is shown in Fig. 2, in which there are limits ⬍0 or
a function of time 共Skitmore 1992; Navon 1996兲. Since such ⬎1 as a result of the broad confidence interval. Methods of this
methods give estimates of progress that are not produced from a kind do not use a generalized mathematical form to represent
schedule according to project-specific information, their results S-curves, nor do they consider the factors that may influence
are intended mainly for initial preparation of financing, not for project progress.
control purposes during construction. However, because a con-
struction project is subject to the influence of many factors and a Mathematical Formulas
schedule is often not very certain, it is a prudent practice to com-
pare a schedule-based S-curve estimate with one obtained accord- Peer 共1982兲 proposed five S-curve formulas for building construc-
ing to historical realities and an empirical method will also serve tion projects in which percent progress is made a function of
for the checking purpose. percent time with all parameters predetermined. However, the
In light of the backdrop mentioned above, the objective of the more common form adopted in other researches is a progress-
research presented herein is to develop an improved approach versus-time relation in which a few parameters are left to be
based on a proposed polynomial function and neural networks, determined for an individual project by using some curve-fitting
whose application is in line with an empirical method’s. In the methods. Navon 共1996兲 gave a summary of the formulas pro-
following, existing methods for the early estimation of project posed by a number of researchers, some of which fail to meet the
progress as well as existing S-curve formulas are reviewed first. boundary conditions of 0% progress at 0% time and 100%
The proposed function and the solution to its parameters are pre- progress at 100% time. Skitmore 共1992兲 fitted each of four two-
sented next, along with a comparison of closeness of fit with parameter formulas to 27 case projects and evaluated their close-
existing functions. Based on collected cases of completed con- ness of fit. Since the proposed formula in the present research will
struction projects, a neural network model for estimating S-curves be compared with these four formulas later, they are briefly re-
as represented by the polynomial function is then proposed, viewed next in Eqs. 共2兲–共7兲, where y and x denotes standardized
whose accuracy is compared with that of statistical models. Rel- progress and standardized time, respectively, and a , b are the pa-
evance to industry practitioners is discussed before conclusions rameters to be determined.
are drawn at the end. 1. The Department of Health and Social Security 共DHSS兲 for-
mula, which was developed for hospital projects, has the
form of Eq. 共2兲
Review of Existing Methods and Formulas
y = x + ax2 − ax − 共6x3 − 9x2 + 3x兲/b 共2兲

Envelope Curves where a , b can be evaluated by the method in Tucker 共1988兲


using the Weibull function according to contract value.
Based on standardized data of actual progress versus time for past 2. Kenley and Wilson 共1986兲 proposed a logit transformation-
similar projects, empirical methods have been developed for based model that is widely referred to by other researchers
evaluating the scheduled progress of a project and for monitoring on S-curves. It is defined as
its actual progress, which can be found in Kaka 共1999兲 and Chen
共2003兲, etc. One of such methods is to draw envelope curves 共also ln关y/共1 − y兲兴 = a + b兵ln关x/共1 − x兲兴其 共3兲
known as banana curves because of the shape兲 showing the lower
and upper limits of cumulative project progress over time estab- Or equivalently
lished according to a sample of completed projects. The limits at
y = F/共1 + F兲 共4兲
each percent of time can be set at, say, the bounds of the esti-
mated 90% confidence interval of the mean progress or the 10 and and

170 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:169-177.


F = y/共1 − y兲 = ea关x/共1 − x兲兴b 共5兲
where a , b are obtained through a linear regression analysis
with a set of pairs of x and y for a project transformed into a
set of pairs of ln关x / 共1 − x兲兴 and ln关y / 共1 − y兲兴.
As x approaches 0 or 1, its logit, ln关x / 共1 − x兲兴, approaches
−⬁ or +⬁, respectively, to the effect of distorting evaluation
of a and b in a regression analysis by these least significant
parts of a project. To deal with the problem, the authors
suggested excluding the first 10% and final 10% of progress
data from the set before fitting the model. In addition, Evans
and Kaka 共1998兲 found that after a regression analysis ad-
justing the values of a and b manually may lower the errors
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

of Eq. 共3兲 to achieve a better fit to project data.


3. Skitmore 共1992兲 modified the S-curve formula for process
plant projects in Miskawi 共1989兲 as
Fig. 3. Effects of an increase/decrease in the values of a or b on the
y = 共3b/2兲sin关共␲/2兲共1 − x兲兴sin共␲x兲log关共x + 0.5兲/共a + x兲兴 − 2x3 shape of Eq. 共8兲
+ 3x2 共6兲
4. Finally, Skitmore 共1992兲 also quoted the following degree of y = ax3 + 共−3 / 2兲ax2 + 共1 + a / 2兲x. The effect of changes in the
three polynomial function for representing S-curves values of a and b on the geometrical properties of Eq. 共8兲 is such
that an increase in the value of either a or b causes the location of
y = 关1 + a共1 − x兲共x − b兲兴x 共7兲 inflection point to move closer to the end, whereas a decrease in
In Eqs. 共6兲 and 共7兲, an approximation of the values of a and their value causes an opposite movement 共see Fig. 3兲. Moreover,
b for given project progress data can be obtained using the decreasing a’s value but increasing b’s value causes the curve to
trial-and-error method by matching the right-hand and left- become steeper in the middle, indicating greater concentration of
hand sides. work there, whereas increasing a’s value but decreasing b’s value
causes the curve to be straighter, indicating more evenly distrib-
uted work over the project duration; in fact, at a = 0 and b = 0, Eq.
Models for Estimating S-Curves as a Mathematical 共8兲 becomes y = x, a straight line 共see Fig. 4兲.
Function A two-parameter polynomial without the constant term that
Using a mathematical function with its characterizing parameters can meet the boundary conditions of 共x = 0, y = 0兲 and 共x = 1, y
to represent S-curves, models have been developed for predicting = 1兲, Eq. 共8兲 has a more succinct form than existing functions;
the cumulative progress of a project. Typically, in such models, hence, it is more convenient when used in calculation of cumula-
past case projects are collected and classified according to some tive progress for any given time. A description of the technique
attributes into groups and the average function parameters for for solving the values of a and b is given below, as well as a
each group are used to produce a standard S-curve as the basis for comparison with existing functions in accuracy of fitting to real
prediction for a new project classified in the same category. For cases.
example, Kaka and Price 共1993兲 calculated the values of a and b
of Eq. 共3兲 for each of 150 projects classified into seven groups Derivation of Solution to Function Parameters
according to duration and method and then obtained the mean a
and b for each group for S-curve modeling. Evans and Kaka For a given set of measurements of cumulative progress versus
共1998兲 presented a similar model in which case projects were time for a project, standardized as 兵共x1 , y 1兲 , 共x2 , y 2兲 , . . . 其, the val-
categorized by duration and contract value and concluded that ues of a and b of Eq. 共8兲 can be determined by using the least-
prediction accuracy of the standard S-curve method would not
improve even if more detailed project classifications were used.
To summarize, since such models do not include a mechanism
for directly computing the combined effect of multiple factors,
they have difficulty in reducing errors both in modeling and in
prediction.

Proposed S-Curve Formula

To address existing S-curve formulas’ problems of large fitting


error and complicated calculation for solving parameters, a third
degree polynomial is presented next
y = ax3 + bx2 + 共1 − a − b兲x 共8兲
where x, y, a, and b are as defined earlier.
Normally, the value of a is negative, while the value of b is
positive. It can be shown that for inflection point 共where the rate Fig. 4. Effects of reverse changes in the values of a and b on the
of project work peaks兲 occurring at x = 0.5, Eq. 共8兲 is of the form shape of Eq. 共8兲

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 171

J. Constr. Eng. Manage. 2009.135:169-177.


Table 1. Comparison of Fitting Accuracy of Various Formulas for Case Data in Skitmore 共1992兲
S-curve Parameters-solving Mean Maximum Mean Maximum
formula method MSE MSE RMSE RMSE
Eq. 共2兲 Contour map 0.000849 0.004107 0.0264 0.0641
Eq. 共3兲 Regression/truncated data 0.000859 0.004315 0.0260 0.0657
Eq. 共3兲 Contour map 0.000671 0.002350 0.0237 0.0485
Eq. 共6兲 Contour map 0.002843 0.015720 0.0456 0.1254
Eq. 共7兲 Contour map 0.000917 0.004508 0.0267 0.0671
Eq. 共8兲 Optimization by Eqs. 共13兲–共19兲 0.000714 0.003405 0.0243 0.0584

squared error method. The error for time xt, et, is defined as the a basis for model performance evaluation later: mean square error
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

difference between the actual progress y t and the calculated 共MSE兲 and root-mean-square error 共RMSE兲, as defined next
progress from Eq. 共8兲 as
兺dj=1共calculated j − actual j兲2
et = y t − ax3t − bx2t − 共1 − a − b兲xt 共9兲 MSE = 共20兲
d
The sum of squared errors for all xt in the set then is

兺 e2t = 兺 关yt − ax3t − bx2t − 共1 − a − b兲xt兴2 共10兲


RMSE = 冑 兺dj=1共calculated j − actual j兲2
d
共21兲
For simplicity, subscript t is eliminated in the following state-
ments. Since the optimum a and b will minimize the sum of where d = number of progress measurements for a project;
squared errors, a and b can be solved directly by taking partial calculated j = calculated progress from an S-curve formula for
differentiation of Eq. 共10兲 with respect to a and b each and setting measurement j; actual j = actual progress for measurement j.
the derivatives to zero as Both Eqs. 共20兲 and 共21兲 are a consistent accuracy indicator for
projects of various durations; while MSE is in line with the least-
⳵ 兺 e 2/ ⳵ a = 0 共11兲 squared error method for curve fitting, RMSE gives a direct mea-
sure of the average error in percentage terms. As an illustration,
⳵ 兺 e 2/ ⳵ b = 0 共12兲
for the fitted curve in Fig. 1, the MSE and RMSE obtained are
0.000576 and 0.024 or 2.4%, respectively.
Through rearrangements of Eqs. 共11兲 and 共12兲 with insertion of
Eq. 共10兲, a and b can be evaluated using expressions 关Eqs. Comparison with Other S-Curve Formulas in Fitting
共13兲–共19兲兴 below Accuracy
a = 共AB − DE兲/共BC − E2兲 共13兲 The proposed formula 关Eq. 共8兲兴 was fitted to the actual progress
data of the case projects in Skitmore 共1992兲. For each project, a
b = 共CD − AE兲/共BC − E2兲 共14兲 and b were solved using Eqs. 共13兲–共19兲 and then MSE and RMSE
were obtained with the fitted S-curve using Eqs. 共20兲 and 共21兲.
where A= 兺 x3y − 兺 xy − 兺 x4 + 兺 x2 共15兲 Table 1 gives the mean and maximum of MSE and RMSE for all
27 projects, along with those from Eqs. 共2兲, 共3兲, 共6兲, and 共7兲 for a
comparison of fitting accuracy, which were solved using the con-
B= 兺 x4 − 2 兺 x3 + 兺 x2 共16兲 tour map method 共Skitmore 1992兲 as discussed in the following.
The contour map method in Skitmore 共1992兲 produces a solu-
C= 兺 x6 − 2 兺 x4 + 兺 x2 共17兲 tion to a and b in Eqs. 共2兲, 共3兲, 共6兲, and 共7兲 that achieves higher
fitting accuracy than solutions otherwise produced, but it is essen-
tially a systemized trial and error involving large amounts of cal-
D= 兺 x2y − 兺 xy − 兺 x3 + 兺 x2 共18兲 culation. On the other hand, the method of regression with
truncated data suggested in Kenley and Wilson 共1986兲 gives a
direct solution to Eq. 共3兲 as the optimization technique of Eqs.
E= 兺 x5 − 兺 x4 − 兺 x3 + 兺 x2 共19兲
共13兲–共19兲 does to Eq. 共8兲, but its fitting accuracy is not as good.
With the values of a and b of Eq. 共8兲 determined, a fitted Although the accuracy of Eq. 共3兲 when solved by the contour map
S-curve can be constructed. For example, the S-curve fitted to the method improves, solving Eq. 共8兲 is more straightforward without
actual progress data in Fig. 1 is y = −1.025x3 + 2.108x2 − 0.083x. complicated calculations. Therefore, for this set of cases, Eqs. 共3兲
Although in some cases fitting Eq. 共8兲 may result in y slightly less and 共8兲 have comparable overall performance, while both clearly
than 0 for x close to 0 or y slightly greater than 1 for x close to 1, outperform the rest. Further comparisons between Eqs. 共3兲 and 共8兲
this is a minor problem, because they happen in regions near the are made later with project data collected in the present study.
start or the end of a project with little significance, and they may
be rectified readily by replacement of 0 or 1 if needed.
Description of Data
Error Measures
To illustrate the development of a model for estimating S-curves,
Two error measures are used to measure the accuracy of an data on the nature and actual progress of 110 construction projects
S-curve formula in terms of closeness of fit as well as to provide for the second freeway of Taiwan completed in 1991–2001 was

172 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:169-177.


Table 2. Statistics of Contract Amount, Duration, Parameter a, and Pa- Eqs. 共13兲–共19兲, rectification of any y ⬍ 0 and y ⬎ 1 by replace-
rameter b for Collected 101 Projects ment of 0 and 1 after optimization was used as a second method.
Standard As in the previous comparison, the mean and maximum of MSE
Minimum Maximum Average deviation and RMSE for all 101 projects obtained from each method are
shown in Table 4. The manual adjustment method for solving Eq.
Contract amount 60 5,663 1,617 1,050
共NT$ million兲
共3兲 clearly decreases the errors of the original method and
achieves an improvement of about 20% in MSE reduction, albeit
Project duration 8 117 50 18
共month兲 at the expense of added efforts. On the other hand, the second
method for solving Eq. 共8兲 achieves only a marginal MSE reduc-
Parameter a −3.285 0.089 −1.431 0.767
tion of about 5% from optimization only, but its implementation
Parameter b 0.300 0.5043 2.300 1.088
is easier and quicker than the parameter adjustment for Eq. 共3兲.
With respect to fitting accuracy, Eqs. 共3兲 and 共8兲 are generally on
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

collected. Of this data, nine projects were discarded because of a par, considering the errors of the two solving methods for each
their unusual delays, resulting in a usable set of 101 projects. The formula together. Therefore, although mathematical proof is not
projects are spread all over Taiwan covering a variety of work, possible and more tests are required to judge conclusively, for the
such as roads, bridges, and service areas, and vary greatly in two sets of cases in Skitmore 共1992兲 and herein, the proposed
contract amount and in duration. As is the common practice of S-curve formula 关Eq. 共8兲兴 with the advantages of a simpler form
valuation of work done, progress measurement for all the projects and the accompanying convenience in use, is shown to be at least
took place monthly. For each project, the progress versus time as good as Eq. 共3兲.
data was first standardized by its contract amount and project
duration, with the number of pairs of 共x , y兲 obtained equaling the
number of measurements or months. a and b of Eq. 共8兲 were then Description of Model
solved using Eqs. 共13兲–共19兲 and a fitted S-curve was obtained.
See Table 2 for statistics of contract amount, project duration, and The idea of the model presented herein is to represent the S-curve
values of a and b for the 101 projects 共note: NT$ 1 ⬃ US$ 0.03兲. by parameters a, b of polynomial Eq. 共8兲 and use neural networks
An analysis of one-to-one correlation among the quantifiable to acquire the ability to predict a, b from actual progress data,
project attributes, i.e., contract amount and duration, and param- with the aim of producing a better early S-curve estimate for a
eters a, b of Eq. 共8兲 for the 101 projects was performed. The few given project conditions. The attainment of the goal will be
coefficients of correlation 共COE兲 obtained are shown in Table 3. evaluated by comparing the accuracy of the model with that of the
The strong and positive correlation between contract amount and multiple regression and average curve methods.
project duration appears reasonable. The weak but noticeable cor-
relations between contract amount and a 共or b兲 and between du-
Input Factors
ration and a 共or b兲 may be attributed to effects of these attributes
on project progress reflected by the geometric properties of the The project data was filtered to set up input factors for a model
S-curve. Interestingly enough, a and b are closely and negatively for estimating S-curves. The two quantifiable attributes, contract
correlated 共COE= −0.9643兲. The above would have implications amount and duration, together measure a project’s relative inten-
for development of a model for estimating S-curves with the pro- sity that affects arrangement of activities, i.e., more work is done
posed formula. concurrently or sequentially, so they were selected. Categorical
The accuracy of Eq. 共8兲 in fitting to the 101 projects was attributes comprising type of work, address of project, and iden-
compared with that of Eq. 共3兲, which was solved using two meth- tity of contractor, were considered next. Type of work affects
ods: regression with truncated data in Kenley and Wilson 共1986兲 number of trades, lead time, and site character, so it has a bearing
and manual parameter adjustment after regression suggested by on distribution of work and project progress. Project location in
Evans and Kaka 共1998兲. For solving Eq. 共8兲, in addition to using Taiwan, an island with significant regional differences in rainy
seasons and terrain, also has an effect. Although contractor per-
formance certainly influences progress, the limited information
Table 3. Coefficients of Correlation among Contract Amount, Duration, available does not allow a separate indicator for it to be set up.
Parameter a, and Parameter b for Collected 10 Projects Therefore, only four factors were used as inputs in model devel-
Contract Project Parameter Parameter opment: contract amount, duration, type of work, and location.
amount duration a b While contract amount and duration being real numbers can be
used as they are, type of work and location being categorical
Contract amount 1
variables require a classification scheme in a model. With respect
Project duration 0.6317 1
to type of work, a project is classified into one of three groups:
Parameter a 0.0566 0.1860 1
bridges/elevated roads, embankment roads, and service areas/toll
Parameter b −0.0805 −0.229 −0.9643 1

Table 4. Comparison between Eqs. 共3兲 and 共8兲 in Accuracy of Fitting to Collected 101 Projects
S-curve Parameters-solving Mean Maximum Mean Maximum
formula method MSE MSE RMSE RMSE
Eq. 共3兲 Regression/truncated data 0.000698 0.005399 0.0242 0.0735
Eq. 共3兲 Ditto+ manual adjustment 0.000586 0.003446 0.0224 0.0587
Eq. 共8兲 Optimization by Eqs. 共13兲–共19兲 0.000654 0.002869 0.0236 0.0536
Eq. 共8兲 Ditto+ rectification 0.000625 0.002869 0.0230 0.0536

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 173

J. Constr. Eng. Manage. 2009.135:169-177.


Table 5. Classification of Projects by Type of Work and Their Mean Neural Networks
Values of Parameter a and Parameter b
Since the relationships between project attributes and project
Mean Mean
Number of value of value of
progress are complex, and as the weak linear correlations between
Group number 共type of work兲 projects parameter a parameter b contract amount 共or duration兲 and parameters a, b in Table 3
indicate, it is very likely that in Eq. 共22兲 the outputs are nonlin-
1 共Bridge/elevated roads兲 31 −1.269 2.002 early related to the inputs. Therefore, neural networks were se-
2 共Embankment roads兲 53 −1.273 2.122 lected for the intended model, on the grounds that their structure
3 共Service areas/toll stations兲 17 −2.217 3.386 of highly connected nodes with nonlinear transfer functions en-
ables performance of complex multiattribute mapping with com-
bined unknown degree effects of inputs on outputs 共Chao and
stations including architectural work. With respect to location, a Skibniewski 1995兲 and 共Chao 2001兲. In model construction, a
neural network is trained with input-output patterns from the col-
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

project is classified according to the main geographical divisions


of Taiwan into one of three groups: south, central, and north/east. lected projects to acquire its mapping ability through repetitively
Tables 5 and 6 show the numbers of projects in each group ranked adjusting the connection weights. The nature of Eq. 共22兲 suggests
in order of decreasing mean value of a 共and in exactly reverse application of a feed-forward multilayer network and a supervised
order of b兲, for type of work and location, respectively. For either learning technique such as the gradient descent back-propagation
factor, equality tests for the mean values of a or b of the groups 共BP兲 algorithm for training. In model testing, a trained neural
cannot reject all null hypotheses and conclude that they all differ network is tested with additional patterns new to the network to
significantly from each other. However, the combined effects of determine prediction accuracy using the error measures defined.
the four factors on project progress need to be studied within a Previous studies found that a neural network can achieve better
multifactor model. performance with proper arrangement and representation of train-
ing data and a suitable configuration. Moreover, training needs to
consider noise or randomness commonly existent in construction
Input-Output Mapping data so as to avoid overtraining to the point that a neural network
With the input factors identified above, the model for estimating recognizes only training data but performs poorly in testing.
S-curves is intended to establish the following mapping of outputs Out of the rather limited collection of 101 projects, a largest
from inputs: possible part 共90兲 was kept for modeling through training and a
minimum yet significant part 共11兲 was kept for testing. To counter
共I1,I2,I3,I4兲 ⇒ 共O1,O2兲 共22兲 bias in data, six training sets 共designated modeling samples one to
six兲 and six testing sets 共designated testing samples one to six兲
where I1 , I2 , I3 , I4 = inputs characterizing a project: contract were prepared from randomly picked projects. The six modeling
amount, duration, type of work, and location, respectively; samples were used to develop six corresponding neural networks,
O1 , O2 = outputs: estimates of a and b of Eq.共8兲 for the project, which are then tested by the six complementary testing samples.
respectively.
Prior to training, the contract amounts, durations, and values of a
An S-curve in the form of Eq. 共8兲 would then be generated for
and b for all projects were standardized to the range of 0–1. For
a project with the estimated a, b for it. The mapping model is to
the two categorical variables, decimal representation according to
be developed in two steps: model construction from case projects
the group numbers in Tables 5 and 6 was adopted. MATLAB’s
and test of prediction accuracy with additional cases. While the
Neural Network Toolbox was used as the training software, be-
majority of the collected projects would be used as modeling
cause its graphical interfaces, together with Microsoft EXCEL,
cases for model construction, a portion of them should be set
facilitate formulation, inspection, and revision in model develop-
aside as testing cases for verifying the prediction accuracy of a
ment. Each training session starts with random connection
constructed model. For each project, there are two types of error:
weights initialized by a different seed 共MATLAB 2007兲.
the mapping error or the difference between the estimated a, b
The initial configuration for the neural networks was four
and the target a, b obtained from Eqs.共13兲–共19兲, and the fitting
input nodes, nine nodes in one hidden layer, two output nodes in
error or the difference between the calculated progress from Eq.
the output layer, a learning rate of 0.2, a momentum of 0.9, and
共8兲 with the target a, b and the actual progress. However, the total
the log-sigmoid transfer function. The training error could be de-
error is measured by the difference between the calculated
creased fairly easily and always got lower with more training
progress from Eq. 共8兲 with the estimated a, b and the actual
cycles 共or epochs兲, indicating that the networks could adapt to the
progress, and RMSE defined in Eq. 共21兲 is used again for the
data. However, it was more difficult to bring down the testing
measurement. Ultimately, the performance of a model developed
error in the same time, which would become stagnant or even
is evaluated in terms of the average and maximum RMSE for all
modeling and testing cases. increasing again past a certain point, a sign of overstraining.
Therefore, provided the training error is already low, training was
terminated when the testing error is at a minimum, and the num-
bers of training cycles used for the six networks range between
Table 6. Classification of Projects by Location and Their Mean Values of 20,000 and 30,000. Each network’s performance was then evalu-
Parameter a and Parameter b ated by the previously mentioned total error based on Eq. 共21兲
Number of Mean value of Mean value of against actual progress and the average RMSE for testing cases
Group number 共location兲 projects parameter a parameter b was near 6%, which was considered insufficiently accurate.
Subsequently, a few adjustments and trials were made and the
1 共South兲 41 −1.248 2.042
final configuration was revised to seven hidden nodes and a learn-
2 共Central兲 38 −1.345 2.218
ing rate of 0.7. The numbers of training cycles used now reduced
3 共North and east兲 22 −1.919 2.914
to a low of 10,000. These changes represent a coarser training of

174 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:169-177.


Table 7. Modeling and Testing Errors of Neural Network, Regression, and Average Curve Models
Neural network Multiple regression Average curve

Mean Maximum Mean Maximum Mean Maximum


RMSE RMSE RMSE RMSE RMSE RMSE
Modeling sample 1 0.0567 0.1498 0.0583 0.1463 0.0624 0.1377
Modeling sample 2 0.0555 0.1429 0.0576 0.1386 0.0614 0.1379
Modeling sample 3 0.0552 0.1338 0.0597 0.1880 0.0618 0.1363
Modeling sample 4 0.0559 0.1381 0.0564 0.1395 0.0603 0.1410
Modeling sample 5 0.0595 0.1867 0.0598 0.1795 0.0622 0.1886
Modeling sample 6 0.0591 0.1815 0.0595 0.1764 0.0628 0.1868
Modeling average 0.0570 0.1555 0.0586 0.1614 0.0618 0.1547
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

Testing sample 1 0.0452 0.0898 0.0526 0.1130 0.0513 0.1042


Testing sample 2 0.0537 0.1291 0.0585 0.1468 0.0595 0.1206
Testing sample 3 0.0531 0.1029 0.0540 0.1043 0.0571 0.1133
Testing sample 4 0.0591 0.1101 0.0678 0.1489 0.0675 0.1240
Testing sample 5 0.0558 0.0868 0.0583 0.0975 0.0622 0.1051
Testing sample 6 0.0532 0.0836 0.0603 0.1103 0.0624 0.1120
Testing average 0.0534 0.1004 0.0586 0.1201 0.0600 0.1132
Overall average 0.0552 0.1279 0.0586 0.1408 0.0609 0.1340

the networks, but they perform better than before; their modeling The three models’ mean and maximum RMSE for each of the
and testing RMSE are shown in Table 7. Two issues are noted. modeling and testing samples are shown in Table 7. For all the
First, the testing error is sensitive to the weights at the start of samples, the neural network model consistently outperforms the
training and a network was retrained a few times with different regression and average curve models in terms of mean RMSE,
initial weights to get the lowest error, which is somewhat biased. although for a few samples it has a slightly higher maximum
However, after all, as model performance is judged by the aver- RMSE. The neural network model’s overall averages of 5.52 and
age error of the six networks for the six random samples, the 12.79%, for mean and maximum RMSE, respectively, are both
results of such retraining can be interpreted as the best achievable. lower than those of the other two models, and, hence, can repre-
Second, for some projects, a slight change in the values of a, b sent an improvement in modeling and prediction accuracy. The
will lead to a large total error that is disproportionate to the neural networks’ edge over their regression and average curve
change. For the total error to be acceptable, the mapping error counterparts can be attributed to their being adaptive to the data in
should be sufficiently low, but it is not necessarily the least-mean dealing with the demanding mapping of function parameters.
squared error for a and b, since in any case a network trained to
the point of lowest total error should be adopted.
Effects of Changes in Inputs on Estimated S-Curves
To examine further how the trained neural networks work, a sen-
Discussions sitivity analysis is performed using a hypothetical but representa-
tive case: a middle-size project with a contract amount of
Comparison with Multiple Regression and Average
Curve Models
Instead of the existing standard S-curve model based on some
methods for grouping projects, multiple regression is the bench-
mark for the neural network model, as it represents a more gen-
eral averaging technique, while the average curve for all projects
is used for another comparison. Corresponding to each neural
network above, two multiple regression equations, one with a as
the independent variable and the other with b, were built from the
same data of 90 projects as for training, except that binary repre-
sentation was adopted for the categorical variables. The resulting
R2 ranges between 0.20 and 0.26. As before, the regression equa-
tions were then used to estimate a and b for 11 testing projects
and the total errors in RMSE for all modeling and testing projects
were obtained. Likewise, an average S-curve based on the mean a
and b for the same 90 projects was formed as the third model and
used for obtaining the RMSE for each project. As an illustration,
for a small project in the first testing sample, the fitted S-curve
versus the estimated S-curves from the neural network, multiple Fig. 5. Fitted S-curve versus estimated S-curves from neural net-
regression, and average curve models are shown in Fig. 5, along work, multiple regression, and average curve models for a project of
with their total errors. NT$60 million, 20 months, type of work 3, and location 2

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 175

J. Constr. Eng. Manage. 2009.135:169-177.


Relevance to Industry Practitioners

It appears that the presented methodology of using the polynomial


function for progress generalization and neural networks for pa-
rameter mapping can improve the performance of an empirical
approach to S-curve estimation. As it is experience based and
does not rely on detailed project information, its result is intended
not to replace a schedule-based progress estimate for construction
control but to give early estimates for decision-making support.
Scenarios in which the proposed model is applicable are sug-
gested below.
For owners/consultants in project planning, the model can pre-
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

dict a baseline S-curve to be used for estimating financing re-


quirements during construction, as the estimated outlay at each
period is the estimated percentage progress multiplied by the es-
timated contract sum. A similar scenario is sponsors/investors of
Fig. 6. Effect of changes in duration on estimated project progress build-operate-transfer 共BOT兲 projects in tendering for a conces-
for example project sion contract, who will need a basis for preparation of financial
proposals. Next, a design/build 共DB兲 contractor at the tendering
stage may find the model useful for obtaining an S-curve that is
often required in the bid as well as for his own cash flow fore-
casting and interest assessment, since at this stage the design usu-
NT$1,500 million, duration of 40 months, type of work 2 共em- ally is not yet complete and insufficient for producing a
bankment road兲, and location 1 共south兲. To eliminate bias of any meticulous schedule for analytical progress estimation. Even for
particular network, the averages of the estimated a , b from the six contractors in bidding for construction-only contracts with com-
neural networks were taken as the model output and used to pro- plete design, the model has the advantage of providing a simple
duce an S-curve from Eq. 共8兲. To see the effects of changes in and quick estimate, considering the short time available and that
duration on the S-curve produced, the project scenario was then they rarely plan detailed schedules before contracts are awarded
changed to 20 months and also to 60 months and the same pro- 共Blyth and Kaka 2006; Kaka and Price 1993兲. Afterwards, con-
cedure was applied. The resulting three S-curves are shown in tractors can still use the model to check a schedule-based progress
Fig. 6. As can be seen, while the duration decreases, the S-curve’s estimate by comparison with the model’s estimate to see if the
inflection point moves nearer to the start and it becomes steeper in analytical estimate is too optimistic or pessimistic, from the view-
the middle section with a flatter ending section, because of the point of realities in past projects. Similarly, the owner may use
changes in the values of a and b. This is reasonable and can be it for checking the contractor’s planned progress before giving
explained by the fact that a crash schedule requires more concur- approval.
rent activities for the bulk of the work and causes proportionally
more work to start earlier and take place in the middle. Con-
versely, varying the contract amount to 2,500 as well as to Conclusions
NT$500 million while fixing the duration at 40 months has simi-
lar effects, as shown in Fig. 7. As mentioned in the introduction, the S-curve being a simple tool
with inherent advantages and limitations, when used in contract
administration for control purposes during construction, has its
appropriateness questioned by recent researchers. However, for
planning and financial forecasting before construction, use of the
S-curve is handy and remains the accepted method. Although,
when design is complete, the best progress estimate should be
based on a schedule of activity times according to detailed project
information, an empirical S-curve model developed from real data
has the potential advantage of making a realistic estimate with
only a few given project conditions. If a reasonable accuracy is
achievable, such a model can be used to prepare a preliminary
progress estimate at predesign stages when only sketchy project
information is available. It can also be used to forecast cash flows
at the tendering stage and to check schedule-based estimates.
With application in line with an empirical method, the pre-
sented neural network model combined with the proposed
S-curve formula has the following strengths over existing ones.
First, the polynomial function is of a simpler mathematical form
that allows more convenient usage in calculating progress while
its fitting accuracy is comparable to that of those best available so
far. Second, the neural network model produces an S-curve by
Fig. 7. Effect of changes in contract amount on estimated project computing the combined effect of four factors and, as the illus-
progress for example project trative example shows, the model being adaptive to the case data

176 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:169-177.


can achieve a higher modeling and prediction accuracy than both 13共1兲, 82–95.
the average curve model and the more general multiple regression Chao, L. C. 共2001兲. “Assessing earth-moving operation capacity by neu-
model. Last, the model facilitates sensitivity analysis where ral network-based simulation with physical factors.” Comput. Aided
project conditions are not certain and effects of changing condi- Civ. Infrastruct. Eng., 16共4兲, 287–294.
Chao, L.-C., and Skibniewski, M. J. 共1995兲. “Neural network method of
tions need to be evaluated.
estimating construction technology acceptability.” J. Constr. Eng.
As is common to all S-curve formulas, the fitting error will
Manage., 121共1兲, 130–142.
always exist; a smooth curve can never capture all the fluctuations Chen, H.-F. 共2003兲. “A study on the base of construction progress evalu-
of actual progress caused by reasons specific to an individual ation using banana curves.” MS thesis, Dept. of Civil Engineering,
project, managerial or otherwise. Although this problem is less National Central Univ., Taiwan.
significant for early financial planning whose margins for error Evans, R. C., and Kaka, A. P. 共1998兲. “Analysis of the accuracy of
are large and other mathematical forms can be explored in the standard/average value curves using food retail building projects as
future, heuristics relating to improving the accuracy of a neural case studies.” Eng., Constr., Archit. Manage., 5共1兲, 58–67.
Kaka, A. P. 共1999兲. “The development of a benchmark model that uses
Downloaded from ascelibrary.org by UNIVERSITY OF VIRGINIA on 11/04/13. Copyright ASCE. For personal use only; all rights reserved.

network model based on mapping of function parameters are


summarized here. The performance of a neural network is subject historical data for monitoring the progress of current construction
to how it is trained, so training must consider noise and random- projects.” Eng., Constr., Archit. Manage., 6共3兲, 256–266.
ness in the progress data so as not to overtrain it. If a training Kaka, A. P., and Price, A. D. F. 共1993兲. “Modeling standard cost com-
mitment curves for contractors’ cash flow forecasting.” Constr. Man-
session seems unlikely to give a good result, the network can be
age. Econom., 11共4兲, 271–283.
retrained with different initial weights. Retraining is also used to Kenley, R., and Wilson, O. D. 共1986兲. “A construction project cash flow
deal with the occasionally volatile relation between the mapping model—An idiographic approach.” Constr. Manage. Econom., 4共3兲,
error and the total error; ultimately it is the latter that determines 213–232.
model performance and is to be minimized. To eliminate the bias Kim, Y., and Ballard, G. 共2000兲. “Is the earned-value method an enemy of
of any particular network, the joint result of multiple networks work flow?” Proc. 8th Annual Conf. of the Int. Group for Lean Con-
developed from different random samples should be taken as the struction, IGLC-6, 具http://www.iglc.net/conferences/2000/papers/典.
model’s solution. MATLAB. 共2007兲. “Neural network toolbox for use with MATLAB
Project data for the present study were collected from the same 7.4.0, R2007a.” User’s guide, Ver. 2, Math Works, Inc., Natick, Mass.
owner. Future researches may use data from different owners cov- Miskawi, Z. 共1989兲. “An S-curve equation for project control.” Constr.
ering a wider variety of work. Meanwhile, additional input factors Manage. Econom., 7共2兲, 115–124.
such as those relating to people and other constraints may be Navon, R. 共1996兲. “Cash flow forecasting and updating for building
considered, along with a more detailed classification scheme for projects.” Proj. Manage. J., 27共2兲, 14–23.
work, to improve on the presented model. Peer, S. 共1982兲. “Application of cost-flow forecasting models.” J. Constr.
Div., 108共2兲, 226–232.
Skitmore, M. 共1992兲. “Parameter prediction for cash flow forecasting
References models.” Constr. Manage. Econom., 10共5兲, 397–413.
Tucker, S. N. 共1988兲. “A single alternative formula for Department of
Blyth, K., and Kaka, A. P. 共2006兲. “A novel multiple linear regression Health and Social Security S-curves.” Constr. Manage. Econom.,
model for forecasting S-curves.” Eng., Constr., Archit. Manage., 6共1兲, 13–23.

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 177

J. Constr. Eng. Manage. 2009.135:169-177.

You might also like