Application of Machine Learning Techniques in Project Management
Application of Machine Learning Techniques in Project Management
Tools
May 2017
1
world scenarios, the machine learning algorithms must re- • Verification - In the verification phase, the software is
turn solutions in real-time or near-real-time, and it must be properly tested against the project’s requirements and
integrated into a pleasant and easy-to-use graphical user use cases.
interface. To complete our goal, we must extensively an- • Maintenance - This phase consists on making small
alyze the existing scientific work in the machine learning changes in the software to adapt it to possible
field and adapt it to our problem. changes of the requirements that arose during the pre-
vious phases.
2 Related Work The biggest disadvantage of the waterfall model is that
In this thesis we address the problem of IT project failure it is usually very hard to fully specify the project require-
due to poor planning and bad risk analysis. In this section ments in the beginning of the project before any design or
we review both project management methods and machine development has been performed.
learning techniques and applications. The person or organization that ordered the develop-
ment of the project doesn’t always know from the early
2.1 IT Project Management stage of the problem what they really want to be build.
Non-technical users and organizations always plan to start
Project Management is defined as a discipline that has
a project with the goal of solving at least one problem but
the goal of initiating, planning, executing, controlling, and
some of the requirements to achieve that goal are very of-
closing (Project Management Institute, 2004) the work of a
ten only fully discovered during the design or development
team which has the goal of achieving a certain predefined
of the actual project.
success criteria. A Project can be defined as being an indi-
Despite of the previously pointed disadvantages, the
vidual or collaborative endeavor that is planned to achieve
waterfall model works quite well as a life-cycle model in
a particular aim. On the other hand, a Project Manager
projects that are well understood from a technical point of
is an individual that has the responsibility of planning and
view and that have a stable requirements definition since
executing a particular project.
the beginning of the project planning (McConnell, 1996).
In the following sub-sections some of the popular project
design processes will be presented in detail. Despite the 2.1.2 Spiral model
fact that project management is a general discipline, we will
The spiral model (Boehm et al., 1987) is a risk driven pro-
keep our focus in project management techniques applied cess model. The spiral model consists on partitioning the
to IT projects. project development process on multiple flexible loops that
2.1.1 The Waterfall model are appropriate to the context of the process that’s being
developed, but usually it can be breakdown into the follow-
The waterfall model (Royce, 1970) is a sequential project
ing steps:
design process where the project orderly progresses
through a sequence of defined steps beginning in the prod- • Determine objectives / Planning - During this phase
uct concept or requirements analysis to the system test- the project’s requirements are all gathered into a re-
ing. The waterfall model usually serves as a basis for more quirement specification document.
complex and effective project life-cycle models. • Risk analysis / Identify and resolve risks - In this
The number of phases or steps in the waterfall model is phase. If any risk is found during this phase, the team
flexible and depends on what the project is and how the will suggest alternatives to try to mitigate the risk, or
project manager wants to organize the project’s develop- an appropriate contingency plan will be created.
ment; however the traditional waterfall model usually fea-
• Development and tests - This is the phase where the
tures the 5 following phases:
actual software is developed. After the development is
• Requirement analysis - The project manager con- completed, the software is then properly tested.
ferences with all the project’s stakeholders and the • Plan the next iteration - In this phase the customer
project’s team and gathers all the requirements in a evaluates the project in its current state before it goes
requirement specification document. into the next spiral.
• Design - The requirements of the project are carefully A project that is being developed using the spiral model
studied by the project’s team and the project’s archi- is planned to start small and incrementally expanding the
tecture is planned. scope of the project. The scope of the project is only ex-
• Implementation - The development team starts to im- panded after the project manager reduces the risks of the
plement the software based on the design obtained next increment of the project to a level that he/she consid-
from the previous phase. The software is divided into ers to be acceptable. The spiral model is, therefore, highly
small units. After the development of all the units is dependent on risk analysis, which must be done by people
completed, they are all integrated. with specific expertise.
2
In the spiral model, typically costs increase as the project 2.3.1 Instance-based learning
is being developed but risks decrease (McConnell, 1996).
Instance-based learning is a family of machine learning
For a project that needs to be developed quickly using this
algorithms that compares new problem instances with in-
model, the more time and money that is spent, the less
stances that were previously observed during the training,
risks the project leader is taking.
instead of performing explicit generalization, like other al-
As the spiral model life cycle is risk-oriented, it is able to gorithms such as neural networks (Aha et al., 1991).
provide very valuable feedback even in early stages of the On our particular domain, we want our system to learn
project. If the project cannot possible be done for technical each individual user patterns, and provide information that
or financial reasons, the project manager will find out early will guide him/her over the course of his/her project man-
and will be able to cancel the project before too much time agement activities. In other common types of machine
and money being spent. The spiral model cycles, just like learning a lot of data is collected to be used to train a
most project management models is flexible and can be learning model, like logistic regression or artificial neural
adapted to each projects need. networks. These models learn an explicit representation
of the training data and then can be used to classify the
2.2 Software Tools for Project Management training examples. The problem with this approach is that
it requires a considerable amount of training data, whereas
There are currently many software tools on the market in our practical domain, we want to help the user of our sys-
that have the goal of helping project managers to plan tem by providing suggestions or identifying project risks as
their projects, organize tasks, perform risk assessment and early as possible. The suggestions and risk analysis are
other related features. Some of these tools will be briefly dependent on each user and project type, so in our domain
described in the following sub-sections. Even though there it is not appropriate to construct a single large dataset for
is a considerable amount of good project management the entire platform. The user-dependent information on the
software, most of these tools do not offer any kind of ”ma- system is what must be used to provide information to that
chine learning” solutions to assist the users. particular user. It is for this particular reason that instance-
based learning must be explored on our system, as it is
2.2.1 Microsoft Project able to perform machine learning on a small and dynami-
Microsoft Project is a project management software devel- cally growing amount of data.
oped by Microsoft, and as of 2017 it is the most widely used 2.3.2 k-Nearest neighbors algorithm
project management program in the world (http://project-
The k-Nearest Neighbors (k-NN) algorithm (Altman, 1992)
management.zone/ranking/planning). Microsoft Project al-
is a very simple instance-based learning algorithm where
lows project managers to develop project plans, creating
each training example is defined as a vector in RN and is
and assigning resources to tasks, tracking the progress of
stored in a list of training examples, every time it is ob-
the project and other features. The project flow can be vi-
served. All the computation involved in classification is
sualized in Gantt charts.
delayed until the system receives a classification query.
A query consists on either performing classification or re-
2.2.2 Gantt charts
gression to the query point.
A Gantt chart is an horizontal bar chart that serves as a
• Classification – The output of the algorithm is an inte-
production control tool in project management and illus-
ger that denotes the class membership of the majority
trates the project schedule (Wilson, 2003). This type of
of the input point k-nearest neighbors.
diagram makes it easy for both the project manager and
the project’s team members to see which activities have to • Regression – The output of the algorithm is a real-
be done and when. value number y that is the average of the values of the
k-nearest training examples to the input query point.
2.3 Machine Learning and Project Manage- The common distance metric used in the k-NN algorithm is
ment the Euclidean distance:
v
Machine learning is a field of Computer Science that has u n
uX
the goal of giving computers the ability of learning without d(x1 , x2 ) = t (ar (x1 ) − ar (x2 ))2
r=1
being explicitly programmed. In this thesis we are going
to study the practical application of machine learning ap- where x1 and x2 are two data points in Rn and ar (x)
proaches to project management software with the goal denotes the values of the rth coordinate of the x data point.
of assisting the project manager to develop better project Training a k-NN model can be done using Algorithm 1
plans and identify and mitigate risks in an early stage of and k-NN classification can be performed with Algorithm
the project. 2.
3
Algorithm 1 k-NN Training algorithm This problem is called the ”curse of dimensionality” (Bell-
1: procedure T RAIN man, 2003) and occurs due to the fact that there is an expo-
2: Let Dexamples be a list of training examples nential increase in volume when adding extra dimensions
3: For each training example (x, f (x)), add the exam- through a mathematical space.
ple to the list Dexamples
The standard implementation of the k-NN algorithm can
therefore be rendered useless on vector spaces with an
Algorithm 2 k-NN Classification algorithm high number of dimensions. In this case, the main problem
1: procedure C LASSIFY
stops being to find the k nearest neighbors of the queries
2: Let xq be a query instance
that are passed through the algorithm but to find a ”good”
3: Let x1 . . . xk be the k instances from the training-
examples nearest to xq by a distance metric D distance function that is able to capture the similarity of
4:
Pk
Return f (xq ) := argmax i=1 f (xi ) two feature vectors where each feature will have a different
weight contribution to the distance function, based on the
importance of that feature on the context of the problem
2.3.3 Distance-weighted nearest neighbor algorithm itself.
The k-nearest neighbor algorithm can be modified to 2.3.5 Feature-weighted nearest neighbor
weight the contribution of each one of the query points xq ,
by giving greater weight to close neighbors rather than dis- One of the ways of overcoming the curse of dimensional-
tant ones. Clearly when classifying a query point with the ity problem is by weighting the contribution of each feature
k-NN algorithm with k = 5, if 3 of the 5 nearest neighbors (Inza et al., 2002; Tahir et al., 2007), when performing k-
are clustered very far from the query point xq and from its nearest neighbor search. The feature vector is composed
2 other different neighbors, it may introduce noisy error to with a meaningful representation of an instance of an indi-
the classification of the point. vidual model, but not all the features have the same impor-
By weighting the contribution of each point according to tance in each specific classification or regression problem.
the distance we are considerably reducing this problem. In fact, some of the selected features may even be found
For example, the weight w of each stored data point i can to be irrelevant for a particular problem.
be calculated with the expression bellow that consists on For example, the following equation can be used to cal-
the inverse square distance between the query point and culate the feature-weighted euclidean distance between
each of the data points. two feature vectors.
v
1 u n
wi = uX
d(xq , xi )2 d(x1 , x2 ) = t (ar (wr .x1 ) − ar (wr .x2 ))2
r=1
with d denoting the selected distance metric.
where wr denotes the weight of the r feature, in the fea-
With this tweak to the k-NN algorithm, the classification
ture vectors.
algorithm doesn’t even need to search for the k-nearest
neighbors of the query point as the inverse-square weight
distance function almost eliminates their contributions, so 2.4 Regression models
it is now appropriate to perform classification by using the
2.4.1 Logistic regression
contribution of the entire stored training set.
Logistic regression is a regression model where the de-
2.3.4 The curse of dimensionality
pendent variable (i.e. the variable that represents the out-
Dealing with data points in an high-dimensionality hyper- put value of the model) is categorical (Freedman, 2005).
space brings certain difficulties to nearest neighbor search Logistic regression can either have a binomial dependent
algorithms. For example, let x Rn with n = 50 be a vector variable (in the case where there are only two target cate-
where each coordinate represents one of the 50 features gories), or multinomial (when there are more than two cat-
available in that specific domain. Lets suppose that we egories).
want to perform a classification task on that dataset, but The goal of Logistic regression is to create a model that
only 3 of the 50 dimensions of the vector are actually rel- receives an input vector x (eg. a vector that represents a
evant for our classification. While the 3 dimensions that real-world object) and outputs an estimate y. The follow-
represent features relevant to the classification could form ing expression can be used to calculate the prediction of a
clusters of objects of the same category (eg. the vectors of given logistic regression model.
the points of the same category are near in space accord-
ing to some distance metric d), the other 47 dimensions y = σ(W x + b) (1)
could make these points of the same class becoming very
far away, rendering the nearest neighbor algorithm com- 1
pletely useless, without previous data processing. σ= (2)
1 + e−z
4
The output of a logistic regression model consists on the bution (the one that results from the models predictions at
weighted average of the inputs passed through a sigmoid any given moment in time).
activation function. The weights W and the bias term b are
2.4.3 Stochastic Gradient Descent
the parameters of the model that need to be learned. If the
non-linear activation function was not used, then logistic Gradient descent is a first order numerical optimization al-
regression would be the same as linear regression (Zou gorithm that is commonly used to train differentiable mod-
et al., 2003). The sigmoid function σ is bounded between els (Ruder, 2016). A model can be trained by defining a
0 and 1 and is a simple way of introducing a non-linearity cost function which can be minimized by using the Gradi-
into the model and conveying a more complex behavior. ent Descent algorithm, that updates the model parameters
In order to train the regression model (i.e. finding the on each time-step, by applying the following rule:
desired values for the parameters W and b), we can em- δ
ploy numerical optimization techniques such as Stochastic θ := θ − η J(θ) (5)
δθ
Gradient Descent (SGD) (Ruder, 2016).
where η is the learning rate and θ is the parameter that is
Regression can be used not only to train a model that is
going to be learned and J is the cost function. This rule is
able to make predictions, but can also be used to weight
applied iteratively and changes the value of the parameter
the contribution of each feature in the feature vector for a
until the model converges into a local minima of the cost
particular problem. The magnitude of each learned param-
function.
eter in binomial logistic regression does not directly rep-
resent the ”semantic importance” of the particular feature 2.4.4 Model overfitting and underfitting
that the parameter is associated to, in the context of the In a statistical model, overfitting is the term used to de-
problem that’s being modeled. For example, the feature scribe the noise or the random error that is part of the
vector f of dimension n could have one feature that repre- model, rather than the true underlying relationship be-
sents the mere scaling of another feature (eg: f1 = 100f0 ). tween the input vector and the output value. Overfitting
In this case, the learned coefficients associated with those usually occurs when the model is either ”too big” or the
features, would be different by a factor of about 100, while training data that is passed to the training algorithm is not
at the same time, the semantic importance of features f0 enough.
and f1 would be the same.
However, at the same time, the model will tend to learn 2.4.5 Dynamic Learning Rate
smaller coefficients that are associated with less impor- It is a common practice to decay the learning rate dur-
tant feature vectors. As an example, if feature f3 consists ing the training process, just like the Simulated Anneal-
of random noise and has no semantic importance in the ing (Kirkpatrick and Gelatt, 1983) algorithm. The following
classification problem whatsoever, as long as the model rule, if applied at each training epoch t, will exponentially
converges during training time, a small coefficient will be decay the learning rate:
learned so that the noisy feature gets filtered out.
ηt+1 := rηt (6)
2.4.2 Cost function
A cost function (also called a loss function) is a function where r is a constant that denotes the decay ratio of the
C used in numerical optimization problems, that returns learning rate at each epoch.
a real scalar number which represents how well a model
performs to map training examples into the correct output. 3 Adaptive Risk Analysis Tool
The Binary cross-entropy function is a way of comparing
the difference between two probability distributions t and o Before we describe our system, we need to explain in de-
and is usually used as the cost function of binary logistic tail what is the problem we address, which parameters
regression (when the output of the logistic regression if a we have to consider, and, which restrictions/constraints we
scalar binary value - 0 or 1). have to take into account. The following sections describe
the problem, the relevant data to the problem and how it is
crossentropy(t, o) = −(t.log(o) + (1 − t).log(1 − o)) (3) organized, as well as the explored techniques.
where t is the target value and o is the output predicted 3.1 Problem Formulation
by the model. When the logistic regression algorithm is
Our problem consists on performing risk analysis at the
used to classify data into multiple categories, the categori-
”milestone” level. In order to perform risk analysis on a
cal cross-entropy function is usually used as the loss func-
particular milestone that is being planned at a certain point
tion. X in time, we must find similar milestones in the project man-
H(p, q) = − p(x)log(q(x)) (4)
ager previous history and check its registered problem oc-
where p is the true distribution and q is the coded distri- currences.
5
3.2 Data Models plan their projects. As we want to perform a risk analysis
per milestone, we need to compare the current milestone
In order to create a solution that helps teams to manage
which is being planned by the project manager with previ-
their projects by providing suggestions and identifying risks
ous milestones that are stored in the system. In order to
based on each user previous history requires storing infor-
do this, we must list the features that will be used to create
mation in several appropriate data models. The structure
a feature vector for each milestone.
of these models and their purpose will be described bellow.
The features of each milestone include the Number of
A Project is divided into Milestones and each Milestone is
Users, Number of Tasks, Duration, Type, Average dura-
divided into Tasks. These three data models store the in-
tion of tasks, duration of the project, the order of the mile-
formation necessary to keep track of the structure of the
stone in the project, standard deviation of the duration of
projects in our platform. A milestone can be independent
the tasks and an histogram of the Task types
from all the others, but can also be connected to another
milestone that must have been completed previously. Fig- 3.3.1 Feature representation
ure 1 shows a UML representation of both data models The T ype of a milestone is a categorical feature which can
that were detailed in the following subsections. be any value c in the set of milestone types LM ilestoneT ypes .
Let l = {A, B, C, D} be the list of milestone types of a
• Project (Name, Start Date, Due Date)
particular project manager. Without any prior information
• Milestone (Name, Project (FK), Previous Milestone of similarities between types, we must assume that all the
(FK), Type, Start Date, Due Date) types are equally different to each other. If we choose to
• Task (Name, Project (FK), Milestone (FK), Type, Ex- represent each category as it’s index in the list (a single
pected Duration, Due Date, Status, Complete Date) integer value), we run into problems while using distance
metrics such as the euclidean distance, as the distance
• Problem (Name)
between the type A and type B and lower than the distance
• ProblemOccurrence (Problem (FK), Milestone (FK), between type A and type D.
Task (FK), Date) In order to address this issue, the milestone T ype feature
• ChangeEstimate (Milestone (FK), Task (FK), Date, can be represented as a one-hot vector, which is a vector
Old Estimate, New Estimate) that has zero magnitude in all its directions but one, where
the magnitude will be equal to 1. This vector v ∈ Rn will
We also need to have models that will store the feedback where k is the number of categories. The vector will have
from the user throughout the project. The ProblemOccur- magnitude 1 in the dimension that corresponds to the index
rence model has the purpose of storing problems that any of the T ype in the set of possible milestone types. As an
team member may experience during the project, while the example, to represent the type C ⊂ s we use the following
ChangeEstimate model has the purpose of keeping track one-hot vector:
of the changes in the duration of tasks made by the project
manager or by any user assigned to that task. FT ype = [0, 0, 1, 0]T (7)
In this Section, we provide a description of the techniques In order to solve our problem, we use two types of ma-
used to develop a solution to the problem of creating a chine learning algorithms: Instance-based learning algo-
machine learning approach to help project managers to rithms and Regression models.
3.3.3 Nearest neighbor algorithms
One of the approaches used to solve our problem is using
nearest neighbor search in order to find similar milestones
and perform risk analysis during the planning of the current
project.
3.3.4 Top milestone risks
The output of the nearest neighbor algorithm consists on
the k-nearest neighbors to the query point. In our domain,
the nearest neighbor are milestone vectors. We check
if the milestones associated with these vectors have any
problem occurrences associated. From our data models,
problem occurrences are associated with individual tasks.
As a milestone is composed of multiple tasks, a milestone
Figure 1: UML representing the both data models can have a problem of a certain type associated with itself
6
multiple times. If so, we then will check what is the most 3.3.7 Regression models
common problem associated with that milestone.
We also tried an alternative approach to the nearest neigh-
bor algorithms by building a logistic regression model that
3.3.5 Representing Problem Occurrences
is able to predict a potential project risk from an input mile-
As our system is designed to be an open web platform stone (i.e. the milestone that a project manager as just
used by many project managers working at the same time introduced into the system). In order to do this we build a
and we need to provide user-specific suggestions, we training set T consisting on (x, y) tuples where x is a mile-
needed to have a way of representing the problem oc- stone vector and y an integer that represents the problem
currences in our algorithms. In a real-world scenario a instance that occurred when the project team was working
wide variety of problem types may occur during the course on that particular milestone.
of a project. We represent problem types as an inte- The probability of each problem type happening can be
ger. An integer number represents the index of a prob- calculated with the expression 8:
lem in the user-specific problems list. For example, if a
eWi x+bi
user experienced problems of type A, B and C in his/her P (Y = i|x, W, b) = sof tmaxi (W x + b) = P Wj x+bj (8)
past history then the problem-integer mapping will become je
7
Project Manager (PM) will learn from his/her past experi-
ence. For example, initially the project manager may begin Database
8
UIs that are appealing to the users and that looks good on The reason why the ProjManagerLearnerOpt compo-
a wide range of screen sizes, including on mobile devices. nent is kept separated to the web platform (not only in the
system architecture, but also in the implementation itself)
4.3 Database is to make it possible for the platform to scale while on
production. The web platform could itself do all the tasks
The database component is used to store information in a
that the ProjManagerLearnerOpt component does, but we
persistent and consistent way and has no special features
want to make sure that the machine learning component
or functionalities. The web platform is connected to the
that will perform all the heavy computation can easily be
database and reads and writes data. The database is a
replicated into several servers, depending on the amount
key component of this platform as it is where all the data
of users using the platform at a particular point in time. In
provided by the users is stored.
this scenario the web platform would use a load balancer
4.3.1 Implementation that would decide which ProjManagerLearnerOpt server is
available to serve the user’s requests.
We do not have the need to use any particular platform-
specific features to develop or deploy our solution. We 4.4.1 Implementation
rather just need to use the basic database operations to The ProjManagerLearnerOpt module was developed in the
access and manipulate the data: Select, Insert, Update python language, just like the web platform module. The
and Delete. reason for this choice is that python is one of the most used
programming languages for machine learning and data sci-
4.3.2 Model
ence, containing a great amount of highly optimized and
This project uses a relational database as its storage sys- actively maintained machine learning libraries, such as
tem. There are currently many other possibilities, but re- scikit-learn, Tensorflow, Theano, Pylearn2 and Caffe.
lational databases were chosen because they are widely In this module we often need to perform many math-
used and they allow the practical implementation of the ematical computations, and for that we use the python’s
models presented in Section 3.2 with minor changes. It NumPy library, which is the fundamental library for scien-
is not the scope of this dissertation to explore alternative tific computing in Python. As python is an interpreted lan-
methods to store information. guage, code runs slower than in other compiled program-
ming languages such as C and C++. In order to address
4.3.3 Database Management System
this issue, the NumPy library has many of its operations
There are plenty of DBMS that implement a relational implemented in C, bringing must faster running times to
model. Despite having different features, all of those python (van der Walt et al., 2011).
DBMS have a common base. We are just interested in
those base features. So our choice of DBMS was based
again in easy access to infrastructure.
5 Evaluation
To develop this platform we used SQLite which is a very In the previous sections we described the architecture and
lightweight database management system and stores the inner workings of our solution that has the goal of help-
information on a single small file on the OS file system, ing IT project managers to better plan their projects. It is,
which makes it very portable. SQLite has the advantage therefore, of utmost importance to assess the quality of the
of not requiring any particular configurations or database developed solution by performing evaluation tests and take
server. On the other hand SQLite has poor performance conclusions according to the obtained results.
and does not support user management but these issues
are not relevant during the development time. 5.1 Evaluation Sets and Environment
In order to evaluate our solution we programmatically cre-
4.4 ProjManagerLearnerOpt
ated several datasets. The task of helping project man-
The ProjManagerLearnerOpt component is used to per- agers during the planning of their projects is dependent not
form all the machine learning operations of the platform. only on the project type and features, but on the project
This component is connected with both the database and manager himself/herself. An inexperienced project man-
the web platform. When a user makes an operation that re- ager can, for example, underestimate the time that is nec-
quires a machine learning response (for example, the user essary to complete a certain type of task and therefore
has just finished to plan a project milestone, and wants create delays in the delivery of a milestone, and thus, the
to perform risk analysis on that particular milestone) the project itself. As the learning task is dependent on the
web platform queries the ProjManagerLearnerOpt system project manager himself/herself, it is necessary to create
which then makes database queries, computes the result datasets that have this situation into account.
using the appropriate algorithms and returns the response The datasets consists on several lists of Projects, Mile-
to the web platform. stones and tasks, where each milestone can be associ-
9
ated with at least one Problem instance (that represents an the web platform to our machine learning back-end that
occurrence that happened during the course of that mile- is generic enough to be replaced by a different one with
stone). Each of these components, as described in section minimum amount of effort from the developers.
3, has several defining properties. Assigning random val-
ues to all the properties of the instances of these models 6 Conclusions
would then render the learning task impossible and use-
less. The properties of the milestones and tasks are as-
6.1 Summary
signed with values that are between intervals of realistic
values. In order to make the dataset useful and interest- In this dissertation we proposed a system that has the
ing, we introduce certain ”biases” into the dataset gener- goal of helping project managers to improve their plan-
ator algorithm. For example, milestones of type ”A” with ning by performing risk analysis based on their previous
more than ”n” tasks, have ”k%” probability of reporting the professional history, every time they are planning a project
occurrence of problem of type ”P”. milestone. Our literature review showed that there are two
Another important factor to consider while building the main types of machine learning algorithms that contribute
datasets is that project managers learn over time. For to solve this problem: Instance-based learning and Re-
example an inexperienced project manager that uses our gression models. We developed models using both ap-
platform may underestimate the time needed to complete proaches. Our tests show that both techniques are able
the milestones in the beginning, but may learn how to cor- to give quite satisfiable results when applied on real-world
rectly estimate the duration over time. To address this real- scenarios. We were able to integrate these algorithmic so-
life scenario, part of the evaluation sets are also generated lutions into a platform that can be used by project man-
taking this into account. agers and their teams in order to be more efficient and
improve their project success rates by discovering and mit-
5.2 Solution quality igating potential risks before they happen.
10
Number of Number of Number of PM learns
projects milestones tasks over time
Dataset 1 1 16 100 No
Dataset 2 3 30 200 No
Dataset 3 3 30 200 Yes
Table 1: Description of the datasets to test our algorithms against different scenarios
11