decision tree
decision tree
Decision Tree
Decision Tree:
A decision tree is a non-parametric supervised learning algorithm, which is utilized for both
classification and regression tasks. It has a hierarchical, tree structure, which consists of a root
node, branches, internal nodes and leaf nodes.
Advantages
• Easy to interpret: The Boolean logic and visual representations of decision trees make
them easier to understand and consume. The hierarchical nature of a decision tree also
makes it easy to see which attributes are most important, which isn’t always clear with
other algorithms, like neural networks.
• Little to no data preparation required: Decision trees have a number of
characteristics, which make it more flexible than other classifiers. It can handle various
data types—i.e. discrete or continuous values, and continuous values can be converted
into categorical values through the use of thresholds. Additionally, it can also handle
values with missing values, which can be problematic for other classifiers, like Naïve
Bayes.
• More flexible: Decision trees can be leveraged for both classification and regression
tasks, making it more flexible than some other algorithms. It’s also insensitive to
underlying relationships between attributes; this means that if two variables are highly
correlated, the algorithm will only choose one of the features to split on.
Disadvantages
• Prone to overfitting: Complex decision trees tend to overfit and do not generalize well
to new data. This scenario can be avoided through the processes of pre-pruning or post-
pruning. Pre-pruning halts tree growth when there is insufficient data while post-
pruning removes subtrees with inadequate data after tree construction.
• High variance estimators: small variations within data can produce a very different
decision tree. Bagging, or the averaging of estimates, can be a method of reducing
variance of decision trees. However, this approach is limited as it can lead to highly
correlated predictors.
• More costly: Given that decision trees take a greedy search approach during
construction, they can be more expensive to train compared to other algorithms.
• Not fully supported in scikit-learn: Scikit-learn is a popular machine learning library
based in Python. While this library does have a Decision Tree module (Decision Tree
Classifier, link resides outside of ibm.com), the current implementation does not
support categorical variables.
ANUPAM GOPAL 2023M B 56 |2
i) Decision node
This indicates a situation when a decision has to be made and is
represented by a closed square. This is how most of the decision
tree diagrams start.
iv) Branches
Every time a decision is made, it leads to different nodes. A
branch would connect these nodes and represent a situation or
a result. Mostly, the result is written over the branch in normal
text.
ANUPAM GOPAL 2023M B 56 |3
Question 1:
ONGC bought a land and want to decide whether drill the land or sell the land
1) Land has 25% probability of contains oil and the balance 75% be dry.
2) Do a seismic experiment and determine whether to drill the the land or sell the
land. Seismic experiment caused 2M Dollars, If the result is favorable of which
there is 40% probability the chance of getting the oil is 60%. If result is
unfavorable which is 60% probable the chance of getting the oil is 10%. Drilling
operation cost will be 5M$. If oil is found the company will earn 40M$. On the
other hand if the company sell the land the company will earn 4M$.
ANUPAM GOPAL 2023M B 56 |4
Question 2:
A Company is having a piece of land near NH-8. The size of land is 3 Hectare. The owner
is having 3 options in context of that such land.
Option 1: The land is acquired by developing authority (NHAI) against the price of
3lakh/hec. Or land remain unused. There will be cost of security 0.25 lakhs.
Option 2: The land Owner gives lease to educational institution for 1 lakh/hectare for
extra curricular activity or land given to local marketers at the cost of 0.5 lakh/hectare.
Option 3: The land is acquired by NHAI to complete the project of Bharat mala at piece
of 5 Lakh/hectare or the land may be used for agricultural purpose will generate output
5.75 lakh/hec.
Information: There will be annual municipal charges equal to 5% of earning or 0.1
lakhs/hectare whichever will less born by land owner.
The land is airmark by Airport Authority of India to develop a domestic airport. This
project involves the cost of 12.5% of expected revenue to be generated by this activity.
Revenue cost paid by AAI is 4 times higher than present prevailing cost i.e. 2
lakhs/hectare.
ANUPAM GOPAL 2023M B 56 |5
Q3. Calciatore Verragamo SPA has developed a new line of products. Top management
is trying to decide the appropriate marketing and production strategy. Three strategies
that are being considered which we will call A (Aggressive), B (Basic) and C (Cautious).
The market condition under study are denoted by S (Strong) or W (Work).
Management’s best estimate of the net profit (In millions of $) in each case is given by
the following payoff table :
State of Market
Decisions Strong Weak
A 30 -8
B 20 7
C 5 15
Management’s best estimate of probability of a strong or weak market are 45% and 55%
respectively. Which strategy should be chosen.
ANUPAM GOPAL 2023M B 56 |6
Q4. You are planning an activity which requires a crane. You can rent either a small
crane or a large crane. The small crane costs 10000₹ which is less than the large crane
that costs 16000₹ but it will be slower. As currently planned the activity is not on the
critical path however an analysis indicates that there is a probability of 30% that the
activity might become critical. In this case each delayed day will cost the project 5000₹.
Using risk monetary values draw a decision tree and determine a course of action.
Q5. A piece of land is acquired by the developing authority in a rural area the
management of the organization has two options option 1 if land is use for Agri purpose
the operating cost will be ten lakh rupees per year and there will be three seasons of
food grains that will earn you 25 lakh rupees
option 2 if the land is use to develop a water park an expenditure of 20 lakh rupees will
arise that will earn Rs. 10,000 a day against the summer season of 91
option 3 if the land is given on rent to delhivery logistics private limited against a rental
income of 35000 per month for a period of 11 months in a year.
if the land is kept by the org. there will be a security charges of 12500 per month and no
Income will be generated.
ANUPAM GOPAL 2023M B 56 |8
Q6. A company considering launching a new product. The decision tree has two
decision points: whether to conduct market research (Event A) and whether to proceed
with product development (Event B). Each decision point has two possible outcomes:
Success (S) or Failure (F). You also have probabilities associated with each outcome.
Calculate the expected value for each possible outcome and make decisions based on
them.