0% found this document useful (0 votes)
204 views

Department of Mining Engineering: Indian Institute of Technology (Indian School of Mines) Dhanbad

The document discusses dragline mining and planning using machine learning. It provides an introduction to dragline mining parameters including geological parameters, equipment parameters, and operational parameters that influence dragline productivity. It describes common dragline digging methods and mining methods used globally. The document then discusses various machine learning techniques that can be used for dragline planning, including linear regression, decision trees, random forests, and neural networks. It highlights that optimal dragline productivity depends on evaluating multiple operating methods to meet production and pit geometry requirements.

Uploaded by

Aditya Himanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
204 views

Department of Mining Engineering: Indian Institute of Technology (Indian School of Mines) Dhanbad

The document discusses dragline mining and planning using machine learning. It provides an introduction to dragline mining parameters including geological parameters, equipment parameters, and operational parameters that influence dragline productivity. It describes common dragline digging methods and mining methods used globally. The document then discusses various machine learning techniques that can be used for dragline planning, including linear regression, decision trees, random forests, and neural networks. It highlights that optimal dragline productivity depends on evaluating multiple operating methods to meet production and pit geometry requirements.

Uploaded by

Aditya Himanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Department of Mining Engineering

Indian Institute of Technology


(Indian School of Mines) Dhanbad

Project Report
On
DEVELOPMENT OF AI SOLUTION FOR DRAGLINE
PLANNING USING MACHINE LEARNING
SESSION: 2019-20

Under the guidance of: Submitted by:


Prof. S.S. RAI Vivek Oraon
Professor 16JE002057
Dept. of Mining Engg. Aditya Himanshu
16JE001908

1|Page IIT (ISM) Dhanbad


Certificate

This is to certify that the project entitled “Development of AI Solution for Dragline Planning
using ML” of Vivek Oraon(16JE002057) and Aditya Himanshu (16JE001908) is a bonafide work
carried out by them under my supervision and guidance. The results embedded in this work
have neither been published before nor submitted to any other institutions for the award of
degree or diploma to the best of my knowledge and belief.

Prof. S.S. RAI


(Professor)

2|Page IIT (ISM) Dhanbad


Acknowledgement

A formal statement of acknowledgement will hardly meet the ends of justice in the matter of
expression of sense of gratitude and obligation to all those who helped us in the completion
of this project report. I would also like to express my gratitude and appreciation to all those
with whom we interacted and who helped us to broaden my horizon of knowledge and
understanding of Dragline Planning and Machine Learning.
We wish to express our deep sense of gratitude to our project guide, Prof. S.S. RAI, Professor,
Dept. of Mining Engineering, IIT (ISM) Dhanbad, for his able guidance and useful suggestions,
which helped us in proceeding in the right direction about the project work, all through the
semester.

3|Page IIT (ISM) Dhanbad


TABLE OF CONTENTS

SL.NO. TITLES Page No.

1.0 INTRODUCTION 5-9


1.1 Dragline Mining
1.2 Dragline Operating Parameter
1.3 Dragline and related mining process
2.0 DRAGLINE DIGGING METHOD 9-11
2.1 Normal or Underhand digging
2.2 Overhand digging and chopping
2.3 Pull back operation
3.0 DRAGLINE MINING METHODS 11-14
3.1 Simple side casting
3.2 Horse shoe method
3.3 Extended bench method
3.4 Tandem operations
3.5 Low-wall In-Pit Bench
3.6 Extended Key Cut
3.7 Multi-Pass Extended Key-Cut
3.8 Multi Seam operation
4.0 MACHINE LEARNING TECHNIQUES 15-20
4.1 Linear Regression
4.2 Multivariate Linear Regression
4.3 Polynomial Linear Regression
4.4 Decision Tree Regression
4.5 Random Forest Regression
4.6 ANN and DNN
5.0 MEASURING THE ACCURACY OF REGRESSION 20-21
MODELS – R SQUARED METHOD
6.0 IMPLEMENTATION OF VARIOUS REGRESSION 21-23
ALGORITHMS IN PYTHON
6.1 Multivariate Linear Regression
6.2 Polynomial Regression
6.3 Decision Tree and Random Forest Regression
7.0 RESULT 24

4|Page IIT (ISM) Dhanbad


1. INTRODUCTION

1.1 Dragline Mining

Draglines are capital-intensive heavy earth moving equipment. The cost of procurement of a
Dragline varies from USD 50 to 100 Million (depending on its application and size). It is also
estimated that 1 % improvement in productivity of a medium size Dragline operation (40 - 45
Cum. bucket capacity) will be valued around USD 0.75 – 1.00 Million per annum. Therefore, saving
in every second of operation cycle of this machine will bring improvement in the productivity of
Dragline operation having a tremendous financial impact on the cost of production (Corke,
Winstanley, Dunbabin, Robert, 2006)

In a typical high capacity, open cut coal mine, overburden removal accounts for more than 60 %
of the total mining costs. It is, therefore, important for the mine operators to optimize the cost of
overburden removal by enhancing the equipment productivity. As per an estimate from
Australian Dragline operations, a dragline with 50 cum bucket capacity and an average cycle time
over one minute may make 350,000 cycles per year. At a stripping ratio of 1:10, a 1 % decrease in
cycle time (0.6 seconds) would uncover additional 18000 tonnes of coal per year. At $ 30 a tonne
of coal this amounts to about $ 540,000 a year extra income for a typical operation. This 1 %
increase in productivity of all of Australia’s Dragline operations could increase the industry’s sale
of coal by more than $ 30 Million a year (Mirabediny, 1998)

Planning and design of Dragline operation to obtain optimal productivity is largely influenced by
three main parameters; Geological, Equipment and Operational parameters (Morey, 1990).
Geological parameters include overburden and inter-burden depth, thickness of coal seams, swell
factor, angle of repose of overburden, bench slope etc. over which the mine planner will have no
control but to design the dragline cut and operating method to meet these constraints.

Equipment or the machine Parameters include Rated Suspended Load, Bucket Capacity, Boom
Length, Boom Angle, Operating Radius, Tub Radius, Shoe Width, Digging Depth, Dumping Height
etc.

Operational parameters include production parameters such as Dig Position Time, Bucket Filling
Time, Outward Swing Time, Inward Swing Time, Dump Time, Walk set up Time, Operator’s
Efficiency etc.

The other parameter affecting the productivity of Dragline is the pit parameters such as Digging
Depth / Height, Dumping Height, Strip Length, High Wall Angle, Low wall Angle, Dig Face Angle,
amount of re-handling etc.

It is the consideration of all the three parameters that decides the optimal dragline operating
method which will include, in addition to the pit parameters, digging methods and the digging
sequence such as high-wall or low-wall operation, single or multi-pass operation etc.

There could be always a more than one dragline digging methods and sequence in a given geo-
mining conditions - which one suits or the most optimal one meeting the operational parameters
is the key to success of Dragline operation. (Lumely and Haneman, 1994; Kishore & Dewangan,
2010). Application of technology for Dragline Simulation for selecting an optimal Dragline

5|Page IIT (ISM) Dhanbad


operating method is the requirement of the day for a better visibility and control over Dragline
operation.

There are over twenty traditional Dragline digging methods across global operations (Mirabediny,
1998). These operating methods may differ in terms of bench width, bench height, throw blast,
number of Dragline passes, Dragline position, walking pattern, digging modes, re-handle
percentage, swing angle, hoist distance, cycle time components etc. (Humphrey 1990, Lumely
and Haneman 1994, Esterhuyse 1997, Thorton 2000, Liang 2015). In order to arrive at the most
optimal Dragline productivity, a number of possible options of Dragline operating methods would
need to be evaluated to meet a specific requirement in terms of production and pit geometry.

Draglines is a self-contained system that loads and transport overburden material to a dump
point. Dragline systems are highly productive, extremely robust and have very long lives,
commonly 30 to 40 years. The dragline system is high capital cost, low operating cost system with
low sensitivity to geologic variance. Because of their high productivity and capability to direct cast
of material, draglines are suitable for flat lying tabular geology with high production
requirements. The most common application for large draglines is overburden removal in coal
mining. The most common machines average a 46 – 60 m3 bucket size, there has been a trend in
favour of larger machines.
Large draglines feature much larger bucket sizes (up to 115m3) and dimensions allowing the
draglines to operate at greater depths and widths without needing pre-stripping and reducing re-
handle (Aspinall et al., 1993; Gianazza, 2010). Largest draglines may have 125 m3 (160 yd3)
buckets. Draglines of this size can move 30–35 million BCM (bank cubic meters).

A large dragline can operate through a range from about 50 m above to 65 m below its working
level. With advanced techniques, the dragline can handle overburden depths of about 80
draglines work in strips and the overburden is excavated by the dragline and dumped
subsequently in adjacent mined out strips. Strips are generally aligned along strike with each
subsequent strip down dip from the previous strip. The dragline starts at one end (or the middle)
of the strip and advances along the strip to the other end. At the completion of each strip the
dragline relocates to the start of the strip and commences the next strip, this is referred to as
deadheading. Widths of Dragline pit is a critical pit parameter (commonly 40–90 m) of Dragline
operation. This is influenced by the depth of the overburden, blasting method, material
characteristics, size and specification of Dragline and Mining method. Length of the pit or the Strip
Length are commonly 1,000–3,000 m. Some operations may have shorter strip length up to 300
m. The pad on which a dragline sits is relatively at level. Draglines can propel up and down a <=
10% grade or across a + 5% grade. When they march, it is important that they do so gradually,
always distributing the load evenly across the tub (the dragline’s circular base) and shoes. As a
rule, the rate of grade change should be <=3% per tub diameter. For example, for a tub 20 m in
diameter, the rate of grade change should be <= 3% per 20 m, so a change from 0% to 9% should
take at least 9/3 * 20 = 60 m.

Draglines are designed to work in soft underfoot conditions, and as such are designed with tub
ground bearing pressures on the order of 1.2–1.4 kg/cm2 (17–20 psi). During propel, about 80%
of the machine weight is transferred to the shoes, and the remaining weight is carried by the tub
edge. This ratio can be changed by carrying the bucket or setting it on the ground. The dragline

6|Page IIT (ISM) Dhanbad


swings approximately 90° and casts into a pile in the previous pit. A Dragline removes material
from a specified length of the pit, called a set or block. Set lengths for larger machines are about 30
m (100 ft.), or about 16 steps for the dragline.

Application of Dragline

• Dragline is the predominant machine, which is used to remove the overburden and
expose the coal.
• Draglines are the lowest cost overburden removal mining system.
• The use of Dragline for overburden removal is restricted by the following factors;
(Westcott et al., 2009):

1. Size of Deposit: Large deposits to ensure adequate strip length and sufficient
reserves to justify large capital expenditure
2. Dip of Coal Seam: Gently dipping deposits, due to spoil instability on steep dips.
3. Thickness of Overburden / Inter-burden: Dragline is a high bench operation.

• Dragline mining is often associated with cast blast and/or dozer push to increase
overburden removal capacity (Westcott et al., 2009).
• The main advantages of Dragline applications are;
1. Direct cast,
2. Low operating cost, and
3. Handles hard digging
• Dragline Mining requires detailed and meticulous planning. Application is generally
determined by the following factors:
1. Pit Geometry
2. The placement of spoil material
3. Excavation methods and digging sequence
• Dragline application is generally constrained by the digging depth and dumping height.

1.2 Dragline Operating Parameters:

 Clearance radius (CR): Clearance radius is the minimum distance from dragline positioning
centre, which have to be free for safe dragline rotation.
 Boom foot radius (FR): Radius of boom foot.
 Clearance height (CH): The clearance height for dragline swing or rotation.
 Dumping clearance (DC): The distance between highest position of dragline bucket and
boom point end.
 Dumping height (DH): The dumping height from highest position of dragline bucket and
dragline operating surface level.
 Boom point height (BP): The height from boom point end to dragline operating surface
level.
 Digging depth (DD): The digging depth of cut.
 Point sheave pitch diameter (PS): Diameter of a pulley block carrying the main rope of
dragline bucket
 Tub diameter (TD): The diameter of dragline tub leg on surface.
 Boom angle (BA): Maximum angle of boom from horizontal.

7|Page IIT (ISM) Dhanbad


 Boom foot height (BH): Height of boom foot from dragline operating surface.
 Dragline positioning (PD): Dragline positioning is the distance from dragline tub centre
point to the end of operating bench. It is usually 75% of dragline tub diameter.
 Operating radius (OR): Operating radius is the maximum operating or distance of swinging
from the positioning point that dragline available reach in both dragging and spoiling
operations.
 Reach factor (RF): The reach factor of dragline is maximum distance from bench end that
equipment can reach.

Fig (a) and (b) showing different working parameters of Dragline.

8|Page IIT (ISM) Dhanbad


1.3 The dragline and the related mining processes:

The dragline cycle begins with the bucket being lowered into the pit and positioned to penetrate
the bank. The bucket is filled by dragging it into the digging face. Although the buckets are
designed to allow the teeth to have a good angle of attack in the relaxed position, sensitive
handling of the hoist tension at this time can improve the penetration rate and can reduce bucket
fill time. Once filled, hoisting and drag pay-out commences almost simultaneously, and this is
followed by swinging as the bucket clears the trench. As the bucket swings and climbs, proper
tension between the hoist and drag holds the bucket in the carry position. As the dumping point
is approached, the swing control is reversed (plugged) and the drag allowed to pay out until the
bucket is unbalanced and the load is dumped. Due to the swing inertia of the machine, the
direction of swing will not change for several seconds after the controls are reversed, thus giving
the bucket time to dump without delay. During the return swing, the hoist is paid out and the
drag is reeved in so as to begin the positioning process as the bucket settles into position. The
proficiency with which these five functions are carried out thus contributes significantly to the
productivity of the machine (Sargent 1990).

2. DRAGLINE DIGGING METHOD

2.1 Normal or Underhand Digging: In underhand Digging, the Dragline generally works from
High-wall and digs material from below. The swing angle ranges from 30 to 120 degree. Making
of key cut, the main cut and the extended bench are generally by underhand digging.

Fig: Normal or Underhand Digging Method

2.2 Overhand Digging or Chopping: In this method, the Dragline excavate material from a bench
above its pad level. This method is generally applied in soft or undulating ground surfaces. Bucket

9|Page IIT (ISM) Dhanbad


is usually held in dump position and the teeth are dropped into the material to give the bucket a
penetrating force. The bucket is then dragged downwards towards the Dragline. This method
enhances the maximum digging depth of the Dragline, but there is a decrease in Dragline
productivity due to longer bucket fill time, lower bucket fill factor, longer swing angle (usually
between 130 to 150 degree and more Dragline movement. The other dis-advantage is the
increase in down time and repair cost of Dragline due to increase wear on rigging, ropes and
bucket.

Fig: Overhand Digging or Chopping Digging Method

2.3 Pull Back Operation: In Pullback operation, Dragline pad is on the dumps, which be built at
higher level than normal bridge and closer to the dumps. This method is used as an alternative
to extended bench when the size of the Dragline is not sufficient to dump material from the
high-wall bench. This operation is very common in multi-seam operation or tandem dragline
operation.

Fig: Pull Back Operation Digging Method

10 | P a g e IIT (ISM) Dhanbad


The main advantage of this operation is that it creates extra room for dumping and amount of re-
handling is substantially reduced.

The dis-advantages of this technique are:

 The efficiency of Dragline in this method is reduced due to increased swing angles and
bucket fill time in chopping operation, increased Dragline Walking
 Inherent hazards of unstable spoil dump
 Complexities in planning and sequencing of operation. A number of scenarios may have
to be evaluated.
 Additional dozing operations are required to prepare Dragline pad on the dump.

3. DRAGLINE MINING METHOD


As discussed earlier, the primary use of Dragline in an Indian open pit coal mine is to remove
overburden to expose the coal beneath for production by either Truck and Shovel system or the
Surface Miner system. To accomplish this task, following basic Dragline methods are in use in India
depending upon the overall productivity, required coal exposure rate and associated mining cost
for a single seam operation;

3.1. Simple Side Casting: This is the simplest Dragline operating method in which the Dragline
sitting on its level removes the overburden and side cast it into the previously de-coaled area.
This method is applicable for shallow depth (usually up to 30 meter) for a medium sized Dragline
with operating radius of approximately 90 meter. Generally, no re-handle is associated in this
method and the angle of swing is in the range of 900. Dragline Bench Width and Dumping Radius
is the restricting factor.

3.2. Horse Shoe Method: In this method, the dragline initially works from the high wall side to
remove part of the overburden and thereafter it moves to the spoil side to clear the balance
overburden and the re-handle, if any to expose coal. The spoil side digging enhances the cycle
time; the amount of re-handle is significantly reduced.

3.3. Extended Bench Method: With increase in the depth and width of the pit, it would be difficult
to side cast all the overburden by Simple Side Casting Method. In Extended Bench method, the
initial key cut overburden is used to expand the bench for Dragline pad on the highwall side.
Working from this position significantly increases the dumping room and final dumping radius.
This method is generally associated with re-handling.

3.4. Tandem Operations: In this method two Dragline operate together to remove overburden
and expose coal. This method is very popular in Northern Coalfields Limited (NCL) operations
where thickness of overburden is high (up t0 60 meters) and the size of Dragline is small (max.
size is 24 / 96). The Tandem operation methods in NCL mines are employed in two ways;

3.4.1. Vertical Tandem Method: For deep Dragline operations, the Dragline operation may
be limited by the digging depth. The Dragline bench is split in two vertical benches. The
method involves two highwall passes with the simple side casting of the first pass and
extended bench method in the second pass. Theoretically, no re-handle is associated with
the first pass; the second pass has to do some re-handle depending upon the pit

11 | P a g e IIT (ISM) Dhanbad


parameters. The top bench which is generally low height compared to the bottom bench,
moves ahead followed by the bottom bench. A Dragline tail clearance of 25 – 30 meter is
required for the bottom pass. This method can handle up to 70 – 80 meter depth with the
help of a medium size Dragline (40 – 45 cum.).

3.4.2. Horizontal Tandem Method: This method is typically used in NCL mines for
enhancing coal exposure by increasing the Dragline pit width up to 80 meter at the same
time compromising with the digging depth to 30 – 35 meter. The balance Dragline thickness
is allotted to Truck-Shovel operation. There are two highwall passes like Vertical Tandem
method, Dragline operate from the same horizon. Horizontal tandem is associated with
large amount of re-handle.

The current Dragline digging methods in India have, thus far, worked well. As the coal mining
operations move into deeper areas, the stripping ratios increase and the geological conditions
getting constrained, there is a need to review the existing dragline methods in the light of
technological advancement to enhance Dragline productivities. There are over twenty traditional
Dragline digging methods across global operations (Hamid Mirabediny, 1998). These operating
methods may differ in terms of bench width, bench height, throw blast, number of Dragline
passes, Dragline position, walking pattern, digging modes, re-handle percentage, swing angle,
hoist distance, cycle time components etc. (Humphrey 1990, Lumely and Haneman 1994, Willem
Petrus Esterhuyse 1997, Thorton 2000, Xin Liang 2015). In order to arrive at the most optimal
Dragline productivity, a number of possible options of Dragline operating methods would need to
be evaluated to meet a specific requirement in terms of production and pit geometry.

Characterized by the number of coal seams, overburden depth, the Dragline movement and the
number of Dragline lifts, many improvements have been made on Dragline operating methods in
global operations. Considering the geological, mining and operational parameters, some modified
Dragline digging methods practiced in Australian open pit coal mining operations may also be
considered for Indian open pit coal operations, especially for NCL mines;

3.5. Low-wall In-Pit Bench Method: For a medium depth overburden single seam operation, the
Dragline dump height is generally not the limiting factor. Extended bench method may involve
large amount of re-handle. Low-wall In-pit bench method could provide the solution to
significantly reduce the re-handle. This is achieved by using an in-pit bench at a lower elevation.
The amount of re-handle is optimized by the selection of level of in-pit bench. Ideally, an in-pit
bench level should be prepared as low as possible keeping in mind the reach of the dumping space
available.

Usually this method is associated with maximum throw blast of material into the old pit. Dragline
sits on the blasted material (after pad preparation by Dozer) on the low-wall side of the pit. The
dimension of in-pit bench will depend on the pit-geometry, blast profile and the Dragline
parameters (operating radius, dumping height in particular). Usually, in-pit bench is made 10 – 15
meter below the in-situ highwall level. Dragline works solely from the in-pit bench on the low-
wall side. Mode of operation of Dragline is generally a combination of Chopping and Pull-Back.

12 | P a g e IIT (ISM) Dhanbad


3.6. Extended Key Cut method: This method is a two-pass operation, highwall pass and the low-
wall pass. Highwall pass does the key cut and further extend it, the low-wall pass is a pull back
operation. Low-wall bench is about 10-15 meter below the pre-blasted bench level.

Like low-wall in-pit bench method, a maximum throw of the blast into the de-coaled pit can be
achieved. The highwall is levelled and Dragline pad (approximately 30 meter) is prepared using
the auxiliary equipment for the Dragline to prepare the key-cut and extending it along strike to
prepare the low-wall side Dragline pad at a pre-designed height. Unlike Extended Bench method,
the highwall operation does not uncover the coal, therefore if this method is used by a single
dragline, the length of the cut needs to be shorter (approximately 500 meter). However, there is
no limit if the two passes are executed by two separate Draglines.

On the low-wall side, operation is generally a combination of Chopping and Pull-back operation.
Extended key cut method is characterized by higher swing in wider strips.

3.7. Multi-Pass Extended Key-Cut Method: This is a variation of Extended Key Cut method
designed for deep overburden without employing Throw Blast technique. The method is best
suited for thick coal seams since the coal loss is minimized due to absence of heavy blasting
(without throw).
In this method, there are two highwall and two low-wall passes. The first highwall pass is Simple
Side Casting to reduce the bench height. The second highwall pass extends the key cut, highwall
trimming and the remainder of the main cut as shown in the figure below;

Fig: Multi-Pass Extended Key Dragline Mining Method

Material from the second pass and the first pass are utilized to prepare the in-pit bench for the
first low-wall operation. Lower level of in-pit bench than the highwall bench ensures less re-
handle. However, the height of this in-pit bench should be enough to create a dump height to
accommodate the material from the final pass (which could be from the same level or at higher
level. The level of in-pit bench will also depend on its ability the toe of the bench and creation of
a safe working width of the final bench.

3.8. Multi – Seam Operation: In multi coal seam operations, the selection of suitable Dragline
digging method is influenced by the thickness of both the overburden and inter-burden.
Sequencing of various Dragline positions as well as coal mining is more complex in multi-seam
operations.

13 | P a g e IIT (ISM) Dhanbad


Multi-seam operations are usually a combination of high wall and low-wall Dragline passes to
remove overburden and interburden material. Depending upon the thickness of the Inter-burden,
multi-seam operations could be either a Single High Wall – Double Low Wall Operation or a
Double High Wall - Single Low Wall Operations.

3.8.1 Single High-Wall and Double Low-Wall Multi Dragline Operation: The first pass is a
standard Highwall Simple Side Casting with a key – cut and the main cut. The
overburden is directly dumped into the previously de-coaled area with minimal re-
handle. The second and third pass are essentially the low-wall operations with a
combination of Chopping and Pull Back. A typical operation is shown in the following
figure;

Fig: Singe High-Wall and Double Low-Wall Multi Dragline Mining Method

3.8.2. Double High Wall and Single Low Wall Multi Dragline Operation: With a decrease in
top over burden, there is a point at which there is not enough material from the first
pass to form the in-pit bench level for the low wall operation. It is then desirable to
switch to Double High Wall and Single Low Wall operation.
In practice Double High Wall and Single Low Wall method is more productive than
the Single High Wall and Double Low Wall method with very less or no re-handle.

Fig: Double High Wall and Single Low Wall Multi Dragline Operation

14 | P a g e IIT (ISM) Dhanbad


4. Machine Learning Techniques
This project requires the use of various Supervised Machine Learning algorithms to find the
productivity of Dragline based on field and equipment parameters. Supervised Learning means
the input and output variables are known to us and we feed the input variables to get the desired
output. We will see the working of various supervised learning algorithms and then apply each of
them to our dataset to find out the best one for our dataset. We will be using Regression
algorithms rather than classification because we need to predict a continuous value rather than
categorical value. The ML regression algorithms that we will be using in this project are:

 Linear Regression
 Polynomial Regression
 Decision Tree Regression
 Random Forest Regression

The dataset has been captured from DRAGSIM; a dragline simulation tool for dragline planning.

4.1) Linear Regression:


The linear regression is of the form:

y = b0 + b1x1

where, y = dependent or output variable,

x1 = independent or input variable

b0 = y intercept

b1 = regression coefficient (Slope of regression line)

In linear regression, there is only one input variable by which the prediction of output variable is
done. If we plot a linear regression dataset, it will look as shown:

Source: Wikipedia.org

Fig: A linear regression curve passing through datapoints

15 | P a g e IIT (ISM) Dhanbad


Now, the Regression line is chosen such that it best fits all the data points. In other words the line
whose Mean Squared Error (MSE) is least, is chosen as the Regression line. The MSE is given by:

Where, N = no. of observations

Y^ = Predicted Value

Y = Actual Value
Thus, Regression line which has the least MSE, will best fit all the data points as the distance
between data points and predicted value will be minimum.

In this way, using Linear Regression, we can predict the dependent variable having independent
variable in our dataset. Linear Regression can be used for predicting house prices, forecasting
sales, etc.

4.2) Multivariate Linear Regression:


It is the same as Linear Regression except it uses multiple independent variables rather than one
to find out the dependent variable. It is of the form:

y = b0 + b1x1 + b2x2 + b3x3 + …… + bnxn


where, y = dependent or output variable,

x1, x2, x3, …. , xn = independent or input variable

b0 = y intercept

b1, b2, b3, …., bn = regression coefficients

If suppose there are two independent variables then the equation will be: y = b0 + b1x1 + b2x2. If
we plot data points between y, x1 and x2, there will be a 3rd axis instead of just two like in case of
simple linear regression. The plot will look as

Fig: Linear Regression for 2 variables can be represented in 3-d with a regression plane passing through data points

16 | P a g e IIT (ISM) Dhanbad


In this case, we will have a regression plane instead of line, which best fit all the x, y and z data
points. We cannot visualize more than three dimensions but the idea behind Multivariate Linear
Regression is clear that it is same as Simple Linear Regression except that it has more than one
independent variables. In addition, the best fit is chosen by Ordinary Least Square (OLS) method,
which checks the difference between actual and predicted values. The best fit has minimum OLS.

Now, In Multivariate there are more than one variables and some of them might not be useful for
out ML algorithm. In other words, they do not contribute to the training of algorithm but rather
makes it more noisy or inaccurate. Machine Learning models follows the convention of Garbage
In, Garbage out. If we feed unnecessary variables to our model, then predictions might be less
accurate, due to which we perform feature selection.

Feature Selection is a technique to get rid of unnecessary variables which are not contributing to
the training of model but making it worse. For the purpose of this project, we will be using
Backward Elimination, which is the most commonly used method for Feature Selection. We will
discuss more about backward elimination during python implementation of the project.

4.3) Polynomial Regression:

Fig: Linear Regression curve unable to study patterns in data.

If our dataset is as shown in the above figure (red dots), then the straight line (green line) cannot
fit the data well and this may lead to underfitting. In such cases, we make use of Polynomial
Regression. In Polynomial Regression, we add powers to the original feature as new features. Thus
the linear equation, y = b0 + b1x1 becomes:

Y = b0 + b1x1 + b2x12

The model remains linear as the coefficients associated with the features (independent variables)
are still linear, the power added is only a new feature. But due to this new feature, the curve that
is being fitted is quadratic in nature, which will be able to fit the points better as shown:

Fig: Regression curve when high degree features are added to equation

17 | P a g e IIT (ISM) Dhanbad


Thus, Polynomial Regression is very useful for datasets in which a non-straight curve is needed to
study the pattern of the dataset properly. The generalized equation can be written as:

y = b0 + b1x1 + b2x12 + …. + bnx1n

4.4) Decision Tree Regression:


Classification and Regression trees (CART) algorithms are widely used for classification and
regression models. Classification trees are used for categorical dependent variable whereas
regression trees are used for continuous dependent variables. A decision tree consists of a root
node from which different Decision nodes are linked. The terminating node is called a leaf node.

Decision tree is mainly used when we have randomly distributed dataset as shown:

Fig: Randomly distributed data points, Source: Statquest YouTube

We can see that neither a straight line nor a higher degree polynomial curve can best fit the data
points. In such situation, decision tree comes handy and is drawn as:

Fig: A regression tree formed for the dataset, Source: Statquest YouTube

The top node is called root node. The node from and to which arrows are emerging is known as
decision node. The terminating node is called leaf node. The root node is chosen such that sum
of squared residuals ∑(y_pred – y_actual)2 is minimum for the dataset.
This is the principle behind working of Regression Trees.

18 | P a g e IIT (ISM) Dhanbad


4.5) Random Forest Regression

Source: https://towardsdatascience.com/random-forest-and-its-implementation-71824ced454f

Random Forest principle is simply multiple decision trees running in parallel to make predictions
that are more accurate. In case of classification, Random forest considers the mode obtained by
various trees, while in case of Regression, it considers the mean of all the decision trees.

Thus, Random Forests models are more accurate than Decision Tree and is widely used ML
algorithm in many classification and Regression problems.

4.6) Artificial Neural Networks (ANN) and Deep Neural Networks (DNN):
ANN uses neurons which are separated in many layers to process the problems in real-life. It
mimics the operations of the human brain to analyse and respond to the received information, in
which, the layers are divided into three groups: input, hidden, and output layers. The neurons in
the layers are linked together through weights and the deviations between weights (i.e., biases).
The accuracy of the ANN model depends on the weights and biases. In recent years, ANN has
been growing strongly in many different forms, such as simple ANN (with only one hidden layer),
complex ANN (with multiple hidden layers), convolutional neural network (CNN), and recurrent
neural network (RNN), to name a few. Of those, ANN is the most common and widely used form
in many areas.

Like other AI techniques, ANN can be trained through supervised or unsupervised learning
methods. The performance of ANN models highly depends on the number of hidden layers and
hidden neurons. The complex ANN models with multiple hidden layers are called deep neural
network (DNN). Theoretically, an ANN model with the low hidden layer(s) can improve the
training time, and keep hidden layers as small as possible to avoid overfitting, too. However, the
predictive power of ANN increases as hidden layers increases.

As ANN requires large dataset, it will not be feasible to use this in our project.

19 | P a g e IIT (ISM) Dhanbad


Fig: Structure of an ANN with two hidden layers, Source: towardsdatscience.com

5. Measuring the accuracy of regression models – R Squared method


To measure how well our model is performing, we use R Squared test, which determines the
goodness of fit in regression analysis. Goodness of fit implies how better regression model is fitted
to the data points. More is the value of r-square near to 1, better is the model. The formula is
given by:

where, SSres = Sum of squared residuals = ∑(y_true – y_predicted)2 around the regression line.

And, SStot = Total sum of squares = ∑(y_true – y_mean)2 around mean line.

Fig a and b: Sum of squared residuals around Regression and Mean line, Source:geeksforgeeks.com

20 | P a g e IIT (ISM) Dhanbad


We can see that for R squared value to be close to 1, Sum of squared residuals has to be low, i.e.
the regression line shall best fit the data points.

6. Implementation of various regression algorithms in Python:


Firstly, we will do data pre-processing (if required) and split our dataset into training and testing
dataset. We will keep 80% of our data into training dataset to better train the algorithm and the
remaining 20% data into testing dataset. We will scikit learn library to implement different
machine learning models.

Fig: Importing various scikit libraries and importing the dataset

6.1) Multivariate Linear Regression:


The dataset contains more than one explanatory variables, so we have performed multivariate
linear regression. We have trained our model with the training data set and then the model was
made to predict values for the test data set. Finally we have checked accuracy of the model using
“R squared” metric.

Fig: Applying linear regression and checking r2 score

Our model has performed well with an accuracy of about 93%. We had also discussed about
feature selection technique that we can use to further increase the accuracy of multivariate linear
regression by eliminating the independent variables that are not required or adds noise to our
dataset. The feature selection technique that we will be using is Backward Elimination.

21 | P a g e IIT (ISM) Dhanbad


In this method we make use of p-value, t value and hypothesis to make eliminations. In our case,
the null (Ho) and alternate hypothesis (Ha) are:

Ho :- There is no relationship between dependent and independent variable.

Ha :- There is strong relationship between dependent and independent variable.

Now, the p-value is the probability that the null hypothesis is true. The t value indicates how far
the estimated value is from hypothesized value. Larger the t value more is the probability that
alternate hypothesis is plausible. Also, if the p–value is large, then the probability that large t-
value is due to chance increases. In short, we have to reject the variables whose p–value is greater
than the significance level of 0.05.

Variable x3 will be removed Variable x4 will be removed

Variable x3 will be removed We obtain the required variables, all having p < 0.05

Thus, after performing backward elimination, only necessary variables remain which are:

 Pit Width
 OB thickness
 Cast %
 Dozer Volume %
 Coal Exposed (te/annum)

22 | P a g e IIT (ISM) Dhanbad


Fig: Applying Linear Regression after feature selection and checking r2 score

Training our linear regression model with only these independent variables, we obtain an
accuracy of about 97% which has improved over the model containing all the independent
variables. Thus, feature selection helped us to increase our model accuracy over 4%.

6.2) Polynomial Regression:

Fig: Applying Polynomial Regression and checking r2 score

As covered in theory part, the model is still linear as the coefficient associated with the features
are linear. Only the curve is polynomial in nature. In the above implementation, we have added
features of degree two in our equation, which makes the curve quadratic in nature.

However, the R2 score obtained shows that the accuracy is about 89%, which is less than that
obtained in linear regression. If we increase the degree, R2 score even lowers. This shows us that
a linear curve is more likely able to study the patterns in our dataset than polynomial curve.

6.3) Decision Tree and Random Forest Regression:

Fig: Applying Decision Tree and Random Forest Regression

We performed Decision Tree as well as Random Forest Regression to our dataset. The R2 score
for Decision Tree and Random Forest Regression was about 52% and 70% respectively. Random
Forest being advanced of the two performed little better but both of these models performed
worse than the linear models. Maybe it was due to less data in our dataset.

23 | P a g e IIT (ISM) Dhanbad


7.RESULT:
The implementation of various machine learning algorithms to our dataset, gave us the idea
behind working of these algorithms and the best model suitable for prediction of the output for
Dragline Planning. Multivariate Linear Regression along with feature selection technique was able
to accurately predict the productivity of dragline up to 97%. We can experiment with various
parameters in our model to get the required productivity without having to do trial in real scenario
which would waste time as well as resource.

24 | P a g e IIT (ISM) Dhanbad


REFERENCES

S,V. Chaoji & Badal Chandra Dey, 2000, Metals, Mines & fuels, Vol. XLVIII, 2000, Piyush Rai,
Umakant Yadav & Ashok Kumar, 2011, Productivity Analysis of Draglines Operating in Horizontal
and Vertical Tandem Mode of Operation in a Coal Mine – A case study).

Xin Liang – “The Optimization of Digging Sequence of a Dragline” 2015, University of Queensland,
M.Phil Thesis

Hamid Mirabediny, 1998, “Dragline Simulation Model for Strip Mine Design and Development”,
University of Wollongong, Thesis Collection

Lumley, G & Haneman, D 1994, Improved Monitoring of Dragline Operations, ACARP Project

Hong Zhang, Hoang Nguyen, Xuan-Nam Bui, Trung Nguyen-Thoi, Thu-Thuy Bui, Nga Nguyen, Diep-
Anh Vu, Vinyas Mahesh, Hossein Moayedi - Developing a novel artificial intelligence model to
estimate the capital cost of mining projects using deep neural network-based ant colony
optimization algorithm

25 | P a g e IIT (ISM) Dhanbad

You might also like