Iron Ore Quality Prediction Using Machine Learning
Iron Ore Quality Prediction Using Machine Learning
ABSTRACT:
The main goal of this project is to predict how much impurity Research Objectives
is in the ore concentrate. The% of Silica is measured in a lab
experiment it takes at least one hour for the process engineers
to have this value. As this impurity is measured every hour, if 1. To evaluate the feasibility of using machine learning
we can predict how much silica (impurity) is in the ore algorithms to predict in real-time the percentage of silica
concentrate, we can help the engineers, giving them early concentrate of froth flotation processing plant.
information to take actions (empowering!). Hence, they will
be able to take corrective actions in advance (reduce impurity,
if it is the case) and also help the environment (reducing the 2. Model selection: The project finds out which variable
amount of ore that goes to tailings as you reduce silica in the associated with iron ore extraction is statistically significant.
ore concentrate).
1.Introduction
3. Estimate: The project will propose a model to predict
The approach is simple. It aims whether we can predict the percentage of silica concentrate in froth flotation
silica concentrate without iron concentrate and approached
with simple way of developing the model with concentrate and
model without concentrate and compare the performance of 2.LITERATURE REVIEW
model using various regression metric like R^2 or MAE and Column Process DESCRIPTION OF
drawing conclusion based on the results. VARIABLES IN FORTH
PLANT
When multiple dependent variables exist in a regression Date date of the measurement
model, this task is called as multi-target regression. In this % Iron Feed % of Iron that comes
case, a multi-output regressor is employed to learn the from the iron ore that is
mapping from input features to output variables jointly. In this
being fed into the
study, multi-target regression technique is implemented for
quality prediction in a mining process to estimate the amount flotation cells
of silica and iron concentrates in the ore at the end of the % Silica Feed % of silica (impurity) that
process. In the experimental studies, different regressors that comes from the iron ore
use Random Forest, AdaBoost, k-Nearest Neighbors and that is being fed into the
Decision Tree algorithms separately in the background were flotation cells
compared to determine the best model. Coefficient of Starch Flow Starch (reagent) Flow
determination (R 2 ) measure was used as the evaluation measured in m3/h
metric. There are some studies that predict iron concentrate Amina Flow Amina (reagent) Flow
and silica concentrate separately. However, this Model measured in m3/h
provides a new contribution to the field by calculating these Ore Pulp Flow t/h
two values jointly since they have a great correlation. Ore Pulp pH pH scale from 0 to 14
Our Approaches is whether Ore Pulp Density Density scale from 1 to 3
1. % Iron Concentrate is correlated with % Silica Concentrate kg/cm³
2.Predict the % silica concentrate without using % iron Flotation Column 01 Air Air flow that goes into
concentrate . Flow the flotation cell
3. If it is correlated and we can predict both % Iron and Silica measured in Nm³/h
concentrate at same time using power of ML and DL . Flotation Column 02 Air Air flow that goes into
Flow the flotation cell dataset from data analytic practitioners. Data scientists
measured in Nm³/h compete to build the best model for both descriptive and
Flotation Column 03 Air Air flow that goes into predictive analytic. It however allows individual to access
Flow the flotation cell their dataset in order create models and also work with other
measured in Nm³/h data scientist to solve various real world analytics problems.
The input dataset used in developing this model has been
Flotation Column 04 Air Air flow that goes into
downloaded from Kaggle. The dataset contains design
Flow the flotation cell
characteristics of iron ore froth flotation processing plant
measured in Nm³/h which were put together within three months. This is nicely
Flotation Column 05 Air Air flow that goes into organized using common format and a standardized set of
Flow the flotation cell associate features of iron ore froth flotation system.
measured in Nm³/h
Flotation Column 06 Air Air flow that goes into Structure of Dataset
Flow the flotation cell
measured in Nm³/h The dataset contains 24 columns representing the
Flotation Column 07 Air Air flow that goes into measurements, 737,453 samples exist. The 24 columns include
the date and time of the measurement, which will not be used
Flow the flotation cell
as an input feature. The last columns of the dataset represent
measured in Nm³/h
the targets of this prediction task: the percentages of iron ore
Flotation Column 01 Froth level in the and silica concentrate, which are highly inversely correlated.
Level flotation cell measured in Our goal is to predict silica concentrate without the use of iron
mm (millimeters) concentrate. The other 21 columns will be used as features for
Flotation Column 02 Froth level in the predicting the target value. Description of each feature can be
Level flotation cell measured in found in Table above
mm (millimeters)
Flotation Column 03 Froth level in the
Level flotation cell measured in
2.2 Proposed Solution
mm (millimeters)
Flotation Column 04 Froth level in the Over the past two decades, there has been an upsurge of
Level flotation cell measured in academic research work within froth flotation process
mm (millimeters) fraternity. Though, a significant number of the plant
Flotation Column 05 Froth level in the processing problems are being successfully modelled using
Level flotation cell measured in machine learning algorithms but other unresolved issues and
mm (millimeters) impediment still remain.
Flotation Column 06 Froth level in the
Level flotation cell measured in
mm (millimeters) Random ForestRegressor
Flotation Column 07 Froth level in the
This method basically trains a number of classifying decision
Level flotation cell measured in
trees on various different subsamples. It benefits from
mm averaging mechanism to improve the predictive accuracy and
%Iron Concentrate % of Iron which to control over-fitting. Training samples are randomly selected
represents how much with replacement. The size of each new training set is the same
iron is presented in the as the original dataset. That is to say, a chosen instance is likely
end of the flotation to be chosen again and again as an element of distinct subsets.
process As input parameters, the number of trees in the algorithm and
% Silica Concentrate % of silica which maximum depth should be determined initially. The change in
represents how much their values may affect the performance and predictive power
iron is presented in the of the algorithm. Therefore, all possible parameters in the range
end of the flotation for the size of the dataset are given to the method and tested.
process The parameters leading to best results become candidates to be
2.1 Source of Data used. This method performs efficiently without causing too
much computational cost.
Kaggle is an online community for descriptive analysis and
predictive modelling. It collects variety of research fields’
© 2022, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM14486 | Page 2
International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 06 Issue: 06 | June - 2022 Impact Factor: 7.185 ISSN: 2582-3930
3.1BlockDiagram
3. THEORETICAL ANALYSIS