Data Based Modeling Approach of Iron and Steelmaking Processes1
Data Based Modeling Approach of Iron and Steelmaking Processes1
Abstract:
Iron and steel making processes are very complex in nature and we need prediction tools which
can act as a guideline to control them. Various modeling techniques have been adopted in order
to develop good prediction models. These models are the part of automation control systems in a
steel plant. These models could be fundamental in nature based upon physical and chemical laws
of the process on one hand and empirical approach on the other hand. Subject to the condition
that there could be lot of variations due to error in input measurements and other uncertain
factors beyond control, the actual process will always have some degree of uncertainty.
Therefore models which are based upon actual plant data are more reliable as compared to the
fundamental models. Even fundamental models could also be used in association with data based
models where various relationships and coefficients of uncertainty are evaluated based upon
actual plant data. In this paper data based modeling approach is demonstrated for BOF
steelmaking process in particular. A comparative study has been done for combination of
various approaches like ANN (Artificial Neural Networks), MTS (Mahalanobis Taguchi systems)
and PCA (Principal Component Analysis) and MLR (multivariate regression analysis) to
develop prediction models based upon industrial data.
Introduction:
Currently, oxygen steelmaking accounts for 65% of worldwide crude steel production and is thus
the predominant steelmaking process. The oxygen converter utilizes oxygen as an oxidation
source for reacting with other elements to convert iron into steel and increase the bath
temperature. These reactions are characterized by a high reaction rate, short residence time,
numerous influencing factors and complicated reaction processes. In BOF steelmaking process,
composition and temperature of steel bath can’t be measured continuously and operation
conditions vary frequently, which makes it difficult to control the BOF end-point bath precisely.
Actually, it often happens that operators have to reblow the steel bath due to the low control
precision of end-point bath. So improving the control precision of BOF steelmaking end-point is
quite important. Earlier, the steel industry used to rely on fundamental heat and mass balance
methods to predict parameters like the temperature and blow time and input weights required like
tonnes of oxygen required. But this process is highly time-consuming and in many cases,
inaccurate, owing to the factors mentioned above. Due to the coexistence of several phases and
the complex flow conditions with mass and heat transfer inside, a steel making furnace is very
difficult to model. For many years, furnace operators have been aware of the fact that there are
no universally accepted methods for accurately controlling complex iron and steelmaking
operation and predicting the outcome. Our task is to develop a predictive model which could be
developed using the operational data of steelmaking process. The models developed under this
category uses data based techniques, particularly multiple linear regression and artificial neural
networks (ANN) along with reduction in dimensionality of the problem using Mahalonobis
method (MTS) and Principal component analysis (PCA). In this paper the application and the
advantages of using these techniques is explained in detail in the next section.
The Basic Oxygen Steel-making (BOS) process converts hot metal, from blast furnace, and scrap
into steel by exothermic oxidation of metalloids dissolved in the iron. Oxygen also combines
with carbon, eliminating the impurities by gas collection. The main purpose of this process is the
carbon percentage decrease: from approximately 4% in hot metal to less than 0.08% in liquid
steel. BOF steelmaking process is executed to raise the bath temperature and reduce the impurity
level by blowing proper volume of oxygen into the steel bath surface and adding appropriate
amount of flux and coolant into the bath. The main raw materials of the process include main
materials (such as hot metal, scrap, pig iron) and sub-material (oxygen, iron ore, lime, dolomite
and etc.), and the product is the steel bath of which the temperature and composition are required
to hit the tapping aim window.
The quantity of oxygen utilized plays an important role in determining the steel quality.
Specifically, if the amount of oxygen injected is too small, the endpoint carbon content will
exceed the required value or the endpoint temperature may be too low. If the amount of oxygen
is too large, the molten steel will be over-oxidized, the consumption of alloys will increase, the
temperature may be too high and the yield of liquid steel will decrease. Therefore, determination
of the exact oxygen blowing quantity has tremendous influence on the steelmaking process.
According to the characteristics of BOF steelmaking process, the control method combining the
Static Process Control with Dynamic Process Control is popularly used. Static Process Control
determines the gross requirement of oxygen and coolant for the each heat based on the initial
information, when sub-lance SL1 measurement is processed successfully in the posterior period,
Dynamic Process Control is started to adjust the dynamic requirement of oxygen and coolant
based on the measurement result of bath [C]
1. Fundamental approach
The BOS is a very complex chemical batch process. The amount and quality of scrap iron
change from batch to batch; the grades of steel produced can change frequently and also
changes the vessel shape during the campaign lifetime. A first principles model—called
charge balance or static model—which is a complete heat, mass and chemistry balance of the
steel-making process is used to predict total oxygen blow necessary to each batch. However,
model mismatches and the unsteady-state nature of decarburization rate lead to a poor control
in end-point temperature and carbon percentage.
2. Linear regression
The multiple linear regression model is based on the utilization of a large amount of
production data; therefore data from nearly 1 000 heats of the same campaign need to be
collected from steel plants. Before incorporating the production data into the model, the data
is filtered and treated. The principles of filtration and treatment include removal of the
variables which do not affect the model and omission of abnormal values of the variables so
that the production data meets the actual requirements. The selection of independent variables
plays a key role in establishing the model. The reactions that occur in the molten steel bath of
a converter are very complex, and end-point manganese content is affected by numerous
interacting factors. Therefore, in order to provide an adequate description of the entire melting
process and clarify the model, the multiple linear regression models employs those factors
which change dramatically and play a key role in the BOF steelmaking process as the basic
variables. Broadly speaking a regression model assigns certain weights for each contributing
factor in such a way that the equation of a straight line is satisfied for the maximum number
of points. Say if we wish to predict the end point Manganese and using some contributing
factors. Linear regression fits a straight line into the plot for the graph of contributing factors
vs. manganese % plot. The line with the best fit (or highest R squared value) gives us the most
accurate curve. Of course, there is no justification for the choice of the particular form of
relationship. This and other difficulties associated with ordinary linear regression analysis can
be summarized as follows: (a) A relationship has to be chosen before analysis. (b) The
relationship chosen tends to be linear, or with non-linear terms added together to form a
pseudo linear equation. (c) The regression equation, once derived, applies across the entire
span of the input space.
With the development of artificial intelligence, some control methods based on neural
network or neural network combined with algorithms have been widely used in BOF end-
point control[1-5]. ANNs represent an alternative computational paradigm in which the solution
to a problem is learned from a set of examples. The concept of ANN originally comes from
the mechanisms for information processing in human brain system. ANN models has been
applied to the wide range of complex metallurgical processes[1-5] and proved to be successful
due to its ability to develop non linear relationships. ANNs are the mathematical patterns
constructed by several neurons arranged in different layers interconnected through the
complex networks. The layers are defined as input layer, output layer and at least one hidden
layer. A multilayer feed forward back propagation ANN network has been used in present
work. The typical ANN topology is presented in Fig. 1.
Fig. 1: Architecture of feed-forward back propagation ANN
The output of a neuron (k) in the network (yk) is the summation of all signals from previous layer
multiplied by weights (wk,j) and a bias (bk) which is activated by a transfer function (tanh
sigmoid) in the following way:
§ N · 2
yk f ¨¨ ¦ wk , j .x j bk ¸¸ where f ( z ) 1 (1)
©j 1 ¹ 1 exp(2.z )
The sum of the square of the errors (between the training output data and output data obtained
using ANN) are minimized for getting the correct values of weights.
4. Principal Component Analysis (PCA)
Principal component analysis is done for reducing the dimensionality of data set. The Principal
components are calculated which are orthogonal to each other and all variables can be defined by
principal components. Finally only those principal components are considered for analysis which
have more than 90% cumulative sum of variances.
5. Mahalanobis Taguchi System (MTS)
Mahalanobis-Taguchi system is used to minimize the number of variables (or control factors)
required to predict the performance of a system. It is based upon the calculation of Mahalanobis
distance, Mahalanobis space to be used to discriminate between normal and abnormal data
followed by reduction using orthogonal array and signal to noise ratio to calculate the effect of
each variable. The reduction in dimensionality of the problem is based on Mahalanobis distances
and signals to noise ratios[6-8].
Result analysis of data driven model developed for phosphorus prediction in BOF
steelmaking process:
Data drive based models have been developed for the prediction of end point phosphorous for
BOF steelmaking process. Table 1 gives the details of the steel plant data (400 in numbers) used
for calculation.
Table 1: Range, mean and standard deviation of the data set used for investigation
HMP (Hot Metal Phosphorous (wt %)) 0.29 0.20 0.24 0.02
SL_FE (Fe Level of the slag (Wt %)) 26.50 14.60 19.50 2.00
As it can be seen that phosphorous has got strongest correlation with EB_TEMP followed by
OXY_ACT and SCP_ACT. The interdependence among different variables is also evident from
above table.
In MTS-MLR method first of all MTS run was done. In MTS run following variables were
selected (for variables having positive gain values as given in Table 3.
Table 3: The selected variables and their positive gain values after MTS run
Standard
Coefficients Error t Stat P-value
Intercept -0.095478988 0.008351118 -11.43307878 2.32082E-26
LIME -0.000324903 7.41834E-05 -4.379719093 1.52226E-05
HMWT_ACT -4.15385E-05 1.54942E-05 -2.680895095 0.007648581
EB_TEMP 7.48028E-05 5.02834E-06 14.87624737 4.26512E-40
CaO/SiO2 -0.001740103 0.000497778 -3.495740179 0.000526106
0.02
Predicted [P]
0.015
0.01
0.005
0
0 0.005 0.01 0.015 0.02 0.025
Actual [P]
In MTS-MLR-ANN method, Neural network model was developed by using finally selected
variables in MTS-MLR method. Predictive performance of MTS-MLR-ANN model is plotted in
Fig. 3.
Fig. 3: Prredictive perrformance off MTS-MLR
R-ANN moddels
Standarrd
Coefficientss Error t Stat P-value
Interceptt -0.11108163 34 0.008684 -12.7908 8 1.31E-31
SCP_AC CT 0.00010366 61 1.59E-05 6.537584 4 1.92E-10
OXY_AC CT -1.20841E-0 06 4.1E-07 -2.94548 8 0.003414
EB_TEM MP 8.11189E-0 05 5.5E-06 14.75673 3 1.34E-39
CaO/SiOO2 -0.00123929 99 0.000508 -2.43904 4 0.015164
0.025
5
0.02
2
Predicted [P]
0.015
5
0.01
1
0.005
5
0
0 0
0.005 0.01 0
0.015 0
0.02 0
0.025
Actual [P
P]
Fig. 4: Predictive
P peerformance of
o MLR moddel
Based uppon Principaal Componeent analysis using MAT TLAB for given
g data seet, the folloowing
principall components are calculaated for moree than 90% cumulative
c v
variance (Taable 6).
Coeff_
_Princomp1 Coeff_Prin
ncomp2 Co
oeff_Princommp3 Coeff_Princomp4
LIME -0
0.000298829 -0.010
0145147 -0.0067025508 0..089812428
HMA_SII 3.30E-07 -0.000
0713899 0.0013398821 0
0.00026098
HMA_P 3.06E-06 -1.34E-05 5.60E
E-05 3.46E-05
HMA_TEEMP 0.0130596 0.950
0162146 -0.3098320054 -0..018530731
HMWT_A ACT 0.004706269
0 -0.008
8619643 -0.1140853367 0..731035508
SCP_ACCT -0.00981226 0.047
7189526 0.1306660009 -0..636634756
ORE 0.0022988 0.00
0798071 -0.017530033 0..136681841
OXY_ACCT -0
0.999655022 0.01
1804258 0.012684418 0..013206178
EB_TEMMP -0
0.019884965 -0.307
7198914 -0.9345550081 -0..175068211
SL_FE -0.00024973 -0.005
5020839 0.0013356654 0..048200327
SL_P2O
O5 -3.54E-05 0.001
1304467 -0.0017374459 -0..004459078
CaO/SiO
O2 -7.20E-05 5
5.39E-05 -0.0014477738 0
0.00501806
Multiple linear regression is perfformed usingg these princcipal componnents as variiables are givving
followingg results (Taable 7):
Standaard
Coefficientss Erro
or t Stat P-value
e
Interceptt 0.01321610 04 0.0001333358 99.10259195
5 4.0634E-2282
Princomp1 -831133312.8 3754159 963.2 -
-2.2138997655 0.0274033336
princompp2 831133312.8 3754159 963.2 2.213899765
5 0.0274033336
princompp3 -2.19824E-0 05 4.770633E-06 -
-4.6078605633 5.49031EE-06
princompp4 -5.90027E-0 05 5.735144E-06 -
-10.287935522 3.65163EE-22
Predictivve performan
nce of PCA-M
MLR modell is plotted inn Fig. 7.
Data driven models for the prediction of phosphorus using industrial data are developed using
ANN, MLR, MLR-ANN, MTS-MLR-ANN, PCA-MLR approaches. The relative performances
of all these models are given in Table 8:
Based upon the performance of various data based models, performance of MTS-MLR approach
is found to be best followed by MLR and MLR-ANN (4-6-1). Reduction of dimensionality of the
problem using MTS or PCA approach is always suggested to deal with lesser number of control
variables. The performance of any data based model depends upon the distribution of data of the
concerned process. In general linear regression models should work better if range of variation is
not so large for different variable data which is the case for most of industrial steelmaking
processes which are operated in well defined and small domain of variations. Application of
ANN models does not yield better performance due to noise and chaotic nature of the process.
References:
1. Rajesh N., Khare M.R. and Pabi S.K.: Materials Research. 2010, vol. 13(1), pp 15.
2. Shukla A.K., Deo B. and Robertson D.G.C.: Metallurgical and Materials Transactions B, July
2013.
3. Fileti A.M.F., Pacianotto T.A. and Cunha A. P.: Engineering Applications of Artificial
Intelligence, 2006, vol. 19, pp 9.
4. Das A., Maiti J., Banerjee R.N.: Expert Systems with Applications, 2010, vol. 37, pp 1075.
5. Cox I.J., Lewis R.W., Ransing R.S., Laszczewski H., Berni G.: Journal of Materials
Processing Technology , 2002, vol. 120, pp 310.
6. Taguchi G. and Rajesh J: Sankhya-The Indian Journal of Statistics, 2000, vol. 62(B), pp 233.
7. Cudney E.A., Hong J., Jugulum R., Paryani K., Ragsdell K.M. and Taguchi G.: Journal of
Industrial and Systems Engineering, 2007, vol. 1(2), pp 139.
8. Bagchi T.P.: Taguchi Methods Explained, Prentice Hall of India, New Delhi (1993).