Utilization of Support Vector Machine To Calculate Gas Compressibility Factor
Utilization of Support Vector Machine To Calculate Gas Compressibility Factor
a r t i c l e i n f o a b s t r a c t
Article history: The compressibility factor (Z-factor) is considered as a very important parameter in the petroleum indus-
Received 26 April 2013 try because of its broad applications in PVT characteristics. In this study, a meta-learning algorithm called
Received in revised form 15 August 2013 Least Square Support Vector Machine (LSSVM) was developed to predict the compressibility factor. In
Accepted 18 August 2013
addition, the proposed technique was examined with previous models, exhibiting an R2 and an MSE
Available online 26 August 2013
of 0.999 and 0.000014, respectively. A significant drawback in the conventional LSSVM is the deter-
mination of optimal parameters to attain desired output with a reasonable accuracy. To eliminate this
Keywords:
problem, the current study introduced coupled simulated annealing (CSA) algorithm to develop a new
Prediction of compressibility factor
Least square support vector machine
model, known as CSA-LSSVM. The proposed algorithm included 4756 datasets to validate the effective-
Coupled simulated annealing ness of the CSA-LSSVM model using statistical criteria. The new technique can be utilized in chemical and
Pseudo-reduced pressure petroleum engineering software packages where the most accurate value of Z-factor is required to predict
Pseudo-reduced temperature the behavior of real gas, significantly affecting design aspects of equipment involved in gas processing
plants.
© 2013 Elsevier B.V. All rights reserved.
0378-3812/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.fluid.2013.08.018
190 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
1.0 < Tpr < 3.0). To find this temperature function for low temper- Table 2
Optimal coefficients of Heidaryan–Moghadasi–Rahimi (2010) model.
ature isotherms, 890 Z-factor data from the Standing–Katz chart
were employed. The Z-factor correlation and the corresponding Coefficient Value [0.2 ≤ Ppr ≤ 3.0] Value [3.0 < Ppr ≤ 15.0]
parameters introduced by Hall and Iglesias-Silva [18] are presented A1 2.827793 × 10+00 3.252838 × 10+00
as follows: A2 −4.688191 × 10−01 −1.306424 × 10−01
1 + y + y2 − y3 A3 −1.262288 × 10+00 −6.449194 × 10−01
2 3.5
Z = − (14.54/Tpr − 8.23/Tpr + 3.39/Tpr )y A4 −1.536524 × 10+00 −1.518028 × 10+00
(1 − y)3 A5 −4.535045 × 10+00 −5.391019 × 10+00
2 3
A6 6.895104 × 10−02 −1.379588 × 10−02
+ (90.7/Tpr − 242.2/Tpr + 42.4/Tpr )y(1.18+2.82/Tpr ) A7 1.903869 × 10−01 6.600633 × 10−02
A8 6.200089 × 10−01 6.120783 × 10−01
+ k1 y exp[−k2 (y − 0.421)2 ] A9 1.838479 × 10+00 2.317431 × 10+00
A10 4.052367 × 10−01 1.632223 × 10−01
+ k3 y10 exp[−69279(y − 0.374)4 ] (6) A11 1.073574 × 10+00 5.660595 × 10−01
2.4
B = e + fTpr 1.56
+ gPpr 0.124 3.033
+ hPpr Tpr (13) Table 2 provides the values of constant coefficients (e.g., A1 , A2 ,
A3 ,. . .,A10 ) for Eq. (17) at two different pseudo-reduced pressure
C = i ln (Tpr )−1.28 + j ln (Tpr )1.37 ranges.
+ k ln(Ppr ) + l ln (Ppr )2 + m ln(Ppr ) ln(Tpr ) (14) 2.1.2.5. Heidaryan–Salarabadi–Moghadasi model [25]. Multiple
regression analysis was performed by Heidaryan et al. [25] to
develop an empirical correlation. The non-linear regression
5.55 0.68 0.33
D = 1 + nTpr + oPpr Tpr (15) equation was obtained by using 1220 data points where the
ranges of Ppr and Tpr are 0.2–15 and 1.2–3.0, respectively. The
E = p ln (Tpr )1.18 + q ln (Tpr )2.1 Heidaryan–Salarabadi–Moghadasi model [25] is given by:
A5 A6
+ r ln(Ppr ) + s ln (Ppr )2 + t ln(Ppr ) ln(Tpr ) (16) A1 + A2 ln(Ppr ) + A3 (ln(Ppr ))2 + A4 (ln(Ppr ))3 + Tpr
+ 2
Tpr
Z= A9 A10
(18)
1 + A7 ln(Ppr ) + A8 (ln(Ppr ))2 + Tpr
+ 2
Tpr
The numerical values of the coefficients (e.g., a, b, c, and d) in
Eqs. (12)–(15) are given in Table 1. Table 3 lists the coefficients of the above correlation.
Table 1
Tuned coefficients of Azizi–Behbahani–Isazadeh (2010) equation.
2.1.2.6. Sanjari and Nemati Lay model [23]. Given 5844 experimen-
tal data points for Z-factors of natural gas mixtures, Sanjari and
Coefficient Value Coefficient Value Nemati Lay [23] proposed a empirical equation within the ranges of
a 0.0373142485385592 k −24,449,114,791.1531 1.01 < Tpr < 3.0 and 0.01 < Ppr < 15.0. This model divides the pressure
b −0.0140807151485369 l 19,357,955,749.3274 region into two parts including 0.01 < Ppr < 3.0 and 3.0 < Ppr < 15.
c 0.0163263245387186 m −126,354,717,916.607
Depending on the range of reduced pressure, different values for
d 0.0307776478819813 n 623,705,678.385784
e 13,843,575,480.943800 o 17,997,651,104.3330 the coefficients of the following correlation were suggested
f −16,799,138,540.763700 p 151,211,393,445.064 A (A +1) (A +2)
g 1,624,178,942.6497600 q 139,474,437,997.172 2
A3 Ppr4 A6 Ppr 4 A8 Ppr 4
Z = 1 + A1 Ppr + A2 Ppr + + + (19)
h 13,702,270,281.086900 r −24,233,012,984.0950 A A (A +1)
Tpr5 Tpr7 Tpr 7
i −41,645,509.896474600 s 18,938,047,327.5205
j 237,249,967,625.01300 t −141,401,620,722.689 The tuned coefficients for Eq. (19) are tabulated in Table 4.
192 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
This approach was initially used by Normandin et al. [30] to min w w+C ei
2 (22)
develop an artificial neural network (ANN) model for Z-factor i=1
estimation of a number of pure gases. Kamyab et al. [31] also Subjected to yi (wT ϕ(x) + b)≥1 − ei , i = 1, . . ., l
implemented the ANN modeling to predict compressibility fac- The Lagrangian equation is defined by:
tor for natural gases. The model introduced by them that covers
1 T l l
wide ranges of reduced temperature and pressure (1 < Tpr < 3.0 and L(w, b, e; ˛) = w w+C ei2 − ˛i {yi (wT (xi ) + b) − 1 + ei }
0 < Ppr < 30) exhibits higher accuracy compared to the Normandin 2
i=1 i=1
et al.’s model [30]. Baniasadi et al. [32] designed an ANN model (23)
in which the M-factor (BP/RT), reduced temperature, and reduced
pressure were selected as the input variables. Their model is based
on 597 data points. Kamari et al. [33] obtained an intelligent model If the Lagrangian equation is differentiated with respect to w, b,
on the basis of support vector machine for estimation of sour gas ˛i and ei , the following set of equations is obtained, applying the
conditions:
compressibility factor. In addition, Chamkalani et al. [34] intro- ⎧
duced particle swarm optimization (PSO) and genetic algorithm ⎪ l
⎪
⎪ dL
= → =
(GA) as population-based stochastic search algorithms to opti- ⎪
⎪
0 w ˛i yi (xi )
⎪
⎪
dw
mize the weights and biases of networks while predicting the gas ⎪
⎪ i=1
⎪
⎨ dL = 0 → ˛ y = 0
⎪ l
compressibility factor. They conducted a comparison between the
optimized ANN model and 13 conventional models, showing the db i i
(24)
superiority of their developed smart technique. ⎪
⎪ i=1
⎪
⎪ dL
⎪
⎪ = 0 → ˛ i = Cei , i = 1, . . ., l
⎪
⎪ dei
⎪
⎪
2.2. Theoretical perspectives of smart systems used ⎪
⎩ dL = 0 → y = y (wT (x ) + b) − 1 + e , i = 1, . . ., l
i i i i
˛i
2.2.1. Artificial neural network
Omitting ei , it results in the KKT system as follows:
Artificial neural network (ANN) is a nonlinear method for solving
problems which might not be solved by conventional techniques. 0 YT 0
Neurons are considered as synchronous processing elements that = −
→ (25)
Y ˝ + C −1 I 1
are inspired by the biological nerve system. These elements also fol-
low human learning activity in terms of methodology [35,36]. There where C is a positive constant (regularization parameter), and b
are different neural network structures for ANN. The Feed forward is the bias.
neural network (FFNN) is the most commonly used structure. FFNN Other variables in Eqs. (24) and (25) are defined by the following:
consists of three main layers including, input layer, output layer,
and intermediate layer. Each layer is connected to another layer by y = (y1 , . . ., yl )T (26)
the neurons. Each node in a layer is strengthened or weakened by a
−
→
weight, leading to a significant effect on the power and role of next 1 = (1, . . ., 1) (27)
layers. The link between ith and jth neuron is multiplied by a weight
coefficient wij and this adjustment is conducted by the threshold ˛ = (˛1 , . . ., ˛l ) (28)
coefficient ϑi for the ith neuron. The output of the ith neuron, xi , is
obtained as follows: ˝ij = yi yj (xi )T (xj ) = yi yj K(xi , xj ), 1 ≤ i, j≤l (29)
Applying the Mercer condition [52], the final result is as follows:
xi = f (i ) where i = ϑi + Wij xj (20)
l
j∈
y(x) = sign ˛i yi K(xi , xj ) + b (30)
In the above, i and f( i ) are the potential of the ith neuron and the i=1
activation function, respectively. The sigmoid function, as shown The most common kernel functions used in SVM are shown for
below, is one of the most common activation functions, applicable three cases as follows [53,54]:
for most of engineering problems:
1. Linear:
1
f () = (21) K(xi , xj ) = xi · xj (31)
1 + e−
2. Polynomial:
Further information about ANN can be found in the recent liter-
ature [36–47]. K(xi , xj ) = (xi · xj + c)d for d ∈ N, c≥0 (32)
A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202 193
3. Gaussian (RBF): The data were divided into two parts as the training phase
contains 3503 data, where the testing part includes the data dis-
−||xi − xj ||2
K(xi , xj ) = exp (33) tributions of 258, 258, 258, 258, and 113. The corresponding
2 2
pseudo-temperature values for these distributions are 1.15, 1.35,
1.6, 2.2, and 2, respectively.
It is important to note that the SVM-based methods have many
advantages over the traditional methods based on the ANN tech- 3.2. Research methodology
nique. The most important pros for the SVM model are such as
[49,55]: In this study, a variety of methodologies for prediction of com-
pressibility factor were examined. Two iterative correlations, four
• The model has only two parameters. directly-derived correlations, and two intelligent systems were
• It has more probability for convergence to the global optimum. chosen here in order to make a comprehensive as well as fair
• This method can usually find a solution that can be quickly judgment. The two intelligent systems used in this paper are
artificial neural network (ANN) and least square support vector
attained by a standard algorithm.
• Determination of the network topology is not required in machine (LSSVM) which is combined with the coupled simu-
lated annealing (CSA) optimization. The two iterative models
advance. The network structure can be automatically found out
include Dranchuk and Abu-kassem [19] and Hall and Iglesias-
as the training process is terminated.
• Over-fitting complications are less feasible in SVM systems. Silvia [18], whereas the four direct techniques are composed
• There is no requirement for choosing the number of hidden nodes of Azizi–Behbahani–Isazadeh [24], Heidaryan–Moghadasi–Rahimi
[1], Heidaryan–Salarabadi–Moghadasi [25], and Sanjari and Nemati
in the SVM method.
• Acceptable generalization performance is attainable in the tech- Lay [23] models.
One input layer with two neurons, one hidden layer with ten
nique compared to ANN.
• Fewer adjustable parameters are required in the SVM technique. neurons, and one output layer with one neuron were determined
• The SVM models require convex optimization procedures. to be the proper architecture for the ANN system. The method of
back-propagation scaled conjugate gradient was also selected to be
the training algorithm. In addition, the ANN model was constructed
2.2.3. Coupled simulated annealing
based on the activation functions of sigmoid and linear for the hid-
The coupled simulated annealing (CSA) method was proposed
den and output layers, respectively. In order to remove outliers and
by Xavier-de-Souza et al. [56]. CSA introduces a new feature of
keep the data in a certain range, the data normalization was carried
acceptance probability functions that can be applicable to a group of
out.
optimizers [57–59]. This approach considers several current states
Prior to implementation of LSSVM modeling, a number of sys-
which are coupled together by their energies in their acceptance
tem parameters should be determined through an optimization
probability functions [57–59]. Also, the parallelism is an inherent
approach as the parameter selection is a problem that generally
characteristic of this class of techniques. The objective of gener-
hinders the applicability of SVM techniques [57]. The prede-
ating coupled acceptance probability functions, which consist of
termined parameters are regularization parameter (C), kernel
the energy of many current states or/and solutions, is to produce
parameter ( 2 ), and kernel function. The Gaussian radial basis
further information when deciding to accept less encouraging solu-
function (RBF) is chosen as the kernel function in the algorithm
tions.
proposed in this study as it is the most commonly-used kernel
Namely, this probability depends also on the costs of the solu-
for the LSSVM model. It is noteworthy that the kernel parameter
tions in a set ∈ ˝, where ˝ is the set of all possible solutions. This
( 2 ) has significant impact on the generalization performance of
dependence is given by the coupling term which is generally a
LSSVM [60]. The regularization parameter (C) determines the trade-
function of the costs of the solutions in . In CSA, the acceptance
off between minimizing the training error and minimizing model
probability A and the coupling term are given by
complexity [60]. The parameters of the kernel function implicitly
E(xi ) − maxq ∈ (E(q))/Tkac describe the non-linear mapping from input space toward the high-
A (, xi → yi ) = exp (34)
dimensional feature space [61]. If the system parameters are not
properly picked, the performance of LSSVM will decline. Hence,
where it is necessary to optimize the parameters of LSSVM in order to
E(x) − maxq ∈ (E(q)) obtain a reasonable performance and also handle the learning task.
= exp (35) It requires either a comprehensive search throughout the param-
Tkac
∀q ∈ eter space or an optimal methodology that look at only a finite
Here, Tkac is the acceptance temperature and xi and yi denote subset of the possible values [62]. The rationale for parameters
an individual solution of and its corresponding probing solution, optimization in the LSSVM system makes it crucial to find out the
respectively. These two equations define A as a probability. The generalization error and determine its global minimum over the
sum of probabilities of leaving any of the current states equals 1.0. parameter space.
The adapted optimization technique for the case under study
is a heuristic based method, known as coupled simulated anneal-
3. Data collection and methodology
ing (CSA). The parameters assigned to the CSA are the generation
temperature (Tk ) and acceptance temperature (Tkac ) at time instant
3.1. Data assembly
k. Both parameters are initially assumed to be one. The LSSVM
parameters are also considered to be 2 = [0.01, 104 ] and C = [10,
The study was carried out by using 4756 data points that
104 ].
were extracted from the Standing and Katz chart [6] in range of
0 ≤ PPr ≤ 30 and 1 ≤ TPr ≤ 3. The inputs include PPr and TPr , while 4. Results and discussion
the compressibility factor, Z, is the output parameter. Although the
globally-accepted chart used in this study is more appropriate for The optimal parameters for the LSSVM algorithm using the RBF
various hydrocarbons, the model developed can be extended to all kernel were calculated as 2 = 0.067643401336, C = 1,044,674.4225
types of gases. through the optimization approach.
194 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
Fig. 1. Simulated values of compressibility factors against observed values for two iterative correlations namely, Dranchuk and Abu-kassem (1975) and Hall and Iglesias-Silvia
(2007).
Fig. 2. Predicted values of compressibility factors versus observed values for four direct correlations including, Azizi–Behbahani–Isazadeh (2010),
Heidaryan–Moghadasi–Rahimi (2010), Heidaryan–Salarabadi–Moghadasi (2010), and Sanjari and Nemati Lay (2012).
Fig. 1 depicts the predicted values of compressibility factors high predictive capability of both intelligent systems, though
against the values obtained from two iterative correlations namely, better computational performance is attained employing the
Dranchuk and Abu-kassem [19] and Hall and Iglesias-Silvia [18]. As LSSVM.
seen from the figure that the Dranchuk and Abu-kassem correlation The relative error percent of all models considered for com-
[19] exhibits a moderate behavior, while the Hall and Iglesias-Silvia parison purpose are plotted in Fig. 4 as a function of pseudo
model [18] follows a scatter pattern. As shown in Fig. 2, the similar reduced pressure. The data used in the figure were included in
comparison was conducted for four direct correlations including: the training phase. The interesting point in the figures is that
Azizi–Behbahani–Isazadeh [24], Heidaryan–Moghadasi–Rahimi the noisy pattern is observed when the Ppr is lower than 2,
[1], Heidaryan–Salarabadi–Moghadasi [25], and Sanjari and Nemati although this scatter feature is different for the models. As obvi-
Lay [23]. Also, smart techniques (e.g., ANN) and the LSSVM ous from Fig. 4, the LSSVM model demonstrates its ability in
model were compared with each other in terms of predictive acquiring higher performance in terms of predictive accuracy. It
performance. The results depicted in Fig. 3 clearly denote the is also concluded from the figure that the relative error percent
A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202 195
Fig. 3. Simulated values of compressibility factors against real data for two intelligent systems introduced as LSSVM and ANN.
Fig. 4. Relative error percentage [(Pred − Obs)/Obs × 100] of all models taken into account in the present study. “Pred” and “Obs” stand for “Predicted” and “Observed” values,
respectively.
Table 5
Performance assessment of various predictive models in training phase.
LSSVM 0.999 8.89E-05 0.001884 0.333372 0.000009 0.003067 31.15468 0.008894 0.188414
ANN 0.995 0.002557 0.015732 0.331552 0.000447 0.021304 895.6572 0.255683 1.573193
Dranchuk and Abou-Kassem (1975) 0.894 −0.04684 0.058937 0.38988 0.016044 0.133254 −16,403.3 −4.68397 5.893711
Hall and Iglesias-Silva (2007) 0.081 −0.54375 0.581698 0.675072 0.093971 0.710811 −168,725 −54.3747 58.16978
Heidaryan–Moghadasi–Rahimi (2010) 0.994 0.004637 0.011061 0.338875 0.000590 0.02491 1623.863 0.463696 1.106065
Heidaryan–Salarabadi–Moghadasi (2010) 0.968 0.011992 0.046886 0.323804 0.003334 0.059605 4199.539 1.199183 4.688588
Azizi–Behbahani–Isazadeh (2010) 0.744 −0.01311 0.177643 0.265633 0.016868 0.183981 −4589.44 −1.31052 17.76433
Sanjari and Nemati Lay (2012) 0.790 0.100905 0.107511 0.487149 0.047368 0.257561 35,347.19 10.09055 10.75114
196 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
Fig. 5. Predicted and digitized compressibility factor values plotted with dots and lines for Dranchuk and Abu-kassem model (1975).
Fig. 6. Predicted and observed compressibility factor values shown with dots and lines for Hall and Iglesias-Silvia model (2007).
for LSSVM lies between −20 and 20 whereas others fall into larger In Fig. 5a, the compressibility factors predicted by correlation
limits. of Dranchuk and Abu-kassem at different reduced temperatures
Table 5 lists the magnitudes of the coefficient of determination and pressures are plotted versus the observed values of Z-factor.
(R2 ) and mean squared error (MSE) in order to assess the per- It is important to note that the digitized and estimated values are
formance of models in the training stage. As the data shows, the shown in Fig. 5 with dots and lines, respectively. As can be seen in
LSSVM experiences an excellent performance with an R2 of 0.999 Fig. 5a, the outputs obtained from Dranchuk and Abu-kassem [19]
and an MSE of 0.000005 while R2 and MSE for the next proper model demonstrates a considerable deviation at low pressure and temper-
(e.g., conventional ANN) are equal to 0.995 and 0.000447, respec- ature, compared to the observed (or digitalized) values. The relative
tively. The definitions of the statistical parameters are presented in error percent presenting the correlation performance for Dranchuk
Appendix. and Abu-kassem model is also depicted in Fig. 5b. For better evalu-
The pseudo-reduced temperatures [1.15, 1.35, 1.60, 2.20, 2.00] ation, a part of the figure was magnified. Based on Fig. 5, Dranchuk
and various pseudo-reduced pressures (mostly greater than 15) and Abu-kassem model [19] has a weak predictive performance at
were selected for the testing phase in order to have a better judg- Tpr equal to 1.15, 1.35, and 2.00 when the gas pressure exceeds the
ment on the predictive potential of the deterministic approaches. critical pressure.
A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202 197
Fig. 7. Observed and estimated values of compressibility factor versus Ppr , based on Heidaryan–Moghadasi–Rahimi model (2010).
Fig. 8. Results obtained from Heidaryan–Salarabadi–Moghadasi correlation (2010) against observed data.
The performance of Hall and Iglesias-Silvia correlation [18] in the latter correlation. In spite of Heidaryan–Moghadasi–Rahimi
determining Z-factor was examined through including the real and correlation [1], Heidaryan– Salarabadi–Moghadasi model under-
predicted data into Fig. 6. The correlation demonstrates a satisfac- estimates the Z-factor at Ppr above 15. It is worth noting that the
tory performance at very low Ppr , whereas unacceptable predictions relative error percentage for Heidaryan–Salarabadi–Moghadasi
in contrast to the observed values are obtained at the reduced tem- model mostly lies in the interval of [−15 21], while it varies
perature and pressure ranges of [1.15 1.35] and [1 7], respectively, between −3 and 8 for Heidaryan–Moghadasi–Rahimi correlation
resulting in high error percentages. [1].
It is observed from Fig. 7 that the correlation proposed by The outcome of Azizi–Behbahani–Isazadeh model [24] while
Heidaryan et al. [1] to predict gas compressibility factor attains estimating Z-factor is depicted in Fig. 9 that clearly shows the weak
acceptable outputs (for Ppr lower than 15) which look proper for predictive performance of the correlation so that a noticeable devi-
practical cases, though the model overestimates the Z-factor when ation with respect to the observed compressibility factor is experi-
the pressure is greater than the critical pressure. The main rea- enced, particularly when the reduced pressure is kept below 8.
son for high error percentage at Ppr > 15 seems to be the limited Fig. 10 presents the results obtained from Sanjari and Nemati
range of reduced pressure (e.g., 0.2–15) used for derivation of the Lay [23] model. It is clear from the figure that the undesirable pre-
correlation. dictions are obtained at Tpr = 1.15. Also, the correlation gives higher
Fig. 8 compares the predictive performance of values for Z-factor for pseudo reduced pressure over 15. It might be
Heidaryan–Salarabadi–Moghadasi model [25] with Heidaryan– attributed to the limited database employed when the model has
Moghadasi–Rahimi model [1], exhibiting higher accuracy for been originally developed.
198 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
Fig. 9. Estimated and observed compressibility factor versus Ppr , based on Azizi–Behbahani–Isazadeh model (2010).
Fig. 10. Z-factors predicted by Sanjari and Nemati Lay model (2012) in contrast with observed data.
As shown in Fig. 11, the artificial neural network (ANN) is able phase, as tabulated in Table 6. The LSSVM model acquired the
to satisfactorily simulate the variation of Z-factor as function of Tpr highest R2 (e.g., 0.999) and the lowest values for other param-
and Ppr within the wide ranges of the data, except at the reduced eters corresponding to various types of error. After that, the
pressure above 15 at which the ANN fails to accurately predict the Heidaryan–Moghadasi–Rahimi [1], and ANN models have the
output parameter. next ranks in predicting Z-factor, considering the accuracy crite-
To remove the drawbacks of the previous models, the LSSVM ria.
technique is applied on the input data, leading to promising Further attempts were made to demonstrate the prediction
results in terms of predicting Z-factor behavior over various tem- capability of the models under study. Therefore, four natural
peratures and pressures and also precision of prediction. The gas samples were taken from Elsharkawy’s study [2] where dif-
outcome of this investigation is seen in Fig. 12. Having a very ferent composition has been considered for each mixture. As
good match between the predicted and observed values con- clearly seen in Table 7, the LSSVM model predicts the tar-
veys the message that the developed LSSVM model can be one get parameter with higher precision in comparison to other
of the best alternatives for estimation of compressibility factor methods.
with minimum error percentage (see Fig. 12). In order to pre- It should be noted here that Stewart–Burkhardt–Voo [63] (SBV)
cisely investigate the performance of the LSSVM technique among mixing rule was used to determine the pseudo-critical properties of
others, nine criteria are identified while considering the testing the gases. Due to presence of hydrogen sulfide, carbon dioxide, and
A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202 199
Fig. 11. ANN outputs and digitized data versus pseudo-reduced pressure.
Fig. 12. Relative error percent and Z–Ppr plots, based on LSSVM model and actual data.
Table 6
Statistical investigation of previous models in terms of predictive performance.
LSSVM 0.999 −0.00057 0.0028 0.360989 1.40E-05 0.004234 −71.7074 −0.05723 0.279989
ANN 0.993 0.026969 0.029373 0.378632 0.000977 0.046553 3379.227 2.696909 2.937333
Dranchuk and Abou-Kassem (1975) 0.933 −0.03386 0.039634 0.402552 0.01082 0.109521 −4239.1 −3.38586 3.963378
Hall and Iglesias-Silva (2007) 0.034 −0.5481 0.587437 0.711505 0.110839 0.768338 −63,086.4 −54.8101 58.74369
Heidaryan-Moghadasi-Rahimi (2010) 0.997 0.002478 0.009198 0.372844 0.000382 0.025175 310.1876 0.247754 0.919781
Heidaryan–Salarabadi–Moghadasi (2010) 0.962 −0.01232 0.034391 0.339221 0.004319 0.073807 −1541.98 −1.23162 3.439106
Azizi–Behbahani–Isazadeh (2010) 0.753 −0.05565 0.160361 0.269292 0.015149 0.216443 −6966.88 −5.5646 16.03605
Sanjari and Nemati Lay (2012) 0.904 0.057412 0.06248 0.43028 0.017269 0.152934 7193.681 5.741166 6.247979
nitrogen in natural gas, the pseudo-critical properties of natural LSSVM model while estimating the compressibility fac-
gases were modified via employing Wichert and Aziz [64] method. tor. Again, this particular evaluation indicates that the
In addition, the cumulative frequency analysis con- model proposed by Heidaryan–Moghadasi–Rahimi [1] is
ducted for both training and testing stages was provided in capable of predicting Z-factor accurately after the LSSVM
Fig. 13. This figure also confirms the high accuracy of the technique.
200
Table 7
Compressibility factor of four different samples of natural gas, estimated by various techniques.
Rich gas condensate CO2 rich gas Very light gas Highly sour gas condensate
Temperature (F) 313 313 313 219 219 219 209 209 209 250 250 250
Pressure (Psi) 6010 3000 700 4825 1900 700 4786 2600 700 4190 2400 700
Z-experimental 1.212 0.927 0.97 0.851 0.775 0.915 1.019 0.933 0.969 0.838 0.809 0.935
Z-LSSVMa 1.204449 0.921511 0.96051 0.847007 0.768027 0.909499 1.014395 0.916708 0.962592 0.86314 0.817899 0.93485
Z-ANNa 1.187856 0.907709 0.940908 0.833827 0.749734 0.89885 1.002046 0.904544 0.959286 0.908643 0.854668 0.926585
Z-DAKa 1.133726 0.914005 0.950089 0.808829 0.791032 0.907694 1.004169 0.912877 0.956252 0.883237 0.850583 0.931136
Z-HIa 0.948357 1.428421 0.939567 1.05156 0.162486 0.879032 0.997693 −0.06274 0.948535 1.03642 −43.7517 0.912166
Z-HMRa 1.197943 0.912982 0.950368 0.837683 0.758822 0.90037 1.007852 0.904228 0.956984 0.899627 0.837831 0.928819
Z-HSMa 1.222253 0.919526 0.932158 0.837311 0.714872 0.865572 1.030255 0.903419 0.944237 0.914723 0.828008 0.899401
Z-ABIa 1.003748 0.818413 0.776472 0.857297 0.747834 0.727992 0.859988 0.813678 0.784127 0.860023 0.782026 0.750287
Z-SNa 1.17985 0.935816 0.958863 0.843556 0.774701 0.901357 1.01608 0.925818 0.965574 0.897892 0.85345 0.935162
a
The abbreviations are as Least Square Support Vector Machine (LSSVM), Artificial Neural Network (ANN), Dranchuk and Abou-Kassem (DAK), Hall and Iglesias-Silva (HI), Heidaryan-Moghadasi-Rahimi (HMR),
Heidaryan–Salarabadi–Moghadasi (HSM), Azizi–Behbahani–Isazadeh (ABI), and Sanjari and Nemati Lay (SN).
A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202 201
Fig. 13. Cumulative frequency plots for all models in terms of error percentage.
In summary, the proposed LSSVM optimized with CSA can 1 2
Standard deviation: SD = N−1
(Zpred − Z̄obs )
reduce the error between the predicted values and the actual val-
ues and consequently improve prediction performance within the i
1 2
Mean square error: MSE = (Zpred − Zobs )
broad ranges of input data. N
i
1 2
5. Conclusions Root mean square error: RMSE = N
(Zpred − Zobs )
Z i
pred −Zobs
This study employed the least square support vector Relative error percent: REP = Zobs
× 100
machine (LSSVM) as a meta-learning technique to estimate the i Z
compressibility factor. In this paper, the coupled simulated anneal- Average relative error percent: AREP = 1 pred −Zobs
× 100
N Zobs
ing (CSA) algorithm was proposed to obtain the optimum LSSVM
parameters through an effective manner. The comparison of the
i
Zpred −Zobs
1
Average absolute relative error percent: AAREP = N
| Zobs
| × 100
LSSVM model and other models (analytical and empirical correla-
i
tions) based on the statistical analysis showed the superiority of where Zpred , Zobs , Zobs and N are referred to the predicted compressibility factor,
our newly developed method such that R2 , ARE and MSE for the observed compressibility factor, average of observed compressibility factor, and
developed LSSVM technique are 0.999, −0.00057, and 0.000014, number of data, respectively.
respectively, statistically showing a satisfactory predictive tool.
The model introduced in this study also provided a considerable References
improvement over past correlation with broader applicability in
terms of temperature and pressure ranges. [1] E. Heidaryan, J. Moghadasi, M. Rahimi, Petrol. Sci. Eng. 73 (2010) 67–72.
[2] A.M. Elsharkawy, Fluid Phase Equilib. 218 (2004) 1–13.
[3] A. Bahadori, S. Mokhatab, B.F. Towler, J. Nat. Gas Chem. 16 (2007) 349–353.
Acknowledgements [4] Y.A. Cengel, M.A. Boles, Thermodynamics: An Engineering Approach, sixth ed.,
McGraw Hill, 2007, pp. 141, ISBN 978-0007-125771-8.
The authors express their sincere thanks to Prof. Blasingame and [5] A. Danesh, PVT and Phase Behaviour of Petroleum Fluids, Elsevier Science B.V.,
Netherlands, 1998, pp. 10 and 80, Chap. 1 and 2.
Prof. Hall, from Texas A&M University, for their valuable comments [6] M.B. Standing, D.L. Katz, Trans. AIME 146 (1942) 140–149.
throughout this study. Technical assistance of Dr. Kamyab, from [7] D.G. Rayes, L.D. Piper Jr., W.D. McCain, S.W. Poston, J. Soc. Petrol. Eng. Form.
Curtin University is also appreciated. Eval. 7 (1992) 87–92.
[8] W.B. Kay, Ind. Eng. Chem. 28 (1936) 1014–1019.
[9] W.F. Stewart, S.F. Burkhardt, D. Voo, Prediction of pseudocritical parameters for
Appendix. mixtures, in: Proceedings of the Presentation the AIChE Meeting, Kansas City,
MO, 1959.
[10] R.P. Sutton, Paper SPE 14265, in: Proceedings of the Presentation of the Annual
−Z )2
i pred obs 2
(Z
Coefficient of determination: R2 = 1 − Technology Conference Exhibition, Las Vegas, 1985.
i Z −Z
(Zpred −Z̄obs ) [11] J.H. Corredor, L.D. Piper, W.D. McCain Jr., Compressibility factors for naturally
1 pred obs occurring petroleum gases, in: Paper SPE 24864 Presented at the SPE Annual
Average relative error: ARE = N Zobs Technical Meeting and Exhibition, Washington, D.C., October 4–7, 1992.
i
Zpred −Zobs [12] L.D. Piper, S.A. McCain Jr., J.H. Corredor, Compressibility factors for naturally
Average absolute relative error: AARE = 1
N
Z occurring petroleum gases, in: SPE 26668, Houston, TX, October 3–6, 1993.
obs [13] M.B. Standing, Volumetric and Phase Behaviour of Oil Field Hydrocarbon Sys-
i tems, Society of Petroleum Engineers of AIME, Dallas, TX, 1981.
202 A. Chamkalani et al. / Fluid Phase Equilibria 358 (2013) 189–202
[14] A.M. Elsharkawy, Y.S.K. Hashim, A.A. Alikhan, Energ. Fuel J. 15 (2001) 807–816. [42] S. Zendehboudi, M.A. Ahmadi, A. Bahadori, A. Shafiei, T. Babadagli, Can. J. Chem.
[15] A.M. Elsharkawy, A. Elkamel, Pet. Sci. Technol. 19 (2001) 711–731. Eng. (2013).
[16] F.E. Londono Galindo, R.A. Archer, T.A. Blasingame, SPE Reserv. Eval. Eng. 8 [43] S. Zendehboudi, G. Zahedi, A. Bahadori, A. Lohi, A. Elkamel, I. Chatzis, Can. J.
(2005) 561–572. Chem. Eng. (2013).
[17] R.P. Sutton, SPE Reserv. Eval. Eng. 10 (2007) 270–284. [44] M.T. Hagan, H.B. Demuth, M. Beal, Neural Network Design, PWS Publishing
[18] K.R. Hall, G.A. Iglesias-Silva, Hydrocarb. Process. 86 (2007) 107–110. Company, Boston, 1966.
[19] P.M. Dranchuk, J.H. Abou-Kassem, J. Can. Pet. Technol. 14 (1975) 34–36. [45] H.R. Vallés, Master Thesis, University of Puerto Rico, 2006.
[20] H. Nishiumi, S. Saito, J. Chem. Eng. Jpn. 8 (1975) 356–360. [46] K. Hornik, M. Stinchcombe, H. White, Neural Networks 2 (1989) 359–366.
[21] K.R. Hall, L. Yarborough, Oil Gas J. 71 (1973) 82–92. [47] K. Hornik, M. Stinchcombe, H. White, Neural Networks 3 (5) (1990) 551–600.
[22] M. Benedict, G.B. Webb, L. Rubin, J. Chem. Phys. 8 (1940) 334–345. [48] A. Chamkalani, M. Pordel Shahri, S. Poordad, Support vector machine model: a
[23] E. Sanjari, E. Nemati, Lay, J. Nat. Gas Chem. 21 (2012) 184–188. new methodology for stuck pipe prediction, in: SPE 164003, SPE Middle East
[24] N. Azizi, R. Behbahani, M.A. Isazadeh, J. Nat. Gas Chem. 19 (2010) 642–645. Unconventional Gas Conference and Exhibition held in Muscat, Oman, 28–30
[25] E. Heidaryan, A. Salarabadi, J. Moghadasi, J. Nat. Gas Chem. 19 (2010) 189–192. January, 2013.
[26] V.N. Gopal, Gas z-factor equations developed for computer, Oil Gas J. 75 (1977) [49] A. Chamkalani, A.H. Mohammadi, A. Eslamimanesh, F. Gharagheizi, D. Richon,
58–60. Chem. Eng. Sci. 81 (2012) 202–208.
[27] J.P. Brill, H.D. Beggs, Two-Phase Flow in Pipes, INTERCOMP Course, The Huge, [50] A. Chamkalani, Pet. Sci. Technol. (2012), http://dx.doi.org/10.1080/
1974. 10916466.2011.651237.
[28] J. Papay, A Termelestechologiai Parametrrek Valtozasa a Gazlelepk Muvelese [51] J.A.K. Suykens, J. Vandewalle, Neural Process. Lett 9 (1999) 293–300.
Soran, Ogil Musz, Tud, Kuzl., Budapest, 1968, pp. 267–273. [52] J. Mercer, Philos. Trans. R. Soc. Lond., A 209 (1909) 415–446.
[29] F.E. Londono Galindo, New Correlations for Hydrocarbon Gas Viscosity and Gas [53] S. Abe, Support Vector Machines for Pattern Classification, Springer-Verlag
Density, Texas A&M University, 2001 (M.Sc. thesis). Springer, New York, 2005.
[30] A. Normandin, P.A. Grandjean, J. Thibauld, Ind. Eng. Chem. Res. 32 (1993) [54] C.A. Burges, Data Min. Knowl. Discov. 2 (1998) 121–167.
970–975. [55] A. Shokrollahi, M. Arabloo, F. Gharagheizi, A.H. Mohammadi, Fuel 112 (2013)
[31] M. Kamyab Jr., J.H.B. Sampaio, F. Qanbari, A.W. Eustes III, J. Petrol. Sci. Eng. 73 375–384.
(2010) 248–257. [56] S. Xavier-de-Souza, J.A.K. Suykens, J. Vandewalle, D. Boll’e, IEEE Trans. Syst.
[32] M. Baniasadi, A. Mohebbi, M. Baniasadi, J. Eng. Thermophys. 21 (2012) 248–258. Manage. Cybern. B 40 (2010) 320–335.
[33] A. Kamari, A. Hemmati-Sarapardeh, S.M. Mirabbasi, M. Nikookar, A.H. Moham- [57] A. Chamkalani, M. Amani, M.A. Kiani, R. Chamkalani, Fluid Phase Equilib. 339
madi, Fuel Process. Technol. 116 (2013) 209–216. (2013) 72–80.
[34] A. Chamkalani, A. Mae’soumi, A. Sameni, J. Nat. Gas Sci. Eng. 14 (2013) 132–143. [58] S. Kirkpatrick Jr., C. Gelatt, M. Vecchi, Science 220 (1983) 671–680.
[35] A. Chamkalani, M. ArablooNareh‘ei, R. Chamkalani, M.H. Zargari, M.R. [59] E. Aarts, J. Korst, Simulated Annealing and Boltzmann Machines, John Wiley &
Dehestani-Ardakani, M. Farzam, Chem. Eng. Commun. 200 (2013) 731–747. Sons, 1989.
[36] T. Malinova, Z.X. Guo, Mater. Sci. Eng. A 365 (2004) 219–227. [60] X.L. Zhang, X.F. Chen, Z. He, Expert Syst. Appl. 37 (2010) 6618–6628.
[37] P.M. Wong, M. Jang, S. Cho, T.D. Gedeon, Comput. Geosci. 26 (2000) 907–913. [61] K. Duan, S.S. Keerthi, A.N. Poo, Neurocomputing 51 (2003) 41–59.
[38] S. Zendehboudi, M.A. Ahmadi, L. James, I. Chatzis, Energy Fuels 26 (6) (2012). [62] F. Imbault, K. Lebart, A stochastic optimization approach for parameter tuning
[39] M.A. Ahmadi, S. Zendehboudi, A. Lohi, A. Elkamel, I. Chatzis, J. Geophys. of support vector machines, in: Proceedings of the 17th international confer-
Prospect. (2012). ence on pattern recognition (ICPR’04), 2004, pp. 1051–4651.
[40] A.R. Rajabzadeh, S. Zendehboudi, A. Lohi, A. Elkamel, Can. J. Chem. Eng. (2013). [63] W.F. Stewart, S.F. Burkhard, D. Voo, Prediction of pseudo critical parameters for
[41] S. Zendehboudi, M.A. Ahmadi, O. Mohammadzadeh, A. Bahadori, I. Chatzis, Ind. mixtures, in: Paper presented at the AICHE Meeting, Kansas City, MO, 1959.
Eng. Chem. Res. (2013). [64] E. Wichert, K. Aziz, Hydrocarb. Process. 51 (1972) 119–122.