0% found this document useful (0 votes)
40 views

(2004 Makale) Stochastic Generation of Hourly Mean Wind Speed Data

The document discusses methods for generating synthetic hourly mean wind speed data, including normal distribution, Weibull distribution, autoregressive models, Markov chain, and a new wavelet-based method. It compares the ability of these methods to reproduce long wind speed time series with similar statistics to observational data.

Uploaded by

Helin Ekin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

(2004 Makale) Stochastic Generation of Hourly Mean Wind Speed Data

The document discusses methods for generating synthetic hourly mean wind speed data, including normal distribution, Weibull distribution, autoregressive models, Markov chain, and a new wavelet-based method. It compares the ability of these methods to reproduce long wind speed time series with similar statistics to observational data.

Uploaded by

Helin Ekin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228578636

Stochastic generation of hourly mean wind speed data

Article in Renewable Energy · November 2004


DOI: 10.1016/j.renene.2004.03.011

CITATIONS READS

133 981

4 authors, including:

Hafzullah Aksoy Z. Fuat Toprak


Istanbul Technical University Dicle University
174 PUBLICATIONS 4,588 CITATIONS 31 PUBLICATIONS 646 CITATIONS

SEE PROFILE SEE PROFILE

Ali Aytek
Gaziantep Islam Science and Technology University
22 PUBLICATIONS 1,202 CITATIONS

SEE PROFILE

All content following this page was uploaded by Ali Aytek on 18 April 2019.

The user has requested enhancement of the downloaded file.


Renewable Energy 29 (2004) 2111–2131
www.elsevier.com/locate/renene

Stochastic generation of hourly mean wind


speed data
Hafzullah Aksoy , Z. Fuat Toprak, Ali Aytek,
N. Erdem Ünal
Department of Civil Engineering, Civil Engineering Faculty, Istanbul Technical University,
Hydraulics Division, Maslak, 34469 Istanbul, Turkey
Received 19 September 2003; accepted 23 March 2004

Abstract

Use of wind speed data is of great importance in civil engineering, especially in structural
and coastal engineering applications. Synthetic data generation techniques are used in prac-
tice for cases where long wind speed data are required. In this study, a new wind speed data
generation scheme based upon wavelet transformation is introduced and compared to the
existing wind speed generation methods namely normal and Weibull distributed independent
random numbers, the first- and second-order autoregressive models, and the first-order Mar-
kov chain. Results propose the wavelet-based approach as a wind speed data generation
scheme to alternate the existing methods.
# 2004 Elsevier Ltd. All rights reserved.

Keywords: Normal distribution; Weibull distribution; Autoregressive models; Markov chain; Wavelet;
Hourly mean wind speed

1. Introduction and existing literature

Climatology is defined as a set of probabilistic statements on long-term weather


conditions [1], and wind climatology as that branch of climatology that specialises
in the study of winds, from which information on extreme winds is provided to
structural designers. Such information is also needed for wind energy producers
and engineers who design coastal civil structures, for example breakwaters. From a
structural engineering point of view, forecasting the maximum wind speed that is


Corresponding author. Tel.: +90-212-2856577; fax: +90-212-2856587.
E-mail address: [email protected] (H. Aksoy).

0960-1481/$ - see front matter # 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.renene.2004.03.011
2112 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

expected to affect a structure during its lifetime is important to the designer. On the
other hand, in coastal engineering practices, not only the magnitude but also the
directionality of wind becomes important. The duration of wind, in addition to its
magnitude and direction, is also required in wind energy production systems, and
the amount of energy that can be produced depends upon it.
The information required by either structural and coastal engineers or wind
energy producers is related to wind speed data, and is a matter of quality and quan-
tity. The quality of the wind speed data refers to whether the data set is reliable and
micrometeorologically homogeneous. A data set is reliable if (i) the measurement
instrument performs adequately, (ii) the instrument is not influenced by obstructions
and (iii) the atmospheric stratification is neutral. A set of wind speed data is con-
sidered micrometeorologically homogeneous if the data set is obtained under ident-
ical micrometeorological conditions [1]. The size of the data set (quantity) is related
to the time period during which the wind speed data are recorded. The time period
over which wind speed data are recorded is usually shorter than the lifetime of civil
engineering structures. Therefore, the worst case of wind load that the structural
designer expects that the structure will face during its lifetime is determined by mod-
elling the wind speed data record in hand. For this, climatological and physical
modelling techniques are available. Additionally, probabilistic and stochastic models
have been developed, for which the existing literature is reviewed in brief below. The
main aim in those techniques is to determine minimum design loads due to wind [2].
Short records of daily, weekly, and monthly highest wind speeds taken at 36
weather stations in the US were empirically analyzed [3] in order to determine
design wind speeds. Short records of hourly mean wind speed data from normal
regions in the US were used by Cheng and Chiu [4] for determination of the tran-
sition probabilities of the Markov chain upon which the methodology in that study
was based. This methodology was extended later to tropical cyclone-prone regions
[5]. Also, a knowledge-based expert system, principally similar to the mentioned
methodologies, was made available [6,7]. Alternative approaches used in the gener-
ation of simulated wind speed time series were compared by Kaminsky et al. [8].
Sfetsos [9] examined adaptive neuro-fuzzy inference systems and neural logic net-
works and compared them to the traditional autoregressive moving average
(ARMA) models. Dukes and Palutikof [10] employed the Markov chain in order to
estimate hourly mean wind speed with very long return periods. Another Markov
chain based study was conducted by Sahin and Sen [11]. Castino et al. [12] coupled
autoregressive processes to the Markov chain and simulated both wind speed and
direction. A recent study [13] presents a wavelet-based method to generate artificial
wind data. The Weibull distribution has commonly been fitted to hourly mean
wind speed data [14,15]. The peaks-over-threshold approach has also been com-
monly used in the estimation of extreme quantiles of wind speed data [16–19].

2. Methods

In this study, a number of probabilistic and stochastic methods are used in order
to compare their ability to reproduce long series of hourly mean wind speed data
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2113

with the same statistical behaviour as that of the observations in hand. The normal
and Weibull probability distribution functions are chosen in order to generate
independent and identically distributed random numbers. Autoregressive processes
are useful tools in generating data sets in cases where persistency exists. Persistency
means that large values tend to be followed by large values, and small values by
small values, so that runs of values of similar magnitude tend to persist throughout
the sequence. First- and second-order autoregressive processes are chosen in this
study. Another concept commonly employed in wind speed data generation studies
is the first-order Markov chain. The results of these methods are compared to
those obtained from a newly developed wavelet-based approach.
The methods are described below. Only the wavelet-based approach will be
detailed, whereas the remaining five methods will be outlined briefly as they have
been well documented in literature.

2.1. Normal distribution

Hourly mean wind speed time series are generated by using a sequence of inde-
pendent random numbers from the normal distribution. The normal probability
distribution function is given by
1
f ðwÞ ¼ pffiffiffiffiffiffi exp½ðw  lÞ2 =2r2  ð1Þ
r 2p
where w is the variable (hourly mean wind speed, in this study), l mean value of
wind speed, and r standard deviation of wind speed. A number of computational
methods are available for the generation of random numbers with normal prob-
ability distribution of mean l and standard deviation r.
2.2. Weibull distribution

The Weibull distribution is another probability distribution function commonly


used for the frequency analysis of wind speed data [14,15]. It is given by
 
a a1 1 a
f ðwÞ ¼ a w exp  a w w 0; a; b > 0 ð2Þ
b b
where a and b are shape and scale parameters, respectively, that can be determined
by using either a graphical method or the method of moments. They can also be
determined using the method of probability weighted moments (PWMs) for which
explicit equations are available. It is the method used in this study for the determi-
nation of parameters.
Equations to be used for this purpose are given by
lnð2Þ
a¼ ð3aÞ
L2;ðln wÞ
 
0:5772
b ¼ exp L1;ðln wÞ þ ð3bÞ
a
2114 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

In Eqs. (3a,b), L1,(ln w) and L2,(ln w) are L1 and L2 moments of the logarithm of
the hourly mean wind speed time series. The L1 and L2 moments of a series are
given by
L 1 ¼ b0 ð4aÞ
L2 ¼ 2b1  b0 ð4bÞ
in which b0 and b1 are given by

b0 ¼ x
 ð5aÞ

X
N 1
ðN  jÞ
b1 ¼ xj ð5bÞ
j¼1
NðN  1Þ

xj in Eq. (5b) comes from the time series sorted in descending order as
xN xi x1 . Detailed information on L moments and the method of
PWM is given in [20].
Once the parameters are determined, the generation of Weibull distributed ran-
dom numbers is a matter of a simple computer code, as the cumulative distribution
function of the Weibull distribution can be obtained in closed form.

2.3. AR(1) model


The hourly mean wind speed time series is of high dependence. This property
particularly requires a wind speed data generation model incorporating the depen-
dence structure of the observations. As mentioned, both normal and Weibull dis-
tributed random numbers do not take this property into account as they are
independent, but autoregressive models are of correlated type and hence capable of
simulating this property of the data series.
The use of autoregressive type models is reported in literature very commonly.
The first-order autoregressive [AR(1)] model accommodates only the effect of the
previous value in the series in which the observed sequence of wind speed data {w1,
w2,. . ., wt,. . .} is used to fit a model of form
X
m
wi ¼ aj wij þ ei ð6Þ
j¼1

where w is the hourly mean wind speed, a the autoregressive coefficient, that is,
model parameter, and e a normally distributed independent random variable. It is
noted that Eq. (6) is written for the mth order. The simplest case of Eq. (6) is
obtained for m ¼ 1, which is also called the Markov model. Eq. (6) then becomes

yi ¼ r1 yi1 þ ei ð7Þ
where y is the standardised (zero mean and unit variance) version of the variable
and r1 the lag-one serial correlation coefficient of the sequence.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2115

The random component (e) in AR(1) is of normal distribution with zero mean
and a variance of 1  r21 . The simulation procedure for the processes is very sim-
ple. It requires only a random number of a normal distribution to be generated.

2.4. AR(2) model

With increase in order of the autoregressive model, the dependence structure in


the observations is better preserved. Therefore, the second-order autoregressive
[AR(2)] model is preferred to AR(1). This becomes more important in cases where
dependence in the data set is very obvious, as in the hourly mean wind speed data.
AR(2) is formulized as
yi ¼ /1 yi1 þ /2 yi2 þ ei ð8Þ
where autoregressive coefficients /1 and /2 are given by

/1 ¼ r1 ð1  r2 Þ=ð1  r21 Þ ð9aÞ

/2 ¼ ðr2  r21 Þ=ð1  r21 Þ ð9bÞ


in which r1 is the lag-one autocorrelation coefficient and r2 the lag-two autocorrela-
tion coefficient of the wind speed time series. The random component in AR(2) is
again of normal distribution, with zero mean and variance equal to 1R2, where

r21 þ r22  2r1 r2


R2 ¼ ð10Þ
1  r21

2.5. Markov chain


In this approach, the observed time series is divided into a number of states. A
wind speed state contains wind speeds between certain values. For example, State 1
might include wind speeds below 2 m/s, State 2 wind speeds between 2 and 4 m/s,
etc. until the final wind speed state includes all speeds above the highest observed
value or a predefined upper limit. The upper and lower limits of the states are
highly subjective values. For instance, the hourly mean wind speed data set in this
study was divided into 10 states. In another wind speed study [11], states were
defined depending upon the standard deviation of the data set. Each state in that
study [11] was taken as wide as one standard deviation of the observed hourly
mean wind speed time series. Dukes and Palutikof [10], on the other hand, used a
fixed width for the states, which was equal to 2 m/s.
In the Markov chain approach, the state of wind speed in the current hour can
be defined depending only upon the previous state. This is called the first-order or
one-step Markov chain. Two previous states are used in the second-order or two-
step Markov chain in determining the current state of the wind speed. Although
they are not common as the first- and second-order Markov chains, higher-order
Markov chains can also be used. However, dramatic increase in the number of
their parameters limits their use.
2116 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

The parameter set of a Markov chain consists of probabilities of transition from


one state to another that are given in transition probability matrices. The tran-
sition probability matrix of a first-order Markov chain with m states can be written
symbolically as
2 3
P11 P12 . . . P1m
6 P21 P22 . . . P2m 7
P¼6 4 . . . . . . Pij . . . 5
7 ð11Þ
Pm1 Pm2 . . . Pmm
where Pij is the probability of transition from state i to state j. The number of
parameters is mðm  1Þ, as the sum of the probabilities is equal to 1 (100%) for
each row of the matrix. If nij is the total number of hours of observation in state j
with the previous state i, the probabilities of transition from state i to state j can be
calculated as
nij
Pij ¼ P i; j ¼ 1; 2; . . . ; m ð12Þ
nij
j

The procedure for generating the simulated hourly mean wind speed time series
is explained below.
First, the cumulative transition probability matrix is calculated. In the cumulative
transition probability matrix, cumulative summation of probabilities within each row
is carried out; hence, each row in that matrix ends with 1. Then, an initial state is
adopted. No wind (State 1) can, for example, be assumed as the initial state. Using a
uniform random number, the next state of wind speed can be determined. If State 1 is
obtained as the new state of wind speed, then it is first checked if the wind speed is
zero. If the wind speed is not zero, then a uniform random number is generated from
the interval of State 1. If the highest state is found to be the new state of wind speed,
then a shifted one-parameter gamma distributed random number is used in order to
find the magnitude of the wind speed. The reason for choosing the gamma distribution
will be discussed in the section where results obtained from application of the methods
are presented. For intermediate states, a uniform random number from the interval of
the corresponding state is generated and set as the wind speed at the current hour.
2.6. Wavelet-based approach

A real or complex-value continuous function with zero mean and finite variance
is called a wavelet [21]. There are many functions that can qualify as wavelets.
Some examples of wavelets are Morlet, Mexican hat, Shannon and Meyer. A sim-
ple wavelet is the Haar wavelet (Fig. 1), defined as
8
< 1 0 t 1=2
wðtÞ ¼ 1 1=2 t 1 ð13Þ
:
0 otherwise
Decomposing a signal and then reconstructing it is the base for the wavelet
transform. In this study, the Haar wavelet was used due to its simplicity. There-
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2117

Fig. 1. Haar wavelet.

fore, decomposition of a signal (multiresolution analysis) with the Haar wavelet is


considered and explained in detail below.
For a certain value of k, let us define fk(t) as the average of f(s) over an interval
of size 2k: ð k
1 2 ðlþ1Þ
fk ðtÞ ¼ k f ðsÞds 2k l < t < 2k ðl þ 1Þ ð14Þ
2 2k l
where k and l are integers, k a scale variable (k > 0 means stretching and k < 0
means contracting of the wavelet) and l a translation variable [21]. For
k ¼ 1,. . .,1, 0, 1,. . ., 1, fk(t) is as follows:
f1 ðtÞ ¼ f ðtÞ
..
. ð ðlþ1Þ=2
l lþ1
f1 ðtÞ ¼ 2 f ðsÞds <t<
l=2 2 2
ð ðlþ1Þ
f0 ðtÞ ¼ f ðsÞds l < t < ðl þ 1Þ ð15Þ
l
ð 2ðlþ1Þ
1
f1 ðtÞ ¼ f ðsÞds 2l < t < 2ðl þ 1Þ
2 2l
..
.
f1 ðtÞ ¼ 0
2118 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

The resolution decreases as k increases. The difference between the successive


averages fk1(t) and fk(t) is defined as a detail function:
gk ðtÞ ¼ fk1 ðtÞ  fk ðtÞ ð16Þ
It can be easily seen that
X1
f ðtÞ ¼ gk ðtÞ ð17Þ
k¼1

According to Eq. (17), the original signal is obtained when all detail functions
are summed up. Change in data resolution with change in k, the resolution level,
can be seen in the upper part of Fig. 2, in which the average of the time series
taken at different resolution levels according to Eq. (15) is shown. Note that the
data sample used in Fig. 2 has 16 elements. Increase in the ordinates of fk(t) with
decrease in k shows the change (increase) in the resolution. The middle part of
Fig. 2 shows the detail functions calculated using Eq. (16) for different resolution
levels. Note from Eq. (15) that f 4 ðtÞ ¼ 0 for all t. At the bottom of Fig. 2, f(t), the
sum of the four detail functions according to Eq. (17), is seen, and it represents the
original data, f0(t). Eq. (17) is the basis for the generation algorithm explained
below.
Let us consider a data sample of size M ¼ 2K , where K is a positive integer
(K ¼ 4 for the sequence in Fig. 2) taken from a stochastic process f(t) with zero
mean: f(1), f(2),. . ., f(M). Define the sample fk(i) (k ¼ 0, 1,. . ., K; i ¼ 1,. . ., M) con-
sisting of averages of 2k successive elements of the sample. f0(i) is the original sam-
ple and fK(i) is a sample of all zeros, since the average of M elements is zero. The
detail function gk(t) has a sample consisting of M elements given by Eq. (16) for
k ¼ 1, 2,. . ., K.
Thus, for each element fi of the original sample, we have K detail function
values, gk(i), corresponding to different resolutions. Choosing from M elements for
each gk(t) randomly, and then summing them up using Eq. (17), one obtains a
simulated value for f(t) as
X
K
f ðjÞ ¼ gk ðjÞ ð18Þ
k¼1

where j is the index for generated elements.


The generation algorithm is given step by step as follows [22] and is illustrated in
Fig. 3 for K ¼ 4.
1. In order to obtain the first element of the series ( j ¼ 1), gk values (k ¼ 0, 1,. . .,
K) are chosen from M values randomly and summed up to obtain f1 (Fig. 3).
2. The second element ( j ¼ 2) is generated by choosing, for each k, the gk coming
just after the gk values chosen in the first step. f2 is obtained by the summation
of these (Fig. 3).
3. Data generation is continued in this way for a desired number of times using,
for the generation of each element fj, the detail function values right next to
those of the previous step j1 at each resolution level.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2119

Fig. 2. Decomposition and reconstruction of a data sequence.


2120 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

Fig. 3. Construction of a simulated data sequence.

This generation algorithm is a newly developed approach for data simulation


purposes. It was first used in non-skewed annual and monthly streamflow data
simulation studies [22,23]. The approach was later used for the simulation of the
storage capacity of river reservoirs [24]. Modelling suspended sediment discharge
series [25] and annual and monthly rainfall data series [26] was also performed by
this approach successfully. The algorithm generated the mean, standard deviation
and correlation structure of the observed streamflow data sets. When one is inter-
ested in the generation of skewed data, it is first required to transform the data to
a non-skewed structure, generate them and then transform them back to their
skewed structure.

3. Application

The methods were applied to an hourly mean wind speed data set that will be
introduced in the following subsections. Results obtained from the application of
the methods are presented and discussed below. The performance of the methods
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2121

was measured according to their ability to capture the statistical behaviour of the
observed data set. A comparison of the methods is finally presented.
3.1. Data

Table 1 shows the main statistical characteristics of the data set of hourly mean
wind speed taken from the State Meteorological Works’ meteorology station in
Diyarbakir, a southeastern Anatolian city. The data set is of four years’ length,
from 1994 to 1997 (35064 hours in total). The region is normal, as is seen in
Table 1. The data set is highly correlated, as expected, and skewed. For the wave-
let-based approach, 32768 hours of data, extending from the first hour of April 6,
1994, to the eighth hour of December 31, 1997, were used. This is a choice with no
specific reason. Characteristics corresponding to that part of the observed series are
also given in Table 1.
3.2. Parameters
The hourly mean wind speed data set used in the study is of skewed structure.
This prevents fitting of the normal distribution to the data. Therefore, power trans-
formation [ y ¼ xh ; where x the is raw (untransformed) variable, y the transformed
variable, and h the transformation coefficient] was adopted in order to obtain non-
skewed data, to which the normal distribution can be fitted. The transformation
coefficient was obtained as h ¼ 0:38585 for the data set in the study. As the normal
distribution is fitted to the transformed hourly mean wind speed time series (but
not to the raw data series), the parameters of the normal distribution are the mean
and standard deviation of the transformed hourly mean wind speed time series.
Those parameters are presented in Table 2. The normal probability distribution
function based upon the determined parameters was fitted to the transformed wind
speed data series (Fig. 4). It is seen that the distribution performs very well in fit-
ting to the observations as well as to the generated data, to be explained later in
following sections.
The Weibull distribution has two parameters (a, the shape parameter, and b, the
scale parameter). The parameters were determined using the method of L-moments
on which detailed information was given previously. The reason for choosing this
method is that explicit equations are available for determination of the parameters
of the distribution. The method also has the superiority of being less sensitive to
outliers, which means that outliers do not affect the performance of the method in
determining the parameters correctly. The only problem with this method is the
presence of zero wind speeds, which makes the method inapplicable due to the log-
arithm included. In order to overcome this problem, zero wind speeds were ignored
from the observed time series as their number of occurrences was very small, less
than 0.5%. The parameters of the Weibull distribution determined by the method
of L-moments are listed in Table 2. Fig. 5 shows the agreement between the
observed data and the fitted Weibull probability distribution function. It can be
considered a very good fit, although the Weibull probability distribution function
2122

Table 1
Statistical characteristics of observed hourly mean wind speed time series
Date Number Mean Standard Coefficient Coefficient Maximum Correlation coefficient
of data (m/s) deviation of variation of skewness wind speed
r1 r2 r3 r4 r5
(m/s) (m/s)
1 January 1994–31 35064 2.538 1.786 0.703 1.285 14.4 0.860 0.732 0.633 0.549 0.476
December 1997
6 April 1994–31 32768 2.555 1.794 0.702 1.283 14.4 0.861 0.733 0.635 0.551 0.438
December 1997
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2123

Table 2
Parameter sets of methods

Method Parameter set

Normal l ¼ 1:347 m=s r ¼ 0:392 m=s


Weibull a ¼ 1:583 b ¼ 1:973
AR(1), AR(2) r1 ¼ 0:820 r2 ¼ 0:688

gives the mode an occurrence probability slightly lower than that in the obser-
vation.
AR(1) is a parametric model with two parameters (a, the autoregression coef-
ficient, and r2e , the variance of the independent normal variable). The model
requires only the lag-one serial correlation coefficient (r1), as both parameters are
dependent only upon r1.
AR(2) has three parameters (/1 and /2, the autoregression coefficients, and r2e ,
the variance of the independent normal variable), all functions of r1 and r2, the lag-
one and lag-two serial correlation coefficients listed in Table 2.
Of the six methods, the Markov chain is the one that requires the highest num-
ber of parameters. The number of parameters required changes with the number of
states used for the wind speed. In this study, 10 states were chosen for the wind
speed, each 1.5 m/s wide. This resulted in 90 transition probabilities to be determ-
ined from the observed wind speed data set, when it is considered that summation

Fig. 4. Normal probability distribution function fitted to the observed and simulated random wind
speed sequences.
2124 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

Fig. 5. Weibull probability distribution function fitted to the observed and simulated random wind
speed sequences.

over any row in the transition probability matrix results in 100% probability. The
transition probability matrix of the data set is given in Table 3. Not only transition
probabilities, but also the wind speed distribution in each state should be known
by this method. In this study, wind speed was assumed to be distributed uniformly
over the states except for the last one (state of highest wind speeds with no upper
limit), where the one-parameter gamma distribution was used. In State 1 with the
lower limit of zero, the probability of occurrence of zero wind speed was also taken

Table 3
Transition probability matrix of the observed hourly mean wind speed data set
Pij j¼1 2 3 4 5 6 7 8 9 10
i¼1 0.7053 0.2779 0.0144 0.0015 0.0008 0.0001 0.0000 0.0000 0.0000 0.0000
2 0.2405 0.6089 0.1306 0.0153 0.0041 0.0005 0.0000 0.0001 0.0000 0.0000
3 0.0256 0.2839 0.5317 0.1352 0.0178 0.0048 0.0005 0.0003 0.0002 0.0000
4 0.0042 0.0491 0.3116 0.4954 0.1191 0.0179 0.0023 0.0003 0.0000 0.0000
5 0.0008 0.0176 0.0865 0.3486 0.4311 0.0978 0.0168 0.0008 0.0000 0.0000
6 0.0000 0.0089 0.0266 0.1197 0.3437 0.4013 0.0865 0.0111 0.0022 0.0000
7 0.0000 0.0152 0.0076 0.0455 0.1212 0.3561 0.3485 0.1061 0.0000 0.0000
8 0.0000 0.0000 0.0000 0.0526 0.0000 0.2105 0.3684 0.2632 0.0789 0.0263
9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1818 0.3636 0.3636 0.0909
10 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2125

into consideration in order to reproduce the zero wind speeds, although their
occurrence was very low.
There is no parameter to be listed for the wavelet approach, as it is a nonpara-
metric method. The length of the series to be used in this method is equal to 2K,
where K is a positive integer and equal to 15 in this study. This corresponds to a
series 32768 hours in length. Another requirement for the wavelet approach is that
the data set should be of a non-skewed structure. Therefore, the part of the
observed series used for the wavelet approach was transformed by using the power
transformation with h ¼ 0:3853.

3.3. Simulation and results

A thousand-year (8760000-hour) -long series was generated for each method.


The correlogram, frequency distribution of maximum wind speeds and wind dur-
ation curve obtained from the simulations will be compared to those of the
observed series.
It is obvious that the hourly mean wind speed time series has a highly dependent
structure. The normal and Weibull distributions, however, are of independent
structures (Fig. 6), yet they are very common methods used in generating wind
speed data. These methods may be useful in offering, to the structural designer, the
highest wind speed that the structure will possibly face during its lifetime.
It is seen from Fig. 4 that wind speed data generated by the normal distribution
fit the observed series very well. It is seen in Fig. 5 that the Weibull fit is perfect as
well.
Other than those two methods, the AR(1), AR(2), and Markov chain methods
looked to produce the dependence structure of the series. However, with increasing
lags in time, the success of those methods in reproducing the correlation structure

Fig. 6. Correlogram of the observed and simulated wind speeds.


2126 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

Fig. 7. Cumulative frequency diagram of maximum wind speeds.

of the series decreases (Fig. 6). The wavelet method of the six studied, was found
to be the best in preserving the correlation structure of the series.
The annual maximum values of the simulation series were compared in Fig. 7. It
is seen that the normal distribution, AR(1) and AR(2) produced similar maxima,
whereas the wavelet approach produced higher, the Markov chain slightly lower
and the Weibull distribution considerably lower maxima. From the structural
engineering point of view, therefore, it is safer to use the wind load due to the
maximum wind speed generated by the wavelet-based approach.
Maxima obtained from the Markov chain method should be discussed specifi-
cally. There are three vertical jumps (one of them is very obvious) in the cumulat-
ive frequency diagram of the maxima of this method, as is seen in Fig. 7. The
reason for those jumps can be explained very simply. It is seen from Table 3 that
the probabilities of transition of the wind speed to the highest states are too low,
making transition of wind speed to those states almost impossible in the simulation
series. It is only possible to make a transition to State 10 if the previous state in the
simulation series is either State 8 or State 9. Otherwise State 10 is not simulated.
This causes the maximum value of the series to be bounded by the upper limit of
State 9, which was taken as 13.5 m/s in this study. A very small jump exists in the
frequency curve in Fig. 7 due to this circumstance. Similarly, State 9 can be simu-
lated if and only if the previous state of the wind speed is one of the following
states: 3, 6, 8, 9 and 10 (Table 3). The big jump in the cumulative frequency curve
in Fig. 7 is due to this situation. It is seen that maximums of the simulated series
are bounded by the upper limit of State 8, which was taken as 12 m/s in this study.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2127

The third jump at the very beginning of the curve (close to the y-axis of the graph)
is due to a similar situation. This is the result of not having simulated wind speeds
from State 8, which limits the maximum wind speed to 10.5 m/s at the upper limit
of State 7. This drawback of the method can be overcome by forcing the simula-
tion series to have at least one value from the highest state that results in a
maximum wind speed data series all generated from the highest state with no upper
limit. Such a forcing can be considered quite reasonable and it does not affect the
transition probability matrix as the number of data is usually very large (of the
order of tens of thousands).
Uniformly distributed wind speeds were accepted for the intermediate states,
whereas the one-parameter gamma distribution was adopted for the highest state
(State 10 in this study). The reason for choosing this distribution is explained
below together with a discussion on other distributions.
A distribution with no upper limit should be used for the highest state so that
maximum wind speeds higher than those in the observed series can possibly be
generated. Therefore, in this study, it was first thought to simulate wind speeds of
the highest state by using the exponential distribution shifted to that state as Sahin
and Sen [11] did. This is quite a reasonable choice for simulating the wind speeds
in that state. However, it was seen that the exponential distribution generated
lower maximum wind speeds compared to those generated by other methods.
Therefore, the Gumbel distribution was tested. It was seen that the maximum wind
speeds generated by this distribution were too low compared to those obtained by
the other methods. The Frechet distribution, which is accepted as the distribution
of the maximum wind speeds [1], was also found to be unsuccessful in generating
maximum wind speeds compared to other methods. The distribution generated low
maximum wind speeds. In the end, the two-parameter gamma distribution was fit-
ted, which resulted again in low maximum wind speeds. Finally, the one-parameter
gamma distribution was fitted and results comparable to those of the other meth-
ods (in Fig. 7) were obtained.
The conclusion that can be drawn from those trials is that a one-parameter dis-
tribution can fit to the highest state better than distributions with two or more
parameters. If the standard deviation of the highest state, which is bounded by the
lower and upper limits of the state, is included in the generation scheme, then
lower maximum wind speeds are generated. Therefore, mean-dependent probability
distribution functions are better in the simulation of maximum wind speeds.
The transition probability matrix of the Markov chain based simulation wind
speed series is given in Table 4. It is almost the same as its observed counterpart
given in Table 3, which means that the Markov chain based simulation technique
worked very well in the simulation of the state of the wind speed series.
The wind duration curve is a graph with time percentage as abscissa and wind
speed as ordinate (Fig. 8). It is an important tool used in determining the percent-
age of time that the wind speed exceeds a specified level. Wind energy production
systems use this graph in order to determine the wind energy potential of the
region under consideration. A very good fit was obtained in Fig. 8, where the wind
duration curves of the six methods were plotted together with the one extracted
2128 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

Table 4
Transition probability matrix of hourly mean wind speed data simulated by Markov chain method
Pij j¼1 2 3 4 5 6 7 8 9 10
i¼1 0.7049 0.2782 0.0143 0.0016 0.0009 0.0001 0.0000 0.0000 0.0000 0.0000
2 0.2406 0.6086 0.1309 0.0153 0.0041 0.0005 0.0000 0.0001 0.0000 0.0000
3 0.0255 0.2845 0.5318 0.1346 0.0179 0.0048 0.0005 0.0003 0.0002 0.0000
4 0.0042 0.0491 0.3115 0.4958 0.1191 0.0177 0.0023 0.0003 0.0000 0.0000
5 0.0008 0.0181 0.0860 0.3474 0.4322 0.0982 0.0165 0.0008 0.0000 0.0000
6 0.0001 0.0089 0.0259 0.1206 0.3421 0.4028 0.0861 0.0113 0.0022 0.0000
7 0.0001 0.0156 0.0077 0.0478 0.1227 0.3526 0.3491 0.1045 0.0000 0.0000
8 0.0000 0.0000 0.0000 0.0501 0.0000 0.2197 0.3667 0.2606 0.0777 0.0252
9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1888 0.3619 0.3720 0.0773
10 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000

from the observed series. Although the wind duration curve of the Markov chain
method fluctuates around the others, it has a fit that is good enough as well.
The first three central moments of the observed and simulated series are given in
Table 5. The maximum values and the first five lags of the correlation are also lis-
ted. It is seen that the mean values of the simulated series are almost the same as
those of the observations. The wavelet-based method approaches its counterparts
with a relative error of 0.3%. Standard deviation and variation coefficient were
best captured by the normal probability distribution, and AR(1) and AR(2) pro-
cesses. Skewness coefficient in the wind speed time series was best reproduced by

Fig. 8. Wind duration curve of observed and simulated wind speeds.


Table 5
Statistical characteristics of simulated series
Series Mean Standard Coefficient Coefficient Maximum Correlation coefficient
(m/s) deviation of variation of skewness wind speed
r1 r2 r3 r4 r5
(m/s) (m/s)
Normal 2.538 1.777 0.700 1.315 26.94 0.0005 0.0002 0.0005 0.0002 0.0002
Weibull 2.529 1.634 0.646 0.980 16.22 0.0003 0.0005 0.0003 0.0001 0.0007
AR(1) 2.537 1.776 0.700 1.309 27.30 0.815 0.661 0.536 0.436 0.355
AR(2) 2.537 1.776 0.700 1.313 25.23 0.815 0.677 0.561 0.466 0.387
Markov 2.585 2.058 0.796 0.983 21.29 0.715 0.587 0.483 0.399 0.330
Wavelet 2.566 1.833 0.714 1.438 31.13 0.715 0.580 0.523 0.443 0.421
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2129
2130 H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131

AR(1). Higher maximums were obtained by the methods of AR(1), wavelet and
normal distribution and lower maximums by Weibull distribution. Correlation
structure, as discussed earlier, was best simulated by the wavelet-based method.

4. Summary and conclusion

In this study, hourly mean wind speed data sets were generated by traditional
simulation methods—the normal and Weibull probability distribution functions,
the first- and second-order autoregressive processes, and the Markov chain.
Additionally, the newly developed wavelet-based approach was used. The normal
and Weibull probability distribution functions consist of independent identically
distributed random numbers. The autoregressive models include the correlation
structure of the observation and hence generate dependent series. The Markov
chain is a two-step method that first determines the state of the wind speed and
then generates its magnitude by using a preselected distribution. All the mentioned
methods are parametric and they therefore require the time series to have a specific
probability distribution. This is a drawback of parametric models more than a
limitation. A nonparametric model, of which the wavelet approach in this study is
one of the best examples, can be applied to data sets with any distribution. How-
ever, it should be kept in mind that the wavelet approach works only with sequen-
ces of zero skewness.
The correlation structure of the observations, distribution of the maximum wind
speeds, wind duration curve and statistical features of the series were used in order
to compare the success of the methods.
The generation of maximum wind speeds requires special attention in Markov
chain based simulation methods. Based upon the application in this study, it is
concluded that the uniform probability distribution function is suitable for use in
the first and intermediate states. A probability distribution function with no upper
limit should be used for the highest state. It is concluded that the one-parameter
gamma distribution is good enough in fitting to the wind speed data in the highest
state of the series for normal regions, such as the one used in this study.
Some methods performed better in preserving some particular characteristics
than other methods did. For example, the wavelet method is obviously the best in
preserving the correlation structure of the sequence. This method is as good at pre-
serving other statistical features of the series as other methods. Therefore, in con-
clusion, the wavelet method is proposed as a tool to substitute for the classical
generation schemes for the simulation of hourly mean wind speed data.

Acknowledgements

The wavelet approach presented in this study is a result of an earlier cooperation


between the first author (H. Aksoy) and Professor M. Bayazit of Istanbul Techni-
cal University, Turkey, whom the authors sincerely thank.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131 2131

References
[1] Simiu E, Scanlan RH. Wind effects on structures. New York: John Wiley & Sons; 1986.
[2] American Society of Civil Engineers. Minimum design loads for buildings and other structures.
ANSI/ASCE 7-93 (Revision of ANSI/ASCE 7-88), New York, 1994.
[3] Simiu E, Filliben JJ, Shaver JR. Short-term records and extreme wind speeds. ASCE, Journal of
the Structural Division 1982;108(ST11):2571–7.
[4] Cheng EDH, Chiu ANL. Extreme winds simulated from short-period records. ASCE, Journal of
Structural Engineering 1985;111(1):77–94.
[5] Cheng EDH, Chiu ANL. Extreme winds generated from short records in a tropical cyclone-prone
region. Journal of Wind Engineering and Industrial Aerodynamics 1988;28:69–78.
[6] Cheng EDH. Wind data generator: a knowledge-based expert system. Journal of Wind Engineering
and Industrial Aerodynamics 1991;38:101–8.
[7] Cheng EDH, Chiu ANL. An expert system for extreme wind simulation. Journal of Wind Engin-
eering and Industrial Aerodynamics 1990;36:1235–43.
[8] Kaminsky FC, Kirchhoff RH, Syu CY, Manwell JF. A comparison of alternative approaches for
the synthetic generation of a wind speed time series. Transactions of the ASME 1991;113:280–9.
[9] Sfetsos A. A comparison of various forecasting techniques applied to mean hourly wind speed time
series. Renewable Energy 2000;21:23–35.
[10] Dukes MDG, Palutikof JP. Estimation of extreme wind speeds with very long return periods. Jour-
nal of Applied Meteorology 1995;34:1950–61.
[11] Sahin AD, Sen Z. First-order Markov chain approach to wind speed modelling. Journal of Wind
Engineering and Industrial Aerodynamics 2001;89:263–9.
[12] Castino F, Festa R, Ratto CF. Stochastic modelling of wind velocities time series. Journal of Wind
Engineering and Industrial Aerodynamics 1998;74–76:141–51.
[13] Kitagawa T, Nomura T. A wavelet-based method to generate artificial wind fluctuation data. Jour-
nal of Wind Engineering and Industrial Aerodynamics 2003;91:943–64.
[14] Garcia A, Torres JL, Prieto E, De Francisco A. Fitting wind speed distributions: a case study.
Solar Energy 1998;62(2):139–44.
[15] Grigoriu M. Estimates of design wind from short records. ASCE Journal of the Structural Division
1982;108(ST5):1034–48.
[16] Heckert NA, Simiu E, Whalen T. Estimates of hurricane wind speeds by ‘peaks over threshold’
method. ASCE Journal of Structural Engineering 1998;124(4):445–9.
[17] Lechner A, Simiu E, Heckert NA. Assessment of ‘peaks over threshold’ methods for estimating
extreme value distribution tails. Structural Safety 1993;12:305–14.
[18] Pandey MD, Van Gelder PHAJM, Vrijling JK. The estimation of extreme quantiles of wind velo-
city using L-moments in the peaks-over-threshold approach. Structural Safety 2001;23:179–92.
[19] Simiu E, Heckert NA. Extreme wind distribution tails: a ‘peaks over threshold’ approach. ASCE,
Journal of Structural Engineering 1996;122(5):539–47.
[20] Stedinger JR, Vogel RM, Foufoula-Georgiou E. Frequency analysis of extreme events. In: Maidment
D, editor. Handbook of hydrology. New York: McGraw Hill Book Co; 1993 [Chapter 18].
[21] Rao RM, Bopardikar AJ. Wavelet transforms, introduction to theory and applications. Reading,
MA: Addison-Wesley; 1998.
[22] Bayazit M, Aksoy H. Using wavelets for data generation. Journal of Applied Statistics 2001;28(2):
157–66.
[23] Bayazit M, Onoz B, Aksoy H. Nonparametric streamflow simulation by wavelet or Fourier analy-
sis. Hydrological Sciences Journal 2001;46(4):623–34.
[24] Aksoy H. Storage capacity for river reservoirs by wavelet-based generation of sequent peak algor-
ithm. Water Resources Management 2001;15(6):423–37.
[25] Aksoy H, Akar T, Unal NE. Wavelet analysis for modeling suspended sediment discharge. Nordic
Hydrology 2004;35:165–74.
[26] Unal NE, Aksoy H, Akar T. Annual and monthly rainfall data generation schemes. Stochastic
Environmental Research and Risk Assessment 2044;18(6):in press.

View publication stats

You might also like