0% found this document useful (0 votes)
40 views9 pages

HiddenMarkov Pred

Hidden Markov Modelling

Uploaded by

mrcord814800
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views9 pages

HiddenMarkov Pred

Hidden Markov Modelling

Uploaded by

mrcord814800
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

976

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

Prediction of State of Wireless Network Using


Markov and Hidden Markov Model
MD.Osman Gani
Military Institute of Science and Technology, Department of Computer Science and Engineering, Dhaka, Bangladesh
Email: [email protected]

Hasan Sarwar and Chowdhury Mofizur Rahman


United International University, Department of Computer Science and Engineering, Dhaka, Bangladesh
Email: {mdhasan70, cmrbuet}@yahoo.com

AbstractOptimal resource allocation and higher quality of


service is a much needed requirement in case of wireless
networks. In order to improve the above factors, intelligent
prediction of network behavior plays a very important role.
Markov Model (MM) and Hidden Markov Model (HMM)
are proven prediction techniques used in many fields. In this
paper, we have used Markov and Hidden Markov
prediction tools to predict the number of wireless devices
that are connected to a specific Access Point (AP) at a
specific instant of time. Prediction has been performed in
two stages. In the first stage, we have found state sequence
of wireless access points (AP) in a wireless network by
observing the traffic load sequence in time. It is found that a
particular choice of data may lead to 91% accuracy in
predicting the real scenario. In the second stage, we have
used Markov Model to find out the future state sequence of
the previously found sequence from first stage. The
prediction of next state of an AP performed by Markov Tool
shows 88.71% accuracy. It is found that Markov Model can
predict with an accuracy of 95.55% if initial transition
matrix is calculated directly. We have also shown that O(1)
Markov Model gives slightly better accuracy in prediction
compared to O(2) MM for predicting far future.
Index Termsstate prediction, Markov model, Hidden
Markov model, access point.

I. INTRODUCTION
WLAN is now the choice for a LAN at present. They are
easily deployable and ensure quick connectivity among
laptop, PDA and other mobile devices. An access point
(AP) is a device that creates wireless connectivity with a
station (STA) within their WLAN. An access point
supports a particular number of devices at a time. These
numbers are changing since new mobile devices come
within the LAN and existing mobile devices leave by. As
a result, there are changes in the number of users who are
supported by an AP at a time unlike a hub or router of a
fixed network whose number of clients is fixed. As the
number of users is changing within a WLAN, so is the
amount of traffic.
The boundaries of WLANs are not well-defined from
one moment the next, mostly due to the mobility of the
nodes (the addressable units of the WLAN). At one

2009 ACADEMY PUBLISHER


doi:10.4304/jnw.4.10.976-984

moment - a congestion free network may turn into a


congested network within a short while due to the change
in number of client stations or their change in behavior of
using multimedia services. By analyzing the history of
an access point, future workload of the access point may
be determined in terms of its traffic load. The variation of
the number of client nodes in a WLAN and the change in
traffic load on the access point may affect the router in
taking its decision to find the best route to destination. In
order to take early action before such a situation arises, it
requires the incorporation of intelligence in congestion
control algorithm or in routing. A good prediction of
network traffic at each access point helps in early
allocation of network resources as well as guaranteed
quality of service. Resource allocation to individual users
now depends on many parameters. Inclusion of prediction
scheme is performed in many algorithms now-a-days. A
lot of works have been performed in this regard for
mobility prediction [1-4], where attempts are being made
to predict the future time and space of a mobile node.
However, from the perspective of access point, rarely
attempts have been made to predict the future states of an
access point in terms of its traffic load variation in time.
Traffic load on an access point varies with time. This
variation depends on the number of nodes that are
attached with that access point. Again, the type of
services that are in use may cause performance variation
in throughput. For example todays video, audio and data
rich multimedia applications may require more
bandwidth than a simple application might require, A
model might be used to forecast the future status of an
access point in terms of its traffic load. For example, the
amount of traffic loads that are going to be carried by the
access point may be predicted earlier. The Congestion
control algorithms may use this prediction data to take
actions, or the routing mechanism may use this prediction
data to dynamically select less costly route.
We propose a scheme which predicts network situation
in an indirect way. The scheme works in two phases. In
the first phase, we have analyzed traffic load at an access
point with respect to time. Later we used this information
to find out the load assignment on an access point. Thus

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

by using the traffic traces, we have generated a sequence


of states of an access point. In the second phase, we made
a prediction to find out the future load of an access point.
We used Hidden Markov Model for the first phase
calculation and in the second phase, we have used
Markov Model as a prediction tool.
Literature review suggests that a lot of research works
have been reported on congestion control, routing
mechanism,
mobility
management,
bandwidth
reservation, and call admission control in wireless
networks. Effective prediction mechanisms have been
considered in these publications. The major contribution
of our work is that our scheme is able to predict the load
scenario of an access point in an indirect way. We
focused on the usage of an access point. This can surely
help in taking decisions in all of the above cases.
Moreover, a better network topology may be suggested
based on the usage of an access point at a particular
location. Our scheme can be helpful in determining
efficient bandwidth management system as well.
This paper is organized as follows: Section II gives an
account of related works. Section III and IV mention
about the Markov Model and the Hidden Markov Model.
Section V describes the data sets of traffic traces. Section
VI and VII presents the implementation of the models on
data sets and comes with the results achieved. Finally the
conclusion is outlined in Section VIII.

II. RELATED WORKS


Traffic prediction is important to assess network capacity
requirements for future and to plan future network
developments. It can also help in early detection of
network congestion. Congestion deteriorates system
throughput and results in energy loss of nodes in wireless
networks. The importance of prediction is shown in a
congestion detection algorithm, presented in [5] which
decreases packet drops and provides high packet delivery
ratio. In [6], a suite of predictive congestion control
scheme is shown that increases network throughput,
efficiency, and satisfies energy conservation for wireless
sensor network.
Traffic prediction can also be an important basis for
planning the fastest route to a given destination in a
network [7] [8], a traffic aware routing metric for realtime communications (RTC) has been proposed. It is
known that under the condition of busy network and time
varying topology, a dynamic Ad-Hoc Routing Algorithm
that dynamically changes the routing paths according to
the channel condition proves to be more efficient. Time
varying nature of the wireless channel may cause serious
degradation in case of route quality and data throughput.
Several other developments are found in [9] [10].
Generally, for the purpose of prediction of future
network traffic, a time series model Auto-Regressive
Integrated Moving Average (ARIMA) model is used. A
2009 ACADEMY PUBLISHER

977

variation of ARIMA model that captures seasonal pattern


is shown in SARIMA. a comparative analysis among
parametric predictors (i.e. Auto-Regressive Integrated
Moving Average (ARIMA) and fractional ARIMA
(FARIMA) predictors) and nonparametric predictors (i.e.
artificial neural network (ANN) and wavelet-based
predictors) have been given [11][12]. A novel network
traffic one-step-ahead prediction technique is proposed
on a state-of-the-art learning model called minimax
probability machine (MPM) in [13].

III. MARKOV MODEL


A Markov chain is a stochastic process with Markov
property [14-16]. Markov property means that, given the
present state, future states are independent of the past
states. A probabilistic approach is used to determine the
future state. Information of the present states influences
the evolution of the process. The change of state, i.e.,
from current state to another state, or the same state, is
called a transition, and the probabilities associated with
various state changes are called transition probabilities.
A Markov chain is a sequence of random variables X1,
X2, X3 ... Formally,

Pr( X n 1

x | Xn

= Pr( X n 1

x n ,..., X 1
x | Xn

x1 )

xn )

(1)

The possible values of Xi form a countable set S called


the state space of the chain which is denoted by

{1,2,3,..., N  1, N } .

(2)

And St denotes the state at time t. Thus, St ranges over the


set S. We have interest in the probability, t,i, that the
Markov model will be in state i at time t. We denote this
as

3 t ,i

Pr( S t

i ), i 1, 2, 3, , N

(3)

The transition probabilities, Aij, denote the probability of


going from state i at time t (St = i) to state j at time t + 1
(St+1 = j). So,

Aij

Pr( S t 1

j | St

i), i, j 1, 2, 3, N (4)

IV. HIDDEN MARKOV MODEL


The Hidden Markov Model (HMM) [17] is a powerful
statistical tool for modeling generative sequences that can
be characterized by an underlying process generating an
observable sequence. HMM is used for modeling &
analyzing time series or sequential data in various fields
today, such as automatic speech recognition,
cryptanalysis, natural language processing, computational
biology, bioinformatics etc. With its prior knowledge,
HMM is concerned about the unobserved sequence of
hidden states and the corresponding sequence of related

978

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

Connection ID

Time
Stamp IN

C6d41f77041b6b72
89f9fd9af1fb5476

2007-08-28
17:52:46

1ba6e1a57801fbd68
754f5a786f1daf4

2007-08-28
16:42:48

89c26264c5501119
676650110722f6f5

2007-08-28
16:33:48

A2b809b87878ba2d
54371ad0a696d1e6

2007-08-28
16:11:19

13c33027b1142bd2
d966e581824aadc7

2007-08-28
15:48:24

C620a55893b24504
4f06403b7db0ec1d

2007-08-28
14:55:23

TABLE I
DETAILS OF USER SESSION
Time
Node ID
User ID
Stamp Out
d919294205e44a1
24a3f7cc2b5aa569
cf18fac49c2db65f
NULL
5a4d080d3885ed1
7
9
d919294205e44a1
0115d2bb055e85c
2007-08-28
cf18fac49c2db65f
c0be89eeae7ddfd3
17:13:00
7
f
d919294205e44a1
2007-08-28
e6bbc2939f1abcec
cf18fac49c2db65f
16:42:52
f9fe85f45785d561
7
d919294205e44a1
2007-08-28
e6bbc2939f1abcec
cf18fac49c2db65f
16:20:51
f9fe85f45785d561
7
d919294205e44a1
ea1e3b4b418bdda
2007-08-28
cf18fac49c2db65f
4ccfae995e3d7dea
16:24:52
7
8
29b8ac284559a4e
d919294205e44a1
2007-08-28
6a447a4f2b5fef10
cf18fac49c2db65f
15:17:56
6
7

observation. An HMM is defined with respect to states,


observations & their probabilities. These are:
N = Number of states in the model
M = Number of observation
T = length of observation sequence
V = {v1, v2, , vM} the discrete set of possible
observations
S = {Si}, Si = P(St = i), the initial probability of the
system will be in state i
A = {aij} where aij = P(St+1 = j | St = i), the probability
of being in state j at time t+1 given that the system was in
state i at time t.
B = {bj(k)}, bj(k) = P(vk at t | St = j), the probability of
observation will be vk given that the system is in state j.
Ot will denote the observation symbol observed at time
instant t.
O = (A, B, S) is the compact notation to denote an
HMM.
Applications of HMMs are reduced to solving three
main problems. These are:
1. Given the model O = (A,B,S) compute P(O|O), the
probability of occurrence of the observation sequence O
= O1,O2,,OT
2. Given the model O = (A,B,S) what will be the state
sequence I = i1,i2,, iT so that P(O,I|O), the joint
probability of the observation sequence O = O1,O2,,OT
and the state sequence is maximized.
3. Adjustment of the HMM model parameters O =
(A,B,S) so that P(O|O) or P(O,I|O) will be maximized.
The nature of our problem falls in the category 2. Here
Observation sequence O is the observed traffic, State
sequence I corresponds to the state sequence of APs. The
state of an AP can assume 3 values, namely low, mid,
high, based on the number of active connections
In this case We want to find the most likely state
sequence for a given sequence of observations, O = O1,
O2, , OT and a model, O = (A,B,S). The solution to this
problem depends upon the way ``most likely state
sequence'' is defined. One approach is to find the most
likely state St at t=t and to concatenate all such ' St's. But

2009 ACADEMY PUBLISHER

User MAC
f2129958494bf6
da5895e58f7ea6a
5bc
8dad09844457b0
d73d487e17fc8d
9493
7663756184a005
6baf9267c501b4
9d86
7663756184a005
6baf9267c501b4
9d86
3f15d887a41883
0f65642027fc6c7
cc6
ecfb5cbac5e3191
c970e19af35af9f
82

Incoming
Data

Outgoing
Data

285489

177490

1205807

224965

1117223

59685

765625

56922

122218210

40821009

12614543

375922

some time this method does not give a physically


meaningful state sequence. Therefore we would go for an
effective method known as Viterbi algorithm, the whole
state sequence with the maximum likelihood is found. In
order to facilitate the computation we define an auxiliary
variable,

w t (i )

max P{S1 , S 2 ,..., S t

S1 , S 2 ,... S t 1

i, O1 , O2 ,..., Ot | O}

(5)
which gives the highest probability that partial
observation sequence and state sequence up to t=t can
have, when the current state is i. It is easy to observe that
the following recursive relationship holds.

>

w t 1 ( j ) b j (Ot 1 ) max w t (i )aij ,1 d i d N ,1 d t d T  1


1di d N

(6)
where,

w 1 ( j ) S j b j (O1 ),1 d j d N

(7)
So the procedure to find the most likely state sequence
starts from calculation of w T ( j ),1 d j d N using
recursion in above equation, while always keeping a
pointer to the ``winning state'' in the maximum finding
operation. Finally the state

j*

j * , is found where

arg max w T ( j )

1d j d N
(8)
and starting from this state, the sequence of states is
back-tracked as the pointer in each state indicates. This
gives the required set of states. This whole algorithm can
be interpreted as a search in a graph whose nodes are
formed by the states of the HMM in each of the time

instant t, 1 d t d T .
V. DATA SETS
We have used trace files of wireless data. The files are
available at CRAWDAD (Community Resource for
Archiving Wireless Data) [18]. We have chosen two
different datasets to implement our scheme. The first
dataset [19] that we have used to model our scheme
contains summary of a connection that is created between

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

979

TABLE II
AP
AcadBldg10AP10
AcadBldg10AP11
AcadBldg10AP12
AcadBldg10AP13
AcadBldg10AP14
AcadBldg10AP15
AcadBldg10AP16

t0
0
0
0
0
0
9
0

TRAFFIC LOAD ON ACCESS POINT


t1
t2
t3
t4
0
4
24
0
2
0
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
0
0
8
0
2
0
0

a node and an AP. The specific information that refines


from each row of this file is a particular connection id,
node id, MAC address of the user, and the time stamp in,
the time stamp out, the total incoming and outgoing data
in bytes during the session interval. These user sessions
are created in various Wi-Fi hotspots in Montral,
Qubec, Canada for three years. The dataset have the
tabular format shown in TABLE I.
The second dataset [20] have been created in a network
scenario which employed 6202 wireless devices and 624
wireless access points in the Dartmouth campus. This
dataset contains network behavior of 300 consecutive
days. AP traces of each day are kept in individual file.

t5
0
0
0
0
0
0
0

t6
0
0
0
23
0
0
0

t48
0
0
0
0
0
17
0

S = {low, mid, high}


An AP makes transitions among the low, mid and high
states. State status is updated at the start of each time
interval. Each time interval is of 15 minutes. Probability
of changing state from one status to another status is
known as transition probability, Aij. While i=low and
j=high, that is, Alow,high indicates the transition probability
of changing the state status of an access point from low to
high. For our three-state model, there will be nine
probabilities of interest for each access point. Let St and
St+1 denote the state of the AP at time t and at time t+1,
respectively. We define the nine transition probabilities
as

The number of wireless devices associated with each


AP throughout the day is stored in each row of 49
columns shown in TABLE II. In TABLE II, the 1st
column shows the location of an AP. The location format
for an AP is as follows:
[Building Name][Building Number][AP][AP Number]
The 2nd column specifies the number of associated
wireless devices with the AP, mentioned in 1st column,
between time intervals of 1200AM to 1230AM. Similarly
the other columns show the number of wireless devices
connected with the AP in 30-minutes intervals throughout
the day.
VI. HMM IMPLEMENTATION AND RESULTS
To set the stage for Hidden Markov models for the first
dataset, we consider a three state model. An AP may
attain low, mid or high state status. Being in a low
state indicates that at present the access point is
connected to 0 (minimum) to 2 (maximum) number of
devices. Being in a state mid corresponds to a situation
where the access point is connected to 3 (minimum) to 6
(maximum) number of devices. Similarly a high state
status corresponds to a situation where access point is
connected with more than 6 devices. For convenience of
graph plotting low, mid and high state status have been
indexed as 1, 2, and 3 respectively, shown in TABLE III.
TABLE III
STATE DEFINITION
No of
No of
connections (MM)
connections(HMM)
0-5
02
6-10
36
>10
>6

2009 ACADEMY PUBLISHER

State
Low (1)
Mid (2)
High (3)

Fig.1. The transition between the three states


A low, low (t) = Pr {St+1 = low | St = low}
A low, mid (t) = Pr {St+1 = mid | St = low}
A low, high (t) = Pr {St+1 = high | St = low}
A mid, low (t) = Pr {St+1 = low | St = mid}
A mid, mid (t) = Pr {St+1 = mid | St = mid}
A mid, high (t) = Pr {St+1 = high | St = mid}
A high, low (t) = Pr {St+1 = low | St = high}
A high, mid (t) = Pr {St+1 = mid | St = high}
A high, high (t) = Pr {St+1 = high | St = high}
This can be represented by the state transition matrix

A(t) =

Alow,low
Amid,low
Ahigh,low

Alow,mid Alow, high


Amid,mid Amid, high
Ahigh,mid Ahigh, high

We have classified traffic flow, i.e. our observation, in


several types. A data transfer which involves total
number of bytes less than 1000,000 is classified as
observation type 1. The other observation types are
shown in TABLE IV.

980

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

Initialize time slot length. In our algorithm we have


assumed it to be 15 minutes
Create state definitions according to table III.
Create observation types according to table IV.
// Determine (time slot index, state status, observation
type)
For each row
Find number of connections and accumulate it to
respective time slots
Assign a state status for each time slot
following rules given in table I
Find the number of data transferred and
accumulate it to respective time slots
Assign an observation type for each
time slot following rules in table II
// Determination of HMM parameters
Compute initial probability matrix
Compute Transition Matrix A
Compute Observation Matrix B
Fig.2. Algorithm to convert data set values into HMM
parameters
The first dataset is processed to transform it as a set of
rows containing time slot index, state status and
observation type. In order to create HMM parameters, we
process this according to the following algorithm given in
Fig-2.
We have run our algorithm, mentioned in Fig-2, on
user
sessions
of
the
node
id
d919294205e44a1cf18fac49c2db65f7 and found out
the parameters of HMM. For a training set of size 60 days
the parameters of HMM are tabulated in the TABLE V,
VI and VII.
Using the parameter values from the TABLE V, VI,
VII we have executed the Viterbi algorithm. Generally,
TABLE IV
DEFINITION OF OBSERVATION TYPE
No of Data Transferred in Byte
Observation Type
0
0
<1000000
1
<2000000
2
<3000000
3
<4000000
4
<5000000
5
<6000000
6
<7000000
7
<8000000
8
<9000000
9
<10000000
10
<12000000
11
<15000000
12
<20000000
13
>20000000
14

State
Low
Mid
High

0
0.427
0.000
0.000

1
0.309
0.242
0.120

2009 ACADEMY PUBLISHER

2
0.073
0.172
0.440

3
0.058
0.095
0.200

TABLE V
INITIAL PROBABILITY MATRIX (S)
State
Probability
Low
0.928
Mid
0.068
High
0.004

State
Low
Mid
High

TABLE VI
STATE TRANSITION MATRIX (A)
Low
Mid
High
0.973
0.026
0.000
0.364
0.608
0.027
0.000
0.520
0.480

TABLE VIII
ACCURACY OF HMM VITERBI DECODER
Training Set Size
Prediction Set Size
Accuracy
7 days
60 days
82.64%
25 days
60 days
89.43%
60 days
60 days
91.52%
7 days
115 days
79.64%
25 days
115 days
86.54%
60 days
115 days
88.70%

Viterbi algorithm, as stated previously, gives hidden state


sequence based on observation sequence. We have
calculated different hidden state sequences by varying the
size of the training set and also observation sequence. We
executed the algorithm with 7days training data and
predicted the state sequence for the next 60 days. In
TABLE VIII the first row uses a training data set having
a data set (observation sequence of traffic) size of 7 days
and the prediction (state sequence of AP) has been
performed for the next 60 consecutive days. Similarly for
a training data set of 25 and 60 days, prediction has been
performed to find the state sequence for the next 60 days.
Accuracy results are given in TABLE VIII, where
accuracy is defined as the total number of correct
prediction of state status per 100 predictions. We can see
in Table VIII that as the training set size increases from 7
to 25 days, the accuracy of correct prediction increase
from 82% to 89%. Further increase of training data set
size to 60 days increases the prediction accuracy to 91%.
Later, we extended our prediction period to 115 days
keeping the training data set size 7, 25 and 60 days
respectively. Figure 4 shows the number of correct and
incorrect predictions with respect to time. X-axis
indicates time index of every 30 minute interval
throughout 115 days. Y- Axis indicates the state status
(1= low, 2 = mid, 3 = high) at any particular time
interval. The white boxes with blue edge are the predicted
values while the red boxes are the original values.
Overlapping blue and red boxes indicate a correct
prediction at a time interval. A snapshot of Fig. 4 is
presented in Fig.3. Here the blue line between blue
markers corresponds to state transition between predicted
states, while the red line between red markers stands for

TABLE VII
OBSERVATION MATRIX (B)
Observation Probability
4
5
6
7
8
0.038 0.036 0.019 0.020 0.008
0.117 0.100 0.097 0.045 0.035
0.120 0.000 0.080 0.040 0.000

9
0.003
0.017
0.000

10
0.002
0.007
0.000

11
0.001
0.047
0.000

12
0.001
0.010
0.000

13
0.002
0.012
0.000

14
0.003
0.002
0.000

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

981

Fig.3: Actual States and Predicted States by HMM (a closer snapshot of Fig. 4)

Fig.4: Actual States and Predicted States by HMM for 115 days performed by HMM.
transition between actual states. We see in Table VIII, as
we increase the training data size from 7 days to 25 days
and then to 60 days, prediction accuracy for the next 115
days increases from 79% to 86% to 88% respectively.

generate uniformly distributed random variable, U1. The


random variable, U1, is used to determine the new state.
Given the transition matrix, A of the model, the kth row
of A is designated by the vector Ak where

Thus our first phase calculation was involved in


predicting the state status (hidden sequence) of an access
point by observing the traffic load (observed sequence).

Ak = [ ak1 ak2 aki ak,i+1 akN ]


The cumulative probabilities associated with Ak are
denoted as k and are given by
k

Ek

VII. MARKOV MODEL IMPLEMENTATION AND RESULTS

k, j

(9)

j 1

We implemented Markov Model on the second data set.


For the purpose, we assume the state values as stated in
Table III. A total number of 0-5 connections with an AP
is considered as low state with index 1, 6-10 connections
with an AP is termed as mid state with index 2 and more
than 10 connections is termed as high state status with
index 3.

The probability of making a transition from state k to


state i is, by definition, aki, and is given by

a k ,i

E i  E i 1

This probability is represented by the shaded area in


figure 5:

As we mentioned in equation (4), that for a first-order


Markov process the probability of a state at time t + 1 is a
function only of the previous state.
For our three state models we use a first-order Markov
model. Assuming that the model is in state k at time t,
the next state is to be predicted. The first step is to

2009 ACADEMY PUBLISHER

(10)

Fig.5. Transition probability

982

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

A [3] [3] is the transition matrix


N is the total state
STATE i
FOR T 1 to N
Begin
State_Seq [T] STATE
U1 Random (1)
Cum_Sum CUMSUM (A (State, : ))
FOR I 1 TO Total_States
Begin
If U1 >= Cum_Sum(I) AND U1 < Cum_Sum(I+1)
STATE = I
END IF
END FOR
State_Seq [T] STATE
END FOR

Fig.6. Pseudo Code for the simulation procedure


The implementation of the simulation procedure is
realized using the algorithm stated in Fig.6.
We predicted the future state sequence by varying
present state values of different sets mentioned in table
IX. Initially, we used the state sequences of first 7 days
as our training data set and predicted state values for the
rest 93 days. We have run both the Markov O(1) and
O(2) predictor on the data set. It is seen that with O(1)
predictor, the predictor can produce a state sequence that
matches 69% states of the original sequence, whereas the

O(2) predictor shows an accuracy of 69% in prediction.


We have varied our training data size in different chunks
and made prediction for a maximum of 300 days shown
in table IX. We can see that a continuous training data of
first 50 days can produce the best prediction of the next
50 days with an efficiency of 95%. However, O(2)
predictor shows a little less efficiency of 93%. Figure 9,
Figure 10, and Figure 11 gives a comparative analysis of
O(1) and O(2) predictor. Figure 9 shows that as we
increase our training data size from 10 to 40, prediction
accuracy shown by the two predictors are equal. When
the training set size exceeds almost to 60 days, O(1)
achieves better accuracy compared to O(2) predictor.
Figure 10 shows that initially with increasing training set
size O(2) predictor shows a better accuracy with respect
to O(1) up to a training set size of 50 days. Afterwards,
O(1) shows a better result. Figure 11 shows similar
performance by both predictors up to a training data size
of 75. Around training set size of 75 days, O(1) shows
better performance. Here the training set consisted of data
from different interval. For example, training set was
composed of information of first 25 days, Day 100 to Day
125 and Day 226 to Day 250, shown in Table IX. An
increase of data size from 75 onwards show that O(1)
predictors prediction accuracy increases, while that of
O(2) decreases.
Next, we have predicted the future state sequences

Fig.7: Actual States and Predicted States by MM (a closer snapshot of Fig. 8)

Fig.8. Actual States and Predicted States by HMM for 115 days performed by MM.

2009 ACADEMY PUBLISHER

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

983

TABLE IX
TRAINING DATA, PREDICTED DATA, AND ACCURACY OF MARKOV O(1) AND O(2) PREDICTOR
No
1
2
3
4
5
6
7
8
9
10
11

Training Set
Day 1 to Day 7
Day 1 to Day 25
Day 1 to Day 40
Day 1 to Day 50
Day 1 to Day 25
Day 1 to 40
Day 1 to Day 25 and Day 100 to Day 125
Day 1 to Day 25
Day 1 to Day 25 and Day 100 to Day 125
Day 1 to Day 25 and Day 100 to Day 125 and Day 200 to Day 225
Day 1 to Day 25 and Day 100 to Day 125 and Day 226 to Day 250

Predicted Set
Day 8 to Day 100
Day 26 to Day 100
Day 41 to Day 100
Day 51 to Day 100
Day 26 to Day 200
Day 41 to 200
Day 26 to Day 200
Day 26 to Day 300
Day 26 to Day 300
Day 26 to Day 300
Day 26 to Day 300

Accuracy of O(1)
69.05%
87.61%
91.80%
95.55%
75.05%
75.50%
78.77%
69.06%
72.22%
69.86%
84.62%

Accuracy of O(2)
69.23%
87.46%
91.99%
93.29%
75.10%
77.22%
78.74%
68.99%
72.22%
69.87%
66.81%

found from the hidden state sequences in Section VI. The


predictions are shown in Fig.8. Correctly predicted values
have overlapped markers of white and red color. State
transitions are also shown using blue and red lines.
A snapshot of the full diagram is shown in Fig.7. State
Status index falls between 1 and 2 mostly. We have used
O(1) predictor, since this shows better performance in
terms of correctness in prediction. In Table X, we have
shown that the output of HMM prediction has been used
as the input training data for MM in the next phase. For
example, with a training set size of 7 data for HMM, a

Fig.11 Markov O(1) and O(2) predictors accuracy for


300 days.
TABLE X
PREDICTION ACCURACY OF MM TAKING PREDICTION
DATA OF HMM AS INPUT

Fig.9. Markov O(1) and O(2) predictors accuracy for


100 days.

Trainin
g Set
Size
(HMM)
7
25
25
60

Predictio
n Set
Size
(HMM)
30
45
60
60

Accurac
y
(HMM)
84.72%
88.59%
88.76%
91.52%

Trainin
g Set
Size
(MM)
30
45
60
60

Predictio
n Set
Size
(MM)
115
115
115
115

Accura
cy
(MM)
83.89%
86.66%
86.51%
88.71%

prediction of data size 30 has been made. Later this data


set has been used as the training data for O(1) Markov
Model, which produced a prediction of data size 115. In
Table X, we see that as we increase the training set size
from 7 to 60, the prediction efficiency increases from
84% to 91%, however when these corresponding
information is fed to MM, the corresponding efficiency
increases from 83% to 88% only.

VIII. CONCLUSION AND FUTURE WORKS


In this paper, we have tried to find the present load
scenario of wireless AP in a wireless network. Later we
predicted the future load of AP for several consecutive
days. In the first phase, we used Hidden Markov Model.
Traffic traces at each AP have been used as observation
sequence. States of AP have been determined by
Fig.10. Markov O(1) and O(2) predictors accuracy for
200 days
2009 ACADEMY PUBLISHER

984

JOURNAL OF NETWORKS, VOL. 4, NO. 10, DECEMBER 2009

observing the traffic load. In the next stage, this state


sequence has been extended to forecast the next state of
AP. It is also shown the choice of training data set affects
prediction. Proposed results show that this method can be
used to improve resource allocation in wireless network.
Different methods may be employed to predict future
state of AP. A comparison with other techniques is an
issue for future investigation. Moreover, this technique
can be applied to congestion control, routing or resource
allocation scheme and future studies may be carried out
with respect to its application.

Vehicular
Ad
Hoc
Networks,
2332
IEEE
TRANSACTIONS ON VEHICULAR TECHNOLOGY,
VOL. 56, NO. 4, JULY 2007..
Yantai SHU, Minfang YU, Oliver YANG, Jiakun LIU4
and Huifang FENG1 Wireless Traffic Modeling and
Prediction Using Seasonal ARIMA Models, IEICE
Transactions on Communications 2005, E88-B(10):39923999; doi:10.1093/ietcom/e88-b.10.3992.
Huifang Feng and Yantai Shu, Study on Network Traffic
Prediction Techniques, Wireless Communications,
Networking and Mobile Computing, 2005. Proceedings.
2005 International Conference on, Volume: 2, ISBN: 07803-9335-X , pp. 1041- 1044, Sept.2005.
A One-Step Network Traffic Prediction, Xiangyang Mu,
Nan Tang, Weixin Gao, Lin Li, Yatong Zhou, Lecture
Notes In Artificial Intelligence, Proceedings of the 4th
international conference on Intelligent Computing:
Advanced Intelligent Computing Theories and Applications
- with Aspects of Artificial Intelligence; Vol. 5227.
R. A. Howard, Dynamic Probabilistic Systems, Vol. I:
Markov Models, New York: Wiley, 1971.
W. Hsu, T. Spyropoulos, K.Psounis and A. Helmy,
Modeling Time-variant User Mobility in Wireless Mobile
Networks, Proceedings of IEEE INFOCOM 2007.
H. S. Wang and P.-C. Chang, On Verifying the First
Order Markovian Assumption for a Rayleigh Fading
Channel Model, IEEE Transactions on Vehicular
Technology, Vol. 45, pp. 353357, , May 1996.
Lawrence, R.Rabiner, Fellow IEEE, A Tutorial On
Hidden Markov Model and Selected Applications in
Speech Recognition, Readings in speech recognition,
ISBN:1-55860-124-4, pp. 267 296, 1990.
http://crawdad.cs.dartmouth.edu/.
http://crawdad.cs.dartmouth.edu/meta.php?name=ilesansfil
/wifidog.
http://www.ics.uci.edu/~rex/ics273a/.

REFERENCES
[1] Song, L. Deshpande, U. Kozat, U. C. Kotz, D. Jain,
R. Predictability of WLAN Mobility and Its Effects on
Bandwidth Provisioning INFOCOM 2006. 25th IEEE
International Conference on Computer Communications.
Proceedings, ISSN: 0743-166X, ISBN: 1-4244-0221-2, pp.
1-13, April 2006
[2] Minkyong Kim, David Kotz Modeling users' mobility
among WiFi access points, International Conference On
Mobile Systems, Applications And Services, 2005
workshop on Wireless traffic measurements and modeling,
Seattle, Washington, ISBN:1-931971-33-1, pp.19 24,
2005.
[3] Shih-An Chen,Yang-Han Lee, Yen, R.Y, Yu-Jie Zheng,
Chih-Hui Ko, Shiann-Tsong Sheu, Meng-Hong Chen,
Optimal prediction tool for wireless LAN using genetic
algorithmand neural network concept Communications,
1999. APCC/OECC '99. Fifth Asia-Pacific Conference on
and Fourth Optoelectronics and Communications
Conference, Beijing, China, ISBN: 7-5635-0402-8, pp.
786-789 vol.1, 1999.
[4] Kang Yong Lee Kee-Seong Cho Byung-Sun Lee,
Efficient Traffic Prediction Algorithm of Mutimedia
Traffic for Scheduling the Wireless Network Resources,
Consumer Electronics, 2007. ISCE 2007. IEEE
International Symposium on, ISBN: 978-1-4244-1109-2,
pp.1-5,June 2007.
[5] Faisal B. Hussain, Yalcin Cebi, and Ghalib A. Shah, A
multievent congestion control protocol for wireless sensor
network EURASIP Journal on Wireless Communications
and Networking Volume 2008 (2008), Article ID 803271,
doi:10.1155/2008/803271.
[6] Zawodniok, Maciej and Jagannathan, Sarangapani.
Predictive congestion control MAC protocol for Wireless
Sensor Networks, International Conference on Control
and Automation, ICCA '05, vol. 1, pp. 185-190, 2005.
[7] Yong-Bin Kang, Sung-Soo Kim, A fastest route planning
for LBS based on traffic prediction, Proceedings of the
9th WSEAS International Conference on, Athens, Greece,
ISBN:960-8457-29-7, Article No. 69, 2005.
[8] Shouyi Yin, Yongqiang Xiong, Qian Zhang, Xiaokang Lin,
Traffic-aware routing for real-time communications in
wireless multi-hop networks, Wireless Communications
& Mobile Computing,Volume6 , Issue 6, ISSN:1530-8669,
pp. 825 843, 2006.
[9] Meng, L.M. Zang, J.X. Fu, W.H. Xu, Z.J, A novel ad
hoc routing protocol research based on mobility prediction
algorithm, International Conference on Wireless
Communications, Networking and Mobile Computing,
2005. Proceedings. 2005, ISBN: 0-7803-9335-X, Volume:
2, pp. 791- 794, Sept. 2005.
[10] Vinod Namboodiri, Student Member, IEEE, and Lixin
Gao, Member, IEEE, Prediction-Based Routing for

2009 ACADEMY PUBLISHER

[11]

[12]

[13]

[14]
[15]

[16]

[17]

[18]
[19]
[20]

MD.Osman Gani received the B.Sc. Engg. degree from the


Dept. Computer Science and Engineering, Military Institute of
Science and Technology, Dhaka, Bangladesh, in 2008. His
research interests include Wireless Communication, and Pattern
Recognition.
Hasan Sarwar received the B.Sc. Engg. degree from the
Computer Science and Engineering, Bangladesh University of
Engineering and Technology, Dhaka, Bangladesh, in 1995 and
the M.Phil. degree from Dept. of Applied Physics, Electronics
and Communication Engineering, University of Dhaka,
Bangladesh, in 2002 and Ph.D. degree from Dept. of Applied
Physics, Electronics and Communication Engineering,
University of Dhaka, Bangladesh, in 2006. He is currently the
Associate Professor and Head of Dept. of Computer Science
and Engineering, United International University, Dhaka,
Bangladesh. His research interests include Wireless
Communication, Simulation, Traffic Analysis, Pattern
Recognition and Solid State Materials. He is a member of IEEE.
Chowdhury Mofizur Rahman received the B.Sc. Engg.
degree from the Dept. of Electrical and Electronic Engineering,
Bangladesh University of Engineering and Technology, Dhaka,
Bangladesh, in 1989 and the M.Sc. Engg. degree from Dept. of
Computer Science and Engineering, Bangladesh University of
Engineering and Technology, Dhaka, Bangladesh, in 1992 and
Ph.D. degree from Dept. of Computer Science , Tokyo Institute
of Technology, Japan in 1996. He is currently the Professor and
Pro Vice-Chancellor of United International University, Dhaka,
Bangladesh. His research interests include Artificial
Intelligence, Data Mining and Natural Language Processing. He
is a member of IEEE.

You might also like