Fuzzy_c-means_clustering_identification_method_of_urban_road_traffic_state
Fuzzy_c-means_clustering_identification_method_of_urban_road_traffic_state
Abstract—Urban road state identification refers to determining traffic flow parameters. All the above studies have provided
the operation status of the road network system, which plays an premises and laid a solid foundation for this paper.
important role in urban road traffic management. By clustering
time series of traffic flow, typical fluctuation pattern recognize As road traffic system is a complex system, traffic flow
algorithms of traffic flow can get the urban road network parameters will change along with time randomly due to
operation states. As the detected traffic data contain vague and impacts of some factors, such as the level of urban economics,
uncertain information, preprocessing is needed. An improved travel modes, driving habits, weather and other environmental
fuzzy c-means clustering (FCM) method is proposed in this factors, which may incur errors during data detection. The
paper. A case study based on urban road section of Beijing City uncertainty of traffic flow data directly results in the fuzziness
demonstrates the feasibility and effectiveness of the improved
FCM algorithm.
and uncertainty in representing traffic state. The boundary
between certain two traffic flow states cannot be split clearly.
Keywords-Urban Road, Traffic State Identification, Clustering, Thus people can only approximate describe the traffic state
Fuzzy c-Means with some fuzzy concepts, such as congestion, slowing down
and unblocked, etc. The key part of AID algorithm is to
I. INTRODUCTION distinguish each state from state-set by clustering the traffic
data.
Traffic state refers to the operation status of urban road
traffic network, and traffic state identification means After the fuzzy theory introduced by Zadeh, the researchers
recognizing the traffic state by using mathematical models. put the fuzzy theory into clustering. Fuzzy algorithms can
assign data object partially to multiple clusters. The degree of
Along with the increasing of the urban size as well as the membership in the fuzzy clusters depends on the closeness of
total amount of motor vehicles, traffic congestion and energy the data object to the cluster centers [7]. In the experimental
consumption are becoming a normal state. Trustworthy traffic data processing, fuzzy clustering theory has the advantage over
information, as well as most appropriate traffic-control traditional statistic theory, namely that its results sufficiently
strategies, can only be made on the basic of accurate judgment embody the inherent properties of a system with less
of urban road traffic state. The research of urban road traffic experimental data and unknown systemic probability [8]. So
state identification methodology plays an important role in the the method based on fuzzy processing technology began to
practice of managing urban traffic flow. play a more and more important role in automatic road traffic
Through years of research and practice, scholars have incident identification [9~14].
developed various methods to identify urban road traffic state. Fuzzy c-means (FCM) algorithm is a clustering algorithm
Automatic traffic state detection algorithm, also known as the based on fuzzy divisions [15, 16]. It is designed to maximize
Automatic Incident Detection (AID) algorithm was first the similarity between objects in the same cluster while
developed by California Department of Transportation in the minimizing the similarity between objects in different clusters.
1960s.It was modeled based on the fact that when an event FCM usually describes the clustering problem as a constrained
occurs, occupancy will increase in the upstream detection nonlinear programming problem, and gets the optimal fuzzy
section and will decrease in the downstream detection section. division and clustering results using the iterative optimization
In addition, flow volume, speed, and other parameters detected theory. In addition, a FCM is used because it requires smaller
with detectors are also the basis of AID algorithm research. training sets and shorter training times. Moreover, it is
The research contents and methods of AID have changed a compatible with future research needs to compare other
lot as the road traffic systems are becoming more and more systems based on cluster variances [17]. It is also useful for
complex. Series of new AID methods based on ANN (artificial specify number and shape of membership functions which
neural network) [1~4], wavelet analysis [5], SVM(Support consider from the distribution of data points [18]. Thus, it is an
Vector Machine)[6]and fuzzy theory came into being ideal algorithm for traffic state identification [19].
subsequently in 2000s. The main idea of such method is to For this reason, we adopt the fuzzy c-means (FCM)
evaluate different traffic state through clustering the data of algorithm to identify traffic state in this paper. The key of using
n
The objective function of FCM algorithm is: ∑ (u
j =1
(r ) m
ij ) xj
(6)
( r +1)
c n Vi = n
J m (U , V1 ,...,Vc ) = ∑∑ u d m
ij
2
ij ∑ (u (r ) m
ij )
i =1 j =1
(2) j =1
In the formula (2), J m (U ,V ) is the objective function, Step 4: In the case of V ( r +1) −V ( r ) ≤ ε , terminate the
V = [V1 , ..., Vc ] is the clustering center, m ∈ [1, +∞ ) is the algorithm and export the membership matrix U and the
fuzzy weighting index, and d ij = Vi − x j is the distance clustering center V; otherwise, set r ⇐ r + 1 and go back to
between the j-th data and the i-th clustering center. Step 2 until meet the termination condition of the algorithm.
An iterative procedure is used to get the membership matrix U For the preceding algorithm, it is acceptable to initialize the
and the clustering center V when the conditions are satisfied, membership matrix first, and then perform the iteration
can be solved with Lagrange method of multipliers. Shown as B. Research on m and c Optimization
(3) and (4). In this section, we introduce an effective criterion to select
the optimal parameters of the FCM. Clustering algorithm is
designed to classify the given data sample set, to maximize the
distance between clusters while minimizing the distance
between data samples within a cluster. The indicator Xie-Beni
So, the formula (7) is used as the criteria for the selection of 6:00 16.29 31.97 2.00
m and c. In addition, we improve the above-stated FCM 6:02 18.23 37.78 2.26
algorithm as follows: 6:04 18.69 36.35 3.55
Step 1: Define the initial values of the cluster c and the 6:06 20.21 39.54 4.14
fuzzy weighting index m, which are c0 and m0 , respectively, 6:08 21.27 54.15 4.83
and set k to 0; and define the iteration termination threshold … … … …
ε and the initial clustering center V (0) , and set r to 0. 13:22 38.50 38.59 21.71
13:24 37.97 37.70 21.43
(r )
Step 2: Calculate the membership matrix U based on 13:26 39.93 32.89 23.43
the formula (5).
13:28 41.38 31.97 24.72
( r +1)
Step 3: Update the clustering center V based on the 13:30 40.66 22.79 24.65
formula (6). … … … …
Step 4: In the case of V ( r + 1) −V ( r ) ≤ ε , export the 20:50 30.15 57.79 13.25
membership matrix U k and the clustering center V k , 20:52 28.60 60.02 11.58
otherwise, set r to make it equal ( r + 1) and go back to Step 2. 20:54 27.97 59.63 10.63
20:56 28.58 68.58 10.83
Step 5: Calculate the value of clustering validate based on the
20:58 26.61 59.70 8.03
formula (7), which is v XBk . In the case of v XBk ≥ v XBk −1 ,
terminate the algorithm and export the membership matrix
U k − 1 and the clustering center V k −1 ; otherwise, set B. FCM Clustering and Result Analysis
k ⇐ k + 1 , choose the new values of the cluster c, ck and the (1) Definition of the values of m and c
fuzzy weighting index mk , respectively, and go to Step 1. First, it is necessary to define the value range of the fuzzy
weighting index m and the cluster c.
III. CASE STUDY According to existing research, the value range of m is (1,
In this part, we perform the case study in order to validate 2.5]. With the heuristic method, we find the value of m from
the effectiveness of the improved FCM algorithm. A section of the range (1, 2.5] with an increment of 0.3, that is, m ∈ {1.3,
the freeway from the South 4th Ring Road in Beijing city, 1.6, 1.9, 2.2, 2.5}.
which consists of four lanes, is selected as research subjects. A For the value of c, the rule cmax ≤ 2 ln n is followed. In this
piece of data, on June 30, 2011, from 6:00 a.m. and 9:00 p.m.,
paper, there are 450 traffic data samples with
which is recorded every two minutes, is get from traffic flow
cmax ≤ 2ln n = 2ln 450 = 12.22 . As a result, c is a positive
database and utilized to test the algorithm. We firstly pre-treat
the collected data according to the above algorithm, and then integer which value is between 2 and 12.
apply the improved FCM algorithm for clustering. The result Second, it is necessary to calculate the clustering validity
will be analyzed at last. Finally, complete content and functions vXB (m, c) for different m and c values, to determine
organizational editing before formatting. the ideal m and c values. Set the iteration termination threshold
127 49.05 18.41 41.66 0.0394 0.0613 0.1027 0.1878 0.6089 Congested
Mildly
161 44.58 42.21 26.92 0.0296 0.0776 0.7326 0.1077 0.0526
Congested
Moderately
331 43.85 30.25 29.25 0.0103 0.0199 0.0527 0.8635 0.0536
Congested
408 31.28 69.46 10.08 0.8683 0.0643 0.0311 0.0208 0.0156 Very Smooth
μ1 , μ 2 , μ 3 , μ 4 , μ 5 represent the membership of very Smooth, Table 4. The Clustering result of Traffic States
Smooth, Mildly Congested, Moderately Congested, and
Very Mildly Moderately
Congested, respectively. As to sample point No.25 in table 3, Traffic states Smooth Congested
its membership to smooth is 0.8833, greater than the Smooth Congested Congested
memberships to the other 4 states. So the corresponding state number of
35 105 115 117 78
of No.25 is smooth. Thus we can get the results of samples
identification on all sample points, shown as Table 4. And the
result is essentially in agreement with the real states on road. Proportion 7.78% 23.33% 25.56% 26.00% 17.33%
Fig. 2 shows the traffic state for each sample point though
clustering.
Error rate
Fig. 2. Traffic state diagram
IV. CONCLUSIONS
In this paper, FCM algorithm is improved to identify urban
3