0% found this document useful (0 votes)

4 views

Cluster and Calendar Based Visualization of Time Series Data

This document presents a new method for visualizing univariate time series data by clustering similar daily patterns and displaying them on a calendar. The approach aims to identify patterns and trends across multiple time scales, addressing challenges such as large data sets and varying time scales. Applications include analyzing employee presence and energy consumption, with a focus on interactive exploration and pattern detection.

Uploaded by

Gustavo Cortes

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Cluster and Calendar Based Visualization of Time Series Data

Uploaded by

Gustavo Cortes

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

To be presented at the IEEE Symposium on Information Visualization (INFOVIS’99), San Francisco, October 25-26, 1999

Cluster and Calendar based Visualization of Time Series Data

Jarke J. van Wijk Edward R. van Selow

Eindhoven University of Technology Neth. Energy Research Foundation ECN
Dept. of Mathematics and Computing Science P.O. Box 1, 1755 ZG Petten
P.O. Box 513, 5600 MB Eindhoven The Netherlands
The Netherlands [email protected]
[email protected]

Abstract analyzed, much simpler than for instance flow data, which
consist of a mix of scalar and vector quantities at a multi-
A new method is presented to get insight into univariate dimensional grid. Visualization is trivial: just draw a graph.
time series data. The problem addressed here is how to iden- So what’s the problem?
tify patterns and trends on multiple time scales (days, weeks, The first is that N can be very large. For instance, mea-
seasons) simultaneously. The solution presented is to cluster surement at 10-minute intervals during a year yields 52,560
similar daily data patterns, and to visualize the average pat- values. The second is that repetitive data patterns often have
terns as graphs and the corresponding days on a calendar. different time scales. For our applications we usually distin-
This presentation provides a quick insight into both standard guish three time scales: seasons, weeks, and days. Human
and exceptional patterns. Furthermore, it is well suited to activities can vary strongly for these time scales, and hence
interactive exploration. Two applications, numbers of em- also the related measured quantities. The third is that clear a
ployees present and energy consumption, are presented. priori hypotheses are rarely available. Hence, the user wants
to have an overview first, subsequently he may want to zoom
in on data and detect peculiar patterns or subsequences, and
1 Introduction so on.
How can we analyze time series data? The first approach
is to use mathematical models. A well-known method is
Time series data are ubiquitous. The aim of time se-
the ARIMA model of Box and Jenkins [1]. This stochastic
ries analysis is to obtain insight into phenomena, to discover
model can be used to predict future values, and for an expert,
repetitive patterns and trends, and to predict the future. We
its coefficients give some insight into its time-dependent be-
focus here on the analysis of univariate time series data.
havior. But in general, the multi-scale aspect is not ad-
Suppose, we have collected energy consumption or air pol-
dressed, and, the counterpart of the very high compression,
lution data at short time intervals during one year, then how
details are lost.
can we extract information from these data?
Transformation from the time domain to a scale space di-
In the next section we discuss the problem and consider
rectly addresses the multiple scales that are present in the
various solutions. Current methods fall short in the analysis
data. Fourier transforms, Wavelet transforms, and fractal
of time series data at the various time scales, such as years,
analysis [2] are conceivable approaches. They are most
weeks, and days. Our new approach is based on a combina-
suited when the dominant frequencies or time scales are un-
tion of two methods: The use of cluster analysis (section 3)
known. However, for the type of applications discussed here
and the visualization of the result on a calendar (section 4).
it is often known a priori that patterns will have a scale of
Several applications are presented. In section 5 the strengths
days or weeks, hence such methods are too general. Fur-
and limitations are discussed.
thermore, the result after transformation, defined over a fre-
quency or scale-space domain, is much harder to interpret.
2 Background Another approach is to use the dependency on time scales
explicitly, by considering the data as two-dimensional, for
Time series data consist of a sequence of N pairs instance as f (day, hour ). The data can then be displayed
(yi , ti ), i = 1, · · · , N, where yi is the measured value of as a so-called fingerprints. The days and hours are mapped
a quantity at time ti . They are the simplest type of data to be on different axes, data is visualized via color [3]. In addi-

1
Total KW−consumption ECN

17 dec.

12 nov.

8 oct.

3 sep.

30 jul.
2000
25 jun.
1600
21 may
1200 days
16 apr. 24:00
800
18:00
12 mar.
400
12:00
5 feb.
0 6:00 hours
KW
1 jan. 0:00

Figure 1. Power demand by ECN, displayed as a function of hours and days

tion to color, the third dimension can be used to display the large, and the difficulty arises how to combine graphs prop-
data, yielding a mountain landscape. As an example, fig- erly and how to extract information.
ure 1 shows the power demand data of a research facility Let’s make a step backward. What do we want? We want
(i.c. ECN). Such images show all data simultaneously. Sea- to elucidate which standard day patterns occur, and how they
sonal trends can be discerned, as well as the typical day pat- are distributed over the year and over the week. Further-
tern. Yet, the variation over the week is harder to discern more, we want to detect days with patterns that strongly
and the day-patterns of Saturdays and Sundays are obscured. deviate from these standard patterns. If we use multiple
Furthermore, in order to see the trends smoothing had to be graphs, as suggested before, it is implicitly assumed that
used, but this eliminates fine details. there is a fixed relation between the distribution of patterns
A simple way to get an overview is to average the data. over the months and weekdays. In general, this assumption
For instance, temperature data over a year can be displayed can not be tested a priori. An alternative is to drop this as-
as a graph of the average daily temperature, combined with sumption, and let the analysis tool decide which daily pat-
a grey-shaded band to show the variation over the day [4]. terns are similar and show their distribution over the year.
However, if the data follow a weekly pattern, this technique This is the basis of our approach: cluster analysis, combined
is less useful, and any pattern within a day is not shown. with a calendar based visualization.
This can be overcome by showing multiple graphs. We
could show the average day pattern for each month, for each 3 Cluster analysis
day of the week, and so on. However, information is lost
here too. As an example, many data patterns on holidays Our aim is to merge similar day patterns into clusters,
show strong similarities to data patterns on Sundays. If the such that the day patterns within a cluster are more similar
data for each weekday is averaged separately, the holidays than the day patterns in other clusters. Each cluster contains
will disturb the results. To get more precise information and an average day pattern. To this end, a simple and straightfor-
to eliminate cross-over of the various effects, we could make ward bottom-up clustering algorithm suffices [5]. We split
graphs for combinations of time scales: ranging from Sun- the time series data into a sequence of M day patterns. Each
days in January to Saturdays in December. As a result, the day pattern Y j , j = 1, · · · , M consists of a sequence of pairs
number of graphs to investigate becomes overwhelmingly (yi , ti ), i = 1, · · · , N, where yi denotes the measured value

2
and ti the time that has elapsed since midnight. dot [6] without additional directives for the lay-out. Such an
We start with M clusters, each cluster containing one day automatic tool yields the best result when the user does not
pattern. Next, we compute the mutual differences between supply lay-out directives that constrain the search for an op-
all clusters, and merge the two clusters which are most simi- timal lay-out. Additional lay-out directives must be used to
lar into a new cluster. As a result, we have M −1 active clus- generate a dendrogram, with the days in their original order
ters. This step of merging small clusters into larger clusters on the same row, and with each new cluster on a next row.
is repeated until a single cluster results, which contains the This yields a highly cluttered image.
average of all day patterns. To speed up the clustering proce-
dure, the calculated differences between clusters are stored
in a table, which only has to be updated for new clusters. The
result of this algorithm is a binary tree of 2M − 1 clusters.
Various distance measures can be used. Suppose that we
have two day patterns yi and z i , i = 1, · · · , N. A simple
measure is the average geometric distance, or root-mean-
square distance:
qX
drms = (yi − z i )2 /N. 1 2 3 4 5 6 7

This measure is robust and usually yields good results. If

Figure 2. Dendrogram
we want to cluster patterns with similar shapes, a normalized
version can be used:
qX
dnm = (yi /ymax − z i /z max )2 /N.
728

710 727

643 5/5 725 726

474 550 719 724 723 31/12

470 23/3 475 31/3 709 716 722 5/12 718 721

466 468 469 28/3 705 25/9 667 715 714 717 713 4/7 708 720

465 27/9 467 26/10 18/1 30/4 671 699 27/11 16/12 695 711 691 702 707 712 703 8/8 9/5 4/10 696 24/12

462 464 463 28/9 653 3/2 654 15/1 662 15/12 704 17/12 683 27/6 688 694 697 706 701 19/12 521 692 561 2/1

458 11/10 456 461 435 460 650 1/4 644 4/2 629 27/10 693 698 656 674 639 684 608 622 680 681 687 22/12 700 19/8 18/7 25/7 11/7 1/8 510 3/1

Here the measured values are normalized via division by the

453 30/3 447 455 6/12 13/12 1/3 4/5 459 2/11 582 649 632 642 618 628 679 685 20/11 3/12 590 600 29/7 15/8 586 2/5 647 673 540 7/5 551 3/7 615 668 612 657 663 686 689 690 29/12 30/12

440 27/4 446 6/9 25/1 21/6 454 457 581 27/1 605 619 631 14/5 641 27/3 25/11 8/12 627 2/12 648 670 625 659 557 17/7 536 573 30/5 6/6 646 20/6 527 665 30/6 2/7 472 8/7 16/5 29/8 664 26/9 12/8 13/8 613 23/5 25/6 26/6 652 18/6 682 12/12 660 678

28/6 25/10 445 20/12 451 22/3 452 1/2 517 13/1 598 26/5 603 2/9 610 620 634 640 567 10/11 633 23/10 669 1/10 604 20/10 658 13/11 481 520 535 10/7 14/7 16/7 524 17/10 4/8 5/8 588 6/8 7/7 9/7 636 23/12 545 18/4 471 1/7 676 24/10 623 637 677 26/2

24/5 8/11 449 450 448 11/1 9/1 13/10 523 587 592 22/5 25/2 12/3 611 8/1 609 23/4 630 16/1 3/11 24/11 614 22/10 666 15/9 558 596 655 28/10 22/7 28/7 519 23/7 479 30/7 13/6 22/8 28/2 7/8 593 621 17/1 10/10 1/5 6/5 599 675 19/6 14/8 597 11/6 626 672

maximum value in the sequence. If we want to eliminate

442 15/2 21/9 23/11 444 1/11 4/9 8/9 559 570 574 13/5 585 28/8 541 589 607 29/10 2/10 12/11 638 6/1 4/12 11/12 595 18/12 568 30/10 21/7 24/7 15/7 31/7 575 578 602 11/8 21/11 28/11 624 31/10 594 4/6 23/6 24/6 661 29/4

438 439 441 443 14/4 9/10 539 569 515 26/8 566 30/1 11/9 18/9 15/5 3/9 606 8/10 547 16/9 473 17/11 528 11/11 31/1 11/4 553 577 591 19/9 3/10 7/11 5/6 12/6 651 3/6

437 21/12 432 15/6 406 7/6 427 8/5 533 538 509 24/3 485 26/3 564 565 514 601 546 18/11 1/12 9/12 21/10 4/11 552 21/2 14/3 5/9 580 14/11 584 645

409 436 429 431 20/9 22/11 426 17/5 518 14/10 498 500 8/4 23/9 477 28/4 563 3/4 549 555 513 10/3 572 25/3 542 19/11 24/1 7/2 576 579 583 10/6 635 19/3

slow trends, we have to subtract the average difference. This

14/9 12/10 410 434 428 22/6 430 10/5 425 13/9 508 22/4 18/2 4/3 499 17/2 18/3 27/5 13/3 9/4 522 25/8 23/1 10/9 5/3 6/3 525 571 526 5/11 489 12/9 10/1 14/2 20/8 21/8 617 27/2

402 3/5 414 433 419 2/3 398 415 424 2/2 507 7/4 492 21/1 2/4 2/6 493 17/4 556 20/5 488 503 486 21/3 562 616

15/3 9/8 413 26/1 403 422 418 5/7 22/2 1/6 5/4 3/8 421 423 496 506 7/1 20/1 28/1 10/4 548 554 29/9 6/10 491 502 476 25/4 560 15/10 28/5 27/8

412 13/4 397 14/12 417 6/4 405 16/11 420 16/8 23/8 29/11 495 11/2 505 24/9 20/3 24/4 544 14/1 30/9 6/11 490 501 7/3 4/4 534 9/6

411 7/12 396 4/1 416 6/7 404 30/8 31/5 14/6 484 494 487 17/9 532 543 7/10 26/11 22/9 10/12 530 18/8

means that we consider two patterns as equal if they are the 29/3 5/10

399
8/3

400

18/10
7/9

407

401

395
408

19/10

15/11
8/6

367
383

372
389

5/1
390

19/4

385

384
394

386

25/12

19/7
393

392

380
30/11

19/1
3/3
480

17/3
10/2 482

15/4
21/4

12/5
1/9 9/9

12/2
516

19/2

5/2
504

6/2
531

478
512

483

11/3
22/1
537

511

29/1
20/2

16/4

21/5 16/6
497
529

17/6
29/5
16/10

same, except for an offset: 375

23/2 16/3

381

379
388
391

387

382
12/7
29/6

27/7
9/3 13/7 16/2

10/8
26/7

369

17/8

365
373

371

368

28/12
20/4

370

366
20/7

18/5
27/12 13/2 24/2

qX
9/2 11/5 378 8/2 376 377 24/8 26/12 19/5 25/5

12/4 26/4 12/1 31/8 374 1/1

2/8 9/11

dsh = (yi − z i − 1)2 /N,

Figure 3. Full clustering tree
with X
1= (yi − z i )/N. This can be improved if only a selection of all clusters is
displayed, where the user can browse through and zoom in
If we are interested in peak values, we can use: and out on the clustering tree, in the same style as with a file
browser. What we still lack then is insight into the distri-
dma = | ymax − z max | . bution of the elements of the cluster over the year. What is
We have experimented with several other measures, and needed is a visual representation such that the viewer can ef-
found that the preceding measures gave the best results, and fortlessly determine whether similarity is due to the season,
provide an easy-to-use toolset. the day of the week, or that some other correlation exists.
Fortunately, such a representation already exists: a calendar.

4 Cluster visualization 4.1 Visualization

Now we have grouped the day patterns, how can we get We have developed a combined representation of daily
insight into the result? A standard way to display the result patterns and clusters. Patterns are shown as graphs, clus-
of clustering is the use of dendrograms as is shown in fig- ters are shown on a calendar. Colors indicate corresponding
ure 2. The bottom row shows the initial elements, each next clusters and patterns. As an example, figure 4 shows a re-
row shows how two clusters are combined. This works fine sult of a cluster analysis of time series data on the number of
if the number of elements is small. For more than, say, hun- employees present at ECN. The most significant seven clus-
dred clusters such images are much harder to grasp. Figure 3 ters are shown. On the right, the average value per cluster is
shows a full clustering tree for 365 day patterns, which was shown as a colored graph; on the left, each day in the calen-
generated by the well-known graph visualization package dar is colored according to the cluster to which it belongs.

3
1997 employees Cluster viewer
(c) ECN 1998
januari februari maart
ma 6 13 20 27 3 10 17 24 3 10 17 24 31
Graphs
di 7 14 21 28 4 11 18 25 4 11 18 25 600
wo 1 8 15 22 29 5 12 19 26 5 12 19 26 5/12/1997
do 2 9 16 23 30 6 13 20 27 6 13 20 27
31/12/1997
vr 3 10 17 24 31 7 14 21 28 7 14 21 28
za 4 11 18 25 1 8 15 22 1 8 15 22 29 Cluster 710
zo 5 12 19 26 2 9 16 23 2 9 16 23 30 Cluster 718
500
Cluster 719
april mei juni Cluster 721
ma 7 14 21 28 5 12 19 26 2 9 16 23 30
di 1 8 15 22 29 6 13 20 27 3 10 17 24 Cluster 722
wo 2 9 16 23 30 7 14 21 28 4 11 18 25
do 3 10 17 24 1 8 15 22 29 5 12 19 26 400
vr 4 11 18 25 2 9 16 23 30 6 13 20 27
za 5 12 19 26 3 10 17 24 31 7 14 21 28
zo 6 13 20 27 4 11 18 25 1 8 15 22 29

juli augustus september 300

ma 7 14 21 28 4 11 18 25 1 8 15 22 29
di 1 8 15 22 29 5 12 19 26 2 9 16 23 30
wo 2 9 16 23 30 6 13 20 27 3 10 17 24
do 3 10 17 24 31 7 14 21 28 4 11 18 25
vr 4 11 18 25 1 8 15 22 29 5 12 19 26 200
za 5 12 19 26 2 9 16 23 30 6 13 20 27
zo 6 13 20 27 3 10 17 24 31 7 14 21 28

oktober november december

ma 6 13 20 27 3 10 17 24 1 8 15 22 29
100
di 7 14 21 28 4 11 18 25 2 9 16 23 30
wo 1 8 15 22 29 5 12 19 26 3 10 17 24 31
do 2 9 16 23 30 6 13 20 27 4 11 18 25
vr 3 10 17 24 31 7 14 21 28 5 12 19 26
za 4 11 18 25 1 8 15 22 29 6 13 20 27 hours
zo 5 12 19 26 2 9 16 23 30 7 14 21 28 0
6:00 9:00 12:00 15:00 18:00

Figure 4. Calendar view of the number of employees

Several conclusions can be drawn from this image. We • On December 5th many people left at 4:00 PM. Dutch
see that: people will immediately know the explanation: On this
day we celebrate Santa Claus and are allowed to leave
• Office hours are followed strictly. Most people arrive earlier!
between 8:30 and 9:00 am, and leave between 4:00 and
5:00 pm. Furthermore, in the morning the number of We see that for this distribution of patterns quite plausible
employees present is slightly higher than in the after- explanations exist. The advantage of clustering is that none
noon. of these explanations have to be inserted a priori, such as
separating working days and holidays, and all effects are
• On Fridays and in the summer fewer people are present elucidated automatically. The combined representation of
(cluster 722); average graphs and clusters enables a user to quantify these
effects easily. Another strong point is that standard patterns
• On Fridays in the summer even fewer people are
present (cluster 718); (cluster 719) as well as exceptional patterns (December 5th)
are detected automatically.
• In the weekend and at holidays only very few people
are working (cluster 710): security and fire brigade; 4.2 Interaction
• Holidays in the Netherlands in 1997 were January 1st, For effective data exploration, user interaction is as im-
March 28th, March 31st, April 30th, May 5th, May 8th, portant as presentation. The combination of cluster analysis
May 19th, December 25th and 26th. with a calendar representation provides good opportunities
• School vacations are visible in Spring (May 3rd to May for interaction. We have embedded our presentation in an in-
11th), in Autumn (October 11th to October 19th), and teractive system for the analysis of time series data, such that
in Winter (December 21th to December 31st); the user can interact with the image presented to him (such
as fig. 4) in many ways.
• Many people take a day off after a holiday (cluster Selection of the data to be displayed can be done easily.
721); Initially, no days are selected for display. The user can tog-

4
gle days for display via point-and-click on a single day, on same methods for interaction as with the content based clus-
the label of a month, or on the label of the year. All days are tering. With a slightly modified measure, also a separation
then displayed as separate graphs. The user can point-and- into weekdays can be made.
click on a graph, upon which the corresponding day on the Also, a simplified clustering method was implemented:
calendar is highlighted. Exceptional patterns are thus easy Starting at a selected day, all other are added one after each
to locate. other in order of their distance to an initial day. This option
When the user selects a day, a typical next question is is useful to determine stepwise whether certain patterns are
which other days have a similar pattern. This is where the exceptions or not, again using the same methods for interac-
cluster analysis comes in. The user can select a day, and tion.
ask for more similar days via a single button press. The sys-
tem determines the parent cluster, shows the average graph 4.3 Application
of this cluster and highlights the days within the cluster via
color. This step can be repeated and reversed, so that the The background of our interest in time series data is the
user can interactively enlarge and shrink the cluster to be dis- liberalization of the energy markets. In the Netherlands,
played. Also, the user can select other days, and inspect sev- customers with very high energy consumptions are recently
eral clusters simultaneously. allowed to choose their gas and electricity supplier and ne-
In addition to this bottom-up approach, the user can show gotiate a tailor-made tariff. Other customers will follow in
clusters top-down. The user can select the number of clus- the next few years. This will strongly enhance competition
ters to be displayed, upon which the system generates a par- between the energy distribution companies, which will have
tioning of the year as shown in figure 4. Via two more/less to transform themselves from utility companies into market-
buttons, the user can add and remove clusters, until a mean- oriented companies. Insight into consumption patterns is
ingful decomposition is made. essential for the segmentation of their customer markets.
The full clustering process itself is done initially and later But also customers themselves need insight into their energy
on request of the user. Our non-optimized version takes consumption patterns in order to lower consumptions, avoid
about 5 seconds on a PC with a Pentium 100 MHz processor, peak rates and to negotiate a lower tariff.
which is quite acceptable for interactive use. The clustering Our aim is to develop methods, techniques, and tools that
tree is stored and re-used upon each query. Reclustering has enable customers to analyze their energy consumption pat-
to be done if the user wants to use a different distance mea- terns easily and effectively. We started with a study of the
sure. As an additional option, the user can reduce or enlarge electricity consumption at ECN itself. After collection of
the time interval upon which the comparison has to be made. data several analysis and visualization methods were tried.
For instance, if he finds in a graph a strange peak occurring The time series analysis data tool described proved to be
between 9:00 am and 10:00 am, he can select this interval very helpful.
graphically, and ask for a reclustering using only this time Figure 5 shows a cluster analysis of the power con-
interval and the dsh measure. As a result, all days with a sim- sumption. The five main clusters are shown here. During
ilar peak in this time interval are clustered. week-ends power consumption was fairly constant. Further-
Many standard options are further provided for the dis- more, four clusters with about the same patterns but differ-
play of the graphs. The user can zoom-in and out, show ent plateau levels emerge. The correlation with the seasons
the standard deviation for a cluster, and show each member is clearly visible. Finally, in the morning of February 4th a
of the cluster individually. Smoothing, with different filters high peak demand occurred.
and a user-controllable width, can be applied, which is use-
ful if noisy data have to be processed. Clusters can be gener-
5 Discussion
ated from these smoothed data. Some straightforward addi-
tional options could be fit easily within our framework. The
use of the following distance measure: We have presented a new method for the exploration and
analysis of extensive time series data. The combination of
dmn = 5000 | ymon /6 − z mon /6 | + cluster analysis and calendar based visualization turned out
1200 | ymon /3 − z mon /3 | + to be highly effective. Almost effortlessly images such as
400 | ymon /3 − z mon /3 | + figure 4 and 5 can be generated, which provide a good in-
| yday − z day |; sight.
The next step to be made is the extension to the inter-
where ymon and yday are the number of the month and the active visualization and analysis of several variables simul-
day respectively, gives a balanced clustering of the year in taneously. This enables a user to study correlations be-
half-years, quarters of a year, months, etcetera. This enables tween variables, either manually or automatically. Detected
a user to view standard averages and slow trends, using the correlations can lead the user in the direction of a suitable

5
1997 kW Cluster viewer
(c) ECN 1998
januari februari maart 2200
ma 6 13 20 27 3 10 17 24 3 10 17 24 31
Graphs
di 7 14 21 28 4 11 18 25 4 11 18 25
wo 1 8 15 22 29 5 12 19 26 5 12 19 26 2000 4/2/1997
do 2 9 16 23 30 6 13 20 27 6 13 20 27
Cluster 706
vr 3 10 17 24 31 7 14 21 28 7 14 21 28
za 4 11 18 25 1 8 15 22 1 8 15 22 29 Cluster 714
1800
zo 5 12 19 26 2 9 16 23 2 9 16 23 30 Cluster 720
Cluster 722
april mei juni 1600 Cluster 723
ma 7 14 21 28 5 12 19 26 2 9 16 23 30
di 1 8 15 22 29 6 13 20 27 3 10 17 24
wo 2 9 16 23 30 7 14 21 28 4 11 18 25 1400
do 3 10 17 24 1 8 15 22 29 5 12 19 26
vr 4 11 18 25 2 9 16 23 30 6 13 20 27
za 5 12 19 26 3 10 17 24 31 7 14 21 28 1200
zo 6 13 20 27 4 11 18 25 1 8 15 22 29

juli augustus september

1000
ma 7 14 21 28 4 11 18 25 1 8 15 22 29
di 1 8 15 22 29 5 12 19 26 2 9 16 23 30
wo 2 9 16 23 30 6 13 20 27 3 10 17 24
800
do 3 10 17 24 31 7 14 21 28 4 11 18 25
vr 4 11 18 25 1 8 15 22 29 5 12 19 26
za 5 12 19 26 2 9 16 23 30 6 13 20 27
600
zo 6 13 20 27 3 10 17 24 31 7 14 21 28

oktober november december

ma 6 13 20 27 3 10 17 24 1 8 15 22 29 400
di 7 14 21 28 4 11 18 25 2 9 16 23 30
wo 1 8 15 22 29 5 12 19 26 3 10 17 24 31
do 2 9 16 23 30 6 13 20 27 4 11 18 25 200
vr 3 10 17 24 31 7 14 21 28 5 12 19 26
za 4 11 18 25 1 8 15 22 29 6 13 20 27 hours
zo 5 12 19 26 2 9 16 23 30 7 14 21 28 0
0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 24:00

Figure 5. Cluster analysis of power demand by ECN

model. Model parameters can subsequently be estimated by [4] Tufte, E.R. The Visual Display of Quantitative Infor-
a regression method, and a statistical analysis of the model mation, Graphics Press, 1983.
residuals will indicate the validity of the model. Adopting
this procedure in the study of ECN energy consumption, a [5] Kaufman, L. and Rousseeuw, P.J. Finding Groups in
linear model was identified which could accurately predict Data: An Introduction to Cluster Analysis, John Wiley,
the power consumption from the sunlight intensity and the 1990.
number of employees [7]. We used different packages for [6] Gansner, E.R., E. Koutsofios, S, North, and K-P. Vo. A
this, integration of such methods in a single tool would be Technique for Drawing Directed Graphs. IEEE Trans-
highly effective. actions on Software Engineering 19 (3), pp. 214-230,
In conclusion, we think that our cluster and calendar 1993.
based analysis is a useful method to explore and visualize
large quantities of univariate time series data, and provides [7] Selow, E.R. van, Wijk, J.J. van, Jehee, J.N.T. Identifi-
a sound basis for a general analysis tool. cation and Visualization of Energy Consumption Pat-
terns. In: Proceedings of DistribuTECH DA/DSM Eu-
rope, Pennwell, London, October 1998.
References

[1] Box, G.E.P. and Jenkins, G.M. Time Series Analy-

sis: Forecasting and Control, 2nd edition, Holden-Day,
1976.

[2] Evertsz, C.J.G. Fractal Geometry of Financial Time

Series. Fractals 3 (3), pp. 609-616, 1995.

[3] Keller, P.R. and Keller, M.M. Visual Cues, IEEE

Press, Piscataway, NJ, USA, 1993, p. 53.

Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
From Everand
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Bob Mather
3/5 (1)
PHB English Kisan Drone Operator AGRQ1006 V2.0
No ratings yet
PHB English Kisan Drone Operator AGRQ1006 V2.0
327 pages
Time Series Analysis in R
100% (1)
Time Series Analysis in R
138 pages
4540 17 PDF
No ratings yet
4540 17 PDF
274 pages
TCT181229M096 - 品诺 - PN 802540 - MSDS-更新
No ratings yet
TCT181229M096 - 品诺 - PN 802540 - MSDS-更新
8 pages
Silva Widm2021
No ratings yet
Silva Widm2021
44 pages
Stageverslag Roelofsen Tcm235 882304
No ratings yet
Stageverslag Roelofsen Tcm235 882304
83 pages
1709.08055
No ratings yet
1709.08055
28 pages
DSA(Notes-2,3,4 units)
No ratings yet
DSA(Notes-2,3,4 units)
77 pages
Paper 14
No ratings yet
Paper 14
9 pages
s40745-024-00551-2
No ratings yet
s40745-024-00551-2
14 pages
01 ASAP TimeSeriesForcasting Day1 2 Introduction
No ratings yet
01 ASAP TimeSeriesForcasting Day1 2 Introduction
62 pages
Module 02.1 Time Series Analysis and Forecasting Accuracy
No ratings yet
Module 02.1 Time Series Analysis and Forecasting Accuracy
11 pages
01 ASAP GM TimeSeriesForcasting - Day1 - 2 - Introduction
No ratings yet
01 ASAP GM TimeSeriesForcasting - Day1 - 2 - Introduction
66 pages
Clustering of Time-Series Data
No ratings yet
Clustering of Time-Series Data
20 pages
UNIT 5 Time Series Analysis
No ratings yet
UNIT 5 Time Series Analysis
17 pages
Time Series & Streaming
No ratings yet
Time Series & Streaming
13 pages
Analysis of Dendrogram Tree For Identifying and Visualizing Trends in Multi-Attribute Transactional Data
No ratings yet
Analysis of Dendrogram Tree For Identifying and Visualizing Trends in Multi-Attribute Transactional Data
5 pages
Time Series Analysis and Modeling To Forecast: A Survey
No ratings yet
Time Series Analysis and Modeling To Forecast: A Survey
76 pages
Time Series Analysis (TSA) - Tutorial
No ratings yet
Time Series Analysis (TSA) - Tutorial
136 pages
From Time Series To Complex Networks The Visibility Graph
No ratings yet
From Time Series To Complex Networks The Visibility Graph
4 pages
Data Visualization 14 TimeSeriesData
No ratings yet
Data Visualization 14 TimeSeriesData
33 pages
Gas Production
No ratings yet
Gas Production
29 pages
Time Series Forecasting Using Clustering With Periodinc Pattern
No ratings yet
Time Series Forecasting Using Clustering With Periodinc Pattern
8 pages
Optimal Multi-Scale Patterns in Time Series Streams: Spiros Papadimitriou Philip S. Yu
No ratings yet
Optimal Multi-Scale Patterns in Time Series Streams: Spiros Papadimitriou Philip S. Yu
12 pages
Module 2.3 EDA Part 3 Time Series Data in Python and R
No ratings yet
Module 2.3 EDA Part 3 Time Series Data in Python and R
20 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
12 pages
Data Analysis and Decision Making Time Series Analysis
No ratings yet
Data Analysis and Decision Making Time Series Analysis
5 pages
Time Series Analysis. Trends, Patters, Seasonality
No ratings yet
Time Series Analysis. Trends, Patters, Seasonality
14 pages
QT Module 3
No ratings yet
QT Module 3
21 pages
1512.04349v1
No ratings yet
1512.04349v1
53 pages
M1_L1 (Introduction, Applications)
No ratings yet
M1_L1 (Introduction, Applications)
39 pages
Kshape
No ratings yet
Kshape
49 pages
Time Series
No ratings yet
Time Series
29 pages
Chapter 2 - Timeseries Analysis
No ratings yet
Chapter 2 - Timeseries Analysis
36 pages
Time Series and Survival Analysis
No ratings yet
Time Series and Survival Analysis
30 pages
Time analysis
No ratings yet
Time analysis
11 pages
Time Series
No ratings yet
Time Series
1 page
Chapter 5 Exponential Smoothing Methods L 2015
No ratings yet
Chapter 5 Exponential Smoothing Methods L 2015
19 pages
Project Time Series Analysis
100% (2)
Project Time Series Analysis
26 pages
Deep Multivariate Time Series Embedding Clustering
No ratings yet
Deep Multivariate Time Series Embedding Clustering
26 pages
Temporal Data Mining: Time Series Analysis and Time-Lag Detection
No ratings yet
Temporal Data Mining: Time Series Analysis and Time-Lag Detection
11 pages
Components of Time Series and Exploratory Analysis - Transcript
No ratings yet
Components of Time Series and Exploratory Analysis - Transcript
2 pages
Hanke, John E. - Wichern, Dean W. - Business Forecasting
No ratings yet
Hanke, John E. - Wichern, Dean W. - Business Forecasting
45 pages
Time Series Coursework
100% (2)
Time Series Coursework
5 pages
A Comparative Study and Analysis of Time
No ratings yet
A Comparative Study and Analysis of Time
7 pages
Time Series Analysis and Mining With R
No ratings yet
Time Series Analysis and Mining With R
12 pages
LM of Tip
No ratings yet
LM of Tip
5 pages
Data Preprocessing
No ratings yet
Data Preprocessing
76 pages
Demgn801 Business Analytics 76 150
No ratings yet
Demgn801 Business Analytics 76 150
75 pages
RDataMining Slides Time Series Analysis PDF
No ratings yet
RDataMining Slides Time Series Analysis PDF
41 pages
Time Series Mid Term-1
No ratings yet
Time Series Mid Term-1
11 pages
Unit 5 Time Series Data Analysis
No ratings yet
Unit 5 Time Series Data Analysis
33 pages
MBA Analytics For Finance 11
No ratings yet
MBA Analytics For Finance 11
12 pages
AIS 3209 Chapter 1 To 4
No ratings yet
AIS 3209 Chapter 1 To 4
31 pages
AI notes
No ratings yet
AI notes
12 pages
Time Series
100% (1)
Time Series
61 pages
Time Series Analysis Homework
100% (1)
Time Series Analysis Homework
4 pages
Unit 5
No ratings yet
Unit 5
16 pages
Communication Nets: Stochastic Message Flow and Delay
From Everand
Communication Nets: Stochastic Message Flow and Delay
Leonard Kleinrock
3/5 (1)
Introduction to Electromagnetic Engineering
From Everand
Introduction to Electromagnetic Engineering
Roger E. Harrington
5/5 (1)
Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements
From Everand
Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements
S. Twomey
No ratings yet
RM 2200 v3.2
No ratings yet
RM 2200 v3.2
4 pages
Qualitative Vs Quantitative Data
No ratings yet
Qualitative Vs Quantitative Data
6 pages
POLYCARE Brochure
No ratings yet
POLYCARE Brochure
3 pages
Laying of Underground Cables
No ratings yet
Laying of Underground Cables
14 pages
Graminae Group Analysis Dissertation Wulfsohn
100% (1)
Graminae Group Analysis Dissertation Wulfsohn
96 pages
Meaningful Learning in The Practice
No ratings yet
Meaningful Learning in The Practice
4 pages
Oracle Isetup - Frequently Asked Questions Faq
No ratings yet
Oracle Isetup - Frequently Asked Questions Faq
3 pages
IBP Road, Batasan Hills Quezon City: Pres. Corazon C. Aquino Elementary School
No ratings yet
IBP Road, Batasan Hills Quezon City: Pres. Corazon C. Aquino Elementary School
3 pages
Behavioral Finance Project
No ratings yet
Behavioral Finance Project
23 pages
Estimation of Welding Cost: by K.R.Prasanna Venkatesan WE0663
100% (1)
Estimation of Welding Cost: by K.R.Prasanna Venkatesan WE0663
41 pages
Cold Emailing Template
No ratings yet
Cold Emailing Template
3 pages
Supply Chain Management
No ratings yet
Supply Chain Management
21 pages
Geoprocessing ArcGIS
No ratings yet
Geoprocessing ArcGIS
5 pages
Design and Fabrication of Spring Loaded Fan
No ratings yet
Design and Fabrication of Spring Loaded Fan
10 pages
Saakx95lmk 075014
No ratings yet
Saakx95lmk 075014
150 pages
Think About or Reflect On Your Past. Has Your Past Influenced You in A Way or Another? How Does Your Past Shape Your Identity and Behavior?
No ratings yet
Think About or Reflect On Your Past. Has Your Past Influenced You in A Way or Another? How Does Your Past Shape Your Identity and Behavior?
2 pages
Insan Sase Service Review CLASS 2012: Biology No. 2
No ratings yet
Insan Sase Service Review CLASS 2012: Biology No. 2
2 pages
Water Rocket Challenge Competition Rules
No ratings yet
Water Rocket Challenge Competition Rules
2 pages
CF Exemplar NURS-FPX4020 Assessment 4
No ratings yet
CF Exemplar NURS-FPX4020 Assessment 4
13 pages
Olimpiade Bahasa Inggris
No ratings yet
Olimpiade Bahasa Inggris
9 pages
Step by Step of The SIPRED System
No ratings yet
Step by Step of The SIPRED System
32 pages
Group Assignment III
No ratings yet
Group Assignment III
7 pages
Checklist
No ratings yet
Checklist
8 pages
Assignment # 2 Database Management System: National University of Modern Languages
No ratings yet
Assignment # 2 Database Management System: National University of Modern Languages
5 pages
Theories On Language Learning - Vygotsky, Piaget, & Habit Formation
No ratings yet
Theories On Language Learning - Vygotsky, Piaget, & Habit Formation
30 pages
Pushp India CRED 2022
No ratings yet
Pushp India CRED 2022
42 pages
Econometrics I: Nicolás Corona Juárez, Ph.D. 4.11.2020
No ratings yet
Econometrics I: Nicolás Corona Juárez, Ph.D. 4.11.2020
45 pages

Cluster and Calendar Based Visualization of Time Series Data

Uploaded by

Cluster and Calendar Based Visualization of Time Series Data

Uploaded by

To be presented at the IEEE Symposium on Information Visualization (INFOVIS’99), San Francisco, October 25-26, 1999

Cluster and Calendar based Visualization of Time Series Data

Jarke J. van Wijk Edward R. van Selow

Figure 1. Power demand by ECN, displayed as a function of hours and days

This measure is robust and usually yields good results. If

643 5/5 725 726

474 550 719 724 723 31/12

Here the measured values are normalized via division by the

maximum value in the sequence. If we want to eliminate

slow trends, we have to subtract the average difference. This

same, except for an offset: 375

12/4 26/4 12/1 31/8 374 1/1

dsh = (yi − z i − 1)2 /N,

4 Cluster visualization 4.1 Visualization

juli augustus september 300

oktober november december

Figure 4. Calendar view of the number of employees

juli augustus september

oktober november december

Figure 5. Cluster analysis of power demand by ECN

[1] Box, G.E.P. and Jenkins, G.M. Time Series Analy-

[2] Evertsz, C.J.G. Fractal Geometry of Financial Time

[3] Keller, P.R. and Keller, M.M. Visual Cues, IEEE

You might also like