Visualizing Music in Its Entirety Using Acoustic Features
Visualizing Music in Its Entirety Using Acoustic Features
MUSIC FLOWGRAM
ABSTRACT one needs to read through the score for a while, having a
certain level of musical knowledge.
In this paper, we present an automatic method for visualiz-
As a way of making up the shortcomings of music no-
ing a music audio file from its beginning to end, especially
tation, audio-synchronized music scores have been devel-
for classical music. Our goal is developing an easy-to-use
oped [1]. Synchronized scores automatically follow music
visualization method that is helpful for listeners and can be
on the score so that listeners can easily track where the
used for various kinds of classical music, even for complex
currently playing measure is or select the measure on the
orchestral music. To represent musical characteristics, the
score to play the music from the position. However, such
method uses audio features such as volume, onset density,
systems require a synchronization process between audio
and auditory roughness, which describe loudness, tempo,
and score. Manual synchronization is too laborious to pro-
and dissonance, respectively. These features are visually
cess a large set of pieces, whereas automatic one, an active
mapped into static two-dimensional graph, so that users
research topic in the area of music information retrieval
can see how the music changes by time at a look. We im-
(MIR), is not accurate enough particularly for large orches-
plemented the method with Web Audio API so that users
tral music. Above all, these solutions still cannot show the
can access to the visualization system on their web browser
entire structure of a piece.
and make visualizations from their own music audio files.
Two types of user tests were conducted to verify the effects In the case of classical music, particularly for long in-
and usefulness of the visualization for classical music lis- strumental pieces, visualizing information about the entire
teners. The result shows that it helps listeners to memorize structure can be helpful to music listeners in that they do
and understand a structure of music, and to easily find a not contain lyrics or clear storytelling to follow. So addi-
specific part of the music. tional information about the music is required to help lis-
teners to understand the music. A traditional way of pro-
viding the information is giving a lecture or writing a pro-
1. INTRODUCTION gram note. But these requires professionals who can ex-
Music visualization is widely used in various music ac- plain the music. Many researchers instead have suggested
tivities for many purposes. Because music is an auditory a content-based approach to visualize the entire structure
art, its visual representations can contain information that of music from audio. Most of automatic music structure
cannot be transferred or preserved accurately with sound. visualization methods are based on self-similarity between
Music notation is a typical example of the visualized music each part of a piece [2, 3, 4]. These methods show a repeti-
representations. It is designed for communication between tive structure of the music based on the self-similarity. Fur-
composers and performers. The notation systems thus have ther information about the research is introduced in Sec-
been evolved to represent and deliver a composer’s inten- tion 2. In general, it is not easy to interpret the meaning of
tion as precise as possible. the visualizations. Finding the repetitive structure can help
For listeners, however, music notation has some limita- music structure analysis, but its usefulness on listeners has
tions. It contains too much information for listeners to in- been not verified yet.
terpret and so only a small part can be understood while To address this problem, we present an automatic music
following the music. Especially, in the case of orchestral visualization method named music flowgram, which aims
music, the score following task is quite difficult unless the to visualize an entire piece as an easy-to-understand im-
listeners are musically trained. Another problem is that the age. It extracts audio features from audio files and visual-
notation does not show the entire structure of a piece of izes them on a static two-dimensional graph. In our pre-
music. The time scope of a music score that can be read vious research, we found that a simple static graph show-
in a sight is limited to a few measures. The notation is ing the change of volume of a music piece can help lis-
focused on delivering information about what is happen- teners to concentrate more on classical music, compared
ing in a specific time. To understand the global structure, to spectrum-based real-time visualization [5, 6]. We have
improved this concept by adding additional features that
Copyright: c 2016 Dasaem Jeong et al. This is an open-access article distributed can represent other important characteristics of music, and
under the terms of the Creative Commons Attribution 3.0 Unported License, which conducted user test to verify its effect on listening to clas-
permits unrestricted use, distribution, and reproduction in any medium, provided sical music.
the original author and source are credited. The later part of this paper is organized as follows. First,
related work on visualizing music structure is briefly re- used an abstract representation and 21 of them were in con-
viewed. Then, we present our visualization method in two tinuous mode. This result indicates that a two-dimensional
sections: concept of the visualization and audio features in graph is natural in human sense for representing whole mu-
Section 3, and its implementation in Section 4. The de- sic piece.
tailed information about user tests are described in Section
5, and results with discussion in Section 6. The last sec-
3. MUSIC FLOWGRAM
tion concludes the paper with a summary and our plan for
future work. The idea of music flowgram for music visualization is based
on dramatic structure of storytelling. Freytag explained
2. RELATED WORK the structure of each story with two-dimensional graph vi-
sualization of tension progress [14]. Our idea is apply-
There has been some research on visualizing the structure ing a similar concept to music: drawing continuous two-
of a music piece, both in data visualization and MIR ar- dimensional graph that shows the change of music by time.
eas. The majority of them exploited the repetitive struc- If listeners can see a dramatic structure of music, they could
ture of music using self-similarity within a piece. Wat- feel more comfortable to concentrate on the music because
tenberg visualized it using an arc diagram that connects they can clearly see when the tension will increase or de-
each repetitive part with an edge being drawn as a semi- crease. This is similar to watching an opera, for which
circle [2]. Foote visualized the self-similarity as a two- people are encouraged to know dramatic structure before
dimensional matrix where each element is calculated from watching. The visualization will also help the listeners to
similarity between two audio frames [3]. Müller and Jiang recall the sequence of the music, as people remember the
extended it to a scape plot representation that visualizes order of opera story based on the order of important events.
the repetitive structure with varying segment size. Other A similar type of visualization is waveform visualization
researchers combined this self-similarity information with or volume graph. It shows the volume progress of the mu-
volume transitions over time [7]. There has been also re- sic so that users can see which part is loud or quiet. This
search that applies this structural information to music lis- type of visualization is used in SoundCloud 1 . Though vol-
tening interfaces [8, 9]. ume is a highly important factor in deciding characteristic
Other than those based on repetitions in music, some work of music, there are other quantitative parameters to explain
visualized the structure using tonality such as key change the music. Spectrogram is another way to show the vari-
over multiscale segments [10]. Malt and Jourdan presented ance of music as a two-dimensional image. However, it
a visualization method using statistical characteristics of contains too much details to deliver meaningful musical
spectral information, including spectral centroid and stan- information. Thus, more compact representations, which
dard deviation of the audio spectrum [11]. They illus- effectively extract musical elements, is needed.
trated the change of those information over time on a two- Considering that emotion is the most influential high-level
dimensional graph, adding amplitude information as a color concept on listeners, we focus on musical elements that are
of the graph. However, the most of the mentioned research associated with the emotional aspects of music. Among
have not released an end-user application so that general many suggested elements in this regard [15], we choose
users can render their own visualization. Furthermore, this loudness, tempo and harmony. For visualization, we repre-
research lacks user test or human side experiments that ver- sent them with volume, onset density and auditory rough-
ify its effect and usefulness for listeners. ness, respectively, as below.
Besides the automatic visualization methods using au-
dio files or MIDI files, visualization of semantic structure 3.1 Volume
of music is also proposed [12]. This method contains a
lot more information than repetitive structure, for exam- Unlike other genres of music, classical music consists with
ple, traditional structure analysis of sonata form, motif de- many different sub-parts, each of which has a different
velopment, and how the role of each instrument changes loudness characteristic. Therefore, temporal differences of
through the piece. But all of the information used in vi- loudness can explain the structural information of music
sualization is manually extracted from written explanation effectively. We represent the loudness with volume which
of the music, and cannot be automatically computed from is simply calculated as frame-level energy. Though more
audio files. complex measures of loudness could be adopted, we as-
There is also music psychological research about visual- sume that the volume is sufficiently effective in complex
izing whole music [13]. This research tested how people musical sound.
describe short music with graphical representations. Par-
ticipants are asked to “make any marks” to describe five 3.2 Onset Density
short orchestral works after listening to the music. The re- Emotion of music is highly dependent on the tempo char-
sult showed that musically trained participants more tended acteristic of music, i.e, whether the music is fast or slow.
to describe music with abstract representations such as sym- Beats per minute (BPM) is a typical way of representing
bols and lines. Most frequently used mapping was X-axis it. However, the single speed measure is not sufficient to
as time and Y-axis as pitch. The other type was pictorial describe the tempo characteristic of music because note
representations, which were mostly drawn by untrained
participants. Among 30 musically trained participants, 24 1 www.soundcloud.com
passages can vary dramatically in the same tempo. For
example, a long note and multiple short notes can be lo-
cated in a single beat but they produce a different nuance.
For this reason, we represent the tempo characteristic with
the number of notes per second. Since we need to have
overall trend of local note population rather than the exact
number of notes for visualization, we use a simple onset
detection algorithm which counts note onsets in a selected
frame based on amplitude information.
Table 7. Results of user test B with participants who know [5] D. Jeong and Y. Noh, “Music visualization using vol-
well the material ume graph and its effect on classical music lstening,”
2014.
loading time for the YouTube player. One of the partici- [6] D. Jeong, “Music visualization using flow graph and its
pants found the excerpted part with a single click on the effect on listening to classical music,” Master’s thesis,
music flowgram. Since the excerpt contains legato passage Korea Advanced Institute of Science and Technology,
of strings, the participant could easily find it by searching 2015.
a part with low onset density.
[7] N. Kosugi, “Misual: music visualization based on
acoustic data,” in Proceedings of the 12th International
7. CONCLUSION Conference on Information Integration and Web-based
In this paper, we have presented an automatic visualiza- Applications & Services. ACM, 2010, pp. 609–616.
tion method for representing music in its entirety. The
[8] M. Goto, “Smartmusickiosk: Music listening station
goal of our visualization is showing how the music changes
with chorus-search function,” in Proceedings of the
from beginning to end. The method visualizes music with
16th annual ACM symposium on User interface soft-
three audio features like volume, onset density, and audi-
ware and technology. ACM, 2003, pp. 31–40.
tory roughness, which are highly associated with loudness,
tempo, and dissonance, respectively, in musical character- [9] M. Goto, K. Yoshii, H. Fujihara, M. Mauch, and
istics. These features are visualized as a two-dimensional T. Nakano, “Songle: A web service for active music
graph. We implemented the method on a web page us- listening improved by user contributions.” in Proceed-
ing Web Audio API and conducted user test for verifying ings of the 12th International Society for Music Infor-
the usefulness of our method in the listening and searching mation Retrieval Conference (ISMIR). Citeseer, 2011,
task. The results showed that listening to music with a mu- pp. 311–316.
sic flowgram helps listeners to memorize the music more
precisely. A music flowgram was also helpful for search- [10] C. S. Sapp, “Harmonic visualizations of tonal music,”
ing a specific excerpt from music. in Proceedings of the International Computer Music
Despite of the overall positive results, there is still a large Conference, vol. 1001. Citeseer, 2001, pp. 423–430.
margin for improvement. Auditory roughness, which is in-
[11] M. Malt and E. Jourdan, “Le bstd–une representa-
tended for representing the harmonic characteristic of mu-
tion graphique de la brillance et de lecart type spectral,
sic, was not satisfactory for many participants. For the fu-
comme possible representation de levolution du tim-
ture work, we are planning to improve our algorithm for
bre sonore.” in Proceedings of L’analyse musicale au-
detecting onset and calculating audio roughness. We are
jourd’hui. Crise ou (r)/’evolution ?, 2009.
also considering other audio features that can replace au-
ditory roughness such as tonal complexity [18]. Another [12] W.-Y. Chan, H. Qu, and W.-H. Mak, “Visualizing the
important challenge will be finding more intuitive and vi- semantic structure in classical music works,” Visualiza-
sually pleasing mappings for each parameter. tion and Computer Graphics, IEEE Transactions on,
vol. 16, no. 1, pp. 161–173, 2010.
8. REFERENCES
[13] S.-L. Tan and M. E. Kelly, “Graphic representations
[1] M. Müller, F. Kurth, and T. Röder, “Towards an effi- of short musical compositions,” Psychology of Music,
cient algorithm for automatic score-to-audio synchro- vol. 32, no. 2, pp. 191–212, 2004.
nization.” in Proceedings of the 5th International Con-
ference on Music Information Retrieval (ISMIR), 2004. [14] G. Freytag, Die Technik des Dramas. Hirzel, 1872.
[2] M. Wattenberg, “Arc diagrams: Visualizing structure [15] A. Gabrielsson and E. Lindström, “The role of struc-
in strings,” in Information Visualization, 2002. INFO- ture in the musical expression of emotions,” Handbook
VIS 2002. IEEE Symposium on. IEEE, 2002, pp. 110– of music and emotion: Theory, research, applications,
116. pp. 367–400, 2010.
[3] J. Foote, “Visualizing music and audio using self- [16] E. J. Humphrey and J. P. Bello, “Four timely insights
similarity,” in Proceedings of the seventh ACM inter- on automatic chord estimation,” in Proceedings of the
national conference on Multimedia (Part 1). ACM, 16th International Society for Music Information Re-
1999, pp. 77–80. trieval Conference (ISMIR), 2015.
[17] P. N. Vassilakis and K. Fitz, “Sra: A web-based re- [18] C. Weiss and M. Muller, “Tonal complexity features
search tool for spectral and roughness analysis of for style classification of classical music,” in Acoustics,
sound signals,” in Proceedings of the 4th Sound and Speech and Signal Processing (ICASSP), 2015 IEEE
Music Computing (SMC) Conference, 2007, pp. 319– International Conference on. IEEE, 2015, pp. 688–
325. 692.