Simple Tempo Models for Real-time Music Tracking

The paper presents a method for integrating learned tempo models into real-time music tracking systems, allowing for accurate tracking of musical performances despite variations in tempo and structure. It utilizes Online Dynamic Time Warping (ODTW) for alignment and proposes two simple tempo models to enhance tracking precision. The system is designed to adapt to different performance styles and manage expressive timing effectively, making it robust for live music applications.

Uploaded by

jlgultraboom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views7 pages

Simple Tempo Models for Real-time Music Tracking

Uploaded by

jlgultraboom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

SIMPLE TEMPO MODELS FOR REAL-TIME MUSIC TRACKING

Gerhard Widmer
Andreas Arzt Department of Computational Perception
Department of Computational Perception Johannes Kepler University Linz
Johannes Kepler University Linz The Austrian Research Institute
for Artificial Intelligence (OFAI)

ABSTRACT like western classical music, where close synchronization

between the solo and the accompanying parts are required,
The paper describes a simple but effective method for in- Arshia Cont’s system ‘Antescofo’ [2] addresses a slightly
corporating automatically learned tempo models into real- different domain, namely, contemporary music by com-
time music tracking systems. In particular, instead of train- posers like Boulez, Cage and Stockhausen, with musical
ing our system with ‘rehearsal data’ by a particular per- characteristics quite different from ‘classical’ music. Dur-
former, we provide it with many different interpretations ing the tracking process both systems are guided by sophis-
of a given piece, possibly by many different performers. ticated tempo models.
During the tracking process the system continuously re- In contrast to the above-mentioned systems, which are
combines this information to come up with an accurate based on probabilistic models, our music follower uses On-
tempo hypothesis. We present this approach in the context line Dynamic Time Warping (ODTW) as its basic tracking
of a real-time tracking system that is robust to almost ar- algorithm (at multiple levels – see Section 3). Even with-
bitrary deviations from the score (e.g. omissions, forward out a predictive model of tempo, this algorithm is surpris-
and backward jumps, unexpected repetitions or re-starts) ingly robust. But for passages with extremely expressive
by the live performer. timing, knowledge about plausible performance strategies
is needed to improve the precision of real-time alignment.
1. INTRODUCTION In this paper we will show two simple and very general
ways of doing so, the second of which actually permits the
Real-time audio tracking systems, which listen to a musi- system to adapt to different ways of playing without sepa-
cal performance through a microphone and automatically rate training each time.
recognize at any time the current position in the musical In the following, we first re-capitulate the basic prin-
score, even if the live performance varies in tempo and ciples of our approach to on-line music following (Sec-
sound, promise to be useful in a wide range of applications. tion 2), briefly point to a recent extension that makes the
They can serve as a (musical) partner to the performer(s) algorithm robust to almost arbitrary disruptions in a per-
by e.g. automatically accompanying them, interacting with formance (Section 3; the details of this are described in a
them or supplementing their art by the creation of visual- separate paper [3]), and then describe two simple, but ef-
izations of their performance. fective ways of introducing expressive tempo information
In this paper we propose a very simple and general meth- into the tracking process in Sections 4 and 5.
od for incorporating learned tempo models into real-time
music trackers. These tempo models need not reflect one
specific way of how to perform a piece of music, but rather 2. A HIGHLY ROBUST MUSIC TRACKER
illustrate many different possible performance strategies
Our approach to score following is via audio-to-audio align-
(in terms of timing and tempo). We present this approach
ment. That is, rather than trying to transcribe the incoming
in the context of a real-time music tracking system that is
audio stream into discrete notes and align the transcrip-
extremely robust in the face of almost arbitrary structural
tion to the score, we first convert a MIDI version of the
changes (e.g. disruptions or re-starts) during a live perfor-
given score into a sound file by using a software synthe-
mance.
sizer. The result is a ‘machine-like’, low-quality rendition
This unique ability distinguishes our real-time tracking
of the piece, in which, due to the information stored in the
system from the two major advanced score followers that
MIDI file, we know the time of every event (e.g. note on-
have been developed in recent years. These systems have
sets).
two quite different domains in mind. While Christopher
Raphael’s ‘Music Plus One’ [1] focuses on the automatic
accompaniment of music containing a quite regular pulse, 2.1 Data Representation
The score audio stream and the live input stream to be
Copyright: c 2010 Andreas Arzt et al. This is an open-access article distributed aligned are represented as sequences of analysis frames,
under the terms of the Creative Commons Attribution License 3.0 Unported, which computed via a windowed FFT of the signal with a ham-
permits unrestricted use, distribution, and reproduction in any medium, provided ming window of size 46ms and a hop size of 20ms. The
the original author and source are credited. data is mapped into 84 frequency bins, spread linearly up
to 370Hz and logarithmically above, with semitone spac-
ing. In order to emphasize note onsets, which are the most
important indicators of musical timing, only the increase in
energy in each bin relative to the previous frame is stored.

2.2 On-line Dynamic Time Warping (ODTW)

This algorithm is the core of our real-time audio tracking
system. ODTW takes two time series describing the au-
dio signals – one known completely beforehand (the score)
and one coming in in real time (the live performance) –,
computes an on-line alignment, and at any time returns the
current position in the score. In the following we only give
a short intuitive description of this algorithm, for further
details we refer the reader to [4].
Dynamic Time Warping (DTW) is an off-line alignment
method for two time series based on a local cost measure
and an alignment cost matrix computed using dynamic pro-
gramming, where each cell contains the costs of the opti-
mal alignment up to this cell. After the matrix computa-
tion is completed the optimal alignment path is obtained
by tracing the dynamic programming recursion backwards Figure 1. Illustration of the ODTW algorithm, showing
(backward path). the iteratively computed forward path (white), the much
Originally proposed by Dixon in [4], the ODTW algo- more accurate backward path (grey, also catching the one
rithm is based on the standard DTW algorithm, but has two onset that the forward path misaligned), and the correct
important properties making it useable in real-time sys- note onsets (yellow crosses, annotated beforehand). In the
tems: the alignment is computed incrementally by always background the local alignment costs for all pairs of cells
expanding the matrix into the direction (row or column) are displayed. Also note the white areas in the upper left
containing the minimal costs (forward path), and it has lin- and lower right corners, illustrating the constrained path
ear time and space complexity, as only a fixed number of computation around the forward path.
cells around the forward path is computed.
At any time during the alignment it is also possible to
compute a backward path starting at the current position, system in the face of possible repetitions and to avoid ran-
producing an off-line alignment of the two time series which dom jumps between identical parts in the score, we also
generally is much more accurate. This constantly updated, introduced automatically computed information about the
very accurate alignment of the last couple of seconds will structure of the piece to be tracked. We chose to call our
be used heavily throughout this paper. See also Figure 1 new approach ‘Any-time Music Tracking’, as the system is
for an illustration of the above-mentioned concepts. continuously ready to receive input and find out what the
Improvements to this algorithm, focusing both on adap- performers are doing, and where they are in the piece.
tivity and robustness, were presented in [5] and are incor-
porated in our system, including the ‘backward-forward Figure 2 visually demonstrates the capabilities of our
strategy’, which reconsiders past decisions (using the back- system. In this case 5 different performances of the Prelude
ward path) and tries to improve the precision of the current in G minor Op. 23 No. 5 by Sergei Rachmaninoff are
score position hypothesis. tracked that start not at the beginning, but 20 bars into the
In the following, we will give a short description of a piece. While the basic system finds the correct position
dynamic and general solution to the problem of how to after a long timespan (basically by chance), our ‘any-time’
deal with structural changes effectively on-line, and then tracker almost instantly identifies the correct position.
describe and evaluate our main new contribution: two ways While testing this real-time tracking system with com-
to estimate the current tempo of a performance on-line, and plex piano music played with a lot of expressive freedom in
how to use this information to improve the alignment. terms of tempo changes, we realized the need for a tempo
model to improve the alignment accuracy and the robust-
ness of our system. In the following we propose two simple
3. ‘ANY-TIME’ REAL-TIME AUDIO TRACKING
tempo models, one only based on the analysis of the most
In [3] we introduced a unique feature to this system, namely recent couple of seconds of the live performance (Section
the ability to cope with arbitrary structural deviations from 4) and one having access to automatically extracted ad-
the score during a live performance. At the core is a pro- ditional knowledge about possible future tempo develop-
cess that continually updates and evaluates high-level hy- ments (Section 5). The result will be a robust real-time
potheses about possible current positions in the score, which tracker that is able to adapt to and even anticipate tempo
are then verified or rejected by multiple instances of the changes of the performer, thus leading to a significant in-
basic alignment algorithm described above. To guide our crease in alignment precision.
Alignment Errors on the Prelude by Rachmaninov (without bars 0-20) formation completely, for robustness considerations.
25 Pn
(t ∗ i)
20 t = i=1 Pn i (1)
i=1 i
15
Error in Bars

10
b) Of course, due to the simplicity of the procedure and
especially the fact that only information older than 1 sec-
5
a) ond is used, this tempo estimation can recognize tempo
0 changes only with some delay. However, the computation
-5
is very fast, which is important for real-time applications,
0 50 100 150 200 250 and it proved very useful for the task we have in mind.
Time (Seconds)

Figure 2. ’Starting in the middle’: A visual comparison 4.2 Feeding Tempo Information to the ODTW
of the capabilities of the tracker in [5] and the ‘any-time’
Based on the observation that both the alignment preci-
real-time tracking system described in [3]. 5 performances
sion and the robustness directly depend on the similarity
of the g minor Prelude by Rachmaninoff, with bars 0-20
between the tempo of the performance and the score rep-
missing, are aligned to the score by both systems. For all
resentation, we now use the current tempo estimate to alter
performances, the ‘any-time’ real-time tracker (a) almost
the score representation on the fly, stretching or compress-
instantly identifies the correct position, while the old sys-
ing it to match the tempo of the performance as closely as
tem (b) finds the correct position by mere chance.
possible. This is done by altering the sequence of feature
vectors representing the score audio. The relative tempo is
4. A (VERY) SIMPLE TEMPO MODEL directly used as the probability to compress or extend the
sequence by either adding new vectors or removing vec-
4.1 Computation of the Current Tempo tors.
More precisely, after every incoming frame from the
The computation of the current tempo of the performance live performance, and before the actual path computation,
(relative to the score representation) is based on a con- the current relative tempo t is computed as given above,
stantly updated backward path starting in the current po- where t = 1 means that the live performance and the score
sition of the forward calculation. As the backward path, in representation currently are in the exact same tempo and
contrast to the forward path which has to make its decisions t > 1 means that the performance is faster than the score
on-line, has perfect information about the performance – at representation. The current position in the score ps is given
least up to the current position in the performance –, it is by the forward path and thus coincides with the index of
much more accurate and reliable than the forward path (see the last processed frame of the score representation. If
also Figure 1). a newly computed random number r between 0 and 1 is
Intuitively, the slope of such a backward path represents larger than t (or 1t if t > 1) an alteration step takes place.
the relative tempo differences between the score represen- If t > 1, a feature vector is removed from the score repre-
tation and the actual performance. Given a perfect align- sentation by replacing ps +1 and ps +2 with a mean vector
ment, the slope between the last two onsets would give a of ps + 1 and ps + 2. And if t < 1, a new feature vector,
very good estimation about the current tempo. But as the computed as the mean of ps and ps + 1 is inserted next into
correctness of the alignment of these last onsets generally the sequence between ps and ps +1. As our system is based
is quite uncertain, one has to discard the last few onsets on features emphasizing note onsets, score feature vectors
and use a larger window over more note onsets to come up representing onsets (which are known from the score) are
with a reliable tempo estimation. not duplicated, as more (and wrong) onsets would be in-
In particular, our tempo computation algorithm uses a troduced to the score representation. In such cases the al-
method described in [6]. It is based on a rectified version of teration process is postponed until the next frame. Further-
the backward alignment path, where the path between note more, to avoid that the system could get stuck at one frame,
onsets is discarded and the onsets (known from the score alterations may take place at most 3 times in a row.
representation) are instead linearly connected. In this way,
possible instabilities of the alignment path between onsets
5. ‘LEARNING’ TEMPO DEVIATIONS FROM
(as, e.g., between the 2nd and 3rd onset in the lower left
DIFFERENT PERFORMERS
corner in Fig.1) are smoothed away.
After computing this path, the n = 20 most recent note As will be shown later in Section 6, the introduction of this
onsets which lie at least 1 second in the past are selected, very simple tempo model – simply using the current esti-
and the local tempo for each onset is computed by consid- mated tempo to stretch/compress the reference score audio
ering the slope of the rectified path in a window with size 3 – already leads to considerably improved tracking results.
seconds centered on the onset. This results in a vector vt of But especially at phrase boundaries with huge changes in
length n of relative tempo deviations from the score repre- tempo (e.g. a slow-down or a speed-up by a factor of 2
sentation. Finally, an estimate of the current relative tempo is not uncommon, see also Figure 3) the above-mentioned
t is computed using Eq.1, which emphasizes more recent delay in the recognition of tempo changes still results in
tempo developments while not discarding older tempo in- large alignment errors. Furthermore, such tempo changes
Tempo Curves
180
160
140
Tempo (bpm)

120
100
80
60 Alexeev
Ashkenazy
40 Biret
20 Gavrilov
Shelley
0
0 50 100 150 200 250 300 350
Time (Beats)

Figure 3. Tempo curves (at the level of quarter notes) automatically extracted from 5 different commercial recordings of
the Prelude Op. 23 No. 5 by Rachmaninoff. Note especially the slow-down around beat 130 and the subsequent speed-up
around beat 190 and the generally big differences in timing between the performances.

are very hard to catch instantly, even with more reactive value of tempo curve i.
tempo models. To cope with this problem we came up with Pn
[(ti,oj + ti,oj+1 )si ]
an automatic and very general way to provide the system t = i=1 (2)
with information about possible ways in which a performer 2
might shape the tempo of the piece. Intuitively, the tempo is estimated as the mean of the tempo
estimates at these 2 onsets, which in turn are computed
First we extract tempo curves from various different as a weighted sum of the (scaled) tempi in the stored per-
performances (audio recordings) of the piece in question. formance curves, with each curve contributing according
Again, as for the real-time tempo estimation, this is done to its local similarity to the current performance. Please
completely automatically using the method described in note that this approach somewhat differs from typical ways
[6] (see Section 4.1), but as the whole performance is known of training a score follower to follow a particular perfor-
beforehand and the tempo analysis can be done off-line mance. We are not feeding the system with ‘rehearsal data’
there is now no need for further smoothing of the tempo by a particular musician, but with many different ways of
computation. These tempo curves (see Figure 3) are di- how to perform the piece in question, as the analyzed per-
rectly imported into our real-time tracking system. formances may be by different performers and differ heav-
ily in their interpretation style. The system then decides
We use this additional information during the tracking on-line at every iteration how to weigh the curves, effec-
process to compute a tempo estimate based not only on tively selecting a mixture of the curves which represents
tracking information about the last couple of seconds, but the current performance best.
also on similarities to other known performances. More
precisely, as before, after every iteration of the path com- 6. EVALUATION
putation algorithm the vector vt containing tempo infor-
mation at note onsets is updated based on the backward The precision of our system was thoroughly tested on vari-
path and the above-mentioned local tempo computation ous pieces of music (see Table 1), with very well known
method. But now the tempo curve of the live performance musicians like Vladimir Horowitz, Vladimir Ashkenazy
over the last w = 50 onsets, again located at least 1 sec- and Daniel Barenboim amongst the performers. While we
ond in the past, is compared to the previously stored tempo currently focus on classical piano music, to show the inde-
curves at the same position. To do this all n tempo curves pendence of specific instruments we also tested our system
are first normalized to represent the same mean tempo over on an oboe sonata by Mozart and the 1st movement of the
these w onsets as the live performance. The Euclidean dis- 5th symphony by Beethoven.
tances between the curve of the live performance and the As for the evaluation reference alignments of the perfor-
stored curves are computed. These distances are inverted mances are needed, Table 1 also indicates how the ground
and normalized to sum up to 1, thus now representing the truth data was prepared. For the performance excerpts of
similarity to the tempo curve of the live performance. the Ballade Op. 38 No. 1 by Chopin (CB) we have ac-
cess to very accurate data about every note onset (‘match-
Based on the stored tempo curves our system can now files’), as these were recorded on a computer-monitored
estimate the tempo at the current position. As the current grand piano. For the performances of the 3 movements of
position should be somewhere between the last aligned on- Mozart’s Sonata KV279 (MS) the evaluation is based on
set oj and the onset oj+1 to be aligned next, we compute exact information about every beat time, which was manu-
the current tempo t according to Formula 2, where ti,oj and ally compiled. The evaluation of the other pieces is based
ti,oj+1 represent the (scaled) tempo information of curve i on off-line alignments produced by our system, which gen-
at onset oj and oj+1 respectively, and si is the similarity erally are much more precise than on-line alignments. We
ID Composer Piece Name Instruments # Perf. Eval. Type
BF Bach Fugue BMV847 Piano 7 Offline Align.
BS Beethoven 5th Symphony, 1st Movement Orchestra 5 Offline Align.
CB Chopin Ballade Op. 38 No. 1 (excerpt) Piano 22 Match
CW Chopin Waltz Op. 34 No. 1 Piano 8 Offline Align.
MO1 Mozart Oboe Quartet KV370 Mov. 1 Oboe, Violin, Viola, Cello 5 Offline Align.
MO3 Mozart Oboe Quartet KV370 Mov. 3 Oboe, Violin, Viola, Cello 5 Offline Align.
MS1 Mozart Sonata KV279 Mov. 1 Piano 5 Beats
MS2 Mozart Sonata KV279 Mov. 2 Piano 5 Beats
MS3 Mozart Sonata KV279 Mov. 3 Piano 5 Beats
RP Rachmaninoff Prelude Op. 23 No. 5 Piano 5 Offline Align.
SI Schubert Impromptu D935 No. 2 Piano 12 Offline Align.

Table 1. The data set used for the evaluation of our real-time tracking system.

are well aware that this information is not guaranteed to cially for the Schubert Impromptu (SI), the Rachmaninov
be entirely accurate, but we manually checked the align- Prelude (RP) and the Chopin Waltz (CW), for which the
ments for obvious errors and are quite confident that the re- number of missed notes is more than halved. Nonetheless
sults based on these alignments are reasonable, especially these kinds of music still pose a great challenge to real-
as evaluations of CB and MS based on these alignments time tracking systems. As the results for the Beethoven
led to very similar numbers compared to the evaluation on Symphony (BS) show, our system can also cope quite well
the correct reference alignments. with orchestral music and does not depend on specific in-
For all pieces we used audio files synthesized from pub- struments. This is also supported by the results on the
licly available ‘flat’ MIDI files with fixed tempo as score Oboe Quartet (MO).
representation, only the MIDI representing the Beethoven As was to be expected, the results for pieces with less
Symphony contained sparse tempo annotations. extreme tempo deviations were improved to a much smaller
The evaluation took the form of a cross-validation. Ev- extent. Further investigation showed that as intended, the
ery performance in our data set (Table 1) was aligned with ‘learned’ tempo curves guided the alignment path more ac-
3 algorithms: the system introduced in [5] with only mi- curately and more reactively during huge tempo changes
nor changes and optimizations; the system including the (i.e., at phrase boundaries).
simple tempo model (Section 4); and the tempo model that Unfortunately it is not easy to make comparisons be-
has access to a set of possible performance strategies (Sec- tween different approaches in the literature, as the focus
tion 5). For the latter, all recordings pertaining to the given on a particular kind of music (e.g. contemporary vs. ro-
piece were used except, of course, for the performance cur- mantic piano music or monophonic vs. heavily polyphonic
rently being aligned. The result, for each performance and music) and the area of application (e.g. automatic accom-
each algorithm, is a set of events with detection times in paniment vs. visualization of music) have a huge influence
milliseconds. on the design of the system. That makes it hard to com-
The evaluation itself was performed as proposed in [7]. pile a well-balanced ground truth database suitable for all
For each event i the difference (offset ei ) in milliseconds systems.
to the reference alignment is computed. An event i is re- With this in mind, and the fact that most of our results
ported as missing if it is aligned with ei > 250ms. This are currently only computed relative to off-line alignments
percentage of notes thus misaligned (or, inversely, the per- as ground truth, we merely want to point out some obser-
centage of correctly aligned notes) is the main performance vations. First there is an overlap between our data set and
measure for a real-time music tracking system. Further the one used for the evaluation of ‘Antescofo’ [2], which
statistics, providing information about the alignment preci- was already used professionally in a number of live per-
sion on those events that were correctly matched, and thus formances. Using the same evaluation metrics, our sys-
computed on ei excluding missed events (eci ), are the av- tem performed significantly better (1.9% vs. 9.33% missed
erage error, defined as the mean over the absolute values notes) on the Fugue by Bach (BF). Of course the result for
of all eci , the mean error, defined as the regular mean with- ‘Antescofo’ is based on only 1 single performance, which
out taking the absolute value, and the standard deviation of may not even be in our data set. Furthermore, we are
eci . Finally two measures are computed which sum up the quite sure that our system will perform significantly worse
overall performance of the system: the piecewise precision than ‘Antescofo’ on sparse monophonic data, as we do not
rate (PP) as the average of the percentage of correctly de- explicitly detect note onsets and our forward path tends
tected events for each group of performances (see Table 1) to ‘randomly’ wander around during long pauses between
and the overall precision rate (OP) on the whole database. note onsets. Also, we allow our system to report notes
Table 2 summarizes the results. Clearly, both tempo early while ‘Antescofo’ is purely reactive, thus effectively
models lead to large improvements in tracking accuracy giving our system twice as large a window to report onsets
for pieces played with a lot of expressive freedom, espe- ‘correctly’. While for the task of automatic accompani-
No Tempo Model Simple Tempo Model ‘Learned’ Tempo Model
Offset (ms) % Offset (ms) % Offset (ms) %
ID Avg. Mean STD Miss Avg. Mean STD Miss Avg. Mean STD Miss
BF 52.1 -15.3 70.4 2.7% 41.7 0.1 61.3 2.2% 41.3 -0.3 59.3 1.9%
BS 84.1 4.3 106.5 15.9% 79.0 -11.6 100.8 15.0% 78.3 -6.4 100.3 13.9%
CB 63.1 16.6 83.7 10.9% 62.4 8.6 83.8 10.0% 63.1 3.9 85.2 9.9%
CW 86.3 -24.6 107.1 27.6% 78.7 -23.2 99.2 16.3% 75.4 -20.2 95.7 11.9%
MO1 94.8 -75.7 89.1 15.0% 70.1 -22.8 90.0 7.0% 72.1 -30.5 89.9 6.9%
MO3 99.9 -84.5 85.3 18.4% 64.3 -18.0 84.0 7.9% 65.7 -16.9 85.8 7.0%
MS1 47.4 13.8 64.5 3.6% 44.9 9.7 62.5 3.3% 42.7 10.1 59.5 3.2%
MS2 85.6 -21.3 104.8 19.8% 71.8 -4.7 93.7 13.8% 73.3 -6.4 94.5 11.3%
MS3 44.1 28.7 58.4 3.9% 40.2 6.7 59.5 3.3% 39.5 9.9 58.5 2.1%
RP 79.8 -18.7 102.0 31.8% 75.5 -10.5 96.8 17.1% 70.9 -10.6 93.2 14.8%
SI 107.3 -59.2 113.9 41.8% 77.9 -32.8 95.2 23.6% 78.7 -33.1 95.7 20.1%
OP 83.2% 89.7% 91.1%
PP 81.1% 87.9% 91.4%

Table 2. Real-time alignment results for all 3 evaluated systems (see text).

ment notes reported early are very bothersome, we think both an estimation of the timing and an analysis of the in-
that for the task of real-time music visualization, which is coming audio frames. Furthermore we should think about
our current focus, this is more tolerable. ways to use the extracted tempo information to further im-
Unfortunately, we could not find a comparable evalu- prove the high level ‘any-time’ tracking process (not de-
ation of ‘Music Plus One’ [1], which, like our system, scribed in this paper – see [3]).
focuses on classical music. However, a number of live
demonstrations and available videos suggest that the sys- 8. ACKNOWLEDGEMENTS
tem works very well in real-time accompaniment settings,
not only reacting to tempo changes, but actually predicting This research is supported by the City of Linz, the Federal
them quite well. State of Upper Austria, and the Austrian Federal Ministry
That said, our real-time tracking system combines com- for Transport, Innovation and Technology, and the Aus-
petitive alignment results with a unique feature not found trian Science Fund (FWF) under project number TRP 109-
in the above-mentioned systems: the ability to cope with N23.
arbitrary jumps of the performer(s) on-line by continuously
tracking the performance at a coarser level and refining hy- 9. REFERENCES
potheses about the current score position (see Section 3).
This not only allows to, e.g., automatically cope with arbi- [1] C. Raphael, “Current directions with music plus one,”
trary rehearsal situations, where the musician(s) may keep in Proc. of the Sound and Music Computing Confer-
repeating parts of the piece over and over, but effectively ence (SMC), (Porto, Portugal), 2009.
makes it impossible for the system to get lost. (Detailed
experimental proof of that can be found in [3].) [2] A. Cont, “A coupled duration-focused architecture for
realtime music to score alignment,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 99,
7. CONCLUSION AND FUTURE WORK 2009.
We have presented a new approach to the incorporation
[3] A. Arzt and G. Widmer, “Towards effective ‘any-
of tempo information into a very robust real-time track-
time’ music tracking,” in Proc. of the Starting AI Re-
ing system that is capable of dealing on-line with almost
searchers’ Symposium (STAIRS 2010), European Con-
arbitrary structural deviations from the score. We demon-
ference on Artificial Intelligence (ECAI), (Lisbon, Por-
strated two ways to compute a tempo estimate, one only
tugal), 2010.
based on the alignment of the last couple of seconds of the
performance, and one additionally based on a collection of [4] S. Dixon, “An on-line time warping algorithm for
previously extracted possible timing patterns, thus giving tracking musical performances,” in Proc. of the 19th
the system the means to anticipate tempo changes of the International Joint Conference on Artificial Intelli-
performer. The system was evaluated on a range of pieces gence (IJCAI), (Edinburgh, Scotland), 2005.
from Western classical music. Both tempo models lead
to significantly improved alignment results, especially for [5] A. Arzt, G. Widmer, and S. Dixon, “Automatic page
pieces played with a lot of expressive freedom. turning for musicians via real-time machine listening,”
An important direction for future work is the introduc- in Proc. of the 18th European Conference on Artificial
tion of explicit event detection into our system, based on Intelligence (ECAI), (Patras, Greece), 2008.
[6] M. Mueller, V. Konz, A. Scharfstein, S. Ewert, and
M. Clausen, “Towards automated extraction of tempo
parameters from expressive music recordings,” in Proc.
of the International Society for Music Information Re-
trieval Conference (ISMIR), (Kobe, Japan), 2009.

[7] A. Cont, D. Schwarz, N. Schnell, and C. Raphael,

“Evaluation of real-time audio-to-score alignment,” in
Proc. of the 8th International Conference on Music In-
formation Retrieval (ISMIR), (Vienna, Austria), 2007.

The Happiest Refugee Timeline
100% (1)
The Happiest Refugee Timeline
3 pages
Marketing & Digital Strategy - #40047588 - April 2022
No ratings yet
Marketing & Digital Strategy - #40047588 - April 2022
24 pages
Bartok Viola Concertop-The Remarkable Story of His Swansong-Donald Maurice PDF
100% (2)
Bartok Viola Concertop-The Remarkable Story of His Swansong-Donald Maurice PDF
235 pages
Robust Real-Time Music Tracking
No ratings yet
Robust Real-Time Music Tracking
4 pages
Automatic Accompaniment For Improvised Music
No ratings yet
Automatic Accompaniment For Improvised Music
50 pages
Beat Tracking: 1. Rhythm Perception 2. Onset Extraction 3. Beat Tracking 4. Dynamic Programming
No ratings yet
Beat Tracking: 1. Rhythm Perception 2. Onset Extraction 3. Beat Tracking 4. Dynamic Programming
19 pages
Music Plus One and Machine Learning: Dannenberg & Mukaino 1988 Cont Et Al. 2005 Raphael 2009
No ratings yet
Music Plus One and Machine Learning: Dannenberg & Mukaino 1988 Cont Et Al. 2005 Raphael 2009
8 pages
Drum Track Transcription of Polyphonic Music Using Noise Subspace Projection
No ratings yet
Drum Track Transcription of Polyphonic Music Using Noise Subspace Projection
8 pages
Music Score Alignment and Computer Accompaniment: Roger B. Dannenberg and Christopher Raphael
100% (1)
Music Score Alignment and Computer Accompaniment: Roger B. Dannenberg and Christopher Raphael
8 pages
Automatic Detection of Cue Points For The Emulation of DJ Mixing
No ratings yet
Automatic Detection of Cue Points For The Emulation of DJ Mixing
30 pages
Automatic Detection of Cue Points for the Emulation of DJ Mixing
No ratings yet
Automatic Detection of Cue Points for the Emulation of DJ Mixing
30 pages
Paper 2
No ratings yet
Paper 2
5 pages
Audio-To-Score Alignment of Piano Music Using Rnn-Based Automatic Music Transcription
No ratings yet
Audio-To-Score Alignment of Piano Music Using Rnn-Based Automatic Music Transcription
6 pages
Time-Frequency Representation of Musical Rhythm by Continuous Wavelets
No ratings yet
Time-Frequency Representation of Musical Rhythm by Continuous Wavelets
24 pages
1d State Space
No ratings yet
1d State Space
5 pages
A Two-Layer Approach For Multi-Track Segmentation of Symbolic Music
No ratings yet
A Two-Layer Approach For Multi-Track Segmentation of Symbolic Music
1 page
Klapuri - 2006 - Introduction To Music Transcription
No ratings yet
Klapuri - 2006 - Introduction To Music Transcription
28 pages
Imm 6321
No ratings yet
Imm 6321
88 pages
Estimating Tempo, Swing and Beat Locations in Audio Recordings
No ratings yet
Estimating Tempo, Swing and Beat Locations in Audio Recordings
4 pages
A Mid-Level Representation For Capturing Dominant Tempo and Pulse Information in Music Recordings
100% (1)
A Mid-Level Representation For Capturing Dominant Tempo and Pulse Information in Music Recordings
6 pages
Toward Automated Holistic Beat Tracking, Music Analysis, and Understanding
No ratings yet
Toward Automated Holistic Beat Tracking, Music Analysis, and Understanding
8 pages
Thesis MacRitchie
No ratings yet
Thesis MacRitchie
255 pages
Pert Usa PHD
No ratings yet
Pert Usa PHD
232 pages
Ph.D. Thesis Computationally Efficient Methods For Polyphonic Music Transcription
No ratings yet
Ph.D. Thesis Computationally Efficient Methods For Polyphonic Music Transcription
232 pages
wu-pan-2025-providing-music-selection-and-matching-suggestions-for-dance-creations-using-music-information-retrieval
No ratings yet
wu-pan-2025-providing-music-selection-and-matching-suggestions-for-dance-creations-using-music-information-retrieval
15 pages
JIIS-MIRrors-AMT-postprint
No ratings yet
JIIS-MIRrors-AMT-postprint
28 pages
Tempogram Representation and Kalman Filtering
No ratings yet
Tempogram Representation and Kalman Filtering
19 pages
Winkler - 2004 - The Realtime Score. A Missing Link in Computer Music Performance
No ratings yet
Winkler - 2004 - The Realtime Score. A Missing Link in Computer Music Performance
6 pages
Qiaozhan Gao Report ReportFinal
No ratings yet
Qiaozhan Gao Report ReportFinal
6 pages
PHD Tristan
No ratings yet
PHD Tristan
137 pages
Autonomous Instruments PDF
No ratings yet
Autonomous Instruments PDF
404 pages
Automatic Bass Line Transcription For Electronic Music
100% (1)
Automatic Bass Line Transcription For Electronic Music
3 pages
06b Yamamoto
No ratings yet
06b Yamamoto
1 page
Gerhard Winkler, The Real-Time Score
No ratings yet
Gerhard Winkler, The Real-Time Score
11 pages
Thesis Fitz
No ratings yet
Thesis Fitz
206 pages
New Directions in Music and Machine Learning
No ratings yet
New Directions in Music and Machine Learning
5 pages
2016 09 Foglietti
No ratings yet
2016 09 Foglietti
103 pages
A Classifi Er-Based Approach To Score - Guided Source Separation of Musical Audio
No ratings yet
A Classifi Er-Based Approach To Score - Guided Source Separation of Musical Audio
9 pages
Towards automated segmentation of repetitive music recordings
No ratings yet
Towards automated segmentation of repetitive music recordings
94 pages
9686 PDF
100% (1)
9686 PDF
18 pages
Optical Measurement of Acoustic Drum Strike Locations: Janis Sokolovskis Andrew P. Mcpherson
No ratings yet
Optical Measurement of Acoustic Drum Strike Locations: Janis Sokolovskis Andrew P. Mcpherson
4 pages
Beatnet
No ratings yet
Beatnet
13 pages
The Columbine Massacre - Barack Obama - Zionist Wolf in Sheep's (PDFDrive)
No ratings yet
The Columbine Massacre - Barack Obama - Zionist Wolf in Sheep's (PDFDrive)
18 pages
Kyoto University
No ratings yet
Kyoto University
14 pages
Dissertacao Bruno Dias
No ratings yet
Dissertacao Bruno Dias
95 pages
LL: Listening and Learning in An Interactive Improvisation System
No ratings yet
LL: Listening and Learning in An Interactive Improvisation System
7 pages
Signal Processing Methods For Music Transcription Klapuri
No ratings yet
Signal Processing Methods For Music Transcription Klapuri
443 pages
An Experimental Comparison of Audio
No ratings yet
An Experimental Comparison of Audio
13 pages
On The Modeling of Musical Solos As Complex Networks
100% (1)
On The Modeling of Musical Solos As Complex Networks
36 pages
Pandeiro Funk
No ratings yet
Pandeiro Funk
1 page
Generative Music For Live Performance Experiences With Real-Time Notation
No ratings yet
Generative Music For Live Performance Experiences With Real-Time Notation
10 pages
Miron Et Al. - 2013 - An Open-Source Drum Transcription System For Pure Data and Max MSP
No ratings yet
Miron Et Al. - 2013 - An Open-Source Drum Transcription System For Pure Data and Max MSP
5 pages
MPM12 Rhythm PDF
No ratings yet
MPM12 Rhythm PDF
34 pages
Algorithms and Data Strctures for a Music Notation System Based on GUIDO [Hermann Walter, Keith Hamel] (2002)
No ratings yet
Algorithms and Data Strctures for a Music Notation System Based on GUIDO [Hermann Walter, Keith Hamel] (2002)
163 pages
Fober Et Al. - Time Synchronization in Graphic Domain - A New Par
No ratings yet
Fober Et Al. - Time Synchronization in Graphic Domain - A New Par
5 pages
Statical Music Modeling
No ratings yet
Statical Music Modeling
2 pages
Blostein 1992
No ratings yet
Blostein 1992
30 pages
MTSSM - A Framework For Multi-Track Segmentation of Symbolic Music
No ratings yet
MTSSM - A Framework For Multi-Track Segmentation of Symbolic Music
7 pages
Sonification of A Juggling Performance Using Spatial Audio
No ratings yet
Sonification of A Juggling Performance Using Spatial Audio
6 pages
The Music Producer's Guide To Reverb: The Music Producer's Guide
From Everand
The Music Producer's Guide To Reverb: The Music Producer's Guide
Ashley Hewitt
No ratings yet
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Computer Audition: Fundamentals and Applications
From Everand
Computer Audition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Analog Dialogue, Volume 48, Number 2
From Everand
Analog Dialogue, Volume 48, Number 2
Analog Dialogue
No ratings yet
A Polyphony of Characteristics
No ratings yet
A Polyphony of Characteristics
23 pages
Comparison of Implicitization Methods interpolation
No ratings yet
Comparison of Implicitization Methods interpolation
19 pages
Audio-To-Symbolic Arrangement via Cross-Modal Music Representation Learning
No ratings yet
Audio-To-Symbolic Arrangement via Cross-Modal Music Representation Learning
5 pages
Implicit Representation of Parametric Curves and Surfaces
No ratings yet
Implicit Representation of Parametric Curves and Surfaces
13 pages
Piece Identification in Classical Piano Music Without Reference Scores
No ratings yet
Piece Identification in Classical Piano Music Without Reference Scores
7 pages
A Low-Complexity Audio Fingerprinting Technique for Embedded Applications
No ratings yet
A Low-Complexity Audio Fingerprinting Technique for Embedded Applications
20 pages
A Transformational Grammar Framework for Improvisation
No ratings yet
A Transformational Grammar Framework for Improvisation
7 pages
A Modified Shape Context Method for Shape Based Object Retrieval
No ratings yet
A Modified Shape Context Method for Shape Based Object Retrieval
12 pages
A Hierarchical Self-Organizing Approach for Learning the Patterns of Motion Trajectories
No ratings yet
A Hierarchical Self-Organizing Approach for Learning the Patterns of Motion Trajectories
10 pages
A Comparison of Physiological Signal Analysis Techniques and Classifiers
No ratings yet
A Comparison of Physiological Signal Analysis Techniques and Classifiers
14 pages
Automatic Piano Reduction of Orchestral Music Based on Musical Entropy
No ratings yet
Automatic Piano Reduction of Orchestral Music Based on Musical Entropy
5 pages
"Dynamic Dualism" Kurth and Riemann On Music Theory and The Mind
100% (1)
"Dynamic Dualism" Kurth and Riemann On Music Theory and The Mind
17 pages
Pitchclass 2 Vec
No ratings yet
Pitchclass 2 Vec
17 pages
A Comparative Review of 3 Good Music Theory Books
No ratings yet
A Comparative Review of 3 Good Music Theory Books
10 pages
Definitions of Timbre
100% (1)
Definitions of Timbre
11 pages
In Control but Uninspired
No ratings yet
In Control but Uninspired
14 pages
The Chordinator
100% (1)
The Chordinator
15 pages
Chord 2 Vec
No ratings yet
Chord 2 Vec
5 pages
Fundamental Considerations in Style Analysis
No ratings yet
Fundamental Considerations in Style Analysis
18 pages
Hippocampal Spatial Representations
No ratings yet
Hippocampal Spatial Representations
25 pages
Evolution of Fast Root Gravitropism in Seed Plants
No ratings yet
Evolution of Fast Root Gravitropism in Seed Plants
10 pages
Nur Syahirah Binti Mohd Yusri A170291: Nama Dan No. Matrik
No ratings yet
Nur Syahirah Binti Mohd Yusri A170291: Nama Dan No. Matrik
9 pages
Future Tenses
No ratings yet
Future Tenses
1 page
Pedal Rig Tips: and and
No ratings yet
Pedal Rig Tips: and and
7 pages
Lemon Temptress Spaghetti-pasta Queen
No ratings yet
Lemon Temptress Spaghetti-pasta Queen
2 pages
Capstone Part 2 New
No ratings yet
Capstone Part 2 New
30 pages
Analysis of Edward Albees The Zoo Story
100% (1)
Analysis of Edward Albees The Zoo Story
6 pages
Modern Competitive Bidding Handout（格兰特 Audrey Grant）
100% (1)
Modern Competitive Bidding Handout（格兰特 Audrey Grant）
52 pages
Cinderella Story
No ratings yet
Cinderella Story
5 pages
Etek Opera PDF
No ratings yet
Etek Opera PDF
2 pages
Voter Lists For CO-Vaccine AC 054 Malviya Nagar 03.03.2021
No ratings yet
Voter Lists For CO-Vaccine AC 054 Malviya Nagar 03.03.2021
935 pages
Hip-Hop Monograph
No ratings yet
Hip-Hop Monograph
19 pages
Bikaner 170910064920 PDF
No ratings yet
Bikaner 170910064920 PDF
12 pages
Lesson One Words: Family and Friends Special Edition Grade - Unit 7: Are These His Pants?
No ratings yet
Lesson One Words: Family and Friends Special Edition Grade - Unit 7: Are These His Pants?
27 pages
BÀI TẬP THÌ QUÁ KHỨ HOÀN THÀNH
No ratings yet
BÀI TẬP THÌ QUÁ KHỨ HOÀN THÀNH
21 pages
1- TEST 1- - E10 - HKI (23-24) - KTTX LẦN 2
No ratings yet
1- TEST 1- - E10 - HKI (23-24) - KTTX LẦN 2
2 pages
Don Quixote: by Miguel de Cervantes
No ratings yet
Don Quixote: by Miguel de Cervantes
30 pages
Phase Trace in Smaart
No ratings yet
Phase Trace in Smaart
13 pages
Guidelines and Invitation Letter Arnis 2023
No ratings yet
Guidelines and Invitation Letter Arnis 2023
15 pages
Pricebook (Version 1) .Xls Grocery Store Price List
No ratings yet
Pricebook (Version 1) .Xls Grocery Store Price List
8 pages
Setswana
No ratings yet
Setswana
24 pages
Post-Independence Indian English Literature - Towards A New Literary History
No ratings yet
Post-Independence Indian English Literature - Towards A New Literary History
9 pages
Fluid Power - (ME353) - Lec6
No ratings yet
Fluid Power - (ME353) - Lec6
37 pages
The Grosse Fuge The Hundred Years of Its History
100% (1)
The Grosse Fuge The Hundred Years of Its History
9 pages
Shiv Ganga Exp Third Ac (3A)
No ratings yet
Shiv Ganga Exp Third Ac (3A)
3 pages
Schubert Zimerman
No ratings yet
Schubert Zimerman
5 pages
Lesson 1 Olympic Games
No ratings yet
Lesson 1 Olympic Games
10 pages
Irenestrange - Boo The Friendly Ghost
100% (1)
Irenestrange - Boo The Friendly Ghost
5 pages

Simple Tempo Models for Real-time Music Tracking

Uploaded by

Simple Tempo Models for Real-time Music Tracking

Uploaded by

SIMPLE TEMPO MODELS FOR REAL-TIME MUSIC TRACKING

ABSTRACT like western classical music, where close synchronization

2.2 On-line Dynamic Time Warping (ODTW)

[7] A. Cont, D. Schwarz, N. Schnell, and C. Raphael,

You might also like