Alfonseca Et Al. - 2007 - A Simple Genetic Algorithm For Music Generation by Means of Algorithmic Informat
Alfonseca Et Al. - 2007 - A Simple Genetic Algorithm For Music Generation by Means of Algorithmic Informat
net/publication/221008730
CITATIONS READS
11 701
3 authors:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Disaster Analytics: Disaster Preparedness and Management through Online Social Media View project
All content following this page was uploaded by Manuel Alfonseca on 16 May 2014.
Abstract— Recent large scale experiments have shown that been observed that this operator plays an important role in
the Normalized Information Distance, an algorithmic informa- this procedure.
tion measure, is among the best similarity metrics for melody This paper is organized thus: the second section pro-
classification. This paper proposes the use of this distance as
a fitness function which may be used by genetic algorithms to vides a short introduction to musical concepts needed to
automatically generate music in a given pre-defined style. The better understand the remainder, with a description of the
minimization of this distance of the generated music to a set of restrictions applied in our experiments and an enumeration
musical guides makes it possible to obtain computer-generated of different ways of representing music. The third section
music which recalls the style of a certain human author. The introduces the Normalized Compression Distance, which has
recombination operator plays an important role in this problem
and thus several variations are tested to fine tune the genetic been used to compute the distance from the results of the
algorithm for this application. The superiority of the relative genetic algorithm to the target musical pieces. The fourth
pitch envelope over other music parameters, such as the lengths section describes the genetic algorithm we have used for
of the notes, brought us to develop a simplified algorithm that music generation. In the fifth and sixth sections we describe
nevertheless obtains interesting results. our experiments, where we have compared the use of one or
I. I NTRODUCTION two target guides, and six different recombination procedures
for the genetic algorithm. Finally, the last section presents our
The automatic generation of musical compositions is a
conclusions and possibilities for future work.
long standing, multi disciplinary area of interest and research
in computer science, with over thirty years of history at its II. M USICAL REPRESENTATION : RESTRICTIONS
back. Melody, rhythm and harmony are considered the three fun-
Some of the current approaches try to simulate how the damental elements in music. In the experiments performed in
musicians play [1] or improvise [2] on the fly, while others this paper, we shall restrict ourselves to melody, leaving the
are not concerned with execution time and mainly try to management of rhythm and harmony as future objectives. In
generate some ‘good’ output. Many of them apply models this way, we can forget about different instruments (parts and
and procedures of theoretical computer science (cellular au- voices) and focus on monophonic music: a single performer
tomata [3], parallel derivation grammars [1], or evolutionary executing, at most, a single note on a piano at a given point in
programming [4], [5], [6], [7]) to the generation of complex time. Melody consists of a series of musical sounds (notes) or
compositions. The models are then assigned a musical mean- silences (rests) with different lengths and stresses, arranged
ing. In some cases, the music may be automatically found in succession in a particular rhythmic pattern, to form a
(composed) by means of genetic programming. recognizable unit.
In a previous paper [8] we proposed the use of the well- In the English notation for Western music the names of the
known Normalized Compression Distance [9], an algorithmic notes belong to the set {A, B, C, D, E, F, G}. These letters
information measure , as a fitness function which may be represent musical pitches and correspond to the white keys
used by genetic algorithms to automatically generate music on the piano. The black keys on the piano are considered as
in a given pre-defined style. The superiority of the relative modifications of the white key notes, and are called sharp
pitch envelope over other musical parameters, such as the or flat notes. From left to right, the key that follows a white
lengths of the notes, has been confirmed in [10], bringing us key is its sharp key, while the previous key is its flat key. To
to develop a simplified algorithm that nevertheless obtains indicate a modification, a symbol is added to the white key
interesting results. name (as in A# or A+ to represent A sharp, or in Bb or B-,
In this paper we start on the results of the previous work which represent B flat). The distance from a note to its flat
and refine them, trying to increase the efficiency of the pro- or sharp notes is called a half step and is the smallest unit
cedures described in the above mentioned paper. This is done of pitch used in the piano, where every pair of two adjacent
by testing several variations of the recombination operator to keys are separated by a half step, no matter their color. Two
fine tune the genetic algorithm for this application, as it has consecutive half steps are called a whole step. Instruments
M. Alfonseca, M. Cebrián and A. Ortega are with the Escuela Politécnica different from the piano may generate additional notes; in
Superior of the Universidad Autónoma de Madrid, Tomas y Valiente 11, P. fact, flat and sharp notes may not coincide; also, in different
O. Box 28049, Madrid, Spain (e-mail: {manuel.alfonseca, manuel.cebrian, musical traditions (such as Arab or Hindu music) additional
alfonso.ortega}@uam.es).
This work has been supported by grant TSI 2005-08255-C07-06 of the notes exist. However, in these experiments, we shall restrict
Spanish Ministry of Education and Science. to the Western piano lay-up, thus simplifying the problem to
3035
1-4244-1340-0/07$25.00 2007
c IEEE
just 88 different notes separated by half steps. An interval where K(x|y) is the conditional Kolmogorov complexity of
may be defined as the number of half steps between two string x given string y, whose value is the length of the
notes. shortest program (for some universal machine) which, when
Notes and rests have a length (a duration in time). There run on input string y outputs string x. K(x) is the degenerate
are seven different standard lengths (from 1, corresponding to case K(x|λ), where λ is the empty string; see [12] for a
a whole or round note, to 1/64), each of which has duration detailed exposition on the subject. Unfortunately it can be
double than the next (whole, half, quarter, ...). Intermediate proven that, due to the well-known halting problem, both
durations can be obtained by means of dots or periods, the conditional and the unconditional complexities happen
triplets and other constructs. The complete specification of to be incomputable functions.
notes and silences includes their lengths. In [9] a computable estimate of the NID, the Normalized
A piece of music can be represented in several different, Compression Distance (NCD), is presented:
but equivalent ways:
1) With the traditional Western bi-dimensional graphic
max{C(xy) − C(x), C(yx) − C(y)}
notation on a pentagram. NCD(x, y) = (2)
2) By a set of character strings: notes are represented by max{C(x), C(y)}
letters (A-G), silence by a P, sharp and flat alterations
by + and - signs, and the lengths of notes by a number where xy is the concatenation of strings x and y, and C(x)
(0 would represent a whole note, 1 a half note, and denotes the length of the text x compressed by some com-
so on). Adding a period provides intermediate lengths. pression algorithm which approximates K(x) from above.
Additional codes define the tempo, the octave and Distances near 0 indicate similarity between objects, while
the performance style (normal, legato or staccato). those near 1 reveal dissimilarity.
Polyphonic music is represented with sets of parallel
Li and Sleep have reported that this distance, together with
strings.
a nearest neighbor or a cladistic classifier, outperforms some
3) By numbering (1 to 88) the pitches of the notes in the
of the finest (more complex) algorithms for clustering music
piano keyboard. Note 0 would represent a silence. The
by genre [10]. An earlier research has also reported success
length of a note can be represented by a multiple of
of the same distance for clustering tasks [13]. These results
the minimum unit of time. A voice in a piece of music
suggest that the NCD, although not achieving the universality
would be a series of integer pairs representing notes
of its incomputable predecessor (the NID), works well at
and lengths. Polyphonic music may be represented by
extracting features shared between two musical pieces.
means of parallel sets of integer pairs.
4) Other coding systems are used to keep and reproduce On the other hand, genetic algorithms need to define a
music in a computer or a recording system, with or fitness function to compare different individuals, subject to
without compression, such as wave sampling, MIDI, simulated evolution, and classify them according to their
MP3, etc. degree of adaptation to the environment. In many cases,
fitness functions compute the distance from each individual
In our experiments, we represent melodies by the second
to a desired goal.
and third notation systems.
Assume we want to generate a composition that resembles
III. T HE N ORMALIZED C OMPRESSION D ISTANCE a Mozart symphony; in this situation, we can elaborate a
natural fitness measure: an individual (representing a com-
The search for a universal distance has been, for a long position) has a high fitness if it shares many features with
time, one of the main objectives of cluster theory. The as many as possible of Mozart’s symphonies. We propose
availability of such a distance would make it possible to to use a genetic algorithm (with musical compositions as
apply the same algorithms to widely different clustering individuals of the population) which uses the NCD as the
problems, such as the classification of music, texts, gene fitness measure. This measure may compute these shared
sequences, and so forth. features between the individuals and the target musical
A deep result from Algorithmic Information Theory is guides which, in this example, would be the set of Mozart’s
that there exists such a universal similarity distance, which symphonies.
summarizes all computable similarities: the Normalized In-
It remains to choose the compressor used to estimate the
formation Distance (NID) [11]. It is universal in the sense
NCD. Li and Sleep compute it by counting the number
that, when when a small distance is measured by any means
of blocks generated by executing the LZ78 compression
between any two given objects, the NID is also small between
algorithm [14] on an input. In our initial experiments, we
these objects. Thus, it is at least as good as any other
used both the LZ78 and LZ77 algorithms, and found that
computable similarity distance. The NID is mathematically
LZ77 performs better, which agrees with theoretical results
defined as follows:
from Kosaraju and Manzini [15]. Therefore, LZ77 has been
max{K(x|y), K(y|x)} used as our reference compressor in all the experimental
NID(x, y) = (1)
max{K(x), K(y)} results presented in this paper.
generated in the interval [0, n], [0, m] for each par- First, we used Yankee Doodle as the guide a single piece
ent genotype. Each genotype is then cut into the five of music, described by the following string with the second
corresponding pieces, which are shuffled together representation defined in sect. II:
(one of them is reversed). Finally, the genotypes of M2T2O3L2C+4C+4D+4F4C+4F4D+4O2G+4O3C+4C+4D+4F4
the progeny are obtained by concatenating five of C+3C4P4C+4C+4D+4F4F+4F4D+4C+4C4O2G+4A+4O3C4C+
the pieces in the shuffled 3C+4P4O2A+4.O3C5.O2A+4G+4A+4O3C4C+4P4O2G+4.A+
– Recombination based on a three point crossover: 5.G+4F+4F3G+4P4A+4.O3C5.O2A+4G+4A+4O3C4C+4O2A
similar to the preceding one, but only three ran- +4G+4O3C+4C4D+4C+3C+3
dom ordered integers are used to divide the parent The corresponding WAV formatted file, Yankee.wav, to-
genotypes into four pieces, which are then joined, gether with all the musical pieces mentioned in this paper,
shuffled, and used (four at a time) to generate the can be found at:
genotypes of the progeny. www.eps.uam.es/˜mcebrian/music
• Mutation (one mutation was applied to every generated After applying the genetic algorithm, the succession of
genotype, although this rate may be modified in dif- notes obtained was completed by adding length information
ferent experiments). It consists of replacing a random in the following way: each note was assigned the length of
element of the individual genotype by a random integer the note in the same position in the guide piece (the guide
in the same interval. piece was shortened or circularly extended, if needed, to
• Fusion (applied to a certain percentage of the gener- make it the same length as the generated piece, which could
ated genotypes, which in our experiments was varied be shorter or longer).
between 5 and 10). The genotype is replaced by a In successive executions of the algorithm, we obtained
catenation of itself with a piece randomly broken from different melodies at different distances from the guide. It
either itself or its paired genotype. was observed that a lower distance made the generated music
• Elision (applied to a certain percentage of the generated more recognizable to the ear, as related to the guide piece For
genotypes, in our experiments between 2 and 5). One instance, the distance to the guide of the following generated
integer in the vector (in a random position) is elimi- piece. (see also figure 1, where the same music appears in
nated. standard musical notation), named Yankee NEW.wav is 0.43:
The last two operations, together with some recombination T5O3D+2O2G+2O3C+2C+2D+2F2F+2F2E2C2D2E2O2F1D2E
procedures, allow longer or shorter genotypes to be obtained 2D2C2D2E2F+2G2G2A2B2O3C2O2B2O3D2E1O2F2D+2F2.G
from the original vectors. 3.G+2F2D+2G+2F+2E2F+2.F3.C+2C2D+1C+2C+2A+2.O3
C3.C+2O2G+2A+2G+2F+2O3D+2B2O3D+2C+2
V. T ESTING DIFFERENT NUMBER OF GUIDE PIECES The number of generations needed to reach a given
In our first experiments, we selected the simplest recombi- distance to the guide depends on the guide length and
nation procedure (strategy 1 in sect. VI) and tested the effect the random seed used in each experiment, and follows an
of varying the number of guide pieces and the functions approximate Poisson curve, as shown in figure 2, which
which generate the lengths of the notes in the best output represents the result of one experiment.
pieces. In our second experiment, we used two guide pieces
G+2F+3A3A+3G+3.F+1D+3.C+3.C3.O2A+3.A+0A+3.O3C
3.O2A+3.O3C3.C+0D3.C3.O2C3.
Fig. 5. A piece generated using Chopin’s seventh prelude as the only guide (NCD 0.39).
movement in symphony 40 (Mozart40.wav), and a part of the In order to fine tune the genetic algorithm for this appli-
second movement in sonata KV545 (KV545.wav). The result cation, we devote a section to discuss several variations we
(Mozart NEW.wav), which sounds like a mixture of both have tested experimentally. We analyzed four strategies that
sources to the ear in some parts, happens to be at distances use respectively the four types of recombination described
of 0.65 and 0.58 from the two guide pieces, which on the in section III: strategy 1 (single point crossover, adjusted for
other hand differ from one another by a distance of 0.90: variable length genomes) is the base case (the simplest re-
T5O4G+0F+3B3.A+3A3G+2.G+3B1F1G+1.F3.D+3D+3D3.
combination strategy) which was used in all the experiments
C3O3A+3O4F2.G+3F1C+1F+2.D+3.F+2E2D2C+2O3F2.F+ described in the preceding section, strategy 2 (modified two-
3.G+1O4C1O3A3.F3.O4F3.G3F3F3.G3E3G3.B3O3E3G3. point crossover for variable length genomes), strategy 3,
F2.G+3G4G+2A2O4G2G+2G1G+3.F3.G+3.O5C3O4A+3G+3 (recombination based on a four point crossover) and strategy
.A+3.A+2.G+3.G3.G3.E3D+3O3F+3F3F+3.G3.G+3.A3. 4 (recombination based on a three point crossover)
A3.G3F+3F+3.D+3.D3.O4D3.C+2C+3E2O3A+2O4C+3C2O
3B2G1B1O4D3.C3.O3G3.O4D+3G3D3.D+3D3D+3.D3E3F3
.G3.F3.E3F3G1O3D+3E3D+3.D+3B3B3.A+3.O4F3.G2.G
The one-point crossing-over strategy 1 has the property
+3A+3.G3 that the lengths of the parent genomes are invariant under
recombination in the progeny. Since mutation also keeps the
The length of the notes was generated in the same way as length of the genome, only fusion and elision change it.
in the second experiment. In fact, we did notice that, in our preceding experiments,
fusion almost never leads to a fitter genome, while elision
VI. T ESTING DIFFERENT RECOMBINATION PROCEDURES
sometimes does, which means that the version of our genetic
We have evidence that the recombination operator plays a algorithm described in the previous section, which starts with
key role in our approach, both in the quality of the generated a genome length copied from one of the target pieces of
musical pieces and in the time the algorithm takes to generate music, leads to genome lengths usually reduced by a little
it. from their initial value.
Fig. 6. Two different experiments with a comparison of three recombination strategies. ‘Mixed strategy’ refers to the mixed strategy 1.
Strategies 2, 3 and 4, however, all lead to progeny genomes the algorithm uses the second strategy (the two point
with lengths usually quite different from those of their par- recombination procedure with four different crossing-over
ents (even when both parent genomes had the same length), points between both parents). During all the remaining
which provides the population with a larger genome length generations, the first strategy is used instead (i.e. the one
diversity than strategy 1. point recombination procedure with a single crossing-over
After performing several experiments we noticed that, at point for both parents).
the beginning of the evolution, the second recombination
strategy converges more quickly towards the target, but after Mixed strategy 2: In the first 200 generations, the program
a certain number of generations (usually between 150 and uses the second strategy; between generations 200 and 500
200), the first and fourth strategies behave better, while it switches to the fourth strategy, and above 500 generations
beyond about 500 generations after the beginning of the it uses the first strategy.
process the first strategy is clearly the best. Above 1000
generations, the first two strategies tend to converge, i.e. to
The data in table I correspond to a typical experiment
obtain similar distances to the goal after the same number of
in which two Mozart’s pieces were used as the guide set:
generations.
Symphony 40 and KV545. The results of the combined
This brought us to add two new strategies to the testbed, strategies are much better than those of any of the four
which are simple combinations of the four described above: strategies applied separately. It can be observed that the
mixed strategy 1 reaches, in just 600 generations, target
Mixed strategy 1: In the first 150 to 200 generations, distances similar to those attained by the first two strategies