0% found this document useful (0 votes)
73 views

World-GAN - A Generative Model For Minecraft

Uploaded by

Gordon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

World-GAN - A Generative Model For Minecraft

Uploaded by

Gordon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

World-GAN: a Generative Model for Minecraft

Worlds
Maren Awiszus∗ Frederik Schubert∗ Bodo Rosenhahn
Institut für Informationsverarbeitung Institut für Informationsverarbeitung Institut für Informationsverarbeitung
Leibniz University Hannover Leibniz University Hannover Leibniz University Hannover
Hannover, Germany Hannover, Germany Hannover, Germany
[email protected] [email protected] [email protected]

On
Abstract—This work introduces World-GAN, the first method e3
D
Le
to perform data-driven Procedural Content Generation via Ma- ve
l
chine Learning in Minecraft from a single example. Based on a 3D
Generative Adversarial Network (GAN) architecture, we are able Ra
nd
om
to create arbitrarily sized world snippets from a given sample. 3D
W Le
or
GAld- ve
We evaluate our approach on creations from the community as ls
of
N

arbi
well as structures generated with the Minecraft World Generator. tra
ry
siz
Our method is motivated by the dense representations used in es

Natural Language Processing (NLP) introduced with word2vec


[1]. The proposed block2vec representations make World-GAN
independent from the number of different blocks, which can
vary a lot in Minecraft, and enable the generation of larger levels.
Finally, we demonstrate that changing this new representation
space allows us to change the generated style of an already trained
generator. World-GAN enables its users to generate Minecraft
worlds based on parts of their creations.
Index Terms—Minecraft, Level, Generation, PCG, GAN, Sin- Fig. 1: Examples of generated Minecraft world snippets trained
GAN, Single Example, Scales, Style, Representation on a sample of handcrafted stone ruins in a field of grass. The
samples can be generated in arbitrary sizes and capture the
I. I NTRODUCTION structures of the ruins and the terrain of the original area.
Procedural Content Generation (PCG) has been applied to
many different areas and games [2], [3]. Recently, the progress
in Machine Learning has especially spurred research in the in a fixed world and the World Generator cannot learn to
field of PCG via Machine Learning (PCGML) [4]. While reconstruct or generate newly built structures on its own.
there are methods to generate levels for 2D games such as In this paper, we bridge the gap between the rule-based PCG
Super Mario Bros. [5], generating levels in 3D is an open of the Minecraft World Generator and custom structures that
area of research. There has been work on generating levels were designed by hand, as our proposed method aims to learn
for the 3D game Doom [6], but the generation process relies and generate structures directly in 3D voxel space (compare
on generating 2D descriptions of the level layout that prohibit Fig. 1, all world visualizations are made using Mineways [8]
complex vertical structures. and Blender [9]).
There are 3D games that use grammars and rule-based In summary, our contributions are:
algorithms for PCG in level generation. However, these games • We introduce a 3D Generative Adversarial Network
do not yet use Machine Learning in their generation process (GAN) architecture for level generation in Minecraft.
1
and are thus only extendable by programming new rules. The • Our proposed dense token representations using
most prominent game in this domain is Minecraft [7], which block2vec enable the processing of larger world snippets.
can generate endless worlds in a 3D voxel space. Minecraft’s • Editing this representation space allows the application of
World Generator is intricately handcrafted to generate vast style changes to generated levels without further training.
landscapes, filled with structures like villages or caves with • Using a current version of Minecraft (1.16) makes our
mine shafts. These landscapes also change with so-called method widely applicable for practitioners.
biomes, which define the type of area, from plains to deserts • We enable others to generate their own world snippets by
to beaches. Human generated content also plays an important releasing our source code at https://github.com/Mawiszus/
role in Minecraft, but the structures have to be placed manually World-GAN.

Equal contribution
978-1-6654-3886-5/21/$31.00 ©2021 IEEE 1 The smallest building block or tile a level is made of, e.g. sprites or voxels.
Input Fake Real

G D
0 0

...

...
Progression

G D
N- N-
Training

1 1

G D
N N

Multi-scale Patch Multi-scale Patch


Generators Discriminators

(a) Pipeline (b) 3D Convolution


Fig. 2: The World-GAN training pipeline. Similar to [10], different patch-based generators are trained at different scales to
create locally convincing world snippets that the discriminators are trying to distinguish. While the original method uses one-hot
encodings for their levels, we use dense representations from block2vec (Section III-E) that are mapped to functional levels
after training. These representations make us independent of the number and types of blocks in a level. World-GAN uses 3D
convolutions to process the 3D structures in the given world snippets.

II. R ELATED W ORK Grbic et al. [21] introduce the problem of open-ended proce-
dural content generation in Minecraft. Sudhakaran et al. [22]
The field of PCGML has seen many advances in recent use the Neural Cellular Automata architecture to produce a
years, due to the growing capabilities of Machine Learning fixed structure in Minecraft given a seed or partial structure.
algorithms. Besides classical methods like Markov Random Yoon et al. [23] classify Minecraft villages into several
Fields and GANs [11], its methods have been used for level themes (e.g. medieval, futurist, asian) but do not perform
generation in Candy Crush Saga and Super Mario Bros. PCG. There have been experiments to generate Minecraft
(SMB) [5]. A recent approach framed PCG as a Reinforcement structures based on user-defined content [24], but the results
Learning (RL) problem and generated Zelda and Sokoban were not satisfying. Our proposed World-GAN is one of the
levels [12] using a Deep RL agent. Our method is inspired first practical PCGML applications for Minecraft.
by our previous work TOAD-GAN [10], which extended
SinGAN [13] to token-based games by using a hierarchical III. M ETHOD
downsampling operation. Our method builds upon several existing techniques which
In contrast to these existing studies, we propose a method are briefly described in this section before we introduce World-
for 3D level generation in Minecraft that adapts the idea of GAN and our block2vec algorithm.
token embeddings from Natural Language Processing (NLP) A. Generative Adversarial Networks
to overcome memory bottlenecks and manually-defined token
World-GAN is based on the GAN [25] architecture. Given
hierarchies. Embeddings of game entities have not been used
a dataset, these networks are able to generate new samples
in PCG, but were posed as a future research direction in [14].
that are similar to the provided examples. They are trained by
A game where PCG plays an essential role is Minecraft
using two adversaries, a generator G and a discriminator D.
[7]. The complex 3D structures in this game pose a problem
The generator is fed a random noise vector z and produces
for PCGML methods. The AI Settlement Generation Chal-
an output x̃. Then, the discriminator is either given a real
lenge [15]–[17] was recently created to spur research in this
sample x or the generated one and has to predict whether
direction. The submitted algorithms are generating villages in
the sample is from the real dataset or not. By learning to
a given world and are evaluated using subjective measures
fool the discriminator, the generator is gradually producing
(adaptability, functionality, narrative, aesthetics). The creators
more and more samples that look as if they belong to the
of the challenge mention data-driven approaches as a future
training distribution. One problem with this architecture is
direction of PCG in Minecraft which was one motivation
that it requires a lot of data. Otherwise, it is too easy for the
for our work. One method to increase the diversity of the
discriminator to distinguish between real and fake samples and
generated content was published by Green et al. [18], which
the generator is not able to improve its output.
generates floor plans using a constrained-growth algorithm and
Cellular Automata. B. SinGAN
Several simplified Minecraft-inspired simulators were pro- SinGAN [13] enables the generation of images from only
posed [19], [20] to study the creative space of 3D structures. one example by using a fully-convolutional generator and
discriminator architecture. Thus, the discriminator only sees
one part of the sample and can more easily be fooled by
the generator. Because the field of view in this architecture
is limited, long-range correlations can only be modeled by
introducing a cascade of generators and discriminators that
operate at N different scales. The samples for each scale are
downsampled and the GANs are trained beginning from the
smallest scale N
x̃N = GN (zN ). (1)
This scale defines the global structure of the generated
sample, which will be refined in the subsequent scales. At
scales 0 ≤ n < N , the output from the previous scale
is upsampled (↑) and passed to the scale’s generator after
disturbing it with a noise map zn ∼ N (0, σn 2 ). The variance
of the noise determines the amount of detail that will be added
at the current scale by the generator to produce

x̃n = x̃n+1↑ +Gn (zn + x̃n+1↑). (2) Fig. 3: Embeddings learned by block2vec of the ruins struc-
ture. The embeddings have 32 dimensions but are transformed
At each scale, the discriminator either receives a down- to two dimensions for this visualization using the Minimum
sampled real sample xn or the output of the generator with Distortion Embedding method [28].
equal probability. The gradient of the discrimination loss is
then propagated to the discriminator and the generator, which
creates the Minimax problem is able to generate. To put this difference into perspective,
a one-hot encoded tensor of the original SMB level 1-1 has
min max Ladv (Gn , Dn ) + αLrec (Gn ). (3) a shape of 202 × 16 with 12 (out of 28 possible) different
Gn Dn
tokens. Taking only the actually present tokens into account,
The loss Ladv is the widely-used Wasserstein GAN with this results in 38, 784 floating point numbers, which take up
Gradient Penalty (WGAN-GP) [26], [27] loss and Lrec is a 0.16 MB. The village example by comparison has a shape
reconstruction loss weighted by α which ensures that the of 121 × 136 × 33 with 71 (out of 300+ possible) different
GAN’s latent space contains the real sample2 . After training on tokens, resulting in 38, 556, 408 numbers that require 154.23
one scale has converged, the parameters of the generator and MB to store. If you do not preprocess the data so that only
discriminator are copied to the next scale as an initialization. present tokens are taken into account, the difference becomes
even more steep.
C. TOAD-GAN
D. World-GAN
As SinGAN is designed for modeling natural images, its
application to token-based games requires some modifications. While the overall architecture of World-GAN in Fig. 2a is
TOAD-GAN [10] introduces several changes to SinGAN’s similar to TOAD-GAN, the 3D structure of Minecraft levels
architecture. Small structures that consist of only a few or a requires several modifications. The generator and discriminator
single token would be missing at lower scales due to aliasing now use 3D convolutional filters that can process the k × D ×
by the downsampling operation. The bilinear downsampling H ×W sized slices from the input level. Here, k is the number
is thus replaced by a special downsampling operation that of tokens in a level and D, H and W are the depth, height
considers the importance of a token in comparison with its and width of the slice. Fig. 2b shows a visualization of the
neighbors. The importance is determined using a hierarchy that 3D convolution operation.
is constructed by a heuristic which is motivated by the TF- Another difficulty is the number of tokens in Minecraft
IDF metric from NLP. These extensions allow TOAD-GAN to and their long-tailed distribution, i.e. some of the tokens only
be applied to SMB and several other 2D token-based games. appear a few times in a given sample whereas others (such
However, the generation of 3D content requires some changes as air) take up half of the map. To make World-GAN
to the network architecture of TOAD-GAN. The jump from 2D independent of the number of tokens, we turn to a technique
to 3D means the size of samples will be significantly bigger from NLP.
and since TOAD-GAN is using one-hot encodings of tokens,
E. block2vec
the required GPU space grows substantially. This shortcoming
is especially apparent in Minecraft, where the high number Previous works on GANs [5], [10] for PCGML use a one-
of tokens can drastically limit the volume that TOAD-GAN hot encoding of each token in a level. The downsampling in
TOAD-GAN’s architecture requires a hierarchy of tokens to
2 For a more detailed description see [13] and [10]. enable the generation of small structures at lower scales. This
TABLE I: Structure coordinates of our example areas in
DREHMAL:PRIMΩRDIAL [29] (visualizations are shown in
Fig. 4).

Structure x y z Volume
desert [-3219, -3132] [2628, 2717] [116, 128] 92 916
plains [1082, 1167] [1110, 1186] [65, 103] 245 480
ruins [1026, 1077] [1088, 1152] [63, 73] 32 640
beach [606, 695] [-688, -629] [39, 64] 131 275
swamp [-2753, -2702] [3242, 3296] [56, 86] 82 620
mine shaft [24987, 25029] [-799, -754] [20, 38] 34 020
village [25165, 25286] [-770, -634] [55, 88] 543 048

heuristic, based on term frequencies, has its limitations due


to the large number of available tokens in Minecraft which is
also constantly expanding in newer versions of the game. In
NLP, this problem was solved by learning a dense fixed-size
representation of words [1]. These embeddings are constructed
by modeling the joint probability of a token and its context.
To train these block2vec token embeddings, we construct a
dataset of blocks and their neighbors from the area of interest,
i.e. we create block2vec embeddings for each new area we
want to train. Some tokens such as air occur relatively often,
which can lead to sub-optimal representations. Following [1],
we mitigate this imbalance by sampling the tokens according
to their occurrence probability
r
f (bi ) 0.001
P (bi ) = +1· , (4)
0.001 f (bi )
where f (bi ) is the frequency of the token bi in our given
training sample.
We use a skip-gram model with two linear layers, i.e. pre-
dicting the context from the target token. Since our vocabulary
is still relatively small in comparison to other NLP tasks, we do
not have to employ negative sampling like Mikolov et al. [1].
This algorithm can be seen as a kind of matrix factorization
[30] into an m-dimensional token representation and a token
affinity matrix. Using a dimensionality reduction technique
(MDE) [28], we can visualize our token embeddings (Fig. 3).
The generators and discriminators in World-GAN are given
the levels in this representation space, i.e. the generator pro-
duces a m×D ×H ×W tensor that is fed to the discriminator.
After training is complete, the generated levels can be turned
into a valid Minecraft level by choosing the token whose
embedding is the nearest-neighbor to the generator’s output for
each voxel. In contrast to the size calculated in Section III-C,
the tensor for the village example now does not depend on the
dimensionality of 71 in the token dimension but on the size
of the token embedding. We choose 32 in our experiments,
resulting in a tensor using 69.51 MB instead. The size of 32
was empirically chosen, so it can be reduced even further Fig. 4: Qualitative samples of our generated levels using
by choosing a smaller representation dimensionality where block2vec. Using our learned representation space, World-
appropriate. GAN is able to create convincing samples of different areas
In addition to reduced memory requirements, block2vec also while still including meaningful variations. It excels at gen-
allows us to omit the definition of a hierarchy. The token erating samples in which structures can be freely added and
hierarchy was proposed in [10] to enable the generation of subtracted from, like ruins.
small or rare tokens at lower scales. Making sure rare tokens the area while still showing some variability. Even structures
are generated is especially important in video game levels, as can be generated to a certain degree, such as ruins and trees.
gameplay relevant secrets or power-ups are usually rare and However, as World-GAN is not optimized to make coherent,
hidden on purpose. In our new embedding space, rare tokens functioning structures, the details of generated houses, like the
are placed close to semantically similar more common tokens. interior blocks, windows, doors and specific structure are not
For example, in the embeddings shown in Fig. 3 the stone entirely correct. Still, the overall structure and style of all areas
brick stairs is a rare token, but its representation is close is captured well by our method.
to one of the most common tokens, stone bricks. With For comparison, Fig. 5 shows a sample generated by using
this, a generator at higher scales can more easily learn to a simple hierarchy as in [10]. The rank of a block in the hier-
generate the rarer token even if the more common token was archy is defined by the inverse of the sum of its occurrences,
generated one scale below. i.e. the blocks occurring most often like air and grass blocks
Finally, choosing a different mapping from internal rep- are lower in the hierarchy, while blocks that are rare such as
resentations to tokens allows us to change the style of the stairs and a chest are higher. We can see that TOAD-GAN 3D
generated content after training, which we demonstrate in also works with its simple hierarchy. However, it is apparent
Section IV-D. that rarer blocks are not being generated as much. Especially
in the area where grass and air blocks meet in the upper part
IV. E XPERIMENTS of Fig. 5a, the rarer tall grass block is not generated.
We perform several experiments to evaluate the capabilities
of our method, which we will describe in the following B. Quantitative Evaluation
sections. After showing some qualitative samples for a range In this section, we describe three different metrics to eval-
of different areas in Minecraft, we evaluate several metrics, uate our generated levels: block distribution histograms, the
such as the Tile Pattern KL-Divergence (TPKL-Div) and TPKL-Div and the Levenshtein distance.
the Levenshtein Distance. We prove the effectiveness of our 1) Block Distribution: As shown in Section IV-A, espe-
block2vec embedding, by comparing it to variants of TOAD- cially rare tokens are difficult for PCGML methods to model.
GAN which we extended to 3D. We call these variants TOAD- By counting the occurrences and visualizing them as his-
GAN 3D and TOAD-GAN 3D*, for a TOAD-GAN extended to tograms we can empirically study whether our method is able
3D with and without hierarchical3 downsampling respectively. to produce rare tokens given the limited number of examples.
Additionally, we present one version of World-GAN that we Fig. 6 shows how World-GAN compares to TOAD-GAN
train on embeddings of the token descriptions from a general 3D under this metric. Both methods have varying success
purpose NLP model, called BERT [31]. Finally, we change the generating rare tokens. Usually, the discriminator will be more
mapping from our representation space to tokens to showcase likely to label a sample with a rare token as fake. However,
the possibility of editing World-GAN’s output after training. for World-GAN with block2vec the rare tokens share some
similarities with other more common tokens and are not as
A. Qualitative Examples
easy to detect. This leads to more of them being generated,
Our goal is to generate areas for Minecraft that are similar as can be seen in the bottom row of Fig. 6. Only the chest
to a user-defined world snippet but show a reasonable amount token is missing from World-GAN’s output. As its embedding
of variation. There is no restriction to what kind of blocks are in Fig. 3 is also further away from the other tokens, we
in the snippet, therefore, any type of area (biomes, buildings, hypothesise that it might be too easy for the discriminator to
plants) can be used for training World-GAN. To demonstrate detect compared to the other rare tokens. Developing different
the broad applicability of our method we choose a variety embeddings that generalise better between common and rare
of user-defined biomes, like a desert, plains and a beach in tokens is a direction for future work.
our experiments. We also want to investigate the capability 2) Tile Pattern KL-Divergence: Next, we evaluate how well
of the method to create simple structures. For this we select the patterns in our generated content match the original sample
samples with buildings or natural structures like ruins, swamp based on the metrics used in [10]. The TPKL-Div [32] is the
trees, a mine shaft and a village. For reproducibility, we Kullback-Leibler Divergence of token patterns of size n that
extracted all of these samples from the handcrafted world occur in a level. We apply it to Minecraft by considering all
”DREHMAL:PRIMΩRDIAL” [29], which is available online. n × n × n patterns in our generated content
The coordinates in which the areas can be found are shown in
Table I. The village and the mine shaft are drawn from areas
X P (s)
DKL (P ||Q) = P (s) log , (5)
of the original Minecraft World Generator. All other areas are Q(s)
s∈Nn×n×n
taken directly from hand crafted biomes in [29].
Fig. 4 shows each of the different areas and one sample where P (s) is the frequency of the pattern s in the original
generated with our proposed World-GAN. We can see, that level and Q(s) is its frequency in the generated level. We
World-GAN is capable of reproducing a convincing sample of choose patterns of size 5 × 5 × 5 and 10 × 10 × 10 and
average their resulting TPKL-Divs. In preliminary experiments
3 The hierarchy uses a token frequency based heuristic like in [10]. we found that using 4 scales of sizes 1.0, 0.75, 0.5 and 0.25
(a) TOAD-GAN 3D (b) World-GAN (block2vec)
Fig. 5: Qualitative comparison between content from TOAD-GAN 3D and our block2vec approach. (a) was generated with a
simple hierarchy based on the method originally described in [10], and (b) is a sample from the same generator as in Fig. 4.
The samples look very similar overall and both methods generate viable structures.

TABLE II: Average Tile-Pattern KL-Divergence between the TABLE III: Average Levenshtein distance between the gener-
real structure and 20 generated levels. A lower TPKL-Div ated levels. A larger distance implies a larger variability in the
implies that the patterns of the original level are matched generated output.
better.
World-GAN TOAD-GAN 3D TOAD-GAN 3D*

World-GAN TOAD-GAN 3D TOAD-GAN 3D* desert 3251.37 1342.24 1305.19


plains 5314.43 3895.96 4372.41
desert 16.28 18.18 18.56 ruins 5073.49 5808.97 5615.67
plains 23.05 22.79 22.84 beach 15 623.57 12 846.66 11 743.38
ruins 16.20 16.35 16.51 swamp 7900.03 10 515.40 7292.57
beach 16.76 15.80 16.22 mine shaft 9691.69 6138.71 7764.37
swamp 20.32 18.41 19.95 village 5721.88 6452.61 6679.69
mine shaft 14.67 14.64 14.62
village 21.73 21.57 21.35 Average 7510.92 6714.36 6396.18

Average 18.43 18.25 18.58

generated using block2vec has a higher variability on average.


One explanation could be the generation of rare tokens, which
in our generation process leads to the best results4 . We use are more frequent in World-GAN’s output. This supports our
this configuration for all qualitative samples that are presented findings in Section IV-B1.
in the paper. The results in Table II show that World-GAN C. Alternative Token Embeddings
with block2vec is able to match the training patterns as well
We evaluate the impact of the token embeddings on the
as the other variants while requiring less memory (compare
generated output by comparing our block2vec approach to the
Section III-D).
canonical BERT [31] embeddings of the token descriptions.
3) Levenshtein Distances: Finding an objective measure for
BERT is a widely used NLP model trained on many sample
the uniqueness of procedurally generated content is no easy
sentences of the English language and can therefore provide
task. The Levenshtein distance [33] is an established metric
embeddings for any English sentence. For this experiment, we
coming from information theory to measure the similarity of
feed the token descriptions (e.g. mossy stone bricks) to
two discrete strings. It is defined as the minimum number
a pretrained BERT model [34] and use the final layer’s output
of insertions, deletions and substitutions that are needed to
as our token embedding. Fig. 7 shows a qualitative sample
transform one string into the other. The distance is bounded
when training with these 768 dimensional embeddings. It is
from above by the length of the longer string and is equal
apparent that the patterns are not as closely modeled as with
to zero iff. the two strings are equal. We can interpret slices
the block2vec embeddings. This could be attributed to the high
of our generated levels as strings by concatenating the tokens
dimensionality of the embeddings. However, despite not being
at each position and assigning them a number. This allows
trained on Minecraft, the general structure of stone ruins with
us to compute the Levenshtein distance between all generated
grass around them is still visible. Since the embeddings are
samples of an area, which lets us quantify their variability.
created using only their textual description, this experiment
The results are shown in Table III. We find that content
indicates a future research direction of grounding World-GAN
4 We evaluated World-GAN with scales (1.0, 0.5), (1.0, 0.75, 0.5) and (1.0, in natural language. This is especially interesting regarding the
0.75, 0.5, 0.25). The corresponding results are published with our source code. recently introduced Chronicle Challenge [17].
Fig. 7: Sample generated with a World-GAN trained using
BERT embeddings of the token descriptions. Some artifacts
such as floating tall grass are produced but the overall
structure of the map is visible. This indicates that meaningful
level generation within a learned word space is possible.

Fig. 8: An example of a level with edited representations. With


block2vec, we can change the style of the generated structure
(ruins) from one style (plains) to another (desert).

of the generated tokens without retraining the generator. This


results in levels that have the same structure as the original
Fig. 6: Histograms depicting the block distributions for the training sample, but can use wildly different tokens. Fig. 8
ruins example, showing mean and variance over 100 generated shows an example of such a transfer. The World-GAN used
samples. The y-axis is scaled logarithmically. While there are in this example is the same generator used for the ruins
some differences in the token occurrence counts between our examples in previous sections. For each token vector in the
samples and the original, we capture the given distribution representation space we change the token it is mapped to,
reasonably well and are also able to generate most rare tokens. in order to fit the new style. In this example, we changed
the blocks representing the ground, i.e. grass and dirt, to
be interpreted as sand and the blocks representing the ruins,
D. Representation Editing i.e. stone bricks, to variations of red sandstone. This method
allows a designer to use the generator of one basic structure
Similar to TOAD-GAN, levels generated with World-GAN for many different styles.
can be edited during generation. The editing capabilities
discussed in [10] are all applicable to World-GAN as well. In V. C ONCLUSION AND F UTURE W ORK
this section, however, we want to highlight another possibility We introduce World-GAN, a method that enables data-
of editing the generated levels. As described in Section III-E, driven PCGML in Minecraft. It is inspired by the TOAD-
World-GAN is trained on a specific representation space. Since GAN [10] architecture, but is specifically extended to 3D by
the generated samples also belong to this representation space, incorporating 3D convolutions and the application of dense
the transformation from that space back to the original tokens token embeddings, which we call block2vec. We evaluated its
is crucial to the style of the generated level. By changing generated content with respect to its pattern similarity to the
this interpretation of the latent space, we can change the style original input, its variability and by how well it handles rare
tokens in the input. Finally, we present an easy way to change [15] C. Salge, M. C. Green, R. Canaan, F. Skwarski, R. Fritsch, A. Bright-
the representations after training, which enables us to edit the moore, S. Ye, C. Cao, and J. Togelius, “The AI Settlement Generation
Challenge in Minecraft: First Year Report,” KI - Künstliche Intelligenz,
style of a given level. Especially with the Minecraft Settlement vol. 34, no. 1, Mar. 2020.
Generation Challenge [16] in mind, a few research directions [16] C. Salge, M. C. Green, R. Canaan, and J. Togelius, “Generative design in
open up for future work. With the current method, semantic minecraft (GDMC): Settlement generation competition,” in Proceedings
of the 13th International Conference on the Foundations of Digital
correctness of structures is not enforced, which can result Games. Malmö Sweden: ACM, Aug. 2018.
in for example nonsensical houses. We want to investigate [17] C. Salge, C. Guckelsberger, M. C. Green, R. Canaan, and J. To-
improving our method in order to better generate semantic gelius, “Generative Design in Minecraft: Chronicle Challenge,” in 10th
International Conference on Computational Creativity, ICCC 2019.
structures by incorporating semantic rules into the generation Association for Computational Creativity (ACC), 2019, pp. 311–315.
process. Our method opens up several directions for PCGML [18] M. C. Green, C. Salge, and J. Togelius, “Organic building generation
in Minecraft, as we will publish our source code and are able in minecraft,” in Proceedings of the 14th International Conference on
the Foundations of Digital Games. San Luis Obispo California USA:
to handle the most current Minecraft version. ACM, Aug. 2019.
[19] L. B. Soros, J. K. Pugh, and K. O. Stanley, “Voxelbuild: A minecraft-
VI. ACKNOWLEDGMENT inspired domain for experiments in evolutionary creativity,” in Pro-
This work has been supported by the Federal Ministry ceedings of the Genetic and Evolutionary Computation Conference
Companion. Berlin Germany: ACM, Jul. 2017.
of Education and Research (BMBF), Germany, under the [20] C. Patrascu and S. Risi, “Artefacts: Minecraft meets collaborative
project LeibnizKILabor (grant no. 01DD20003), the Federal interactive evolution,” in 2016 IEEE Conference on Computational
Ministry for Economic Affairs and Energy under the Wipano Intelligence and Games (CIG). Santorini, Greece: IEEE, Sep. 2016.
[21] D. Grbic, R. B. Palm, E. Najarro, B. Glanois, and S. Risi, “EvoCraft: A
programme ”NaturalAI” (03THW05K06), the Center for Dig- New Challenge for Open-Endedness,” in Applications of Evolutionary
ital Innovations (ZDIN) and the Deutsche Forschungsgemein- Computation: 24th International Conference, EvoApplications 2021,
schaft (DFG) under Germany’s Excellence Strategy within the Held as Part of EvoStar 2021, Virtual Event, April 7–9, 2021, Pro-
ceedings, vol. 12694. Springer Nature, 2021, p. 325.
Cluster of Excellence PhoenixD (EXC 2122). [22] S. Sudhakaran, D. Grbic, S. Li, A. Katona, E. Najarro, C. Glanois, and
S. Risi, “Growing 3D Artefacts and Functional Machines with Neural
R EFERENCES Cellular Automata,” arXiv:2103.08737 [cs], Mar. 2021.
[1] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation [23] E. Yoon, E. Andersen, B. Hariharan, and R. A. Knepper, “Design mining
of Word Representations in Vector Space,” arXiv:1301.3781 [cs], Sep. for minecraft architecture,” in AIIDE, 2018.
2013. [24] M. T., “Minecraft-gan-city-generator,” https://github.com/BluShine/
[2] S. Snodgrass and S. Ontanón, “Generating maps using markov chains,” Minecraft-GAN-City-Generator, 2020.
in Ninth Artificial Intelligence and Interactive Digital Entertainment [25] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
Conference, 2013. S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in
[3] J. Gutierrez and J. Schrum, “Generative Adversarial Network Rooms Advances in Neural Information Processing Systems, 2014.
in Generative Graph Grammar Dungeons for The Legend of Zelda,” [26] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative ad-
in 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, versarial networks,” in International Conference on Machine Learning,
2020, pp. 1–8. 2017, pp. 214–223.
[4] A. Summerville, S. Snodgrass, M. Guzdial, C. Holmgård, A. K. Hoover, [27] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville,
A. Isaksen, A. Nealen, and J. Togelius, “Procedural Content Genera- “Improved Training of Wasserstein GANs,” in Advances in neural
tion via Machine Learning (PCGML),” IEEE Transactions on Games, information processing systems, 2017, pp. 5767–5777.
vol. 10, no. 3, Sep. 2018. [28] A. Agrawal, A. Ali, and S. Boyd, “Minimum-distortion embedding,”
[5] V. Volz, J. Schrum, J. Liu, S. M. Lucas, A. Smith, and S. Risi, “Evolving arXiv, 2021.
mario levels in the latent space of a deep convolutional generative [29] “Drehmal:primordial,” https://www.planetminecraft.com/project/
adversarial network,” in Proceedings of the Genetic and Evolutionary drehmal-v2-prim-rdial-12k-x-12k-survival-adventure-map/, accessed:
Computation Conference, 2018, pp. 221–228. 2021-03-26.
[6] E. Giacomello, P. L. Lanzi, and D. Loiacono, “Doom level generation [30] O. Levy and Y. Goldberg, “Neural Word Embedding as Implicit Matrix
using generative adversarial networks,” in 2018 IEEE Games, Entertain- Factorization,” Advances in neural information processing systems,
ment, Media Conference (GEM). IEEE, 2018, pp. 316–323. 2014.
[7] Mojang Studios, “Minecraft,” 2009, version: 1.16. [31] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training
[8] E. Haines, “Mineways,” https://www.realtimerendering.com/erich/ of Deep Bidirectional Transformers for Language Understanding,” in
minecraft/public/mineways/index.html, version: 8.05. Proceedings of the 2019 Conference of the North American Chapter
[9] Bender Foundation, “Blender,” https://www.blender.org/, version: 2.79. of the Association for Computational Linguistics: Human Language
[10] M. Awiszus, F. Schubert, and B. Rosenhahn, “Toad-gan: coherent style Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.
level generation from a single example,” in Proceedings of the AAAI [32] S. M. Lucas and V. Volz, “Tile Pattern KL-Divergence for Analysing and
Conference on Artificial Intelligence and Interactive Digital Entertain- Evolving Game Levels,” Proceedings of the Genetic and Evolutionary
ment, vol. 16, no. 1, 2020, pp. 10–16. Computation Conference, Jul. 2019.
[11] V. Volz, N. Justesen, S. Snodgrass, S. Asadi, S. Purmonen, C. Holmgård, [33] V. I. Levenshtein, “Binary Codes Capable of Correcting Deletions,
J. Togelius, and S. Risi, “Capturing Local and Global Patterns in Insertions and Reversals,” Soviet Physics Doklady, vol. 10, p. 707, Feb.
Procedural Content Generation via Machine Learning,” in 2020 IEEE 1966.
Conference on Games (CoG). IEEE, 2020, pp. 399–406. [34] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi,
[12] A. Khalifa, P. Bontrager, S. Earle, and J. Togelius, “Pcgrl: Procedural P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer,
content generation via reinforcement learning,” in Proceedings of the P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. L. Scao, S. Gugger,
AAAI Conference on Artificial Intelligence and Interactive Digital En- M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-
tertainment, no. 1, 2020. the-art natural language processing,” in Proceedings of the 2020
[13] T. R. Shaham, T. Dekel, and T. Michaeli, “Singan: Learning a generative Conference on Empirical Methods in Natural Language Processing:
model from a single natural image,” in Proceedings of the IEEE System Demonstrations. Online: Association for Computational
International Conference on Computer Vision, 2019, pp. 4570–4580. Linguistics, Oct. 2020, pp. 38–45. [Online]. Available: https:
[14] N. Y. Khameneh and M. Guzdial, “Entity Embedding as Game Repre- //www.aclweb.org/anthology/2020.emnlp-demos.6
sentation,” in Proceedings of the Second Workshop on Experimental AI
in Games - EXAG ’20, Oct. 2020.

You might also like