0% found this document useful (0 votes)
108 views9 pages

The Next Generation of Modeling & Simulation: Integrating Big Data and Deep Learning

This document discusses integrating big data and deep learning into modeling and simulation. It provides a brief overview of big data and deep learning, explaining that big data allows users to extract information from large, heterogeneous datasets while deep learning discovers correlations in data through supervised and unsupervised learning algorithms. The document argues that combining these approaches with modeling and simulation could significantly improve computational support for other sciences by enriching modeling and simulation approaches and helping discover correlations in simulation results. Some examples are given to demonstrate the feasibility of integrating big data, deep learning, and modeling and simulation to create a new generation of applications.

Uploaded by

Dork654651
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views9 pages

The Next Generation of Modeling & Simulation: Integrating Big Data and Deep Learning

This document discusses integrating big data and deep learning into modeling and simulation. It provides a brief overview of big data and deep learning, explaining that big data allows users to extract information from large, heterogeneous datasets while deep learning discovers correlations in data through supervised and unsupervised learning algorithms. The document argues that combining these approaches with modeling and simulation could significantly improve computational support for other sciences by enriching modeling and simulation approaches and helping discover correlations in simulation results. Some examples are given to demonstrate the feasibility of integrating big data, deep learning, and modeling and simulation to create a new generation of applications.

Uploaded by

Dork654651
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/278667209

The Next Generation of Modeling & Simulation: Integrating Big Data and Deep
Learning

Conference Paper · July 2015

CITATIONS READS

23 3,798

1 author:

Andreas Tolk
MITRE
300 PUBLICATIONS   2,984 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Automated Intelligent Mentoring System (AIMS) View project

Formal Simulation Systems View project

All content following this page was uploaded by Andreas Tolk on 17 June 2015.

The user has requested enhancement of the downloaded file.


The Next Generation of Modeling & Simulation:
Integrating Big Data and Deep Learning

Andreas Tolk
SimIS Inc.
Portsmouth, VA, USA
[email protected]

ABSTRACT When the Massachusetts Institute for Technology compiled


Big data allows users to cope with data that are huge in re- their list of the most influential technologies of the year 2013
gards to volume, velocity, variety, and veracity. It provides [17], it identified Deep Learning as one of the winners. Why
methods and tools to extract aggregates and new information should we as simulation experts care? Do these new tech-
out of heterogeneously structured data or even completely un- nologies, methods and tools provide us with any innovative
structured data. Deep learning is a collection of algorithms functionality that closes gaps in our body of knowledge?
that allows us to discover correlations and learn — super-
Within this position paper, I will make the case that by bring-
vised and unsupervised — from information provided. This
ing all three topics — modeling & simulation, big data, and
contribution introduces the main ideas and methods of big
deep learning — together we will create synergy that will al-
data and deep learning and shows how they can be applied
low us to significantly improve our services to other sciences.
to various phases of the traditional modeling and simulation
To make the case, I will first look into big data and deep learn-
process. Big data supports obtaining data for the initialization
ing and give a very short overview of these topics. Following
as well as evaluating the results of the simulation experiment.
these two sections, I will show how these methods and tools
Deep learning can help with the conceptual modeling phase
can enrich modeling and simulation approaches and result in
as well as with the discovery of correlations in the results.
a significant leap towards the next generation of modeling
Examples of existing applications will be given to prove the
and simulation. In order to show that these ideas are feasi-
feasibility of such ideas. This leads to the observation that big
ble, some examples of applications will be given to prove the
data, deep learning, and modeling and simulation have the po-
concept. Although the paper is not yet supported by the nec-
tential to lead to a new generation of modeling and simulation
essary full research, implications can be clearly shown.
applications that provide computational scientific support on
a new scale beyond the current capabilities. This paper will hopefully result in many discussions and sup-
porting research. It is not meant to replace any tutorials or
Author Keywords other introductory material. It has been written to provoke
artificial intelligence; big data; deep learning; modeling; new ideas, even if the reader is not an expert in any of the
simulation. three topics.

Computing methodologies — Modeling and simulation


Computing methodologies — Artificial intelligence 2. BIG DATA
Although the term Big Data is used ubiquitously, there is
no generally accepted definition. What exactly people un-
1. INTRODUCTION
derstand about big data seems to be very domain dependent.
Two terms are currently used enthusiastically by researchers For example, the article by Jacobs [19] looks at big data from
as well as by business developers: Big Data and Deep Learn- the viewpoint of how traditional applications can — or better
ing! cannot — cope with big data in the sense of huge amounts of
The Winter Simulation Conference of 2014 had ”Exploring data. The research presented by him addresses the need to be
Big Data through Simulation” as its theme, showing the im- able to scale software to large problems described by a huge
portance of this topic for our community. In the panel discus- amount of data and to develop better access patterns. The
sion on grand challenges conducted during the Spring Simu- pragmatic definition proposed in this paper is that big data
lation Multi–Conference 2013, Khan called in a recent expert should be defined at any point in time as ”data whose size
panel for researching on how big simulation can have a simi- forces us to look beyond the tried–and–true methods that are
lar effect as big data [37]. prevalent at that time” [19].
In other words, whenever our data is getting too big to fit into
our applications, we have a big data problem. Whenever our
database systems or servers are no longer able to provide us in
time with the required data, as they can no longer locate and
SCSC 2015, July 26-29, 2015, Chicago, IL, USA load it in time, we have a big data problem. As a result, Big
2015
c Society for Modeling & Simulation International (SCS)
Data is usually associated with data analyses, data storage hype are the vast amounts of unstructured data, in particular
and a huge amount of data that is complex in structure. the data provided by social media. The task to derive useful
information from these data is complicated by the variety of
When conducting a literature review, most of the articles that devices used for Internet access, social media, and the like.
address the topic focusing on the business opportunities, im- All these challenges require another paradigm shift: instead
proving decision making by providing deeper insight, or very of bringing the data to the processor, at least part of the pro-
special technical solutions. Ward and Baker conducted a sur- cessing needs to be brought to the data. To allow for this, data
vey on various Big Data definitions as used by influential or- need to be stored, structured, and evaluated accordingly.
ganizations [42]. They discovered the following definitions:
A technology that is often referred to when dealing with so-
• Oracle: Big data is the derivation of value from traditional lutions supporting Big Data is Hadoop [36]. Hadoop is scal-
relational database–driven business decision making, aug- able, fault tolerant, and supports fully distributed computing
mented with new sources of unstructured data. and the sharing of resources. A detailed introduction goes be-
• Intel: Big data opportunities emerge in organizations gen- yond the scope as well as the intention of this article, but it
erating a median of 300 terabytes of data a week. The is worth understanding the principles. The basic functional-
most common forms of data analyzed in this way are busi- ity is mapping of unstructured data to key–value pairs, ma-
ness transactions stored in relational databases, followed nipulating the resulting lists, and combining the outputs into
by documents, e–mail, sensor data, blogs, and social me- aggregated values that provide the new insight or knowledge.
dia. This chain is called MapReduce. Hadoop uses MapReduce in
conjunction with the advanced Hadoop Distributed File Sys-
• Microsoft: Big data is the term increasingly used to de- tem (HDFS) and YARN — Yet Another Resource Negotiator.
scribe the process of applying serious computing power — For the interested reader, [43] gives a good overview of the
the latest in machine learning and artificial intelligence — technical details. For this position paper, the fact that hetero-
to seriously massive and often highly complex sets of in- geneously structured or unstructured data is processed into
formation. aggregated lists that are well defined sets representing the de-
rived data, is very important.
• Method for an Integrated Knowledge Environment
(MIKE), an open source delivery methodology for Enter- In summary, Big Data methods and technologies help to take
prise information management: Big data is not a function structured and unstructured data that are huge regarding vol-
of the size of a data set but its complexity. Consequently, it ume, velocity, variety, and veracity, and produce aggregates
is the high degree of permutations and interactions within for data scientists who now can find valuable information in
a data set that defines big data. the these derived data. Very often, new paradigms are applied,
such as bringing the processing power to the data instead of
• National Institute of Standards and Technology (NIST): using the data as input for powerful local processors. How-
Big data is data which exceed the capacity or capability ever, the data scientist is still a very important factor, as he
of current or conventional methods and systems. In other has a significant role in interpreting and sense making.
words, the notion of big is relative to the current standard
of computation, such is observed in [19]. This is where the second recent big new idea comes into
play, using Deep Learning to look for possible correlation
Overall, these definitions are not satisfying to many scien- and causalities in the data identified.
tists. Some definitions are pretty shallow, others introduce
concepts to be relative with regards to the available technol-
3. DEEP LEARNING
ogy. Accordingly, data that are understood to be big may no
longer belong to this class when petascale and exascale com- Deep Learning is often understood as a new domain of ma-
chine learning research that is dealing with learning multi-
puters become available.
ple levels of representation and abstraction that can be dis-
Today, most experts from the domains of computer science covered in structured data as well as in unstructured data.
and engineering fall back to a report by Meta (today Gardner However, many of the deep learning algorithms are rooted
Group) that noted the increasing size of data, the increasing in the domain of artificial intelligence (AI). They only seem
rate at which it is produced, and the increasing range of for- new as they take full advantage of the new computational re-
mats and representations employed. The report [22] identified sources and recent developments. Deep learning algorithms
Volume, Velocity and Variety as the data characteristics that are implementing supervised learning algorithms, as well as
make data big. To address the important issue of uncertainty unsupervised learning algorithms. For a good introduction to
in such data sets, which are often derived from hundreds of Deep Learning from a computational perspective, the inter-
different sources with different assumptions and constraints, ested reader is referred to tutorials like [8] or [24].
the term Veracity is often introduced as the fourth V.
Deep Learning has to address challenges on several levels,
Although appropriate without doubt, none of these defini- such as how a machine can learn representations from ob-
tions explicitly points towards the new characteristics of serving the perceptual world, or how we can learn abstrac-
data. Many computer engineers are still thinking about large tions from observing and evaluating several instances. Is it
and complex amounts of structured data that are stored in possible, and how is it possible to learn hierarchical repre-
databases. However, one of the main drivers of the Big Data sentations with a few algorithms? Deep learning is trying
to solve these challenges by using trainable feature hierar- 4. THE NEXT GENERATION OF M&S
chies based on a series of trainable feature transformations, Modeling and simulation is a discipline that is made up of
where each transformation connects two internal representa- two equally importing subsections, namely first modeling and
tions with each other. The algorithms developed help to learn second simulation.
all steps and representation by supervised and unsupervised
learning. In other words, the algorithms help to learn the Modeling is the task–driven, purposeful simplification and
structures as well as the transformations of these structures abstraction of a perception of reality. This perception is
into each other. The tools and methods applied will immedi- shaped by physical and cognitive constraints [39]. The access
ately be recognized by AI researchers: to empirical data describing the system is limited by the phys-
ical constraints of the system as well as the sensors. The col-
• Multilayer and convolutional neural nets process the infor- lection and interpretation is furthermore limited by the cog-
mation through a set of layers of interconnected ”neurons.” nitive aspects of the modeling experts. These experts have
As a rule, supervised learning is applied to learn connec- a certain theory in mind that helps them to collect the rele-
tions and weights between neutrons in hierarchical layers. vant data, to observe and evaluate them, and to replace ob-
Convolutional neural nets add the spatially–local aspect by served correlation with assumed causalities. The more they
applying a sort of filter with the first set of neurons. These know about the theories applicable in the modeling domain,
neural nets are trained using supervised learning. Usually, the better they can create models that reflect these theories,
a set of training data is used to learn the desired behavior, and the better they can find correlations in the data to point
and then a set of control data is used to validate the result. them towards which theory is applicable in the observed case.
Another aspect is ethical constraints that limit the ability to
• Deconvolutional neural nets and stacked sparse coding are collect data, in particular in medical modeling, but also in hu-
trained by backward–propagation, similar to convolutional manities [1].
nets. Again, supervised learning is used as a rule to cal-
ibrate the solutions that can be validated by control data. Simulation implements models in an executable form. We are
However, these nets are also used to discover hierarchical mainly interested in computer–based simulations, where the
decompositions to extract features from complex inputs, model is the basis for a program that implements the causal-
such as images. ities as computable functions that transform input data into
output data. The recent developments in computer visual-
• Deep Boltzman machines and stacked auto–encoders are ization helped significantly to communicate with the user, as
the most complex types. They use forward as well as back- already discussed in [32], and more recently in [3, 10]. The
ward propagation, supervised and unsupervised learning, military is now using more and more game–based visualiza-
and often combine several of the approaches described in tion to create more realistic images to display their simulation
this itemization combined. Very often, energy–based meth- results as well as to provide more intuitive controls for the
ods are used to bring the calibrated system into a state of simulated entities. Examples and selected evaluations of the
an energy minimum, which also minimizes the deviation of usefulness of such approaches are given in [16].
the learned functionality from the observable functionality. In the following section, I want to make the case that Big
All these techniques and methods have statistical counter- Data and Deep Learning can and already do influence the
parts, as at the end the various neural networks learn to ap- way we see and conduct modeling and simulation studies. We
proximate observable functionality from the data set used to will start with a short overview on the traditional view of the
train them. The better the observed data correlate with the phases of such a simulation study, then we will look at exam-
representation by the neural net, the smaller the mistakes. ples of application of Big Data and Deep Learning methods,
Neural nets, even in the complex and complicated forms used and finally come to an outlook.
here, are statistically speaking non–linear regression models.
The universal approximation theorem states that a standard
multilayer feed–forward network with just one single hidden
layer, which contains a finite number of hidden neurons, is a 4.1. The Traditional View of Modeling and Simulation
universal approximator among continuous functions on com- Studies
pact subsets of the real numbers. All methods above are ex- The traditional view on how to build valid and credible simu-
tensions of this relatively simple network, so they can approx- lation models is well understood and often taught in tutorials,
imate the hidden functionality that deep learning is interested such as in [23]. The following figure shows the steps, that are
in. The interested reader is referred to [21] for the details. generally agreed upon.
Deep Learning is used to find functional connections between 1. Formulate the Task/Question: The very first challenge in
provided input and observed output data. These can be highly every scientific effort is to understand the question to be
non–linear, complex functions. The software learns to recog- answered and/or the task to be conducted. Requirements
nize patterns in structured and unstructured data. It can there- collection, analysis, and alignment should lead to a good
fore recognize sounds, images, or journal information that is understanding of what needs to be done. However, in many
applicable. With the right amount of data, the right mix of al- practical tasks the quote attributed to Vince Roske proofs
gorithms, and the right computational power many objectives to be very valuable: ”First find out what the question is —
of AI can now be realized. then find out what the real question is.”
used alternative views on this topic and summarized them
in [2].
5. Design the Experiment and Obtain Data: As models are
task–driven simplifications and abstractions, very often the
design of the experiment has already been in the mind of
the model developer. Ören and Zeigler therefore use the ex-
perimental frame as the general constraint for models [28].
The design of experiments that can be conducted with the
simulation system is therefore framed by the assumptions
and constraints underlying the conceptualization, as well as
by the compromises made during the implementation. In
addition, the original question of the customer needs to be
Figure 1. Generic Steps in a Traditional Simulation Study answered as well, i.e., operational research methods need
to be known as well. A valuable resource is [20].

2. Observe the System/Collect Data: The next step is to ob- 6. Conduct the Simulation and Collect Data: The validated
serve the system and collect data. If the task is to simulate and verified approach is then used to conduct the simula-
an existing system, studying and observing this system de- tion experiments, which in themselves produce more data,
livers the causal and temporal relation between input and which can be used to validate the experiments themselves.
output data. This also gives the frame for follow–on val- In addition, these data provide the foundation for the analy-
idation, as the simulated system should produce the same sis to follow. It is worth mentioning that the pure selection
causal–temporal relationships later on as well [26]. of important data to be collected in itself is a modeling pro-
cess as well, as the selection of data results in a data model
If a new system needs to be developed, or new concepts of the important for the data to be collected, and this data
need to be evaluated, this phase of data collection is more model is in itself another simplification and abstraction of
about gathering additional assumptions, constraints, and reality.
expectations, often in the form of additional requirements.
If the system is specified in the form of architecture frame- 7. Evaluate the Experiment: Although written by experts in
work artifacts, these data maybe used in support of this the military domain, the ”NATO Code of Best Practice
phase [41]. An example on how to directly derive executa- for Command and Control Assessment” [35] as well as
bles from system architecture artifacts has been published the ”Guide for Understanding and Implementing Defense
in [15]. Experimentation” [4] provide valuable lessons learned and
good practices for all domains. They address, among other
3. Develop and Validate a Conceptual Model: Robinson’s topics, the need for sensitivity analysis of solutions to un-
definition of a conceptual model is often referred to in the derstand how stable it is, the necessity for repeatable exper-
M&S community: ” a non–software specific description of iments in support of collecting statistically relevant data to
the computer simulation model (that will be, is or has been cope with uncertainty and risk, and more pitfalls when con-
developed), describing the objectives, inputs, outputs, con- ducting and evaluating simulation experiments in complex
tent, assumptions and simplifications of the model.” [31]. domains.
The resulting model needs to be validated, i.e., it needs to Another aspect of interest is the use of Visualization to rep-
be shown that the model is accurate enough to fulfill the resent the simulation and its results. As discussed in [7],
requirements that the customer formulated. Validation an- visualization has its own rhetoric and needs to aligned and
swers the question: ”Did we build the correct model?” harmonzed with the model and the simulation.
4. Develop and Verify the Computable Mode: Next, this 8. Present the Results: Finally, the results need to be pre-
model needs to be implemented. This is not a trivial step, sented to the customer. The focus should be the rec-
as a lot of compromises may be necessary between ac- ommended solution, not the simulation used. However,
curacy and computing time. Many functions need to be clearly stating assumptions and constraints that frame the
numerically approximated, the decision space needs to be results is essential. It would be unethical to oversell or mis-
discretized, and executable code needs to be distributed interpret results to please the customer [27].
to the supporting platforms. Oberkampf et al. conducted
In the light of the insights on Big Data and Deep Learning de-
an impressive study on errors and uncertainty that are in-
scribed here, they can revolutionize M&S, as M&S can rev-
herit to simulation approaches in [25]. Therefore, once this
olutionize the use of Big Data and Deep Learning. Big Data
is accomplished, the simulation needs to be verified, i.e.,
surely will significantly influence the steps of Design the Ex-
it needs to be shown that the conceptual model was ac-
periment and Obtain Data and Observe the System/Collect
curately transformed into the simulation. Verification an-
Data, as the methods described are able to find and evalu-
swers the question: ”Did we build the model correctly?”
ate the applicability of data in the scenario design and ini-
The state of the art in verification and validation has re- tialization processes as well as in the evaluation of results.
cently been summarized in [34]. Bair studied the currently The openness of these efforts has the potential to overcome
biases of the model developer, as the potentially will iden- Crowd Sourcing
tify many more data than the traditional model developer may The ideas of big data are also used increasingly to initialize
take into account. Similarly, Develop and Validate a Concep- and execute simulations, or even to set up scenarios. This is
tual Model is closely related to Deep Learning, as the identi- often referred to as crowd sourcing. The idea behind this is
fication of correlations and detection of possible causal func- to use real world data to initialize your simulation. Two ex-
tional connections is exactly what is common to both topics. amples to highlight the idea behind this are traffic simulation
Again, possible biases can be overcome by applying Deep [11] and regional impact of potential climate change [40].
Learning algorithms, as they systematically look into all pos-
sible connections and are not limited to the perception of a In the traffic example, a microscopic traffic simulation, i.e.,
human developer and his knowledge of the domain. a simulation with individual cars as simulated entities, is ini-
tialized with real–world data. These data describe the general
The following subsections give examples where such ideas traffic infrastructure, such as street locations and limitations,
already have been implemented, so the ideas proposed here bus stops, railroad crossings, intersections, etc. They also
are not just fiction or a vision for the future, the methods al- describe observed cars, traffic flow, and other ”live” and non–
ready are applied. As this contribution is written as a position static information. The presented study concluded that ”au-
paper, the examples were selected mainly for their informa- tomatically creating a route network from a crowd–sourced
tive value. They are neither complete nor exclusive and the data base is possible.
reader is encouraged to publish additional examples that are
even better applicable in this context. The second example is even on a much higher level. In sup-
port of a study on the potential influence of sea level ris-
ing on the Tidewater area in the south–eastern part of Vir-
4.2. Applying Big Data Methods
ginia, the Virginia Modeling Analysis and Simulation Cen-
Internet and web–based methods were always a driver for new ter (VMASC) of Old Dominion University (ODU) developed
modeling and simulation ideas, as shown in [5, 13, 29]. One a decision–support simulation system. Important and criti-
of the more recent overviews on web–based tools supporting cal assets were identified based on the Department of Home-
modeling and simulation was provided in [6]. In the context land Security Infrastructure Data Taxonomy and its 18 fac-
of the ideas evaluated in this paper, two concepts related to tors. These factors were not only used to guide a search, they
such early contributions are data farming for simulation and were used to automatically populate parts of the model with
crowd sourcing for simulation. identified information. The findings also guided the genera-
Data Farming tion of decision rules based on commonly accepted theories
The Naval Postgraduate Schools (NPS) Simulation, Experi- identified to be relevant in this context.
ments and Efficient Design (SEED) Center for Data Farming The decision–support simulation system comprised intelli-
defines data farming as the process of using simulations and gent agents as well as system dynamics constructs that al-
computer modeling to grow data, leverage high–power com- lowed for evaluation of the effects using various viewpoints
puting, state–of–the–art experiment designs, and innovative and answering several modeling questions, such as ”What is
analysis techniques to gain deeper insights from simulation the best combination of factors that would make an area as
models. As such, data farming seeks to provide decision mak- attractive as possible for as long as possible?” Such ques-
ers with insights into complex issues by using simulations to tions guided how much a community is willing to invest to
produce data. Therefore, we address the step Observe the secure area, or to develop areas even if they are in a potential
System/Collect Data from the last subsection. flooding zone, etc. All relevant data were imported from open
This goes hand in hand with the very similar concept of data sources to initialize the system.
mining, that is applied to find structures and new insights in Both examples show the high potential of Big Data to support
the developed data [33]. To emphasize the point, as evaluated the initialization as well as the development of scenarios and
in [18], one of the important aspects is the unbiased evalua- potentially even modeling approaches.
tion of data: data farming is using methods and heuristics to
discover new insights very similar to the big data approaches. 4.3. Applying Deep Learning Methods
Data farming also introduces the idea to use an orchestrated The term Deep Learning is so far rarely used in the modeling
set of tools that all contribute to the data set to be evalu- and simulation community, although many methods enumer-
ated, similar to the recommendations in [35] for operations ated in the section above are used. In particular the use of
research studies. The resulting data set again can comprise neural networks has been evaluated early [12].
structured and unstructured data, and the structured data can
follow different data schemes that are not aligned with each The following example shows how deep learning can help to
other. Evaluating such a data set requires the ideas described find better rule sets. In the early nineties, a report published
earlier. by the Rand Corporation [9] created many discussions in the
operational analyses community.
In summary, Big Data methods can and in parts already are
successfully applied to efficiently evaluate the data obtained Dewar used a relatively simple combat model to create very
in simulation experiments, in particular when a huge amount counter–intuitive behavior of the overall system, although the
of potentially heterogeneous and unstructured data has to be underlying rules make perfect sense for the analytic commu-
evaluated. nity. Two opposing forces fought in a single battle. Each
At the time this example was conducted, the computational
power was only a fraction of what is available today. Al-
though the example shows the potential of using Deep Learn-
ing methods and tools, we only started to perceive its true
potential.

5. CONCLUDING REMARKS
This position paper proposes to intensify the research on how
to create more synergy between modeling and simulation, Big
Data, and Deep Learning.
• Big data provides a means to obtain and evaluate data to an
extent so far not accessible to simulation experts. It extends
the methods and insights of data mining and data farming
and brings them to a new level. It allows crowd sourcing
for simulation on a bigger scale than we are currently able
to conduct.

• Deep learning allows for the discovery of functional con-


nections in even vast amounts of data. As such, it can help
to discover functional connections per supervised and un-
supervised learning that a human subject matter expert may
not realize. It can utilize the results of big data efforts and
analyze them.
Bringing both new technologies together and combining them
with modeling and simulation has tremendous potential An
example is given in [14] regarding the ability of evaluating
and utilizing hundreds of text books and most recent jour-
nal articles when diagnosing a patient. Imagine a conceptual
Figure 2. Chaotic and Optimized Combat Results model that captures all the knowledge provided by Big Data
and analyzed by Deep Learning. And if the resulting system
is designed open enough, it can be calibrated every time using
group had five reserve groups that were called under two con- newly discovered knowledge that is immediately integrated
ditions, namely into the system.
1. that the forces fell under a threshold relative to the initial
value, or

2. the force ratio between blue and red forces exceeded a cer-
tain value.
Once called, the reserve needed some time to arrive at the bat-
tle scene and was integrated into the fighting troops. The bat-
tle ended when one of the fighting forces fell under a minimal
combat value or the ratio between the two forces exceeded a
value perceived to lead to an insurmountable advantage of one
side. When the team who conducted the study plotted which
side was predicted to win depending on the initial strength,
the result showed significantly big areas in which blue and
red wins iterated several times in a nearly chaotic behavior.
This is shown in the upper part of figure 2.
Figure 3. Synergy of Simulation, Big Data, and Deep Learning
The work published in [38] showed that these effects were
not caused by chaotic behavior of the combat model, but that The synergy is visualized in figure 3. Simulation contributes
the rule sets used to engage the reserve units were simply not that based on all the available data, the observed correlation,
adequate for the situation. Using multi–layered perceptrons, and the derived causality now complex interpolation and pro-
new rules were learned to ensure better use of the available jection becomes possible. Big data and deep thinking allow
forces. Using the exact same initial conditions as captured by for the observation and identification of new theories — func-
[9], these new rules were applied to decide when to engage tional connections between provided inputs under given con-
the reserve. The lower part of figure 2 shows the results of straints and observed outputs, based on empirical data — that
these engagements. become the foundation of a simulation system that executes
this new theory to check if the theory still holds when com- Kapolka, A., and McGregor, D. Extensible modeling
puted with all possible inputs under all possible constraints. and simulation framework (xmsf) challenges for
Big data, deep learning, and modeling and simulation there- web-based modeling and simulation. Tech. rep., Naval
fore support observation, analysis, and application! Compu- Postgraduate School, Monterey, CA, 2002.
tational support of sciences is therefore perceived to be on
6. Byrne, J., Heavey, C., and Byrne, P. J. A review of
the brink of bringing a new era of support to scientists and
web-based simulation and supporting tools. Simulation
researchers worldwide.
modelling practice and theory 18, 3 (2010), 253–276.
One field of interest is the modeling and simulation support
7. Collins, A., and Knowles Ball, D. Philosophical and
of system of systems engineering applications. Several ex-
theoretic underpinnings of simulation visualization
amples are given in [30]. System of systems are not just com-
rhetoric and their practical implications. In Ontology,
plex systems with many potentially non–linear interfaces be-
Epistemology, and Teleology for Modeling and
tween them, the challenge is not purely technical. Two of
Simulation. Springer, New York, NY, 2013, 173–191.
their defining characteristics are that they are operationally
and organizationally independent. They have no common 8. Deng, L. A tutorial survey of architectures, algorithms,
governance. and applications for deep learning. APSIPA Transactions
on Signal and Information Processing 3 (2014), e2.
As a result, the data the systems are using are not aligned,
and the processes supported and implemented are not harmo- 9. Dewar, J. A., Gillogly, J., and Juncosa, M. L.
nized. Big data can help to consolidate the data, and deep Non-monotonicity, chaos, and combat models. Rand
learning can be used to harmonize the processes by discov- Corporiation, 1991.
ering underlying common functionality in the form of so far 10. Ezzell, Z., Fishwick, P. A., and Cendan, J. Linking
undiscovered correlation and possible causalities. Simulation simulation and visualization construction through
is then used to execute the resulting system of systems rep- interactions with an ontology visualization. In
resentations and can be used to evaluate possible common Proceedings of the 2011 Winter Simulation Conference,
applications without having to create the system of systems S. Jain, R. Creasey, J. Himmelspach, K. White, and
first. The simulation can furthermore be used to evaluate pos- M. Fu, Eds., Institute of Electrical and Electronics
sible emergence within the system of systems before such
Engineers, Inc. (Piscataway, NJ, 2011), 2921–2932.
emergence can have negative effects in real–world operations.
Such an application brings all aspects discussed in this posi- 11. Feldkamp, N., and Strassburger, S. Automatic
tion paper together and is highly applicable, as many of the generation of route networks for microscopic traffic
current system of systems challenges are no longer accessi- simulations. In Proceedings of the 2014 Winter
ble through the traditional reductionist approach. Bringing Simulation Conference, A. Tolk, S. Y. Diallo, I. O.
all three methods together allows a new systemic and holistic Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, Eds.,
approach that is based on unbiased data evaluation by repeat- Institute of Electrical and Electronics Engineers, Inc.
able and well understood mechanisms. (Piscataway, NJ, 2014), 2848–2859.

REFERENCES 12. Fishwick, P. A. Neural network models in simulation: a


1. Anderson, J. G. Social, ethical and legal barriers to comparison with traditional modeling approaches. In
e–health. International journal of medical informatics Proceedings of the 2014 Winter Simulation Conference,
76, 5 (2007), 480–483. E. A. MacNair, K. J. Musselman, and P. Heidelberger,
Eds., Institute of Electrical and Electronics Engineers,
2. Bair, L. J., and Tolk, A. Towards a unified theory of Inc. (Piscataway, NJ, 1989), 702–709.
validation. In Proceedings of the 2013 Winter Simulation
Conference, R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, 13. Fishwick, P. A. Web-based simulation: some personal
and M. E. Kuhl, Eds., Institute of Electrical and observations. In Proceedings of the 1996 Winter
Electronics Engineers, Inc. (Piscataway, NJ, 2013), Simulation Conference, J. M. Charnes, D. J. Morrice,
1245–1256. D. T. Brunner, and J. J. Swain, Eds., Institute of
Electrical and Electronics Engineers, Inc. (Piscataway,
3. Bijl, J. L., and Boer, C. A. Advanced 3d visualization for
NJ, 1996), 772–779.
simulation using game technology. In Proceedings of the
2011 Winter Simulation Conference, S. Jain, R. Creasey, 14. Friedman, L. F. Ibm’s watson supercomputer may soon
J. Himmelspach, K. White, and M. Fu, Eds., Institute of be the best doctor in the world. Business Insider, Science
Electrical and Electronics Engineers, Inc. (Piscataway, (2014).
NJ, 2011), 2815–2826.
15. Garcia, J. J., and Tolk, A. Executable architectures in
4. Bowley, D., Comeau, P., Edwards, R., Hiniker, P., executable context enabling fit-for-purpose and portfolio
Howes, G., Kass, R., Labbé, P., Morris, C., Nunes-Vaz, assessment. The Journal of Defense Modeling and
R., Vaughan, J., et al. Guide for understanding and Simulation 12, 2 (2015), 91–107.
implementing defense experimentation (guidex). The
Technical Cooperation Program (TTCP) (2006). 16. Glaser, W. R. The impact of user–input devices on
virtual desktop trainers. Tech. rep., Naval Postgraduate
5. Brutzman, D., Zyda, M., Pullen, J. M., Morse, K. L.,
School, 2010.
Fouskarinis, S., Drake, D., Moen, D., Blais, C.,
17. Hof, R. D. Deep learning. MIT Technology Review 32. Rohrer, M. W. Seeing is believing: the importance of
(2013). visualization in manufacturing simulation. In
Proceedings of the 2000 Winter Simulation Conference,
18. Horne, G. E., and Meyer, T. E. Data farming:
J. A. Joines, R. R. Barton, K. Kang, and P. A. Fishwick,
Discovering surprise. In Proceedings of the 2004 Winter
Eds., Institute of Electrical and Electronics Engineers,
Simulation Conference, R. G. Ingalls, M. D. Rossetti,
Inc. (Piscataway, NJ, 2000), 1211–1216.
J. S. Smith, and B. A. Peters, Eds., Institute of Electrical
and Electronics Engineers, Inc. (Piscataway, NJ, 2004), 33. Sanchez, S. M., and Lucas, T. W. Exploring the world of
807–813. agent-based simulations: simple models, complex
analyses: exploring the world of agent-based
19. Jacobs, A. The pathologies of big data. Communications
simulations: simple models, complex analyses. In
of the ACM 52, 8 (2009), 36–44.
Proceedings of the 2002 Winter Simulation Conference,
20. Kleijnen, J. P. Design and analysis of simulation E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M.
experiments, vol. 111. Springer Science & Business Charnes, Eds., Institute of Electrical and Electronics
Media, New York, NY, 2007. Engineers, Inc. (Piscataway, NJ, 2002), 116–126.
21. Kürková, V. Kolmogorov’s theorem and multilayer 34. Sargent, R. G. Verification and validation of simulation
neural networks. Neural networks 5, 3 (1992), 501–506. models. Journal of Simulation 7, 1 (2013), 12–24.
22. Laney, D. 3d data management: Controlling data 35. SAS-026, N. Nato code of best practice for c2
volume, velocity and variety. META Group Research assessment. US DoD CCRP, Washington DC, USA
Note 6 (2001). (2002).
23. Law, A. M. How to build valid and credible simulation 36. Shvachko, K., Kuang, H., Radia, S., and Chansler, R.
models. In Proceedings of the 2009 Winter Simulation The hadoop distributed file system. In Mass Storage
Conference, M. D. Rossetti, R. R. Hill, B. Johansson, Systems and Technologies (MSST), 2010 IEEE 26th
A. Dunkin, and R. G. Ingalls, Eds., Institute of Electrical Symposium on, IEEE (2010), 1–10.
and Electronics Engineers, Inc. (Piscataway, NJ, 2009),
37. Taylor, S. J., Khan, A., Morse, K. L., Tolk, A., Yilmaz,
24–33.
L., and Zander, J. Grand challenges on the theory of
24. LeCun, Y., and Ranzato, M. Deep learning tutorial. In modeling and simulation. In Proceedings of the
Tutorials in International Conference on Machine Symposium on Theory of Modeling & Simulation-DEVS
Learning (ICML13), Citeseer (2013). Integrative M&S Symposium, Society for Computer
Simulation International (2013), 34.
25. Oberkampf, W. L., DeLand, S. M., Rutherford, B. M.,
Diegert, K. V., and Alvin, K. F. Error and uncertainty in 38. Tolk, A. Zur Reduktion struktureller Varianzen [transl.:
modeling and simulation. Reliability Engineering & On the reduction of structural variances]. Pro
System Safety 75, 3 (2002), 333–357. Univeritate Verlag, 1995.
26. Ören, T. I. Concepts and criteria to assess acceptability 39. Tolk, A. Interoperability, composability, and their
of simulation studies: a frame of reference. implications for distributed simulation: Towards
Communications of the ACM 24, 4 (1981), 180–189. mathematical foundations of simulation interoperability.
In Proceedings of the 2013 IEEE/ACM 17th
27. Ören, T. I., Elzas, M. S., Smit, I., and Birta, L. G. Code
International Symposium on Distributed Simulation and
of professional ethics for simulationists. In Summer
Real Time Applications, IEEE Computer Society (2013),
Computer Simulation Conference, Society for Computer
3–9.
Simulation International (2002), 434–435.
40. Tolk, A., Diallo, S. Y., Padilla, J. J., and
28. Ören, T. I., and Zeigler, B. P. Concepts for advanced
Herencia-Zapana, H. Reference modelling in support of
simulation methodologies. Simulation 32, 3 (1979),
m&s—foundations and applications. Journal of
69–82.
Simulation 7, 2 (2013), 69–82.
29. Page, E. H., Buss, A., Fishwick, P. A., Healy, K. J.,
41. Tolk, A., and Hughes, T. K. Systems engineering,
Nance, R. E., and Paul, R. J. Web-based simulation:
architecture, and simulation. In Modeling &
revolution or evolution? ACM Transactions on Modeling
Simulation–based Systems Engineering Handbook,
and Computer Simulation (TOMACS) 10, 1 (2000),
D. Gianni, A. DAmbrosio, and A. Tolk, Eds., CRC
3–17.
Taylor and Francis Group (Boca Raton, FL, 2014),
30. Rainey, L. B., and Tolk, A. Modeling and simulation 11–42.
support of system of systems engineering applications.
42. Ward, J. S., and Barker, A. Undefined by data: a survey
John Wiley & Sons, Hoboken, NJ, 2015.
of big data definitions. arXiv preprint arXiv:1309.5821
31. Robinson, S. Conceptual modelling for simulation part i: (2013).
definition and requirements. Journal of the operational
43. White, T. Hadoop: The definitive guide. O’Reilly
research society 59, 3 (2008), 278–290.
Media, Inc, Sebastopol, CA, 2012.

View publication stats

You might also like