The Next Generation of Modeling & Simulation: Integrating Big Data and Deep Learning
The Next Generation of Modeling & Simulation: Integrating Big Data and Deep Learning
net/publication/278667209
The Next Generation of Modeling & Simulation: Integrating Big Data and Deep
Learning
CITATIONS READS
23 3,798
1 author:
Andreas Tolk
MITRE
300 PUBLICATIONS 2,984 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Andreas Tolk on 17 June 2015.
Andreas Tolk
SimIS Inc.
Portsmouth, VA, USA
[email protected]
2. Observe the System/Collect Data: The next step is to ob- 6. Conduct the Simulation and Collect Data: The validated
serve the system and collect data. If the task is to simulate and verified approach is then used to conduct the simula-
an existing system, studying and observing this system de- tion experiments, which in themselves produce more data,
livers the causal and temporal relation between input and which can be used to validate the experiments themselves.
output data. This also gives the frame for follow–on val- In addition, these data provide the foundation for the analy-
idation, as the simulated system should produce the same sis to follow. It is worth mentioning that the pure selection
causal–temporal relationships later on as well [26]. of important data to be collected in itself is a modeling pro-
cess as well, as the selection of data results in a data model
If a new system needs to be developed, or new concepts of the important for the data to be collected, and this data
need to be evaluated, this phase of data collection is more model is in itself another simplification and abstraction of
about gathering additional assumptions, constraints, and reality.
expectations, often in the form of additional requirements.
If the system is specified in the form of architecture frame- 7. Evaluate the Experiment: Although written by experts in
work artifacts, these data maybe used in support of this the military domain, the ”NATO Code of Best Practice
phase [41]. An example on how to directly derive executa- for Command and Control Assessment” [35] as well as
bles from system architecture artifacts has been published the ”Guide for Understanding and Implementing Defense
in [15]. Experimentation” [4] provide valuable lessons learned and
good practices for all domains. They address, among other
3. Develop and Validate a Conceptual Model: Robinson’s topics, the need for sensitivity analysis of solutions to un-
definition of a conceptual model is often referred to in the derstand how stable it is, the necessity for repeatable exper-
M&S community: ” a non–software specific description of iments in support of collecting statistically relevant data to
the computer simulation model (that will be, is or has been cope with uncertainty and risk, and more pitfalls when con-
developed), describing the objectives, inputs, outputs, con- ducting and evaluating simulation experiments in complex
tent, assumptions and simplifications of the model.” [31]. domains.
The resulting model needs to be validated, i.e., it needs to Another aspect of interest is the use of Visualization to rep-
be shown that the model is accurate enough to fulfill the resent the simulation and its results. As discussed in [7],
requirements that the customer formulated. Validation an- visualization has its own rhetoric and needs to aligned and
swers the question: ”Did we build the correct model?” harmonzed with the model and the simulation.
4. Develop and Verify the Computable Mode: Next, this 8. Present the Results: Finally, the results need to be pre-
model needs to be implemented. This is not a trivial step, sented to the customer. The focus should be the rec-
as a lot of compromises may be necessary between ac- ommended solution, not the simulation used. However,
curacy and computing time. Many functions need to be clearly stating assumptions and constraints that frame the
numerically approximated, the decision space needs to be results is essential. It would be unethical to oversell or mis-
discretized, and executable code needs to be distributed interpret results to please the customer [27].
to the supporting platforms. Oberkampf et al. conducted
In the light of the insights on Big Data and Deep Learning de-
an impressive study on errors and uncertainty that are in-
scribed here, they can revolutionize M&S, as M&S can rev-
herit to simulation approaches in [25]. Therefore, once this
olutionize the use of Big Data and Deep Learning. Big Data
is accomplished, the simulation needs to be verified, i.e.,
surely will significantly influence the steps of Design the Ex-
it needs to be shown that the conceptual model was ac-
periment and Obtain Data and Observe the System/Collect
curately transformed into the simulation. Verification an-
Data, as the methods described are able to find and evalu-
swers the question: ”Did we build the model correctly?”
ate the applicability of data in the scenario design and ini-
The state of the art in verification and validation has re- tialization processes as well as in the evaluation of results.
cently been summarized in [34]. Bair studied the currently The openness of these efforts has the potential to overcome
biases of the model developer, as the potentially will iden- Crowd Sourcing
tify many more data than the traditional model developer may The ideas of big data are also used increasingly to initialize
take into account. Similarly, Develop and Validate a Concep- and execute simulations, or even to set up scenarios. This is
tual Model is closely related to Deep Learning, as the identi- often referred to as crowd sourcing. The idea behind this is
fication of correlations and detection of possible causal func- to use real world data to initialize your simulation. Two ex-
tional connections is exactly what is common to both topics. amples to highlight the idea behind this are traffic simulation
Again, possible biases can be overcome by applying Deep [11] and regional impact of potential climate change [40].
Learning algorithms, as they systematically look into all pos-
sible connections and are not limited to the perception of a In the traffic example, a microscopic traffic simulation, i.e.,
human developer and his knowledge of the domain. a simulation with individual cars as simulated entities, is ini-
tialized with real–world data. These data describe the general
The following subsections give examples where such ideas traffic infrastructure, such as street locations and limitations,
already have been implemented, so the ideas proposed here bus stops, railroad crossings, intersections, etc. They also
are not just fiction or a vision for the future, the methods al- describe observed cars, traffic flow, and other ”live” and non–
ready are applied. As this contribution is written as a position static information. The presented study concluded that ”au-
paper, the examples were selected mainly for their informa- tomatically creating a route network from a crowd–sourced
tive value. They are neither complete nor exclusive and the data base is possible.
reader is encouraged to publish additional examples that are
even better applicable in this context. The second example is even on a much higher level. In sup-
port of a study on the potential influence of sea level ris-
ing on the Tidewater area in the south–eastern part of Vir-
4.2. Applying Big Data Methods
ginia, the Virginia Modeling Analysis and Simulation Cen-
Internet and web–based methods were always a driver for new ter (VMASC) of Old Dominion University (ODU) developed
modeling and simulation ideas, as shown in [5, 13, 29]. One a decision–support simulation system. Important and criti-
of the more recent overviews on web–based tools supporting cal assets were identified based on the Department of Home-
modeling and simulation was provided in [6]. In the context land Security Infrastructure Data Taxonomy and its 18 fac-
of the ideas evaluated in this paper, two concepts related to tors. These factors were not only used to guide a search, they
such early contributions are data farming for simulation and were used to automatically populate parts of the model with
crowd sourcing for simulation. identified information. The findings also guided the genera-
Data Farming tion of decision rules based on commonly accepted theories
The Naval Postgraduate Schools (NPS) Simulation, Experi- identified to be relevant in this context.
ments and Efficient Design (SEED) Center for Data Farming The decision–support simulation system comprised intelli-
defines data farming as the process of using simulations and gent agents as well as system dynamics constructs that al-
computer modeling to grow data, leverage high–power com- lowed for evaluation of the effects using various viewpoints
puting, state–of–the–art experiment designs, and innovative and answering several modeling questions, such as ”What is
analysis techniques to gain deeper insights from simulation the best combination of factors that would make an area as
models. As such, data farming seeks to provide decision mak- attractive as possible for as long as possible?” Such ques-
ers with insights into complex issues by using simulations to tions guided how much a community is willing to invest to
produce data. Therefore, we address the step Observe the secure area, or to develop areas even if they are in a potential
System/Collect Data from the last subsection. flooding zone, etc. All relevant data were imported from open
This goes hand in hand with the very similar concept of data sources to initialize the system.
mining, that is applied to find structures and new insights in Both examples show the high potential of Big Data to support
the developed data [33]. To emphasize the point, as evaluated the initialization as well as the development of scenarios and
in [18], one of the important aspects is the unbiased evalua- potentially even modeling approaches.
tion of data: data farming is using methods and heuristics to
discover new insights very similar to the big data approaches. 4.3. Applying Deep Learning Methods
Data farming also introduces the idea to use an orchestrated The term Deep Learning is so far rarely used in the modeling
set of tools that all contribute to the data set to be evalu- and simulation community, although many methods enumer-
ated, similar to the recommendations in [35] for operations ated in the section above are used. In particular the use of
research studies. The resulting data set again can comprise neural networks has been evaluated early [12].
structured and unstructured data, and the structured data can
follow different data schemes that are not aligned with each The following example shows how deep learning can help to
other. Evaluating such a data set requires the ideas described find better rule sets. In the early nineties, a report published
earlier. by the Rand Corporation [9] created many discussions in the
operational analyses community.
In summary, Big Data methods can and in parts already are
successfully applied to efficiently evaluate the data obtained Dewar used a relatively simple combat model to create very
in simulation experiments, in particular when a huge amount counter–intuitive behavior of the overall system, although the
of potentially heterogeneous and unstructured data has to be underlying rules make perfect sense for the analytic commu-
evaluated. nity. Two opposing forces fought in a single battle. Each
At the time this example was conducted, the computational
power was only a fraction of what is available today. Al-
though the example shows the potential of using Deep Learn-
ing methods and tools, we only started to perceive its true
potential.
5. CONCLUDING REMARKS
This position paper proposes to intensify the research on how
to create more synergy between modeling and simulation, Big
Data, and Deep Learning.
• Big data provides a means to obtain and evaluate data to an
extent so far not accessible to simulation experts. It extends
the methods and insights of data mining and data farming
and brings them to a new level. It allows crowd sourcing
for simulation on a bigger scale than we are currently able
to conduct.
2. the force ratio between blue and red forces exceeded a cer-
tain value.
Once called, the reserve needed some time to arrive at the bat-
tle scene and was integrated into the fighting troops. The bat-
tle ended when one of the fighting forces fell under a minimal
combat value or the ratio between the two forces exceeded a
value perceived to lead to an insurmountable advantage of one
side. When the team who conducted the study plotted which
side was predicted to win depending on the initial strength,
the result showed significantly big areas in which blue and
red wins iterated several times in a nearly chaotic behavior.
This is shown in the upper part of figure 2.
Figure 3. Synergy of Simulation, Big Data, and Deep Learning
The work published in [38] showed that these effects were
not caused by chaotic behavior of the combat model, but that The synergy is visualized in figure 3. Simulation contributes
the rule sets used to engage the reserve units were simply not that based on all the available data, the observed correlation,
adequate for the situation. Using multi–layered perceptrons, and the derived causality now complex interpolation and pro-
new rules were learned to ensure better use of the available jection becomes possible. Big data and deep thinking allow
forces. Using the exact same initial conditions as captured by for the observation and identification of new theories — func-
[9], these new rules were applied to decide when to engage tional connections between provided inputs under given con-
the reserve. The lower part of figure 2 shows the results of straints and observed outputs, based on empirical data — that
these engagements. become the foundation of a simulation system that executes
this new theory to check if the theory still holds when com- Kapolka, A., and McGregor, D. Extensible modeling
puted with all possible inputs under all possible constraints. and simulation framework (xmsf) challenges for
Big data, deep learning, and modeling and simulation there- web-based modeling and simulation. Tech. rep., Naval
fore support observation, analysis, and application! Compu- Postgraduate School, Monterey, CA, 2002.
tational support of sciences is therefore perceived to be on
6. Byrne, J., Heavey, C., and Byrne, P. J. A review of
the brink of bringing a new era of support to scientists and
web-based simulation and supporting tools. Simulation
researchers worldwide.
modelling practice and theory 18, 3 (2010), 253–276.
One field of interest is the modeling and simulation support
7. Collins, A., and Knowles Ball, D. Philosophical and
of system of systems engineering applications. Several ex-
theoretic underpinnings of simulation visualization
amples are given in [30]. System of systems are not just com-
rhetoric and their practical implications. In Ontology,
plex systems with many potentially non–linear interfaces be-
Epistemology, and Teleology for Modeling and
tween them, the challenge is not purely technical. Two of
Simulation. Springer, New York, NY, 2013, 173–191.
their defining characteristics are that they are operationally
and organizationally independent. They have no common 8. Deng, L. A tutorial survey of architectures, algorithms,
governance. and applications for deep learning. APSIPA Transactions
on Signal and Information Processing 3 (2014), e2.
As a result, the data the systems are using are not aligned,
and the processes supported and implemented are not harmo- 9. Dewar, J. A., Gillogly, J., and Juncosa, M. L.
nized. Big data can help to consolidate the data, and deep Non-monotonicity, chaos, and combat models. Rand
learning can be used to harmonize the processes by discov- Corporiation, 1991.
ering underlying common functionality in the form of so far 10. Ezzell, Z., Fishwick, P. A., and Cendan, J. Linking
undiscovered correlation and possible causalities. Simulation simulation and visualization construction through
is then used to execute the resulting system of systems rep- interactions with an ontology visualization. In
resentations and can be used to evaluate possible common Proceedings of the 2011 Winter Simulation Conference,
applications without having to create the system of systems S. Jain, R. Creasey, J. Himmelspach, K. White, and
first. The simulation can furthermore be used to evaluate pos- M. Fu, Eds., Institute of Electrical and Electronics
sible emergence within the system of systems before such
Engineers, Inc. (Piscataway, NJ, 2011), 2921–2932.
emergence can have negative effects in real–world operations.
Such an application brings all aspects discussed in this posi- 11. Feldkamp, N., and Strassburger, S. Automatic
tion paper together and is highly applicable, as many of the generation of route networks for microscopic traffic
current system of systems challenges are no longer accessi- simulations. In Proceedings of the 2014 Winter
ble through the traditional reductionist approach. Bringing Simulation Conference, A. Tolk, S. Y. Diallo, I. O.
all three methods together allows a new systemic and holistic Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, Eds.,
approach that is based on unbiased data evaluation by repeat- Institute of Electrical and Electronics Engineers, Inc.
able and well understood mechanisms. (Piscataway, NJ, 2014), 2848–2859.