SlideShare a Scribd company logo
Bridging Concepts and Practice in
eScience via Simulation-driven
Engineering
Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3
https://wrench-project.org
1 USC Information Sciences Institute, Marina del Rey, CA, USA
2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA
3 IN2P3 Computing Center, CNRS, Villeurbanne, France
#1
Disconnect between theoretical and practical works
Theoreticians produce
results that are never used
by practitioners
2
Practitioners use approaches that
may be vastly suboptimal because
they are not informed by any theory
One of the reasons for this disconnect is that theoretical work must be
done using formally defined models of computation
Ideally, these models are complete enough to be relevant to practice, but
simple enough that obtaining theoretical results (e.g., optimality results,
complexity bounds) is tractable
Real-world experiments are limited
One is limited to particular platform configurations (and sub-configurations)
How can “what if?” scenarios be explored?
How can generality be claimed?
One is limited by specifics of the software infrastructure that impose
constraints on CI application executions
Modifying complex software stacks (often written by others) just to test out ideas is not
feasible
In the end, the scope of real-world experiments is limited, which impedes
progress / discovery
3
Simulation
When one works in an experimental field in which experiments are
problematic, one resorts to simulation
Physicists have understood this decades ago :)
In some fields of Computer Science simulation is a standard research and
development methodology
e.g., Networking, Computer Architecture
Several simulators and simulation frameworks have been developed for
parallel and distributed computing
Some of them developed explicitly for workflows
4
Simulation-driven
engineering life cycle
Experimental
simulation
Research
idea
Evaluation of
simulation
results
Research
product
Implementation
onto CI platform
Design of
research
solution
unsatisfactory results
Accurate CI
simulator
Design of CI
simulator
5
The ability to define
parameterizable services is key for
developing accurate CI simulators,
from which research products
evaluated via experimental
simulation could be seamlessly
integrated into actual CI platforms
The SimGrid framework
SimGrid is a research project
Development of simulation models of hardware/software stacks
Models are accurate (validated/invalidated) and scalable (low computational complexity, low
memory footprint)
SimGrid is open source usable software
Provides different APIs for a range of simulation needs, e.g.:
S4U: General simulation of Concurrent Sequential Processes
SMPI: Fine-grained simulation of MPI applications
SimGrid is versatile scientific instrument
Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects
First developed in 2000, latest release: v3.23.2 (July 2019)
6
https://simgrid.org
SimGrid’s philosophy
SimGrid’s philosophy: provide low-level abstractions
Advantage: you can do anything with it
Drawback: implementing a simulation of a complex system is a lot of work
Critical analysis:
In [Kecskemeti et al.’14] pinpoints exactly the above trade-off:
"SimGrid is more scalable and validated than competing frameworks, but just too much
work when wanting to simulate a WMS that interacts with CI components"
7
https://simgrid.org
The WRENCH simulation framework
Objective #1: Make it easy to develop simulators of complex CI
application executions
Done by providing high-level, reusable simulation abstractions
Objective #2: Produce accurate and scalable simulations
Done by building on SimGrid
Let’s look at an example system one can simulate with WRENCH…
8
wrench-project.org
System to simulate
9
WRENCH core services
10
Simulation core
All necessary simulation models and base
abstractions (computing, communicating,
storing), provided by SimGrid
Simulated core CI services
Abstractions for simulated CI
components to execute
computational workloads
Compute Services
Provide mechanisms for
executing application tasks,
which entail I/O and
computation
cloud
bare-metal virtualized cluster
batch-scheduled cluster
Storage Services
Store application files, which
can then be accessed in
reading/writing by the
compute services when
executing tasks that
read/write files
File Registry Services
Databases of key-value pairs
of storage services and files
replicas
Network Proximity Services
monitor the network and
maintain a database of host-
to-host network distances
Workflow Management
System
Provides the mechanisms for
executing workflow
applications, including
decision-making for
optimizing various objectives
WRENCH’s impact on
CI research
Accuracy: the ability to capture the
behavior of a real-world system with
as little bias as possible
Scalability: the ability to simulate
large systems with as few CPU cycles
and bytes of RAM as possible
11
Empirical cumulative distribution function of task
completion times for sample real-world (“pegasus” and
“workqueue”) and simulated (“wrench”) executions.
Simulation Accuracy and Scalability
●
●
●
●
●
●
●
●
●
●
●
●
140
160
180
200
1 2 3 4 5 6 7 8 9 10 11 12
# cores
PowerConsumption(W)
● estimation real wrench
●
●
●
●
●
●
●
●
●
●
●
●
0.1
0.2
0.3
1 2 3 4 5 6 7 8 9 10 11 12
# cores
EnergyConsumption(KWh)
● estimation real wrench
WRENCH’s impact on
CI research
Investigated the impact of resource
utilization and I/O operations on
the energy usage, as well as the
impact of executing multiple tasks
concurrently on multi-socket,
multi-core compute nodes
12
Comparison of power (left) and energy (right) consumption
measurements for a real-world application (“real”) using a
well-known model from the literature (“estimation”) and
our WRENCH model (“wrench”)
Energy-aware Computing
WRENCH
Pedagogic Modules
Simulation-driven self-contained
pedagogic modules supported by
WRENCH-based simulators
Activities entail running, through a
Web application, a simulator with
different input parameters
13
https://wrench-project.org/wrench-pedagogic-modules
Thank You
Questions?
14
This work is funded by NSF contracts #1642369 and #1642335; by CNRS
under grant #PICS07239; and partly funded by NSF contracts #1923539
and #1923621.
https://wrench-project.org
Ad

More Related Content

What's hot (13)

Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CD
Nebulaworks
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
Ahmed Zerouali
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDD
XP Conference India
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
Ash Winter
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
Len Bass
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis Primer
Coverity
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
Len Bass
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database Objects
Prabhu Ramasamy
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting Agile
Coverity
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
Eric Smalling
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.
KelisKing
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code Scripts
Akond Rahman
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
Samsung Open Source Group
 
Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CD
Nebulaworks
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
Ahmed Zerouali
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDD
XP Conference India
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
Ash Winter
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
Len Bass
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis Primer
Coverity
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
Len Bass
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database Objects
Prabhu Ramasamy
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting Agile
Coverity
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
Eric Smalling
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.
KelisKing
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code Scripts
Akond Rahman
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
Samsung Open Source Group
 

Similar to Bridging Concepts and Practice in eScience via Simulation-driven Engineering (20)

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
Marielisa Peralta
 
Microkontroler
MicrokontrolerMicrokontroler
Microkontroler
Feira Project
 
01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf
OCRE | Open Clouds for Research Environments
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The Supercomputer
Ankit Singh
 
AF-2599-P.docx
AF-2599-P.docxAF-2599-P.docx
AF-2599-P.docx
Sami Siddiqui
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Gaurav Raina
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
Gaurav Raina
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
Gaurav Raina
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET Journal
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
mictc
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
Abhiram Kanigolla
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
achakracu
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min project
araya kiros
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
Benoit Combemale
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGrid
gwickler
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
Amy Roman
 
Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...
WMG centre High Value Manufacturing Catapult
 
Computer Oraganisation and Architecture
Computer Oraganisation and ArchitectureComputer Oraganisation and Architecture
Computer Oraganisation and Architecture
yogesh1617
 
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The Supercomputer
Ankit Singh
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Gaurav Raina
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
Gaurav Raina
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
Gaurav Raina
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET Journal
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
mictc
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
achakracu
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min project
araya kiros
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
Benoit Combemale
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGrid
gwickler
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
Amy Roman
 
Computer Oraganisation and Architecture
Computer Oraganisation and ArchitectureComputer Oraganisation and Architecture
Computer Oraganisation and Architecture
yogesh1617
 
Ad

More from Rafael Ferreira da Silva (20)

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Rafael Ferreira da Silva
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
Rafael Ferreira da Silva
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
Rafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
Rafael Ferreira da Silva
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
Rafael Ferreira da Silva
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
Rafael Ferreira da Silva
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Rafael Ferreira da Silva
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Rafael Ferreira da Silva
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
Rafael Ferreira da Silva
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Rafael Ferreira da Silva
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
Rafael Ferreira da Silva
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
Rafael Ferreira da Silva
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
Rafael Ferreira da Silva
 
Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Rafael Ferreira da Silva
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
Rafael Ferreira da Silva
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
Rafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
Rafael Ferreira da Silva
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
Rafael Ferreira da Silva
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
Rafael Ferreira da Silva
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Rafael Ferreira da Silva
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Rafael Ferreira da Silva
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
Rafael Ferreira da Silva
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Rafael Ferreira da Silva
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
Rafael Ferreira da Silva
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
Rafael Ferreira da Silva
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
Rafael Ferreira da Silva
 
Ad

Recently uploaded (20)

HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
The Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdfThe Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdf
Precisely
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Vibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdfVibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdf
Baiju Muthukadan
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
TrsLabs - Leverage the Power of UPI Payments
TrsLabs - Leverage the Power of UPI PaymentsTrsLabs - Leverage the Power of UPI Payments
TrsLabs - Leverage the Power of UPI Payments
Trs Labs
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Vaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without HallucinationsVaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without Hallucinations
john409870
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Connect and Protect: Networks and Network Security
Connect and Protect: Networks and Network SecurityConnect and Protect: Networks and Network Security
Connect and Protect: Networks and Network Security
VICTOR MAESTRE RAMIREZ
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
The Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdfThe Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdf
Precisely
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Vibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdfVibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdf
Baiju Muthukadan
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
TrsLabs - Leverage the Power of UPI Payments
TrsLabs - Leverage the Power of UPI PaymentsTrsLabs - Leverage the Power of UPI Payments
TrsLabs - Leverage the Power of UPI Payments
Trs Labs
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Vaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without HallucinationsVaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without Hallucinations
john409870
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Connect and Protect: Networks and Network Security
Connect and Protect: Networks and Network SecurityConnect and Protect: Networks and Network Security
Connect and Protect: Networks and Network Security
VICTOR MAESTRE RAMIREZ
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 

Bridging Concepts and Practice in eScience via Simulation-driven Engineering

  • 1. Bridging Concepts and Practice in eScience via Simulation-driven Engineering Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3 https://wrench-project.org 1 USC Information Sciences Institute, Marina del Rey, CA, USA 2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA 3 IN2P3 Computing Center, CNRS, Villeurbanne, France #1
  • 2. Disconnect between theoretical and practical works Theoreticians produce results that are never used by practitioners 2 Practitioners use approaches that may be vastly suboptimal because they are not informed by any theory One of the reasons for this disconnect is that theoretical work must be done using formally defined models of computation Ideally, these models are complete enough to be relevant to practice, but simple enough that obtaining theoretical results (e.g., optimality results, complexity bounds) is tractable
  • 3. Real-world experiments are limited One is limited to particular platform configurations (and sub-configurations) How can “what if?” scenarios be explored? How can generality be claimed? One is limited by specifics of the software infrastructure that impose constraints on CI application executions Modifying complex software stacks (often written by others) just to test out ideas is not feasible In the end, the scope of real-world experiments is limited, which impedes progress / discovery 3
  • 4. Simulation When one works in an experimental field in which experiments are problematic, one resorts to simulation Physicists have understood this decades ago :) In some fields of Computer Science simulation is a standard research and development methodology e.g., Networking, Computer Architecture Several simulators and simulation frameworks have been developed for parallel and distributed computing Some of them developed explicitly for workflows 4
  • 5. Simulation-driven engineering life cycle Experimental simulation Research idea Evaluation of simulation results Research product Implementation onto CI platform Design of research solution unsatisfactory results Accurate CI simulator Design of CI simulator 5 The ability to define parameterizable services is key for developing accurate CI simulators, from which research products evaluated via experimental simulation could be seamlessly integrated into actual CI platforms
  • 6. The SimGrid framework SimGrid is a research project Development of simulation models of hardware/software stacks Models are accurate (validated/invalidated) and scalable (low computational complexity, low memory footprint) SimGrid is open source usable software Provides different APIs for a range of simulation needs, e.g.: S4U: General simulation of Concurrent Sequential Processes SMPI: Fine-grained simulation of MPI applications SimGrid is versatile scientific instrument Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects First developed in 2000, latest release: v3.23.2 (July 2019) 6 https://simgrid.org
  • 7. SimGrid’s philosophy SimGrid’s philosophy: provide low-level abstractions Advantage: you can do anything with it Drawback: implementing a simulation of a complex system is a lot of work Critical analysis: In [Kecskemeti et al.’14] pinpoints exactly the above trade-off: "SimGrid is more scalable and validated than competing frameworks, but just too much work when wanting to simulate a WMS that interacts with CI components" 7 https://simgrid.org
  • 8. The WRENCH simulation framework Objective #1: Make it easy to develop simulators of complex CI application executions Done by providing high-level, reusable simulation abstractions Objective #2: Produce accurate and scalable simulations Done by building on SimGrid Let’s look at an example system one can simulate with WRENCH… 8 wrench-project.org
  • 10. WRENCH core services 10 Simulation core All necessary simulation models and base abstractions (computing, communicating, storing), provided by SimGrid Simulated core CI services Abstractions for simulated CI components to execute computational workloads Compute Services Provide mechanisms for executing application tasks, which entail I/O and computation cloud bare-metal virtualized cluster batch-scheduled cluster Storage Services Store application files, which can then be accessed in reading/writing by the compute services when executing tasks that read/write files File Registry Services Databases of key-value pairs of storage services and files replicas Network Proximity Services monitor the network and maintain a database of host- to-host network distances Workflow Management System Provides the mechanisms for executing workflow applications, including decision-making for optimizing various objectives
  • 11. WRENCH’s impact on CI research Accuracy: the ability to capture the behavior of a real-world system with as little bias as possible Scalability: the ability to simulate large systems with as few CPU cycles and bytes of RAM as possible 11 Empirical cumulative distribution function of task completion times for sample real-world (“pegasus” and “workqueue”) and simulated (“wrench”) executions. Simulation Accuracy and Scalability
  • 12. ● ● ● ● ● ● ● ● ● ● ● ● 140 160 180 200 1 2 3 4 5 6 7 8 9 10 11 12 # cores PowerConsumption(W) ● estimation real wrench ● ● ● ● ● ● ● ● ● ● ● ● 0.1 0.2 0.3 1 2 3 4 5 6 7 8 9 10 11 12 # cores EnergyConsumption(KWh) ● estimation real wrench WRENCH’s impact on CI research Investigated the impact of resource utilization and I/O operations on the energy usage, as well as the impact of executing multiple tasks concurrently on multi-socket, multi-core compute nodes 12 Comparison of power (left) and energy (right) consumption measurements for a real-world application (“real”) using a well-known model from the literature (“estimation”) and our WRENCH model (“wrench”) Energy-aware Computing
  • 13. WRENCH Pedagogic Modules Simulation-driven self-contained pedagogic modules supported by WRENCH-based simulators Activities entail running, through a Web application, a simulator with different input parameters 13 https://wrench-project.org/wrench-pedagogic-modules
  • 14. Thank You Questions? 14 This work is funded by NSF contracts #1642369 and #1642335; by CNRS under grant #PICS07239; and partly funded by NSF contracts #1923539 and #1923621. https://wrench-project.org