SlideShare a Scribd company logo
PRINS: Scalable Model Inference for
Component-based System Logs*
Donghwan Shin1), Domenico Bianculli2), and Lionel Briand2,3)
1) University of She
ffi
eld
2) University of Luxembourg
3) University of Ottawa
* This presentation is for the Journal-First Track at ICSE 2023; the original paper was accepted in Empirical Software Engineering (EMSE) journal.
A
B
Y
Z
…
Model
Inference
Technique
ith execution
20190621.001 A
20190621.002 B
20190621.002 Z
20190621.002 B
…
ith execution
20190621.001 A
20190621.002 B
20190621.002 Z
20190621.002 B
…
ith execution
20221101.001 A
20221101.004 B
20221101.011 Z
20221101.013 B
20221101.101 Y
…
System Logs System Model
Log = A sequence of log entries representing a single execution
fl
ow
Too large
Not Scalable
Enough
No Models
2
081111 090711 25010 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:38382 dest: /10.251.65.203:50010
081111 090711 25181 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.27.63:54730 dest: /10.251.27.63:50010
081111 090711 25487 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:40305 dest: /10.251.65.203:50010
081111 090711 00031 INFO dfs.FSNamesystem: BLOCK* NameSystem.allocateBlock: /user/root/rand8/_temporary/part-00156. blk_5652408071925555972
081111 090756 25011 INFO dfs.DataNode$PacketResponder: PacketResponder 2 for block blk_5652408071925555972 terminating
081111 090756 25011 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203
081111 090756 25184 INFO dfs.DataNode$PacketResponder: PacketResponder 0 for block blk_5652408071925555972 terminating
081111 090756 25184 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.27.63
081111 090756 25488 INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_5652408071925555972 terminating
081111 090756 25488 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203
081111 090756 00027 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.71.16:50010 is added to blk_5652408071925555972
081111 111345 00013 INFO dfs.DataBlockScanner: Veri
fi
cation succeeded for blk_5652408071925555972
Example HDFS Log
Component IDs
3
Observation: Systems are often composed of multiple components
What if we infer INDIVIDUAL component
models and then stitch them together?
4
System Logs
eA
1
eA
2
eB
4
eB
4
eA
1
eA
2
eB
4
eA
1
eA
3
eB
5
eA
1
eA
2
eB
4
eB
4
eA
1
eA
2
eB
4
eA
1
eA
3
eB
5
ax
bx
dy
dy
ax
bx
dy
ax
cx
ey
PRINS: PRojection-INference-Stitching
s0
s1
s2
s3
s4
a b
c
d
e
INference
Model of x
Model of y
INference
Component x
Component y
PRojection
eA
1
eA
2
eA
1
eA
2
eA
1
eA
3
eA
1
eA
2
eA
1
eA
2
eA
1
eA
3
ax
bx
ax
bx
ax
cx
eB
4
eB
4
eB
4
eB
5
eB
4
eB
4
eB
4
eB
5
dy
dy
dy
ey
s0
s1
s2
a b
c
d
s4
e
Stitching
System Model
+ (optional) Heuristic
Determinisation (HD)
Research Questions
• RQ1: How does the execution time of PRINS change according to the parallel
inference tasks in the inference stage?
• RQ2: How does the execution time of change according to parameter ?
• RQ3: How does the accuracy of the models (in the form of gFSMs) generated
by change according to parameter ?
• RQ4: How fast is PRINS when compared to state-of-the-art model inference
techniques?
• RQ5: How accurate are the models generated by PRINS compared to those
generated by state-of-the-art model inference techniques?
HDu u
HDu u
6
Parallel
inference
Heuristic
Determinisation
PRINS
(compared
to
MINT)
Research Questions
• RQ1: How does the execution time of PRINS change according to the parallel
inference tasks in the inference stage?
• RQ2: How does the execution time of change according to parameter ?
• RQ3: How does the accuracy of the models (in the form of gFSMs) generated
by change according to parameter ?
• RQ4: How fast is PRINS when compared to state-of-the-art model inference
techniques?
• RQ5: How accurate are the models generated by PRINS compared to those
generated by state-of-the-art model inference techniques?
HDu u
HDu u
7
Parallel
inference
Heuristic
Determinisation
PRINS
(compared
to
MINT)
RQ4: Execution Time of PRINS compared to MINT
2 4 6 8
5
10
15
20
Execution
Time
(s)
Hadoop
MINT
PRINS-N
PRINS-P
2 4 6 8
0
5000
10000
HDFS
MINT
PRINS-N
PRINS-P
2 4 6 8
0
5000
10000
15000
Linux
MINT
PRINS-N
PRINS-P
2 4
0
2500
5000
7500
10000
Zookeeper
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
5000
10000
15000
Execution
Time
(s)
CoreSync
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
2.5
5.0
7.5
10.0
12.5
NGLClient
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
10000
20000
30000
Oobelib
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
5000
10000
15000
PDApp
MINT
PRINS-N
PRINS-P
PRINS-N = PRINS with No parallel inference (HD is enabled to be fair with MINT)
PRINS-P = PRINS with Parallel inference (HD is enabled to be fair with MINT)
Duplication Factor = How many times each log is duplicated to increase the input log size systematically
8
RQ5: Accuracy of PRINS compared to MINT
9
Downside: Size of System Models
10
Contributions
• Tame the scalability issue of model
inference using divide-and-conquer.
• Present an empirical evaluation of
PRINS and its comparison with the
state-of-the-art model inference tool.
• It works especially well when the
components appearing in di
ff
erent
executions are similar.
• Provide a publicly available
implementation of PRINS.
11
Paper (Open Access) Replication Package
Ad

More Related Content

Similar to PRINS: Scalable Model Inference for Component-based System Logs (20)

software effort estimation
 software effort estimation software effort estimation
software effort estimation
Besharam Dil
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
Alexios Lekidis
 
Busy Polling: Past, Present, Future
Busy Polling: Past,      Present, FutureBusy Polling: Past,      Present, Future
Busy Polling: Past, Present, Future
VenkatPulimi
 
Deep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLabDeep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLab
NECST Lab @ Politecnico di Milano
 
SDN and metrics from the SDOs
SDN and metrics from the SDOsSDN and metrics from the SDOs
SDN and metrics from the SDOs
Open Networking Summit
 
slides
slidesslides
slides
Cesar Bernardini
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
I MT
 
A GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKSA GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKS
Weaveworks
 
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
ScyllaDB
 
Data acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor NetworkData acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor Network
Rutvik Pensionwar
 
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Yoshitake Kobayashi
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution Control
Daniel Wheeler
 
Addressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh IntegrationAddressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh Integration
ThomasGraf42
 
optimizing_ceph_flash
optimizing_ceph_flashoptimizing_ceph_flash
optimizing_ceph_flash
Vijayendra Shamanna
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabus
nimal83
 
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
ijesajournal
 
Support of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG NotificationsSupport of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG Notifications
ThomasGraf42
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
software effort estimation
 software effort estimation software effort estimation
software effort estimation
Besharam Dil
 
Busy Polling: Past, Present, Future
Busy Polling: Past,      Present, FutureBusy Polling: Past,      Present, Future
Busy Polling: Past, Present, Future
VenkatPulimi
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
I MT
 
A GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKSA GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKS
Weaveworks
 
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
ScyllaDB
 
Data acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor NetworkData acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor Network
Rutvik Pensionwar
 
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Yoshitake Kobayashi
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution Control
Daniel Wheeler
 
Addressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh IntegrationAddressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh Integration
ThomasGraf42
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabus
nimal83
 
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
ijesajournal
 
Support of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG NotificationsSupport of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG Notifications
ThomasGraf42
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 

More from Lionel Briand (20)

FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
Ad

Recently uploaded (20)

Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Foundation Models for Time Series : A Survey
Foundation Models for Time Series : A SurveyFoundation Models for Time Series : A Survey
Foundation Models for Time Series : A Survey
jayanthkalyanam1
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Odoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education ProcessOdoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education Process
iVenture Team LLP
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Tools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google CertificateTools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google Certificate
VICTOR MAESTRE RAMIREZ
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Full Cracked Resolume Arena Latest Version
Full Cracked Resolume Arena Latest VersionFull Cracked Resolume Arena Latest Version
Full Cracked Resolume Arena Latest Version
jonesmichealj2
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Innovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at allInnovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at all
ayeshakanwal75
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Foundation Models for Time Series : A Survey
Foundation Models for Time Series : A SurveyFoundation Models for Time Series : A Survey
Foundation Models for Time Series : A Survey
jayanthkalyanam1
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Odoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education ProcessOdoo ERP for Education Management to Streamline Your Education Process
Odoo ERP for Education Management to Streamline Your Education Process
iVenture Team LLP
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Tools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google CertificateTools of the Trade: Linux and SQL - Google Certificate
Tools of the Trade: Linux and SQL - Google Certificate
VICTOR MAESTRE RAMIREZ
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Full Cracked Resolume Arena Latest Version
Full Cracked Resolume Arena Latest VersionFull Cracked Resolume Arena Latest Version
Full Cracked Resolume Arena Latest Version
jonesmichealj2
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Innovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at allInnovative Approaches to Software Dev no good at all
Innovative Approaches to Software Dev no good at all
ayeshakanwal75
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Ad

PRINS: Scalable Model Inference for Component-based System Logs

  • 1. PRINS: Scalable Model Inference for Component-based System Logs* Donghwan Shin1), Domenico Bianculli2), and Lionel Briand2,3) 1) University of She ffi eld 2) University of Luxembourg 3) University of Ottawa * This presentation is for the Journal-First Track at ICSE 2023; the original paper was accepted in Empirical Software Engineering (EMSE) journal.
  • 2. A B Y Z … Model Inference Technique ith execution 20190621.001 A 20190621.002 B 20190621.002 Z 20190621.002 B … ith execution 20190621.001 A 20190621.002 B 20190621.002 Z 20190621.002 B … ith execution 20221101.001 A 20221101.004 B 20221101.011 Z 20221101.013 B 20221101.101 Y … System Logs System Model Log = A sequence of log entries representing a single execution fl ow Too large Not Scalable Enough No Models 2
  • 3. 081111 090711 25010 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:38382 dest: /10.251.65.203:50010 081111 090711 25181 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.27.63:54730 dest: /10.251.27.63:50010 081111 090711 25487 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:40305 dest: /10.251.65.203:50010 081111 090711 00031 INFO dfs.FSNamesystem: BLOCK* NameSystem.allocateBlock: /user/root/rand8/_temporary/part-00156. blk_5652408071925555972 081111 090756 25011 INFO dfs.DataNode$PacketResponder: PacketResponder 2 for block blk_5652408071925555972 terminating 081111 090756 25011 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203 081111 090756 25184 INFO dfs.DataNode$PacketResponder: PacketResponder 0 for block blk_5652408071925555972 terminating 081111 090756 25184 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.27.63 081111 090756 25488 INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_5652408071925555972 terminating 081111 090756 25488 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203 081111 090756 00027 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.71.16:50010 is added to blk_5652408071925555972 081111 111345 00013 INFO dfs.DataBlockScanner: Veri fi cation succeeded for blk_5652408071925555972 Example HDFS Log Component IDs 3 Observation: Systems are often composed of multiple components
  • 4. What if we infer INDIVIDUAL component models and then stitch them together? 4
  • 5. System Logs eA 1 eA 2 eB 4 eB 4 eA 1 eA 2 eB 4 eA 1 eA 3 eB 5 eA 1 eA 2 eB 4 eB 4 eA 1 eA 2 eB 4 eA 1 eA 3 eB 5 ax bx dy dy ax bx dy ax cx ey PRINS: PRojection-INference-Stitching s0 s1 s2 s3 s4 a b c d e INference Model of x Model of y INference Component x Component y PRojection eA 1 eA 2 eA 1 eA 2 eA 1 eA 3 eA 1 eA 2 eA 1 eA 2 eA 1 eA 3 ax bx ax bx ax cx eB 4 eB 4 eB 4 eB 5 eB 4 eB 4 eB 4 eB 5 dy dy dy ey s0 s1 s2 a b c d s4 e Stitching System Model + (optional) Heuristic Determinisation (HD)
  • 6. Research Questions • RQ1: How does the execution time of PRINS change according to the parallel inference tasks in the inference stage? • RQ2: How does the execution time of change according to parameter ? • RQ3: How does the accuracy of the models (in the form of gFSMs) generated by change according to parameter ? • RQ4: How fast is PRINS when compared to state-of-the-art model inference techniques? • RQ5: How accurate are the models generated by PRINS compared to those generated by state-of-the-art model inference techniques? HDu u HDu u 6 Parallel inference Heuristic Determinisation PRINS (compared to MINT)
  • 7. Research Questions • RQ1: How does the execution time of PRINS change according to the parallel inference tasks in the inference stage? • RQ2: How does the execution time of change according to parameter ? • RQ3: How does the accuracy of the models (in the form of gFSMs) generated by change according to parameter ? • RQ4: How fast is PRINS when compared to state-of-the-art model inference techniques? • RQ5: How accurate are the models generated by PRINS compared to those generated by state-of-the-art model inference techniques? HDu u HDu u 7 Parallel inference Heuristic Determinisation PRINS (compared to MINT)
  • 8. RQ4: Execution Time of PRINS compared to MINT 2 4 6 8 5 10 15 20 Execution Time (s) Hadoop MINT PRINS-N PRINS-P 2 4 6 8 0 5000 10000 HDFS MINT PRINS-N PRINS-P 2 4 6 8 0 5000 10000 15000 Linux MINT PRINS-N PRINS-P 2 4 0 2500 5000 7500 10000 Zookeeper MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 5000 10000 15000 Execution Time (s) CoreSync MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 2.5 5.0 7.5 10.0 12.5 NGLClient MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 10000 20000 30000 Oobelib MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 5000 10000 15000 PDApp MINT PRINS-N PRINS-P PRINS-N = PRINS with No parallel inference (HD is enabled to be fair with MINT) PRINS-P = PRINS with Parallel inference (HD is enabled to be fair with MINT) Duplication Factor = How many times each log is duplicated to increase the input log size systematically 8
  • 9. RQ5: Accuracy of PRINS compared to MINT 9
  • 10. Downside: Size of System Models 10
  • 11. Contributions • Tame the scalability issue of model inference using divide-and-conquer. • Present an empirical evaluation of PRINS and its comparison with the state-of-the-art model inference tool. • It works especially well when the components appearing in di ff erent executions are similar. • Provide a publicly available implementation of PRINS. 11 Paper (Open Access) Replication Package