SlideShare a Scribd company logo
Bench4BL: Reproducibility Study on
the Performance of IR-Based Bug
Localization
Jaekwon Lee1, Dongsun Kim1, Tegawendé F. Bissyandé1, 

Woosung Jung2, Yves Le Traon1

1SnT, University of Luxembourg - Luxembourg

2Seoul National University of Education - South Korea
Bug Localization
!2
Bug Localization
!3
Where should we fix?
Bug Localization
!4
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
Bug Localization
F(x)
Test Case
Test Case
Test Case
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
Fault Localization
Bug Localization
!5
Bug Localization
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
Bug Localization
F(x)
Test Case
Test Case
Test Case
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
Fault Localization
Bug Localization
!6
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
Information Retrieval based
Bug Localization (IRBL)
!7
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Information Retrieval based
Bug Localization (IRBL)
!8
Feature

Vector
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
….
Feature 

Vectors
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Information Retrieval based
Bug Localization (IRBL)
!9
Feature

Vector
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
Recommend

Code Files
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
1
2
3
N
….
….
Feature 

Vectors
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Comparing Similarity

& Ranking
Information Retrieval based
Bug Localization (IRBL)
!10
!11
Is there any issue?
!12
Are these results mature enough?
Not enough maturity of performance
Subjects BRTracer BLUiR AmaLgam Locus
ZXing 0.445 0.380 0.410 0.502
SWT 0.467 0.560 0578 0.640
AspectJ 0.264 0.263 0.271 0.320
PDE 0.367 0.349 0.322 0.422
JDT 0.232 0.277 0.282 0.359
(metric : MAP)
Are the subjects still usable?
!13
PDE
Eclipse
ZXing
AspectJ
JDT
SWT
98
286
20
#Reports Period
2004 - 2016
2004 - 2010
2002 - 2006
2010 - 2010
Subject
Out-of-dated subjects
60
98
Are the subjects still usable?
!14
PDE
Eclipse
ZXing
AspectJ
JDT
SWT
98
286
20
#Reports Period
2004 - 2016
2004 - 2010
2002 - 2006
2010 - 2010
Subject
Out-of-dated subjects
60
98
Evaluation Configuration?
!15
Inconsistent evaluation settings
BugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
Version Matching
Test file inclusion
Study Design
!16
Experiment 

Data Set
RQ1: To what extent do IRBL techniques
perform on up-to-date subjects?
Research Questions
!17
Experiment 

Data Set
Experiment
Configuration
RQ1: To what extent do IRBL techniques
perform on up-to-date subject?
RQ2: What is the impact of version
matching on the performance of IRBL
techniques?
RQ3: To what extent are IRBL techniques
sensitive to the inclusion of test code files?
Research Questions
!18
Experiment 

Data Set
Experiment
Configuration
Potential
Improvement
RQ1: To what extent do IRBL techniques
perform on up-to-date subject?
RQ2: What is the impact of version
matching on the performance of IRBL
techniques?
RQ3: To what extent are IRBL techniques
sensitive to the inclusion of test code files?
RQ4: What potential performance gain can
be reached by leveraging duplicate bug
reports?
Research Questions
!19
BugLocator

(ICSE 2012)
BLIA

(APSEC 2015)
Locus

(ICSE 2016)
AmaLgam

(ICPC 2014)
BRTracer

(ICSME 2014)
BLUiR

(ASE 2013)
IRBL Features Sub Modules
Bug report fixing historyFull text
Code

segmentations
Identifiers
Identifiers
Identifiers
Identifiers
Bug report fixing history
Bug report fixing history, Revision history
Revision history
Bug report fixing history

Stack Trace Analysis, Revision history
Bug report fixing history, 

Stack Trace Analysis
IRBL Techniques we used
!20
Subjects
!21
20+
Written in Java
Publicly available

bug reports
20 source code files 

in one of its version
Subjects
46

Projects
New Subjects
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
5

Projects
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
Old Subjects
!22
Subjects
46

Projects
New Subjects
690 

Major
Versions
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
807 

Duplicate
Reports
5

Projects
5 

Major
Versions
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
136 

Duplicate
Reports
Old Subjects
!23
!24
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
New Subjects Old Subjects
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
VS.
Single version matching
Test file included
Configuration
RQ1: 

The use of old vs. new subjects
Single version
Matching
Multiple version
Matching
!25
VS.
Configuration
New subjects
Test files included
RQ2: 

The importance of version matching
!26
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
Test File Included Test File Excluded
VS.
+Test
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Configuration
Multiple version matching
New subjects
Commit Log
RQ3: 

The impact of test file inclusion
!27
Master reports Merged reportsDuplicate reports
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle 

(Master reports)
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle 

(Duplicate reports)
Bug Oracle 

(Merged reports)
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
…..….
……..

….
………..
…. …..
…..….
……..

….
..
…..….
……..

….
………..
…. …..
…..….
……..

….
..
…..….
……..

….
Configuration
For the all subjects

Including test files in Bug Oracle
RQ4: 

Leveraging duplicate bugs reports
Experiment Results
!28
Metrics
MAP
MRR
MAP =
1
M
MX
j=1
AP(j)
MRR =
1
M
MX
i=1
1
f-ranki
MAP =
1
M
MX
j=1
AP(j)
MRR =
1
M
MX
i=1
1
f-ranki
!29
Mean Average Precision
Mean Reciprocal Rank
●●
●●
0.359
0.35
0.359
0.365
0.363
0.38Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MAP values of all subjects for each techniques
0.455
0.501
0.43
0.516
0.497
0.506Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MRR values of all subjects for each techniques
Baseline Performance
!30
●●
●●
0.359
0.35
0.359
0.365
0.363
0.38Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MAP values of all subjects for each techniques
0.455
0.501
0.43
0.516
0.497
0.506Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MRR values of all subjects for each techniques
Baseline Performance
Bug localization still has much room for improvement.
!31
Technique
Old Subjects New Subjects
MAP MRR MAP MRR
BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223
BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690
BLUiR 0.3102 0.4556 0.2881 0.3869
AmaLgam 0.2950 0.4072 0.2906 0.3899
BLIA 0.2935 0.4242 ↗0.3014 0.4155
Locus 0.2641 0.3399 ↗0.3289 ↗0.4430
Single version matching
Test files included
Summary of MAP/MRR of IRBL techniques
!32
Configuration
RQ1: 

The use of old vs. new subjects
Summary of MAP/MRR of IRBL techniques
Not over-fitted to old subjects
Technique
Old Subjects New Subjects
MAP MRR MAP MRR
BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223
BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690
BLUiR 0.3102 0.4556 0.2881 0.3869
AmaLgam 0.2950 0.4072 0.2906 0.3899
BLIA 0.2935 0.4242 ↗0.3014 0.4155
Locus 0.2641 0.3399 ↗0.3289 ↗0.4430
!33
Single version matching
Test files included
Configuration
RQ1: 

The use of old vs. new subjects
!34
Summary of MAP/MRR of IRBL techniques
Technique
Single Version Multiple Version
MAP MRR MAP MRR
BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075
BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526
BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802
AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840
BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124
Locus 0.3289 0.4430 ↗0.4217 ↗0.5514
New subjects
Test files included
Configuration
RQ2: 

The importance of version matching
New subjects
Test files included
Summary of MAP/MRR of IRBL techniques
The evaluation/execution of IRBL techniques should apply
multiple version matching
!35
Technique
Single Version Multiple Version
MAP MRR MAP MRR
BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075
BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526
BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802
AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840
BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124
Locus 0.3289 0.4430 ↗0.4217 ↗0.5514
Configuration
RQ2: 

The importance of version matching
ConfigurationSummary of MAP/MRR of IRBL techniques
!36
Multiple version matching
New subjects
RQ3: 

The impact of test file inclusion
Technique
Test files excluded Test files included
MAP MRR MAP MRR
BugLocator 0.3811 0.4647 0.3713 ↗0.5075
BRTracer 0.4141 0.5090 0.3992 ↗0.5526
BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802
AmaLgam 0.3633 0.4420 0.3657 ↗0.4840
BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124
Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
Summary of MAP/MRR of IRBL techniques
Technique
Test files excluded Test files included
MAP MRR MAP MRR
BugLocator 0.3811 0.4647 0.3713 ↗0.5075
BRTracer 0.4141 0.5090 0.3992 ↗0.5526
BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802
AmaLgam 0.3633 0.4420 0.3657 ↗0.4840
BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124
Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
!37
Including test files does not bring bias or noise
Configuration
Multiple version matching
New subjects
RQ3: 

The impact of test file inclusion
RQ4: 

Leveraging duplicate bugs reports
!38
Technique
Master Duplicate Merged
(Master+Duplicate)
MAP MRR MAP MRR MAP MRR
BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249
BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692
BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728
AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725
BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041
Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021
Summary of MAP/MRR of IRBL techniques
!39
Summary of MAP/MRR of IRBL techniques
RQ4: 

Leveraging duplicate bugs reports
Technique
Master Duplicate Merged
(Master+Duplicate)
MAP MRR MAP MRR MAP MRR
BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249
BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692
BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728
AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725
BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041
Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021
Duplicate reports are complement master bug reports and
guarantee a minimum level of performance
Summary
!40
!41
Dataset Available
https://github.com/exatoa/Bench4BL
Bug Linking
!42
Bug-Code Linking
Bug Report
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit Log
!43
Bug-Code Linking
Bug Report
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit Log
!44
Bug Oracle
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/model/PolicyDefinition.java

main/java/org/apache/camel/model/TransactedDefinition.java

test/java/org/apache/camel/catalog/CamelCatalog.java

main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/GoogleBigQueryProducer.java
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/component/StringConcatenator.java
Bug Report 1
Bug Report 2
Bug Report 3
…….
!45
Version Matching
!46
Version Matching Strategy
Single version
Matching
!47
Previous Techniques
Version Matching Strategy
Single version
Matching
Multiple version
Matching
!48
Previous Techniques
Version Matching Approach
!49
Version Matching Approach
Selecting earliest version
!50
Test Case Inclusion
!51
Test File Inclusion
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit LogBugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
!52
Test File Inclusion
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit LogBugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
We remove 

including “test” or “Test” in a path or filename
!53
Duplicate Report
!54
Duplicate Bug Reports
46

Projects
New Subjects
690 

Major
Versions
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
807 

Duplicate
Reports
5

Projects
5 

Major
Versions
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
136 

Duplicate
Reports
Old Subjects
!55
Duplicate Bug Reports
!56
Duplicate Bug Reports
MATH-760 MATH-1192 MATH-2022
MATH-760 MATH-1192
MATH-760 MATH-2022
!57

More Related Content

What's hot (12)

PDF
Exploring the Defender's Advantage
Raffael Marty
 
PDF
Autosar basics by ARCCORE
ARCCORE
 
PDF
Luận văn: Giải pháp tích hợp dịch vụ nghiệp vụ ngân hàng, 9đ
Dịch vụ viết bài trọn gói ZALO 0917193864
 
PDF
WHAT IS THE INFORMATION SYSTEM AUDIT.pdf
AxelKAPITA1
 
PDF
Penetration Testing Tutorial | Penetration Testing Tools | Cyber Security Tra...
Edureka!
 
ODP
Snort
Michael Boman
 
PPTX
Cyber Threat Modeling
EC-Council
 
PPTX
Introduction to Metasploit
GTU
 
PPTX
The CIS Critical Security Controls the International Standard for Defense
EnclaveSecurity
 
PPTX
Future of SOC: More Security, Less Operations
Anton Chuvakin
 
PDF
Alphorm.com Formation Techniques de Blue Teaming : L'Essentiel pour l'Analyst...
Alphorm
 
PPTX
Network penetration testing
Imaginea
 
Exploring the Defender's Advantage
Raffael Marty
 
Autosar basics by ARCCORE
ARCCORE
 
Luận văn: Giải pháp tích hợp dịch vụ nghiệp vụ ngân hàng, 9đ
Dịch vụ viết bài trọn gói ZALO 0917193864
 
WHAT IS THE INFORMATION SYSTEM AUDIT.pdf
AxelKAPITA1
 
Penetration Testing Tutorial | Penetration Testing Tools | Cyber Security Tra...
Edureka!
 
Cyber Threat Modeling
EC-Council
 
Introduction to Metasploit
GTU
 
The CIS Critical Security Controls the International Standard for Defense
EnclaveSecurity
 
Future of SOC: More Security, Less Operations
Anton Chuvakin
 
Alphorm.com Formation Techniques de Blue Teaming : L'Essentiel pour l'Analyst...
Alphorm
 
Network penetration testing
Imaginea
 

Similar to Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization (20)

PDF
pawel_jakubowski_master_thesis_swarm
Paweł Jakubowski
 
PDF
Machine Learning for Application-Layer Intrusion Detection
butest
 
PDF
masteroppgave_larsbrusletto
Lars Brusletto
 
PDF
Zap Scanning
Suresh Kumar
 
DOCX
JConrad_Mod11_FinalProject_031816
Jeff Conrad
 
PDF
edc_adaptivity
Ramin Zohouri
 
PPTX
Molecular Biology Software Links
university of education,Lahore
 
PDF
Breakfast cereal for advanced beginners
Truptiranjan Nayak
 
PDF
A preliminary study on using code smells to improve bug localization
krws
 
PDF
thesis-2
danish shrestha
 
PPTX
swatiVCprsentation artificial learning and machine learning.pptx
pooja71445
 
PDF
Java Performance & Profiling
Isuru Perera
 
DOCX
java traning report_Summer.docx
GauravSharma164138
 
PDF
Thesis_Report
Jérémy Pouech
 
PDF
Dissertation_of_Pieter_van_Zyl_2_March_2010
Pieter Van Zyl
 
PPTX
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
Dongsun Kim
 
PDF
ltu-cover6899158065669445093
Alberto Isaac Barquín Murguía
 
PPTX
The Hacking Games - Operation System Vulnerabilities Meetup 29112022
lior mazor
 
PDF
project(copy1)
Cameron White
 
pawel_jakubowski_master_thesis_swarm
Paweł Jakubowski
 
Machine Learning for Application-Layer Intrusion Detection
butest
 
masteroppgave_larsbrusletto
Lars Brusletto
 
Zap Scanning
Suresh Kumar
 
JConrad_Mod11_FinalProject_031816
Jeff Conrad
 
edc_adaptivity
Ramin Zohouri
 
Molecular Biology Software Links
university of education,Lahore
 
Breakfast cereal for advanced beginners
Truptiranjan Nayak
 
A preliminary study on using code smells to improve bug localization
krws
 
thesis-2
danish shrestha
 
swatiVCprsentation artificial learning and machine learning.pptx
pooja71445
 
Java Performance & Profiling
Isuru Perera
 
java traning report_Summer.docx
GauravSharma164138
 
Thesis_Report
Jérémy Pouech
 
Dissertation_of_Pieter_van_Zyl_2_March_2010
Pieter Van Zyl
 
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
Dongsun Kim
 
ltu-cover6899158065669445093
Alberto Isaac Barquín Murguía
 
The Hacking Games - Operation System Vulnerabilities Meetup 29112022
lior mazor
 
project(copy1)
Cameron White
 
Ad

More from Dongsun Kim (13)

PDF
LeakPair: Proactive Repairing of Leaks in Single Page Web Applications
Dongsun Kim
 
PPTX
iFixR: Bug Report Driven Program Repair
Dongsun Kim
 
PPTX
TBar: Revisiting Template-based Automated Program Repair
Dongsun Kim
 
PDF
Mining Fix Patterns for FindBugs Violations
Dongsun Kim
 
PDF
Learning to Spot and Refactor Inconsistent Method Names
Dongsun Kim
 
PPTX
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
Dongsun Kim
 
PPTX
A Closer Look at Real-World Patches
Dongsun Kim
 
PPTX
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
Dongsun Kim
 
PDF
Impact of Tool Support in Patch Construction
Dongsun Kim
 
PDF
FaCoY – A Code-to-Code Search Engine
Dongsun Kim
 
PDF
Augmenting and structuring user queries to support efficient free-form code s...
Dongsun Kim
 
PDF
Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...
Dongsun Kim
 
PDF
Automatic Patch Generation Learned from Human-Written Patches
Dongsun Kim
 
LeakPair: Proactive Repairing of Leaks in Single Page Web Applications
Dongsun Kim
 
iFixR: Bug Report Driven Program Repair
Dongsun Kim
 
TBar: Revisiting Template-based Automated Program Repair
Dongsun Kim
 
Mining Fix Patterns for FindBugs Violations
Dongsun Kim
 
Learning to Spot and Refactor Inconsistent Method Names
Dongsun Kim
 
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
Dongsun Kim
 
A Closer Look at Real-World Patches
Dongsun Kim
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
Dongsun Kim
 
Impact of Tool Support in Patch Construction
Dongsun Kim
 
FaCoY – A Code-to-Code Search Engine
Dongsun Kim
 
Augmenting and structuring user queries to support efficient free-form code s...
Dongsun Kim
 
Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...
Dongsun Kim
 
Automatic Patch Generation Learned from Human-Written Patches
Dongsun Kim
 
Ad

Recently uploaded (20)

PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PPTX
Role_of_Artificial_Intelligence_in_Livestock_Extension_Services.pptx
DrRajdeepMadavi
 
PDF
Bitkom eIDAS Summit | European Business Wallet: Use Cases, Macroeconomics, an...
Carsten Stoecker
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPTX
Talbott's brief History of Computers for CollabDays Hamburg 2025
Talbott Crowell
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
Home Cleaning App Development Services.pdf
V3cube
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
NASA A Researcher’s Guide to International Space Station : Fundamental Physics
Dr. PANKAJ DHUSSA
 
PDF
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
PDF
NASA A Researcher’s Guide to International Space Station : Earth Observations
Dr. PANKAJ DHUSSA
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
PDF
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Role_of_Artificial_Intelligence_in_Livestock_Extension_Services.pptx
DrRajdeepMadavi
 
Bitkom eIDAS Summit | European Business Wallet: Use Cases, Macroeconomics, an...
Carsten Stoecker
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Digital Circuits, important subject in CS
contactparinay1
 
Talbott's brief History of Computers for CollabDays Hamburg 2025
Talbott Crowell
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Home Cleaning App Development Services.pdf
V3cube
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
NASA A Researcher’s Guide to International Space Station : Fundamental Physics
Dr. PANKAJ DHUSSA
 
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
NASA A Researcher’s Guide to International Space Station : Earth Observations
Dr. PANKAJ DHUSSA
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
Next Generation AI: Anticipatory Intelligence, Forecasting Inflection Points ...
dleka294658677
 

Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization

  • 1. Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization Jaekwon Lee1, Dongsun Kim1, Tegawendé F. Bissyandé1, 
 Woosung Jung2, Yves Le Traon1 1SnT, University of Luxembourg - Luxembourg 2Seoul National University of Education - South Korea
  • 4. Bug Localization !4 Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files Bug Localization
  • 5. F(x) Test Case Test Case Test Case 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function Fault Localization Bug Localization !5 Bug Localization Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files
  • 6. Bug Localization F(x) Test Case Test Case Test Case 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function Fault Localization Bug Localization !6 Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files
  • 8. …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Information Retrieval based Bug Localization (IRBL) !8
  • 9. Feature
 Vector …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report …. Feature 
 Vectors NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Information Retrieval based Bug Localization (IRBL) !9
  • 10. Feature
 Vector …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report Recommend
 Code Files ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. 1 2 3 N …. …. Feature 
 Vectors NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Comparing Similarity
 & Ranking Information Retrieval based Bug Localization (IRBL) !10
  • 12. !12 Are these results mature enough? Not enough maturity of performance Subjects BRTracer BLUiR AmaLgam Locus ZXing 0.445 0.380 0.410 0.502 SWT 0.467 0.560 0578 0.640 AspectJ 0.264 0.263 0.271 0.320 PDE 0.367 0.349 0.322 0.422 JDT 0.232 0.277 0.282 0.359 (metric : MAP)
  • 13. Are the subjects still usable? !13 PDE Eclipse ZXing AspectJ JDT SWT 98 286 20 #Reports Period 2004 - 2016 2004 - 2010 2002 - 2006 2010 - 2010 Subject Out-of-dated subjects 60 98
  • 14. Are the subjects still usable? !14 PDE Eclipse ZXing AspectJ JDT SWT 98 286 20 #Reports Period 2004 - 2016 2004 - 2010 2002 - 2006 2010 - 2010 Subject Out-of-dated subjects 60 98
  • 15. Evaluation Configuration? !15 Inconsistent evaluation settings BugLocator BLIA Locus AmaLgam BRTracer BLUiR Version Matching Test file inclusion
  • 17. Experiment 
 Data Set RQ1: To what extent do IRBL techniques perform on up-to-date subjects? Research Questions !17
  • 18. Experiment 
 Data Set Experiment Configuration RQ1: To what extent do IRBL techniques perform on up-to-date subject? RQ2: What is the impact of version matching on the performance of IRBL techniques? RQ3: To what extent are IRBL techniques sensitive to the inclusion of test code files? Research Questions !18
  • 19. Experiment 
 Data Set Experiment Configuration Potential Improvement RQ1: To what extent do IRBL techniques perform on up-to-date subject? RQ2: What is the impact of version matching on the performance of IRBL techniques? RQ3: To what extent are IRBL techniques sensitive to the inclusion of test code files? RQ4: What potential performance gain can be reached by leveraging duplicate bug reports? Research Questions !19
  • 20. BugLocator
 (ICSE 2012) BLIA
 (APSEC 2015) Locus
 (ICSE 2016) AmaLgam
 (ICPC 2014) BRTracer
 (ICSME 2014) BLUiR
 (ASE 2013) IRBL Features Sub Modules Bug report fixing historyFull text Code
 segmentations Identifiers Identifiers Identifiers Identifiers Bug report fixing history Bug report fixing history, Revision history Revision history Bug report fixing history
 Stack Trace Analysis, Revision history Bug report fixing history, 
 Stack Trace Analysis IRBL Techniques we used !20
  • 21. Subjects !21 20+ Written in Java Publicly available bug reports 20 source code files 
 in one of its version
  • 22. Subjects 46
 Projects New Subjects 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. 5
 Projects 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. Old Subjects !22
  • 23. Subjects 46
 Projects New Subjects 690 
 Major Versions 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 807 
 Duplicate Reports 5
 Projects 5 
 Major Versions 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 136 
 Duplicate Reports Old Subjects !23
  • 24. !24 ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle New Subjects Old Subjects ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle VS. Single version matching Test file included Configuration RQ1: 
 The use of old vs. new subjects
  • 25. Single version Matching Multiple version Matching !25 VS. Configuration New subjects Test files included RQ2: 
 The importance of version matching
  • 26. !26 ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle Test File Included Test File Excluded VS. +Test ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Configuration Multiple version matching New subjects Commit Log RQ3: 
 The impact of test file inclusion
  • 27. !27 Master reports Merged reportsDuplicate reports ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle 
 (Master reports) ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle 
 (Duplicate reports) Bug Oracle 
 (Merged reports) …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. Configuration For the all subjects
 Including test files in Bug Oracle RQ4: 
 Leveraging duplicate bugs reports
  • 29. Metrics MAP MRR MAP = 1 M MX j=1 AP(j) MRR = 1 M MX i=1 1 f-ranki MAP = 1 M MX j=1 AP(j) MRR = 1 M MX i=1 1 f-ranki !29 Mean Average Precision Mean Reciprocal Rank
  • 30. ●● ●● 0.359 0.35 0.359 0.365 0.363 0.38Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MAP values of all subjects for each techniques 0.455 0.501 0.43 0.516 0.497 0.506Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MRR values of all subjects for each techniques Baseline Performance !30
  • 31. ●● ●● 0.359 0.35 0.359 0.365 0.363 0.38Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MAP values of all subjects for each techniques 0.455 0.501 0.43 0.516 0.497 0.506Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MRR values of all subjects for each techniques Baseline Performance Bug localization still has much room for improvement. !31
  • 32. Technique Old Subjects New Subjects MAP MRR MAP MRR BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223 BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690 BLUiR 0.3102 0.4556 0.2881 0.3869 AmaLgam 0.2950 0.4072 0.2906 0.3899 BLIA 0.2935 0.4242 ↗0.3014 0.4155 Locus 0.2641 0.3399 ↗0.3289 ↗0.4430 Single version matching Test files included Summary of MAP/MRR of IRBL techniques !32 Configuration RQ1: 
 The use of old vs. new subjects
  • 33. Summary of MAP/MRR of IRBL techniques Not over-fitted to old subjects Technique Old Subjects New Subjects MAP MRR MAP MRR BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223 BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690 BLUiR 0.3102 0.4556 0.2881 0.3869 AmaLgam 0.2950 0.4072 0.2906 0.3899 BLIA 0.2935 0.4242 ↗0.3014 0.4155 Locus 0.2641 0.3399 ↗0.3289 ↗0.4430 !33 Single version matching Test files included Configuration RQ1: 
 The use of old vs. new subjects
  • 34. !34 Summary of MAP/MRR of IRBL techniques Technique Single Version Multiple Version MAP MRR MAP MRR BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075 BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526 BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802 AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840 BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124 Locus 0.3289 0.4430 ↗0.4217 ↗0.5514 New subjects Test files included Configuration RQ2: 
 The importance of version matching
  • 35. New subjects Test files included Summary of MAP/MRR of IRBL techniques The evaluation/execution of IRBL techniques should apply multiple version matching !35 Technique Single Version Multiple Version MAP MRR MAP MRR BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075 BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526 BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802 AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840 BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124 Locus 0.3289 0.4430 ↗0.4217 ↗0.5514 Configuration RQ2: 
 The importance of version matching
  • 36. ConfigurationSummary of MAP/MRR of IRBL techniques !36 Multiple version matching New subjects RQ3: 
 The impact of test file inclusion Technique Test files excluded Test files included MAP MRR MAP MRR BugLocator 0.3811 0.4647 0.3713 ↗0.5075 BRTracer 0.4141 0.5090 0.3992 ↗0.5526 BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802 AmaLgam 0.3633 0.4420 0.3657 ↗0.4840 BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124 Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
  • 37. Summary of MAP/MRR of IRBL techniques Technique Test files excluded Test files included MAP MRR MAP MRR BugLocator 0.3811 0.4647 0.3713 ↗0.5075 BRTracer 0.4141 0.5090 0.3992 ↗0.5526 BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802 AmaLgam 0.3633 0.4420 0.3657 ↗0.4840 BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124 Locus 0.4146 0.5002 ↗0.4217 ↗0.5514 !37 Including test files does not bring bias or noise Configuration Multiple version matching New subjects RQ3: 
 The impact of test file inclusion
  • 38. RQ4: 
 Leveraging duplicate bugs reports !38 Technique Master Duplicate Merged (Master+Duplicate) MAP MRR MAP MRR MAP MRR BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249 BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692 BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728 AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725 BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041 Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021 Summary of MAP/MRR of IRBL techniques
  • 39. !39 Summary of MAP/MRR of IRBL techniques RQ4: 
 Leveraging duplicate bugs reports Technique Master Duplicate Merged (Master+Duplicate) MAP MRR MAP MRR MAP MRR BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249 BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692 BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728 AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725 BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041 Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021 Duplicate reports are complement master bug reports and guarantee a minimum level of performance
  • 43. Bug-Code Linking Bug Report CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit Log !43
  • 44. Bug-Code Linking Bug Report CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit Log !44
  • 45. Bug Oracle ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/model/PolicyDefinition.java main/java/org/apache/camel/model/TransactedDefinition.java test/java/org/apache/camel/catalog/CamelCatalog.java main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/GoogleBigQueryProducer.java ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/component/StringConcatenator.java Bug Report 1 Bug Report 2 Bug Report 3 ……. !45
  • 47. Version Matching Strategy Single version Matching !47 Previous Techniques
  • 48. Version Matching Strategy Single version Matching Multiple version Matching !48 Previous Techniques
  • 50. Version Matching Approach Selecting earliest version !50
  • 52. Test File Inclusion CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit LogBugLocator BLIA Locus AmaLgam BRTracer BLUiR !52
  • 53. Test File Inclusion CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit LogBugLocator BLIA Locus AmaLgam BRTracer BLUiR We remove 
 including “test” or “Test” in a path or filename !53
  • 55. Duplicate Bug Reports 46
 Projects New Subjects 690 
 Major Versions 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 807 
 Duplicate Reports 5
 Projects 5 
 Major Versions 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 136 
 Duplicate Reports Old Subjects !55
  • 57. Duplicate Bug Reports MATH-760 MATH-1192 MATH-2022 MATH-760 MATH-1192 MATH-760 MATH-2022 !57