SlideShare a Scribd company logo
Predicting Defects in
SAP Java Code
An Experience Report
                       by Tilman Holschuh
                         (SQS AG)
                         Markus Päuser
                         (SAP AG)
                         Kim Herzig
                         (Saarland University)
                         Thomas Zimmermann
                         (Microsoft Research)
                         Rahul Premraj
                         (Vrije University Amsterdam)
                         Andreas Zeller
                         (Saarland University)
Motivation
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge




Where do we put the most effort?
Replicated 2 Studies
Replicated 2 Studies
1
Replicated 2 Studies
1



    Source
     code


    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code          McCabe
                   FanOut
               Dependencies
                   LoC
                   Coupling
    Version
    archive                         Predictor
                        Component
                        Quality

      Bug
    database
The Product

‣   SAP Standard Software
‣   Large scale Java software system ( > 10M LoC )
‣   Separated in projects
‣   Service pack release cycles
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




Upper bound for
prediction




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Basics


         Predictor
Input     Model      Output
How to collect
    Input Data?

1               2
     McCabe
     FanOut
     LoC            Dependencies
     Coupling
Collecting Metric Data

1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
                ‣ Change frequency


                  JDepend               ckjm
Collecting
    Dependency Data
2
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
                   ‣ Tool: JDepend

                      JDepend
How to measure
Component Quality?


Input ✔   Predictor
           Model      Output
Component Quality
Component Quality
  Bug
database




Version-
 archive
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality


                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
Component Quality

                                         #defects + 1
                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
How to build
Predictor Models?

 Linear Regression     Support Vector
  Y = Xβ + ε           Machine
      McCabe         McCabe
      FanOut         FanOut
      LoC            LoC        Dependencies
      Coupling       Coupling
Forward Prediction


                          t
V1     V2



               static analysis
               training bug data
               test bug data
Results
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
    Prediction is more precise at
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
       higher granularity levels
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Hit Rate
          actual   predicted
             1         4
             2         9    Hit rate = 50%
             3         2
Top 20%      4        11
             5         6
             6         1
             7         3
             8         5
             9        10
            10         8
            11         7
McCabe
FanOut
LoC
                 Predictions using
                 Linear Regression
Coupling




                       Top 5%   Top 20%
      All projects      46%      55%
           Group 1      47%      63%
           Project 1    21%      43%
           Project 2    42%      64%
           Project 3    41%      55%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                        Top 5%   Top 20%
          Machine
           Group 1       26%      43%

          Project 1      38%      50%

          Project 2      36%      46%

          Project 3      46%      49%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                         Top 5%      Top 20%
          Machine
            Stable
           Group 1  prediction results 43%
                          26%
                  across projects
          Project 1       38%         50%

          Project 2       36%         46%

          Project 3       46%         49%
Compare Results
                           Dependencies     Metrics
           80%



           60%
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Compare Results
                           Dependencies     Metrics
           80%



           Complexity metrics have higher
           60%

                 predictive power
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Lessons Learned
                 Nagappan   Schröter
                   et al.     et al.   our study
metrics defect
 correlation       ✔          n/a        ✔
  prediction
   possible        ✔         ✔           ✔
   forward
  prediction       ✘         ✘           ✔
  universal
  predictor        ✘         ✘           ✘
Lessons Learned
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data


     Defects have many sources, and code is
                just one of them
SQS Software Quality Systems AG

Stollwerckstraße 11
51149 Cologne, Germany
Phone: + 49 22 03 91 54 - 7149
Fax: + 49 22 03 91 54 - 15
Email: tilman.holschuh@sqs.de

Internet: www.sqs-group.com
Thank you!
         SQS Software Quality Systems AG

         Stollwerckstraße 11
         51149 Cologne, Germany
         Phone: + 49 22 03 91 54 - 7149
         Fax: + 49 22 03 91 54 - 15
         Email: tilman.holschuh@sqs.de

         Internet: www.sqs-group.com
Predicting Defects in SAP Java Code: An Experience Report
Ad

More Related Content

Viewers also liked (8)

Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver Java
Leland Bartlett
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)
ERPScan
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
hwilming
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java apps
Simon Ritter
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFC
Monsif Elaissoussi
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
Iosif Itkin
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcache
HyeonSeok Choi
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
Chris Westin
 
Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver Java
Leland Bartlett
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)
ERPScan
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
hwilming
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java apps
Simon Ritter
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFC
Monsif Elaissoussi
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
Iosif Itkin
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcache
HyeonSeok Choi
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
Chris Westin
 

Similar to Predicting Defects in SAP Java Code: An Experience Report (20)

How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java Applications
Alex Soto
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
Masud Rahman
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software Quality
Nicolas Bettenburg
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann
 
BioWeka
BioWekaBioWeka
BioWeka
Martin Szugat
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration Hell
Databricks
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
sjust
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
Thomas Zimmermann
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
MongoDB
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
Thomas Zimmermann
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
Anubhav Jain
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
chk49
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016
Chris Fregly
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
Databricks
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008
ChemAxon
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentation
Yury Chemerkin
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render Engine
Matthias Jugel
 
Debugging TV Frame 0x13
Debugging TV Frame 0x13Debugging TV Frame 0x13
Debugging TV Frame 0x13
Dmitry Vostokov
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」
Satoshi Goto
 
How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java Applications
Alex Soto
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
Masud Rahman
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software Quality
Nicolas Bettenburg
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration Hell
Databricks
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
sjust
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
Thomas Zimmermann
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
MongoDB
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
Thomas Zimmermann
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
Anubhav Jain
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
chk49
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016
Chris Fregly
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
Databricks
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008
ChemAxon
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentation
Yury Chemerkin
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render Engine
Matthias Jugel
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」
Satoshi Goto
 
Ad

Recently uploaded (20)

Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Top 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing ServicesTop 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing Services
Infrassist Technologies Pvt. Ltd.
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Unlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive GuideUnlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive Guide
vikasascentbpo
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Unlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive GuideUnlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive Guide
vikasascentbpo
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Ad

Predicting Defects in SAP Java Code: An Experience Report

  • 1. Predicting Defects in SAP Java Code An Experience Report by Tilman Holschuh (SQS AG) Markus Päuser (SAP AG) Kim Herzig (Saarland University) Thomas Zimmermann (Microsoft Research) Rahul Premraj (Vrije University Amsterdam) Andreas Zeller (Saarland University)
  • 7. Motivation Problems Quality Manager Resources Time Knowledge
  • 8. Motivation Problems Quality Manager Resources Time Knowledge Where do we put the most effort?
  • 11. Replicated 2 Studies 1 Source code Version archive Bug database
  • 12. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Bug database
  • 13. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Component Quality Bug database
  • 14. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 15. Replicated 2 Studies 2 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 16. Replicated 2 Studies 2 Source code McCabe FanOut Dependencies LoC Coupling Version archive Predictor Component Quality Bug database
  • 17. The Product ‣ SAP Standard Software ‣ Large scale Java software system ( > 10M LoC ) ‣ Separated in projects ‣ Service pack release cycles
  • 18. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 19. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 20. Defect Distribution 20% of the code contain ~75% of defects graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 21. Defect Distribution 20% of the code contain ~75% of defects Upper bound for prediction graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 22. Basics Predictor Input Model Output
  • 23. How to collect Input Data? 1 2 McCabe FanOut LoC Dependencies Coupling
  • 24. Collecting Metric Data 1 McCabe FanOut LoC Coupling
  • 25. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut LoC Coupling
  • 26. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs
  • 27. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs ‣ Change frequency JDepend ckjm
  • 28. Collecting Dependency Data 2 Dependencies
  • 29. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies
  • 30. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies ‣ Tool: JDepend JDepend
  • 31. How to measure Component Quality? Input ✔ Predictor Model Output
  • 33. Component Quality Bug database Version- archive
  • 34. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Version- archive v1.17 v1.18 v1.19
  • 35. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 36. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 37. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 38. Component Quality Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 39. Component Quality #defects + 1 Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 40. How to build Predictor Models? Linear Regression Support Vector Y = Xβ + ε Machine McCabe McCabe FanOut FanOut LoC LoC Dependencies Coupling Coupling
  • 41. Forward Prediction t V1 V2 static analysis training bug data test bug data
  • 43. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Max 0.587 n/a Sum 0.583 0.299 McCabe Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 44. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Prediction is more precise at Max 0.587 n/a Sum 0.583 0.299 McCabe higher granularity levels Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 45. Hit Rate actual predicted 1 4 2 9 Hit rate = 50% 3 2 Top 20% 4 11 5 6 6 1 7 3 8 5 9 10 10 8 11 7
  • 46. McCabe FanOut LoC Predictions using Linear Regression Coupling Top 5% Top 20% All projects 46% 55% Group 1 47% 63% Project 1 21% 43% Project 2 42% 64% Project 3 41% 55%
  • 47. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Group 1 26% 43% Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 48. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Stable Group 1 prediction results 43% 26% across projects Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 49. Compare Results Dependencies Metrics 80% 60% Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 50. Compare Results Dependencies Metrics 80% Complexity metrics have higher 60% predictive power Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 51. Lessons Learned Nagappan Schröter et al. et al. our study metrics defect correlation ✔ n/a ✔ prediction possible ✔ ✔ ✔ forward prediction ✘ ✘ ✔ universal predictor ✘ ✘ ✘
  • 53. Lessons Learned Predictions based on static code features provide limited results and depend on the project context
  • 54. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data
  • 55. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data Defects have many sources, and code is just one of them
  • 56. SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: [email protected] Internet: www.sqs-group.com
  • 57. Thank you! SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: [email protected] Internet: www.sqs-group.com