SlideShare a Scribd company logo
An Empirical Study of Function Clones in Open Source Software Chnchal K.Roy and James R. Cordy Queen’s University Presenter: MF Khan
Outline Introduction NICAD Overview Experimental Setup Experimental Results Conclusions Discussion
Introduction Code Clone/Clone Reusing a code of fragment by copying and pasting with or without minor modifications Benefits Software Maintenance (Bug detection) History Several techniques were proposed Lack of in depth comparative studies on cloning in Variety of systems
Introduction (Cont) NICAD In depth study of function cloning in 15+ C and Java Systems including Apache and Linux kernel Accurate Detection of Near-Miss functions Clones. Focusing on its worth in detecting copy/Pasted near-miss clones by using pretty printing, Code normalization and filtering Light Weight using simple text line Capable of detecting clones in very large system in different languages
NICAD Overview Three phases of clone detection Extraction All potential  clones are identified and extracted. All function and method in C  & Java with their  original source coordinates Comparison ( Determination of Clones ) Potential clones are clustered and compared. Pretty printed potential clones line by line  text wise using Longest common subsequence(LCS).
NICAD Overview Unique Percentage of Items(UPI) IF UPI for both line sequence is zero or below certain threshold. Potential Clones are consider to be clone Reporting   Results from NICAD reported in XML database form and interactive HTML
Experimental Setup Paper applied NICAD to find function clones in a number of open source systems Later on paper introduce a set of metrics to analyze the results
Experimental Setup Subject Systems  10 C and 7 Java systems
Clone Definition Non empty functions of at least 3 LOC In Pretty printed format. Different Unique Percentage of Items (UPI) use to find exact and near miss clones. E.g.  If UPI threshold is 0.0  =Exact clone If UPI threshold is 0.10=Two function as clone
Validation of Clones To validate detected clone is 2 step process 1:NICADE’s INTRACTIVE HTML OUTPUT To given an overall view of original source of clone classes an over view of original source of clone classes. 2:XML OUTPUT To pair wise compare the original source of the functions in each clone class using Linux  diff  to determine the textual similarity of the original source
Metrics and Visualizations Total Cloned Methods(TCM) How to get over all cloning statistics File Associated with Clone(FAWC) Overall localization of clones. From a s/w maintenance point of view, a lower value of FAWCP is desirable...Why? If  clone are localized to certain specific files and thus may be easier to maintain Still one can’t say which files contain the majority of clone in the system
Metrics and Visualizations Cloned Ratio of File for Methods(CRFM) With CRFM we attempt discover highly cloned files In a particular file (f) Profile of Cloning Locality  w.r.t  Methods(PCLM) Kapser and  Godfrey provide 3 location base function clones. 1:In the same File 2:Same DIR 3: Different DIR
Experimental Results 1.More function cloning in Open Source java than in C. On AvG about 15%(7.2% wrt LOC) 2.Effect of increasing UPI is almost identical.
Detail Overview  1.Several of C system have <10% cloning function. Java systems are consistent  in cloning
Clone Associated Files
Clone Associated Files FAWC address the issue of what portion of the files in a system is associated with clone. A system with more clones but with associated with only a few files is in some sense better than a system with fewer clones scattered over many files from a software maintenance point of view.
Profiles of Cloning Density It tell us which files are highly cloned or which files contain the majority of clones That’s mean Scattered File  and  more near  miss clones
Profile of cloning Density Assuming that cloned method in high density cloned file have been intentionally copy/Pasted.
Profile Cloning Localization Location of a clone pair is a factor in s/w maintenance Except Linux  there are no exact clone in (UPI threshold 0.0) in C When UPI threshold is 0.3,On average 45.9 %(49.0 % LOC) of clone pair in C Occur.
Conclusion NICAD is capable of accurately finding the Exact Function Clone Near Miss Function Clones
Discussion What is definition of Clone? What is definition of near-miss clones? Why Wel tab is higher in slide 14? What if we use C++ or C#? What will happen if we use smaller clone granularity such as begin- end block
Thank you.

More Related Content

Similar to An Empirical Study Of Function Clones In Open Source Software (20)

Detecting the High Level Similarities in Software Implementation Process Usin...
Detecting the High Level Similarities in Software Implementation Process Usin...Detecting the High Level Similarities in Software Implementation Process Usin...
Detecting the High Level Similarities in Software Implementation Process Usin...
IOSR Journals
 
Most Influential Paper - SANER 2017
Most Influential Paper - SANER 2017Most Influential Paper - SANER 2017
Most Influential Paper - SANER 2017
Massimiliano Di Penta
 
A novel approach for clone group mapping
A novel approach for clone group mappingA novel approach for clone group mapping
A novel approach for clone group mapping
ijseajournal
 
PhD Proposal
PhD ProposalPhD Proposal
PhD Proposal
Patricia Deshane
 
Plank
PlankPlank
Plank
FNian
 
Finding Diversity In Remote Code Injection Exploits
Finding Diversity In Remote Code Injection ExploitsFinding Diversity In Remote Code Injection Exploits
Finding Diversity In Remote Code Injection Exploits
amiable_indian
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
Eric Van Hensbergen
 
Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017
Dan Crankshaw
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 
Summit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
Summit 16: The Open Source NFV Eco-system and OPNFV's Role ThereinSummit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
Summit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
OPNFV
 
Java mcq
Java mcqJava mcq
Java mcq
avinash9821
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
CSharp_01_CLROverview_and Introductionc#
CSharp_01_CLROverview_and Introductionc#CSharp_01_CLROverview_and Introductionc#
CSharp_01_CLROverview_and Introductionc#
Ranjithsingh20
 
The Last Line Effect
The Last Line EffectThe Last Line Effect
The Last Line Effect
Andrey Karpov
 
LDTT : A Low Level Driver Unit Testing Tool
LDTT : A Low Level Driver Unit Testing Tool LDTT : A Low Level Driver Unit Testing Tool
LDTT : A Low Level Driver Unit Testing Tool
ijseajournal
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
Nicolas Bettenburg
 
A novel approach based on topic
A novel approach based on topicA novel approach based on topic
A novel approach based on topic
csandit
 
C#_01_CLROverview.ppt
C#_01_CLROverview.pptC#_01_CLROverview.ppt
C#_01_CLROverview.ppt
MarcEdwards35
 
Linux host review
Linux host reviewLinux host review
Linux host review
rglaal
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memory
ZongYing Lyu
 
Detecting the High Level Similarities in Software Implementation Process Usin...
Detecting the High Level Similarities in Software Implementation Process Usin...Detecting the High Level Similarities in Software Implementation Process Usin...
Detecting the High Level Similarities in Software Implementation Process Usin...
IOSR Journals
 
A novel approach for clone group mapping
A novel approach for clone group mappingA novel approach for clone group mapping
A novel approach for clone group mapping
ijseajournal
 
Plank
PlankPlank
Plank
FNian
 
Finding Diversity In Remote Code Injection Exploits
Finding Diversity In Remote Code Injection ExploitsFinding Diversity In Remote Code Injection Exploits
Finding Diversity In Remote Code Injection Exploits
amiable_indian
 
Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017
Dan Crankshaw
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 
Summit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
Summit 16: The Open Source NFV Eco-system and OPNFV's Role ThereinSummit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
Summit 16: The Open Source NFV Eco-system and OPNFV's Role Therein
OPNFV
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
CSharp_01_CLROverview_and Introductionc#
CSharp_01_CLROverview_and Introductionc#CSharp_01_CLROverview_and Introductionc#
CSharp_01_CLROverview_and Introductionc#
Ranjithsingh20
 
The Last Line Effect
The Last Line EffectThe Last Line Effect
The Last Line Effect
Andrey Karpov
 
LDTT : A Low Level Driver Unit Testing Tool
LDTT : A Low Level Driver Unit Testing Tool LDTT : A Low Level Driver Unit Testing Tool
LDTT : A Low Level Driver Unit Testing Tool
ijseajournal
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
Nicolas Bettenburg
 
A novel approach based on topic
A novel approach based on topicA novel approach based on topic
A novel approach based on topic
csandit
 
C#_01_CLROverview.ppt
C#_01_CLROverview.pptC#_01_CLROverview.ppt
C#_01_CLROverview.ppt
MarcEdwards35
 
Linux host review
Linux host reviewLinux host review
Linux host review
rglaal
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memory
ZongYing Lyu
 

Recently uploaded (20)

Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy SurveyTrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
Cisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdfCisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdf
superdpz
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent IntegrationPyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training RoadblocksDown the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdfHow Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
Rejig Digital
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data ResilienceFloods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy SurveyTrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
If You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FMEIf You Use Databricks, You Definitely Need FME
If You Use Databricks, You Definitely Need FME
Safe Software
 
Cisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdfCisco ISE Performance, Scalability and Best Practices.pdf
Cisco ISE Performance, Scalability and Best Practices.pdf
superdpz
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent IntegrationPyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Down the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training RoadblocksDown the Rabbit Hole – Solving 5 Training Roadblocks
Down the Rabbit Hole – Solving 5 Training Roadblocks
Rustici Software
 
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdfHow Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
How Advanced Environmental Detection Is Revolutionizing Oil & Gas Safety.pdf
Rejig Digital
 
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data ResilienceFloods in Valencia: Two FME-Powered Stories of Data Resilience
Floods in Valencia: Two FME-Powered Stories of Data Resilience
Safe Software
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdfvertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Ad

An Empirical Study Of Function Clones In Open Source Software

  • 1. An Empirical Study of Function Clones in Open Source Software Chnchal K.Roy and James R. Cordy Queen’s University Presenter: MF Khan
  • 2. Outline Introduction NICAD Overview Experimental Setup Experimental Results Conclusions Discussion
  • 3. Introduction Code Clone/Clone Reusing a code of fragment by copying and pasting with or without minor modifications Benefits Software Maintenance (Bug detection) History Several techniques were proposed Lack of in depth comparative studies on cloning in Variety of systems
  • 4. Introduction (Cont) NICAD In depth study of function cloning in 15+ C and Java Systems including Apache and Linux kernel Accurate Detection of Near-Miss functions Clones. Focusing on its worth in detecting copy/Pasted near-miss clones by using pretty printing, Code normalization and filtering Light Weight using simple text line Capable of detecting clones in very large system in different languages
  • 5. NICAD Overview Three phases of clone detection Extraction All potential clones are identified and extracted. All function and method in C & Java with their original source coordinates Comparison ( Determination of Clones ) Potential clones are clustered and compared. Pretty printed potential clones line by line text wise using Longest common subsequence(LCS).
  • 6. NICAD Overview Unique Percentage of Items(UPI) IF UPI for both line sequence is zero or below certain threshold. Potential Clones are consider to be clone Reporting Results from NICAD reported in XML database form and interactive HTML
  • 7. Experimental Setup Paper applied NICAD to find function clones in a number of open source systems Later on paper introduce a set of metrics to analyze the results
  • 8. Experimental Setup Subject Systems 10 C and 7 Java systems
  • 9. Clone Definition Non empty functions of at least 3 LOC In Pretty printed format. Different Unique Percentage of Items (UPI) use to find exact and near miss clones. E.g. If UPI threshold is 0.0 =Exact clone If UPI threshold is 0.10=Two function as clone
  • 10. Validation of Clones To validate detected clone is 2 step process 1:NICADE’s INTRACTIVE HTML OUTPUT To given an overall view of original source of clone classes an over view of original source of clone classes. 2:XML OUTPUT To pair wise compare the original source of the functions in each clone class using Linux diff to determine the textual similarity of the original source
  • 11. Metrics and Visualizations Total Cloned Methods(TCM) How to get over all cloning statistics File Associated with Clone(FAWC) Overall localization of clones. From a s/w maintenance point of view, a lower value of FAWCP is desirable...Why? If clone are localized to certain specific files and thus may be easier to maintain Still one can’t say which files contain the majority of clone in the system
  • 12. Metrics and Visualizations Cloned Ratio of File for Methods(CRFM) With CRFM we attempt discover highly cloned files In a particular file (f) Profile of Cloning Locality w.r.t Methods(PCLM) Kapser and Godfrey provide 3 location base function clones. 1:In the same File 2:Same DIR 3: Different DIR
  • 13. Experimental Results 1.More function cloning in Open Source java than in C. On AvG about 15%(7.2% wrt LOC) 2.Effect of increasing UPI is almost identical.
  • 14. Detail Overview 1.Several of C system have <10% cloning function. Java systems are consistent in cloning
  • 16. Clone Associated Files FAWC address the issue of what portion of the files in a system is associated with clone. A system with more clones but with associated with only a few files is in some sense better than a system with fewer clones scattered over many files from a software maintenance point of view.
  • 17. Profiles of Cloning Density It tell us which files are highly cloned or which files contain the majority of clones That’s mean Scattered File and more near miss clones
  • 18. Profile of cloning Density Assuming that cloned method in high density cloned file have been intentionally copy/Pasted.
  • 19. Profile Cloning Localization Location of a clone pair is a factor in s/w maintenance Except Linux there are no exact clone in (UPI threshold 0.0) in C When UPI threshold is 0.3,On average 45.9 %(49.0 % LOC) of clone pair in C Occur.
  • 20. Conclusion NICAD is capable of accurately finding the Exact Function Clone Near Miss Function Clones
  • 21. Discussion What is definition of Clone? What is definition of near-miss clones? Why Wel tab is higher in slide 14? What if we use C++ or C#? What will happen if we use smaller clone granularity such as begin- end block