SlideShare a Scribd company logo
Rename Chains:
An Exploratory Study on the Occurrence and Characteristics of
Identifiers Undergoing Multiple Renamings
Anthony Peruma & Christian Newman
Overview
We explore the phenomenon of a single
identifier undergoing multiple renames (i.e., a
rename chain) through a large-scale empirical
study of 800 open-source Java systems
Introduction
• Research shows that developers spend 58% of their time on program
comprehension activities
• Identifier names account for 70% of the characters in the code base
• Identifier names help developers understand the purpose of the
identifier – essential that names should be high-quality
• Names must be unambiguous and intent reveling in communicating the purpose
and behavior of the code
• Developers correct poor-quality names via rename refactoring
operations – over 40% of refactoring operations are renames
• Not all renames result in a high-quality name
• An identifier can undergo multiple renaming's throughout its lifetime (i.e., a chain
of renaming's – a rename chain)
Rename Chain Examples
A method rename chain resulting in a more
descriptive method name in the final rename
A rename chain resulting in a weak
method name; it is just a copy of the
statement within the method
Related Work on Identifier Renaming
• Empirical Studies
• Arnoudova et al. – Rename taxonomy to classify the semantics updates to a name;
developer study showing renaming is not always straightforward
• Peruma et al. – Multiple studies that examine the structure and meaning of names →
developers frequently narrow the meaning of the name; grammar patterns;
contextualization; taxonomy for digits in a name
• Recommendation Models
• Allamanis et al. – utilizes statistical NLP to learn the coding style of a codebase
• Suzuki et al. – An n-gram-based approach for assessing the comprehensibility of method
names and recommending intelligible method names
• Liu et al. – Deep learning techniques to provide recommendations based on the overlap
between method bodies and names that are close in a vector space
• …
While there are studies that investigate (or involve) rename refactoring’s, this is the first
study that examines a chain of rename operations for an identifier
Goal & Impact
Understand the evolution of identifier
names by constructing and studying the
characteristics of a chain of renames
for identifiers (i.e., a rename chain)
Facilitate the research and development
of tools to aid in name appraisal and
recommendations
Research Questions
• RQ1: To what extent do identifiers undergo multiple rename
refactoring operations?
• Understand the volume and types of identifiers that undergo multiple renames
• RQ2: How frequently do renames occur within a rename chain, and
who is responsible for their creation?
• Gain insight into the developers performing the renames in the chain
• RQ3: How do the semantics of an identifier's name evolve in a
rename chain?
• Determine the lexical-semantic properties of names in the rename chain
• RQ4: To what extent can commit log messages help contextualize the
occurrence of rename chains?
• Identify the specific causes for developers to create rename chains
Contributions
A publicly available dataset of
rename chains for replication
or extension studies
An understanding of identifier
name evolution and a discussion
on avenues for future research
Experiment Design
Source dataset of rename refactorings and commit details Dataset of rename chains and their characteristics
Source Dataset: Used in prior refactoring research studies; renames mined using RefactoringMiner
Rename Chain Construction: For each project: sort renames using the author commit date; compare the
identifier’s old and new name by their fully qualified name
Part-of-Speech Tagging: Utilize a specialized identifier name part-of-speech tagger; tags for only the original and
last name in the rename chain
Topic Modeling: Commit messages associated with rename chains; preprocessing; Latent Dirichlet Allocation
RQ Analysis: Supplement our quantitative findings with qualitative examples from our dataset
Results
RQ 1: To what extent do identifiers undergo multiple
rename refactoring operations?
• Identifier renaming is a common operation developers perform
• 285,786 operations: Methods: 26.50%; Parameters: 25.53%; Variables: 21.75%
• Most identifiers undergo a single rename in their lifetime
• 17,404 detected rename chains – Methods are likely to have a chain
• Methods: 30.73%; Variables: 23.47%; Classes: 16.85%
• A rename chain is usually short – composed of a median of 2 rename
instances; variables typically undergo around 3 renamings
• Rename chains tend to occur in projects frequently
• 83.71% of projects have rename chains
• Projects have a median of 9 identifiers undergoing multiple renames
RQ 2: How frequently do renames occur within a
rename chain, and who is responsible for their
creation?
Interval Analysis – duration between renames in the chain
• Median duration between renames is 2 days
• Attributes: 25 days; Classes: 19 days; Methods: 14 days; Variables: 7 days; Parameters:
2 days
• Duration between the first and last rename:
• Parameters have the lowest interval: 17 days
• Variables have the highest interval: 357 days
Developer Analysis – developers performing the renames
• Most chains have the same developer performing all renames: 62.05%
• Multi-developer chains have around 2 developers involved
• Attribute chains have the most number of developers: 4
• 11.51% of chains have different developers for the first and last rename
RQ 3: How do the semantics of an identifier's name
evolve in a rename chain?
Analysis of the lexical-semantic structure (i.e., part-of-speech tags) of
names in the rename chain
Analysis is limited to only the first and last names in the chain
• The same part-of-speech tags are used for the original and new name
• TestServlet → TheTestServlet → TestServlet : NM-N → DT-NM-N → NM-N
• Developers utilize standard naming structures:
• Class: NM-NM-N → NM-NM-N
• Attribute: NM-N → NM-N
• Method: V-NM-N → V-NM-N
• Parameter & Variable: N → N
• Usually, the original and new names are not identical – 78%
RQ 4: To what extent can commit log messages help
contextualize the occurrence of rename chains?
An automated analysis of rename chain commit log messages
using Latent Dirichlet Allocation
3 high-level topics associated with these messages:
• Code Cleanup – developer improving code style quality by
adhering to standards – ‘naming’, ‘convention’, ‘whitespace’
• “Lots of fixes using Checkstyle - Fixed some names to follow conventions...”
• Refactoring – developers updating the code related to the
behavior and design of the system – ‘refactor’, ‘updated’, ‘revert’
• “Major refactor to start process of eventually moving content manager classes…”
• Bug Fix/Testing – renames are part of either a bug fix or unit
testing – ‘fix’, ‘bug’, ‘test’, ‘testcase’
• “fixed bug with searching for transitive dependencies + added test for it…”
Discussion & Conclusion
Overall Findings
Renaming is a common activity in software implementation
• Most identifiers typically undergo a single rename
• However, rename chains frequently occur in projects – methods are frequently associated
with rename chains
• Renames in a chain occur days apart – variables typically having the shortest duration
(approx. two days) and attributes the longest
• Renames in a rename chain are usually performed by the same developer
• Multi-developer chains usually involve two developers
• The grammatical structure of the initial and last name in the chain remains the same
• Code Cleanup, Refactoring, and Bug Fix/Testing are the cause for rename chains –
However, these topics are at a high-level due to the nature of commit messages
Key Takeaways
• Part-of-speech tags are an efficient means of studying the semantic
updates a name undergoes when renamed
• Academia and practitioners should not limit their focus to only the words in a name
• Improvements to name recommendations and appraisal techniques
• These techniques should consider the historical evolution of an identifier’s name in
their evaluation process
• Emphasis on the importance of using high-quality names
• Academia should instill in students the importance high-quality names in the source
code; practitioners should incorporate naming quality into code reviews
• Challenges with automated contextualization of rename chains
• Current NLP techniques are not adequate for analyzing software engineering artifacts
Conclusion & Future Work
• Interpreting identifier names form the backbone of any code
comprehension task
• Developers perform renames to correct poor-quality names
• This can continue throughout the system’s lifetime
• We analyze multiple renames applied to a single identifier (i.e., rename
chain)
• Almost all projects exhibit this phenomenon, with an average chain size of two
renames
• We report on characteristics such as the interval between renames, developers
responsible for chain construction, grammatical changes, and motivation
• Future work: Human subject study
• Validate our empirical findings with developers of varying experience and skills
Thank You!
Anthony Peruma
https://www.peruma.me
Ad

More Related Content

Similar to Rename Chains: An Exploratory Study on the Occurrence and Characteristics of Identifiers Undergoing Multiple Renamings (20)

Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
Ptidej Team
 
Contextualizing Rename Decisions using Refactorings and Commit Messages
Contextualizing Rename Decisions using Refactorings and Commit MessagesContextualizing Rename Decisions using Refactorings and Commit Messages
Contextualizing Rename Decisions using Refactorings and Commit Messages
University of Hawai‘i at Mānoa
 
Candidate selection tutorial
Candidate selection tutorialCandidate selection tutorial
Candidate selection tutorial
Yiqun Liu
 
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
Aman Grover
 
Recommending refactoring operations in large software systems
Recommending refactoring operations in large software systemsRecommending refactoring operations in large software systems
Recommending refactoring operations in large software systems
Carlos Eduardo
 
Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey
Nakul Sharma
 
Lecture: Refactoring
Lecture: RefactoringLecture: Refactoring
Lecture: Refactoring
Marcus Denker
 
Text summarization
Text summarization Text summarization
Text summarization
prateek khandelwal
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003
robertstevens65
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
Lucidworks
 
Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrier
Crai Macdonald
 
Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
HPCC Systems
 
UML Basics Department of CSE, SARANATHAN
UML Basics  Department of CSE, SARANATHANUML Basics  Department of CSE, SARANATHAN
UML Basics Department of CSE, SARANATHAN
arifcsg
 
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
ankarao14
 
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
ankarao14
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
Rimzim Thube
 
report summarizing semantic analyzer phase
report summarizing semantic analyzer phasereport summarizing semantic analyzer phase
report summarizing semantic analyzer phase
spektrasmith
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrieval
AYESHA JAVED
 
Refactoring_Rosenheim_2008_Workshop
Refactoring_Rosenheim_2008_WorkshopRefactoring_Rosenheim_2008_Workshop
Refactoring_Rosenheim_2008_Workshop
Max Kleiner
 
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
ESEM 2014
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
Ptidej Team
 
Contextualizing Rename Decisions using Refactorings and Commit Messages
Contextualizing Rename Decisions using Refactorings and Commit MessagesContextualizing Rename Decisions using Refactorings and Commit Messages
Contextualizing Rename Decisions using Refactorings and Commit Messages
University of Hawai‘i at Mānoa
 
Candidate selection tutorial
Candidate selection tutorialCandidate selection tutorial
Candidate selection tutorial
Yiqun Liu
 
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
Aman Grover
 
Recommending refactoring operations in large software systems
Recommending refactoring operations in large software systemsRecommending refactoring operations in large software systems
Recommending refactoring operations in large software systems
Carlos Eduardo
 
Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey Keyphrase Extraction And Source Code Similarity Detection- A Survey
Keyphrase Extraction And Source Code Similarity Detection- A Survey
Nakul Sharma
 
Lecture: Refactoring
Lecture: RefactoringLecture: Refactoring
Lecture: Refactoring
Marcus Denker
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003
robertstevens65
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
Lucidworks
 
Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrier
Crai Macdonald
 
Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
HPCC Systems
 
UML Basics Department of CSE, SARANATHAN
UML Basics  Department of CSE, SARANATHANUML Basics  Department of CSE, SARANATHAN
UML Basics Department of CSE, SARANATHAN
arifcsg
 
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
ankarao14
 
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...Innovating Multi-Class Text Classification:Transforming Models with propmtify...
Innovating Multi-Class Text Classification:Transforming Models with propmtify...
ankarao14
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
Rimzim Thube
 
report summarizing semantic analyzer phase
report summarizing semantic analyzer phasereport summarizing semantic analyzer phase
report summarizing semantic analyzer phase
spektrasmith
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrieval
AYESHA JAVED
 
Refactoring_Rosenheim_2008_Workshop
Refactoring_Rosenheim_2008_WorkshopRefactoring_Rosenheim_2008_Workshop
Refactoring_Rosenheim_2008_Workshop
Max Kleiner
 
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
ESEM 2014
 

More from University of Hawai‘i at Mānoa (20)

Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
University of Hawai‘i at Mānoa
 
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
University of Hawai‘i at Mānoa
 
Mobile App Security Trends and Topics: An Examination of Questions From Stack...
Mobile App Security Trends and Topics: An Examination of Questions From Stack...Mobile App Security Trends and Topics: An Examination of Questions From Stack...
Mobile App Security Trends and Topics: An Examination of Questions From Stack...
University of Hawai‘i at Mānoa
 
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
University of Hawai‘i at Mānoa
 
A Developer-Centric Study Exploring Mobile Application Security Practices and...
A Developer-Centric Study Exploring Mobile Application Security Practices and...A Developer-Centric Study Exploring Mobile Application Security Practices and...
A Developer-Centric Study Exploring Mobile Application Security Practices and...
University of Hawai‘i at Mānoa
 
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS CollaborationBuilding Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
University of Hawai‘i at Mānoa
 
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
University of Hawai‘i at Mānoa
 
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
University of Hawai‘i at Mānoa
 
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
University of Hawai‘i at Mānoa
 
A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]
University of Hawai‘i at Mānoa
 
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
A Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier NamingA Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier Naming
University of Hawai‘i at Mānoa
 
Test Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to DetectionTest Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to Detection
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
Understanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory StudyUnderstanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory Study
University of Hawai‘i at Mānoa
 
IDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal ToolIDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal Tool
University of Hawai‘i at Mānoa
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name Evolution
University of Hawai‘i at Mānoa
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
University of Hawai‘i at Mānoa
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
Exploring Accessibility Trends and Challenges in Mobile App Development: A St...
University of Hawai‘i at Mānoa
 
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
The Impact of Generative AI-Powered Code Generation Tools on Software Enginee...
University of Hawai‘i at Mānoa
 
Mobile App Security Trends and Topics: An Examination of Questions From Stack...
Mobile App Security Trends and Topics: An Examination of Questions From Stack...Mobile App Security Trends and Topics: An Examination of Questions From Stack...
Mobile App Security Trends and Topics: An Examination of Questions From Stack...
University of Hawai‘i at Mānoa
 
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
On the Rationale and Use of Assertion Messages in Test Code: Insights from So...
University of Hawai‘i at Mānoa
 
A Developer-Centric Study Exploring Mobile Application Security Practices and...
A Developer-Centric Study Exploring Mobile Application Security Practices and...A Developer-Centric Study Exploring Mobile Application Security Practices and...
A Developer-Centric Study Exploring Mobile Application Security Practices and...
University of Hawai‘i at Mānoa
 
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS CollaborationBuilding Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
Building Hawaii’s IT Future Together CIO Council & UH Manoa ICS Collaboration
University of Hawai‘i at Mānoa
 
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
Impostor Syndrome in Final Year Computer Science Students: An Eye Tracking an...
University of Hawai‘i at Mānoa
 
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in And...
University of Hawai‘i at Mānoa
 
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
Performance Comparison of Binary Machine Learning Classifiers in Identifying ...
University of Hawai‘i at Mānoa
 
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
Understanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory StudyUnderstanding Digits in Identifier Names: An Exploratory Study
Understanding Digits in Identifier Names: An Exploratory Study
University of Hawai‘i at Mānoa
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name Evolution
University of Hawai‘i at Mānoa
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
University of Hawai‘i at Mānoa
 
Ad

Recently uploaded (20)

How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Ad

Rename Chains: An Exploratory Study on the Occurrence and Characteristics of Identifiers Undergoing Multiple Renamings

  • 1. Rename Chains: An Exploratory Study on the Occurrence and Characteristics of Identifiers Undergoing Multiple Renamings Anthony Peruma & Christian Newman
  • 2. Overview We explore the phenomenon of a single identifier undergoing multiple renames (i.e., a rename chain) through a large-scale empirical study of 800 open-source Java systems
  • 3. Introduction • Research shows that developers spend 58% of their time on program comprehension activities • Identifier names account for 70% of the characters in the code base • Identifier names help developers understand the purpose of the identifier – essential that names should be high-quality • Names must be unambiguous and intent reveling in communicating the purpose and behavior of the code • Developers correct poor-quality names via rename refactoring operations – over 40% of refactoring operations are renames • Not all renames result in a high-quality name • An identifier can undergo multiple renaming's throughout its lifetime (i.e., a chain of renaming's – a rename chain)
  • 4. Rename Chain Examples A method rename chain resulting in a more descriptive method name in the final rename A rename chain resulting in a weak method name; it is just a copy of the statement within the method
  • 5. Related Work on Identifier Renaming • Empirical Studies • Arnoudova et al. – Rename taxonomy to classify the semantics updates to a name; developer study showing renaming is not always straightforward • Peruma et al. – Multiple studies that examine the structure and meaning of names → developers frequently narrow the meaning of the name; grammar patterns; contextualization; taxonomy for digits in a name • Recommendation Models • Allamanis et al. – utilizes statistical NLP to learn the coding style of a codebase • Suzuki et al. – An n-gram-based approach for assessing the comprehensibility of method names and recommending intelligible method names • Liu et al. – Deep learning techniques to provide recommendations based on the overlap between method bodies and names that are close in a vector space • … While there are studies that investigate (or involve) rename refactoring’s, this is the first study that examines a chain of rename operations for an identifier
  • 6. Goal & Impact Understand the evolution of identifier names by constructing and studying the characteristics of a chain of renames for identifiers (i.e., a rename chain) Facilitate the research and development of tools to aid in name appraisal and recommendations
  • 7. Research Questions • RQ1: To what extent do identifiers undergo multiple rename refactoring operations? • Understand the volume and types of identifiers that undergo multiple renames • RQ2: How frequently do renames occur within a rename chain, and who is responsible for their creation? • Gain insight into the developers performing the renames in the chain • RQ3: How do the semantics of an identifier's name evolve in a rename chain? • Determine the lexical-semantic properties of names in the rename chain • RQ4: To what extent can commit log messages help contextualize the occurrence of rename chains? • Identify the specific causes for developers to create rename chains
  • 8. Contributions A publicly available dataset of rename chains for replication or extension studies An understanding of identifier name evolution and a discussion on avenues for future research
  • 9. Experiment Design Source dataset of rename refactorings and commit details Dataset of rename chains and their characteristics Source Dataset: Used in prior refactoring research studies; renames mined using RefactoringMiner Rename Chain Construction: For each project: sort renames using the author commit date; compare the identifier’s old and new name by their fully qualified name Part-of-Speech Tagging: Utilize a specialized identifier name part-of-speech tagger; tags for only the original and last name in the rename chain Topic Modeling: Commit messages associated with rename chains; preprocessing; Latent Dirichlet Allocation RQ Analysis: Supplement our quantitative findings with qualitative examples from our dataset
  • 11. RQ 1: To what extent do identifiers undergo multiple rename refactoring operations? • Identifier renaming is a common operation developers perform • 285,786 operations: Methods: 26.50%; Parameters: 25.53%; Variables: 21.75% • Most identifiers undergo a single rename in their lifetime • 17,404 detected rename chains – Methods are likely to have a chain • Methods: 30.73%; Variables: 23.47%; Classes: 16.85% • A rename chain is usually short – composed of a median of 2 rename instances; variables typically undergo around 3 renamings • Rename chains tend to occur in projects frequently • 83.71% of projects have rename chains • Projects have a median of 9 identifiers undergoing multiple renames
  • 12. RQ 2: How frequently do renames occur within a rename chain, and who is responsible for their creation? Interval Analysis – duration between renames in the chain • Median duration between renames is 2 days • Attributes: 25 days; Classes: 19 days; Methods: 14 days; Variables: 7 days; Parameters: 2 days • Duration between the first and last rename: • Parameters have the lowest interval: 17 days • Variables have the highest interval: 357 days Developer Analysis – developers performing the renames • Most chains have the same developer performing all renames: 62.05% • Multi-developer chains have around 2 developers involved • Attribute chains have the most number of developers: 4 • 11.51% of chains have different developers for the first and last rename
  • 13. RQ 3: How do the semantics of an identifier's name evolve in a rename chain? Analysis of the lexical-semantic structure (i.e., part-of-speech tags) of names in the rename chain Analysis is limited to only the first and last names in the chain • The same part-of-speech tags are used for the original and new name • TestServlet → TheTestServlet → TestServlet : NM-N → DT-NM-N → NM-N • Developers utilize standard naming structures: • Class: NM-NM-N → NM-NM-N • Attribute: NM-N → NM-N • Method: V-NM-N → V-NM-N • Parameter & Variable: N → N • Usually, the original and new names are not identical – 78%
  • 14. RQ 4: To what extent can commit log messages help contextualize the occurrence of rename chains? An automated analysis of rename chain commit log messages using Latent Dirichlet Allocation 3 high-level topics associated with these messages: • Code Cleanup – developer improving code style quality by adhering to standards – ‘naming’, ‘convention’, ‘whitespace’ • “Lots of fixes using Checkstyle - Fixed some names to follow conventions...” • Refactoring – developers updating the code related to the behavior and design of the system – ‘refactor’, ‘updated’, ‘revert’ • “Major refactor to start process of eventually moving content manager classes…” • Bug Fix/Testing – renames are part of either a bug fix or unit testing – ‘fix’, ‘bug’, ‘test’, ‘testcase’ • “fixed bug with searching for transitive dependencies + added test for it…”
  • 16. Overall Findings Renaming is a common activity in software implementation • Most identifiers typically undergo a single rename • However, rename chains frequently occur in projects – methods are frequently associated with rename chains • Renames in a chain occur days apart – variables typically having the shortest duration (approx. two days) and attributes the longest • Renames in a rename chain are usually performed by the same developer • Multi-developer chains usually involve two developers • The grammatical structure of the initial and last name in the chain remains the same • Code Cleanup, Refactoring, and Bug Fix/Testing are the cause for rename chains – However, these topics are at a high-level due to the nature of commit messages
  • 17. Key Takeaways • Part-of-speech tags are an efficient means of studying the semantic updates a name undergoes when renamed • Academia and practitioners should not limit their focus to only the words in a name • Improvements to name recommendations and appraisal techniques • These techniques should consider the historical evolution of an identifier’s name in their evaluation process • Emphasis on the importance of using high-quality names • Academia should instill in students the importance high-quality names in the source code; practitioners should incorporate naming quality into code reviews • Challenges with automated contextualization of rename chains • Current NLP techniques are not adequate for analyzing software engineering artifacts
  • 18. Conclusion & Future Work • Interpreting identifier names form the backbone of any code comprehension task • Developers perform renames to correct poor-quality names • This can continue throughout the system’s lifetime • We analyze multiple renames applied to a single identifier (i.e., rename chain) • Almost all projects exhibit this phenomenon, with an average chain size of two renames • We report on characteristics such as the interval between renames, developers responsible for chain construction, grammatical changes, and motivation • Future work: Human subject study • Validate our empirical findings with developers of varying experience and skills