0% found this document useful (0 votes)
7 views

Tse2022 - Code Cloning in Smart Contracts On The Ethereum Platform - An Extended Replication Study

Uploaded by

Lu Liu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Tse2022 - Code Cloning in Smart Contracts On The Ethereum Platform - An Extended Replication Study

Uploaded by

Lu Liu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO.

X, MONTH 20XX 1

Code Cloning in Smart Contracts on the


Ethereum Platform: An Extended Replication
Study
Faizan Khan∗ , Istvan David∗ , Daniel Varro , Shane McIntosh

Abstract—Smart contracts are programs deployed on blockchains that run upon meeting predetermined conditions. Once deployed,
smart contracts are immutable, thus, defects in the deployed code cannot be fixed. As a consequence, software engineering
anti-patterns, such as code cloning, pose a threat to code quality and security if unnoticed before deployment. In this paper, we report
on the cloning practices of the Ethereum blockchain platform by analyzing 33,073 smart contracts amounting to over 4MLOC. Prior
work reported an unusually high 79.2% of code clones in Ethereum smart contracts. We replicate this study at the conceptual level, i.e.,
we answer the same research questions by employing different methods. In particular, we analyze clones at the granularity of functions
instead of code files, thereby providing a more fine-grained estimate of the clone ratio. Furthermore, we analyze more complex clone
types, allowing for a richer analysis of cloning cases. To achieve this finer granularity of cloning analysis, we rely on the NiCad clone
detection tool and extend it with support for Solidity, the programming language of the Ethereum platform. Our analysis shows that
most findings of the original study hold at the finer granularity of our study as well; but also sheds light on some differences, and
contributes new findings. Most notably, we report a 30.13% overall clone ratio, out of which 27.03% are exact duplicates. Our findings
motivate improving the reuse mechanisms of Solidity, and in a broader context, of programming languages used for the development of
smart contracts. Tool builders and language engineers can use this paper in the design and development of such reuse mechanisms.
Business stakeholders can use this paper to better assess the security risks and technical outlooks of blockchain platforms.

Index Terms—Code cloning, Smart contracts, Ethereum, Blockchain, Empirical Study

1 I NTRODUCTION show that a large proportion of code in software projects (6%


to 50%) is duplicated [2], [3]. Although some benefits of code
Blockchains offer a novel computation paradigm for dis-
cloning exist—such as improved learning curve of APIs
tributed systems, where information has to be stored with-
and rapid bug workarounds [4]—unintentional cloning [5]
out the possibility of modification. Smart contracts are soft-
affects the quality [6] and maintainability [7] of source
ware programs deployed on a blockchain. Once deployed,
code adversely. For example, upon the detection of a bug
these programs are immutable and execute for as long as the
in a clone, its copies must be checked for bugs as well.
platform is active. Due to its immutable nature, repair in the
Such problems are further exacerbated by code similarity
deployed code is not possible as the modified source code
often going beyond simple copy-and-paste [8], rendering
needs to be redeployed as a new instance. As a consequence,
the management of clones a complex problem.
bad software engineering practices pose more severe threats
Studies show that clones are ideal targets for refactoring
in blockchains than in traditional software settings. Vulner-
aimed at improving the design of the software [9]. Despite
abilities in smart contracts can result in substantial financial
the apparent benefits, the use of clone detection tools is
repercussions, as the majority of deployed smart contracts
limited in the development of smart contracts. This is partly
are written for financial applications [1].
attributed to the fact that the majority of clone detection
A particular bad practice that can deteriorate many func-
tools are designed for traditional programming languages
tional and extra-functional properties (e.g., security, reliabil-
[10], and only limited support exists for the novel class of
ity, and performance) of a software system is the abundance
programming languages targeting decentralized execution
of duplicated source code, also known as code cloning. Code
platforms, such as blockchains. As a consequence, the vast
clones can be commonly found in software systems. Studies
body of knowledge on clone detection in traditional pro-
gramming languages, such as C, C++, and Java [3], [11],
• F. Khan and D. Varro are with the Department of Electrical and Computer cannot be exploited in programming languages used for de-
Engineering, McGill University, Canada.
E-mail: [email protected], [email protected] veloping smart contracts, such as Solidity for Ethereum [12].
• I. David is with the Department of Computer Science and Operations In this paper, we report on the cloning practices on
Research, Université de Montréal, Canada. Ethereum,1 one of the most frequently used blockchain
E-mail: [email protected]
• S. McIntosh is with the David R. Cheriton School of Computer Science,
platforms. We have designed and carried out a study to
University of Waterloo, Canada. analyze 33,073 smart contracts, containing more than four
E-mail: [email protected] million lines of code (MLOC).
• ∗ F. Khan and I. David contributed to this paper equally.
1
https://ethereum.org/en/

© 2022 IEEE. Author pre-print copy. The final publication is available online at: https://dx.doi.org/10.1109/TSE.2022.3207428.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 2

Prior work by Kondo et al. [13], reported an unusually Clone granularity can be either free or fixed. Free granular-
high 79.2% proportion of code clones on the Ethereum plat- ity clone detection considers the source code as a whole and
form. Our work is an extended conceptual replication of [13], does not make use of syntactic boundaries, such as func-
that is, we (i) pose the same research questions; but (ii) use tions, blocks, or statements [10]. Fixed granularity, however,
different methods to answer them; and by that, (iii) refine incorporates such syntactic units. As such, fixed granularity
and extend the findings of the original study.2 provides a more precise estimate of clone ratio, and is more
Specifically, in our study, we analyze code cloning prac- useful than free granularity in the eventual refactoring of the
tices at the level of function blocks, as opposed to the file- duplicated code [19]. Furthermore, clone detectors of free
level analysis of the original study. To achieve this finer granularity produce a higher number of false positives [11],
granularity of cloning analysis, we opt for the NiCad clone [20], which are code fragments that have been cloned with a
detection tool [14] instead of Deckard [15] which was purpose, such as getter/setter methods in Java code. In this
used in the original study, and extend it to support Solid- paper, we use a fixed granularity at the function level.
ity, the programming language of the Ethereum platform. Syntactic clones are identified based on textual program
NiCad has been frequently used for clone detection tasks code, while the identification of semantic clones requires
in conventional software systems. It has been thoroughly an analysis of the behavior of the units of code [21]. In
analyzed and benchmarked in previous studies to identify this paper, we focus on syntactic clones, which are further
optimal configuration settings for detecting clones [16]. We divided into three types. Type-1 clone fragments are exactly
also extend the scope of potential clone types to better identical except for variations in whitespaces, layout, and
identify near-miss (Type-3) clones, which can detect clones comments. For example, Listings 1 and 2 would be Type-
with modifications such as changed, added, or removed 1 clones of each other, were their respective source code on
statements [17]. We assess the ratio of clones in the code Lines 5 and 7 identical. Type-2 clone fragments include Type-
base by removing clone duplicates, i.e., clones that have 1 clones, but allow for differences in identifiers, literals, and
been identified multiple times as instances of different clone data types. For example, Listings 1 and 2 would be Type-2
types. This allows for a better understanding of the types of clones of each other, were the respective assigned values on
cloning-related issues in Solidity smart contracts [18]. This Lines 5 and 7 identical. Type-3 clone fragments include Type-
step is explained in detail in Section 5.2.1. 2 clones, but allow code fragments to differ in complete lines
To the best of our knowledge, this paper is the first of code, thereby capturing clones with entire lines added or
to explore cloning in Solidity smart contracts at this finer removed. The number of lines to be tolerated is defined by
granularity and with an awareness of these types of clones. the dissimilarity threshold, in ratio with the overall code block.
Results. We corroborate many findings of Kondo et In our experiments, we set the dissimilarity threshold to 0.3,
al. [13], but observe some important differences as well. which classifies clones as Type-3 if at least 70% of the nor-
Most importantly, we observe that the clone ratio decreases malized subsequences match. Accordingly, Listings 1 and
from 79.2% to 30.13% at the finer level of granularity of 2 are Type-3 clones. They differ on two out of twenty lines,
2
functions. Moreover, we observe that the vast majority of i.e., a 20 = 0.1 dissimilarity or 90% similarity, which exceeds
clones (90%) are Type-1 clones (i.e., exact replicas). This the threshold of 70%. Identifiers in Type-2 and Type-3 clones
90% proportion among the clone types tends to be steady are normalized by performing a renaming strategy. The two
over an extended period of time, while the total num- most common renaming strategies are blind renaming, where
ber of clones increases; i.e., smart contract development all identifiers are replaced with the same key; and consistent
practices heavily rely on copy-and-paste mechanisms. Tool renaming, where identifiers are given a unique key. For
builders and language engineers can use these results to example, the line int sum = 0 is changed to x x = 0 by
improve reuse mechanisms in smart contract programming blind renaming, and to x1 x2 = 0 by consistent renaming.
languages, including, but not limited to Solidity. Business Line 5 in Listings 1 and 2 is changed to x = "MT" and x =
stakeholders can use this paper to better assess the security "NEM", respectively by both blind and consistent renaming.
risks and technical outlooks of blockchain platforms. Were the variables named differently, e.g., symbol = "MT"
Fostering replication. The replication package containing in Listing 1 and sym = "NEM" in Listing 2, blind renaming
the data and analysis scripts of our study are publicly would still change them to x = "MT" and x = "NEM";
available for the independent verification or replication.3 however, consistent renaming would change them to x1 =
"MT" and x2 = "NEM".
In this paper, we identify Type-1, Type-2, and Type-3
2 BACKGROUND clones. The latter two types are further refined into blindly
Clone detection aims to identify repeated code. Clones are and consistently renamed clones (Type-2b, Type-2c; and
identified based on a similarity relation between their two Type-3b, Type-3c; respectively). Semantic clones (Type-4) are
respective code fragments. A clone fragment is a sequence beyond the scope of this paper.
of contiguous lines of code that is similar to another, non-
overlapping sequence of contiguous lines of code. Clones Smart contracts are programs that can be reliably executed
with similar properties form a clone pair, and when there are by a network of anonymous distributed nodes without the
many similar clones, they form a clone class (also referred to need for a centralized trusted authority. The collection of
as clone group or clone cluster) [16]. these nodes forms a distributed computing platform called
a blockchain [22], upon which smart contracts are executed.
2
In the remainder of this paper, we refer to [13] as the original study. The name blockchain reflects the fact that transactions (i.e.,
3
https://zenodo.org/record/6975351 actions initiated by an externally-owned account, such as
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 3

Listing 1: The MT.sol smart contract. Listing 2: The NEM.sol smart contract.
1 contract MT is ERC20Interface, SafeMath { 1 contract NEM is ERC20Interface, SafeMath {
2 ... 2 ...
3 constructor(string memory _name) public { 3 constructor(string memory _name) public {
4 name = _name; 4 name = _name;
5 symbol = "MT"; 5 symbol = "NEM";
6 decimals = 18; 6 decimals = 18;
7 totalSupply = 500000000000000000000000000; 7 totalSupply = 860000000000000000000000000;
8 balanceOf[msg.sender] = totalSupply; 8 balanceOf[msg.sender] = totalSupply;
9 } 9 }
10 10
11 function transfer(address _to, uint256 _value)public 11 function transfer(address _to, uint256 _value)public
returns (bool success) { returns (bool success) {
12 require(_to != address(0)); 12 require(_to != address(0));
13 require(balanceOf[msg.sender] >= _value); 13 require(balanceOf[msg.sender] >= _value);
14 require(balanceOf[ _to] + _value >= balanceOf[ _to]); 14 require(balanceOf[ _to] + _value >= balanceOf[ _to]);
15 balanceOf[msg.sender] =SafeMath.safeSub(balanceOf[msg. 15 balanceOf[msg.sender] =SafeMath.safeSub(balanceOf[msg.
sender],_value) sender],_value)
16 balanceOf[_to] =SafeMath.safeAdd(balanceOf[_to],_value) 16 balanceOf[_to] =SafeMath.safeAdd(balanceOf[_to],_value)
17 emit Transfer(msg.sender, _to, _value); 17 emit Transfer(msg.sender, _to, _value);
18 return true; 18 return true;
19 } 19 }
20 ... 20 ...
21 } 21 }

a human) within this network are stored in a chain of im- RQ2. What are the characteristics of clusters of similar verified
mutable blocks. One commonly used platform is Ethereum contracts?
[12]. Solidity is an object-oriented and statically-typed pro- The original study reports on three inferred characteristics:
gramming language designed for developing smart con- (i) category, (ii) activity concentration, and (iii) authorship.
tracts, influenced by C++, Python, and ECMAScript. In particular: (i) 9 out of the top-10 largest clusters are token
Listings 1 and 2 show code snippets from the MonPay- managers; (ii) transaction activity tends to be concentrated
Token (MT.sol)4 and NEM token5 smart contracts written on a few contracts; and (iii) contracts in a cluster tend to be
in Solidity. Both smart contracts create a custom token that created by many authors.
can be treated as a virtual currency. The listings show that
RQ3. How frequently code blocks of verified contracts are identi-
the contracts are identical apart from their symbols and the
cal to those from OpenZeppelin?
total supply of tokens. Both of the smart contracts use the
SafeMath contract and ERC-20 interface6 to implement the About one-third of all 165,005 code blocks extracted from
Token functionality. There are 20 instances of the same smart verified contracts are identical to OpenZeppelin code blocks.
contract being repeated with small changes in our corpus. 36.3% of the verified contracts include at least one code
Such repetitions pose a threat to the platform, as vulnerabil- block that is identical to an OpenZeppelin code block. 50%
ities in any of these base smart contracts would potentially of the code blocks from 26.3% of the verified contracts are
affect a large number of smart contracts in production. identical to OpenZeppelin code blocks. The ERC-20 Open-
Zeppelin category is the most frequently reused category,
containing code blocks to support the implementation of
3 S UMMARY OF THE ORIGINAL STUDY token contracts that comply with the ERC-20 standard.
Kondo et al. [13] report (i) the amount of cloned Solidity SafeMath.sol is the most frequently reused OpenZeppelin
smart contracts on the Ethereum platform; (ii) the charac- code file, containing functions that perform mathematical
teristics of clones; and (iii) the overlaps of clones with code operations efficiently and safely.
blocks of smart contract libraries (e.g., OpenZeppelin). The
authors analyzed 33,073 smart contracts amounting to about 3.2 Approach
4 MLOC, and 13 releases of OpenZeppelin7 to answer three
Clone granularity and detection tool. Deckard [15], a free
research questions.
granularity clone detector was used to detect clones be-
tween Solidity code files.
3.1 Research questions and major findings
The research questions and key observations from the orig- Clone types considered. Type-1 and Type-2 clones were
inal study are the following. considered as part of RQ1.
RQ1. How frequently are verified contracts cloned?
Corpus. The corpus consists of 4,004,543 lines of code,
79.2% of the studied contracts are clones. In particular:
extracted from 33,073 verified smart contracts. The files
16.7% of the studied contracts are Type-1 clones; 43.3% of
were retrieved from Etherscan8 in July 2018. Etherscan is
the studied contracts are Type-2 clones. Type-3 clones were
an analytics platform for the Ethereum blockchain that
considered out of the scope due to their detection still being
analyzes each block on Ethereum and provides insights on
actively researched.
each deployed contract. The existence of source code on
4
etherscan.io/token/0xa0b469450e78b3a85d828d454696f8e4bd420038 Etherscan indicates that the source code in Solidity provided
5
etherscan.io/token/0xc14db8e15690c28752dbda133f51821402d29f29 by Etherscan matches the bytecode deployed to Ethereum,
6
https://eips.ethereum.org/EIPS/eip-20
7 8
https://github.com/OpenZeppelin/openzeppelin-contracts https://etherscan.io
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 4

and therefore, it is considered verified. Thus, the corpus study (Deckard [15]), we rely on the NiCad clone detec-
contains only verified contracts. Verified smart contracts tion tool [14]. NiCad does not support Solidity out-of-the-
publish their flattened version on Etherscan. This flattened box. Therefore, we contribute a custom Solidity grammar,10
version of the source code is referred to as the code file of a which makes our analysis and other future work possible.
verified contract. No restriction on the transaction number
on the contracts was imposed. The corpus was compared Clone types considered. In addition to the Type-1 and
with 13 releases of OpenZeppelin in RQ3, released between Type-2 clones that the original study reports on, we also
2016-11-24 and 2018-08-10, with continuous growth in the include Type-3 clones in our scope. Furthermore, to refine
size of 1–5 KLOC over time. our reporting, we (i) split Type-2 and Type-3 clones into
subtypes based on the renaming strategy that has been
applied in the specific clone detecting case; and (ii) provide
4 S TUDY DESIGN
a systematic process to remove duplicated clones.
In this section, we discuss the design of our replication
study, following the guidelines of Carver [23].
5 E XPERIMENTAL SETUP
4.1 Type of replication In this section, we present our experimental setup. As
shown in Figure 1, our study is composed of three phases.
We have carried out a conceptual replication study [24]. That
is, we test the same research questions on the same corpus,
but use different measures and techniques. 5.1 Tool configuration and clone detection
In this phase, we select the clone detector for our study and
4.2 Motivation for replication configure it (Section 5.1.1), develop a grammar to support
clone detection in Solidity smart contracts (Section 5.1.2),
Our work is motivated by the high clone ratio in Solidity
carry out the clone detection (Section 5.1.3), and download
smart contracts reported by the original study being sig-
the releases of OpenZeppelin to be analyzed (Section 5.1.4).
nificantly higher than clone ratios in traditional software
systems. The figures are suggestive of systemic issues in 5.1.1 Tool selection and configuration
the design and methodology of engineering Solidity smart
We set out to select a clone detection tool that was (i)
contracts. Other work [25], [26] confirms this unusually high
freely available and (ii) customizable. While there exists a
rate of clones. Such unusual figures have to be verified
long list of freely available clone detection tools [16], we
by independent studies, especially because (i) the cost of
found NiCad being easily customizable for our purposes.
performing a transaction or executing a smart contract is
NiCad is a text-based clone detection tool that was primarily
proportional to its size,9 and thus, minimizing the size of
designed to detect near-miss clones. It has been widely used
smart contracts can result in direct cost reduction; and (ii)
for clone detection studies, thanks to its high precision and
the majority of smart contracts are deployed in financial
high recall for detecting near-miss clones [16], [28]. Follow-
applications, and thus, vulnerabilities might have serious
ing the suggestions of Wang et al. [29] and the settings used
financial repercussions [27]. Furthermore, the approach of
by Hasanain et al. [3], we set the granularity threshold to
the original study is often prone to false positives due to the
10 LOC and the dissimilarity threshold for Type-3 clones to
free clone granularity it relies on (see Section 2). Therefore,
0.3. These are also the default settings of NiCad.
we set out to replicate the analyses of the original study
using a fixed granularity at the function level. We conjecture
5.1.2 Grammar development
that this new viewpoint from which cloning can be observed
also enhances the applicability of the results in refactoring To conduct our experiment, we extended NiCad with a
processes aiming to eliminate duplicated code. grammar to enable the parsing of Solidity source code.
Our grammar10 is inspired by the grammar for Solidity
available in ANTLR.11 In order to extract a parse tree,
4.3 Level of interaction with the original researchers NiCad expects a context-free grammar for the source-code
Ours is an external replication, i.e., the original researchers language to be provided in a TXL grammar format [30]. TXL
were not involved in the replication [23]. The interaction is a programming language for rule-based transformations.
with the original researchers was restricted to inquiring The TXL grammar not only provides the correct input for
about the study’s data and receiving the data package along parsing, but also provides special markers, such as indent,
with technical pointers regarding its structure. extent, and newlines for pretty-printing the source code.

4.4 Changes to the original study 5.1.3 Clone detection


The clone detection of NiCad consists of (i) parsing and
Clone granularity and detection tool. We approach clone extraction of potential clones, (ii) pretty-printing and nor-
detection with a fixed granularity, fixing our scope at the malizing, and (iii) clone clustering.
function level. Due to the lack of support for fixed granular-
ity analysis by the clone detection tool used in the original Parsing and extraction of potential clones. NiCad extracts
a parse tree representation from the source code, filters out
9
See the documentation of Gas, the unit of computational effort
10
required to execute operations on the Ethereum platform at https:// The grammar is available at https://github.com/eff-kay/nicad6.
11
ethereum.org/en/developers/docs/gas/. https://github.com/antlr/grammars-v4/tree/master/solidity
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 5

Raw clone data Clone data Comparison


Observations results
Corpus (original study)

Duplicate
Clone detection
removal
Tool selection & Grammar Data pre-
Analysis Comparison Reporting
configuration development processing
OpenZeppelin Metadata
code analysis extraction

Soldity grammar Author data


Findings
OpenZeppelin releases
Creation dates Preprocessed data
Tool configuration and clone detection Metadata extraction and preprocessing Analysis and reporting

Figure 1: Overview of the study.

irrelevant blocks, performs normalizations, and transforms 5.2 Metadata extraction and preprocessing
the parse tree back to source code. We developed pretty- In this phase, we preprocess the clone detection results by
printers for the grammar to ensure that all functions are removing duplicated clones (Section 5.2.1), extracting meta-
evaluated consistently. The basic rules of pretty-printing are data (Section 5.2.2), and preparing the data (Section 5.2.3)
the following: (i) function signatures appear on a single for the subsequent analysis.
line; (ii) block parentheses follow the ECMAScript standard;
and (iii) every complete statement appears on its own line. 5.2.1 Duplicate removal
A block of at least ten lines of normalized source code is
Due to the overlapping definition of clone types, some
considered for regular clones because according to previous
clones might belong to multiple clone classes conforming to
studies, this is the best threshold value for the NiCad tool to
different types [18]. For example, if two code fragments are
detect code clones from Java and C source code [29]. Most
identical, they will also be identical after a blind renaming
studies of clone detection consider code clones of less than
procedure is performed on them. Consequently, the class
five LOC to be false positives [20] or micro-clones [31], [32],
of Type-2 clone instances that have been obtained by blind
[33] – an entirely different type of copy-pasted artifact.
renaming, will contain fragments that are also within the
class of Type-1 clone instances. (See Section 2.) We refer to
Flexible pretty-printing and normalization. In addition this implied containment relationship between clone classes
to pretty-printing, NiCad is capable of context-sensitive as the strictness of a clone class. Class Ct of clone instances
normalizations, i.e., normalization based on the context of of type t is stricter than Ct0 if each clone that belongs to Ct
the code fragment. For this initial exploration, we use the de- also belongs to Ct0 . It is directly implied by this definition,
fault normalization settings of NiCad. NiCad detects clone that Ct ⊆ Ct0 . We construct the classes of our approach
pairs in this step by performing a line-wise comparison of based on (i) the type of contained clone instances, and the
the normalized code snippets. renaming procedure. For simplicity, we refer to these classes
by their type and by appending c (consistent) or b (blind) to
Clone clustering. Finally, NiCad conducts a basic cluster the name, depending on the renaming procedure that was
analysis of the clones identified to combine similar clone applied while extracting the clones. Equation 1 defines the
fragments into the same clone cluster. Clones in the same relations between the resulting clone classes.
cluster belong to the same clone class.
Type-1 ⊆ Type-2c ⊆ Type-2b ⊆ Type-3 ⊆ Type-3c ⊆ Type-3b (1)
Corpus. We use the corpus of the original study,12 described We use this hierarchy to remove clone duplicates. The
in detail in Section 3.2. The corpus contains 33,073 verified process iterates through the sets from the strictest to the
smart contracts, amounting to 4,004,543 lines of code. By weakest, and excludes every clone present in the current
using verified contracts deployed to Ethereum, we can be set from the sets that are weaker than the current set. That
sure that the corpus is representative of code in production. is, first, we exclude every Type-1 clone from classes Type-
2c, Type-2b, etc; then, we exclude every Type-2c clone from
classes Type-2b, Type-3, etc; and as the last step, we exclude
5.1.4 OpenZeppelin code analysis
every Type-3c clone from class Type-3b.
We download the contracts of the twelve releases of Open-
Zeppelin that were analyzed by Kondo et al. [13]. We use 5.2.2 Metadata extraction
NiCad to extract contract and function blocks from the For the 33,073 verified smart contracts in the corpus, we
corpus as well as from the OpenZeppelin releases. As ex- have collected additional metainformation from the Ether-
plained in Section 5.1.3, the extraction of contracts by NiCad scan8 analytics platform. We collect two types of informa-
normalizes the source code within each code-block. Then, tion: Creation dates to answer the clone evolution aspect of
we calculate unique hashes for every code block extracted RQ1, and Author information to answer the authorship aspect
from OpenZeppelin releases. We will compare these hashes of RQ2. Both information is extracted from the transaction
with the hashes extracted from the corpus. log of contracts. In about 3.5% of contracts, the creation date
was not available from Etherscan. Those cases are excluded
12
https://github.com/SAILResearch/suppmaterial-18-masanari- from the analyses. We also calculate the length of files in this
smart contract cloning phase for further analysis.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 6

5.2.3 Data preprocessing

Cumulative % of clones
100.0
To allow for fast analysis in the subsequent phase, we 90.0
take care of the computation-intensive tasks of (i) merging 80.0
71.9
metainformation with the data obtained from the clone 60.0
analysis, and (ii) preprocessing the merged data in various 50.0
ways. For example, we calculate quarterly figures for RQ1 40.0
30.0
and calculate Gini-coefficients for RQ2. The preprocessing 20.0
scripts are available from the replication package.3 10.0
0.0
02 10 20 30 40 50 60 70 80 90 100
5.3 Analysis and reporting Cumulative % of clusters
In this phase, we analyze the data (Section 5.3.1) and carry
out the comparison with the original study (Section 5.3.2). Figure 2: Relationship between the proportions of clones
Finally, we report our findings (Section 5.3.3). and contracts with two characteristic values highlighted.

5.3.1 Analysis are clones. As shown in Table 1: Clone proportions.


Table 1, 27.03% are Type-
We analyze cloning patterns quantitatively. For each re-
1 clones; 2.05% are Type-3 Clone type Proportion
search question, we design an analysis and encode it in
clones; while Type-2b, Type- Type-1 27.03%
automated data analysis scripts in Python. The scripts are
2c, Type-3b and Type-3c Type-3 2.05%
available from the replication package.3 Type-2c 0.54%
amount to 1.05%. Our exper-
Type-3b 0.33%
iments show that 89.7% of
5.3.2 Comparison Type-3c 0.15%
all clones (27.03% of the cor- Type-2b 0.03%
We compare our results with the original study by Kondo pus) are of Type-1, i.e., exact
et al. [13] by research question. We map our findings (Sec- total cloned 30.13%
clones. clone-free 69.87%
tion 6) to the fourteen observations of the original study
and observe whether our findings corroborate the specific Clone clusters. A small proportion of clone clusters encom-
observations (Table 5). pass a large proportion of clones. As shown in Figure 2, 20%
of all clusters encompass 71.9% of all clones; and half of the
5.3.3 Reporting clones can be found in just 2.07% of clusters.
Finally, we conduct a narrative synthesis [34] to synthesize
the main findings from the analyses. We conducted multiple Clone evolution. The number of clones among newly cre-
discussions on the findings to formulate conjectures and ated contracts continues to increase over time. The amount
hypotheses. We were especially interested in identifying of non-Type-1 clones increases proportionally within the
feasible and actionable directions for the Ethereum/Solidity cloned code base. The number of all clones has doubled
community, and for blockchain communities in general. quarterly from early 2016 to early 2018, as shown in Fig-
These discussions are reported at the end of the each sub- ure 3a. However, the proportion of clones other than Type-1
section in Sections 6.1–6.3. remained steady after the initial uptick in Q2 2016, as shown
in Figure 3b. That is, the increasing number of code clones
in Figure 3a are predominantly Type-1.
6 E MPIRICAL STUDY ON CODE CLONING
In this section, we present the results of our study by 6.1.3 Discussion
answering the three research questions.
We find that the proportion of clones in smart contracts
is 30.13%. This figure is on par with the ones reported
6.1 RQ1: How frequently are verified contracts cloned? by studies on conventional software systems [2], [35], [9].
6.1.1 Approach However, an important difference with conventional soft-
ware systems is that the source code of deployed blockchain
To conduct this experiment, we calculate the clone percent-
applications is immutable and therefore, modifications, such
age as the ratio of the total normalized LOC of clone clusters,
as bug fixes, are not possible after deployment. This, in turn,
and the total normalized LOC, as shown in Equation 2.
amplifies the threat of exploiting vulnerabilities that spread
N ormLOCcloned across the code base by cloning. This mechanism has been
Clone% = × 100 (2) demonstrated, e.g., in the Parity Wallet Hack [36], in which
N ormLOCtotal
a malicious agent drained 153,037 ETH (over 428 million
where N ormLOCcloned is the sum of all cloned lines af-
USD at the time of writing) from three high-profile contracts.
ter the lines are normalized and the repeated clones are
However, we also observe that most of the clones in smart
removed. To calculate N ormLOCcloned for a clone cluster,
contracts are of Type-1 (27.03% of the overall corpus and
we count the total number of fragments in the clone cluster
89.7% of all clones), that is, the majority of the functions
and multiply it by the number of lines.
are being copied without any modifications. As Tsantalis
et al. [9] report, Type-1 clones are easier to refactor using
6.1.2 Findings
existing clone refactoring tools, which suggests that refac-
Clone ratio. We have observed that 30.13% of the corpus toring the code base of the Ethereum blockchain platform
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 7

30000 Table 2: Top 20 most frequently cloned functions.


Number of new clones all 27044
25000 24482
Function ID Clone count Category
20000
1 transferFrom 748 Token
15000 12624 2 transfer 610 Token
10000 6910 3 () payable noAnyReentrancy 512 Token
5000 2762 4 buyTokens 398 Token
523 1147
0 1 37 31 114 255 5 transfer 163 Token
15/3 15/4 16/1 16/2 16/3 16/4 17/1 17/2 17/3 17/4 18/1 18/2 6 buy 140 Token
Quarter 7 Crowdsale 106 Helper
(a) Number of newly created clones. 8 createTokens 98 Token
9 withdraw 98 Token
% of new non-type-1 clones

10 sell 97 Token
30 type-3 11 purchase 81 Token
25 other 12 decreaseApproval 78 Token
20 13 mint 72 Token
14 claimTokens 69 Token
150 15 69 Helper
10 15/3 23 16/3 16/4 17/1 17/2 17/3 17/4 18/1 18/2
15/4 16/1 16/2 16
finalize
refund 64 Token
5 10 7 8 6 7 6
17 deploy 61 Token
5 3 5 5 18 tokensOfOwner 57 Token
0
15/3 15/4 16/1 16/2 16/3 16/4 17/1 17/2 17/3 17/4 18/1 18/2 19 callback 54 Oracle
Quarter 20 investInternal 46 Token
(b) Number of newly created non-Type-1 clones.

Figure 3: Evolution of clone numbers and percentages. authorization contracts. The function of the code clones in
this category is to evaluate the authorization of the caller.
might be feasible. Furthermore, the analysis of clone clusters With any transaction, one needs to check whether the parties
suggests that there are hotspots of cloned source code that invoking the transactions are permitted to do so.
should be the primary targets of refactoring. Refactorings Oracle. Some smart contracts require data from outside
related to inheritance—such as class and method extraction, the scope of the blockchain, e.g., to determine the latest
method pull up and push down—could be of particular posted exchange rate for the US dollar. For this purpose,
utility. While inheritance is a supported language feature a collection of special smart contracts, called Oracles have
in Solidity, it is apparently underutilized, as evidenced by been created for the Ethereum platform. Oracle smart con-
the high proportion of clones despite the immutability of the tracts provide hooks to the outside world, which allows
deployed code and demonstrated in Listings 1-2. This might an external service to update the state of the Oracle. This
indicate a need for better tool assistance in recognizing allows Oracles to act as stable interfaces between the outside
abstraction/inheritance opportunities. world and other smart contracts. In practice, a non-Oracle
smart contract queries the Oracle smart contract, instead of
6.2 RQ2: What are the characteristics of clusters of querying an external service. On the other hand, an external
similar verified contracts? service sends a transaction to the Oracle when an update to
6.2.1 Approach the data encapsulated by the Oracle is requested.
Helper functions. This category includes functions
We extract the function identifier within each clone fragment
that serve as wrappers around existing functions. Smart
using a custom regular expression. For Type-1 and Type-
contract-specific functions, such as initialize and migrate also
3 clone clusters with no renaming, the function identifiers
belong to this category. The category can be considered as a
are the same for each clone fragment within a clone cluster.
collection of functions not categorized elsewhere.
Thus, there is one function identifier per clone cluster. For
the rest of the clone types, the unique function identifiers are
6.2.2 Findings
extracted from all clone fragments within a clone cluster.
Bartoletti and Pompianu [1] studied 811 smart contracts
written in Solidity and categorized them by the design Cloned functionality. Table 2 lists the 20 functions of the
patterns that they apply. We use the same categorization code base that contain the most clones. 17 of the 20 contracts
applied at the function level. In addition, we add a new are Token-related, i.e., they provide functionality for the
category of Helper functions, which broadly includes all management and provision of contacts, such as buy, sell,
functions not categorized elsewhere. Below, we describe the withdraw, refund, etc. These functions have the same intent
three categories that are most relevant to our study. as the transfer and transferFrom functions. Another group
Token. The code clones in these patterns are used for the of functions, including createTokens, mint, and deploy, are all
distribution of tokens or fungible goods to users. Token is an variants of a mechanism for increasing the supply of Tokens
abstract concept that can represent anything that is count- available for a smart contract.
able and transferable, e.g., shares in a company, outcomes The second most frequent category are Helpers. Crowdsale
of an event, etc. DigixGold13 is an instance of the Token is a common Helper function that is used to set the initial
pattern, which tracks the ownership of a fixed amount of conditions for carrying out a crowdsale operation. Crowd-
gold by using Tokens. A subset of token managers are sale is a method of flash sale where a number of tokens are
allocated to be sold within a time window. The presence
13
https://digix.global/#/ of this function is not unexpected, as a large number of
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 8

1.0 80
1.0 1.0
0.8 0.8 0.9 70
0.8
Gini-coefficient

0.6 0.6 60
0.7

Number of contracts
0.4 0.4 0.6 50

Entropy
0.2 0.2 0.5 40
0.4
0.0 0.0 30
type-1 type-2c type-3 type-3b Overall 0.3
0.2 20
(a) Gini-coefficient of clone clusters (with at least 10 clones).
0.1 10
100 100 0.0
80 80 0.00579 0.2 0.4 0.6 0.8 1
Rank percentage

Normalized cluster size


60 60
Figure 5: Distribution of authorship entropy with the me-
40 40 dian entropy highlighted.
20 20 Overall case is 66.7%.) This number is higher in Type-2c
0 0 clones, where 50% of affected contracts were created before
type-1 type-2c type-3 type-3b Overall 90.91% of the rest of the cluster.

(b) Relative rank of the most active contract. Authorship. Contracts in a clone cluster tend to be created
Figure 4: Clone clusters (with at least 10 clones) and the by many authors. We measure this observation by the nor-
activity of the related contracts. malized Shannon-entropy [38] within a cluster. Maximum
entropy (1.0) is measured for distributions with elements of
uniform probability, i.e., in clone clusters with contracts that
smart contracts are written to conduct Initial Coin Offerings have equal transactions. The less uniform the probabilities
for raising capital for projects. Similarly, finalize is another of elements in a distribution, the lower the entropy. To ob-
Helper function, with the purpose of terminating the crowd- tain meaningful results, we once again investigate clusters
sale initiated by the previous function. with at least ten clones. As Figure 5 shows, the median
One of the top 20 categories is the Oracle functionality, entropy in our sample is 0.7, while the median normalized
specifically, the callback function. As the name suggests, cluster size is close to zero (0.058). This means the average
the purpose of this function is to serve as a callback function cluster is relatively small compared to the largest clusters
to be invoked when an external query is completed. In our while showing high entropy, i.e., a large variance in the
example, the most common calls are made to the Oracle authors. The darker area in the bottom-left corner shows
smart contract. As explained in Section 6.2.1, an Oracle that the majority of clone clusters have high entropy.
serves as a doorway between the blockchain and the ex-
ternal world. Therefore, the purpose of queries to Oracles is 6.2.3 Discussion
to access external data and resources. Out of the functionality that is subject to frequent cloning,
token management contracts, including authorization, pose
Activity. Following the orig- Table 3: Gini-coefficients. the most pressing issue. A detailed look at the cloned func-
inal study, we measured ac- tions reveal that basic transaction functions such as transfer
tivity in terms of the num- Clone type Proportion and createTokens are among most frequently cloned. Provid-
ber of transactions that are Type-1 0.86 ing a library of secure transfer primitives could simplify
related to contracts. First, we Type-2c 0.73 the development of such functionality. From a language
observe that activity tends Type-3 0.86 design point of view, declarative and verifiable language
Type-3b 0.77
to be concentrated on a few constructs have been identified as potential enablers to a
contracts. We use the Gini- Overall 0.86 more secure design of smart contracts [39]. The benefits
coefficient [37] as the measure of such techniques have been demonstrated in blockchain
of inequality among transactions related to clone clusters. A languages, such as Pact and Liquidity.
Gini-coefficient of 0 indicates no inequality among values, The relatively high Gini-coefficients suggest that activ-
while a value of 1 indicates maximal inequality. To obtain ity within a cluster tends to focus on a small number of
meaningful results, we investigate clusters with at least ten contracts. The overall Gini-coefficient of 0.86 is roughly
clones. As Table 3 and Figure 4a show, the overall Gini- equivalent to a cluster of ten contracts with nine contracts
coefficient of the clone clusters is 0.86. Second, the medians having one transaction, and one contract having 250 trans-
in Figure 4b show that in 50% of the cases, the most active actions. Vulnerabilities in such frequently used contracts
contract was created before 66.7% of contracts in the same are more likely to be identified by malicious attackers. The
clone cluster. (The proportion above the median line in the effects of such vulnerabilities, in turn, are amplified by code
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 9

Table 4: Top 10 cloned functions from OpenZeppelin.

Function Clone count % of all Contract


1 transferFrom public returns (bool) 15,287 28.43 StandardToken
2 decreaseApproval public returns (bool) 12,021 22.35 StandardToken
3 transfer 11,951 22.22 BasicToken
4 allowance 1,602 2.98 StandardToken
5 approve 974 1.81 StandardToken
6 burn 779 1.45 BurnableToken
7 transferFrom returns (bool) 748 1.39 StandardToken
8 decreaseApproval returns (bool success) 663 1.23 StandardToken
9 burn 612 1.14 BurnableToken
10 TokenVesting 540 1.00 TokenVesting

cloning as the same vulnerabilities can be anticipated in the could reduce the number of clones, and improve the main-
contracts of the same clone cluster. tainability of the overall code base. This, in turn, could
The high entropy in authorship suggests that cloning is improve the extra-functional properties of Ethereum, such
a widespread phenomenon on Ethereum. Such community- as security, reliability, and integrity. The functionality cloned
wide bad practices are often addressed by guidelines pub- from OpenZeppelin tends to concentrate on transfer-related
lished by community leaders, such as the Python Enhance- functionality, and mostly from the StandardToken contract.
ment Proposal (PEP) 8 style guidelines for Python [40].
However, such general rules cannot be enforced in a 7 C OMPARISON WITH THE ORIGINAL STUDY
computer-automated fashion, and a better solution could
be establishing community-specific DevOps processes that In this section, we provide an overview of how the findings
include the usage of quality gates enforced by code quality in Section 6 align with the results of the original study of
tools that evaluate contracts that are ready to be deployed. Kondo et al. [13]. The mapping between the two studies
is shown in Table 5. For the sake of compactness, we have
presented our results in slightly different groups of findings.
6.3 RQ3: How frequently are code blocks of verified Below we give a detailed explanation.
contracts identical to those from OpenZeppelin?
6.3.1 Approach 7.1 RQ1
To answer the research question, we identify the code blocks We have observed the most important difference between
present in OpenZeppelin releases that are also present in our study and the original study while analyzing RQ1.
the corpus. We do so by (i) extracting code blocks from
OpenZeppelin, (ii) calculating their hashes (as explained Clone ratio. The overall proportion of clones that we
in Section 5.1.4), and (iii) comparing those hashes with the detect (30.13%) is considerably smaller than the proportion
hashes calculated for the code blocks of the corpus. observed in the original study (79.2%). This difference is
due to three factors. First, our analysis is performed at the
6.3.2 Findings function-level, which is a finer granularity and provides a
Table 4 shows the 10 most commonly cloned functions from larger sample of code units that are subject to cloning. Sec-
OpenZeppelin, along with their category, the respective ond, we count every identified clone once, as explained in
number of clones, and the proportion of these clones in the Section 5.1.3. Third, due to the normalization, the clones that
overall set of OpenZeppelin (OZ) clones. we identify are mainly exact copies, further reducing the
number of instances of less strict clone clusters. However,
Clone proportion. Of all verified contracts, 21.79% have as a common treatment in near-miss clone detectors, we
functions identical to those of OpenZeppelin. As seen in normalize the text of the function blocks using standard
Table 4, the three most cloned functions encompass 73% of ECMA formatting, as explained in Section 5.1, reducing
all clones from OpenZeppelin. the number of false-negatives, and consequently, potentially
increasing the number of identified clones.
Functionality. Most functions have been defined in the We corroborate the high ratio of Type-1 clones but ob-
StandardToken OpenZeppelin contract. Six of the ten most serve a much larger proportion of Type-1 clones among all
cloned functions (Table 4) and fourteen of the hundred most clones. Our experiments show 89.7% of all clones are of
cloned functions belong to this category. Other frequently Type-1, as opposed to the 21.1% (16.7% overall) reported
encountered categories are SafeMath (10) and VestedToken by the original study. This can be explained by removing
(8). The most frequently cloned functionality is related to Type-1 clones from Type-2 and Type-3 clusters.
transfers, accounting for over 50% of cloned functionality. The substantial difference between our results and the
original study shows that while 79.2% of contract files might
6.3.3 Discussion be affected by cloning practices, it is typically only a subset
OpenZeppelin serves as a frequent source of code cloning of the encoded functionality that is actually cloned.
on the Ethereum blockchain platform. The high volume of
cloning from OpenZeppelin suggests that mechanisms for Clone clusters. The original study found that 20% of
reusing functionality from libraries such as OpenZeppelin clusters encompass 68% of clones. These figures are nearly
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 10

Table 5: Mapping the findings of the current study to the authors of cloned functions. Therefore, identifying develop-
observations of the original study. ers who are responsible for clones can be achieved either at
the function or the contract level with similar results, with
Finding: Observations:
current study original study
Comparison potentially different runtime performance.
RQ1
Clone ratio Observations 1, 4, 5 Refined – different results 7.3 RQ3
Clone clusters Observation 2 Corroborated
Clone evolution Observation 3 Corroborated & refined
We observed relevant differences both in the detected clone
RQ2 proportion and the cloned functionality. These differences
Cloned functionality Observations 6, 7 Corroborated – minor diff are due to the finer granularity of our analysis.
Activity Observations 8, 9 Corroborated – minor diff
Authorship Observations 10, 11 Corroborated
RQ3 Clone proportion. The original study reported that 36.3% of
Clone proportion Observations 12, 13 Refined – different results verified contracts have at least one code block identical to an
Functionality Observation 14 Refined – different results OpenZeppelin code block. Our finer-grained results show
that this proportion decreases to 21.79% when analyzing
code similarity at the level of functions with at least 10
identical to those that we observe: 20% of all clusters en- LOC; increases to 47.21% when analyzing code similarity
compass 71.9% of all clones; and half of the clones can be at the level of functions with at least 5 LOC; and increases
found in just 2.07% of clusters. We conclude that our results to 64.59% when not considering a minimum function length.
at a finer level of granularity corroborate the findings of the These proportions are on par with, and in some cases
original study at a coarser level of granularity. exceed the 6–50% cloning rate reported from traditional
engineering domains [2], [3], suggesting security risks.
Clone evolution. Since the original study also observed an
increasing trend, we conclude that our results corroborate Functionality. The original study reported that ERC20 is the
the findings of the original study. However, we point out most frequently cloned category from OpenZeppelin, and
that different types of clones evolve at different paces. that ERC20 is more frequently cloned than its concrete im-
Specifically, the amount of newly created Type-1 clones is plementation, StandardToken. However, our finer-grained
the predominant factor behind the increasing trend. analysis shows that the StandardToken implementation of
ERC20 is more frequently cloned than ERC20. This is not
7.2 RQ2 unexpected because ERC20 is an interface and as such, it
We observed minor differences in RQ2 in terms of the cloned only defines function signatures but no bodies. While ERC20
functionality (due to the different levels of granularity of the might be the most cloned contract block, it is the concrete
two studies), and the activity of cloned contracts. implementations of ERC20 that that contribute the most
cloned function blocks.
Cloned functionality. The original study reports that nine
of the ten most populous clusters are related to Token 8 R ELATED WORK
management. Our finer-grained results also show that nine In this section, we briefly review the related work.
of the top ten clusters are indeed Token management func-
tions. Moreover, 17 of the top 20 are Token management
functions. Unlike the original study, we find that the other 8.1 Empirical studies on smart contracts
top clusters were Helper and Oracle functions rather than Bartoletti and Pompianu [1] conducted a study to analyze
Token Lockers. The Token Locker category of the original top blockchain platforms and their usages. They analyzed
study covers three specific functionalities: lock(), lockOver() 834 smart contracts written for the Ethereum and Bitcoin
and release(). At the finer level of granularity of functions, technologies, and grouped the contracts by application do-
however, these functionalities prove to be less frequently main and design patterns that were applied. We use the
cloned than at the contract level. same design patterns as the basis for assigning commonly
cloned functions into different categories.
Activity. The original study also observes that transactions Durieux et al. [41] studied nine automated analysis tools
tend to be concentrated on a few contracts, and reports for Solidity. Automated analysis tools can aid developers
an overall Gini-coefficient of 0.817. We enhance the prior in meeting required functional and extra-functional quali-
observations by adding that Type-2c clones show a lower tative measures, resulting in better performing, safer and
Gini-coefficient (0.73). We report slightly different figures more reliable code. The authors conclude that state-of-the-
regarding the relative creation date of the most active con- art analysis tools fall short in detecting numerous classes of
tracts. In 50% of cases, the top-active contract of a cluster vulnerabilities, identifying only 40% of the vulnerabilities
was found created before 74.7% of other contracts in the in a testing corpus, and produce a large number of false
original cluster by the original study, and 63.4% by our finer- positives. These results are corroborated by Ghaleb et al. [42]
grained study. However, this difference is minor. who investigated the effectiveness of static analysis tools for
Solidity smart contracts using bug injection. These results
Authorship. We observe numbers that are almost identical provide evidence that code clones are hotspots of software
to the ones reported by the original study. This means the issues because they facilitate the spread of faulty code, code
authors of cloned smart contract files are the same as the smells, and anti-patterns. The ineffectiveness of analysis
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 11

Table 6: Studies on clone detection in Solidity contracts. Table 7: Studies on cloning in conventional software.

Authors Granularity Method/Tool Clone% Author LOC Lang. Tools Clone%


Liu et al. [26] free Birthmarks/EClone N/A Tuzun et al. [50] 97K Java CCFinder 33%
Gao et al.[25] free SmartEmbed14 90% Tsantalis et al. [9] 51K-209K Java Deckard, NiCad 14.3-81.9%
Kondo et al.[13] free Deckard [15] 79.2% Lague et al. [2] 15M C Datrix 6.4%-7.5%
Baxter et al. [35] 400K C custom 12.7%-28%
Our study fixed NiCad [14] 30.13% Hasanain et al. [3] 286K C NiCad 49%
van Bladel et al. [11] 3.7K-32.4K Java NiCad, IClones 23-39%
Our study 2.7M Solidity NiCad 30.13%
tools positions clone detection as a viable technique to aid in
combating the vulnerability of the code base on Ethereum.
Extra-functional properties of blockchains have been Tuzun and Er. [50] conducted an empirical analysis of
analyzed by Li et al. [43] on security, by Rouhani [44] an industrial system consisting of 677 Java files and 97
on performance, by Scherer et al. [45] on scalability, and KLOC. They observed a clone rate of 22% and found that
by Belchior et al. [46] on interoperability. These studies more than half of the files (360 files) have at least one
provide evidence of quality-related challenges in blockchain clone. Similarly, Tsantalis et al. [9] conducted a large-scale
applications, which are further exacerbated by code cloning. empirical study using nine open-source projects analyzed
Application scenarios of blockchain technologies have by four different clone detection tools (CCFinder, Deckard,
been discussed in numerous domains, such as finance [47], CloneDR, and NiCad). They found that the clone rate in
e-governance [48], and healthcare [49], where blockchains test code was between 14.3% and 81.9%, while the clone
are positioned as a core technology of foundational infras- rate in application code was between 18.1% and 85.7%. In
tructure. In such contexts, poor quality code that is prone addition, they reported that Type-1 clones can be refactored
to defects and vulnerabilities (e.g., code that is copied and more easily than other types of clones. This observation
pasted into contexts in which it was never intended to aligns well with our finding that about 90% of all clones in
operate) can have serious repercussions. Solidity smart contracts are of Type-1, and suggests that the
majority of cloning-related code quality issues in Solidity
smart contracts can be efficiently refactored.
8.2 Clone detection in Solidity smart contracts
Laguë et al. [2] conducted a study to reveal the benefits
Table 6 summarizes the studies conducted on clone detec- of using clone detection in industrial software development
tion in smart contracts. processes. They analyzed a large telecommunication system
The first known exploration of clone detection in Solidity of 15 MLOC over three years. The results show that the rate
smart contracts was conducted by Liu et al. [26]. Their clone of clones stays at a constant level over time, ranging be-
detection approach relies on a custom semantics-preserving tween 6.4% and 7.5%. Similarly, Baxter et al. [35] conducted
representation of smart contract traits, called birthmarks. a clone detection study to identify duplicated code in an
Code similarity is then determined by calculating the statis- industrial system written in C. The size of the code was 400
tical similarity between pairs of contract birthmarks. Their KLOC. They observed cloning rates of 12.7% clones overall
work focuses on the evaluations of the birthmark method in and 28% in three specific subsystems. Their results show
detecting self-injected clones, rather than the actual rate of that the clone rates may vary for different subsystems.
clones in smart contracts themselves. More recently, Hasanain et al. [3] used NiCad to study
The SmartEmbed tool was developed by Gao et al. [25] clone detection in a large industrial test suite. They found
for detecting clones in smart contracts. The traits of Solidity 49% of the code to be duplicated with Type-3 being the
smart contracts are encoded by code embedding vectors. prevalent clone type. Van Bladel and Demeyer [11] carried
Clones are identified based on the pair-wise comparison of out a similar study on test code in open source projects. They
these vectors. The authors report a clone ratio of 90%, which used four different clone detection tools i.e. Nicad, CPD,
is substantially higher than the clone rate of traditional IClones, and TCore. They report a clone density between
software artifacts. The exact precision and recall measures 23% and 29% with Type-2 clones having a higher represen-
of the tool, however, are not reported. tation. Test suites have not been analyzed in the context of
In contrast, we use NiCad [14] in our study. NiCad smart contracts, but could provide further valuable insights.
has been frequently used for clone detection tasks in con-
ventional software systems. In addition, NiCad has been
thoroughly analyzed by previous studies for optimum con- 9 T HREATS TO VALIDITY
figuration in detecting clones [16], and assessed from a Construct validity. Our observations may be artifacts of
qualitative perspective [28]. the NiCad configuration settings that we used, rather than
meaningful observations about cloning tendencies in Solid-
ity smart contracts. To combat this, we use well-established
8.3 Clone detection studies for other languages settings [3], [11], [28] in our experiments, e.g., normalization,
Table 7 summarizes a list of studies conducted on clone setting the granularity threshold to 10 LOC, and using a 0.3
detection for programming languages other than those for dissimilarity threshold for Type-3 clones.
smart contracts. The studies approach clone detection with Internal validity. The manual classification of cloned
a fixed granularity at function/method level, and therefore, functions and contracts (see Tables 2 and 4) can result
the reported figures are comparable to those of our work. in incorrectly classified data. Furthermore, because of the
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 12

broader definition of the categories, there is always room Unfortunately, the lack of coordination between develop-
for interpretation when conducting the classification. To ers renders such efforts particularly challenging. Thus, we
address this potential threat, we made the list of categorized anticipate automated audit mechanisms to appear in the
functions available to public scrutiny.10 integration (pre-deployment) phase of blockchain DevOps
External validity. Our study has sampled only veri- processes [51]. Solutions such as the NiCad-based tool pre-
fied smart contracts deployed on the Ethereum platform, sented in this paper could serve as a machinery to generate
a subset of all smart contracts deployed on the platform. refactoring recommendations to reduce the clone ratio in the
Thus, there are no guarantees on the safe generalization code to be deployed, thereby improving the overall code
of our findings to all smart contracts written in Solidity. quality of the platform. Furthermore, we foresee the emer-
The same reasoning applies to generalizing our findings gence of quality control as a service, provided by platform
to other blockchain platforms. However, the goal of these agents in exchange for compensation that is proportional to
experiments was not to provide a general theory for all their computation investment.
Ethereum smart contracts, but to extract initial and high- Future work should focus on extending the scope of
level insights from existing smart contracts in order to raise the current study to smart contract programming languages
awareness about the highly vulnerable state of systems re- of other platforms, such as Script, the language of Bitcoin.
lying on immutable code. We are still reasonably confident, Opportunities in adapting traditional software engineering
that many of our insights translate well to other platforms lifecycle models to the particularities of smart contract de-
relying on immutable source code. An external threat to velopment should be considered as well.
validity w.r.t. the original study we could not mitigate is
the number of transactions used in the analysis of RQ2, as
explained in Section 6.2.2. R EFERENCES
Limitations. Due to the limited parsing support for Solid-
[1] M. Bartoletti and L. Pompianu, “An empirical analysis of smart
ity (especially compared to that for mainstream languages, contracts: platforms, applications, and design patterns,” in Inter-
such as Java and C++), we have developed a custom parser national conf. on financial cryptography and data security. Springer,
using the TXL grammar [30]. Since this is the first version 2017, pp. 494–509.
[2] B. Laguë et al., “Assessing the Benefits of Incorporating Function
of the parser, bugs and other shortcomings are possible. Clone Detection in a Development Process,” in International Con-
Although we have not experienced such issues during our ference on Software Maintenance. IEEE, 1997, pp. 314–321.
experiments, we have made the parser available to public [3] W. Hasanain et al., “An analysis of complex industrial test code us-
scrutiny10 . Nevertheless, as a sign of maturity, the Solidity ing clone analysis,” in International Conference on Software Quality,
Reliability and Security. IEEE, 2018, pp. 482–489.
parsers and normalizers developed for our experiments [4] C. J. Kapser and M. W. Godfrey, ““Cloning considered harmful”
have become part of NiCad starting with its v6.2 release. considered harmful: patterns of cloning in software,” Empirical
Software Engineering, vol. 13, no. 6, pp. 645–692, 2008.
[5] C. K. Roy et al., “The vision of software clone management:
10 C ONCLUSION Past, present, and future,” in 2014 Software Evolution Week - IEEE
Conference on Software Maintenance, Reengineering, and Reverse En-
In this paper, we reported the results of our study on gineering. IEEE, 2014, pp. 18–33.
source code cloning practices on the Ethereum blockchain [6] R. Koschke, “Frontiers of software clone management,” in 2008
Frontiers of Software Maintenance. IEEE, 2008, pp. 119–128.
platform. By analyzing 33,073 Solidity smart contracts, we [7] D. Chatterji et al., “Effects of cloned code on software maintain-
found that 30.13% of the source code is cloned. Our work ability: A replicated developer study,” in Working Conference on
is an extended conceptual replication of the study of Kondo Reverse Engineering. IEEE, 2013, pp. 112–121.
et al. [13] who reported a substantially higher clone ratio [8] E. Jürgens, F. Deissenboeck, and B. Hummel, “Code similarities
beyond copy & paste,” in European Conference on Software Mainte-
of 79.2%. The main difference between the two studies is nance and Reengineering. IEEE, 2010, pp. 78–87.
the level of granularity clones are analyzed at. Our analysis [9] N. Tsantalis, D. Mazinanian, and G. P. Krishnan, “Assessing
was carried out at the level of functions, while the original the refactorability of software clones,” IEEE Trans. Software Eng.,
analysis was carried out at the level of whole source files. vol. 41, no. 11, pp. 1055–1090, 2015.
[10] C. Roy and J. Cordy, “A survey on software clone detection
To achieve this finer granularity of cloning analysis, we research,” Ontario, Canada, Tech. Rep. 2007-541, 2007.
extended the NiCad clone detection tool to support Solid- [11] B. van Bladel and S. Demeyer, “Clone Detection in Test Code:
ity, the programming language of the Ethereum platform. An Empirical Evaluation,” in International Conference on Software
Analysis, Evolution and Reengineering. IEEE, 2020, pp. 492–500.
Our study reports a lower boundary of the clones on the
[12] C. Dannen, Introducing Ethereum and solidity. Springer, 2017.
Blockchain platform. This lower boundary is still on par [13] M. Kondo et al., “Code cloning in smart contracts: a case study
with the 6–50% rate of cloning reported from traditional on verified contracts from the Ethereum blockchain platform,”
software engineering domains [2], [3], suggesting potential Empirical Software Engineering, vol. 25, no. 6, pp. 4617–4675, 2020.
[14] C. K. Roy and J. R. Cordy, “NICAD: accurate detection of near-
risks of reduced security, reliability, and performance of the miss intentional clones using flexible pretty-printing and code
overall software system. normalization,” in Int. Conf. on Program Comprehension. IEEE,
An important takeaway of our study is that these prob- 2008, pp. 172–181.
lems could be effectively addressed by refactoring. The [15] L. Jiang et al., “DECKARD: scalable and accurate tree-based
detection of code clones,” in International Conference on Software
majority, about 90% of clones are of Type-1, i.e., exact Engineering. IEEE, 2007, pp. 96–105.
replicas, and such clones have been shown to be easier to [16] C. K. Roy, J. R. Cordy, and R. Koschke, “Comparison and evalu-
refactor [9]. Moreover, as shown by our cluster analysis, ation of code clone detection techniques and tools: A qualitative
approach,” Sci. Comput. Program., vol. 74, no. 7, pp. 470–495, 2009.
cloned functions tend to form hotspots in the source code:
[17] R. K. Saha et al., “Understanding the evolution of type-3 clones: an
half of the clones can be found in just about 2% of clusters. exploratory study,” in Proceedings of the 10th Working Conference on
Such clusters should be the prime candidates for refactoring. Mining Software Repositories. IEEE, 2013, pp. 139–148.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. XX, NO. X, MONTH 20XX 13

[18] W. O. A. Hasanain, “Analysis and maintainability of complex [45] M. Scherer, “Performance and scalability of blockchain networks
industry test code using clone detection,” Ph.D. dissertation, Car- and smart contracts,” Master’s thesis, Umea Uni., Sweden, 2017.
leton University, 2020. [46] R. Belchior, A. Vasconcelos, S. Guerreiro, and M. Correia, “A
[19] Y. Ueda, T. Kamiya, S. Kusumoto, and K. Inoue, “Gemini: Main- survey on blockchain interoperability: Past, present, and future
tenance Support Environment Based on Code Clone Analysis,” in trends,” ACM Comput. Surv., vol. 54, no. 8, pp. 168:1–168:41, 2022.
International Software Metrics Symposium. IEEE, 2002, pp. 67–76. [47] P. Treleaven, R. G. Brown, and D. Yang, “Blockchain technology in
[20] S. Bellon, R. Koschke, G. Antoniol, J. Krinke, and E. Merlo, “Com- finance,” Computer, vol. 50, no. 9, pp. 14–17, 2017.
parison and Evaluation of Clone Detection Tools,” IEEE Trans. [48] C. Alexopoulos et al., “Benefits and Obstacles of Blockchain Ap-
Softw. Eng., vol. 33, no. 9, p. 577–591, 2007. plications in e-Government,” in Hawaii International Conference on
[21] A. Kumar et al., “A systematic review of semantic clone detection System Sciences. ScholarSpace, 2019, pp. 1–10.
techniques in software systems,” IOP Conference Series: Materials [49] C. C. Agbo et al., “Blockchain technology in healthcare: A system-
Science and Engineering, vol. 1022, p. 11, 2021. atic review,” Healthcare, no. 2, 2019.
[22] M. Swan, Blockchain: Blueprint for a new economy. O’Reilly, 2015. [50] E. Tüzün and E. Er, “A case study on applying clone technology
[23] J. C. Carver, “Towards reporting guidelines for experimental repli- to an industrial application framework,” 2012 6th International
cations: A proposal,” in 1st International Workshop on Replication in Workshop on Software Clones, IWSC 2012 - Proceedings, 06 2012.
Empirical Software Engineering Research, vol. 1, 2010, pp. 1–4. [51] M. Wöhrer and U. Zdun, “Devops for ethereum blockchain smart
[24] A. R. Dennis and J. S. Valacich, “A replication manifesto,” AIS contracts,” in 2021 IEEE Intl. Conference on Blockchain, Blockchain
Transactions on Replication Research, vol. 1, no. 1, p. 1, 2015. 2021, Melbourne, Australia, 2021. IEEE, 2021, pp. 244–251.
[25] Z. Gao et al., “SmartEmbed: A Tool for Clone and Bug Detection
in Smart Contracts through Structural Code Embedding,” in Int.
Conference on Software Maintenance and Evolution. IEEE, 2019, pp.
394–397.
[26] H. Liu et al., “Enabling clone detection for ethereum via smart con-
Faizan Khan is a software engineer at Plotly
tract birthmarks,” in Proceedings of the 27th International Conference
working on data-visualization libraries. He com-
on Program Comprehension. IEEE / ACM, 2019, pp. 105–115.
pleted his Masters at the Department of the
[27] M. I. Mehar et al., “Understanding a Revolutionary and Flawed
Electrical and Computer Engineering at McGill
Grand Experiment in Blockchain: The DAO Attack,” J. Cases Inf.
University. His research interests include pro-
Technol., vol. 21, no. 1, pp. 19–32, 2019.
gramming languages and program synthesis.
[28] C. K. Roy and J. R. Cordy, “Towards a mutation-based automatic
framework for evaluating code clone detection tools,” in Canadian
Conf. on Comp. Science & Software Eng., ser. ACM International
Conference Proceeding Series, vol. 290. ACM, 2008, pp. 137–140.
[29] T. Wang et al., “Searching for better configurations: a rigorous
approach to clone evaluation,” in European Software Engineering
Conference. ACM, 2013, pp. 455–465.
[30] J. R. Cordy, C. D. Halpern-Hamu, and E. Promislow, “TXL: A rapid
prototyping system for programming language dialects,” Comput.
Lang., vol. 16, no. 1, pp. 97–107, 1991. Istvan David is a postdoctoral researcher at the
[31] M. Beller, A. Zaidman, and A. N. Karpov, “The last line effect,” in University of Montreal, Canada. He received his
Proceedings of the 2015 IEEE 23rd International Conference on Program PhD in Computer Science from the University of
Comprehension. IEEE, 2015, pp. 240–243. Antwerp, Belgium. His research interests include
[32] R. van Tonder and C. L. Goues, “Defending against the attack of model-driven engineering of complex heteroge-
the micro-clones,” in 24th IEEE International Conference on Program neous systems, and software quality improve-
Comprehension. IEEE, 2016, pp. 1–4. ment through automation. He is active outside
[33] M. Mondal, C. K. Roy, and K. A. Schneider, “Micro-clones in of academia as well, especially in innovation
evolving software,” in 25th International Conference on Software consulting. Contact: https://istvandavid.com.
Analysis, Evolution and Reengineering. IEEE, 2018, pp. 50–60.
[34] D. S. Cruzes and T. Dybå, “Research synthesis in software en-
gineering: A tertiary study,” Information and Software Technology,
vol. 53, no. 5, pp. 440–455, 2011.
[35] I. D. Baxter et al., “Clone detection using abstract syntax trees,” in
Int. Conference on Software Maintenance. IEEE, 1998, pp. 368–377.
[36] S. Palladino, “The parity wallet hack explained,” OpenZeppelin, Daniel Varro is a full professor at McGill Univer-
2017. sity. He serves on the editorial board of Software
[37] R. Dorfman, “A formula for the gini coefficient,” The review of and Systems Modeling and Journal of Object
economics and statistics, pp. 146–149, 1979. Technology periodicals, and served as a pro-
[38] C. E. Shannon, “A mathematical theory of communication,” The gram co-chair of MODELS 2021, SLE 2016,
Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948. ICMT 2014, FASE 2013 conferences. He is a
[39] R. M. Parizi, Amritraj, and A. Dehghantanha, “Smart contract co-founder of the VIATRA open source soft-
programming languages on blockchains: An empirical evaluation ware framework as well as IncQuery Labs, a
of usability and security,” in Blockchain - ICBC 2018 - First Inter- technology-intensive company.
national Conference, ser. Lecture Notes in Computer Science, vol.
10974. Springer, 2018, pp. 75–91.
[40] G. Van Rossum, B. Warsaw, and N. Coghlan, “Pep 8: style guide
for python code,” Python. org, vol. 1565, 2001.
[41] T. Durieux, J. F. Ferreira, R. Abreu, and P. Cruz, “Empirical review
of automated analysis tools on 47,587 Ethereum smart contracts,” Shane Mcintosh is an associate professor at
in Int. Conference on Software Engineering. ACM, 2020, pp. 530–541. the University of Waterloo. Previously, he was
[42] A. Ghaleb et al., “How effective are smart contract analysis tools? an assistant professor at McGill University. He
Evaluating smart contract static analysis tools using bug injec- received his Ph.D. from Queen’s University. In
tion,” in Int. Symp. on Software Testing and Analysis. ACM, 2020, his research, Shane uses empirical methods to
pp. 415–427. study software build systems, release engineer-
[43] X. Li et al., “A survey on the security of blockchain systems,” ing, and software quality: http://shanemcintosh.
Future Gener. Comput. Syst., vol. 107, pp. 841–853, 2020. org/.
[44] S. Rouhani and R. Deters, “Security, performance, and applications
of smart contracts: A systematic survey,” IEEE Access, vol. 7, p. 20,
2019.

You might also like