Architectural Iot - Mod5.submit
Architectural Iot - Mod5.submit
8.1 Introduction
Several devices communicate among themselves in an IoT network. Con-
sider an example: when your neighbors get access to all your personal details
like eating habits, disease, date of birth, income amongst others. This may pose
threat to your bank account, credit card, and health card. Nobody likes it if ev-
eryone knows our private data. Therefore, things data needs to be anonymized
before storing that in a hospital, bank, or government organization. This en-
sures prevention from unauthorized manipulation even if illegitimate persons
get hold of it. Consider another example: the user turns on or off a smart air
conditioner, smart refrigerator, smart bulb, smart door remotely in a smart
home application. Imagine a situation, when some intruders intercept the com-
munication of devices and get access to it. Nobody likes it if they control and
trespass through our building. These two examples motivate us to study the
privacy and security of things data. There are several other examples of IoT
applications like smart grid, smart manufacturing, smart transportation, and
smart parking where there is a need to anonymize and secure the things data.
In particular, the chapter discusses data privacy, elliptic curve cryptography
(ECC), and blockchain technology.
TABLE 8.1: Dataset for k-anonymity. Entries in normal and italic fonts of
two blocks are for distinction purpose only.
Non-Sensitive Non-Sensitive Non-Sensitive Sensitive
Record No. Age Area Code Country Disease
1 27 800101 ∗ Stomach
2 32 800120 ∗ Flu
3 65 800105 ∗ Heart
4 43 800283 ∗ Heart
5 19 800140 ∗ Stomach
6 52 800787 ∗ Heart
There are several techniques for data anonymization. Among them, three
common techniques are k-anonymity, l-diversity, and t-closeness. In k-
anonymity, each user is not distinguishable from at least k − 1 number of
users in each block. In l-diversity, there are at least l different sensitive at-
tributes in each block. A simple approach for anonymization is to suppress the
name of the user. However, the malicious user can extract the user’s sensitive
attributes using non-sensitive attributes such as age, gender, or pin code of
the user’s area and prior knowledge. For example, in a linking attack, the ma-
licious user can combine the published data like a medical database that has
sensitive attributes, and external data like voter registration details. This is
further used to determine the disease of a user. k-anonymity is not vulnerable
against this attack. Note that the row of a table is called record or tuple, while
the column is an attribute.
from heart disease because the sensitive attribute of heart disease is common
for all three records. Hence, we need a diversity of sensitive attributes in each
group.
TABLE 8.3: Dataset for l-diversity. Entries in normal and italic fonts of two
blocks are for distinction purpose only.
Non-Sensitive Non-Sensitive Non-Sensitive Sensitive
Record No. Age Area Code Country Disease
1 27 800101 ∗ Stomach
2 32 800120 ∗ Flu
3 65 800105 ∗ Heart
4 43 800183 ∗ Heart
5 50 800140 ∗ Stomach
6 52 800787 ∗ Flu
In attack with prior knowledge, let the malicious user knows his other
best friend whose age is less than 30 years. Since he or she is a friend, so,
the malicious user knows his or her area code as well. Now, assume that the
malicious user has prior knowledge that he or she eats sufficient fiber-rich
foods. This means that he or she may not be suffering from the flu and he or
she may have a stomach-related disease.
y 2 = x3 + ax + b (8.1)
• If a line intersects two points A and B on the curve, then it also intersects
a third point, that is, −C on the curve.
• If a line is a tangent to the curve at point A, then it also intersects
another point −C on the curve.
• The vertical line intersects the curve at infinity.
There are two operations called dot operations. These are called point addition
and point doubling. Let us discuss that.
Point Addition: Let A and B be (xA , yA ) and (xB , yB ), respectively. A
straight line between points A and B intersects at point −C. The reflection
of point −C about x axis gives the point C which is the addition of points A
and B as shown in Figure 8.2. Let
yB − yA
θa = (8.2)
xB − xA
be the slope of the straight line connecting A and B. Now, make it a general
straight line by replacing (xB , yB ) with (x, y). That is,
θa (x − xA ) = (y − yA ) (8.3)
y = θ a x + yA − θ a xA (8.4)
Privacy and Security of Things Data 213
x3 + ax + b = (θa x + yA − θa xA )2 (8.5)
− yC = yA + θa (xC − xA ) (8.9)
dy 3x2 + a
θd = = A (8.12)
dx 2yA
at point A. We may consider A = B for point doubling. From final expression
of xC of point addition, we can write
xC = θd2 − 2xA
(8.13)
yC = −yA + θd (xA − xC )
• If we add the points A with −A, then the straight line becomes vertical.
It is assumed that the line intersects the curve at infinity. Hence, the
finite length of the reflected line remains at infinity. That is to say, A +
(−A) = O.
• Addition of a point A and infinity O gives the point itself. That is, A +
O = A.
In the last case, there was no other option other than going to infinity
for the third intersection. However, in this case, we have the possibility
of a third intersection on the actual curve. So, there is no need to go to
infinity to find the third intersection with the curve.
curve. Here, start point (g) and end point (q) on an elliptic curve act as the
public keys. The number of jumps to go from g to q is a private key.
2A, (c) Add 4A and then double 4A, and (d) Add 8A. Therefore, we can do
it in only 4 steps. Both the approaches produce the same result. Doubling
is easier and order of addition does not matter. We can trick the enemy by
doing doubling quickly on an elliptic curve rather than adding one by one. The
second approach requires only 4 steps and hence faster than the first approach
which requires 13 steps. Note that nr = 13 is a very small number herein. We
consider it for visualization purpose.
We can do the computation with order of O(log2 nr ). Even if nr = 2200 ,
we can do the computation in only 200 double and add very fast steps. If nr
is a very large number, then it becomes difficult for the enemy to determine
the number of jumps. This is called elliptic curve discrete logarithm. For a
secure message exchange between sender and receiver, they must agree upon
the parameters of an elliptic curve, that is, a, b, and a large prime number pr .
They then pick a generator or base point g on the elliptic curve.
Notably, we can get spub (or rpub ) by adding g, spriv (or rpriv ) times. Now,
in key exchange, the sender (or receiver) adds rpriv g (or spriv g), spriv (or rpriv )
times. We can trick enemy by using double and add method and using a large
spriv and rpriv .
Hence,
CM = {spriv g, Message + spriv rpub } (8.22)
which is again the x and y coordinates on the elliptic curve.
In decryption, the receiver subtracts the product of receiver’s private key
and first coordinate of CM from the second coordinate of CM . That is,
Message + spriv rpub − (rpriv spriv g) = Message + spriv rpriv g − (rpriv spriv g)
(8.23)
where rpub = rpriv g. Hence, we can simplify to
The following example utilizes all the concepts like point adding, doubling,
and negation operations in detail.
Example of ECC Encryption and Decryption: Consider the elliptic
curve of the last example with generator point g ≡ (xg , yg ) = (3, 3) and
encoded message on the curve, Message ≡ (xM , yM ) = (3, 4). Assume the
private key for sender and receiver are respectively 3 and 2. Carry out the
ECC encryption and decryption.
Solution: Here, spriv = 3 and rpriv = 2. Let us compute the public key of
the sender. spub = spriv g = 3 × (3, 3) = (3)10 g = (11)2 g = (21 + 20 )g = 2g + g.
Hence, (a) Add g and then double g (b) Add 2g.
3x2g +a
Now, we double g as 2g = (xC , yC ) say. θd = 2yg mod pr =
3×32 +1 14
2×3 mod 7 = 3 mod 7. That is 3θd = 14 mod 7 = 0. For θd = 7, the
condition is satisfied as shown in Table 8.5. Therefore, xC = (θd2 −2xg ) mod pr
= (72 − 2 × 3) mod 7 = 1. Now, yC = (−yg + θd (xg − xC )) mod pr =
(−3 + 7(3 − 1)) mod 7 = 4. Hence, 2g = (1, 4) = (xC , yC ) (say).
y −y
Now, we add g and 2g ≡ (xD , yD ). Here, θa = xgg −xCC mod pr =
3−4 −1
3−1 mod 7 = 2 mod 7. This is 2θa = −1 + 7 mod 7 = 6. Therefore,
θa = 3 as shown in Table 8.6. Let us add. xD = (θa2 − xg − xC ) mod pr =
(32 − 3 − 1) mod 7 = 5 and yD = −yg + θa (xg − xD ) mod pr = −3 + 3(3 −
5) mod 7 = −9 mod 7 = −9 + 7 × 2 mod 7 = 5. Hence, g + 2g = (5, 5) ≡
(xD , yD ).
We determine the public key of receiver as: rpub = rpriv g = 2 × (3, 3) = (1,
4) ≡ (xE , yE ) as computed previously. This is the first part of spub . The secret
key after key exchange for the sender is computed as follows: skey = spriv rpub
= 3 × (1, 4) = (3)10 × rpub = (11)2 × rpub = (21 + 20 ) × rpub = 2rpub + rpub .
We need to carry out double and add operations for this as follows:
Privacy and Security of Things Data 219
3x2E +a
• Double of rpub ≡ (xF , yF ): New θd,F = 2yE mod pr =
3×12 +1 1
2×4 mod 7 = 2 mod 7. That is 2θd,F = 1 mod 7 = 1. For
θd,F = 4, the condition is satisfied as shown in Table 8.7. Therefore,
2
xF = (θd,F − 2xE ) mod pr = (42 − 2 × 1) mod 7 = 0. Now, yF =
(−yE + θd,F (xE − xF )) mod pr = (−4 + 4(1 − 0)) mod 7 = 0. Hence,
2rpub = (0, 0) = (xF , yF ).
• Addition:
(xE , yE ) ≡ 4−0
(1,4) + (xF , yF ) ≡ (0, 0) = (xH , yH ). θa,H =
yE −yF 2
xE −xF mod p r = 1−0 mod 7 = 4 mod 7 = 4. xH = (θa,H − xE −
xF ) mod pr = (42 − 1 − 0) mod 7 = 1 and yF = −yE + θa,H (xE −
xH ) mod pr = −4 + 4(1 − 1) mod 7 = −4 mod 7 = −4 + 7 mod 7 = 3.
Hence, (xH , yH ) ≡ (1, 3).
The secret key after key exchange for receiver is computed as follows: rkey
= rpriv spub = 2 × (5, 5) = 2 × (xD , yD ) ≡ (xI , yI ). We need to to double
operation herein.
3x2 +a 2
mod pr = 3×5 +1
• Double of spub ≡ (xI , yI ): New θd,I = 2yDD 2×5 mod 7
38
= 5 mod 7. That is 5θd,I = 38 mod 7 = 3. For θd,I = 2, the condition
2
is satisfied as shown in Table 8.8. Therefore, xI = (θd,I − 2xD ) mod pr
2
= (2 − 2 × 5) mod 7 = −6 mod 7 = −6 + 7 mod 7 = 1. Now, yI =
(−yD + θd,I (xD − xI )) mod pr = (−5 + 2(5 − 1)) mod 7 = 3. Hence,
2spub = (1, 3) = (xI , yI ).
Remarks: we can also observe numerically that skey and rkey are indeed
the same as expected.
Let us do now encryption. CM = {spriv g, Message + skey or rkey } as skey
= rkey . Therefore, CM = {spriv g, Message + spriv rpub } = {3 × (3, 3), (3, 4) +
3 × (1, 4)}. Since we already have calculated 3 × (1, 4) = (1, 3) and 3 × (3, 3) =
(5, 5). So, CM = {(5, 5), (3, 4) + (1, 3)}. Here, the addition of (xJ , yJ ) = (3,4)
and (xK , yK ) = (1,3) ≡ (xL , yL ) is needed.
−yK
• θa,L = xyJJ −x mod pr = 4−3 1
K 3−1 mod 7 = 2 mod 7. This gives 2θa,L
= 1 mod 7 = 1. Therefore, θa,L = 4 as shown in Table 8.9. Therefore,
220 Fundamentals of Internet of Things
2
xL = (θa,L − xJ − xK ) mod pr = (42 − 3 − 1) mod 7 = 5 and yL =
−yJ + θa,L (xJ − xL ) mod pr = −4 + 4(3 − 5) mod 7 = −12 mod 7 =
−12 + 7 × 2 mod 7 = 2. Hence, (xL , yL ) ≡ (5, 2)
Hence, the sender sends the ciphertext CM = {(5, 5), (5, 2)} to receiver. In
order to decrypt the message, we follow the below steps:
We know that Message = (Message + spriv rpub ) − rpriv spriv g = (5, 2) − 2 ×
(5, 5) = (5, 2) − (1, 3) where we use already computed 2 × (5, 5) = (1, 3). Now,
we do subtraction of (5, 2) and (1, 3). We can write using the expression of
negation as: (5, 2) + (1, −3) = (5, 2) + (1, −3 mod 7) = = (5, 2) + (1, −3 +
7 mod 7) = (5, 2) + (1, 4). We now need to add it. Denote (xO , yO ) = (5,2)
and (xS , yS ) = (1,4). The added coordinates are denoted by ≡ (xT , yT ).
O −yS
• θa,T = xyO 2−4 −1
−xS mod pr = 5−1 mod 7 = 2 mod 7. That is, 2θa,T
= −1 + 7 mod 7 = 6. Therefore, θa,T = 3 as shown in Table 8.10. Now,
2
xT = (θa,T − xO − xS ) mod pr = (32 − 5 − 1) mod 7 = 3 and yT =
−yO + θa,T (xO − xT ) mod pr = −2 + 3(5 − 3) mod 7 = 4.
Hence, Message ≡ (3, 4). We get the decrypted message which is the orig-
inal encoded message.
8.4 Blockchain
Blockchain was introduced in 2008 by Satoshi Nakamoto or a group of
people with alias who developed bitcoin. He/they authored a bitcoin white
paper titled ‘Bitcoin: A Peer-to-Peer Electronic Cash System’. Let us under-
stand the motivation behind blockchain technology. As we know that the bank
service is not attractive. Consider an example of sending the money from one
country to other. There is a tedious process for opening a bank account, high
transaction fee, online scam makes vulnerable, unavailability of bank service
Privacy and Security of Things Data 221
8.4.1 Bitcoin
The bitcoin is a digital currency which is used for electronic payment from
a sender to a receiver. Blockchain is the underlying technology to perform
that. There are about 1600 digital currencies. A bitcoin can be considered as
an application that uses blockchain technology. Blockchain technology can be
used in a multitude of use cases.
Blockchain is a distributed digital ledger of immutable records. The mean-
ing of immutable is that the data cannot be altered once it is recorded.
Blockchain is composed of cryptographically linked blocks like a linked list
of a data structure and hash as a pointer. That is, there is a chain of blocks
in blockchain in chronological order. This is illustrated in Figure 8.5.
8.4.2 Hash
Hash is a function that converts input data of any size to a fixed size of
alphanumeric characters. Hash represents each block uniquely. The first block
has the previous hash “0000....” as there is no preceding block to it. The
“previous hash” of the current block is the “Hash” of the previous block.
Hash Function and Its Properties: The blockchain performs cryptog-
raphy using hash function. Given an input data, the corresponding hash value
is generated. For example, if the input data is “We learn blockchain” the
222 Fundamentals of Internet of Things
• It is a one-way function, that is, given the input, the hash value can be
computed easily. However, given the hash value, it is very difficult to
produce the input back. This is like a trapdoor function in ECC.
• It is deterministic, that is, for the same input, the output is also the
same.
• A small change in the input, there would be a big change in the output.
8.4.4 Miner
There are three input, namely, previous hash, Merkle root hash and the
nonce to SH-256 cryptographic hashing algorithm as shown in Figure 8.7.
The miner varies the nonce, while the previous hash and Merkle root remain
the same, to generate a hash value lower than the target. If the hash value
meets the target set by the networks, then we stop. Otherwise, the process is
repeated and this is called proof of work consensus algorithm. All the nodes of
the network agrees on the same version of the fact in the consensus algorithm.
The miner verifies and validates the transactions and adds the block. The
first miner who gets the hash that meets the target is rewarded. Since the
miner invests resources and computing power, he or she is rewarded for the
same in terms of bitcoin or other forms of remuneration. The miner gets 12.5
bitcoins and the sum of the transaction fee of that block. The amount is halved
every 2,10,00 blocks, which is approximately 4 years. Note that 1 bitcoin is
equivalent to USD 51333 or 37.76 lakhs in rupees in May 2021. The miner
also verifies if the sender has a sufficient amount to be transferred. The target
hash is decided months in advance for every block. It is difficult to generate a
nonce that satisfies the target but it is easy to verify by other miners.
In this scenario, the loyal general receives the 2 commands for attack and 1
command to not attack. The majority of commands are to attack, therefore,
they attack the enemy. This is called Byzantine fault tolerance. On similar
lines, in blockchain networks, the generals are the nodes and the messages are
the transactions. Attack command is the legitimate transaction, while to not
attack command is the invalid transaction. We arrive at the consensus in the
presence of some malicious nodes.
8.4.11 Fork
The blockchain diverges into a different chain with different rules or pro-
tocols. The users of networks are not in agreement. Some improvements to be
made are minor, while some are major. There are two types of forks.
• Soft fork: The new version works well with the old version, that is,
there is backward compatibility. Hence, there is flexibility.
• Hard fork: This is rigid as there are different rules for different chains
and the users have to choose one. Both chains are valid and cannot be
discarded. Some users support the old chain, while others the new chain.
Example: We have different versions of the same coin, for instance, bitcoin
(BTC) and bitcoin cash (B cash or BCH). Initially, we had a block size of 1
MB with a lesser number of transactions. With the popularity of Bitcoin, the
number of transactions has been increased. Therefore, we need a larger block
size of 8 MB for bitcoin cash since August 2017.
The Proof of Concept is used in the Ethereum network. One gets the
chance of mining based on the number of coins he or she owns. The more the
stake, the more the chance and mining power. The one with more stake likely
performs genuine validation, otherwise, he or she loses stake if it is invalidated
by the rest of the users. The miner gets a reward of 3 ETH and the sum of
the transaction fee for the validation of the block. The reward does not get
halved, unlike the bitcoin network.
• Smart contract: Based on terms and conditions set using code in the
Ethereum network, the smart contract automatically negotiates the
parking fee, ordering some repair parts of a machine and payment
through the agreement of a smart contract, negotiation of different en-
ergy (solar, wind, etc.) of a home in a smart grid system and many
more. Smart contract monitors flight status and automatic payment of
compensation in case flight get delayed by some hours. The sensors in
a vehicle provide real-time conditions of the vehicle and automatic pay-
ment of compensation if the vehicle meets an accident.
• Retail Management: The sensors collect the conditions of the product.
The product can be automatically tracked, and conditions at generation,
shipment, and arrival can be verified through history. The contract can
be executed between two vendors.
• Government Organizations: All the data can be secured, tamper-proof
and transparent among stakeholders eliminating paper-based records.
8.5 Summary
In this chapter, we studied data privacy, elliptic curve cryptography, and
blockchain technology. The data anonymization preserves the privacy of the
user. The anonymized data must not disclose sensitive attributes like the name
of disease and income of the user. Also, at the same time, we should have
enough information. Next, if the RSA algorithm uses 3072 bits public key,
then the ECC algorithm uses only 256 bits public key for a comparable level
of security. ECC algorithm uses a smaller key size and hence suitable for IoT
networks. Finally, the data cannot be tampered with in blockchain technology
as it is cryptographically secured.
Privacy and Security of Things Data 231
8.6 Exercises
1. Apply k-anonymity and l-diversity anonymization techniques for dataset
of Table 8.11.
2. (a) What could be the original pin code if we generalize pin code in
k-anonymity as 844***?
(b) What could be the original gender if we generalize gender in k-
anonymity as *?
3. How does the key exchange happen in ECC when the receiver does not
know the sender?
4. Generate an elliptic curve (y 2 = x3 + ax + b) over a finite field for a =
1, b = −1 and pr = 5.
5. Crack the number of jumps using brute-force method if we start from
the point (2, 3) and end at the point (1, 4) on the elliptic curve of the
last Problem 4 one by one. It is possible herein because nr is a small
number.
6. Carry out the encryption for the elliptic curve discussed in the Problem 4
for g = (1, 4), Message = (3, 2). The private keys for sender and receiver
are 5 and 2, respectively.
7. Carry out the decryption for the elliptic curve discussed in the Problem 4
for g = (1, 4), Message = (3, 2). The private keys for sender and receiver
are 5 and 2, respectively.
8. Do It Yourself : Study and modify the example of transactions of Token
available at https://andersbrownworth.com/blockchain/tokens