0% found this document useful (0 votes)
6 views26 pages

Architectural Iot - Mod5.submit

Iot in mod5 syllabus

Uploaded by

rajjeevsushma90
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views26 pages

Architectural Iot - Mod5.submit

Iot in mod5 syllabus

Uploaded by

rajjeevsushma90
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 8

Privacy and Security of Things Data

8.1 Introduction
Several devices communicate among themselves in an IoT network. Con-
sider an example: when your neighbors get access to all your personal details
like eating habits, disease, date of birth, income amongst others. This may pose
threat to your bank account, credit card, and health card. Nobody likes it if ev-
eryone knows our private data. Therefore, things data needs to be anonymized
before storing that in a hospital, bank, or government organization. This en-
sures prevention from unauthorized manipulation even if illegitimate persons
get hold of it. Consider another example: the user turns on or off a smart air
conditioner, smart refrigerator, smart bulb, smart door remotely in a smart
home application. Imagine a situation, when some intruders intercept the com-
munication of devices and get access to it. Nobody likes it if they control and
trespass through our building. These two examples motivate us to study the
privacy and security of things data. There are several other examples of IoT
applications like smart grid, smart manufacturing, smart transportation, and
smart parking where there is a need to anonymize and secure the things data.
In particular, the chapter discusses data privacy, elliptic curve cryptography
(ECC), and blockchain technology.

8.2 Data Privacy


The organization often maintains a database of the employees or users like
name, age, gender, area code, health conditions, and many more. Generally,
the database has sensitive attributes like a disease of a patient in a hospital,
income of a person in a government department, location of the user for a
mobile network amongst others. We anonymize data for research or analysis
purposes. The anonymized data must not disclose sensitive attributes like the
name of disease and income of the user. Also, at the same time, we should
have enough information. The data anonymization preserves the privacy of
the user.

DOI: 10.1201/9781003225584-8 207


208 Fundamentals of Internet of Things

TABLE 8.1: Dataset for k-anonymity. Entries in normal and italic fonts of
two blocks are for distinction purpose only.
Non-Sensitive Non-Sensitive Non-Sensitive Sensitive
Record No. Age Area Code Country Disease
1 27 800101 ∗ Stomach
2 32 800120 ∗ Flu
3 65 800105 ∗ Heart
4 43 800283 ∗ Heart
5 19 800140 ∗ Stomach
6 52 800787 ∗ Heart

There are several techniques for data anonymization. Among them, three
common techniques are k-anonymity, l-diversity, and t-closeness. In k-
anonymity, each user is not distinguishable from at least k − 1 number of
users in each block. In l-diversity, there are at least l different sensitive at-
tributes in each block. A simple approach for anonymization is to suppress the
name of the user. However, the malicious user can extract the user’s sensitive
attributes using non-sensitive attributes such as age, gender, or pin code of
the user’s area and prior knowledge. For example, in a linking attack, the ma-
licious user can combine the published data like a medical database that has
sensitive attributes, and external data like voter registration details. This is
further used to determine the disease of a user. k-anonymity is not vulnerable
against this attack. Note that the row of a table is called record or tuple, while
the column is an attribute.

8.2.1 Privacy Using k-anonymity


We do generalization and suppression to anonymize the data. Consider
a dataset as in Table 8.1 for anonymization purpose. For example, if there
are three records in each group, the non-sensitive attributes are generalized
using less than or greater than some value, say, 30. Further, some digits of
non-sensitive area code are suppressed by asterisks. The name of the country
is completely suppressed. In the 3-anonymized table, at least 2 records are not
distinguishable with each record in each block. Notably, the name of the person
is suppressed to provide a first level of anonymization. The 3-anonymized
dataset is shown in Table 8.2.
Drawbacks of k-anonymity: In a homogeneity attack, the malicious
user is a neighbor and friend of the legitimate user. Therefore, the mali-
cious user knows the user’s area code and age which is 30+ years. The
malicious user is interested in his friend’s disease as he or she often goes to
the hospital. Now, the malicious user gets the published 2-anonymized table
and shortlisted records number 4, 6, and 3 of block 2 with the help of age and
area code. Hence, the malicious user comes to know that his friend is suffering
Privacy and Security of Things Data 209

TABLE 8.2: 3-anonymized table


Record No. Age Area Code Country Disease
5 ≤ 30 8001** ∗ Stomach
1 ≤ 30 8001** ∗ Stomach
2 ≤ 30 8001** ∗ Flu
4 > 30 800*** ∗ Heart
6 > 30 800*** ∗ Heart
3 > 30 800*** ∗ Heart

from heart disease because the sensitive attribute of heart disease is common
for all three records. Hence, we need a diversity of sensitive attributes in each
group.

TABLE 8.3: Dataset for l-diversity. Entries in normal and italic fonts of two
blocks are for distinction purpose only.
Non-Sensitive Non-Sensitive Non-Sensitive Sensitive
Record No. Age Area Code Country Disease
1 27 800101 ∗ Stomach
2 32 800120 ∗ Flu
3 65 800105 ∗ Heart
4 43 800183 ∗ Heart
5 50 800140 ∗ Stomach
6 52 800787 ∗ Flu

In attack with prior knowledge, let the malicious user knows his other
best friend whose age is less than 30 years. Since he or she is a friend, so,
the malicious user knows his or her area code as well. Now, assume that the
malicious user has prior knowledge that he or she eats sufficient fiber-rich
foods. This means that he or she may not be suffering from the flu and he or
she may have a stomach-related disease.

8.2.2 Privacy Using l-diversity


If a block contains at-least l different sensitive attributes, then it is called
l-diverse. If every block of the table is l-diverse, so is the table. Let us consider
the dataset of Table 8.3. The 3-diverse anonymized table is in Table 8.4.
We carried out some minor changes in the original table of k-anonymity
to fit for l-diversity problem as we don’t have a large number of records. Note
that, k and l need not be the same. In each block, there are 3 records and at
least 3 different sensitive attributes, namely, stomach, heart, and flu. Notably,
these two numbers can be different but they must have at least 3 different
sensitive attributes. Now, the malicious user needs at least l − 1 = 2 pieces of
210 Fundamentals of Internet of Things

prior knowledge to discard two records in each block. In k-anonymity, we need


just 1 piece of prior knowledge to obtain the sensitive attribute of a person.
Hence, l-diversity is a better anonymization technique than k-anonymity.

TABLE 8.4: 3-diverse anonymized table


Record No. Age Area Code Country Disease
1 ≤ 45 8001** ∗ Stomach
2 ≤ 45 8001** ∗ Flu
4 ≤ 45 8001** ∗ Heart
3 > 45 800*** ∗ Heart
5 > 45 800*** ∗ Stomach
6 > 45 800*** ∗ Flu

Drawbacks of l-diversity: At times, it is difficult to achieve anonymity


for some problems. For example, there are only negative and positive test
reports of disease. These are the sensitive attributes and negative attributes
dominate, say, 95% of 1000 records. Most of the blocks contain both negative
attributes using 2-diversity. There can be at-most 5% × 1000 = 50 distinct
2-diverse blocks. In literature, an efficient anonymization technique such as
t-closeness is proposed.

8.3 Elliptic Curve Cryptography


Earlier, we had a symmetric key cryptography, where only one key called
private key is used. The sender and receiver exchange the private key. This
can be vulnerable since it can eavesdrop on the communication medium. The
solution is to use asymmetric key where two keys are used, namely, private and
public keys. The private key is like a password and is kept secretly with each
party. On the other hand, the public key is like an email id and is announced
to the external world. Public key cryptography was developed by Diffie and
Hellman in 1976. Public key cryptography is also called asymmetric key cryp-
tography because it uses two keys: private and public. RSA and ECC are
examples of public-key cryptography. RSA algorithm was developed by MIT
researchers Ron Rivest, Adi Shamir, and Leonard Adleman in 1977. RSA
algorithm addresses the issue of secrecy or private key cryptography, however,
it has higher computational complexity.
Elliptic Curve Cryptography (ECC) was developed by Neal Koblitz and
Victor Miller who had worked at IBM in 1985. If the RSA algorithm uses
3072 bits public key, then the ECC algorithm uses only 256 bits public key
for a comparable level of security. ECC algorithm uses a smaller key size and
hence suitable for IoT networks. Notably, the elliptic curve has nothing to
Privacy and Security of Things Data 211

FIGURE 8.1: ECC curves

do with an ellipse. The elliptic curve is described using a quadratic curve.


In ECC, addition does not mean the simple algebraic addition. Similarly, the
doubling of points does not mean the simple multiplication of each coordinate
by two. Nevertheless, we use the same terminologies.

8.3.1 ECC over Real Numbers


Elliptic curve over real number is described using

y 2 = x3 + ax + b (8.1)

where x, y, a, b ∈ R, a set of real numbers and 4a3 + 27b2 = 6 0. This cubic


form is of degree 3.
ECC uses trapdoor or one way function. That is, given the public key,
it is difficult to derive the private key. Public and private keys are generated
using points of an elliptic curve. The curve is parametrized by {a, b, pr , nr , g}.
Note that, a and b are the parameters of the elliptic curve. Figure 8.1 shows
the elliptic curves for different values of the parameters. Here, pr imposes the
maximum limits along the x and y axes. pr is a prime number and used in
the elliptic curve over a finite field. nr is a cyclic group and this many times
operations are repeated to get a private key. g is the generator or base point
on the elliptic curve. This is used to compute other points on the curve.
212 Fundamentals of Internet of Things

FIGURE 8.2: ECC addition

8.3.2 Dot Operations over Real Numbers


Some characteristics of an elliptic curve are as follows: First, the curve is
symmetric about x axis. Second, if we draw a line, it touches the curve at a
maximum of three points. Third, the curve extends to infinity on both ends.
There are also the following properties:

• If a line intersects two points A and B on the curve, then it also intersects
a third point, that is, −C on the curve.
• If a line is a tangent to the curve at point A, then it also intersects
another point −C on the curve.
• The vertical line intersects the curve at infinity.

There are two operations called dot operations. These are called point addition
and point doubling. Let us discuss that.
Point Addition: Let A and B be (xA , yA ) and (xB , yB ), respectively. A
straight line between points A and B intersects at point −C. The reflection
of point −C about x axis gives the point C which is the addition of points A
and B as shown in Figure 8.2. Let
yB − yA
θa = (8.2)
xB − xA
be the slope of the straight line connecting A and B. Now, make it a general
straight line by replacing (xB , yB ) with (x, y). That is,

θa (x − xA ) = (y − yA ) (8.3)

We can rewrite it as follows:

y = θ a x + yA − θ a xA (8.4)
Privacy and Security of Things Data 213

In order to find the coordinates of point −C ≡ (xC , −yC ), solve equation of


the curve y 2 = x3 + ax + b and Equation 8.4. That is,

x3 + ax + b = (θa x + yA − θa xA )2 (8.5)

This can be further recast as

x3 − θa2 x2 + [a − 2θa (yA − θa xA )]x + b − (yA − θa xA )2 = 0 (8.6)

This is a monic polynomial. It has the coefficient of highest power of x (that


is, x3 ) equal to 1. We can write the following using the property of monic
polynomial:
xA + xB + xC = −(−θa2 ) (8.7)
That is,
xC = θa2 − xA − xB (8.8)
Replace (x, y) with (xC , −yC ) in Equation 8.3. That is,

− yC = yA + θa (xC − xA ) (8.9)

Hence, yC for point reflected point C is

yC = −yA + θa (xA − xC ) (8.10)

Doubling Operation: The tangent line at point A intersects the elliptic


curve at point −C. The reflected point of −C about x axis gives the doubling
point C of point P . Taking the derivative of the elliptic curve y 2 = x3 + ax + b
with respect to x gives
dy
2y = 3x2 + a (8.11)
dx
That is, replacing (x, y) with (xA , yA )

dy 3x2 + a
θd = = A (8.12)
dx 2yA
at point A. We may consider A = B for point doubling. From final expression
of xC of point addition, we can write

xC = θd2 − 2xA
(8.13)
yC = −yA + θd (xA − xC )

8.3.3 Operations over Finite Field


The computation using real numbers is easy to understand. However, it
requires high computational time for performing various operations. There
is also inaccuracy due to rounding error. At times, the number becomes ex-
tremely large and floating-point cannot hold it. The solution is to use the
214 Fundamentals of Internet of Things

cryptographic algorithm which needs accurate and fast computations. Hence,


carry out the computation on a finite field (0, 1, . . . , pr − 1) of integers modulo
pr , where pr is a prime number. Here, x and y ∈ Z, a set of integers. Using
expressions of point addition and doubling operations, we can summarize the
following for operations over a finite field:
Addition: y −y 
B A
θa = mod pr
xB − xA
xC = (θa2 − xA − xB ) mod pr (8.14)
yC = −yA + θa (xA − xC ) mod pr
Doubling:
 3x2 + a 
A
θd = mod pr
2yA
(8.15)
xC = (θd2 − 2xA ) mod pr
yC = (−yA + θd (xA − xC )) mod pr
Negation: If A ≡ (xA , yA ) is a point on the curve, then −A is obtained
by reflection about x-axis. That is,

− A ≡ (xA , −yA mod pr ) (8.16)

Identity element: The point at infinity O is an identity element under


addition operation.

• If we add the points A with −A, then the straight line becomes vertical.
It is assumed that the line intersects the curve at infinity. Hence, the
finite length of the reflected line remains at infinity. That is to say, A +
(−A) = O.
• Addition of a point A and infinity O gives the point itself. That is, A +
O = A.
In the last case, there was no other option other than going to infinity
for the third intersection. However, in this case, we have the possibility
of a third intersection on the actual curve. So, there is no need to go to
infinity to find the third intersection with the curve.

8.3.4 Trapdoor Function


Let g and q be two points on the elliptic curve described in the following
manner
q = nr g, (8.17)
where nr < pr . According to the property of commutative group, q will be
also the point of the elliptic group. Since it uses the trapdoor function, given
nr and g, it is easy to find q. However, given q and g, it is very difficult to find
nr for a large nr . This is called the discrete logarithm problem for an elliptic
Privacy and Security of Things Data 215

FIGURE 8.3: Jump in ECC

curve. Here, start point (g) and end point (q) on an elliptic curve act as the
public keys. The number of jumps to go from g to q is a private key.

8.3.5 Beauty of ECC


We take a straight path and then jump to another side to get A + B as
shown in Figure 8.3(a). This is done in order to lose our enemy. However, the
enemy may still follow us. Let us repeat the same trick. Start from A + B
and go to C. Then again jump to another side, that is, (A + B) + C. If the
number of jumps (nr ) is large enough, then it becomes difficult for the enemy
to track us.
We can consider the special case for illustration purposes in Figure 8.3(b).
Let us move along the tangent for easy illustration when A = B. Using the
previous trick, we can get 2A, then 3A, and so on till nr A to lose our enemy
for a large nr . We can go from A to nr A if we know the nr . However, it
is very difficult for the enemy to crack nr , given A and nr A. The elliptic
curve parameters a and b, and generator or base point, g or A, are the public
information. The sender and receiver are agreed to use these parameters. The
final point nr g or nr A is a public key, however, nr is the private key. The
enemy can track us if nr is small and we do the operation slowly.
The solution is to carry out the operation very fastly. Also, if we do some
operations on points on the curve, the result still belongs to the group. Further,
the add operation follows the associativity property. That is, (A + B) + C =
A + (B + C). The proof is difficult, however, it can be verified easily by taking
points on the curve. Hence, the order does not matter.
Example: Let us consider a numerical example to understand the concepts
that we have learned so far. Suppose that we want to get 13A. There are two
approaches for the same. First, we can add A in thirteen steps, we can get
13A. Second, convert decimal to binary equivalent. Then, (13)10 A = (1101)2 A
= 23 A + 22 A + 20 A = 8A + 4A + A. (a) Add A and then double A, (b) Double
216 Fundamentals of Internet of Things

2A, (c) Add 4A and then double 4A, and (d) Add 8A. Therefore, we can do
it in only 4 steps. Both the approaches produce the same result. Doubling
is easier and order of addition does not matter. We can trick the enemy by
doing doubling quickly on an elliptic curve rather than adding one by one. The
second approach requires only 4 steps and hence faster than the first approach
which requires 13 steps. Note that nr = 13 is a very small number herein. We
consider it for visualization purpose.
We can do the computation with order of O(log2 nr ). Even if nr = 2200 ,
we can do the computation in only 200 double and add very fast steps. If nr
is a very large number, then it becomes difficult for the enemy to determine
the number of jumps. This is called elliptic curve discrete logarithm. For a
secure message exchange between sender and receiver, they must agree upon
the parameters of an elliptic curve, that is, a, b, and a large prime number pr .
They then pick a generator or base point g on the elliptic curve.

8.3.6 Key Generation and Exchange


In key generation, the private keys are chosen, which are the large prime
numbers between 0 and pr − 1. Let the private keys of sender and receiver
be spriv and rpriv , respectively. We compute the public keys for sender and
receiver as
spub = spriv g and
(8.18)
rpub = rpriv g,
respectively. After generation of public keys, they exchange these between
each other. Now, in key exchange, the public key of receiver (or sender) is
with sender (or receiver). We calculate the secret keys for the sender and the
receiver, respectively, as

skey = spriv rpub and


(8.19)
rkey = rpriv spub .

Note that both are the same as


spriv rpub = spriv rpriv g
(8.20)
= rpriv (spriv g) = rpriv spub .

Notably, we can get spub (or rpub ) by adding g, spriv (or rpriv ) times. Now,
in key exchange, the sender (or receiver) adds rpriv g (or spriv g), spriv (or rpriv )
times. We can trick enemy by using double and add method and using a large
spriv and rpriv .

8.3.7 Encryption and Decryption


Let us consider a simple elliptic curve encryption and decryption for illus-
tration purpose. In encryption, the original message is encoded by mapping
Privacy and Security of Things Data 217

FIGURE 8.4: ECC finite field for pr = 7

to a point on the curve. The data to be encoded is Message. The encrypted


or ciphered message which is sent from the sender to the receiver is given by

CM = {spriv g, Message + skey or rkey } as skey = rkey (8.21)

Hence,
CM = {spriv g, Message + spriv rpub } (8.22)
which is again the x and y coordinates on the elliptic curve.
In decryption, the receiver subtracts the product of receiver’s private key
and first coordinate of CM from the second coordinate of CM . That is,

Message + spriv rpub − (rpriv spriv g) = Message + spriv rpriv g − (rpriv spriv g)
(8.23)
where rpub = rpriv g. Hence, we can simplify to

Message + spriv rpriv g − spriv rpriv g = Message (8.24)

which is in fact the original message.


Example for Generation of Points: Generate the points for an elliptic
curve y 2 = (x3 + ax + b) mod pr for a = 1, b = 0, and pr = 7.
Solution: The points of the elliptic curve over the finite field are (0, 0),
(1, 3), (1, 4), (3, 3), (3, 4), (5, 2), and (5, 5) as shown in Figure 8.4. We can
easily verify these points below:

• For (0, 0): 02 mod 7 = 03 + 1 × 0 + 0 mod 7. This gives 0 = 0.


• For (1, 3): 32 mod 7 = 13 + 1 × 1 + 0 mod 7. This gives 2 = 2.

• For (1, 4): 42 mod 7 = 13 + 1 × 1 + 0 mod 7. This gives 2 = 2.


• For (3, 3): 32 mod 7 = 33 + 1 × 3 + 0 mod 7. This gives 2 = 2.
• For (3, 4): 42 mod 7 = 33 + 1 × 3 + 0 mod 7. This gives 2 = 2.
218 Fundamentals of Internet of Things

TABLE 8.5: Values of 3θd mod 7


θd 1 2 3 4 5 6 7
3θd 3 6 9 12 15 18 21
3θd mod 7 3 6 2 5 1 4 0

TABLE 8.6: Values of 2θa mod 7


θa 1 2 3
2θa 2 4 6
2θa mod 7 2 4 6

• For (5, 2): 22 mod 7 = 53 + 1 × 5 + 0 mod 7. This gives 4 = 4.


• For (5, 5): 52 mod 7 = 53 + 1 × 5 + 0 mod 7. This gives 4 = 4.

The following example utilizes all the concepts like point adding, doubling,
and negation operations in detail.
Example of ECC Encryption and Decryption: Consider the elliptic
curve of the last example with generator point g ≡ (xg , yg ) = (3, 3) and
encoded message on the curve, Message ≡ (xM , yM ) = (3, 4). Assume the
private key for sender and receiver are respectively 3 and 2. Carry out the
ECC encryption and decryption.
Solution: Here, spriv = 3 and rpriv = 2. Let us compute the public key of
the sender. spub = spriv g = 3 × (3, 3) = (3)10 g = (11)2 g = (21 + 20 )g = 2g + g.
Hence, (a) Add g and then double g (b) Add 2g.
3x2g +a 
Now, we double g as 2g = (xC , yC ) say. θd = 2yg mod pr =
3×32 +1 14
 
2×3 mod 7 = 3 mod 7. That is 3θd = 14 mod 7 = 0. For θd = 7, the
condition is satisfied as shown in Table 8.5. Therefore, xC = (θd2 −2xg ) mod pr
= (72 − 2 × 3) mod 7 = 1. Now, yC = (−yg + θd (xg − xC )) mod pr =
(−3 + 7(3 − 1)) mod 7 = 4. Hence, 2g = (1, 4) = (xC , yC ) (say).
y −y
Now, we add g and 2g ≡ (xD , yD ). Here, θa = xgg −xCC mod pr =
3−4 −1
 
3−1 mod 7 = 2 mod 7. This is 2θa = −1 + 7 mod 7 = 6. Therefore,
θa = 3 as shown in Table 8.6. Let us add. xD = (θa2 − xg − xC ) mod pr =
(32 − 3 − 1) mod 7 = 5 and yD = −yg + θa (xg − xD ) mod pr = −3 + 3(3 −
5) mod 7 = −9 mod 7 = −9 + 7 × 2 mod 7 = 5. Hence, g + 2g = (5, 5) ≡
(xD , yD ).
We determine the public key of receiver as: rpub = rpriv g = 2 × (3, 3) = (1,
4) ≡ (xE , yE ) as computed previously. This is the first part of spub . The secret
key after key exchange for the sender is computed as follows: skey = spriv rpub
= 3 × (1, 4) = (3)10 × rpub = (11)2 × rpub = (21 + 20 ) × rpub = 2rpub + rpub .
We need to carry out double and add operations for this as follows:
Privacy and Security of Things Data 219

TABLE 8.7: Values of 2θd,F mod 7


θd,F 1 2 3 4
2θd,F 2 4 6 8
2θd,F mod 7 2 4 6 1

TABLE 8.8: Values of 5θd,I mod 7


θd,I 1 2
5θd,I 5 10
5θd,I mod 7 5 3

3x2E +a 
• Double of rpub ≡ (xF , yF ): New θd,F = 2yE mod pr =
3×12 +1 1
 
2×4 mod 7 = 2 mod 7. That is 2θd,F = 1 mod 7 = 1. For
θd,F = 4, the condition is satisfied as shown in Table 8.7. Therefore,
2
xF = (θd,F − 2xE ) mod pr = (42 − 2 × 1) mod 7 = 0. Now, yF =
(−yE + θd,F (xE − xF )) mod pr = (−4 + 4(1 − 0)) mod 7 = 0. Hence,
2rpub = (0, 0) = (xF , yF ).
• Addition:
 (xE , yE ) ≡ 4−0
(1,4) + (xF , yF ) ≡ (0, 0) = (xH , yH ). θa,H =
yE −yF 2
xE −xF mod p r = 1−0 mod 7 = 4 mod 7 = 4. xH = (θa,H − xE −
xF ) mod pr = (42 − 1 − 0) mod 7 = 1 and yF = −yE + θa,H (xE −
xH ) mod pr = −4 + 4(1 − 1) mod 7 = −4 mod 7 = −4 + 7 mod 7 = 3.
Hence, (xH , yH ) ≡ (1, 3).

The secret key after key exchange for receiver is computed as follows: rkey
= rpriv spub = 2 × (5, 5) = 2 × (xD , yD ) ≡ (xI , yI ). We need to to double
operation herein.
3x2 +a  2
mod pr = 3×5 +1

• Double of spub ≡ (xI , yI ): New θd,I = 2yDD 2×5 mod 7
38

= 5 mod 7. That is 5θd,I = 38 mod 7 = 3. For θd,I = 2, the condition
2
is satisfied as shown in Table 8.8. Therefore, xI = (θd,I − 2xD ) mod pr
2
= (2 − 2 × 5) mod 7 = −6 mod 7 = −6 + 7 mod 7 = 1. Now, yI =
(−yD + θd,I (xD − xI )) mod pr = (−5 + 2(5 − 1)) mod 7 = 3. Hence,
2spub = (1, 3) = (xI , yI ).

Remarks: we can also observe numerically that skey and rkey are indeed
the same as expected.
Let us do now encryption. CM = {spriv g, Message + skey or rkey } as skey
= rkey . Therefore, CM = {spriv g, Message + spriv rpub } = {3 × (3, 3), (3, 4) +
3 × (1, 4)}. Since we already have calculated 3 × (1, 4) = (1, 3) and 3 × (3, 3) =
(5, 5). So, CM = {(5, 5), (3, 4) + (1, 3)}. Here, the addition of (xJ , yJ ) = (3,4)
and (xK , yK ) = (1,3) ≡ (xL , yL ) is needed.
−yK
• θa,L = xyJJ −x mod pr = 4−3 1
  
K 3−1 mod 7 = 2 mod 7. This gives 2θa,L
= 1 mod 7 = 1. Therefore, θa,L = 4 as shown in Table 8.9. Therefore,
220 Fundamentals of Internet of Things

TABLE 8.9: Values of 2θa,L mod 7


θa,L 1 2 3 4
2θa,L 2 4 6 8
2θa,L mod 7 2 4 6 1

TABLE 8.10: Values of 2θa,T mod 7


θa,T 1 2 3
2θa,T 2 4 6
2θa,T mod 7 2 4 6

2
xL = (θa,L − xJ − xK ) mod pr = (42 − 3 − 1) mod 7 = 5 and yL =
−yJ + θa,L (xJ − xL ) mod pr = −4 + 4(3 − 5) mod 7 = −12 mod 7 =
−12 + 7 × 2 mod 7 = 2. Hence, (xL , yL ) ≡ (5, 2)
Hence, the sender sends the ciphertext CM = {(5, 5), (5, 2)} to receiver. In
order to decrypt the message, we follow the below steps:
We know that Message = (Message + spriv rpub ) − rpriv spriv g = (5, 2) − 2 ×
(5, 5) = (5, 2) − (1, 3) where we use already computed 2 × (5, 5) = (1, 3). Now,
we do subtraction of (5, 2) and (1, 3). We can write using the expression of
negation as: (5, 2) + (1, −3) = (5, 2) + (1, −3 mod 7) = = (5, 2) + (1, −3 +
7 mod 7) = (5, 2) + (1, 4). We now need to add it. Denote (xO , yO ) = (5,2)
and (xS , yS ) = (1,4). The added coordinates are denoted by ≡ (xT , yT ).
O −yS
• θa,T = xyO 2−4 −1
  
−xS mod pr = 5−1 mod 7 = 2 mod 7. That is, 2θa,T
= −1 + 7 mod 7 = 6. Therefore, θa,T = 3 as shown in Table 8.10. Now,
2
xT = (θa,T − xO − xS ) mod pr = (32 − 5 − 1) mod 7 = 3 and yT =
−yO + θa,T (xO − xT ) mod pr = −2 + 3(5 − 3) mod 7 = 4.

Hence, Message ≡ (3, 4). We get the decrypted message which is the orig-
inal encoded message.

8.4 Blockchain
Blockchain was introduced in 2008 by Satoshi Nakamoto or a group of
people with alias who developed bitcoin. He/they authored a bitcoin white
paper titled ‘Bitcoin: A Peer-to-Peer Electronic Cash System’. Let us under-
stand the motivation behind blockchain technology. As we know that the bank
service is not attractive. Consider an example of sending the money from one
country to other. There is a tedious process for opening a bank account, high
transaction fee, online scam makes vulnerable, unavailability of bank service
Privacy and Security of Things Data 221

FIGURE 8.5: Illustration of a Blockchain

24 × 7, transfer limit in a day, and tampering of the data of a bank cannot be


rolled back. We need to unwillingly trust a central authority, that is, a bank.
This problem can be solved using blockchain technology.
In a centralized architecture, there is a single copy of the information.
Also, there is a high maintenance cost. In contrast, the decentralized network
distributes the information to all nodes. Blockchain technology decentralizes
the entire network. Therefore, data can be recovered in the blockchain. Note
that, a node refers to a smart device or computer connected to the network.
Data cannot be tampered with in blockchain technology as it is cryptograph-
ically secured. Let us discuss some terminologies to understand blockchain
technology.

8.4.1 Bitcoin
The bitcoin is a digital currency which is used for electronic payment from
a sender to a receiver. Blockchain is the underlying technology to perform
that. There are about 1600 digital currencies. A bitcoin can be considered as
an application that uses blockchain technology. Blockchain technology can be
used in a multitude of use cases.
Blockchain is a distributed digital ledger of immutable records. The mean-
ing of immutable is that the data cannot be altered once it is recorded.
Blockchain is composed of cryptographically linked blocks like a linked list
of a data structure and hash as a pointer. That is, there is a chain of blocks
in blockchain in chronological order. This is illustrated in Figure 8.5.

8.4.2 Hash
Hash is a function that converts input data of any size to a fixed size of
alphanumeric characters. Hash represents each block uniquely. The first block
has the previous hash “0000....” as there is no preceding block to it. The
“previous hash” of the current block is the “Hash” of the previous block.
Hash Function and Its Properties: The blockchain performs cryptog-
raphy using hash function. Given an input data, the corresponding hash value
is generated. For example, if the input data is “We learn blockchain” the
222 Fundamentals of Internet of Things

corresponding hash value1 is


2a37723e5882bbced70103fc3e17b6b3d97bb5f36d55e52f25451e1aaeff1b2b.
There are 64 digits in hexadecimal representation or 32 bytes in binary rep-
resentation. Each digit is mapped to 4 bits in binary representation. Therefore,
the total size is 64 × 4 equal to 256 bits. The size of input can be anything
and the size of the output is a fixed size of 256 bits for the SHA-256 hashing
algorithm. The SHA stands for Secure Hashing Algorithm. The properties of
the hash function are as follows:

• It is a one-way function, that is, given the input, the hash value can be
computed easily. However, given the hash value, it is very difficult to
produce the input back. This is like a trapdoor function in ECC.
• It is deterministic, that is, for the same input, the output is also the
same.
• A small change in the input, there would be a big change in the output.

8.4.3 Blockchain and Its Versions


The blockchain is shared among all the users of the networks. Each user
has the same copy of it and can see the history of transactions. The details are
since the first block was added to the blockchain network. The first block is also
called genesis block. There are three versions of Blockchain. Blockchain 1.0 is
used in digital currency like bitcoin or other cryptocurrencies in a decentralized
system. Blockchain 2.0 is used in smart contracts and the transfer of stock
and bonds. Blockchain 3.0 is used in government organizations, hospitals, and
many more.
Blocks in Blockchain: The header of the block shown in Figure 8.6 has
the following fields:

• Version number: Sequence number of the block.


• Previous hash: Hash of the previous block to which the current block
is linked.
• Merkle root hash: Hashes of all transactions are structured in binary
Merkle tree. The root of the tree is called Merkle root hash.
• Timestamp: When the block is verified and mined.
• Nonce: A random number used to create the hash.
• Target: The generated hash should be less than the target set by the
network. For example, there should be 3 leading zeros.

1 We refer to https://www.anders.com/blockchain for generating the hash value.


Privacy and Security of Things Data 223

FIGURE 8.6: Header of a block

8.4.4 Miner
There are three input, namely, previous hash, Merkle root hash and the
nonce to SH-256 cryptographic hashing algorithm as shown in Figure 8.7.
The miner varies the nonce, while the previous hash and Merkle root remain
the same, to generate a hash value lower than the target. If the hash value
meets the target set by the networks, then we stop. Otherwise, the process is
repeated and this is called proof of work consensus algorithm. All the nodes of
the network agrees on the same version of the fact in the consensus algorithm.
The miner verifies and validates the transactions and adds the block. The
first miner who gets the hash that meets the target is rewarded. Since the
miner invests resources and computing power, he or she is rewarded for the
same in terms of bitcoin or other forms of remuneration. The miner gets 12.5
bitcoins and the sum of the transaction fee of that block. The amount is halved
every 2,10,00 blocks, which is approximately 4 years. Note that 1 bitcoin is
equivalent to USD 51333 or 37.76 lakhs in rupees in May 2021. The miner
also verifies if the sender has a sufficient amount to be transferred. The target
hash is decided months in advance for every block. It is difficult to generate a
nonce that satisfies the target but it is easy to verify by other miners.

FIGURE 8.7: SHA 256 cryptographic hashing algorithm


224 Fundamentals of Internet of Things

FIGURE 8.8: Confidentiality and authentication in digital signature

8.4.5 Tamper-Proof Blockchain


Let us consider that the attacker tampers the data of block number 3 of
Figure 8.5. Subsequently, the hash value of this block changes. However, the
previous hash value of block 4 remains intact as there is no tampering in
block number 4. Consequently, the changed hash value of block number 3 due
to tampering and the previous hash value of block number 4 do not match.
This makes the following blocks invalid. Hence, tampering can be detected in
a blockchain network. Notably, each user in the blockchain network has the
same copy of the ledger. Ledger refers to the history of all transactions.
The attacker invests a lot of computing power and resources to recalcu-
late the hash of block 3. As mentioned, the “previous hash” of block 4 does
match with the recalculated hash of tampered block 3. Subsequently, the at-
tacker needs to recalculate the hash for block 5 and so on. This requires a
lot of computing power and resources, while the network is still progressing.
This is nearly an impossible task in blockchain technology. Hence, this makes
blockchain technology tamper-proof.

8.4.6 Role of Digital Signature in Blockchain


Confidentiality is defined as the state of keeping secret or private. This is
carried out using public key for encryption and private keys for decryption
as shown in Figure 8.8(a). On the other hand, authentication is defined as
the process of showing something legitimate. This is achieved using digital
Privacy and Security of Things Data 225

FIGURE 8.9: Digital signature

signature as shown in Figure 8.8(b). The digital signature is equivalent to


handwritten signature but it is more secured. However, in authentication,
we need public key, not private key, to decrypt. We make sure that received
message is from the authentic sender. As shown in Figure 8.8(c), we can
achieve both confidentiality and authentication. We do double encryption and
decryption at both sender and receiver ends.
We can also use the SHA-256 hash function in the blockchain. First the
original message is passed through an SHA-256 hash function, which gives
alphanumeric characters. Subsequently, both hash message and private key of
the sender is used for encryption to get a digital signature as shown in Figure
8.9. At the receiver, both the digital signature and sender’s public key are
used to get the hash message. If this hash message and the hash message after
directly applying the hash function to the data are the same, then the message
is not altered during transmission and coming from the legitimate sender.

8.4.7 Transaction in Blockchain


The transaction details are deducted amount for the sender and credited
amount for the receiver. There is no actual transfer of digital currency. Only
the transaction is recorded in the ledger of blockchain. The transaction detail
is passed through the SHA-256 cryptographic hashing function. We know that
the private key is secret with the user, while the public key is shared with all
the Bitcoin users. The transaction detail is signed using the user’s private
key. The sender broadcasts the encrypted transactions with amount of bitcoin
network to be transferred. The miner receives the transaction, verifies, and
validates it. The miner then includes it in a block and propagates it to other
Bitcoin users.

8.4.8 Distributed Ledger


Distributed ledger is a feature of blockchain. All users have the same copy
of a ledger. A blank ledger with no data input also has some hash. Each row is a
blockchain, where the “previous hash” of the current block is the “hash” of the
previous block. The combination of rows and columns makes it a distributed
ledger as shown in Figure 8.10. Note that each row is the same in terms of all
226 Fundamentals of Internet of Things

FIGURE 8.10: Distributed ledger

parameters. If we write something in any block of some peer, all subsequent


blocks turn pink2 . This leads to a change in “hash” and “previous hash”
like a linked list. Now, clicking on mine of that block changes the nonce and
regenerates the hash such that it has 4 leading zeros to meet the target. This
makes that block green and the process is repeated for subsequent blocks of
that row. Notably, modification of a block makes all subsequent blocks invalid
and these are re-mined.
For example, when we write in the data field of block 3 (say) of Peer A,
this generates a hash value that is different from block 3 of peer B or C. Note
that the hash values of block 3 of peer B and C are the same as we have
not modified their data fields. Hence, the data modification of a block can be
easily detected. This was not possible if we have a single copy of the ledger.
All the users of the network have a history of transactions since the genesis
block which is the first block of the network. Even if a node (computer) gets
corrupted, we do not lose the data because the same copy is with other nodes.

8.4.9 Byzantine Generals Problem


Any modification to the block must be approved by a majority of the users
of the networks. This is because the modification made to the block is per-
manent and immutable once recorded. This makes the networks decentralized
where there is no central authority like a bank. Let us assume that there are
four generals, out of them, three are loyal and one is a traitor as shown in
Figure 8.11. The loyal general gives the command to attack, while the traitor
to not attack. The generals are not at one place, they communicate among
themselves. The goal is victory even if a minority of the generals are traitors.
2 Refer to https://andersbrownworth.com/blockchain/distributed for demo
Privacy and Security of Things Data 227

FIGURE 8.11: Byzantine general problem

In this scenario, the loyal general receives the 2 commands for attack and 1
command to not attack. The majority of commands are to attack, therefore,
they attack the enemy. This is called Byzantine fault tolerance. On similar
lines, in blockchain networks, the generals are the nodes and the messages are
the transactions. Attack command is the legitimate transaction, while to not
attack command is the invalid transaction. We arrive at the consensus in the
presence of some malicious nodes.

8.4.10 Transaction Pool and Candidate Block


Transaction pool stores all the unverified transactions before getting in-
cluded in a new block. Candidate block is a temporary current block that
does not have valid proof of work. It is made up of transactions selected from
the transaction pool. The miner picks block number 5 (say) and starts vali-
dating. However, in meantime, some other miners might get the satisfactory
hash that meets the target. The block number 5 does not have a valid proof of
work which is mined by miner 2 before miner 1 completes and gets the reward.
Now, miner 1 starts competing for block number 6 by picking the unverified
transaction from the transaction pool.
Tie in Generating Hashes or Adding Blocks by Miners: A tie is
a rare event, however, we choose the one who finds the hash that meets the
target first. If two miners get the hashes that meet the target at the same
time, both add his or her block. However, two blockchains must not run and
only one blockchain needs to be allowed. We choose the blockchain in which
the other miner has added a block first and gets verified by other users in
the network. Say, miner 3 adds the block in the second blockchain, then it
228 Fundamentals of Internet of Things

FIGURE 8.12: Tie in generating hashes

is accepted as the dominant blockchain. The first blockchain is discarded as


shown in Figure 8.12.

8.4.11 Fork
The blockchain diverges into a different chain with different rules or pro-
tocols. The users of networks are not in agreement. Some improvements to be
made are minor, while some are major. There are two types of forks.

• Soft fork: The new version works well with the old version, that is,
there is backward compatibility. Hence, there is flexibility.
• Hard fork: This is rigid as there are different rules for different chains
and the users have to choose one. Both chains are valid and cannot be
discarded. Some users support the old chain, while others the new chain.

Example: We have different versions of the same coin, for instance, bitcoin
(BTC) and bitcoin cash (B cash or BCH). Initially, we had a block size of 1
MB with a lesser number of transactions. With the popularity of Bitcoin, the
number of transactions has been increased. Therefore, we need a larger block
size of 8 MB for bitcoin cash since August 2017.

8.4.12 Bitcoin versus Ethereum


There are more than 1600 crypto-currencies. For example, Bitcoin, Bitcoin
cash, Ethereum, Litecoin, and many more. Ether cash of Ethereum network
is used for peer-to-peer transactions like bitcoin. In addition, it is used for
the creation and execution of the smart contracts on a decentralized network.
Transaction in the bitcoin network is slow and takes about 10 minutes. In the
Ethereum network, it is fast and takes about 15 seconds. Bitcoin uses SHA-256
cryptographic hashing algorithm, while Ethereum uses ethash algorithm.
Privacy and Security of Things Data 229

The Proof of Concept is used in the Ethereum network. One gets the
chance of mining based on the number of coins he or she owns. The more the
stake, the more the chance and mining power. The one with more stake likely
performs genuine validation, otherwise, he or she loses stake if it is invalidated
by the rest of the users. The miner gets a reward of 3 ETH and the sum of
the transaction fee for the validation of the block. The reward does not get
halved, unlike the bitcoin network.

8.4.13 Some Remarks on Blockchain Technology


• It is not clear whether Satoshi Nakamoto is the name of the person or
group alias name. Nobody knows if Satoshi Nakamoto is alive or dead.
The developers were used to communicate electronically through email
or the BitcoinTalk forum created by them. Satoshi Nakamoto owns 1
million bitcoin. He disappeared from the Internet after handing over the
source code of the project in 2010.
• The bitcoin refers to the unit of cryptocurrency in the transaction, while
Bitcoin refers to the protocol which is a ledger to stores information
related to transactions.
• 1 bitcoin is equivalent to USD 51333 or 37.76 lakhs in Indian rupees, 1
bitcoin cash is equivalent to USD 1337 or 0.98 lakhs in Indian rupees
and 1 ETH is equivalent to USD 3995 or 2.93 lakh Indian rupees in May
2021.
• The blockchain network is fault-tolerant and complete (100%) availabil-
ity. Blockchain is a decentralized and transparent network. There is trust
between neighboring nodes that may be through gossiping. Elliptic curve
is used to generate cryptographic keys.

8.4.14 Applications of Blockchain


• Money transaction: We know the drawback of a bank and steps for
transactions between sender and receiver.
• Know your customer (KYC): We do not need to submit the KYC form
to each bank separately. Once KYC is recorded in a blockchain network,
we can allow accessing the KYC form to each bank.
• Voting system: Single vote casting by an electorate; electorate identity
information and votes cannot be tampered with and stored permanently.
• Healthcare: The medical data can be communicated more securely using
blockchain technology. The data cannot be tampered with and lost in
the communication channel. Importantly, there would be a single version
of data. It removes duplication of medical data and saves a lot of storage
capacity and of course the money.
230 Fundamentals of Internet of Things

• Insurance claim: The medical history of a patient can be stored in im-


mutable records. This prevents the false claim made by the patient.

• Smart contract: Based on terms and conditions set using code in the
Ethereum network, the smart contract automatically negotiates the
parking fee, ordering some repair parts of a machine and payment
through the agreement of a smart contract, negotiation of different en-
ergy (solar, wind, etc.) of a home in a smart grid system and many
more. Smart contract monitors flight status and automatic payment of
compensation in case flight get delayed by some hours. The sensors in
a vehicle provide real-time conditions of the vehicle and automatic pay-
ment of compensation if the vehicle meets an accident.
• Retail Management: The sensors collect the conditions of the product.
The product can be automatically tracked, and conditions at generation,
shipment, and arrival can be verified through history. The contract can
be executed between two vendors.
• Government Organizations: All the data can be secured, tamper-proof
and transparent among stakeholders eliminating paper-based records.

8.4.15 Disadvantages of Blockchain


There is a perception that blockchain technology is difficult to implement.
Only skilled blockchain architects and blockchain developers can do this. Only
1% of organizations currently use this technology. It is also believed that
scalability is an issue. It will take longer to verify and validate a block when
the network size increases. The block size is also increased from 1 MB to 8 MB.
Further, we need rules and regulations from governments or organizations for
regulation purposes.

8.5 Summary
In this chapter, we studied data privacy, elliptic curve cryptography, and
blockchain technology. The data anonymization preserves the privacy of the
user. The anonymized data must not disclose sensitive attributes like the name
of disease and income of the user. Also, at the same time, we should have
enough information. Next, if the RSA algorithm uses 3072 bits public key,
then the ECC algorithm uses only 256 bits public key for a comparable level
of security. ECC algorithm uses a smaller key size and hence suitable for IoT
networks. Finally, the data cannot be tampered with in blockchain technology
as it is cryptographically secured.
Privacy and Security of Things Data 231

8.6 Exercises
1. Apply k-anonymity and l-diversity anonymization techniques for dataset
of Table 8.11.

TABLE 8.11: Dataset for k-anonymity and l-diversity anonymization


Non Sensitive Non Sensitive Non Sensitive Sensitive
Record No. Name Age Area Code Income (Lakh)
1 David 27 101 10
2 Diya 32 120 25
3 Krish 41 192 18
4 Priyansh 22 148 14

2. (a) What could be the original pin code if we generalize pin code in
k-anonymity as 844***?
(b) What could be the original gender if we generalize gender in k-
anonymity as *?
3. How does the key exchange happen in ECC when the receiver does not
know the sender?
4. Generate an elliptic curve (y 2 = x3 + ax + b) over a finite field for a =
1, b = −1 and pr = 5.
5. Crack the number of jumps using brute-force method if we start from
the point (2, 3) and end at the point (1, 4) on the elliptic curve of the
last Problem 4 one by one. It is possible herein because nr is a small
number.
6. Carry out the encryption for the elliptic curve discussed in the Problem 4
for g = (1, 4), Message = (3, 2). The private keys for sender and receiver
are 5 and 2, respectively.

7. Carry out the decryption for the elliptic curve discussed in the Problem 4
for g = (1, 4), Message = (3, 2). The private keys for sender and receiver
are 5 and 2, respectively.
8. Do It Yourself : Study and modify the example of transactions of Token
available at https://andersbrownworth.com/blockchain/tokens

You might also like