SlideShare a Scribd company logo
Introduction to
Cryptography &
Cryptocurrencies
Module 2
Contents
● Cryptographic Hash Functions
● Hash Pointers and Data Structures
● Digital Signatures
● Public Keys as Identities
● A Simple Cryptocurreny
Introduction
• All currencies need some way to control supply and enforce various
security properties to prevent cheating.
• These security features raise the bar for an attacker, but they don’t
make money impossible to counterfeit.
• Ultimately, law enforcement is necessary for stopping people from
breaking the rules of the system.
• Cryptocurrencies too must have security measures that prevent
people from tampering with the state of the system
• If Alice convinces Bob that she paid him a digital coin, for example,
she should not be able to convince Carol that she paid her that same
coin.
• But unlike fiat (Govt issued currencies) currencies, the security
rules of cryptocurrencies need to be enforced purely technologically
and without relying on a central authority.
• Cryptocurrencies make heavy use of cryptography.
• Cryptography provides a mechanism for securely encoding the rules of a
cryptocurrency system in the system itself.
• It can be used to prevent tampering and equivocation (when a dishonest
actor provides conflicting information for the same message),
• It can be used to encode the rules for creation of new units of the currency
into a mathematical protocol
• Cryptography is a deep academic research field utilizing many
advanced mathematical techniques that are subtle and complicated
to understand.
• Fortunately, Bitcoin only relies on a handful of relatively simple
and well known cryptographic constructions.
‐
Cryptographic Hash Functions
• A hash function is a mathematical function with the following
three properties:
• Input can be any string of any size.
• Fixed size output. We will assume a 256 bit output size. (hash value or
‐
digest). Its unique digital "fingerprint" for the input data. Even if
the input is a small file or a large document, the hash function will
always produce an output of the same length.
• Efficiently computable. For a given input string, we can compute
output of the hash function in a reasonable amount of time. More
technically, computing the hash of an n bit string should have a running
‐
time that is O(n).
• Those properties define a general hash function, one that could
be used to build a data structure such as a hash table.
Note: Message Digest
Message Digest is used to ensure the integrity of a message
transmitted over an insecure channel (where the content of the
message can be changed). The message is passed through a
Cryptographic hash function. This function creates a
compressed image of the message called Digest.
Lets assume, Alice sent a message and digest pair to Bob. To
check the integrity of the message Bob runs the cryptographic
hash function on the received message and gets a new digest.
Now, Bob will compare the new digest and the digest sent by
Alice. If, both are same then Bob is sure that the original
message is not changed.
Blockchain Technology, Cryptography and cryptocurrencies Module2.pptx
● To provide authenticity of the message, digest is encrypted with sender’s
private key. Now this digest is called digital signature, which can be only
decrypted by the receiver who has sender’s public key. Now the receiver can
authenticate the sender and also verify the integrity of the sent message.
Example:
The hash algorithm MD5 is widely used to check the integrity of messages. MD5
divides the message into blocks of 512 bits and creates a 128 bit digest(typically, 32
Hexadecimal digits). It is no longer considered reliable for use as researchers have
demonstrated techniques capable of easily generating MD5 collisions on
commercial computers.
The weaknesses of MD5 have been exploited by the Flame malware in 2012.
In response to the insecurities of MD5 hash algorithms, the
Secure Hash Algorithm (SHA) was invented.
Example
Let’s say you have a password like "mypassword123". A hash function will
take that password and convert it into something like
"5f4dcc3b5aa765d61d8327deb882cf99" (this is a real hash of the word
"password" using a hash function called MD5). This hash is a string of letters
and numbers that represents your password.
● If someone tries to hack into your account, they won’t see your real
password, but rather this hashed version.
● Even if they know the hash value, they won’t easily be able to reverse it and
figure out the original password.
Cryptographic Hash Functions
• For a hash function to be cryptographically secure, we require
that it has the following three additional properties:
(1) collision resistance,
‐
(2) hiding,
(3) puzzle friendliness.
‐
Property 1: Collision Resistance
A collision occurs when two distinct inputs produce the same
output. A hash function H(·) is collision resistant if nobody can
find a collision. In other words,
Nobody can find x and y such that
x!=y and H(x)=H(y)
Notice that it was said “nobody can find” a collision
But we did not say that no collisions exist.
Actually, collisions exist for any hash function.
The input space to the hash function contains all strings of all
lengths, yet the output space contains only strings of a specific
fixed length.
Because the input space is larger than the output space (indeed,
the input space is infinite, while the output space is finite), there
must be input strings that map to the same output string.
Blockchain Technology, Cryptography and cryptocurrencies Module2.pptx
Consider the following simple method for finding a collision for a
hash function with a 256-bit output size:
pick 2^256 + 1 distinct values,
compute the hashes of each of them, and
check whether any two outputs are equal.
Since we picked more inputs than possible outputs, some pair of
them must collide when you apply the hash function
The method above is guaranteed to find a collision.
But if we pick random inputs and compute the hash values, we’ll
find a collision with high probability long before examining 2 256 +
1 inputs.
In fact, if we randomly choose just 2^130 + 1 inputs, it turns out
there’s a 99.8 percent chance that at least two of them are going
to collide.
That we can find a collision by examining only roughly the square
root of the number of possible outputs results from a phenomenon
in probability known as the birthday paradox.
Birthday Paradox
It refers to the surprising result that in a group of just 23 people, the
probability of at least two people sharing the same birthday is more than
50%. This result is counterintuitive because most people expect a much
larger group for this to be likely.
In probability terms, the Birthday Paradox shows that the probability of
shared birthdays rises much faster than our intuition suggests. Even in
relatively small groups, the chance of a shared birthday becomes significant
because of the number of potential pairs.
This collision-detection algorithm works for every hash function.
But, the problem is that it takes a very long time to do.
For a hash function with a 256-bit output, you would have to
compute the hash function 2^256 + 1 times in the worst case,
and about 2^128 times on average.
That’s of course an astronomically large number—if a computer
calculates 10,000 hashes per second, it would take more than
one octillion (10^27) years to calculate 2^128 hashes!
A general but impractical algorithm to find a collision for any hash
function.
Consider, for example, the following hash function:
H(x)= x mod 2^256
This function meets our requirements of a hash function, and is
efficiently computable. It is also an efficient method for finding a
collision.
Is there a faster way to find collisions?
For some possible H’s, yes.
For others, we don,t know of one.
No hash functions have been proven to be collision resistant.
The cryptographic hash functions that we rely on in practice are
just functions for which people have tried really, really hard to find
collisions and haven’t yet succeeded.
And so we choose to believe that those are collision resistant.
(In some cases, such as the hash function known as MD5,
collisions were eventually found after years of work, resulting in
the function being deprecated and phased out of practical use.)
Application of Collision Resistance
Hash as Message Digest:
If we know that two inputs x and y to a collision-resistant hash function H
are different, then it’s safe to assume that their hashes H(x) and H(y) are
different.
This argument allows us to use hash outputs as a message digest.
To recognize a file we saw before, just remember it's hash.
Useful becoz hash is small (only 256 bits).
Consider SecureBox, an authenticated online file storage
system that allows users to upload files and to ensure their
integrity when they download them.
Collision-resistant hashes provide an elegant and efficient
solution to this problem. Alice just needs to remember the hash
of the original file. When she later downloads the file from
SecureBox, she computes the hash of the downloaded file and
compares it to the one she stored. If the hashes are the same,
then she can conclude that the file is indeed the same one she
uploaded, but if they are different, then Alice can conclude that
the file has been tampered with.
Property 2: Hiding
The hiding property asserts that if we’re given the output of the hash
function y = H(x), there’s no feasible way to figure out what the input, x,
was.
In other words,
Given H(x), it is infeasible to find x.
The problem is that this property does not exactly hold.
For ex: If the result of the coin flip was heads, we’re going to announce the
hash of the string “heads.” If the result was tails, we’re going to announce
the hash of the string “tails.”
In response, they would simply compute both the hash of the string “heads”
and the hash of the string “tails,” and they could see which one they were
given. And so, in just a couple steps, they can figure out what the input was.
So, its easy to find x.
To be able to achieve the hiding property, there must be no value
of x that is particularly likely.
That is, x has to be chosen from a set that is, in some sense,
very spread out.
Can we achieve the hiding property? “Yes”
We can hide even an input that’s not spread out by concatenating
it with another input that is spread out.
Hiding
A hash function H is said to be hiding if when a secret value r is
chosen from a probability distribution that has high min-entropy,
then, given H(r x), it is infeasible to find x.
‖
In information theory, min-entropy is a measure of how predictable an
outcome is, and
High min-entropy means that the distribution (i.e., of a random
variable) is very spread out.
When we sample from the distribution, there’s no particular value
that’s likely to occur. So, for a concrete example, if r is chosen
uniformly from among all strings that are 256 bits long, then any
particular string is chosen with probability 1/2^256 , which is an
infinitesimally small value (no particular value is chosen with more
than negligible property.
APPLICATION: COMMITMENTS
A commitment is the digital analog of taking a value, sealing it in an
envelope, and putting that envelope out on the table where everyone
can see it.
When you do that, you’ve committed yourself to what’s inside the
envelope.
But you haven’t opened it, so even though you’ve committed to a value,
the value remains a secret from everyone else.
Later, you can open the envelope and reveal the value that you
committed to earlier.
In other words,
Want to “ seal a value in an envelope”, and
“open the envelope later”
Commit to a value. reveal it later.
Commitment scheme
A commitment scheme consists of two algorithms:
• com := commit(msg, nonce) The commit function takes a
message and secret random value, called a nonce, as input
and returns a commitment.
• verify(com, msg, nonce) The verify function takes a
commitment, nonce, and message as input. It returns true if
com == commit(msg, nonce) and false otherwise.
Note: Every time you commit to a value, it is important that you
choose a new random value nonce.
In cryptography, the term nonce is used to refer to a value that can
only be used once.
We require that the following two security properties hold:
• Hiding: Given com, it is infeasible to find msg.
• Binding: It is infeasible to find two pairs (msg, nonce) and
(msg′, nonce ′) such that msg ≠ msg′ and commit(msg, nonce)
== commit(msg′, nonce′).
To use a commitment scheme, we first need to generate a
random nonce. We then apply the commit function to this nonce
together with msg, the value being committed to, and we publish
the commitment com. This stage is analogous to putting the
sealed envelope on the table.
At a later point, if we want to reveal the value that we committed
to earlier, we publish the random nonce that we used to create
this commitment, and the message, msg. Now anybody can
verify that msg was indeed the message committed to earlier.
This stage is analogous to opening the envelope.
The two security properties dictate that the algorithms actually
behave like sealing and opening an envelope.
• First, given com, the commitment, someone looking at the
envelope can’t figure out what the message is.
• The second property is that it’s binding. This ensures that
when you commit to what’s in the envelope, you can’t change
your mind later. That is, it’s infeasible to find two different
messages, such that you can commit to one message and
then later claim that you committed to another.
To implement commitment scheme,we make use of cryptographic
hash function.
commit(msg, nonce) := H(nonce msg), where nonce is a
‖
random 256-bit value
• To commit to a message, we generate a random 256-bit
nonce. Then we concatenate the nonce and the message and
return the hash of this concatenated value as the commitment.
• To verify, someone will compute this same hash of the nonce
they were given concatenated with the message. And they will
check whether the result is equal to the commitment that they
saw.
If we substitute the instantiation of commit and verify as well as
H(nonce msg) for com, then these properties become:
‖
• Hiding: Given H(nonce msg), it is infeasible to find msg.
‖
• Binding: It is infeasible to find two pairs (msg, nonce) and (msg
′, nonce ′) such that msg ≠ msg′ and H(nonce msg) == (nonce′
‖
msg′).
‖
Therefore, if H is a hash function that is both collision resistant and
hiding, this commitment scheme will work, in the sense that it will
have the necessary security properties.
Commitment API
(com,key):=commit(msg)
match:=verify(com,key,mesg)
To seal msg in envelope:
(com,key):=commit(msg) - - then publish com
To open envelope:
publish key msg
anyone can use verify() to check validity
Where
commit(msg):=(H(key|msg),H(key)
verify(com,key,msg):=(H(key|msg)==com)
Property 3: Puzzle Friendliness
A hash function H is said to be puzzle friendly ,
• if for every possible n-bit output value y,
• if k is chosen from a distribution with high min-entropy,
• then it is infeasible to find x such that H(k x) = y in time
‖
significantly less than 2^n
In other words,
if someone wants to target the hash function to have some
particular output value y, and if part of the input has been chosen
in a suitably randomized way, then it’s very difficult to find another
value that hits exactly that target.
APPLICATION: SEARCH PUZZLE
Search puzzle is a mathematical problem that requires searching a
very large space to find the solution.
In particular, a search puzzle has no shortcuts.
That is, there’s no way to find a valid solution other than searching
that large space.
A search puzzle consists of
• a hash function, H,
• a value, id ( puzzle-ID), chosen from a high min-entropy
distribution, and
• a target set Y.
A solution to this puzzle is a value, x, such that
Puzzle friendly property implies that no solving strategy is much
better than trying random values of x.
The intuition is:
• If H has an n-bit output, then it can take any of 2^ n values.
• Solving the puzzle requires finding an input such that the output
falls within the set Y, which is typically much smaller than the
set of all outputs.
• The size of Y determines how hard the puzzle is.
• If Y is the set of all n-bit strings, then the puzzle is trivial,
• whereas if Y has only one element, then the puzzle is maximally
hard.
• That the puzzle ID has high min-entropy ensures that there are
no shortcuts.
On the contrary, if a particular value of the ID were likely, then
someone could cheat, say, by precomputing a solution to the
puzzle with that ID.
If a hash function is puzzle friendly, then there’s no solving
strategy for this puzzle that is much better than just trying random
values of x.
And so, if we want to pose a puzzle that’s difficult to solve, we
can do it this way as long as we can generate puzzle-IDs in a
suitably random way.
Hash function used by Bitcoin:SHA-256
As long as we can build a hash function that works on fixed-
length inputs, there’s a generic method to convert it into a hash
function that works on arbitrary-length inputs. It’s called the
MerkleDamgård transform.
SHA-256 is one of a number of commonly used hash functions
that make use of this method.
In common terminology, the underlying fixed-length collision-
resistant hash function is called the compression function.
It has been proven that if the underlying compression function is
collision resistant, then the overall hash function is collision
resistant as well.
The Merkle-Damgård transform
SHA-256 uses the MerkleDamgård transform to turn a fixed-length collision-
resistant compression function into a hash function that accepts arbitrary-
length inputs. The input is padded, so that its length is a multiple of 512 bits.
IV stands for initialization vector.
Suppose that the compression function takes inputs of length m and
produces an output of a smaller length n. The input to the hash function,
which can be of any size, is divided into blocks of length m – n.
SHA-256 uses a compression function that takes 768-bit input
and produces 256-bit outputs. The block size is 512 bits.
The construction works as follows:
• Pass each block together with the output of the previous block
into the compression function.
• Input length will then be (m – n) + n = m, which is the input
length to the compression function.
• For the first block, to which there is no previous block output,
we instead use an initialization vector . This number is reused
for every call to the hash function.
• The last block’s output is the result that you return.
Modeling Hash Functions
Hash functions are the Swiss Army knife of cryptography:
• They find a place in a spectacular variety of applications.
• The flip side to this versatility is that different applications
require slightly different properties of hash functions to ensure
security.
• It has proven notoriously hard to pin down a list of hash
function properties that would result in provable security
across the board.
1.2. HASH POINTERS AND DATA STRUCTURES
A hash pointer is a data structure.
A hash pointer is simply a pointer to where some information is
stored together with a cryptographic hash of the information.
A regular pointer gives you a way to retrieve the information,
whereas hash pointer also allows you to verify that the information
hasn’t been changed.
A hash pointer is a pointer to where data is stored together with a
cryptographic hash of the value of this data at some fixed point in
time.
Hash pointers to build all kinds of data structures.
We can take a familiar data structure that uses pointers, such as a
linked list or a binary search tree, and implement it with hash
pointers instead of ordinary pointers.
We call this data structure a block chain.
In a regular linked list where you have a series of blocks, each
block has data as well as a pointer to the previous block in the list.
But in a blockchain, the previous-block pointer will be replaced
with a hash pointer. So each block not only tells us where the
value of the previous block was, but it also contains a digest of
that value, which allows us to verify that the value hasn’t been
changed.
We store the head of the list, which is just a regular hash-pointer
that points to the most recent data block.
A use case for a block chain is a tamper-evident log. That is,
we want to build a log data structure that stores data and allows
us to append data to the end of the log. But if somebody alters
data that appears earlier in the log, we’re going to detect the
change.
Why a block chain achieves this tamper-evident
property?
An adversary wants to tamper with data in the middle of the chain.
Specifically, the adversary’s goal is to do it in such a way that someone who remembers
only the hash pointer at the head of the blockchain won’t be able to detect the tampering.
To achieve this goal, the adversary changes the data of some block k. Since the data has
been changed, the hash in block k + 1, which is a hash of the entire block k, is not going
to match up.
Remember that we are statistically guaranteed that the new hash will not match the
altered content, since the hash function is collision resistant. And so we will detect the
inconsistency between the new data in block k and the hash pointer in block k + 1.
Of course, the adversary can continue to try and cover up this change by changing the
next block’s hash as well. The adversary can continue doing this, but this strategy will fail
when she reaches the head of the list. Specifically, as long as we store the hash pointer
at the head of the list in a place where the adversary cannot change it, she will be unable
to change any block without being detected
Thus, If an adversary modifies data anywhere in the blockchain, it will
result in the hash pointer in the following block being incorrect. If we
store the head of the list, then even if an adversary modifies all
pointers to be consistent with the modified data, the head pointer will
be incorrect, and we can detect the tampering.
The upshot is that if the adversary wants to tamper with data
anywhere in this entire chain, to keep the story consistent, she’s
going to have to tamper with the hash pointers all the way to the end.
And she’s ultimately going to run into a roadblock, because she
won’t be able to tamper with the head of the list.
Thus, by remembering just this single hash pointer, we’ve essentially
determined a tamper-evident hash of the entire list.
So we can build a blockchain like this containing as many blocks as
we want, going back to some special block at the beginning of the list,
which we will call the genesis block.
Merkle Trees
A binary tree with hash pointers is known as a Merkle tree , after
its inventor, Ralph Merkle.
In a Merkle tree, data blocks are grouped in pairs, and the hash of each
of these blocks is stored in a parent node.
The parent nodes are in turn grouped in pairs, and their hashes stored
one level up the tree.
This pattern continues up the tree until we reach the root node.
Here, we remember just one hash pointer: in this case, the one at the
root of the tree. We now have the ability to traverse through the hash
pointers to any point in the list.
This allows us to make sure that the data has not been tampered with
because, just as we saw for the blockchain, if an adversary tampers with
some data block at the bottom of the tree, his change will cause the hash
pointer one level up to not match, and even if he continues to tamper
with other blocks farther up the tree, the change will eventually
propagate to the top, where he won’t be able to tamper with the hash
pointer that we’ve stored. So again, any attempt to tamper with any piece
of data will be detected by just remembering the hash pointer at the top.
Proof of Membership
Another nice feature of Merkle trees is that, unlike the block chain is proof of
membership.
Suppose that someone wants to prove that a certain data block is a member
of the Merkle tree, We remember just the root. Then they need to show us
this data block, and the blocks on the path from the data block to the root.
We can ignore the rest of the tree, as the blocks on this path are enough to
allow us to verify the hashes all the way up to the root of the tree.
If there are n nodes in the tree, only about log(n) items need to be shown.
And since each step just requires computing the hash of the child block, it
takes about log(n) time for us to verify it.
And so even if the Merkle tree contains a large number of blocks, we can still
prove membership in a relatively short time.
Verification thus runs in time and space that’s logarithmic in the number of
nodes in the tree.
Proof of membership: To prove that a data block is included in
the tree only requires showing the blocks in the path from that
data block to the root.
A sorted Merkle tree is just a Merkle tree where we take the
blocks at the bottom and sort them using some ordering function.
This can be alphabetical order, lexicographical order, numerical
order, or some other agreed-on ordering.
Proof of Non membership
It is a cryptographic concept in which a party can prove that a specific
element does not belong to a particular set without revealing any
information about the other elements of the set or compromising
privacy. This is often used in scenarios where one needs to verify that an
item or entity is not part of a restricted list or group (e.g., in blacklists,
revocation lists, or negative databases).
Using a sorted Merkle tree, it becomes possible to verify nonmembership in
logarithmic time and space. That is, we can prove that a particular block is
not in the Merkle tree.
A sorted Merkle tree is just a Merkle tree where we take the blocks at the
bottom and sort them using some ordering function. This can be
alphabetical order, lexicographical order, numerical order, or some other
agreed-on ordering.
• It can be done by showing a path to the item just before where
the item in question would be and showing the path to the item
just after where it would be.
• If these two items are consecutive in the tree, then this serves as
proof that the item in question is not included—because if it were
included, it would need to be between the two items shown, but
there is no space between them, as they are consecutive.
• We can use hash pointers in any pointer-based data structure as
long as the data structure doesn’t have cycles.
• If there are cycles in the data structure, then we won’t be able to
make all the hashes match up.
• In an acyclic data structure we can start near the leaves, or near
the things that don’t have any pointers coming out of them,
compute the hashes of those, and then work our way back
toward the beginning.
• But in a structure with cycles, there’s no end that we can start
with and compute back from.
1.3. DIGITAL SIGNATURES
A digital signature is digital analog to a handwritten signature on
paper.
Two desirable properties from digital signatures that correspond
well to the handwritten signature analogy:
• First, only you can make your signature, but anyone who sees
it can verify that it’s valid.
• Second, we want the signature to be tied to a particular
document, so that the signature cannot be used to indicate
your agreement or endorsement of a different document.
For handwritten signatures, this latter property is analogous to
ensuring that somebody can’t take your signature and snip it off
one document and glue it to the bottom of another one.
Blockchain Technology, Cryptography and cryptocurrencies Module2.pptx
Digital signature scheme.
A digital signature scheme consists of the following three algorithms:
➔ (sk, pk) := generateKeys(keysize) The generateKeys method
takes a key size and generates a key pair. The secret key sk is kept
privately and used to sign messages. pk is the public verification
key that you give to everybody. Anyone with this key can verify your
signature.
➔ sig := sign(sk, message) The sign method takes a message and a
secret key, sk, as input and outputs a signature for message under
sk.
➔ isValid := verify(pk, message, sig) The verify method takes a
message, a signature, and a public key as input. It returns a
boolean value, isValid, that will be true if sig is a valid signature for
message under public key pk, and false otherwise.
We require that the following two properties hold:
➔ Valid signatures must verify: verify(pk, message, sign(sk,
message)) == true.
➔ Signatures are existentially unforgeable.
2 properties of digital signature
1. Valid signatures must be verifiable. If I sign a message with sk,
my secret key, and someone later tries to validate that signature
over that same message using my public key, pk, the signature
must validate correctly. This property is a basic requirement for
signatures to be useful at all.
2. Unforgeability. The second requirement is that it’s computationally
infeasible to forge signatures. That is, an adversary who knows your
public key and sees your signatures on some other messages can’t
forge your signature on some message for which he has not seen
your signature. This unforgeability property is generally formalized
in terms of a game that we play with an adversary. The use of
games is quite common in cryptographic security proofs.
Unforgeability game
The attacker and the challenger play the unforgeability game. If the
attacker is able to successfully output a signature on a message that
he has not previously seen, he wins. If he is unable to do so, the
challenger wins, and the digital signature scheme is unforgeable.
➔Use generateKeys to generate a secret signing key and a
corresponding public verification key.
➔We give the secret key to the challenger, and we give the public
key to both the challenger and the adversary. So the adversary
only knows information that’s public, and his mission is to try to
forge a message.
➔The challenger knows the secret key. So he can make signatures.
The setup of this game matches real-world conditions.
A real attacker would likely be able to see valid signatures from his
would-be victim on different documents.
And the attacker might even be able to manipulate the victim into
signing innocuous-looking(something appears harmless or not likely to cause
harm or offense) documents if that’s useful to the attacker.
To model this in our game, we allow the adversary to get signatures on some
documents of his choice, for as long as he wants, as long as the number of
guesses is plausible. What we mean by a plausible number of guesses, we would
allow the adversary to try 1 million guesses, but not 2^80 guesses.
In asymptotic terms, we allow the adversary to try a number of guesses that is a
polynomial function of the key size, but no more (e.g., he cannot try exponentially
many guesses).
Once the adversary is satisfied that he’s seen enough signatures, then he picks
some message, M, that he will attempt to forge a signature on.
The only restriction on M is that it must be a message for which the adversary has
not previously seen a signature (because then he can obviously send back a
signature that he has been given).
The challenger runs the verify algorithm to determine whether the signature
produced by the attacker is a valid signature on M under the public verification key.
If it successfully verifies, the adversary wins the game.
We say that the signature scheme is unforgeable if and only if, no matter what
algorithm the adversary is using, his chance of successfully forging a message is
extremely small—so small that we can assume it will never happen in practice.
Practical Concerns
Several practical things must be done to turn the algorithmic idea into a
digital signature mechanism that can be implemented.
Need a good source of randomness: The importance of this
requirement can’t be overestimated, as bad randomness will make your
otherwise-secure algorithm insecure.
Message size: In practice, there’s a limit on the message size that
you’re able to sign, because real schemes are going to operate on bit
strings of limited length.
There’s an easy way around this limitation: sign the hash of the
message, rather than the message itself. If we use a cryptographic
hash function with a 256-bit output, then we can effectively sign a
message of any length as long as our signature scheme can sign 256-
bit messages. It’s safe to use the hash of the message as a message
digest in this manner, since the hash function is collision resistant.
Another trick is that you can sign a hash pointer.
If you sign a hash pointer, then the signature covers, or protects,
the whole structure—not just the hash pointer itself, but
everything the chain of hash pointers points to.
For example, if you were to sign the hash pointer located at the
end of a block chain, the result is that you would effectively be
digitally signing the entire block chain.
ECDSA -Elliptic Curve Digital Signature Algorithm
Bitcoin uses a particular digital signature scheme known as the Elliptic
Curve Digital Signature Algorithm (ECDSA).
ECDSA is a U.S. government standard, an update of the earlier DSA
algorithm adapted to use elliptic curves.
These algorithms have received considerable cryptographic analysis over
the years and are generally believed to be secure.
It is estimated to provide 128 bits of security (i.e., it is as difficult to break
this algorithm as it is to perform 2^128 symmetric-key cryptographic
operations, such as invoking a hash function).
Although this curve is a published standard, it is rarely used outside Bitcoin;
Other applications using ECDSA (such as key exchange in the TLS protocol
for secure web browsing) typically use the more common secp256r1 curve.
This is just a quirk of Bitcoin, as it was chosen by Satoshi in the early
specification of the system and is now difficult to change.
ECDSA can technically only sign messages 256 bits long, this is not a problem:
messages are always hashed before being signed, so effectively any size message
can be efficiently signed.
With ECDSA, a good source of randomness is essential, because a bad source will
likely leak your key.
But it’s a quirk of ECDSA that, even if you use bad randomness only when making
a signature and you use your perfectly good key, the bad signature will also leak
your private key. (For those familiar with DSA, this is a general quirk in DSA and is
not specific to the elliptic curve variant.)
And then it’s game over: if you leak your private key, an adversary can forge your
signature. We thus need to be especially careful about using good randomness in
practice. Using a bad source of randomness is a common pitfall of otherwise
secure systems.
1.4. PUBLIC KEYS AS IDENTITIES
A public key, one of those public verification keys from a digital
signature scheme can be equated to an identity of a person or an
actor in a system.
If you see a message with a signature that verifies correctly under a
public key, pk, then you can think of pk as stating the message.
You can literally think of a public key as being like an actor, or a party
in a system, who can make statements by signing those statements.
From this viewpoint, the public key is an identity. For someone to
speak for the identity pk, he must know the corresponding secret key,
sk.
A consequence of treating public keys as identities is that you
can make a new identity whenever you want—you simply create
a new fresh key pair, sk and pk, via the generateKeys operation
in digital signature scheme.
This pk is the new public identity that you can use, and sk is the
corresponding secret key that only you know and that lets you
speak on behalf of the identity pk.
In practice, you may use the hash of pk as your identity, since
public keys are large. If you do that, then to verify that a message
comes from your identity, one will have to check that
(1) pk indeed hashes to your identity, and
(2) the message verifies under public key pk.
Moreover, by default, your public key pk will basically look
random, and nobody will be able to uncover your real-world
identity by examining pk.
(Of course, once you start making statements using this identity,
these statements may leak information that allows others to
connect pk to your real-world identity. )
You can generate a fresh identity that looks random, like a face in
the crowd, and is controlled only by you.
Decentralized Identity Management
Rather than having a central authority for registering users in a system, you
can register as a user by yourself.
You don’t need to be issued a username, nor do you need to inform someone
that you’re going to be using a particular name. If you want a new identity, you
can just generate one at any time, and you can create as many as you want.
If you prefer to be known by five different names, no problem! Just make five
identities.
If you want to be somewhat anonymous for a while, you can create a new
identity, use it for just a little while, and then throw it away.
All these things are possible with decentralized identity management, and this
is the way Bitcoin, in fact, handles identity.
These identities are called addresses, in Bitcoin jargon. You’ll frequently
hear the term “address” used in the context of Bitcoin and cryptocurrencies,
and it’s really just a hash of a public key. It’s an identity that someone made up
out of thin air, as part of this decentralized identity management scheme.
At first glance, it may seem that decentralized identity management leads
to great anonymity and privacy. After all, you can create a randomlooking
identity all by yourself without telling anyone your real-world identity.
But it’s not that simple. Over time, the identity that you create makes a
series of statements. People see these statements and thus know that
whoever owns this identity has done a certain series of actions. They can
start to connect the dots, using this series of actions to make inferences
about your real-world identity.
An observer can link together these observations over time and make
inferences that lead to such conclusions as, “Gee, this person is acting a
lot like Joe. Maybe this person is Joe.” In other words, in Bitcoin you don’t
need to explicitly register or reveal your real-world identity, but the pattern
of your behavior might itself be identifying. This is the fundamental privacy
question in a cryptocurrency like Bitcoin,
1.5. TWO SIMPLE CRYPTOCURRENCIES
1.Goofycoin: the simplest cryptocurrency we can imagine.
Two rules of Goofycoin.
• The first rule is that a designated entity, Goofy, can create new coins
whenever he wants, and these newly created coins belong to him.
• To create a coin, Goofy generates a unique coin ID uniqueCoinID that
he’s never generated before and constructs the string CreateCoin
[uniqueCoinID].
• He then computes the digital signature of this string with his secret
signing key.
• The string, together with Goofy’s signature, is a coin.
• Anyone can verify that the coin contains Goofy’s valid signature of a
CreateCoin statement and is therefore a valid coin.
• The second rule of Goofycoin is that whoever owns a coin can
transfer it to someone else. Transferring a coin is not simply a matter
of sending the coin data structure to the recipient—it’s done using
cryptographic operations.
To summarize, the rules of Goofycoin are:
• Goofy can create new coins by simply signing a statement that he’s
making a new coin with a unique coin ID.
• Whoever owns a coin can pass it on to someone else by signing a
statement that says, “Pass on this coin to X” (where X is specified
as a public key).
• Anyone can verify the validity of a coin by following the chain of
hash pointers back to its creation by Goofy, verifying all signatures
along the way.
Fundamental security problem with
Goofycoin
Let’s say Alice passed her coin on to Bob by sending her signed statement
to Bob but didn’t tell anyone else.
She could create another signed statement that pays the same coin to
Chuck.
To Chuck, it would appear that it is a perfectly valid transaction, and now
he’s the owner of the coin.
Bob and Chuck would both have valid-looking claims to be the owner of this
coin.
This is called a double-spending attack—Alice is spending the same coin
twice.
Intuitively, we know coins are not supposed to work that way.
Diagram
Goofycoin coin. Shown here is a coin that’s been created
(bottom) and spent twice (middle and top).
Double-spending attacks are one of the key problems that any
cryptocurrency has to solve. Goofycoin does not solve the
double-spending attack, and therefore it’s not secure. Goofycoin
is simple, and its mechanism for transferring coins is actually
similar to that of Bitcoin, but because it is insecure, it is
inadequate as a cryptocurrency.
2. Scroogecoin
It solves the double-spending problem.
The first key idea is that a designated entity called Scrooge publishes an
append-only ledger containing the history of all transactions.
The append-only property ensures that any data written to this ledger will
remain forever in the ledger to defend against double spending by requiring all
transactions to be written in the ledger before they are accepted.
That way, it will be publicly documented if coins were previously sent to a
different owner.
Scrooge has a blockchain data structure, which the owner will digitally sign.
It consists of a series of data blocks, each with one transaction in it (in practice,
as an optimization, we’d really put multiple transactions in the same block, as
Bitcoin does.)
Each block has the ID of a transaction, the transaction’s contents, and a hash
pointer to the previous block.
Scrooge digitally signs the final hash pointer, which binds all the data in this
entire structure, and he publishes the signature along with the block chain.
In Scroogecoin, a transaction only counts if it is in the block chain signed
by Scrooge.
Anyone can verify that a transaction was endorsed by Scrooge by
checking Scrooge’s signature on the block that records the transaction.
Scrooge makes sure that he doesn’t endorse a transaction that attempts
to double spend an already spent coin.
We need a block chain with hash pointers in addition to having Scrooge
sign each block to ensure the append-only property.
If Scrooge tries to add or remove a transaction or to change an existing
transaction, it will affect all following blocks because of the hash
pointers.
As long as someone is monitoring the latest hash pointer published by
Scrooge, the change will be obvious and easy to catch.

More Related Content

PPTX
2 Cryptographic_Hash_Functions.pptx
PPTX
Lecture 1_Blockchain.pptx
PPTX
Blockchain Technology Explained: A Beginner's Guide to the Future of the Inte...
PDF
PDF
Concepts of BlockChain explained very well
PPTX
How Hashing Algorithms Work
PDF
Applied cryptanalysis - everything else
2 Cryptographic_Hash_Functions.pptx
Lecture 1_Blockchain.pptx
Blockchain Technology Explained: A Beginner's Guide to the Future of the Inte...
Concepts of BlockChain explained very well
How Hashing Algorithms Work
Applied cryptanalysis - everything else

Similar to Blockchain Technology, Cryptography and cryptocurrencies Module2.pptx (20)

PDF
Hash Function.pdf
PDF
A 2,000-BIT MESSAGE IS USED TO GENERATE A 256-BIT HASH. ONE THE AVERAGE, HOW ...
PDF
How does cryptography work? by Jeroen Ooms
PPT
Network security cryptographic hash function
PDF
Hash Functions in Software Security.p df
PDF
HASH FUNCTIONS.pdf
DOC
Encryption
PPT
Hash crypto
PPT
Hash crypto
PPT
Hash crypto
PPT
Hash crypto
PPT
Hash crypto
PPT
Hash crypto
PPT
Hash crypto
PDF
Dnssec tutorial-crypto-defs
PDF
Cryptography Unchained - BeeBryte (White Paper)
PDF
The MD5 hashing algorithm
PPTX
All details of cryptography and all the topics of cryptography was explained
PPTX
INFO 2106 2-17-25.pptx Course Slide Deck
Hash Function.pdf
A 2,000-BIT MESSAGE IS USED TO GENERATE A 256-BIT HASH. ONE THE AVERAGE, HOW ...
How does cryptography work? by Jeroen Ooms
Network security cryptographic hash function
Hash Functions in Software Security.p df
HASH FUNCTIONS.pdf
Encryption
Hash crypto
Hash crypto
Hash crypto
Hash crypto
Hash crypto
Hash crypto
Hash crypto
Dnssec tutorial-crypto-defs
Cryptography Unchained - BeeBryte (White Paper)
The MD5 hashing algorithm
All details of cryptography and all the topics of cryptography was explained
INFO 2106 2-17-25.pptx Course Slide Deck
Ad

Recently uploaded (20)

PPTX
Ship’s Structural Components.pptx 7.7 Mb
PDF
dse_final_merit_2025_26 gtgfffffcjjjuuyy
PPTX
Soil science - sampling procedures for soil science lab
PDF
ETO & MEO Certificate of Competency Questions and Answers
PDF
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
PDF
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
PDF
MCAD-Guidelines. Modernization of command Area Development, Guideines
PDF
Top 10 read articles In Managing Information Technology.pdf
PPT
Chapter 6 Design in software Engineeing.ppt
PPTX
Practice Questions on recent development part 1.pptx
PDF
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
PPTX
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
PDF
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
PPTX
metal cuttingmechancial metalcutting.pptx
PPTX
Simulation of electric circuit laws using tinkercad.pptx
PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
436813905-LNG-Process-Overview-Short.pptx
PPT
SCOPE_~1- technology of green house and poyhouse
Ship’s Structural Components.pptx 7.7 Mb
dse_final_merit_2025_26 gtgfffffcjjjuuyy
Soil science - sampling procedures for soil science lab
ETO & MEO Certificate of Competency Questions and Answers
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
MCAD-Guidelines. Modernization of command Area Development, Guideines
Top 10 read articles In Managing Information Technology.pdf
Chapter 6 Design in software Engineeing.ppt
Practice Questions on recent development part 1.pptx
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
metal cuttingmechancial metalcutting.pptx
Simulation of electric circuit laws using tinkercad.pptx
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
436813905-LNG-Process-Overview-Short.pptx
SCOPE_~1- technology of green house and poyhouse
Ad

Blockchain Technology, Cryptography and cryptocurrencies Module2.pptx

  • 2. Contents ● Cryptographic Hash Functions ● Hash Pointers and Data Structures ● Digital Signatures ● Public Keys as Identities ● A Simple Cryptocurreny
  • 3. Introduction • All currencies need some way to control supply and enforce various security properties to prevent cheating. • These security features raise the bar for an attacker, but they don’t make money impossible to counterfeit. • Ultimately, law enforcement is necessary for stopping people from breaking the rules of the system. • Cryptocurrencies too must have security measures that prevent people from tampering with the state of the system • If Alice convinces Bob that she paid him a digital coin, for example, she should not be able to convince Carol that she paid her that same coin. • But unlike fiat (Govt issued currencies) currencies, the security rules of cryptocurrencies need to be enforced purely technologically and without relying on a central authority.
  • 4. • Cryptocurrencies make heavy use of cryptography. • Cryptography provides a mechanism for securely encoding the rules of a cryptocurrency system in the system itself. • It can be used to prevent tampering and equivocation (when a dishonest actor provides conflicting information for the same message), • It can be used to encode the rules for creation of new units of the currency into a mathematical protocol • Cryptography is a deep academic research field utilizing many advanced mathematical techniques that are subtle and complicated to understand. • Fortunately, Bitcoin only relies on a handful of relatively simple and well known cryptographic constructions. ‐
  • 5. Cryptographic Hash Functions • A hash function is a mathematical function with the following three properties: • Input can be any string of any size. • Fixed size output. We will assume a 256 bit output size. (hash value or ‐ digest). Its unique digital "fingerprint" for the input data. Even if the input is a small file or a large document, the hash function will always produce an output of the same length. • Efficiently computable. For a given input string, we can compute output of the hash function in a reasonable amount of time. More technically, computing the hash of an n bit string should have a running ‐ time that is O(n). • Those properties define a general hash function, one that could be used to build a data structure such as a hash table.
  • 6. Note: Message Digest Message Digest is used to ensure the integrity of a message transmitted over an insecure channel (where the content of the message can be changed). The message is passed through a Cryptographic hash function. This function creates a compressed image of the message called Digest. Lets assume, Alice sent a message and digest pair to Bob. To check the integrity of the message Bob runs the cryptographic hash function on the received message and gets a new digest. Now, Bob will compare the new digest and the digest sent by Alice. If, both are same then Bob is sure that the original message is not changed.
  • 8. ● To provide authenticity of the message, digest is encrypted with sender’s private key. Now this digest is called digital signature, which can be only decrypted by the receiver who has sender’s public key. Now the receiver can authenticate the sender and also verify the integrity of the sent message. Example: The hash algorithm MD5 is widely used to check the integrity of messages. MD5 divides the message into blocks of 512 bits and creates a 128 bit digest(typically, 32 Hexadecimal digits). It is no longer considered reliable for use as researchers have demonstrated techniques capable of easily generating MD5 collisions on commercial computers. The weaknesses of MD5 have been exploited by the Flame malware in 2012. In response to the insecurities of MD5 hash algorithms, the Secure Hash Algorithm (SHA) was invented.
  • 9. Example Let’s say you have a password like "mypassword123". A hash function will take that password and convert it into something like "5f4dcc3b5aa765d61d8327deb882cf99" (this is a real hash of the word "password" using a hash function called MD5). This hash is a string of letters and numbers that represents your password. ● If someone tries to hack into your account, they won’t see your real password, but rather this hashed version. ● Even if they know the hash value, they won’t easily be able to reverse it and figure out the original password.
  • 10. Cryptographic Hash Functions • For a hash function to be cryptographically secure, we require that it has the following three additional properties: (1) collision resistance, ‐ (2) hiding, (3) puzzle friendliness. ‐
  • 11. Property 1: Collision Resistance A collision occurs when two distinct inputs produce the same output. A hash function H(·) is collision resistant if nobody can find a collision. In other words, Nobody can find x and y such that x!=y and H(x)=H(y)
  • 12. Notice that it was said “nobody can find” a collision But we did not say that no collisions exist. Actually, collisions exist for any hash function. The input space to the hash function contains all strings of all lengths, yet the output space contains only strings of a specific fixed length. Because the input space is larger than the output space (indeed, the input space is infinite, while the output space is finite), there must be input strings that map to the same output string.
  • 14. Consider the following simple method for finding a collision for a hash function with a 256-bit output size: pick 2^256 + 1 distinct values, compute the hashes of each of them, and check whether any two outputs are equal. Since we picked more inputs than possible outputs, some pair of them must collide when you apply the hash function
  • 15. The method above is guaranteed to find a collision. But if we pick random inputs and compute the hash values, we’ll find a collision with high probability long before examining 2 256 + 1 inputs. In fact, if we randomly choose just 2^130 + 1 inputs, it turns out there’s a 99.8 percent chance that at least two of them are going to collide. That we can find a collision by examining only roughly the square root of the number of possible outputs results from a phenomenon in probability known as the birthday paradox.
  • 16. Birthday Paradox It refers to the surprising result that in a group of just 23 people, the probability of at least two people sharing the same birthday is more than 50%. This result is counterintuitive because most people expect a much larger group for this to be likely. In probability terms, the Birthday Paradox shows that the probability of shared birthdays rises much faster than our intuition suggests. Even in relatively small groups, the chance of a shared birthday becomes significant because of the number of potential pairs.
  • 17. This collision-detection algorithm works for every hash function. But, the problem is that it takes a very long time to do. For a hash function with a 256-bit output, you would have to compute the hash function 2^256 + 1 times in the worst case, and about 2^128 times on average. That’s of course an astronomically large number—if a computer calculates 10,000 hashes per second, it would take more than one octillion (10^27) years to calculate 2^128 hashes!
  • 18. A general but impractical algorithm to find a collision for any hash function. Consider, for example, the following hash function: H(x)= x mod 2^256 This function meets our requirements of a hash function, and is efficiently computable. It is also an efficient method for finding a collision. Is there a faster way to find collisions? For some possible H’s, yes. For others, we don,t know of one. No hash functions have been proven to be collision resistant.
  • 19. The cryptographic hash functions that we rely on in practice are just functions for which people have tried really, really hard to find collisions and haven’t yet succeeded. And so we choose to believe that those are collision resistant. (In some cases, such as the hash function known as MD5, collisions were eventually found after years of work, resulting in the function being deprecated and phased out of practical use.)
  • 20. Application of Collision Resistance Hash as Message Digest: If we know that two inputs x and y to a collision-resistant hash function H are different, then it’s safe to assume that their hashes H(x) and H(y) are different. This argument allows us to use hash outputs as a message digest. To recognize a file we saw before, just remember it's hash. Useful becoz hash is small (only 256 bits).
  • 21. Consider SecureBox, an authenticated online file storage system that allows users to upload files and to ensure their integrity when they download them. Collision-resistant hashes provide an elegant and efficient solution to this problem. Alice just needs to remember the hash of the original file. When she later downloads the file from SecureBox, she computes the hash of the downloaded file and compares it to the one she stored. If the hashes are the same, then she can conclude that the file is indeed the same one she uploaded, but if they are different, then Alice can conclude that the file has been tampered with.
  • 22. Property 2: Hiding The hiding property asserts that if we’re given the output of the hash function y = H(x), there’s no feasible way to figure out what the input, x, was. In other words, Given H(x), it is infeasible to find x. The problem is that this property does not exactly hold. For ex: If the result of the coin flip was heads, we’re going to announce the hash of the string “heads.” If the result was tails, we’re going to announce the hash of the string “tails.” In response, they would simply compute both the hash of the string “heads” and the hash of the string “tails,” and they could see which one they were given. And so, in just a couple steps, they can figure out what the input was. So, its easy to find x.
  • 23. To be able to achieve the hiding property, there must be no value of x that is particularly likely. That is, x has to be chosen from a set that is, in some sense, very spread out. Can we achieve the hiding property? “Yes” We can hide even an input that’s not spread out by concatenating it with another input that is spread out.
  • 24. Hiding A hash function H is said to be hiding if when a secret value r is chosen from a probability distribution that has high min-entropy, then, given H(r x), it is infeasible to find x. ‖ In information theory, min-entropy is a measure of how predictable an outcome is, and High min-entropy means that the distribution (i.e., of a random variable) is very spread out. When we sample from the distribution, there’s no particular value that’s likely to occur. So, for a concrete example, if r is chosen uniformly from among all strings that are 256 bits long, then any particular string is chosen with probability 1/2^256 , which is an infinitesimally small value (no particular value is chosen with more than negligible property.
  • 25. APPLICATION: COMMITMENTS A commitment is the digital analog of taking a value, sealing it in an envelope, and putting that envelope out on the table where everyone can see it. When you do that, you’ve committed yourself to what’s inside the envelope. But you haven’t opened it, so even though you’ve committed to a value, the value remains a secret from everyone else. Later, you can open the envelope and reveal the value that you committed to earlier. In other words, Want to “ seal a value in an envelope”, and “open the envelope later” Commit to a value. reveal it later.
  • 26. Commitment scheme A commitment scheme consists of two algorithms: • com := commit(msg, nonce) The commit function takes a message and secret random value, called a nonce, as input and returns a commitment. • verify(com, msg, nonce) The verify function takes a commitment, nonce, and message as input. It returns true if com == commit(msg, nonce) and false otherwise. Note: Every time you commit to a value, it is important that you choose a new random value nonce. In cryptography, the term nonce is used to refer to a value that can only be used once.
  • 27. We require that the following two security properties hold: • Hiding: Given com, it is infeasible to find msg. • Binding: It is infeasible to find two pairs (msg, nonce) and (msg′, nonce ′) such that msg ≠ msg′ and commit(msg, nonce) == commit(msg′, nonce′).
  • 28. To use a commitment scheme, we first need to generate a random nonce. We then apply the commit function to this nonce together with msg, the value being committed to, and we publish the commitment com. This stage is analogous to putting the sealed envelope on the table. At a later point, if we want to reveal the value that we committed to earlier, we publish the random nonce that we used to create this commitment, and the message, msg. Now anybody can verify that msg was indeed the message committed to earlier. This stage is analogous to opening the envelope.
  • 29. The two security properties dictate that the algorithms actually behave like sealing and opening an envelope. • First, given com, the commitment, someone looking at the envelope can’t figure out what the message is. • The second property is that it’s binding. This ensures that when you commit to what’s in the envelope, you can’t change your mind later. That is, it’s infeasible to find two different messages, such that you can commit to one message and then later claim that you committed to another.
  • 30. To implement commitment scheme,we make use of cryptographic hash function. commit(msg, nonce) := H(nonce msg), where nonce is a ‖ random 256-bit value • To commit to a message, we generate a random 256-bit nonce. Then we concatenate the nonce and the message and return the hash of this concatenated value as the commitment. • To verify, someone will compute this same hash of the nonce they were given concatenated with the message. And they will check whether the result is equal to the commitment that they saw.
  • 31. If we substitute the instantiation of commit and verify as well as H(nonce msg) for com, then these properties become: ‖ • Hiding: Given H(nonce msg), it is infeasible to find msg. ‖ • Binding: It is infeasible to find two pairs (msg, nonce) and (msg ′, nonce ′) such that msg ≠ msg′ and H(nonce msg) == (nonce′ ‖ msg′). ‖ Therefore, if H is a hash function that is both collision resistant and hiding, this commitment scheme will work, in the sense that it will have the necessary security properties.
  • 32. Commitment API (com,key):=commit(msg) match:=verify(com,key,mesg) To seal msg in envelope: (com,key):=commit(msg) - - then publish com To open envelope: publish key msg anyone can use verify() to check validity Where commit(msg):=(H(key|msg),H(key) verify(com,key,msg):=(H(key|msg)==com)
  • 33. Property 3: Puzzle Friendliness A hash function H is said to be puzzle friendly , • if for every possible n-bit output value y, • if k is chosen from a distribution with high min-entropy, • then it is infeasible to find x such that H(k x) = y in time ‖ significantly less than 2^n In other words, if someone wants to target the hash function to have some particular output value y, and if part of the input has been chosen in a suitably randomized way, then it’s very difficult to find another value that hits exactly that target.
  • 34. APPLICATION: SEARCH PUZZLE Search puzzle is a mathematical problem that requires searching a very large space to find the solution. In particular, a search puzzle has no shortcuts. That is, there’s no way to find a valid solution other than searching that large space. A search puzzle consists of • a hash function, H, • a value, id ( puzzle-ID), chosen from a high min-entropy distribution, and • a target set Y. A solution to this puzzle is a value, x, such that
  • 35. Puzzle friendly property implies that no solving strategy is much better than trying random values of x. The intuition is: • If H has an n-bit output, then it can take any of 2^ n values. • Solving the puzzle requires finding an input such that the output falls within the set Y, which is typically much smaller than the set of all outputs. • The size of Y determines how hard the puzzle is. • If Y is the set of all n-bit strings, then the puzzle is trivial, • whereas if Y has only one element, then the puzzle is maximally hard. • That the puzzle ID has high min-entropy ensures that there are no shortcuts.
  • 36. On the contrary, if a particular value of the ID were likely, then someone could cheat, say, by precomputing a solution to the puzzle with that ID. If a hash function is puzzle friendly, then there’s no solving strategy for this puzzle that is much better than just trying random values of x. And so, if we want to pose a puzzle that’s difficult to solve, we can do it this way as long as we can generate puzzle-IDs in a suitably random way.
  • 37. Hash function used by Bitcoin:SHA-256 As long as we can build a hash function that works on fixed- length inputs, there’s a generic method to convert it into a hash function that works on arbitrary-length inputs. It’s called the MerkleDamgård transform. SHA-256 is one of a number of commonly used hash functions that make use of this method. In common terminology, the underlying fixed-length collision- resistant hash function is called the compression function. It has been proven that if the underlying compression function is collision resistant, then the overall hash function is collision resistant as well.
  • 38. The Merkle-Damgård transform SHA-256 uses the MerkleDamgård transform to turn a fixed-length collision- resistant compression function into a hash function that accepts arbitrary- length inputs. The input is padded, so that its length is a multiple of 512 bits. IV stands for initialization vector. Suppose that the compression function takes inputs of length m and produces an output of a smaller length n. The input to the hash function, which can be of any size, is divided into blocks of length m – n.
  • 39. SHA-256 uses a compression function that takes 768-bit input and produces 256-bit outputs. The block size is 512 bits. The construction works as follows: • Pass each block together with the output of the previous block into the compression function. • Input length will then be (m – n) + n = m, which is the input length to the compression function. • For the first block, to which there is no previous block output, we instead use an initialization vector . This number is reused for every call to the hash function. • The last block’s output is the result that you return.
  • 40. Modeling Hash Functions Hash functions are the Swiss Army knife of cryptography: • They find a place in a spectacular variety of applications. • The flip side to this versatility is that different applications require slightly different properties of hash functions to ensure security. • It has proven notoriously hard to pin down a list of hash function properties that would result in provable security across the board.
  • 41. 1.2. HASH POINTERS AND DATA STRUCTURES A hash pointer is a data structure. A hash pointer is simply a pointer to where some information is stored together with a cryptographic hash of the information. A regular pointer gives you a way to retrieve the information, whereas hash pointer also allows you to verify that the information hasn’t been changed. A hash pointer is a pointer to where data is stored together with a cryptographic hash of the value of this data at some fixed point in time.
  • 42. Hash pointers to build all kinds of data structures. We can take a familiar data structure that uses pointers, such as a linked list or a binary search tree, and implement it with hash pointers instead of ordinary pointers. We call this data structure a block chain. In a regular linked list where you have a series of blocks, each block has data as well as a pointer to the previous block in the list. But in a blockchain, the previous-block pointer will be replaced with a hash pointer. So each block not only tells us where the value of the previous block was, but it also contains a digest of that value, which allows us to verify that the value hasn’t been changed. We store the head of the list, which is just a regular hash-pointer that points to the most recent data block.
  • 43. A use case for a block chain is a tamper-evident log. That is, we want to build a log data structure that stores data and allows us to append data to the end of the log. But if somebody alters data that appears earlier in the log, we’re going to detect the change.
  • 44. Why a block chain achieves this tamper-evident property? An adversary wants to tamper with data in the middle of the chain. Specifically, the adversary’s goal is to do it in such a way that someone who remembers only the hash pointer at the head of the blockchain won’t be able to detect the tampering. To achieve this goal, the adversary changes the data of some block k. Since the data has been changed, the hash in block k + 1, which is a hash of the entire block k, is not going to match up. Remember that we are statistically guaranteed that the new hash will not match the altered content, since the hash function is collision resistant. And so we will detect the inconsistency between the new data in block k and the hash pointer in block k + 1. Of course, the adversary can continue to try and cover up this change by changing the next block’s hash as well. The adversary can continue doing this, but this strategy will fail when she reaches the head of the list. Specifically, as long as we store the hash pointer at the head of the list in a place where the adversary cannot change it, she will be unable to change any block without being detected
  • 45. Thus, If an adversary modifies data anywhere in the blockchain, it will result in the hash pointer in the following block being incorrect. If we store the head of the list, then even if an adversary modifies all pointers to be consistent with the modified data, the head pointer will be incorrect, and we can detect the tampering.
  • 46. The upshot is that if the adversary wants to tamper with data anywhere in this entire chain, to keep the story consistent, she’s going to have to tamper with the hash pointers all the way to the end. And she’s ultimately going to run into a roadblock, because she won’t be able to tamper with the head of the list. Thus, by remembering just this single hash pointer, we’ve essentially determined a tamper-evident hash of the entire list. So we can build a blockchain like this containing as many blocks as we want, going back to some special block at the beginning of the list, which we will call the genesis block.
  • 47. Merkle Trees A binary tree with hash pointers is known as a Merkle tree , after its inventor, Ralph Merkle.
  • 48. In a Merkle tree, data blocks are grouped in pairs, and the hash of each of these blocks is stored in a parent node. The parent nodes are in turn grouped in pairs, and their hashes stored one level up the tree. This pattern continues up the tree until we reach the root node. Here, we remember just one hash pointer: in this case, the one at the root of the tree. We now have the ability to traverse through the hash pointers to any point in the list. This allows us to make sure that the data has not been tampered with because, just as we saw for the blockchain, if an adversary tampers with some data block at the bottom of the tree, his change will cause the hash pointer one level up to not match, and even if he continues to tamper with other blocks farther up the tree, the change will eventually propagate to the top, where he won’t be able to tamper with the hash pointer that we’ve stored. So again, any attempt to tamper with any piece of data will be detected by just remembering the hash pointer at the top.
  • 49. Proof of Membership Another nice feature of Merkle trees is that, unlike the block chain is proof of membership. Suppose that someone wants to prove that a certain data block is a member of the Merkle tree, We remember just the root. Then they need to show us this data block, and the blocks on the path from the data block to the root. We can ignore the rest of the tree, as the blocks on this path are enough to allow us to verify the hashes all the way up to the root of the tree. If there are n nodes in the tree, only about log(n) items need to be shown. And since each step just requires computing the hash of the child block, it takes about log(n) time for us to verify it. And so even if the Merkle tree contains a large number of blocks, we can still prove membership in a relatively short time. Verification thus runs in time and space that’s logarithmic in the number of nodes in the tree.
  • 50. Proof of membership: To prove that a data block is included in the tree only requires showing the blocks in the path from that data block to the root. A sorted Merkle tree is just a Merkle tree where we take the blocks at the bottom and sort them using some ordering function. This can be alphabetical order, lexicographical order, numerical order, or some other agreed-on ordering.
  • 51. Proof of Non membership It is a cryptographic concept in which a party can prove that a specific element does not belong to a particular set without revealing any information about the other elements of the set or compromising privacy. This is often used in scenarios where one needs to verify that an item or entity is not part of a restricted list or group (e.g., in blacklists, revocation lists, or negative databases). Using a sorted Merkle tree, it becomes possible to verify nonmembership in logarithmic time and space. That is, we can prove that a particular block is not in the Merkle tree. A sorted Merkle tree is just a Merkle tree where we take the blocks at the bottom and sort them using some ordering function. This can be alphabetical order, lexicographical order, numerical order, or some other agreed-on ordering.
  • 52. • It can be done by showing a path to the item just before where the item in question would be and showing the path to the item just after where it would be. • If these two items are consecutive in the tree, then this serves as proof that the item in question is not included—because if it were included, it would need to be between the two items shown, but there is no space between them, as they are consecutive. • We can use hash pointers in any pointer-based data structure as long as the data structure doesn’t have cycles. • If there are cycles in the data structure, then we won’t be able to make all the hashes match up. • In an acyclic data structure we can start near the leaves, or near the things that don’t have any pointers coming out of them, compute the hashes of those, and then work our way back toward the beginning. • But in a structure with cycles, there’s no end that we can start with and compute back from.
  • 53. 1.3. DIGITAL SIGNATURES A digital signature is digital analog to a handwritten signature on paper. Two desirable properties from digital signatures that correspond well to the handwritten signature analogy: • First, only you can make your signature, but anyone who sees it can verify that it’s valid. • Second, we want the signature to be tied to a particular document, so that the signature cannot be used to indicate your agreement or endorsement of a different document. For handwritten signatures, this latter property is analogous to ensuring that somebody can’t take your signature and snip it off one document and glue it to the bottom of another one.
  • 55. Digital signature scheme. A digital signature scheme consists of the following three algorithms: ➔ (sk, pk) := generateKeys(keysize) The generateKeys method takes a key size and generates a key pair. The secret key sk is kept privately and used to sign messages. pk is the public verification key that you give to everybody. Anyone with this key can verify your signature. ➔ sig := sign(sk, message) The sign method takes a message and a secret key, sk, as input and outputs a signature for message under sk. ➔ isValid := verify(pk, message, sig) The verify method takes a message, a signature, and a public key as input. It returns a boolean value, isValid, that will be true if sig is a valid signature for message under public key pk, and false otherwise. We require that the following two properties hold: ➔ Valid signatures must verify: verify(pk, message, sign(sk, message)) == true. ➔ Signatures are existentially unforgeable.
  • 56. 2 properties of digital signature 1. Valid signatures must be verifiable. If I sign a message with sk, my secret key, and someone later tries to validate that signature over that same message using my public key, pk, the signature must validate correctly. This property is a basic requirement for signatures to be useful at all. 2. Unforgeability. The second requirement is that it’s computationally infeasible to forge signatures. That is, an adversary who knows your public key and sees your signatures on some other messages can’t forge your signature on some message for which he has not seen your signature. This unforgeability property is generally formalized in terms of a game that we play with an adversary. The use of games is quite common in cryptographic security proofs.
  • 57. Unforgeability game The attacker and the challenger play the unforgeability game. If the attacker is able to successfully output a signature on a message that he has not previously seen, he wins. If he is unable to do so, the challenger wins, and the digital signature scheme is unforgeable.
  • 58. ➔Use generateKeys to generate a secret signing key and a corresponding public verification key. ➔We give the secret key to the challenger, and we give the public key to both the challenger and the adversary. So the adversary only knows information that’s public, and his mission is to try to forge a message. ➔The challenger knows the secret key. So he can make signatures. The setup of this game matches real-world conditions. A real attacker would likely be able to see valid signatures from his would-be victim on different documents. And the attacker might even be able to manipulate the victim into signing innocuous-looking(something appears harmless or not likely to cause harm or offense) documents if that’s useful to the attacker.
  • 59. To model this in our game, we allow the adversary to get signatures on some documents of his choice, for as long as he wants, as long as the number of guesses is plausible. What we mean by a plausible number of guesses, we would allow the adversary to try 1 million guesses, but not 2^80 guesses. In asymptotic terms, we allow the adversary to try a number of guesses that is a polynomial function of the key size, but no more (e.g., he cannot try exponentially many guesses). Once the adversary is satisfied that he’s seen enough signatures, then he picks some message, M, that he will attempt to forge a signature on. The only restriction on M is that it must be a message for which the adversary has not previously seen a signature (because then he can obviously send back a signature that he has been given). The challenger runs the verify algorithm to determine whether the signature produced by the attacker is a valid signature on M under the public verification key. If it successfully verifies, the adversary wins the game. We say that the signature scheme is unforgeable if and only if, no matter what algorithm the adversary is using, his chance of successfully forging a message is extremely small—so small that we can assume it will never happen in practice.
  • 60. Practical Concerns Several practical things must be done to turn the algorithmic idea into a digital signature mechanism that can be implemented. Need a good source of randomness: The importance of this requirement can’t be overestimated, as bad randomness will make your otherwise-secure algorithm insecure. Message size: In practice, there’s a limit on the message size that you’re able to sign, because real schemes are going to operate on bit strings of limited length. There’s an easy way around this limitation: sign the hash of the message, rather than the message itself. If we use a cryptographic hash function with a 256-bit output, then we can effectively sign a message of any length as long as our signature scheme can sign 256- bit messages. It’s safe to use the hash of the message as a message digest in this manner, since the hash function is collision resistant.
  • 61. Another trick is that you can sign a hash pointer. If you sign a hash pointer, then the signature covers, or protects, the whole structure—not just the hash pointer itself, but everything the chain of hash pointers points to. For example, if you were to sign the hash pointer located at the end of a block chain, the result is that you would effectively be digitally signing the entire block chain.
  • 62. ECDSA -Elliptic Curve Digital Signature Algorithm Bitcoin uses a particular digital signature scheme known as the Elliptic Curve Digital Signature Algorithm (ECDSA). ECDSA is a U.S. government standard, an update of the earlier DSA algorithm adapted to use elliptic curves. These algorithms have received considerable cryptographic analysis over the years and are generally believed to be secure. It is estimated to provide 128 bits of security (i.e., it is as difficult to break this algorithm as it is to perform 2^128 symmetric-key cryptographic operations, such as invoking a hash function). Although this curve is a published standard, it is rarely used outside Bitcoin; Other applications using ECDSA (such as key exchange in the TLS protocol for secure web browsing) typically use the more common secp256r1 curve. This is just a quirk of Bitcoin, as it was chosen by Satoshi in the early specification of the system and is now difficult to change.
  • 63. ECDSA can technically only sign messages 256 bits long, this is not a problem: messages are always hashed before being signed, so effectively any size message can be efficiently signed. With ECDSA, a good source of randomness is essential, because a bad source will likely leak your key. But it’s a quirk of ECDSA that, even if you use bad randomness only when making a signature and you use your perfectly good key, the bad signature will also leak your private key. (For those familiar with DSA, this is a general quirk in DSA and is not specific to the elliptic curve variant.) And then it’s game over: if you leak your private key, an adversary can forge your signature. We thus need to be especially careful about using good randomness in practice. Using a bad source of randomness is a common pitfall of otherwise secure systems.
  • 64. 1.4. PUBLIC KEYS AS IDENTITIES A public key, one of those public verification keys from a digital signature scheme can be equated to an identity of a person or an actor in a system. If you see a message with a signature that verifies correctly under a public key, pk, then you can think of pk as stating the message. You can literally think of a public key as being like an actor, or a party in a system, who can make statements by signing those statements. From this viewpoint, the public key is an identity. For someone to speak for the identity pk, he must know the corresponding secret key, sk.
  • 65. A consequence of treating public keys as identities is that you can make a new identity whenever you want—you simply create a new fresh key pair, sk and pk, via the generateKeys operation in digital signature scheme. This pk is the new public identity that you can use, and sk is the corresponding secret key that only you know and that lets you speak on behalf of the identity pk. In practice, you may use the hash of pk as your identity, since public keys are large. If you do that, then to verify that a message comes from your identity, one will have to check that (1) pk indeed hashes to your identity, and (2) the message verifies under public key pk.
  • 66. Moreover, by default, your public key pk will basically look random, and nobody will be able to uncover your real-world identity by examining pk. (Of course, once you start making statements using this identity, these statements may leak information that allows others to connect pk to your real-world identity. ) You can generate a fresh identity that looks random, like a face in the crowd, and is controlled only by you.
  • 67. Decentralized Identity Management Rather than having a central authority for registering users in a system, you can register as a user by yourself. You don’t need to be issued a username, nor do you need to inform someone that you’re going to be using a particular name. If you want a new identity, you can just generate one at any time, and you can create as many as you want. If you prefer to be known by five different names, no problem! Just make five identities. If you want to be somewhat anonymous for a while, you can create a new identity, use it for just a little while, and then throw it away. All these things are possible with decentralized identity management, and this is the way Bitcoin, in fact, handles identity. These identities are called addresses, in Bitcoin jargon. You’ll frequently hear the term “address” used in the context of Bitcoin and cryptocurrencies, and it’s really just a hash of a public key. It’s an identity that someone made up out of thin air, as part of this decentralized identity management scheme.
  • 68. At first glance, it may seem that decentralized identity management leads to great anonymity and privacy. After all, you can create a randomlooking identity all by yourself without telling anyone your real-world identity. But it’s not that simple. Over time, the identity that you create makes a series of statements. People see these statements and thus know that whoever owns this identity has done a certain series of actions. They can start to connect the dots, using this series of actions to make inferences about your real-world identity. An observer can link together these observations over time and make inferences that lead to such conclusions as, “Gee, this person is acting a lot like Joe. Maybe this person is Joe.” In other words, in Bitcoin you don’t need to explicitly register or reveal your real-world identity, but the pattern of your behavior might itself be identifying. This is the fundamental privacy question in a cryptocurrency like Bitcoin,
  • 69. 1.5. TWO SIMPLE CRYPTOCURRENCIES 1.Goofycoin: the simplest cryptocurrency we can imagine. Two rules of Goofycoin. • The first rule is that a designated entity, Goofy, can create new coins whenever he wants, and these newly created coins belong to him. • To create a coin, Goofy generates a unique coin ID uniqueCoinID that he’s never generated before and constructs the string CreateCoin [uniqueCoinID]. • He then computes the digital signature of this string with his secret signing key. • The string, together with Goofy’s signature, is a coin. • Anyone can verify that the coin contains Goofy’s valid signature of a CreateCoin statement and is therefore a valid coin. • The second rule of Goofycoin is that whoever owns a coin can transfer it to someone else. Transferring a coin is not simply a matter of sending the coin data structure to the recipient—it’s done using cryptographic operations.
  • 70. To summarize, the rules of Goofycoin are: • Goofy can create new coins by simply signing a statement that he’s making a new coin with a unique coin ID. • Whoever owns a coin can pass it on to someone else by signing a statement that says, “Pass on this coin to X” (where X is specified as a public key). • Anyone can verify the validity of a coin by following the chain of hash pointers back to its creation by Goofy, verifying all signatures along the way.
  • 71. Fundamental security problem with Goofycoin Let’s say Alice passed her coin on to Bob by sending her signed statement to Bob but didn’t tell anyone else. She could create another signed statement that pays the same coin to Chuck. To Chuck, it would appear that it is a perfectly valid transaction, and now he’s the owner of the coin. Bob and Chuck would both have valid-looking claims to be the owner of this coin. This is called a double-spending attack—Alice is spending the same coin twice. Intuitively, we know coins are not supposed to work that way.
  • 72. Diagram Goofycoin coin. Shown here is a coin that’s been created (bottom) and spent twice (middle and top).
  • 73. Double-spending attacks are one of the key problems that any cryptocurrency has to solve. Goofycoin does not solve the double-spending attack, and therefore it’s not secure. Goofycoin is simple, and its mechanism for transferring coins is actually similar to that of Bitcoin, but because it is insecure, it is inadequate as a cryptocurrency.
  • 74. 2. Scroogecoin It solves the double-spending problem. The first key idea is that a designated entity called Scrooge publishes an append-only ledger containing the history of all transactions. The append-only property ensures that any data written to this ledger will remain forever in the ledger to defend against double spending by requiring all transactions to be written in the ledger before they are accepted. That way, it will be publicly documented if coins were previously sent to a different owner. Scrooge has a blockchain data structure, which the owner will digitally sign. It consists of a series of data blocks, each with one transaction in it (in practice, as an optimization, we’d really put multiple transactions in the same block, as Bitcoin does.) Each block has the ID of a transaction, the transaction’s contents, and a hash pointer to the previous block. Scrooge digitally signs the final hash pointer, which binds all the data in this entire structure, and he publishes the signature along with the block chain.
  • 75. In Scroogecoin, a transaction only counts if it is in the block chain signed by Scrooge. Anyone can verify that a transaction was endorsed by Scrooge by checking Scrooge’s signature on the block that records the transaction. Scrooge makes sure that he doesn’t endorse a transaction that attempts to double spend an already spent coin. We need a block chain with hash pointers in addition to having Scrooge sign each block to ensure the append-only property. If Scrooge tries to add or remove a transaction or to change an existing transaction, it will affect all following blocks because of the hash pointers. As long as someone is monitoring the latest hash pointer published by Scrooge, the change will be obvious and easy to catch.