Graypaper-0 5 3
Graypaper-0 5 3
Abstract. We present a comprehensive and formal definition of Jam, a protocol combining elements of both Polkadot
and Ethereum. In a single coherent model, Jam provides a global singleton permissionless object environment—much
like the smart-contract environment pioneered by Ethereum—paired with secure sideband computation parallelized
over a scalable node network, a proposition pioneered by Polkadot.
Jam introduces a decentralized hybrid system offering smart-contract functionality structured around a secure and
scalable in-core/on-chain dualism. While the smart-contract functionality implies some similarities with Ethereum’s
paradigm, the overall model of the service offered is driven largely by underlying architecture of Polkadot.
Jam is permissionless in nature, allowing anyone to deploy code as a service on it for a fee commensurate with the
resources this code utilizes and to induce execution of this code through the procurement and allocation of core-time,
a metric of resilient and ubiquitous computation, somewhat similar to the purchasing of gas in Ethereum. We already
envision a Polkadot-compatible CoreChains service.
(3) Performance: able to perform computation substitute the crude partitioning we see in scalable sys-
quickly and at low cost. tems so far with a form of “cache affinity” as it typically
(4) Coherency: the causal relationship possible be- seen in multi-cpu systems with a shared ram.
tween different elements of state and thus how Unlike with snark-based L2-blockchain techniques for
well individual applications may be composed. scaling, this model draws upon crypto-economic mecha-
(5) Accessibility: negligible barriers to innovation; nisms and inherits their low-cost and high-performance
easy, fast, cheap and permissionless. profiles and averts a bias toward centralization.
As a declared Web3 technology, we make an implicit 1.4. Document Structure. We begin with a brief
assumption of the first two items. Interestingly, items 3 overview of present scaling approaches in blockchain tech-
and 4 are antagonistic according to an information the- nology in section 2. In section 3 we define and clarify the
oretic principle which we are sure must already exist in notation from which we will draw for our formalisms.
some form but are nonetheless unaware of a name for it. We follow with a broad overview of the protocol in sec-
For argument’s sake we shall name it size-synchrony an- tion 4 outlining the major areas including the Polka Vir-
tagonism. tual Machine (pvm), the consensus protocols Safrole and
Grandpa, the common clock and build the foundations
1.3. Scaling under Size-Coherency Antagonism.
of the formalism.
Size-coherency antagonism is a simple principle implying
We then continue with the full protocol definition split
that as the state-space of information systems grow, then
into two parts: firstly the correct on-chain state-transition
the system necessarily becomes less coherent. It is a direct
formula helpful for all nodes wishing to validate the chain
implication of principle that causality is limited by speed.
state, and secondly, in sections 14 and 19 the honest strat-
The maximum speed allowed by physics is C the speed
egy for the off-chain actions of any actors who wield a
of light in a vacuum, however other information systems
validator key.
may have lower bounds: In biological system this is largely
The main body ends with a discussion over the per-
determined by various chemical processes whereas in elec-
formance characteristics of the protocol in section 20 and
tronic systems is it determined by the speed of electrons
finally conclude in section 21.
in various substances. Distributed software systems will
The appendix contains various additional material im-
tend to have much lower bounds still, being dependent
portant for the protocol definition including the pvm in
on a substrate of software, hardware and packet-switched
appendices A & B, serialization and Merklization in ap-
networks of varying reliability.
pendices C & D and cryptography in appendices E, G &
The argument goes:
H. We finish with an index of terms which includes the
(1) The more state a system utilizes for its data- values of all simple constant terms used in the work in
processing, the greater the amount of space this appendix I, and close with the bibliography.
state must occupy.
(2) The more space used, then the greater the 2. Previous Work and Present Trends
mean and variance of distances between state-
components. In the years since the initial publication of the
(3) As the mean and variance increase, then time for Ethereum YP, the field of blockchain development has
causal resolution (i.e. all correct implications of grown immensely. Other than scalability, development
an event to be felt) becomes divergent across the has been done around underlying consensus algorithms,
system, causing incoherence. smart-contract languages and machines and overall state
environments. While interesting, these latter subjects are
Setting the question of overall security aside for a mo-
mostly out scope of the present work since they generally
ment, we can manage incoherence by fragmenting the sys-
do not impact underlying scalability.
tem into causally-independent subsystems, each of which
is small enough to be coherent. In a resource-rich en- 2.1. Polkadot. In order to deliver its service, Jam co-
vironment, a bacterium may split into two rather than opts much of the same game-theoretic and cryptographic
growing to double its size. This pattern is rather a crude machinery as Polkadot known as Elves and described by
means of dealing with incoherency under growth: intra- Jeff Burdges, Cevallos, et al. 2024. However, major differ-
system processing has low size and total coherence, inter- ences exist in the actual service offered with Jam, provid-
system processing supports higher overall sizes but with- ing an abstraction much closer to the actual computation
out coherence. It is the principle behind meta-networks model generated by the validator nodes its economy in-
such as Polkadot, Cosmos and the predominant vision of centivizes.
a scaled Ethereum (all to be discussed in depth shortly). It was a major point of the original Polkadot pro-
Such systems typically rely on asynchronous and simplis- posal, a scalable heterogeneous multichain, to deliver high-
tic communication with “settlement areas” which provide performance through partition and distribution of the
a small-scoped coherent state-space to manage specific in- workload over multiple host machines. In doing so it took
teractions such as a token transfer. an explicit position that composability would be lowered.
The present work explores a middle-ground in the an- Polkadot’s constituent components, parachains are, prac-
tagonism, avoiding the persistent fragmentation of state- tically speaking, highly isolated in their nature. Though a
space of the system as with existing approaches. We do message passing system (xcmp) exists it is asynchronous,
this by introducing a new model of computation which coarse-grained and practically limited by its reliance on a
pipelines a highly scalable, mostly coherent element to a high-level slowly evolving interaction language xcm.
synchronous, fully coherent element. Asynchrony is not As such, the composability offered by Polkadot be-
avoided, but we bound it to the length of the pipeline and tween its constituent chains is lower than that of
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 3
Ethereum-like smart-contract systems offering a single roll-ups being a stated preference. Each vendor’s roll-up
and universal object environment and allowing for the design, execution and operation comes with its own impli-
kind of agile and innovative integration which underpins cations.
their success. Polkadot, as it stands, is a collection of One might reasonably assume that a diversified market-
independent ecosystems with only limited opportunity based approach for scaling via multivendor roll-ups will al-
for collaboration, very similar in ergonomics to bridged low well-designed solutions to thrive. However, there are
blockchains though with a categorically different security potential issues facing the strategy. A research report by
profile. A technical proposal known as spree would uti- Sharma 2023 on the level of decentralization in the vari-
lize Polkadot’s unique shared-security and improve com- ous roll-ups found a broad pattern of centralization, but
posability, though blockchains would still remain isolated. notes that work is underway to attempt to mitigate this.
Implementing and launching a blockchain is hard, time- It remains to be seen how decentralized they can yet be
consuming and costly. By its original design, Polkadot made.
limits the clients able to utilize its service to those who Heterogeneous communication properties (such as
are both able to do this and raise a sufficient deposit to datagram latency and semantic range), security properties
win an auction for a long-term slot, one of around 50 at (such as the costs for reversion, corruption, stalling and
the present time. While not permissioned per se, acces- censorship) and economic properties (the cost of accept-
sibility is categorically and substantially lower than for ing and processing some incoming message or transaction)
smart-contract systems similar to Ethereum. may differ, potentially quite dramatically, between major
Enabling as many innovators to participate and inter- areas of some grand patchwork of roll-ups by various com-
act, both with each other and each other’s user-base, ap- peting vendors. While the overall Ethereum network may
pears to be an important component of success for a Web3 eventually provide some or even most of the underlying
application platform. Accessibility is therefore crucial. machinery needed to do the sideband computation it is
far from clear that there would be a “grand consolidation”
2.2. Ethereum. The Ethereum protocol was formally de- of the various properties should such a thing happen. We
fined in this paper’s spiritual predecessor, the Yellow Pa- have not found any good discussion of the negative rami-
per, by Wood 2014. This was derived in large part from fications of such a fragmented approach.4
the initial concept paper by Buterin 2013. In the decade
since the YP was published, the de facto Ethereum proto- 2.2.1. Snark Roll-ups. While the protocol’s foundation
col and public network instance have gone through a num- makes no great presuppositions on the nature of roll-ups,
ber of evolutions, primarily structured around introducing Ethereum’s strategy for sideband computation does cen-
flexibility via the transaction format and the instruction tre around snark-based rollups and as such the protocol
set and “precompiles” (niche, sophisticated bonus instruc- is being evolved into a design that makes sense for this.
tions) of its scripting core, the Ethereum virtual machine Snarks are the product of an area of exotic cryptography
(evm). which allow proofs to be constructed to demonstrate to a
Almost one million crypto-economic actors take part neutral observer that the purported result of performing
in the validation for Ethereum.2 Block extension is done some predefined computation is correct. The complexity
through a randomized leader-rotation method where the of the verification of these proofs tends to be sub-linear in
physical address of the leader is public in advance of their their size of computation to be proven and will not give
block production.3 Ethereum uses Casper-FFG intro- away any of the internals of said computation, nor any
duced by Buterin and Griffith 2019 to determine finality, dependent witness data on which it may rely.
which with the large validator base finalizes the chain ex- Zk-snarks come with constraints. There is a trade-off
tension around every 13 minutes. between the proof’s size, verification complexity and the
Ethereum’s direct computational performance remains computational complexity of generating it. Non-trivial
broadly similar to that with which it launched in 2015, computation, and especially the sort of general-purpose
with a notable exception that an additional service now computation laden with binary manipulation which makes
allows 1mb of commitment data to be hosted per block smart-contracts so appealing, is hard to fit into the model
(all nodes to store it for a limited period). The data can- of snarks.
not be directly utilized by the main state-transition func- To give a practical example, risc-zero (as assessed by
tion, but special functions provide proof that the data Bögli 2024) is a leading project and provides a platform
(or some subsection thereof) is available. According to for producing snarks of computation done by a risc-v
Ethereum Foundation 2024b, the present design direction virtual machine, an open-source and succinct risc ma-
is to improve on this over the coming years by splitting chine architecture well-supported by tooling. A recent
responsibility for its storage amongst the validator base in benchmarking report by Polkavm Project 2024 showed
a protocol known as Dank-sharding. that compared to risc-zero’s own benchmark, proof gen-
According to Ethereum Foundation 2024a, the scaling eration alone takes over 61,000 times as long as simply re-
strategy of Ethereum would be to couple this data avail- compiling and executing even when executing on 32 times
ability with a private market of roll-ups, sideband com- as many cores, using 20,000 times as much ram and an
putation facilities of various design, with zk-snark-based additional state-of-the-art gpu. According to hardware
2Practical matters do limit the level of real decentralization. Validator software expressly provides functionality to allow a single instance
to be configured with multiple key sets, systematically facilitating a much lower level of actual decentralization than the apparent number
of actors, both in terms of individual operators and hardware. Using data collated by Dune and hildobby 2024 on Ethereum 2, one can see
one major node operator, Lido, has steadily accounted for almost one-third of the almost one million crypto-economic participants.
3Ethereum’s developers hope to change this to something more secure, but no timeline is fixed.
4Some initial thoughts on the matter resulted in a proposal by Sadana 2024 to utilize Polkadot technology as a means of helping create
a modicum of compatibility between roll-up ecosystems!
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 4
rental agents https://cloud-gpus.com/, the cost multi- either what might be termed a fragmentation approach
plier of proving using risc-zero is 66,000,000x of the cost5 or alternatively a centralization approach. We argue that
to execute using the Polkavm recompiler. neither approach offers a compelling solution.
Many cryptographic primitives become too expensive The fragmentation approach is heralded by projects
to be practical to use and specialized algorithms and struc- such as Cosmos (proposed by Kwon and Buchman 2019)
tures must be substituted. Often times they are otherwise and Avalanche (by Tanana 2019). It involves a system
suboptimal. In expectation of the use of snarks (such as fragmented by networks of a homogenous consensus me-
plonk as proposed by Gabizon, Williamson, and Ciobo- chanic, yet staffed by separately motivated sets of valida-
taru 2019), the prevailing design of the Ethereum project’s tors. This is in contrast to Polkadot’s single validator set
Dank-sharding availability system uses a form of erasure and Ethereum’s declared strategy of heterogeneous roll-
coding centered around polynomial commitments over a ups secured partially by the same validator set operating
large prime field in order to allow snarks to get accept- under a coherent incentive framework. The homogeneity
ably performant access to subsections of data. Compared of said fragmentation approach allows for reasonably con-
to alternatives, such as a binary field and Merklization in sistent messaging mechanics, helping to present a fairly
the present work, it leads to a load on the validator nodes unified interface to the multitude of connected networks.
orders of magnitude higher in terms of cpu usage. However, the apparent consistency is superficial. The
In addition to their basic cost, snarks present no great networks are trustless only by assuming correct operation
escape from decentralization and the need for redundancy, of their validators, who operate under a crypto-economic
leading to further cost multiples. While the need for some security framework ultimately conjured and enforced by
benefits of staked decentralization is averted through their economic incentives and punishments. To do twice as
verifiable nature, the need to incentivize multiple parties much work with the same levels of security and no special
to do much the same work is a requirement to ensure that coordination between validator sets, then such systems es-
a single party not form a monopoly (or several not form sentially prescribe forming a new network with the same
a cartel). Proving an incorrect state-transition should be overall levels of incentivization.
impossible, however service integrity may be compromised Several problems arise. Firstly, there is a simi-
in other ways; a temporary suspension of proof-generation, lar downside as with Polkadot’s isolated parachains and
even if only for minutes, could amount to major economic Ethereum’s isolated roll-up chains: a lack of coherency
ramifications for real-time financial applications. due to a persistently sharded state preventing synchro-
Real-world examples exist of the pit of centralization nous composability.
giving rise to monopolies. One would be the aforemen- More problematically, the scaling-by-fragmentation
tioned snark-based exchange framework; while notionally approach, proposed specifically by Cosmos, provides
serving decentralized exchanges, it is in fact centralized no homogenous security—and therefore trustlessness—
with Starkware itself wielding a monopoly over enacting guarantees. Validator sets between networks must be
trades through the generation and submission of proofs, assumed to be independently selected and incentivized
leading to a single point of failure—should Starkware’s ser- with no relationship, causal or probabilistic, between the
vice become compromised, then the liveness of the system Byzantine actions of a party on one network and potential
would suffer. for appropriate repercussions on another. Essentially, this
It has yet to be demonstrated that snark-based strate- means that should validators conspire to corrupt or revert
gies for eliminating the trust from computation will ever the state of one network, the effects may be felt across
be able to compete on a cost-basis with a multi-party other networks of the ecosystem.
crypto-economic platform. All as-yet proposed snark- That this is an issue is broadly accepted, and projects
based solutions are heavily reliant on crypto-economic sys- propose for it to be addressed in one of two ways. Firstly,
tems to frame them and work around their issues. Data to fix the expected cost-of-attack (and thus level of se-
availability and sequencing are two areas well understood curity) across networks by drawing from the same val-
as requiring a crypto-economic solution. idator set. The massively redundant way of doing this,
We would note that snark technology is improving as proposed by Cosmos Project 2023 under the name
and the cryptographers and engineers behind them do ex- replicated security, would be to require each validator
pect improvements in the coming years. In a recent arti- to validate on all networks and for the same incentives
cle by Thaler 2023 we see some credible speculation that and punishments. This is economically inefficient in the
with some recent advancements in cryptographic tech- cost of security provision as each network would need to
niques, slowdowns for proof generation could be as lit- independently provide the same level of incentives and
tle as 50,000x from regular native execution and much punishment-requirements as the most secure with which
of this could be parallelized. This is substantially bet- it wanted to interoperate. This is to ensure the economic
ter than the present situation, but still several orders of proposition remain unchanged for validators and the se-
magnitude greater than would be required to compete on curity proposition remained equivalent for all networks.
a cost-basis with established crypto-economic techniques At the present time, replicated security is not a readily
such as Elves. available permissionless service. We might speculate that
these punishing economics have something to do with it.
The more efficient approach, proposed by the Om-
2.3. Fragmented Meta-Networks. Directions for
niLedger team, Kokoris-Kogias et al. 2017, would be to
general-purpose computation scalability taken by other
projects broadly centre around one of two approaches;
5In all likelihood actually substantially more as this was using low-tier “spare” hardware in consumer units, and our recompiler was
unoptimized.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 5
make the validators non-redundant, partitioning them be- implementation was found and maliciously exploited, as
tween different networks and periodically, securely and described by Hertig 2016.
randomly repartitioning them. A reduction in the cost The second issue is concerning ultimate scalability of
to attack over having them all validate on a single net- the protocol when it provides no means of distributing
work is implied since there is a chance of having a single workload beyond the hardware of a single machine.
network accidentally have a compromising number of ma- In major usage, both historical transaction data and
licious validators even with less than this proportion over- state would grow impractically. Solana illustrates how
all. This aside it presents an effective means of scaling much of a problem this can be. Unlike classical
under a basis of weak-coherency. blockchains, the Solana protocol offers no solution for the
Alternatively, as in Elves by Jeff Burdges, Cevallos, archival and subsequent review of historical data, crucial
et al. 2024, we may utilize non-redundant partitioning, if the present state is to be proven correct from first prin-
combine this with a proposal-and-auditing game which ciple by a third party. There is little information on how
validators play to weed out and punish invalid computa- Solana manages this in the literature, but according to
tions, and then require that the finality of one network Solana Foundation 2023, nodes simply place the data onto
be contingent on all causally-entangled networks. This a centralized database hosted by Google.6
is the most secure and economically efficient solution of Solana validators are encouraged to install large
the three, since there is a mechanism for being highly amounts of ram to help hold its large state in mem-
confident that invalid transitions will be recognized and ory (512 gb is the current recommendation according to
corrected before their effect is finalized across the ecosys- Solana Labs 2024). Without a divide-and-conquer ap-
tem of networks. However, it requires substantially more proach, Solana shows that the level of hardware which
sophisticated logic and their causal-entanglement implies validators can reasonably be expected to provide dictates
some upper limit on the number of networks which may the upper limit on the performance of a totally synchro-
be added. nous, coherent execution model. Hardware requirements
represent barriers to entry for the validator set and cannot
grow without sacrificing decentralization and, ultimately,
transparency.
2.4. High-Performance Fully Synchronous Net-
works. Another trend in the recent years of blockchain
development has been to make “tactical” optimizations 3. Notational Conventions
over data throughput by limiting the validator set size or
diversity, focusing on software optimizations, requiring a Much as in the Ethereum Yellow Paper, a number of
higher degree of coherency between validators, onerous re- notational conventions are used throughout the present
quirements on the hardware which validators must have, work. We define them here for clarity. The Ethereum
or limiting data availability. Yellow Paper itself may be referred to henceforth as the
The Solana blockchain is underpinned by technology YP.
introduced by Yakovenko 2018 and boasts theoretical fig-
ures of over 700,000 transactions per second, though ac- 3.1. Typography. We use a number of different type-
cording to Ng 2024 the network is only seen processing a faces to denote different kinds of terms. Where a term is
small fraction of this. The underlying throughput is still used to refer to a value only relevant within some localized
substantially more than most blockchain networks and is section of the document, we use a lower-case roman letter
owed to various engineering optimizations in favor of max- e.g. x, y (typically used for an item of a set or sequence)
imizing synchronous performance. The result is a highly- or e.g. i, j (typically used for numerical indices). Where
coherent smart-contract environment with an api not un- we refer to a Boolean term or a function in a local context,
like that of YP Ethereum (albeit using a different under- we tend to use a capitalized roman alphabet letter such as
lying vm), but with a near-instant time to inclusion and A, F . If particular emphasis is needed on the fact a term
finality which is taken to be immediate upon inclusion. is sophisticated or multidimensional, then we may use a
Two issues arise with such an approach: firstly, defin- bold typeface, especially in the case of sequences and sets.
ing the protocol as the outcome of a heavily optimized For items which retain their definition throughout the
codebase creates structural centralization and can under- present work, we use other typographic conventions. Sets
mine resilience. Jha 2024 writes “since January 2022, 11 are usually referred to with a blackboard typeface, e.g. N
significant outages gave rise to 15 days in which major refers to all natural numbers including zero. Sets which
or partial outages were experienced”. This is an outlier may be parameterized may be subscripted or be followed
within the major blockchains as the vast majority of ma- by parenthesized arguments. Imported functions, used by
jor chains have no downtime. There are various causes to the present work but not specifically introduced by it, are
this downtime, but they are generally due to bugs found written in calligraphic typeface, e.g. H the Blake2 cryp-
in various subsystems. tographic hashing function. For other non-context depen-
Ethereum, at least until recently, provided the most dent functions introduced in the present work, we use up-
contrasting alternative with its well-reviewed specifica- per case Greek letters, e.g. Υ denotes the state transition
tion, clear research over its crypto-economic foundations function.
and multiple clean-room implementations. It is per- Values which are not fixed but nonetheless hold some
haps no surprise that the network very notably contin- consistent meaning throughout the present work are de-
ued largely unabated when a flaw in its most deployed noted with lower case Greek letters such as σ, the state
6Earlier node versions utilized Arweave network, a decentralized data store, but this was found to be unreliable for the data throughput
which Solana required.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 6
identifier. These may be placed in bold typeface to denote 3.5. Dictionaries. A dictionary is a possibly partial
that they refer to an abnormally complex value. mapping from some domain into some co-domain in much
the same manner as a regular function. Unlike functions
3.2. Functions and Operators. We define the precedes however, with dictionaries the total set of pairings are
relation to indicate that one term is defined in terms of necessarily enumerable, and we represent them in some
another. E.g. y ≺ x indicates that y may be defined purely data structure as the set of all (key ↦ value) pairs. (In
in terms of x: such data-defined mappings, it is common to name the
values within the domain a key and the values within the
(3.1) y ≺ x ⇐⇒ ∃f ∶ y = f (x) co-domain a value, hence the naming.)
Thus, we define the formalism D⟨K → V⟩ to denote a
The substitute-if-nothing function U is equivalent to dictionary which maps from the domain K to the range
the first argument which is not ∅, or ∅ if no such argu- V. We define a dictionary as a member of the set of all
ment exists: dictionaries D and a set of pairs p = (k ↦ v):
x−1
(3.2) U (a0 , . . . an ) ≡ ax ∶ (ax ≠ ∅ ∨ x = n), ⋀ ai = ∅ (3.3) D ⊂ {{(k ↦ v)}}
i=0
A dictionary’s members must associate at most one
Thus, e.g. U (∅, 1, ∅, 2) = 1 and U (∅, ∅) = ∅. unique value for any key k:
(3.4) ∀d ∈ D ∶ ∀(k ↦ v) ∈ d ∶ ∃!v ′ ∶ (k ↦ v ′ ) ∈ d
3.3. Sets. Given some set s, its power set and cardinality
are denoted as the usual ℘⟨s⟩ and ∣s∣. When forming a This assertion allows us to unambiguously define the
power set, we may use a numeric subscript in order to re- subscript and subtraction operator for a dictionary d:
strict the resultant expansion to a particular cardinality. ⎧
⎪
⎪v if ∃k ∶ (k ↦ v) ∈ d
E.g. ℘⟨{1, 2, 3}⟩2 = {{1, 2}, {1, 3}, {2, 3}}. (3.5) ∀d ∈ D ∶ d[k] ≡ ⎨
⎪
Sets may be operated on with scalars, in which case ⎩∅ otherwise
⎪
the result is a set with the operation applied to each el- (3.6) ∀d ∈ D, s ⊆ K ∶ d ∖ s ≡ {(k ↦ v) ∶ (k ↦ v) ∈ d, k ∈/ s}
ement, e.g. {1, 2, 3} + 3 = {4, 5, 6}. Functions may also Note that when using a subscript, it is an implicit as-
be applied to all members of a set to yield a new set, sertion that the key exists in the dictionary. Should the
but for clarity we denote this with a # superscript, e.g. key not exist, the result is undefined and any block which
f # ({1, 2}) ≡ {f (1), f (2)}. relies on it must be considered invalid.
We denote set-disjointness with the relation ⫰. For- It is typically useful to limit the sets from which the
mally: keys and values may be drawn. Formally, we define a
A ∩ B = ∅ ⇐⇒ A ⫰ B typed dictionary D⟨K → V ⟩ as a set of pairs p of the form
(k ↦ v):
We commonly use ∅ to indicate that some term is
validly left without a specific value. Its cardinality is (3.7) D⟨K → V ⟩ ⊂ D
defined as zero. We define the operation ? such that (3.8) D⟨K → V ⟩ ≡ {{(k ↦ v) ∣ k ∈ K ∧ v ∈ V }}
A? ≡ A ∪ {∅} indicating the same set but with the ad-
dition of the ∅ element. To denote the active domain (i.e. set of keys) of a dic-
The term ∇ is utilized to indicate the unexpected fail- tionary d ∈ D⟨K → V ⟩, we use K(d) ⊆ K and for the range
ure of an operation or that a value is invalid or unexpected. (i.e. set of values), V(d) ⊆ V . Formally:
(We try to avoid the use of the more conventional here (3.9) K(d ∈ D) ≡ { k ∣ ∃v ∶ (k ↦ v) ∈ d }
to avoid confusion with Boolean false, which may be in-
(3.10) V(d ∈ D) ≡ { v ∣ ∃k ∶ (k ↦ v) ∈ d }
terpreted as some successful result in some contexts.)
Note that since the co-domain of V is a set, should dif-
3.4. Numbers. N denotes the set of naturals including ferent keys with equal values appear in the dictionary, the
zero whereas Nn implies a restriction on that set to values set will only contain one such value.
less than n. Formally, N = {0, 1, . . . } and Nn = {x ∣ x ∈ Dictionaries may be combined through the union oper-
N, x < n}. ator ∪, which priorities the right-side operand in the case
Z denotes the set of integers. We denote Za...b to be of a key-collision:
the set of integers within the interval [a, b). Formally, (3.11) ∀d ∈ D, e ∈ D ∶ d ∪ e ≡ (d ∖ K(e)) ∪ e
Za...b = {x ∣ x ∈ Z, a ≤ x < b}. E.g. Z2...5 = {2, 3, 4}. We
denote the offset/length form of this set as Za⋅⋅⋅+b , a short 3.6. Tuples. Tuples are groups of values where each item
form of Za...a+b . may belong to a different set. They are denoted with
It can sometimes be useful to represent lengths of se- parentheses, e.g. the tuple t of the naturals 3 and 5 is de-
quences and yet limit their size, especially when dealing noted t = (3, 5), and it exists in the set of natural pairs
with sequences of octets which must be stored practically. sometimes denoted N ×N, but denoted in the present work
Typically, these lengths can be defined as the set N232 . as (N, N).
To improve clarity, we denote NL as the set of lengths of We have frequent need to refer to a specific item within
octet sequences and is equivalent to N232 . a tuple value and as such find it convenient to declare a
We denote the % operator as the modulo operator, name for each item. E.g. we may denote a tuple with two
e.g. 5 % 3 = 2. Furthermore, we may occasionally express named natural components a and b as T = ⎧ ⎫
⎩a ∈ N, b ∈ N⎭.
a division result as a quotient and remainder with the We would denote an item t ∈ T through subscripting its
separator R , e.g. 5 ÷ 3 = 1 R 2. name, thus for some t = ⎧ ▸
▸
▸
▸
⎫
⎩a 3, b 5⎭, ta = 3 and tb = 5.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 7
3.7. Sequences. A sequence is a series of elements with We denote sequence subtraction with a slight modifica-
particular ordering not dependent on their values. The set tion of the set subtraction operator; specifically, some se-
of sequences of elements all of which are drawn from some quence s excepting the left-most element equal to v would
set T is denoted ⟦T ⟧, and it defines a partial mapping be denoted s m {v}.
N → T . The set of sequences containing exactly n ele-
ments each a member of the set T may be denoted ⟦T ⟧n
3.7.3. Boolean values. Bs denotes the set of Boolean
and accordingly defines a complete mapping Nn → T . Sim-
strings of length s, thus Bs = ⟦{, ⊺}⟧s . When dealing
ilarly, sets of sequences of at most n elements and at least
with Boolean values we may assume an implicit equiva-
n elements may be denoted ⟦T ⟧∶n and ⟦T ⟧n∶ respectively.
lence mapping to a bit whereby ⊺ = 1 and = 0, thus
Sequences are subscriptable, thus a specific item at in-
B◻ = ⟦N2 ⟧◻ . We use the function bits(Y) ∈ B to de-
dex i within a sequence s may be denoted s[i], or where
note the sequence of bits, ordered with the most signif-
unambiguous, si . A range may be denoted using an ellip-
icant first, which represent the octet sequence Y, thus
sis for example: [0, 1, 2, 3]...2 = [0, 1] and [0, 1, 2, 3]1⋅⋅⋅+2 =
bits([160, 0]) = [1, 0, 1, 0, 0, . . . ].
[1, 2]. The length of such a sequence may be denoted ∣s∣.
We denote modulo subscription as s[i]↺ ≡ s[ i % ∣s∣ ].
We denote the final element x of a sequence s = [..., x] 3.7.4. Octets and Blobs. Y denotes the set of octet strings
through the function last(s) ≡ x. (“blobs”) of arbitrary length. As might be expected, Yx
denotes the set of such sequences of length x. Y$ denotes
3.7.1. Construction. We may wish to define a sequence the subset of Y which are ASCII-encoded strings. Note
in terms of incremental subscripts of other values: that while an octet has an implicit and obvious bijec-
[x0 , x1 , . . . ]n denotes a sequence of n values beginning tive relationship with natural numbers less than 256, and
x0 continuing up to xn−1 . Furthermore, we may also we may implicitly coerce between octet form and natural
wish to define a sequence as elements each of which number form, we do not treat them as exactly equivalent
are a function of their index i; in this case we denote entities. In particular for the purpose of serialization, an
[f (i) ∣ i < − Nn ] ≡ [f (0), f (1), . . . , f (n − 1)]. Thus, when octet is always serialized to itself, whereas a natural num-
the ordering of elements matters we use < − rather than ber may be serialized as a sequence of potentially several
the unordered notation ∈. The latter may also be written octets, depending on its magnitude and the encoding vari-
in short form [f (i − < Nn )]. This applies to any set which ant.
has an unambiguous ordering, particularly sequences, thus
[ i2 ∣ i −< [1, 2, 3] ] = [1, 4, 9]. Multiple sequences may be 3.7.5. Shuffling. We define the sequence-shuffle function
combined, thus [ i ⋅ j ∣ i − < [1, 2, 3], j < − [2, 3, 4] ] = [2, 6, 12]. F , originally introduced by Fisher and Yates 1938, with an
As with sets, we use explicit notation f # to denote a efficient in-place algorithm described by Wikipedia 2024.
function mapping over all items of a sequence. This accepts a sequence and some entropy and returns a
Sequences may be constructed from sets or other se- sequence of the same length with the same elements but
quences whose order should be ignored through sequence in an order determined by the entropy. The entropy may
ordering notation [ik ^^ i ∈ X], which is defined to result be provided as either an indefinite sequence of naturals or
in the set or sequence of its argument except that all ele- a hash. For a full definition see appendix F.
ments i are placed in ascending order of the corresponding
value ik .
The key component may be elided in which case it is as- 3.8. Cryptography.
sumed to be ordered by the elements directly; i.e. [i ∈ X] ≡
[i ^^ i ∈ X]. [ik _ _ i ∈ X] does the same, but excludes any 3.8.1. Hashing. H denotes the set of 256-bit values typi-
duplicate values of i. E.g. assuming s = [1, 3, 2, 3], then cally expected to be arrived at through a cryptographic
[i _ ^
_ i ∈ s] = [1, 2, 3] and [−i ^ i ∈ s] = [3, 3, 2, 1]. function, equivalent to Y32 , with H0 being equal to [0]32 .
Sets may be constructed from sequences with the reg- We assume a function H(m ∈ Y) ∈ H denoting the Blake2b
ular set construction syntax, e.g. assuming s = [1, 2, 3, 1], 256-bit hash introduced by Saarinen and Aumasson 2015
then {a ∣ a ∈ s} would be equivalent to {1, 2, 3}. and a function HK (m ∈ Y) ∈ H denoting the Keccak 256-
Sequences of values which themselves have a defined bit hash as proposed by Bertoni et al. 2013 and utilized
ordering have an implied ordering akin to a regular dic- by Wood 2014.
tionary, thus [1, 2, 3] < [1, 2, 4] and [1, 2, 3] < [1, 2, 3, 1]. We may sometimes wish to take only the first x octets
of a hash, in which case we denote Hx (m) ∈ Yx to be the
3.7.2. Editing. We define the sequence concatenation op- first x octets of H(m). The inputs of a hash function
erator ⌢ such that [x0 , x1 , . . . , y0 , y1 , . . . ] ≡ x ⌢ y. For should be expected to be passed through our serialization
sequences of sequences, we define a unary concatenate-all codec E to yield an octet sequence to which the cryp-
operator: Ìx ≡ x0 ⌢ x1 ⌢ . . . . Further, we denote ele- tography may be applied. (Note that an octet sequence
ment concatenation as x i ≡ x ⌢ [i]. We denote the conveniently yields an identity transform.) We may wish
sequence made up of the first n elements of sequence s to to interpret a sequence of octets as some other kind of
be Ð→ value with the assumed decoder function E −1 (x ∈ Y). In
n
s ≡ [s0 , s1 , . . . , sn−1 ], and only the final elements as
←
Ð n
s . both cases, we may subscript the transformation function
We define T x as the transposition of the sequence-of- with the number of octets we expect the octet sequence
sequences x, fully defined in equation H.5. We may also term to have. Thus, r = E4 (x ∈ N) would assert x ∈ N232
apply this to sequences-of-tuples to yield a tuple of se- and r ∈ Y4 , whereas s = E8−1 (y) would assert y ∈ Y8 and
quences. s ∈ N264 .
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 8
3.8.2. Signing Schemes. Ek ⟨m⟩ ⊂ Y64 is the set of valid data external to the system and thus said to be extrinsic,
Ed25519 signatures, defined by Josefsson and Liusvaara E:
2017, made through knowledge of a secret key whose pub-
(4.2) B ≡ (H, E)
lic key counterpart is k ∈ Y32 and whose message is m. To
aid readability, we denote the set of valid public keys HE . (4.3) E ≡ (ET , ED , EP , EA , EG )
We use YBLS ⊂ Y144 to denote the set of public keys for
The header is a collection of metadata primarily con-
the bls signature scheme, described by Boneh, Lynn, and
cerned with cryptographic references to the blockchain an-
Shacham 2004, on curve bls12-381 defined by Hopwood
cestors and the operands and result of the present tran-
et al. 2020.
sition. As an immutable known a priori, it is assumed
We denote the set of valid Bandersnatch public keys as
to be available throughout the functional components of
HB , defined in appendix G. Fm∈Y k∈HB ⟨x ∈ Y⟩ ⊂ Y96 is the set block transition. The extrinsic data is split into its several
of valid singly-contextualized signatures of utilizing the se-
portions:
cret counterpart to the public key k, some context x and
message m. tickets: Tickets, used for the mechanism which
manages the selection of validators for the per-
r∈YR ⟨x ∈ Y⟩ ⊂ Y784 , meanwhile, is the set of valid Ban-
F̄m∈Y
dersnatch Ringvrf deterministic singly-contextualized missioning of block authoring. This component is
proofs of knowledge of a secret within some set of secrets denoted ET .
identified by some root in the set of valid roots YR ⊂ Y144 . preimages: Static data which is presently being re-
We denote O(s ∈ ⟦HB ⟧) ∈ YR to be the root specific to the quested to be available for workloads to be able
set of public key counterparts s. A root implies a specific to fetch on demand. This is denoted EP .
set of Bandersnatch key pairs, knowledge of one of the reports: Reports of newly completed workloads
secrets would imply being capable of making a unique, whose accuracy is guaranteed by specific valida-
valid—and anonymous—proof of knowledge of a unique tors. This is denoted EG .
secret within the set. availability: Assurances by each validator concern-
Both the Bandersnatch signature and Ringvrf proof ing which of the input data of workloads they have
strictly imply that a member utilized their secret key in correctly received and are storing locally. This is
combination with both the context x and the message m; denoted EA .
the difference is that the member is identified in the former disputes: Information relating to disputes between
and is anonymous in the latter. Furthermore, both define validators over the validity of reports. This is de-
a vrf output, a high entropy hash influenced by x but not noted ED .
by m, formally denoted Y(F̄m r ⟨x⟩) ⊂ H and Y(Fk ⟨x⟩) ⊂ H.
m
We define the function S as the signature function, such 4.2. The State. Our state may be logically partitioned
that Sk (m) ∈ Fm into several largely independent segments which can both
k ⟨[]⟩ ∪ Ek ⟨m⟩. We assert that the ability
to compute a result for this function relies on knowledge help avoid visual clutter within our protocol description
of a secret key. and provide formality over elements of computation which
may be simultaneously calculated (i.e. parallelized). We
therefore pronounce an equivalence between σ (some com-
4. Overview plete state) and a tuple of partitioned segments of that
As in the Yellow Paper, we begin our formalisms by state:
recalling that a blockchain may be defined as a pairing
(4.4) σ ≡ (α, β, γ, δ, η, ι, κ, λ, ρ, τ, φ, χ, ψ, π, ϑ, ξ)
of some initial state together with a block-level state-
transition function. The latter defines the posterior state In summary, δ is the portion of state dealing with ser-
given a pairing of some prior state and a block of data vices, analogous in Jam to the Yellow Paper’s (smart con-
applied to it. Formally, we say: tract) accounts, the only state of the YP’s Ethereum. The
identities of services which hold some privileged status are
(4.1) σ ′ ≡ Υ(σ, B)
tracked in χ.
Where σ is the prior state, σ ′ is the posterior state, B is Validators, who are the set of economic actors uniquely
some valid block and Υ is our block-level state-transition privileged to help build and maintain the Jam chain, are
function. identified within κ, archived in λ and enqueued from ι. All
Broadly speaking, Jam (and indeed blockchains in gen- other state concerning the determination of these keys is
eral) may be defined simply by specifying Υ and some gen- held within γ. Note this is a departure from the YP proof-
esis state σ 0 .7 We also make several additional assump- of-work definitions which were mostly stateless, and this
tions of agreed knowledge: a universally known clock, and set was not enumerated but rather limited to those with
the practical means of sharing data with other systems sufficient compute power to find a partial hash-collision in
operating under the same consensus rules. The latter two the sha2-256 cryptographic hash function. An on-chain
were both assumptions silently made in the YP. entropy pool is retained in η.
Our state also tracks two aspects of each core: α, the
4.1. The Block. To aid comprehension and definition of authorization requirement which work done on that core
our protocol, we partition as many of our terms as possible must satisfy at the time of being reported on-chain, to-
into their functional components. We begin with the block gether with the queue which fills this, φ; and ρ, each of the
B which may be restated as the header H and some input cores’ currently assigned report, the availability of whose
7Practically speaking, blockchains sometimes make assumptions of some fraction of participants whose behavior is simply honest, and
not provably incorrect nor otherwise economically disincentivized. While the assumption may be reasonable, it must nevertheless be stated
apart from the rules of state-transition.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 9
work-package must yet be assured by a super-majority of must nonetheless attempt to minimize it. We therefore
validators. strive to ensure that:
Finally, details of the most recent blocks and timeslot (1) It be generally unlikely for two heads to form.
index are tracked in β and τ respectively, work-reports (2) When two heads do form they be quickly resolved
which are ready to be accumulated and work-packages into a single head.
which were recently accumulated are tracked in ϑ and ξ (3) It be possible to identify a block not much older
respectively and, judgments are tracked in ψ and validator than the head which we can be extremely confi-
statistics are tracked in π. dent will form part of the blockchain’s history in
4.2.1. State Transition Dependency Graph. Much as in perpetuity. When a block becomes identified as
the YP, we specify Υ as the implication of formulating such we call it finalized and this property natu-
all items of posterior state in terms of the prior state and rally extends to all of its ancestor blocks.
block. To aid the architecting of implementations which These goals are achieved through a combination of
parallelize this computation, we minimize the depth of two consensus mechanisms: Safrole, which governs the
the dependency graph where possible. The overall depen- (not-necessarily forkless) extension of the blockchain; and
dency graph is specified here: Grandpa, which governs the finalization of some extension
(4.5) τ′ ≺ H into canonical history. Thus, the former delivers point 1,
the latter delivers point 3 and both are important for de-
(4.6) β † ≺ (H, β) livering point 2. We describe these portions of the protocol
(4.7) β ′ ≺ (H, EG , β † , C) in detail in sections 6 and 19 respectively.
While Safrole limits forks to a large extent (through
(4.8) γ ′ ≺ (H, τ, ET , γ, ι, η ′ , κ′ , ψ ′ )
cryptography, economics and common-time, below), there
(4.9) η ′ ≺ (H, τ, η) may be times when we wish to intentionally fork since we
(4.10) κ′ ≺ (H, τ, κ, γ) have come to know that a particular chain extension must
be reverted. In regular operation this should never hap-
(4.11) λ′ ≺ (H, τ, λ, κ) pen, however we cannot discount the possibility of mali-
(4.12) ψ ′ ≺ (ED , ψ) cious or malfunctioning nodes. We therefore define such
an extension as any which contains a block in which data
(4.13) ρ† ≺ (ED , ρ)
is reported which any other block’s state has tagged as
(4.14) ρ‡ ≺ (EA , ρ† ) invalid (see section 10 on how this is done). We further
require that Grandpa not finalize any extension which con-
(4.15) ρ′ ≺ (EG , ρ‡ , κ, τ ′ )
tains such a block. See section 19 for more information
(4.16) W∗ ≺ (EA , ρ′ ) here.
(4.17) (ϑ′ , ξ ′ , δ ‡ , χ′ , ι′ , φ′ , C) ≺ (W∗ , ϑ, ξ, δ, χ, ι, φ)
4.4. Time. We presume a pre-existing consensus over
(4.18) δ ′ ≺ (EP , δ ‡ , τ ′ )
time specifically for block production and import. While
(4.19) α′ ≺ (H, EG , φ′ , α) this was not an assumption of Polkadot, pragmatic and
(4.20) π ′ ≺ (EG , EP , EA , ET , τ, κ′ , π, H) resilient solutions exist including the ntp protocol and
network. We utilize this assumption in only one way: we
The only synchronous entanglements are visible require that blocks be considered temporarily invalid if
through the intermediate components superscripted with their timeslot is in the future. This is specified in detail
a dagger and defined in equations 4.6, 4.18 and 4.14. The in section 6.
latter two mark a merge and join in the dependency graph Formally, we define the time in terms of seconds passed
and, concretely, imply that the availability extrinsic may since the beginning of the Jam Common Era, 1200 UTC
be fully processed and accumulation of work happen be- on January 1, 2025.8 Midday UTC is selected to ensure
fore the preimage lookup extrinsic is folded into state. that all major timezones are on the same date at any exact
4.3. Which History? A blockchain is a sequence of 24-hour multiple from the beginning of the common era.
blocks, each cryptographically referencing some prior Formally, this value is denoted T .
block by including a hash of its header, all the way back
to some first block which references the genesis header. 4.5. Best block. Given the recognition of a number of
We already presume consensus over this genesis header valid blocks, it is necessary to determine which should be
H0 and the state it represents already defined as σ 0 . treated as the “best” block, by which we mean the most
By defining a deterministic function for deriving a sin- recent block we believe will ultimately be within of all fu-
gle posterior state for any (valid) combination of prior ture Jam chains. The simplest and least risky means of
state and block, we are able to define a unique canonical doing this would be to inspect the Grandpa finality mech-
state for any given block. We generally call the block with anism which is able to provide a block for which there is a
the most ancestors the head and its state the head state. very high degree of confidence it will remain an ancestor
It is generally possible for two blocks to be valid and yet to any future chain head.
reference the same prior block in what is known as a fork. However, in reducing the risk of the resulting block ul-
This implies the possibility of two different heads, each timately not being within the canonical chain, Grandpa
with their own state. While we know of no way to strictly will typically return a block some small period older than
preclude this possibility, for the system to be useful we the most recently authored block. (Existing deployments
81,735,689,600 seconds after the Unix Epoch.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 10
suggest around 1-2 blocks in the past under regular oper- as Rust and C++. Furthermore, the instruction set sim-
ation.) There are often circumstances when we may wish plicity which risc-v and pvm share, together with the
to have less latency at the risk of the returned block not register size (64-bit), active number (13) and endianness
ultimately forming a part of the future canonical chain. (little) make it especially well-suited for creating efficient
E.g. we may be in a position of being able to author a recompilers on to common hardware architectures.
block, and we need to decide what its parent should be. The pvm is fully defined in appendix A, but for contex-
Alternatively, we may care to speculate about the most tualization we will briefly summarize the basic invocation
recent state for the purpose of providing information to a function Ψ which computes the resultant state of a pvm
downstream application reliant on the state of Jam. instance initialized with some registers (⟦NR ⟧13 ) and ram
In these cases, we define the best block as the head of (M) and has executed for up to some amount of gas (NG ),
the best chain, itself defined in section 19. a number of approximately time-proportional computa-
tional steps:
4.6. Economics. The present work describes a crypto-
⎧ Y, NR , NG , ⎫ ⎧
⎪ ̵ × NR , ⎫
{∎, ☇, ∞} ∪ { , h}
F
⎪
economic system, i.e. one combining elements of both ⎪ ⎪
(4.22) Ψ∶⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ →⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
cryptography and economics and game theory to deliver ⎪
⎩ R ⟧13 , M
⟦N ⎪ ⎪
⎭ ⎩ RN , Z , ⟦N ⟧ , M ⎪
G R 13 ⎭
a self-sovereign digital service. In order to codify and ma-
nipulate economic incentives we define a token which is We refer to the time-proportional computational steps
native to the system, which we will simply call tokens in as gas (much like in the YP) and limit it to a 64-bit quan-
the present work. tity. We may use either NG or ZG to bound it, the first as
A value of tokens is generally referred to as a balance, a prior argument since it is known to be positive, the latter
and such a value is said to be a member of the set of bal- as a result where a negative value indicates an attempt to
ances, NB , which is exactly equivalent to the set of natu- execute beyond the gas limit. Within the context of the
rals less than 264 (i.e. 64-bit unsigned integers in coding pvm, ϱ ∈ NG is typically used to denote gas.
parlance). Formally:
(4.23) ZG ≡ Z−263 ...263 , NG ≡ N264 , NR ≡ N264
(4.21) NB ≡ N264
It is left as a rather important implementation detail to
Though unimportant for the present work, we presume ensure that the amount of time taken while computing the
that there be a standard named denomination for 109 to- function Ψ(. . . , ϱ, . . . ) has a maximum computation time
kens. This is different to both Ethereum (which uses a approximately proportional to the value of ϱ regardless of
denomination of 1018 ), Polkadot (which uses a denomina- other operands.
tion of 1010 ) and Polkadot’s experimental cousin Kusama The pvm is a very simple risc register machine and as
(which uses 1012 ). such has 13 registers, each of which is a 64-bit quantity,
The fact that balances are constrained to being less denoted as NR , a natural less than 264 .9 Within the con-
than 264 implies that there may never be more than text of the pvm, ω ∈ ⟦NR ⟧13 is typically used to denote the
around 18×109 tokens (each divisible into portions of 10−9 ) registers.
within Jam. We would expect that the total number of
232
M ≡⎧ ⎫
tokens ever issued will be a substantially smaller amount
(4.24) ⎪
⎩ V ∈ Y 2 32 , A ∈ ⟦{W, R, ∅}⟧ ⎪, p =
p⎭
than this. ZP
We further presume that a number of constant prices (4.25) ZP = 2 12
● A page fault (attempt to access some address in single machine. In the present work we expect the net-
F
ram which is not accessible), . This includes the work to be able to do upwards of 300 times the amount
address of the page at fault. of computation in-core as that which could be performed
● An attempt at progressing a host-call, h. ̵ This by a single machine running the virtual machine at full
allows for the progression and integration of a speed.
context-dependent state-machine beyond the reg- Since in-core consensus is not evaluated or verified by
ular pvm. all nodes on the network, we must find other ways to be-
The full definition follows in appendix A. come adequately confident that the results of the com-
putation are correct, and any data used in determining
4.8. Epochs and Slots. Unlike the YP Ethereum with this is available for a practical period of time. We do
its proof-of-work consensus system, Jam defines a proof-of- this through a crypto-economic game of three stages called
authority consensus mechanism, with the authorized val- guaranteeing, assuring, auditing and, potentially, judging.
idators presumed to be identified by a set of public keys Respectively, these attach a substantial economic cost to
and decided by a staking mechanism residing within some the invalidity of some proposed computation; then a suffi-
system hosted by Jam. The staking system is out of scope cient degree of confidence that the inputs of the computa-
for the present work; instead there is an api which may tion will be available for some period of time; and finally,
be utilized to update these keys, and we presume that a sufficient degree of confidence that the validity of the
whatever logic is needed for the staking system will be computation (and thus enforcement of the first guaran-
introduced and utilize this api as needed. tee) will be checked by some party who we can expect to
The Safrole mechanism subdivides time following gen- be honest.
esis into fixed length epochs with each epoch divided into All execution done in-core must be reproducible by any
E = 600 timeslots each of uniform length P = 6 seconds, node synchronized to the portion of the chain which has
given an epoch period of E ⋅ P = 3600 seconds or one hour. been finalized. Execution done in-core is therefore de-
This six-second slot period represents the minimum signed to be as stateless as possible. The requirements for
time between Jam blocks, and through Safrole we aim doing it include only the refinement code of the service,
to strictly minimize forks arising both due to contention the code of the authorizer and any preimage lookups it
within a slot (where two valid blocks may be produced carried out during its execution.
within the same six-second period) and due to contention When a work-report is presented on-chain, a specific
over multiple slots (where two valid blocks are produced block known as the lookup-anchor is identified. Cor-
in different time slots but with the same parent). rect behavior requires that this must be in the finalized
Formally when identifying a timeslot index, we use a chain and reasonably recent, both properties which may
natural less than 232 (in compute parlance, a 32-bit un- be proven and thus are acceptable for use within a con-
signed integer) indicating the number of six-second times- sensus protocol.
lots from the Jam Common Era. For use in this context We describe this pipeline in detail in the relevant sec-
we introduce the set NT : tions later.
(4.28) NT ≡ N232
This implies that the lifespan of the proposed protocol
takes us to mid-August of the year 2840, which with the
current course that humanity is on should be ample. 4.9.2. On Services and Accounts. In YP Ethereum, we
have two kinds of accounts: contract accounts (whose ac-
4.9. The Core Model and Services. Whereas in the
tions are defined deterministically based on the account’s
Ethereum Yellow Paper when defining the state machine
associated code and state) and simple accounts which act
which is held in consensus amongst all network partici-
as gateways for data to arrive into the world state and are
pants, we presume that all machines maintaining the full
controlled by knowledge of some secret key. In Jam, all
network state and contributing to its enlargement—or, at
accounts are service accounts. Like Ethereum’s contract
least, hoping to—evaluate all computation. This “every-
accounts, they have an associated balance, some code and
body does everything” approach might be called the on-
state. Since they are not controlled by a secret key, they
chain consensus model. It is unfortunately not scalable,
do not need a nonce.
since the network can only process as much logic in con-
The question then arises: how can external data be fed
sensus that it could hope any individual node is capable
into the world state of Jam? And, by extension, how does
of doing itself within any given period of time.
overall payment happen if not by deducting the account
4.9.1. In-core Consensus. In the present work, we achieve balances of those who sign transactions? The answer to
scalability of the work done through introducing a sec- the first lies in the fact that our service definition actually
ond model for such computation which we call the in-core includes multiple code entry-points, one concerning refine-
consensus model. In this model, and under normal cir- ment and the other concerning accumulation. The former
cumstances, only a subset of the network is responsible acts as a sort of high-performance stateless processor, able
for actually executing any given computation and assur- to accept arbitrary input data and distill it into some much
ing the availability of any input data it relies upon to smaller amount of output data. The latter code is more
others. By doing this and assuming a certain amount of stateful, providing access to certain on-chain functionality
computational parallelism within the validator nodes of including the possibility of transferring balance and invok-
the network, we are able to scale the amount of computa- ing the execution of code in other services. Being stateful
tion done in consensus commensurate with the size of the this might be said to more closely correspond to the code
network, and not with the computational power of any of an Ethereum contract account.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 12
To understand how Jam breaks up its service code is P is thus defined as being the mapping from one block
to understand Jam’s fundamental proposition of general- header to its parent block header. With P , we are able to
ity and scalability. All data extrinsic to Jam is fed into define the set of ancestor headers A:
the refinement code of some service. This code is not
executed on-chain but rather is said to be executed in- (5.3) h ∈ A ⇔ h = H ∨ (∃i ∈ A ∶ h = P (i))
core. Thus, whereas the accumulator code is subject to
the same scalability constraints as Ethereum’s contract We only require implementations to store headers of
accounts, refinement code is executed off-chain and sub- ancestors which were authored in the previous L = 24 hours
ject to no such constraints, enabling Jam services to scale of any block B they wish to validate.
dramatically both in the size of their inputs and in the The extrinsic hash is a Merkle commitment to the
complexity of their computation. block’s extrinsic data, taking care to allow for the possibil-
While refinement and accumulation take place in con- ity of reports to individually have their inclusion proven.
sensus environments of a different nature, both are exe- Given any block B = (H, E), then formally:
cuted by the members of the same validator set. The Jam
protocol through its rewards and penalties ensures that (5.4) Hx ∈ H , Hx ≡ H(E(H# (a)))
code executed in-core has a comparable level of crypto- (5.5) where a = [ET (ET ), EP (EP ), g, EA (EA ), ED (ED )]
economic security to that executed on-chain, leaving the
(5.6) and g = E(↕[E(H(w), E4 (t), ↕a) ∣ (w, t, a) −
< EG ])
primary difference between them one of scalability versus
synchroneity. A block may only be regarded as valid once the time-
As for managing payment, Jam introduces a new ab- slot index Ht is in the past. It is always strictly greater
straction mechanism based around Polkadot’s Agile Core- than that of its parent. Formally:
time. Within the Ethereum transactive model, the mecha-
nism of account authorization is somewhat combined with (5.7) Ht ∈ N T , P (H)t < Ht ∧ Ht ⋅ P ≤ T
the mechanism of purchasing blockspace, both relying on
a cryptographic signature to identify a single “transactor” Blocks considered invalid by this rule may become valid
account. In Jam, these are separated and there is no such as T advances.
concept of a “transactor”. The parent state root Hr is the root of a Merkle trie
In place of Ethereum’s gas model for purchasing and composed by the mapping of the prior state’s Merkle root,
measuring blockspace, Jam has the concept of coretime, which by definition is also the parent block’s posterior
which is prepurchased and assigned to an authorization state. This is a departure from both Polkadot and the Yel-
agent. Coretime is analogous to gas insofar as it is the low Paper’s Ethereum, in both of which a block’s header
underlying resource which is being consumed when utiliz- contains the posterior state’s Merkle root. We do this
ing Jam. Its procurement is out of scope in the present to facilitate the pipelining of block computation and in
work and is expected to be managed by a system parachain particular of Merklization.
operating within a parachains service itself blessed with a
number of cores for running such system services. The au- (5.8) Hr ∈ H , Hr ≡ Mσ (σ)
thorization agent allows external actors to provide input
to a service without necessarily needing to identify them- We assume the state-Merklization function Mσ is ca-
selves as with Ethereum’s transaction signatures. They pable of transforming our state σ into a 32-octet commit-
are discussed in detail in section 8. ment. See appendix D for a full definition of these two
functions.
5. The Header All blocks have an associated public key to identify the
We must first define the header in terms of its com- author of the block. We identify this as an index into the
ponents. The header comprises a parent hash and prior posterior current validator set κ′ . We denote the Bander-
state root (Hp and Hr ), an extrinsic hash Hx , a time-slot snatch key of the author as Ha though note that this is
index Ht , the epoch, winning-tickets and offenders mark- merely an equivalence, and is not serialized as part of the
ers He , Hw and Ho , a Bandersnatch block author index header.
Hi and two Bandersnatch signatures; the entropy-yielding
vrf signature Hv and a block seal Hs . Headers may be (5.9) Hi ∈ N V , Ha ≡ κ′ [Hi ]
serialized to an octet sequence with and without the latter
seal component using E and EU respectively. Formally: 5.1. The Markers. If not ∅, then the epoch marker
(5.1) H ≡ (Hp , Hr , Hx , Ht , He , Hw , Ho , Hi , Hv , Hs ) specifies key and entropy relevant to the following epoch
in case the ticket contest does not complete adequately
The blockchain is a sequence of blocks, each crypto-
(a very much unexpected eventuality). Similarly, the
graphically referencing some prior block by including a
winning-tickets marker, if not ∅, provides the series of
hash derived from the parent’s header, all the way back to
600 slot sealing “tickets” for the next epoch (see the next
some first block which references the genesis header. We
section). Finally, the offenders marker is the sequence of
already presume consensus over this genesis header H0
Ed25519 keys of newly misbehaving validators, to be fully
and the state it represents defined as σ 0 .
explained in section 10. Formally:
Excepting the Genesis header, all block headers H have
(5.10)
an associated parent header, whose hash is Hp . We denote
He ∈ ⎧ ⎫
⎩H, H, ⟦HB ⟧V ⎭? , Hw ∈ ⟦C⟧E ? , Ho ∈ ⟦HE ⟧
the parent header H− = P (H):
(5.2) Hp ∈ H , Hp ≡ H(E(P (H))) The terms are fully defined in sections 6.6 and 10.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 13
6. Block Production and Chain Growth corresponding values for the present block:
As mentioned earlier, Jam is architected around a hy- τ τ′
(6.2) let eRm= , e ′ R m′ =
brid consensus mechanism, similar in nature to that of E E
Polkadot’s Babe/Grandpa hybrid. Jam’s block produc- 6.2. Safrole Basic State. We restate γ into a number
tion mechanism, termed Safrole after the novel Sassafras of components:
production mechanism of which it is a simplified variant, is
a stateful system rather more complex than the Nakamoto (6.3) γ ≡⎧ ⎫
⎩γk , γz , γs , γa ⎭
consensus described in the YP. γz is the epoch’s root, a Bandersnatch ring root com-
The chief purpose of a block production consensus posed with the one Bandersnatch key of each of the next
mechanism is to limit the rate at which new blocks may be epoch’s validators, defined in γk (itself defined in the next
authored and, ideally, preclude the possibility of “forks”: section).
multiple blocks with equal numbers of ancestors.
To achieve this, Safrole limits the possible author of (6.4) γz ∈ Y R
any block within any given six-second timeslot to a sin- Finally, γa is the ticket accumulator, a series of highest-
gle key-holder from within a prespecified set of validators. scoring ticket identifiers to be used for the next epoch. γs
Furthermore, under normal operation, the identity of the is the current epoch’s slot-sealer series, which is either a
key-holder of any future timeslot will have a very high de- full complement of E tickets or, in the case of a fallback
gree of anonymity. As a side effect of its operation, we mode, a series of E Bandersnatch keys:
can generate a high-quality pool of entropy which may be
used by other parts of the protocol and is accessible to (6.5) γa ∈ ⟦C⟧∶E , γs ∈ ⟦C⟧E ∪ ⟦HB ⟧E
services running on it. Here, C is used to denote the set of tickets, a combi-
Because of its tightly scoped role, the core of Safrole’s nation of a verifiably random ticket identifier y and the
state, γ, is independent of the rest of the protocol. It in- ticket’s entry-index r:
teracts with other portions of the protocol through ι and
κ, the prospective and active sets of validator keys re- (6.6) C ≡⎧ ⎫
⎩y ∈ H, r ∈ NN ⎭
spectively; τ , the most recent block’s timeslot; and η, the As we state in section 6.4, Safrole requires that every
entropy accumulator. block header H contain a valid seal Hs , which is a Ban-
The Safrole protocol generates, once per epoch, a se- dersnatch signature for a public key at the appropriate
quence of E sealing keys, one for each potential block index m of the current epoch’s seal-key series, present in
within a whole epoch. Each block header includes its state as γs .
timeslot index Ht (the number of six-second periods since
the Jam Common Era began) and a valid seal signature 6.3. Key Rotation. In addition to the active set of val-
Hs , signed by the sealing key corresponding to the times- idator keys κ and staging set ι, internal to the Safrole state
lot within the aforementioned sequence. Each sealing key we retain a pending set γk . The active set is the set of keys
is in fact a pseudonym for some validator which was agreed identifying the nodes which are currently privileged to au-
the privilege of authoring a block in the corresponding thor blocks and carry out the validation processes, whereas
timeslot. the pending set γk , which is reset to ι at the beginning of
In order to generate this sequence of sealing keys in each epoch, is the set of keys which will be active in the
regular operation, and in particular to do so without mak- next epoch and which determine the Bandersnatch ring
ing public the correspondence relation between them and root which authorizes tickets into the sealing-key contest
the validator set, we use a novel cryptographic structure for the next epoch.
known as a Ringvrf, utilizing the Bandersnatch curve.
(6.7) ι ∈ ⟦K⟧V , γk ∈ ⟦K⟧V , κ ∈ ⟦K⟧V , λ ∈ ⟦K⟧V
Bandersnatch Ringvrf allows for a proof to be provided
which simultaneously guarantees the author controlled a We must introduce K, the set of validator key tuples.
key within a set (in our case validators), and secondly pro- This is a combination of a set of cryptographic public keys
vides an output, an unbiasable deterministic hash giving and metadata which is an opaque octet sequence, but uti-
us a secure verifiable random function (vrf). This anony- lized to specify practical identifiers for the validator, not
mous and secure random output is a ticket and validators’ least a hardware address.
tickets with the best score define the new sealing keys al- The set of validator keys itself is equivalent to the set of
lowing the chosen validators to exercise their privilege and 336-octet sequences. However, for clarity, we divide the
create a new block at the appropriate time. sequence into four easily denoted components. For any
validator key k, the Bandersnatch key is denoted kb , and
is equivalent to the first 32-octets; the Ed25519 key, ke , is
6.1. Timekeeping. Here, τ defines the most recent
the second 32 octets; the bls key denoted kBLS is equiva-
block’s slot index, which we transition to the slot index
lent to the following 144 octets, and finally the metadata
as defined in the block’s header:
km is the last 128 octets. Formally:
(6.1) τ ∈ NT , τ ′ ≡ Ht (6.8) K ≡ Y336
We track the slot index in state as τ in order that we (6.9) ∀k ∈ K ∶ kb ∈ HB ≡ k0⋅⋅⋅+32
are able to easily both identify a new epoch and deter- (6.10) ∀k ∈ K ∶ ke ∈ HE ≡ k32⋅⋅⋅+32
mine the slot at which the prior block was authored. We (6.11) ∀k ∈ K ∶ kBLS ∈ YBLS ≡ k64⋅⋅⋅+144
denote e as the prior’s epoch index and m as the prior’s
slot phase index within that epoch and e′ and m′ are the (6.12) ∀k ∈ K ∶ km ∈ Y128 ≡ k208⋅⋅⋅+128
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 14
With a new epoch under regular conditions, validator On an epoch transition (identified as the condition
keys get rotated and the epoch’s Bandersnatch key root is e′ > e), we therefore rotate the accumulator value into
updated into γz′ : the history η1 , η2 and η3 :
⎧
⎪ ′
⎪(Φ(ι), γk , κ, z) if e > e ⎧
(6.13) (γk′ , κ′ , λ′ , γz′ ) ≡ ⎨ ⎪ ′
⎪(η0 , η1 , η2 ) if e > e
⎪ (η1′ , η2′ , η3′ ) ≡ ⎨
⎩(γk , κ, λ, γz )
⎪ otherwise (6.23)
⎪
′ ⎩(η1 , η2 , η3 ) otherwise
⎪
where z = O([kb ∣ k − < γk ])
[0, 0, . . . ] if ke ∈ ψo′
(6.14) Φ(k) ≡ [ }∣k −
< k] 6.5. The Slot Key Sequence. The posterior slot key
k otherwise sequence γs′ is one of three expressions depending on the
Note that on epoch changes the posterior queued val- circumstance of the block. If the block is not the first in
idator key set γk′ is defined such that incoming keys be- an epoch, then it remains unchanged from the prior γs .
longing to the offenders ψo′ are replaced with a null key If the block signals the next epoch (by epoch index) and
containing only zeroes. The origin of the offenders is ex- the previous block’s slot was within the closing period of
plained in section 10. the previous epoch, then it takes the value of the prior
ticket accumulator γa . Otherwise, it takes the value of
6.4. Sealing and Entropy Accumulation. The header
the fallback key sequence. Formally:
must contain a valid seal and valid vrf output. These are
two signatures both using the current slot’s seal key; the ⎧
⎪
⎪Z(γa ) if e′ = e + 1 ∧ m ≥ Y ∧ ∣γa ∣ = E
message data of the former is the header’s serialization ⎪
⎪
omitting the seal component Hs , whereas the latter is (6.24) γs′ ≡ ⎨γs if e′ = e
⎪
⎪
⎪
used as a bias-resistant entropy source and thus its mes- ⎪ ′ ′
⎩F (η2 , κ ) otherwise
sage must already have been fixed: we use the entropy
stemming from the vrf of the seal signature. Formally: Here, we use Z as the outside-in sequencer function,
defined as follows:
let i = γs′ [Ht ]↺ ∶
⎧
⎪ iy = Y(Hs ) , ⟦C⟧E → ⟦C⟧E
⎪
⎪
⎪ (6.25) Z∶ {
′ ⎪ EU (H) s ↦ [s0 , s∣s∣−1 , s1 , s∣s∣−2 , . . . ]
(6.15) γs ∈ ⟦C⟧ Ô⇒ ⎨ Hs ∈ FH ⟨XT ⌢ η3′ ir ⟩ ,
⎪
⎪
⎪
a
⎪
⎩T = 1
⎪ Finally, F is the fallback key sequence function which
⎧
⎪ i = Ha , selects an epoch’s worth of validator Bandersnatch keys
⎪
⎪
⎪
(6.16) ′
γs ∈ ⟦HB ⟧ Ô⇒ ⎨ Hs ∈ FEHUa(H) ⟨XF ⌢ η3′ ⟩ , (⟦HB ⟧E ) from the validator key set k using the entropy
⎪
⎪
⎪ collected on-chain r:
⎪
⎩T = 0
⎧ ⎧H, ⟦K⟧⎫ → ⟦HB ⟧
(6.17)
[]
Hv ∈ FHa ⟨XE ⌢ Y(Hs )⟩ ⎪⎩
⎪ ⎭ E
(6.26) F ∶ ⎨
⎪ ⎧ ⎫ −1 ↺
(6.18) XE = $jam_entropy ⎩ ⎩r, k⎭ ↦ [k[E (H4 (r ⌢ E4 (i)))]b ∣ i ∈ NE ]
⎪
(6.19) XF = $jam_fallback_seal
(6.20) XT = $jam_ticket_seal 6.6. The Markers. The epoch and winning-tickets
Sealing using the ticket is of greater security, and we markers are information placed in the header in order to
utilize this knowledge when determining a candidate block minimize data transfer necessary to determine the valida-
on which to extend the chain, detailed in section 19. We tor keys associated with any given epoch. They are partic-
thus note that the block was sealed under the regular se- ularly useful to nodes which do not synchronize the entire
curity with the boolean marker T. We define this only for state for any given block since they facilitate the secure
the purpose of ease of later specification. tracking of changes to the validator key sets using only
In addition to the entropy accumulator η0 , we retain the chain of headers.
three additional historical values of the accumulator at As mentioned earlier, the header’s epoch marker He is
the point of each of the three most recently ended epochs, either empty or, if the block is the first in a new epoch,
η1 , η2 and η3 . The second-oldest of these η2 is utilized to then a tuple of the next and current epoch randomness,
help ensure future entropy is unbiased (see equation 6.29) along with a sequence of Bandersnatch keys defining the
and seed the fallback seal-key generation function with Bandersnatch validator keys (kb ) beginning in the next
randomness (see equation 6.24). The oldest is used to re- epoch. Formally:
generate this randomness when verifying the seal above ⎧
(see equations 6.16 and 6.15). ⎪ − γk′ ])
⎪(η0 , η1 , [kb ∣ k < if e′ > e
(6.27) He ≡ ⎨
⎪
(6.21) η ∈ ⟦H⟧4 ⎩∅
⎪ otherwise
η0 defines the state of the randomness accumulator to The winning-tickets marker Hw is either empty or, if
which the provably random output of the vrf, the signa- the block is the first after the end of the submission period
ture over some unbiasable input, is combined each block. for tickets and if the ticket accumulator is saturated, then
η1 , η2 and η3 meanwhile retain the state of this accumu- the final sequence of ticket identifiers. Formally:
lator at the end of the three most recently ended epochs
in order. ⎧
⎪ ′ ′
⎪Z(γa ) if e = e ∧ m < Y ≤ m ∧ ∣γa ∣ = E
(6.22) η0′ ≡ H(η0 ⌢ Y(Hv )) (6.28) Hw ≡ ⎨
⎪
⎩∅
⎪ otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 15
6.7. The Extrinsic and Tickets. The extrinsic ET is a During the accumulation stage, a value with the par-
sequence of proofs of valid tickets; a ticket implies an entry tial transition of this state is provided which contains the
in our epochal “contest” to determine which validators are update for the newly-known roots of the parent block:
privileged to author a block for each timeslot in the follow-
(7.2) β† ≡ β except β † [∣β∣ − 1]s = Hr
ing epoch. Tickets specify an entry index together with a
proof of ticket’s validity. The proof implies a ticket iden- We define an item n comprising the new block’s header
tifier, a high-entropy unbiasable 32-octet sequence, which hash, its accumulation-result Merkle tree root and the set
is used both as a score in the aforementioned contest and of work-reports made into it (for which we use the guar-
as input to the on-chain vrf. antees extrinsic, EG ). Note that the accumulation-result
Towards the end of the epoch (i.e. Y slots from the tree root r is derived from C (defined in section 12) us-
start) this contest is closed implying successive blocks ing the basic binary Merklization function MB (defined
within the same epoch must have an empty tickets extrin- in appendix E) and appending it using the mmr append
sic. At this point, the following epoch’s seal key sequence function A (defined in appendix E.2) to form a Merkle
becomes fixed. mountain range.
We define the extrinsic as a sequence of proofs of valid let r = MB ([s ^^ E4 (s) ⌢ E(h) ∣ (s, h) ∈ C], HK )
tickets, each of which is a tuple of an entry index (a nat-
let b = A(last([[]] ⌢ [xb ∣ x −< β]), r, HK )
ural number less than N) and a proof of ticket validity. (7.3)
Formally: let p = {((gw )s )h ↦ ((gw )s )e ∣ g ∈ EG }
ET ∈ ⟦⎪
⎧
⎪r ∈ NN , p ∈ F̄[] ′ ⎫
⎪⟧ let n = ⎧
⎪
⎩p, h H(H), b, s H ⎪
0⎫
⎭
γz ⟨XT ⌢ η2
▸ ▸
(6.29) r⟩⎪ ▸ ▸
⎩ ⎭
⎧
⎪ ′ The state-trie root is as being the zero hash, H0 which
⎪K if m < Y
(6.30) ∣ET ∣ ≤ ⎨ while inaccurate at the end state of the block β ′ , it is nev-
⎪0 otherwise
⎪
⎩ ertheless safe since β ′ is not utilized except to define the
We define n as the set of new tickets, with the ticket next block’s β † , which contains a corrected value for this.
identifier, a hash, defined as the output component of the The final state transition is then:
Bandersnatch Ringvrf proof: ←ÐÐÐÐH
(7.4) β′ ≡ β† n
(6.31) n ≡ [⎧ ▸
▸
▸
▸
⎫ < ET ]
⎩y Y(ip ), r ir ⎭ ∣ i −
The tickets submitted via the extrinsic must already 8. Authorization
have been placed in order of their implied identifier. Du- We have previously discussed the model of work-
plicate identifiers are never allowed lest a validator submit packages and services in section 4.9, however we have yet
the same ticket multiple times: to make a substantial discussion of exactly how some core-
(6.32) n = [xy __ x ∈ n]
time resource may be apportioned to some work-package
and its associated service. In the YP Ethereum model, the
(6.33) {xy ∣ x ∈ n} ⫰ {xy ∣ x ∈ γa }
underlying resource, gas, is procured at the point of intro-
The new ticket accumulator γa′ is constructed by merg- duction on-chain and the purchaser is always the same
ing new tickets into the previous accumulator value (or agent who authors the data which describes the work to
the empty sequence if it is a new epoch): be done (i.e. the transaction). Conversely, in Polkadot the
Ð underlying resource, a parachain slot, is procured with a
⎡ÐÐÐ^ ÐÐÐÐÐ⎧ ÐÐÐÐÐÐÐÐÐÐÐ⎤→E
⎪
⎢ ^ ⎪∅ if e′ > e ⎥ substantial deposit for typically 24 months at a time and
(6.34) γa ≡ ⎢
′ ^
⎢xy ^ x ∈ n ∪ ⎨
⎥
⎥
⎢ ^ ⎪ the procurer, generally a parachain team, will often have
⎣ ^ ⎩γa otherwise ⎥
⎪ ⎦ no direct relation to the author of the work to be done
The maximum size of the ticket accumulator is E. On (i.e. a parachain block).
each block, the accumulator becomes the lowest items of On a principle of flexibility, we would wish Jam ca-
the sorted union of tickets from prior accumulator γa and pable of supporting a range of interaction patterns both
the submitted tickets. It is invalid to include useless tick- Ethereum-style and Polkadot-style. In an effort to do so,
ets in the extrinsic, so all submitted tickets must exist in we introduce the authorization system, a means of disen-
their posterior ticket accumulator. Formally: tangling the intention of usage for some coretime from the
(6.35) n ⊆ γa′ specification and submission of a particular workload to
be executed on it. We are thus able to disassociate the
Note that it can be shown that in the case of an empty
purchase and assignment of coretime from the specific de-
extrinsic ET = [], as implied by m′ ≥ Y, and unchanged
termination of work to be done with it, and so are able to
epoch (e′ = e), then γa′ = γa .
support both Ethereum-style and Polkadot-style interac-
7. Recent History tion patterns.
We retain in state information on the most recent H 8.1. Authorizers and Authorizations. The authoriza-
blocks. This is used to preclude the possibility of dupli- tion system involves two key concepts: authorizers and au-
cate or out of date work-reports from being submitted. thorizations. An authorization is simply a piece of opaque
data to be included with a work-package. An authorizer
(7.1) β ∈ ⟦⎧ ⎫
⎩h ∈ H, b ∈ ⟦H?⟧, s ∈ H, p ∈ D⟨H → H⟩⎭⟧∶H meanwhile, is a piece of pre-parameterized logic which ac-
For each recent block, we retain its header hash, its cepts as an additional parameter an authorization and,
state root, its accumulation-result mmr and the cor- when executed within a vm of prespecified computational
responding work-package hashes of each item reported limits, provides a Boolean output denoting the veracity of
(which is no more than the total number of cores, C = 341). said authorization.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 16
Authorizers are identified as the hash of their logic Thus, the balance of the service of index s would be
(specified as the vm code) and their pre-parameterization. denoted δ[s]b and the storage item of key k ∈ H for that
The process by which work-packages are determined to be service is written δ[s]s [k].
authorized (or not) is not the competence of on-chain logic
and happens entirely in-core and as such is discussed in 9.1. Code and Gas. The code c of a service account is
section 14.3. However, on-chain logic must identify each represented by a hash which, if the service is to be func-
set of authorizers assigned to each core in order to ver- tional, must be present within its preimage lookup (see
ify that a work-package is legitimately able to utilize that section 9.2). We thus define the actual code c:
resource. It is this subsystem we will now define. ⎧
⎪
⎪ap [ac ] if ac ∈ ap
(9.4) ∀a ∈ A ∶ ac ≡ ⎨
⎪
8.2. Pool and Queue. We define the set of authorizers ⎩∅
⎪ otherwise
allowable for a particular core c as the authorizer pool There are three entry-points in the code:
α[c]. To maintain this value, a further portion of state is
0 refine: Refinement, executed in-core and state-
tracked for each core: the core’s current authorizer queue
less.10
φ[c], from which we draw values to fill the pool. Formally:
1 accumulate: Accumulation, executed on-chain
(8.1) α ∈ ⟦⟦H⟧∶O ⟧C , φ ∈ ⟦⟦H⟧Q ⟧C and stateful.
2 on_transfer: Transfer handler, executed on-
Note: The portion of state φ may be altered only chain and stateful.
through an exogenous call made from the accumulate logic
Whereas the first, executing in-core, is described in
of an appropriately privileged service.
more detail in section 14.3, the latter two are defined in
The state transition of a block involves placing a new
the present section.
authorization into the pool from the queue:
As stated in appendix A, execution time in the Jam
←ÐÐÐÐÐÐÐÐÐÐÐÐO virtual machine is measured deterministically in units of
(8.2) ∀c ∈ NC ∶ α′ [c] ≡ F (c) φ′ [c][Ht ]↺
gas, represented as a natural number less than 264 and
⎧
⎪ formally denoted NG . We may also use ZG to denote the
⎪α[c] m {(gw )a } if ∃g ∈ EG ∶ (gw )c = c
(8.3) F (c) ≡ ⎨ set Z−263 ...263 if the quantity may be negative. There are
⎪
⎪ otherwise
⎩α[c] two limits specified in the account, g, the minimum gas
Since α′ is dependent on φ′ , practically speaking, this required in order to execute the Accumulate entry-point
step must be computed after accumulation, the stage in of the service’s code, and m, the minimum required for
which φ′ is defined. Note that we utilize the guarantees the On Transfer entry-point.
extrinsic EG to remove the oldest authorizer which has
been used to justify a guaranteed work-package in the 9.2. Preimage Lookups. In addition to storing data in
current block. This is further defined in equation 11.23. arbitrary key/value pairs available only on-chain, an ac-
count may also solicit data to be made available also in-
core, and thus available to the Refine logic of the service’s
9. Service Accounts code. State concerning this facility is held under the ser-
As we already noted, a service in Jam is somewhat vice’s p and l components.
analogous to a smart contract in Ethereum in that it in- There are several differences between preimage-lookups
cludes amongst other items, a code component, a storage and storage. Firstly, preimage-lookups act as a map-
component and a balance. Unlike Ethereum, the code is ping from a hash to its preimage, whereas general storage
split over two isolated entry-points each with their own maps arbitrary keys to values. Secondly, preimage data
environmental conditions; one, refinement, is essentially is supplied extrinsically, whereas storage data originates
stateless and happens in-core, and the other, accumula- as part of the service’s accumulation. Thirdly preimage
tion, which is stateful and happens on-chain. It is the data, once supplied, may not be removed freely; instead
latter which we will concern ourselves with now. it goes through a process of being marked as unavailable,
Service accounts are held in state under δ, a partial and only after a period of time may it be removed from
mapping from a service identifier NS into a tuple of named state. This ensures that historical information on its exis-
elements which specify the attributes of the service rele- tence is retained. The final point especially is important
vant to the Jam protocol. Formally: since preimage data is designed to be queried in-core, un-
der the Refine logic of the service’s code, and thus it is
(9.1) NS ≡ N232 important that the historical availability of the preimage
(9.2) δ ∈ D⟨NS → A⟩ is known.
We begin by reformulating the portion of state concern-
The service account is defined as the tuple of storage ing our data-lookup system. The purpose of this system
dictionary s, preimage lookup dictionaries p and l, code is to provide a means of storing static data on-chain such
hash c, and balance b as well as the two code gas limits g that it may later be made available within the execution
& m. Formally: of any service code as a function accepting only the hash
⎧
⎪ s ∈ D⟨H → Y⟩ , p ∈ D⟨H → Y⟩ , ⎫
⎪ of the data and its length in octets.
⎪
⎪
⎪ ⎪
⎪
⎪
⎪
⎪ ∈ ⎧ ⎫ → ⟦N ⟧ ⟩ ⎪
⎪ During the on-chain execution of the Accumulate func-
(9.3) A ≡⎪
⎪ l D⟨⎩ H, N L ⎭ T ∶3 , ⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪ tion, this is trivial to achieve since there is inherently a
⎪
⎩ c ∈ H , b ∈ N , g ∈ N , m ∈ N ⎪
⎭
B G G state which all validators verifying the block necessarily
10Technically there is some small assumption of state, namely that some modestly recent instance of each service’s preimages. The
specifics of this are discussed in section 14.3.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 17
have complete knowledge of, i.e. σ. However, for the in- 9.3. Account Footprint and Threshold Balance. We
core execution of Refine, there is no such state inherently define the dependent values i and l as the storage footprint
available to all validators; we thus name a historical state, of the service, specifically the number of items in storage
the lookup anchor which must be considered recently fi- and the total number of octets used in storage. They are
nalized before the work result may be accumulated hence defined purely in terms of the storage map of a service,
providing this guarantee. and it must be assumed that whenever a service’s storage
By retaining historical information on its availability, is changed, these change also.
we become confident that any validator with a recently fi- Furthermore, as we will see in the account serialization
nalized view of the chain is able to determine whether any function in section C, these are expected to be found ex-
given preimage was available at any time within the period plicitly within the Merklized state data. Because of this
where auditing may occur. This ensures confidence that we make explicit their set.
judgments will be deterministic even without consensus We may then define a second dependent term t, the
on chain state. minimum, or threshold, balance needed for any given ser-
Restated, we must be able to define some historical vice account in terms of its storage footprint.
lookup function Λ which determines whether the preim- ⎧
⎪
⎪
⎪ai ∈ N232 ≡ 2 ⋅ ∣ al ∣ + ∣ as ∣
age of some hash h was available for lookup by some ser- ⎪
⎪
⎪
vice account a at some timeslot t, and if so, provide its ⎪
⎪
⎪al ∈ N264 ≡ ∑ 81 + z
⎪ (h,z)∈K(al )
preimage: (9.8) ∀a ∈ V(δ) ∶ ⎨
⎪
⎪
⎪ + ∑ 32 + ∣x∣
⎪
⎪
⎪ x∈V(as )
(A, NHt −CD ...Ht , H) → Y? ⎪
⎪
(9.5) Λ∶ { ⎪
⎪a ∈ N ≡ B + B I ⋅ ai + B L ⋅ al
(a, t, H(p)) ↦ v ∶ v ∈ {p, ∅} ⎩ t B S
of said report, usually forking at the block immediately keys which are already in the punish-set:
prior to that at which accumulation happened. The spe- ⎧r ∈ ψb′ ,
⎪
⎪
⎪
cific strategy for chain selection is described fully in section ⎪
19. Authoring a block with a non-positive verdict has the (10.5) ∀(r, k, s) ∈ c ∶ ⋀⎨k ∈ k ,
⎪
⎪
⎪
effect of cancelling its imminent accumulation, as can be ⎪
⎩s ∈ Ek ⟨XG ⌢ r⟩
seen in equation 10.15. ⎧
⎪
⎪r ∈ ψb′ ⇔ r ∈/ ψg′ ⇔ v ,
Registering a verdict also has the effect of placing a ⎪
⎪
(10.6) ∀(r, v, k, s) ∈ f ∶ ⋀⎨k ∈ k ,
permanent record of the event on-chain and allowing any ⎪
⎪
⎪
offending keys to be placed on-chain both immediately or ⎪
⎩s ∈ Ek ⟨Xv ⌢ r⟩
in forthcoming blocks, again for permanent record. where k = {ke ∣ k ∈ λ ∪ κ} ∖ ψo
Having a persistent on-chain record of misbehavior is
Verdicts v must be ordered by report hash. Offender
helpful in a number of ways. It provides a very simple
signatures c and f must each be ordered by the valida-
means of recognizing the circumstances under which ac-
tor’s Ed25519 key. There may be no duplicate report
tion against a validator must be taken by any higher-level
hashes within the extrinsic, nor amongst any past reported
validator-selection logic. Should Jam be used for a public
hashes. Formally:
network such as Polkadot, this would imply the slashing of
the offending validator’s stake on the staking parachain. (10.7) v = [r _ ⎧ ⎫
_⎩r, a, j⎭ ∈ v]
As mentioned, recording reports found to have a high (10.8) c = [k _ ⎧ ⎫ _⎧
_⎩r, k, s⎭ ∈ c] , f = [k _⎩r, v, k, s⎭ ∈ f ]
⎫
confidence of invalidity is important to ensure that said ⎧ ⎫
(10.9) {r ∣ ⎩r, a, j⎭ ∈ v} ⫰ ψg ∪ ψb ∪ ψw
reports are not allowed to be resubmitted. Conversely,
recording reports found to be valid ensures that additional The judgments of all verdicts must be ordered by val-
disputes cannot be raised in the future of the chain. idator index and there may be no duplicates:
10.1. The State. The disputes state includes four items, (10.10) ∀(r, a, j) ∈ v ∶ j = [i _ ⎧ ⎫
_⎩v, i, s⎭ ∈ j]
three of which concern verdicts: a good-set (ψg ), a bad- We define V to derive from the sequence of verdicts
set (ψb ) and a wonky-set (ψw ) containing the hashes of introduced in the block’s extrinsic, containing only the
all work-reports which were respectively judged to be cor- report hash and the sum of positive judgments. We re-
rect, incorrect or that it appears impossible to judge. The quire this total to be either exactly two-thirds-plus-one,
fourth item, the punish-set (ψo ), is a set of Ed25519 keys zero or one-third of the validator set indicating, respec-
representing validators which were found to have mis- tively, that the report is good, that it’s bad, or that it’s
judged a work-report. wonky.11 Formally:
(10.1) ψ ≡⎧ ⎫
⎩ψg , ψb , ψw , ψo ⎭ (10.11) V ∈ ⟦⎧ ⎩H, {0, ⌊1/3V⌋, ⌊2/3V⌋ + 1}⎭⟧
⎫
⎡⎧ ⎫ R
RRRR ⎤
10.2. Extrinsic. The disputes extrinsic, ED , may con- ⎢⎪
⎪
⎪ ⎪
⎪
⎪ ⎥
V=⎢ ⎪
⎢⎪ ∑ ⎪ R ⎧ ⎫ <
− ⎥
⎪ ⎧v,i,s⎫∈j ⎪⎪ RRRRR ⎩
(10.12) r, v r, a, j ⎭ v ⎥
tain one or more verdicts v as a compilation of judgments ⎢⎪
⎪ ⎪
⎪ ⎥
⎣⎩ ⎩ ⎭ ⎭R ⎦
coming from exactly two-thirds plus one of either the ac-
tive validator set or the previous epoch’s validator set, i.e. There are some constraints placed on the composition
the Ed25519 keys of κ or λ. Additionally, it may con- of this extrinsic: any verdict containing solely valid judg-
tain proofs of the misbehavior of one or more validators, ments implies the same report having at least one valid
either by guaranteeing a work-report found to be invalid entry in the faults sequence f . Any verdict containing
(culprits, c), or by signing a judgment found to be con- solely invalid judgments implies the same report having
tradiction to a work-report’s validity (faults, f ). Both are at least two valid entries in the culprits sequence c. For-
considered a kind of offense. Formally: mally:
(10.2) (10.13) ∀(r, ⌊2/3V⌋ + 1) ∈ V ∶ ∃(r, . . . ) ∈ f
ED ≡ (v, c, f ) (10.14) ∀(r, 0) ∈ V ∶ ∣{(r, . . . ) ∈ c}∣ ≥ 2
⎧
⎪ τ ⎫
⎪
where v ∈ ⟦⎪⎪
⎪H, ⌊ ⌋ − N2 , ⟦⎧ ⎫
⎩{⊺, }, NV , E⎭⟧⌊2/3V⌋+1 ⎪
⎪
⎪⟧ We clear any work-reports which we judged as uncer-
⎩ E ⎭ tain or invalid from their core:
and c ∈ ⟦H, HE , E⟧ , f ∈ ⟦H, {⊺, }, HE , E⟧ (10.15)
The signatures of all judgments must be valid in terms ⎧
⎪∅ if {(H(ρ[c]w ), t) ∈ V, t < ⌊2/3V⌋}
⎪
of one of the two allowed validator key-sets, identified by ∀c ∈ NC ∶ ρ† [c] = ⎨
⎪
⎪
the verdict’s second term which must be either the epoch ⎩ρ[c] otherwise
index of the prior state or one less. Formally: The state’s good-set, bad-set and wonky-set assimi-
late the hashes of the reports from each verdict. Finally,
∀(r, a, j) ∈ v, ∀(v, i, s) ∈ j ∶ s ∈ Ek[i]e ⟨Xv ⌢ r⟩
the punish-set accumulates the keys of any validators who
⎧
⎪ τ have been found guilty of offending. Formally:
(10.3) ⎪
⎪κ if a = ⌊ ⌋
where k = ⎨ E
⎪
⎪ (10.16) ψg′ ≡ ψg ∪ {r ∣ ⎧ ⎫
⎩r, ⌊2/3V⌋ + 1⎭ ∈ V}
⎪
⎩λ otherwise
(10.4) X⊺ ≡ $jam_valid , X ≡ $jam_invalid (10.17) ψb ≡ ψb ∪ {r ∣ ⎧
′ ⎫
⎩r, 0⎭ ∈ V}
Offender signatures must be similarly valid and ref- (10.18) ψw ≡ ψw ∪ {r ∣ ⎧
′ ⎫
⎩r, ⌊1/3V⌋⎭ ∈ V}
′
erence work-reports with judgments and may not report (10.19) ψo ≡ ψo ∪ {k ∣ (r, k, s) ∈ c} ∪ {k ∣ (r, v, k, s) ∈ f }
11This requirement may seem somewhat arbitrary, but these happen to be the decision thresholds for our three possible actions and
are acceptable since the security assumptions include the requirement that at least two-thirds-plus-one validators are live (Jeff Burdges,
Cevallos, et al. 2024 discusses the security implications in depth).
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 19
10.3. Header. The offenders markers must contain ex- 11.1.2. Refinement Context. A refinement context, de-
actly the keys of all new offenders, respectively. Formally: noted by the set X, describes the context of the chain at
the point that the report’s corresponding work-package
(10.20) Ho ≡ [k ∣ (r, k, s) ∈ c] ⌢ [k ∣ (r, v, k, s) ∈ f ] was evaluated. It identifies two historical blocks, the an-
chor, header hash a along with its associated posterior
11. Reporting and Assurance state-root s and posterior Beefy root b; and the lookup-
anchor, header hash l and of timeslot t. Finally, it iden-
Reporting and assurance are the two on-chain processes tifies the hash of any prerequisite work-packages p. For-
we do to allow the results of in-core computation to make mally:
its way into the service state singleton, δ. A work-package,
⎧
⎪ a ∈ H, s ∈ H, b ∈ H, ⎫
⎪
which comprises several work items, is transformed by val- (11.4) X ≡⎪
⎪
⎪
⎪ l ∈ H,
⎪
⎪
⎪
idators acting as guarantors into its corresponding work- ⎪
⎩ t ∈ NT , p ∈ {H}⎪ ⎪
⎭
report, which similarly comprises several work outputs and
then presented on-chain within the guarantees extrinsic. 11.1.3. Availability. We define the set of availability spec-
At this point, the work-package is erasure coded into a ifications, S, as the tuple of the work-package’s hash h, an
multitude of segments and each segment distributed to auditable work bundle length l (see section 14.4.1 for more
the associated validator who then attests to its availabil- clarity on what this is), together with an erasure-root u,
ity through an assurance placed on-chain. After enough a segment-root e and segment-count n. Work-results in-
assurances the work-report is considered available, and the clude this availability specification in order to ensure they
work outputs transform the state of their associated ser- are able to correctly reconstruct and audit the purported
vice by virtue of accumulation, covered in section 12. The ramifications of any reported work-package. Formally:
report may also be timed-out, implying it may be replaced
(11.5) S ≡⎧ ⎫
⎩h ∈ H, l ∈ NL , u ∈ H, e ∈ H, n ∈ N⎭
by another report without accumulation.
From the perspective of the work-report, therefore, The erasure-root (u) is the root of a binary Merkle
the guarantee happens first and the assurance after- tree which functions as a commitment to all data required
wards. However, from the perspective of a block’s state- for the auditing of the report and for use by later work-
transition, the assurances are best processed first since packages should they need to retrieve any data yielded. It
each core may only have a single work-report pending its is thus used by assurers to verify the correctness of data
package becoming available at a time. Thus, we will first they have been sent by guarantors, and it is later verified
cover the transition arising from processing the availability as correct by auditors. It is discussed fully in section 14.
assurances followed by the work-report guarantees. This The segment-root (e) is the root of a constant-depth,
synchroneity can be seen formally through the require- left-biased and zero-hash-padded binary Merkle tree com-
ment of an intermediate state ρ‡ , utilized later in equation mitting to the hashes of each of the exported segments
11.29. of each work-item. These are used by guarantors to ver-
ify the correctness of any reconstructed segments they are
11.1. State. The state of the reporting and availability called upon to import for evaluation of some later work-
portion of the protocol is largely contained within ρ, which package. It is also discussed in section 14.
tracks the work-reports which have been reported but are
not yet known to be available to a super-majority of val- 11.1.4. Work Result. We finally come to define a work re-
idators, together with the time at which each was re- sult, L, which is the data conduit by which services’ states
ported. As mentioned earlier, only one report may be may be altered through the computation done within a
assigned to a core at any given time. Formally: work-package.
(11.1) ρ ∈ ⟦⎧ ⎫
⎩w ∈ W, t ∈ NT ⎭?⟧C
(11.6) L ≡ (s ∈ NS , c ∈ H, l ∈ H, g ∈ NG , o ∈ Y ∪ J)
As usual, intermediate and posterior values (ρ† , ρ‡ , ρ′ )
are held under the same constraints as the prior value. Work results are a tuple comprising several items.
Firstly s, the index of the service whose state is to be
altered and thus whose refine code was already executed.
11.1.1. Work Report. A work-report, of the set W, is de-
We include the hash of the code of the service at the time
fined as a tuple of the work-package specification s, the
of being reported c, which must be accurately predicted
refinement context x, and the core-index (i.e. on which
within the work-report according to equation 11.42;
the work is done) as well as the authorizer hash a and
Next, the hash of the payload (l) within the work item
output o, a segment-root lookup dictionary l, and finally
which was executed in the refine stage to give this result.
the results of the evaluation of each of the items in the
This has no immediate relevance, but is something pro-
package r, which is always at least one item and may be
vided to the accumulation logic of the service. We follow
no more than I items. Formally:
with the gas prioritization ratio g used when determining
⎧
⎪s ∈ S, x ∈ X, c ∈ NC , a ∈ H, ⎫ ⎪ how much gas should be allocated to execute of this item’s
(11.2) W ≡⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ accumulate.
⎪
⎩o ∈ Y, l ∈ D⟨H → H⟩, r ∈ ⟦L⟧1∶I
⎪
⎭
Finally, there is the output or error of the execution of
We limit the sum of the number of items in the the code o, which may be either an octet sequence in case
segment-root lookup dictionary and the number of pre- it was successful, or a member of the set J, if not. This
requisites to J = 8: latter set is defined as the set of possible errors, formally:
(11.3) ∀w ∈ W ∶ ∣wl ∣ + ∣(wx )p ∣ ≤ J (11.7) J ∈ {∞, ☇, ⊚, BAD, BIG}
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 20
The first two are special values concerning execution for the removal of items which are either now available or
of the virtual machine, ∞ denoting an out-of-gas error have timed out:
and ☇ denoting an unexpected program termination. Of
(11.17)
the remaining three, the first indicates that the number ⎧
of exports made was invalidly reported, the second indi- ⎪
⎪∅ if ρ[c]w ∈ W ∨ Ht ≥ ρ† [c]t + U
∀c ∈ NC ∶ ρ‡ [c] ≡ ⎨ †
⎪
cates that the service’s code was not available for lookup ⎩ρ [c] otherwise
⎪
in state at the posterior state of the lookup-anchor block.
The third indicates that the code was available but was 11.3. Guarantor Assignments. Every block, each core
beyond the maximum size allowed WC . has three validators uniquely assigned to guarantee work-
In order to ensure fair use of a block’s extrinsic space, reports for it. This is borne out with V = 1, 023 validators
work-reports are limited in the maximum total size of the and C = 341 cores, since V/C = 3. The core index assigned to
successful output blobs together with the authorizer out- each of the validators, as well as the validators’ Ed25519
put blob, effectively limiting their overall size: keys are denoted by G:
(11.8) ∀w ∈ W ∶ ∣wo ∣ + ∑ ∣ro ∣ ≤ WR
r∈wr ,ro ∈Y
(11.18) G ∈ (⟦NC ⟧NV , ⟦HK ⟧NV )
10
(11.9) WR ≡ 48 ⋅ 2 We determine the core to which any given validator is
assigned through a shuffle using epochal entropy and a
11.2. Package Availability Assurances. We first de- periodic rotation to help guard the security and liveness
fine ρ‡ , the intermediate state to be utilized next in sec- of the network. We use η2 for the epochal entropy rather
tion 11.4 as well as W, the set of available work-reports, than η1 to avoid the possibility of fork-magnification where
which will we utilize later in section 12. Both require the uncertainty about chain state at the end of an epoch could
integration of information from the assurances extrinsic give rise to two established forks before it naturally re-
EA . solves.
11.2.1. The Assurances Extrinsic. The assurances extrin- We define the permute function P , the rotation func-
sic is a sequence of assurance values, at most one per val- tion R and finally the guarantor assignments G as follows:
idator. Each assurance is a sequence of binary values (i.e. (11.19) R(c, n) ≡ [(x + n) mod C ∣ x −
< c]
a bitstring), one per core, together with a signature and
the index of the validator who is assuring. A value of 1 C⋅i t mod E
(11.20) P (e, t) ≡ R(F ([⌊ ⌋∣i <
− NV ], e), ⌊ ⌋)
(or ⊺, if interpreted as a Boolean) at any given index im- V R
′ ′ ′
plies that the validator assures they are contributing to (11.21) G ≡ (P (η2 , τ ), Φ(κ ))
its availability.12 Formally:
We also define G∗ , which is equivalent to the value G
(11.10) EA ∈ ⟦⎧ ⎫
⎩a ∈ H, f ∈ BC , v ∈ NV , s ∈ E⎭⟧∶V as it would have been under the previous rotation:
The assurances must all be anchored on the parent and ⎧
⎪ τ′ − R τ′
⎪
⎪ ′ ′
⎪(η2 , κ ) if ⌊ ⌋=⌊ ⌋
ordered by validator index:
let (e, k) = ⎨ E E
(11.22) ⎪
⎪
⎪ ′ ′
(11.11) ∀a ∈ EA ∶ aa = Hp ⎪(η , λ ) otherwise
⎩ 3
(11.12) ∀i ∈ {1 . . . ∣EA ∣} ∶ EA [i − 1]v < EA [i]v G∗ ≡ (P (e, τ ′ − R), Φ(k))
The signature must be one whose public key is that
of the validator assuring and whose message is the seri- 11.4. Work Report Guarantees. We begin by defin-
alization of the parent hash Hp and the aforementioned ing the guarantees extrinsic, EG , a series of guarantees,
bitstring: at most one for each core, each of which is a tuple of a
work-report, a credential a and its corresponding timeslot
(11.13) ∀a ∈ EA ∶ as ∈ Eκ′ [av ]e ⟨XA ⌢ H(E(Hp , af ))⟩
t. The core index of each guarantee must be unique and
(11.14) XA ≡ $jam_available guarantees must be in ascending order of this. Formally:
A bit may only be set if the corresponding core has a EG ∈ ⟦⎧ ⎧ ⎫ ⎫
report pending availability on it:
(11.23) ⎩w ∈ W, t ∈ NT , a ∈ ⟦⎩NV , E⎭⟧2∶3 ⎭⟧∶C
(11.24) EG = [(gw )c ^^ g ∈ EG ]
(11.15) ∀a ∈ EA , c ∈ NC ∶ af [c] ⇒ ρ† [c] ≠ ∅
The credential is a sequence of two or three tuples of a
11.2.2. Available Reports. A work-report is said to be- unique validator index and a signature. Credentials must
come available if and only if there are a clear 2/3 super- be ordered by their validator index:
majority of validators who have marked its core as set
within the block’s assurance extrinsic. Formally, we de- (11.25) ∀g ∈ EG ∶ ga = [v _⎧ ⎫
_⎩v, s⎭ ∈ ga ]
fine the sequence of newly available work-reports W as:
The signature must be one whose public key is that of
⎡ RRRR ⎤
⎢ † ⎥ the validator identified in the credential, and whose mes-
(11.16) W≡⎢ R < NC , ∑ af [c] > 2/3 V⎥
⎢ρ [c]w RRRR c − ⎥ sage is the serialization of the hash of the work-report.
⎢ RR ⎥
⎣ a∈EA
⎦ The signing validators must be assigned to the core in
This value is utilized in the definition of both δ ′ and ρ‡ question in either this block G if the timeslot for the guar-
which we will define presently as equivalent to ρ† except antee is in the same rotation as this block’s timeslot, or
12This is a “soft” implication since there is no consequence on-chain if dishonestly reported. For more information on this implication
see section 16.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 21
in the most recent previous set of assignments, G∗ : We ensure that the work-package not appear anywhere
within our pipeline. Formally:
∀(w, t, a) ∈ EG , ⎧
⎪ s ∈ E(kv )e ⟨XG ⌢ H(E(w))⟩
⎪
∶⎨ ′ (11.36) let q = {(wx )p ∣ q ∈ ϑ, (w, d) ∈ q}
∀(v, s) ∈ a ⎪ τ′
⎩ cv = wc ∧ R(⌊ /R⌋ − 1) ≤ t ≤ τ
⎪
k ∈ R ⇔ ∃(w, t, a) ∈ EG , ∃(v, s) ∈ a ∶ k = (kv )e (11.37) let a = {((iw )x )p ∣ i ∈ ρ, i ≠ ∅}
(11.26)
⎧ (11.38) ∀p ∈ p, p ∈/ ⋃ K(xp ) ∪ ⋃ x ∪ q ∪ a
⎪
⎪ τ′ t
⎪
⎪G if ⌊ ⌋ = ⌊ ⌋ x∈β x∈ξ
where (c, k) = ⎨ R R
⎪
⎪
⎪ ∗ We require that the prerequisite work-packages, if
⎪
⎩G otherwise present, and any work-packages mentioned in the
(11.27) XG ≡ $jam_guarantee segment-root lookup, be either in the extrinsic or in our
We note that the Ed25519 key of each validator whose recent history.
signature is in a credential is placed in the reporters set R. ∀w ∈ w, ∀p ∈ (wx )p ∪ K(wl ) ∶
This is utilized by the validator activity statistics book- (11.39)
p ∈ p ∪ {x ∣ x ∈ K(bp ), b ∈ β}
keeping system section 13.
We denote w to be the set of work-reports in the We require that any segment roots mentioned in the
present extrinsic E: segment-root lookup be verified as correct based on our
recent work-package history and the present block:
(11.28) let w = {gw ∣ g ∈ EG }
(11.40) let p = {((gw )s )h ↦ ((gw )s )e ∣ g ∈ EG }
No reports may be placed on cores with a report pend-
ing availability on it. A report is valid only if the autho- (11.41) ∀w ∈ w ∶ wl ⊆ p ∪ ⋃ bp
b∈β
rizer hash is present in the authorizer pool of the core on
which the work is reported. Formally: (Note that these checks leave open the possibility of ac-
‡ cepting work-reports in apparent dependency loops. We
(11.29) ∀w ∈ w ∶ ρ [wc ] = ∅ ∧ wa ∈ α[wc ]
do not consider this a problem: the pre-accumulation
We require that the gas allotted for accumulation of stage effectively guarantees that accumulation never hap-
each work item in each work-report respects its service’s pens in these cases and the reports are simply ignored.)
minimum gas requirements. We also require that all work- Finally, we require that all work results within the
reports total allotted accumulation gas is no greater than extrinsic predicted the correct code hash for their corre-
the overall gas limit GA : sponding service:
(11.30) ∀w ∈ w ∶ ∑ (rg ) ≤ GA ∧ ∀r ∈ wr ∶ rg ≥ δ[rs ]g (11.42) ∀w ∈ w, ∀r ∈ wr ∶ rc = δ[rs ]c
r∈wr
11.4.1. Contextual Validity of Reports. For convenience, 11.5. Transitioning for Reports. We define ρ′ as be-
we define two equivalences x and p to be, respectively, ing equivalent to ρ‡ , except where the extrinsic replaced
the set of all contexts and work-package hashes within an entry. In the case an entry is replaced, the new value
the extrinsic: includes the present time τ ′ allowing for the value to be
replaced without respect to its availability once sufficient
(11.31) let x ≡ {wx ∣ w ∈ w} , p ≡ {(ws )h ∣ w ∈ w} time has elapsed (see equation 11.29).
There must be no duplicate work-package hashes (i.e. ⎧
⎪⎧w, t τ ⎫
⎪ ⎧ ⎫
′
(11.43) ∀c ∈ NC ∶ ρ′ [c] ≡ ⎨⎩ ‡ ⎭ if ∃⎩c, w, a⎭ ∈ EG
▸
▸
⎪ ⎪
p = [⎪ ⎪
▸ ▸
⎪ ⎪
⎪ ⎪ ⎪
⎪
⎪
⎪
⎪
w∈w...i r∈wr ⎪
⎪
⎩
⎪
⎪
⎪ a w , k (w s h⎪
) ⎪∣ w −
⎪ < w, r <
− w r , rs = s ]
⎪ ⎩ ⎭
▸ ▸
▸
o ▸
⎪
⎪
⎪ and (g ∗ , o∗ , t∗ , b∗ ) = ∆∗ (o, w...i , f )
⎪
⎪
⎪
⎪
⎪
⎪ ′ ∗ ∗ This introduces O, the set of wrangled operand tuples,
⎩ and (j, o , t, b) = ∆+ (g − g , wi... , o , {}) used as an operand to the pvm Accumulation function ΨA .
It also draws upon g, the gas limit implied by the work-
reports and gas-privileges for s and p, a rephrasing of the
work-items for s within w into a sequence of operand tu-
We come to define the parallelized accumulation func- ples O.
tion ∆∗ which, with the help of the single-service accumu-
lation function ∆1 , transforms an initial state-context, to- 12.3. Deferred Transfers and State Integration.
gether with a sequence of work-reports and a dictionary of Given the result of the top-level ∆+ , we may define the
privileged always-accumulate services, into a tuple of the posterior state χ′ , φ′ and ι′ as well as the first intermedi-
total gas utilized in pvm execution u, a posterior state- ate state of the service-accounts δ † and the Beefy com-
context (x′ , d′ , i′ , q′ ) and the resultant accumulation- mitment map C:
output pairings b and deferred-transfers Ì t:
(12.17) ⎛ ⎞
⎧
⎪ (U, ⟦W⟧, D⟨NS → NG ⟩) → (NG , U, ⟦T⟧, B) (12.20) let g = max GT , GA ⋅ C + ∑ (x)
⎪
⎪
⎪ ⎝ x∈V(χg ) ⎠
⎪
⎪
⎪ (o, w, f ) ↦ (u, (x′ , d′ , i′ , q′ ), Ì
t, b)
⎪
⎪
⎪ (12.21) let (n, o, t, C) = ∆+ (g, W∗ , (χ, δ, ι, φ), χg )
⎪
⎪
⎪ where:
⎪
⎪
⎪
⎪
⎪ (12.22) (χ′ , δ † , ι′ , φ′ ) ≡ o
⎪
⎪
⎪ s = {rs ∣ w ∈ w, r ∈ wr } ∪ K(f )
⎪
⎪
⎪
⎪
⎪
⎪ u = ∑(∆1 (o, w, f , s)u ) Note that the accumulation commitment map C is set
⎪
⎪
⎪
⎪
⎪
s∈s of pairs of indices of the output-yielding accumulated ser-
⎪
⎪
⎪ b = {(s, b) ∣ s ∈ s, b = ∆1 (o, w, f , s)b , b ≠ ∅}
⎪ vices to their accumulation result. This is utilized in equa-
∆∗ ∶ ⎨ tion 7.3, when determining the accumulation-result tree
⎪
⎪ t = [∆1 (o, w, f , s)t ∣ s − < s]
⎪
⎪
⎪ root for the present block, useful for the Beefy protocol.
⎪
⎪
⎪ ((m, a, v, z), d, i, q) = o
⎪
⎪
⎪ We have denoted the sequence of implied transfers as
⎪
⎪
⎪ ′
⎪
⎪ x = (∆1 (o, w, f , m)o )x t, ordered internally according to the source service’s ex-
⎪
⎪
⎪
⎪
⎪ i′ = (∆1 (o, w, f , a)o )i ecution. We define a selection function R, which maps a
⎪
⎪
⎪
⎪
⎪ sequence of deferred transfers and a desired destination
⎪
⎪
⎪ q′ = (∆1 (o, w, f , v)o )q service index into the sequence of transfers targeting said
⎪
⎪
⎪
⎪
⎪
⎪ d′ = {s ↦ ds ∣ s ∈ K(d) ∖ s} ∪ ⋃ ((∆1 (o, w, f , s)o )d ) service, ordered primarily according to the source service
⎪
⎩ s∈s index and secondarily their order within t. Formally:
(12.23)
(⟦T⟧, NS ) → ⟦T⟧
R∶ {
We note that all newly added service indices, defined (t, d) ↦ [ t ∣ s −
< NS , t −
< t, ts = s, td = d ]
in the above context as ⋃s∈s K((∆1 (o, w, s)o )d ) ∖ s, must
not conflict with the indices of existing services K(δ) or The second intermediate state δ ‡ may then be defined
other newly added services. This should never happen, with all the deferred effects of the transfers applied:
since new indices are explicitly selected to avoid such con-
flicts, but in the unlikely event it happens, the block must (12.24) δ ‡ = {s ↦ ΨT (δ † , τ ′ , s, R(t, s)) ∣ (s ↦ a) ∈ δ † }
be considered invalid.
Note that ΨT is defined in appendix B.5 such that it
The single-service accumulation function, ∆1 , trans-
results in δ † [d], i.e. no difference to the account’s inter-
forms an initial state-context, sequence of work-reports
mediate state, if R(d) = [], i.e. said account received no
and a service index into an alterations state-context, a
transfers.
sequence of transfers, a possible accumulation-output and
We define the final state of the ready queue and the ac-
the actual pvm gas used. This function wrangles the work-
cumulated map by integrating those work-reports which
items of a particular service from a set of work-reports and
were accumulated in this block and shifting any from the
invokes pvm execution with said data:
prior state with the oldest such items being dropped en-
tirely:
′ ∗
(12.25) ξE−1 = P (W...n )
(12.18) O ≡⎧ ⎫
⎩o ∈ Y ∪ J, l ∈ H, k ∈ H, a ∈ Y⎭ (12.26) ∀i ∈ NE−1 ∶ ξi′ ≡ ξi+1
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 24
⎧
⎪ ′
⎪
⎪E(WQ , ξE−1 ) if i = 0 d: The total number of octets across all preimages
′↺ ⎪
⎪
(12.27) ∀i ∈ NE ∶ ϑm−i ≡ ⎨[] if 1 ≤ i < τ ′ − τ introduced by the validator.
⎪
⎪
⎪ ↺ g: The number of reports guaranteed by the valida-
⎪ ′ ′
⎩E(ϑm−i , ξE−1 ) if i ≥ τ − τ
⎪
tor.
12.4. Preimage Integration. After accumulation, we a: The number of availability assurances made by
must integrate all preimages provided in the lookup ex- the validator.
trinsic to arrive at the posterior account state. The lookup The objective statistics are updated in line with their
extrinsic is a sequence of pairs of service indices and data. description, formally:
These pairs must be ordered and without duplicates (equa-
τ τ′
tion 12.29 requires this). The data must have been so- (13.2) let e = ⌊ ⌋ , e′ = ⌊ ⌋
E E
licited by a service but not yet provided in the prior state.
⎧
⎪
Formally: ⎪(π0 , π1 ) if e′ = e
(13.3) (a, π1′ ) ≡ ⎨
⎪ ⎧ ⎫
(12.28) EP ∈ ⟦⎧ ⎫
⎩NS , Y⎭⟧ ⎩([⎩0, . . . , [0, . . . ]⎭, . . . ], π0 ) otherwise
⎪
′
(12.29) _
EP = [i _ i ∈ EP ] ⎧
⎪ π0 [v]b ≡ a[v]b + (v = Hi )
⎪
⎪
⎪
⎪ ⎧
(12.30) R(d, s, h, l) ≡ h ∈/ d[s]p ∧ d[s]l [⎧ ⎫
⎩h, l⎭] = []
⎪
⎪
⎪
⎪ ′
⎪
⎪∣ET ∣ if v = Hi
⎪
⎪
⎪ π [v] ≡ a[v] + ⎨
(12.31) ⎧ ⎫
∀⎩s, p⎭ ∈ EP ∶ R(δ, s, H(p), ∣p∣) ⎪
⎪
0 t t
⎪0
⎪ otherwise
⎪
⎪
⎪ ⎩
⎪
⎪
⎪ ⎧
We disregard, without prejudice, any preimages which ⎪
⎪ ⎪
⎪∣EP ∣ if v = Hi
⎪
⎪ ′
⎪π0 [v]p ≡ a[v]p + ⎨
due to the effects of accumulation are no longer useful. (13.4) ∀v ∈ NV ∶ ⎨ ⎪
⎪ otherwise
We define δ ′ as the state after the integration of the still- ⎪
⎪ ⎩0
⎪
⎪
⎪ ⎧
relevant preimages: ⎪
⎪ ⎪∑
⎪ ∣d∣ if v = Hi
⎪
⎪
⎪ π0′ [v]d ≡ a[v]d + ⎨ d∈EP
⎪
⎪ ⎪
⎪ otherwise
(12.32) let P = {(s, p) ∣ ⎧ ⎫ ‡
⎩s, p⎭ ∈ EP , R(δ , s, H(p), ∣p∣)}
⎪
⎪
⎪
⎪
⎩ 0
⎪
⎪
⎪ ′ ′
⎧ δ ′ [s]p [H(p)] = p
⎪ ⎪
⎪ π0 [v]g ≡ a[v]g + (κv ∈ R)
⎫∈ P ∶ ⎪ ⎪
⎪
(12.33) δ ′ = δ ‡ ex. ∀⎧
⎩s, p ⎭ ⎨ ′ ⎪
⎪ ′
⎪ ′ ⎩π0 [v]a ≡ a[v]a + (∃a ∈ EA ∶ av = v)
⎩ δ [s]l [H(p), ∣p∣] = [τ ]
⎪
Note that R is the Reporters set, as defined in equation
13. Validator Activity Statistics 11.26.
The Jam chain does not explicitly issue rewards—we
14. Work Packages and Work Reports
leave this as a job to be done by the staking subsystem
(in Polkadot’s case envisioned as a system parachain— 14.1. Honest Behavior. We have so far specified how
hosted without fees—in the current imagining of a public to recognize blocks for a correctly transitioning Jam
Jam network). However, much as with validator punish- blockchain. Through defining the state transition func-
ment information, it is important for the Jam chain to tion and a state Merklization function, we have also de-
facilitate the arrival of information on validator activity fined how to recognize a valid header. While it is not
in to the staking subsystem so that it may be acted upon. especially difficult to understand how a new block may be
Such performance information cannot directly cover all authored for any node which controls a key which would
aspects of validator activity; whereas block production, allow the creation of the two signatures in the header, nor
guarantor reports and availability assurance can easily be indeed to fill in the other header fields, readers will note
tracked on-chain, Grandpa, Beefy and auditing activity that the contents of the extrinsic remain unclear.
cannot. In the latter case, this is instead tracked with val- We define not only correct behavior through the cre-
idator voting activity: validators vote on their impression ation of correct blocks but also honest behavior, which in-
of each other’s efforts and a median may be accepted as volves the node taking part in several off-chain activities.
the truth for any given validator. With an assumption of This does have analogous aspects within YP Ethereum,
50% honest validators, this gives an adequate means of though it is not mentioned so explicitly in said document:
oraclizing this information. the creation of blocks along with the gossiping and inclu-
The validator statistics are made on a per-epoch basis sion of transactions within those blocks would all count as
and we retain one record of completed statistics together off-chain activities for which honest behavior is helpful. In
with one record which serves as an accumulator for the Jam’s case, honest behavior is well-defined and expected
present epoch. Both are tracked in π, which is thus a of at least 2/3 of validators.
sequence of two elements, with the first being the accu- Beyond the production of blocks, incentivized honest
mulator and the second the previous epoch’s statistics. behavior includes:
For each epoch we track a performance record for each ● the guaranteeing and reporting of work-packages,
validator: along with chunking and distribution of both the
(13.1) π ∈ ⟦⟦⎧ ⎫
⎩b ∈ N , t ∈ N , p ∈ N , d ∈ N , g ∈ N , a ∈ N⎭⟧V ⟧2
chunks and the work-package itself, discussed in
section 15;
The six statistics we track are: ● assuring the availability of work-packages after
b: The number of blocks produced by the validator. being in receipt of their data;
t: The number of tickets introduced by the valida- ● determining which work-reports to audit, fetching
tor. and auditing them, and creating and distributing
p: The number of preimages introduced by the val- judgments appropriately based on the outcome of
idator. the audit;
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 25
● submitting the correct amount of auditing work root and an index into the tree) be a kind of cryptographic
seen being done by other validators, discussed in commitment capable of having a justification applied to
section 13. demonstrate that any particular segment is indeed correct.
Justification data must be available to any node over
14.2. Segments and the Manifest. Our basic erasure-
the course of its segment’s potential requirement. At
coding segment size is WE = 684 octets, derived from the
around 350 bytes to justify a single segment, justification
fact we wish to be able to reconstruct even should almost
data is too voluminous to have all validators store all data.
two-thirds of our 1023 participants be malicious or inca-
We therefore use the same overall availability framework
pacitated, the 16-bit Galois field on which the erasure-code
for hosting justification metadata as the data itself.
is based and the desire to efficiently support encoding data
The guarantor is able to use this proof to justify to
of close to, but no less than, 4kb.
themselves that they are not wasting their time on incor-
Work-packages are generally small to ensure guaran-
rect behavior. We do not force auditors to go through
tors need not invest a lot of bandwidth in order to discover
the same process. Instead, guarantors build an Auditable
whether they can get paid for their evaluation into a work-
Work Package, and place this in the Audit da system.
report. Rather than having much data inline, they instead
This is the original work-package, its extrinsic data, its
reference data through commitments. The simplest com-
imported data and a concise proof of correctness of that
mitments are extrinsic data.
imported data. This tactic routinely duplicates data be-
Extrinsic data are blobs which are being introduced
tween the Imports da and the Audits da, however it is
into the system alongside the work-package itself gener-
acceptable in order to reduce the bandwidth cost for au-
ally by the work-package builder. They are exposed to the
ditors who must justify the correctness as cheaply as pos-
Refine logic as an argument. We commit to them through
sible as auditing happens on average 30 times for each
including each of their hashes in the work-package.
work-package whereas guaranteeing happens only twice
Work-packages have two other types of external data
or thrice.
associated with them: A cryptographic commitment to
each imported segment and finally the number of segments 14.3. Packages and Items. We begin by defining a
which are exported. work-package, of set P, and its constituent work items,
14.2.1. Segments, Imports and Exports. The ability to of set I. A work-package includes a simple blob acting as
communicate large amounts of data from one work- an authorization token j, the index of the service which
package to some subsequent work-package is a key fea- hosts the authorization code h, an authorization code hash
ture of the Jam availability system. An export segment, u and a parameterization blob p, a context x and a se-
defined as the set G, is an octet sequence of fixed length quence of work items w:
WG = 4104. It is the smallest datum which may individ- (14.2) P ≡ ⎧ ⎫
⎩j ∈ Y, h ∈ NS , u ∈ H, p ∈ Y, x ∈ X, w ∈ ⟦I⟧1∶I ⎭
ually be imported from—or exported to—the long-term
A work item includes: s the identifier of the service to
Imports da during the Refine function of a work-package.
which it relates, the code hash of the service at the time
Being an exact multiple of the erasure-coding piece size
of reporting c (whose preimage must be available from the
ensures that the data segments of work-package can be
perspective of the lookup anchor block), a payload blob
efficiently placed in the da system.
y, gas limits for Refinement and Accumulation g & a, and
(14.1) G ≡ Y WG the three elements of its manifest, a sequence of imported
Exported segments are data which are generated data segments i which identify a prior exported segment
through the execution of the Refine logic and thus are a through an index and the identity of an exporting work-
side effect of transforming the work-package into a work- package, x, a sequence of blob hashes and lengths to be
report. Since their data is deterministic based on the exe- introduced in this block (and which we assume the valida-
cution of the Refine logic, we do not require any particular tor knows) and e the number of data segments exported
commitment to them in the work-package beyond know- by this work item.
ing how many are associated with each Refine invocation ⎧
⎪ s ∈ NS , c ∈ H, y ∈ Y, g ∈ NG , a ∈ NG , e ∈ N,⎫
⎪
in order that we can supply an exact index. (14.3) I ≡⎪
⎪
⎪
⎪
⎪ ⎧H ∪ (H⊞ ), N⎪ ⎫⟧, x ∈ ⟦(H, N)⟧
⎪
⎪
⎪
⎪
⎪
⎪
⎩ i ∈ ⟦⎪
⎩ ⎭ ⎪
⎭
On the other hand, imported segments are segments
which were exported by previous work-packages. In order Note that an imported data segment’s work-package is
for them to be easily fetched and verified they are ref- identified through the union of sets H and a tagged vari-
erenced not by hash but rather the root of a Merkle tree ant H⊞ . A value drawn from the regular H implies the
which includes any other segments introduced at the time, hash value is of the segment-root containing the export,
together with an index into this sequence. This allows for whereas a value drawn from H⊞ implies the hash value is
justifications of correctness to be generated, stored, in- the hash of the exporting work-package. In the latter case
cluded alongside the fetched data and verified. This is it must be converted into a segment-root by the guaran-
described in depth in the next section. tor and this conversion reported in the work-report for
on-chain validation.
14.2.2. Data Collection and Justification. It is the task of
We limit both the total number of exported items and
a guarantor to reconstitute all imported segments through
the total number of imported items to WM = 211 :
fetching said segments’ erasure-coded chunks from enough
unique validators. Reconstitution alone is not enough (14.4) ∀p ∈ P ∶ ∑ we ≤ WM ∧ ∑ ∣wi ∣ ≤ WM
w∈pw w∈pw
since corruption of the data would occur if one or more
validators provided an incorrect chunk. For this reason We make an assumption that the preimage to each ex-
we ensure that the import segment specification (a Merkle trinsic hash in each work-item is known by the guarantor.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 26
In general this data will be passed to the guarantor along- may be made from it:
side the work-package.
We limit the total size of the implied import and ex- ⎧
⎪⟦G⟧ → ⟦G⟧
⎪
⎪
⎪
trinsic items, together with all payloads, the authorizer (14.10) P ∶ ⎨ s ↦ [Pl (E(↕J6 (s, i), ↕L6 (s, i))) ∣ i <
− N⌈∣s∣/64⌉ ]
⎪
⎪
⎪
parameter and the authorization token to 12mb in order ⎪ where =
⎩ l W G
to allow for around 2mb/s/core data throughput:
C∶ { ⎩ ⎩
▸ ▸
(14.8)
((s, c, y, a), o) ↦ (s, c, H(y), g a, o)
▸
▸
Where:
We define the work-package’s implied authorizer as pa ,
the hash of the concatenation of the authorization code K(l) ≡ {h ∣ w ∈ pw , (h⊞ , n) ∈ wi } , ∣l∣ ≤ 8
and the parameterization. We define the authorization o = ΨI (p, c)
code as pc and require that it be available at the time
(r, e) = T [(C(pw [j], r), e) ∣ (r, e) = I(p, j), j <
− N∣pw ∣ ]
of the lookup anchor block from the historical lookup of
service ph . Formally: ⎧
⎪(r, e) if ∣e∣ = we
⎪
⎪
⎪
⎪
⎪
⎪(r, [G0 , G0 , . . . ]...we ) otherwise if r ∈/ Y
⎧
⎪ pa ≡ H(pc ⌢ pp ) ⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪(⊚, [G0 , G0 , . . . ]...we ) otherwise
(14.9) ∀p ∈ P ∶ ⎨ pc ≡ Λ(δ[ph ], (px )t , pu ) I(p, j) ≡ ⎨
⎪
⎪ ⎪
⎪
⎪ wc , wg , ws , h, wy , px ,
⎪
⎪ ⎪
⎪ where (r, e) = ΨR ( )
⎩ pc ∈ Y ⎪
⎪
⎪ pa , o, S(w, l), X(w), ℓ
⎪
⎪
⎪
⎪
⎩ and h = H(p) , w = pw [j] , ℓ = ∑k<j pw [k]e
(The historical lookup function, Λ, is defined in equa-
tion 9.7.)
Note that we gracefully handle the case where number
of segments actually exported by a work-item’s Refine ex-
14.3.1. Exporting. Any of a work-package’s work-items ecution is incorrectly reported in the work-item’s export
may export segments and a segments-root is placed in the segment count. In this case, the work-package continues
work-report committing to these, ordered according to the to be valid as a whole, but the work item’s exported seg-
work-item which is exporting. It is formed as the root of a ments are replaced by a sequence of zero-segments equal
constant-depth binary Merkle tree as defined in equation in size to the export segment count.
E.4. Initially we constrain the segment-root dictionary l: It
Guarantors are required to erasure-code and distribute should contain entries for all unique work-package hashes
two data sets: one blob, the auditable work-package con- of imported segments not identified directly via a segment-
taining the encoded work-package, extrinsic data and self- root but rather through a work-package hash.
justifying imported segments which is placed in the short- We immediately define the segment-root lookup func-
term Audit da store and a second set of exported-segments tion L, dependent on this dictionary, which collapses
data together with the Paged-Proofs metadata. Items in a union of segment-roots and work-package hashes into
the first store are short-lived; assurers are expected to keep segment-roots using the dictionary:
them only until finality of the block in which the availabil-
⎧
⎪
ity of the work-result’s work-package is assured. Items in ⎪r if r ∈ H
(14.12) L(r ∈ H ∪ H⊞ ) ≡ ⎨
the second, meanwhile, are long-lived and expected to be ⎪
⎪l[h] if ∃h ∈ H ∶ r = h⊞
⎩
kept for a minimum of 28 days (672 complete epochs) fol-
lowing the reporting of the work-report. In order to expect to be compensated for a work-report
We define the paged-proofs function P which accepts they are building, guarantors must compose a value for l
a series of exported segments s and defines some series to ensure not only the above but also a further constraint
of additional segments placed into the Imports da system that all pairs of work-package hashes and segment-roots
via erasure-coding and distribution. The function evalu- do properly correspond:
ates to pages of hashes, together with subtree proofs, such (14.13)
that justifications of correctness based on a segments-root ∀(h ↦ e) ∈ l ∶ ∃p, c ∈ P, NC ∶ H(p) = h ∧ (Ξ(p, c)s )e = e
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 27
As long as the guarantor is unable to satisfy the above Validators, in their role as availability assurers, should
constraints, then it should consider the work-package un- index such chunks according to the index of the segments-
able to be guaranteed. Auditors are not expected to pop- tree whose reconstruction they facilitate. Since the data
ulate this but rather to reuse the value in the work-report for segment chunks is so small at 12 octets, fixed com-
they are auditing. munications costs should be kept to a bare minimum. A
The next term to be introduced, o, is the authoriza- good network protocol (out of scope at present) will al-
tion output, the result of the Is-Authorized function. The low guarantors to specify only the segments-tree root and
second term, (r, e) is the sequence of results for each of index together with a Boolean to indicate whether the
the work-items in the work-package together with all seg- proof chunk need be supplied. Since we assume at least
ments exported by each work-item. The third definition I 341 other validators are online and benevolent, we can
performs an ordered accumulation (i.e. counter) in order assume that the guarantor can compute S and J above
to ensure that the Refine function has access to the total with confidence, based on the general availability of data
number of exports made from the work-package up to the committed to with s♣ , which is specified below.
current work-item.
The above relies on two functions, S and X which, re- 14.4.1. Availability Specifier. We define the availability
spectively, define the import segment data and the extrin- specifier function A, which creates an availability spec-
sic data for some work-item argument w. We also define ifier from the package hash, an octet sequence of the
J, which compiles justifications of segment data: audit-friendly work-package bundle (comprising the work-
package itself, the extrinsic data and the concatenated im-
X(w ∈ I) ≡ [d ∣ (H(d), ∣d∣) −
< wx ]
port segments along with their proofs of correctness), and
(14.14) S(w ∈ I) ≡ [s[n] ∣ M(s) = L(r), (r, n) <
− wi ] the sequence of exported segments:
J(w ∈ I) ≡ [↕J (s, n) ∣ M(s) = L(r), (r, n) <
− wi ] ⎧H, Y, ⟦G⟧⎫ → S
⎩ ⎭
(14.16) A∶ { ⎧ ⎫ ↦ ⎧h, l ∣b∣, u, e M(s), n ∣s∣⎫
We may then define s as the data availability specifi- ⎩h, b, s ⎭ ⎩ ⎭
▸ ▸ ▸
▸ ▸ ▸
Note that while S and J are both formulated using The paged-proofs function P , defined earlier in equa-
the term s (all segments exported by all work-packages tion 14.10, accepts a sequence of segments and returns a
exporting a segment to be imported) such a vast amount sequence of paged-proofs sufficient to justify the correct-
of data is not generally needed as the justification can be ness of every segment. There are exactly ⌈1/64⌉ paged-
derived through a single paged-proof. This reduces the proof segments as the number of yielded segments, each
worst case data fetching for a guarantor to two segments composed of a page of 64 hashes of segments, together with
for every one to be imported. In the case that contiguously a Merkle proof from the root to the subtree-root which in-
exported segments are imported (which we might assume cludes those 64 segments.
is a fairly common situation), then a single proof-page The functions M and MB are the fixed-depth and sim-
should be sufficient to justify many imported segments. ple binary Merkle root functions, defined in equations E.4
Also of note is the lack of length prefixes: only the and E.3. The function C is the erasure-coding function,
Merkle paths for the justifications have a length prefix. defined in appendix H.
All other sequence lengths are determinable through the And P is the zero-padding function to take an octet
work package itself. array to some multiple of n in length:
The Is-Authorized logic it references must be executed
first in order to ensure that the work-package warrants the Y → Yk⋅n
needed core-time. Next, the guarantor should ensure that (14.17) Pn∈N1... ∶ {
x ↦ x ⌢ [0, 0, ...]((∣x∣+n−1) mod n)+1...n
all segment-tree roots which form imported segment com-
mitments are known and have not expired. Finally, the Validators are incentivized to distribute each newly
guarantor should ensure that they can fetch all preimage erasure-coded data chunk to the relevant validator, since
data referenced as the commitments of extrinsic segments. they are not paid for guaranteeing unless a work-report
Once done, then imported segments must be recon- is considered to be available by a super-majority of val-
structed. This process may in fact be lazy as the Refine idators. Given our work-package p, we should therefore
function makes no usage of the data until the import host- send the corresponding work-package bundle chunk and
call is made. Fetching generally implies that, for each im- exported segments chunks to each validator whose keys
ported segment, erasure-coded chunks are retrieved from are together with similarly corresponding chunks for im-
enough unique validators (342, including the guarantor) ported, extrinsic and exported segments data, such that
and is described in more depth in appendix H. (Since each validator can justify completeness according to the
we specify systematic erasure-coding, its reconstruction work-report’s erasure-root. In the case of a coming epoch
is trivial in the case that the correct 342 validators are re- change, they may also maximize expected reward by dis-
sponsive.) Chunks must be fetched for both the data itself tributing to the new validator set.
and for justification metadata which allows us to ensure We will see this function utilized in the next sections,
that the data is correct. for guaranteeing, auditing and judging.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 28
15. Guaranteeing In order to minimize the work for block authors and
thus maximize expected profits, guarantors should at-
Guaranteeing work-packages involves the creation and
tempt to construct their core’s next guarantee extrinsic
distribution of a corresponding work-report which requires
from the work-report, core index and set of attestations
certain conditions to be met. Along with the report, a sig-
including their own and as many others as possible.
nature demonstrating the validator’s commitment to its
In order to minimize the chance of any block authors
correctness is needed. With two guarantor signatures, the
disregarding the guarantor for anti-spam measures, guar-
work-report may be distributed to the forthcoming Jam
antors should sign an average of no more than two work-
chain block author in order to be used in the EG , which
reports per timeslot.
leads to a reward for the guarantors.
We presume that in a public system, validators will be
16. Availability Assurance
punished severely if they malfunction and commit to a
report which does not faithfully represent the result of Ξ Validators should issue a signed statement, called an
applied on a work-package. Overall, the process is: assurance, when they are in possession of all of their cor-
(1) Evaluation of the work-package’s authorization, responding erasure-coded chunks for a given work-report
and cross-referencing against the authorization which is currently pending availability. For any work-
pool in the most recent Jam chain state. report to gain an assurance, there are two classes of data
(2) Creation and publication of a work-package re- a validator must have:
port. Firstly, their erasure-coded chunk for this report’s bun-
(3) Chunking of the work-package and each of its ex- dle. The validity of this chunk can be trivially proven
trinsic and exported data, according to the era- through the work-report’s work-package erasure-root and
sure codec. a Merkle-proof of inclusion in the correct location. The
(4) Distributing the aforementioned chunks across proof should be included from the guarantor. This chunk
the validator set. is needed to verify the work-report’s validity and com-
(5) Providing the work-package, extrinsic and ex- pleteness and need not be retained after the work-report
ported data to other validators on request is also is considered audited. Until then, it should be provided
helpful for optimal network performance. on request to validators.
Secondly, the validator should have in hand the cor-
For any work-package p we are in receipt of, we may responding erasure-coded chunk for each of the exported
determine the work-report, if any, it corresponds to for segments referenced by the segments root. These should
the core c that we are assigned to. When Jam chain state be retained for 28 days and provided to any validator on
is needed, we always utilize the chain state of the most request.
recent block.
For any guarantor of index v assigned to core c and a 17. Auditing and Judging
work-package p, we define the work-report r simply as:
The auditing and judging system is theoretically equiv-
(15.1) r = Ξ(p, c) alent to that in Elves, introduced by Jeff Burdges, Ceval-
los, et al. 2024. For a full security analysis of the mecha-
Such guarantors may safely create and distribute the
nism, see this work. There is a difference in terminology,
payload (s, v). The component s may be created accord-
where the terms backing, approval and inclusion there re-
ing to equation 11.26; specifically it is a signature using
fer to our guaranteeing, auditing and accumulation, re-
the validator’s registered Ed25519 key on a payload l:
spectively.
(15.2) l = H(E(c, r))
17.1. Overview. The auditing process involves each
To maximize profit, the guarantor should require the node requiring themselves to fetch, evaluate and issue
work result meets all expectations which are in place dur- judgment on a random but deterministic set of work-
ing the guarantee extrinsic described in section 11.4. This reports from each Jam chain block in which the work-
includes contextual validity and inclusion of the autho- report becomes available (i.e. from W). Prior to any eval-
rization in the authorization pool. No doing so does not uation, a node declares and proves its requirement. At
result in punishment, but will prevent the block author specific common junctures in time thereafter, the set of
from including the package and so reduces rewards. work-reports which a node requires itself to evaluate from
Advanced nodes may maximize the likelihood that their each block’s W may be enlarged if any declared intentions
reports will be includable on-chain by attempting to pre- are not matched by a positive judgment in a reasonable
dict the state of the chain at the time that the report will time or in the event of a negative judgment being seen.
get to the block author. Naive nodes may simply use the These enlargement events are called tranches.
current chain head when verifying the work-report. To If all declared intentions for a work-report are matched
minimize work done, nodes should make all such evalua- by a positive judgment at any given juncture, then the
tions prior to evaluating the ΨR function to calculate the work-report is considered audited. Once all of any given
report’s work results. block’s newly available work-reports are audited, then we
Once evaluated as a reasonable work-package to guar- consider the block to be audited. One prerequisite of a
antee, guarantors should maximize the chance that their node finalizing a block is for it to view the block as au-
work is not wasted by attempting to form consensus over dited. Note that while there will be eventual consensus on
the core. To achieve this they should send the work- whether a block is audited, there may not be consensus
package to any other guarantors on the same core which at the time that the block gets finalized. This does not
they do not believe already know of it. affect the crypto-economic guarantees of this system.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 29
17.3. Selection of Reports. Each validator shall per- (17.9) S ≡ Eκ[v]e ⟨XI n ⌢ xn ⌢ H(H)⟩
form auditing duties on each valid block received. Since (17.10) where xn = E([E2 (c) ⌢ H(w) ∣ ⎧ ⎫
⎩c, w⎭ ∈ an ])
we are entering off-chain logic, and we cannot assume con- (17.11) XI = $jam_announce
sensus, we henceforth consider ourselves a specific valida-
tor of index v and assume ourselves focused on some re- We define An as our perception of which validator is
cent block B with other terms corresponding to the state- required to audit each of the work-reports (identified by
transition implied by that block, so ρ is said block’s prior their associated core) at tranche n. This comes from each
core-allocation, κ is its prior validator set, H is its header other validators’ announcements (defined above). It can-
&c. Practically, all considerations must be replicated for not be correctly evaluated until n is current. We have
all blocks and multiple blocks’ considerations may be un- absolute knowledge about our own audit requirements.
derway simultaneously. (17.12) An ∶ W → ℘⟨NV ⟩
We define the sequence of work-reports which we may
(17.13) ∀(c, w) ∈ a0 ∶ v ∈ q0 (w)
be required to audit as Q, a sequence of length equal to
the number of cores, which functions as a mapping of core We further define J⊺ and J to be the validator in-
index to a work-report pending which has just become dices who we know to have made respectively, positive
available, or ∅ if no report became available on the core. and negative, judgments mapped from each work-report’s
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 30
core. We don’t care from which tranche a judgment is From this mapping the validator issues a set of judg-
made. ments jn :
(17.18) jn = {Sκ[v]e (Xe(w) ⌢ H(w)) ∣ (c, w) ∈ an }
(17.14) J{,⊺} ∶ W → ℘⟨NV ⟩
All judgments j∗ should be published to other valida-
We are able to define an for tranches beyond the first tors in order that they build their view of J and in the
on the basis of the number of validators who we know are case of a negative judgment arising, can form an extrinsic
required to conduct an audit yet from whom we have not for ED .
yet seen a judgment. It is possible that the late arrival We consider a work-report as audited under two cir-
of information alters an and nodes should reevaluate and cumstances. Either, when it has no negative judgments
act accordingly should this happen. and there exists some tranche in which we see a positive
We can thus define an beyond the initial tranche judgment from all validators who we believe are required
through a new vrf which acts upon the set of no-show to audit it; or when we see positive judgments for it from
validators. greater than two-thirds of the validator set.
J (w) = ∅ ∧ ∃n ∶ An (w) ⊂ J⊺ (w)
∀n > 0 ∶ (17.19) U (w) ⇔ ⋁ {
∣J⊺ (w)∣ > 2/3V
[]
(17.15) sn (w) ∈ Fκ[v] ⟨XU ⌢ Y(Hv ) ⌢ H(w) n⟩
b Our block B may be considered audited, a condition
(17.16) an ≡ { 256F
V
Y(sn (w))0 < mn ∣ w ∈ Q, w ≠ ∅} denoted U, when all the work-reports which were made
where mn = ∣An−1 (w) ∖ J⊺ (w)∣ available are considered audited. Formally:
(17.20) U ⇔ ∀w ∈ W ∶ U (w)
We define our bias factor F = 2, which is the expected
For any block we must judge it to be audited (i.e.
number of validators which will be required to issue a
U = ⊺) before we vote for the block to be finalized in
judgment for a work-report given a single no-show in the
Grandpa. See section 19 for more information here.
tranche before. Modeling by Jeff Burdges, Cevallos, et al.
Furthermore, we pointedly disregard chains which in-
2024 shows that this is optimal.
clude the accumulation of a report which we know at least
Later audits must be announced in a similar fashion to 1/3 of validators judge as being invalid. Any chains includ-
the first. If audit requirements lessen on the receipt of new
ing such a block are not eligible for authoring on. The best
information (i.e. a positive judgment being returned for
block, i.e. that on which we build new blocks, is defined as
a previous no-show), then any audits already announced
the chain with the most regular Safrole blocks which does
are completed and judgments published. If audit require-
not contain any such disregarded block. Implementation-
ments raise on the receipt of new information (i.e. an addi-
wise, this may require reversion to an earlier head or al-
tional announcement being found without an accompany-
ternative fork.
ing judgment), then we announce the additional audit(s)
As a block author, we include a judgment extrinsic
we will undertake.
which collects judgment signatures together and reports
As n increases with the passage of time an becomes
them on-chain. In the case of a non-valid judgment (i.e.
known and defines our auditing responsibilities. We must
one which is not two-thirds-plus-one of judgments con-
attempt to reconstruct all work-packages and their requi-
firming validity) then this extrinsic will be introduced in a
site data corresponding to each work-report we must au-
block in which accumulation of the non-valid work-report
dit. This may be done through requesting erasure-coded
is about to take place. The non-valid judgment extrin-
chunks from one-third of the validators. It may also be
sic removes it from the pending work-reports, ρ. Refer to
short-cutted through asking a cooperative third-party (e.g.
section 10 for more details on this.
an original guarantor) for the preimages.
Thus, for any such work-report w we are assured we
18. Beefy Distribution
will be able to fetch some candidate work-package encod-
ing F (w) which comes either from reconstructing erasure- For each finalized block B which a validator imports,
coded chunks verified through the erasure coding’s Merkle said validator shall make a bls signature on the bls12-381
root, or alternatively from the preimage of the work- curve, as defined by Hopwood et al. 2020, affirming the
package hash. We decode this candidate blob into a work- Keccak hash of the block’s most recent Beefy mmr. This
package. should be published and distributed freely, along with the
In addition to the work-package, we also assume we are signed material. These signatures may be aggregated in
able to fetch all manifest data associated with it through order to provide concise proofs of finality to third-party
requesting and reconstructing erasure-coded chunks from systems. The signing and aggregation mechanism is de-
one-third of validators in the same way as above. fined fully by Jeff Burdges, Ciobotaru, et al. 2022.
We then attempt to reproduce the report on the core Formally, let Fv be the signed commitment of validator
to give en , a mapping from cores to evaluations: index v which will be published:
(17.17) (18.1) Fv ≡ Sκ′v (XB ⌢ HK (EM (last(β)b ]))
∀(c, w) ∈ an ∶ ⎧
⎪
⎪w = Ξ(p, c) if ∃p ∈ P ∶ E(p) = F (w) (18.2) XB = $jam_beefy
en (w) ⇔⎨
⎪
⎩
⎪ otherwise
19. Grandpa and the Best Chain
Note that a failure to decode implies an invalid work- Nodes take part in the Grandpa protocol as defined
report. by Stewart and Kokoris-Kogia 2020.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 31
We define the latest finalized block as B♮ . All associ- Thus, a connection able to sustain 500mb/s should
ated terms concerning block and state are similarly super- leave a sufficient margin of error and headroom to serve
scripted. We consider the best block, B♭ to be that which other validators as well as some public connections, though
is drawn from the set of acceptable blocks of the following the burstiness of block publication would imply validators
criteria: are best to ensure that peak bandwidth is higher.
● Has the finalized block as an ancestor. Under these conditions, we would expect an overall
● Contains no unfinalized blocks where we see an network-provided data availability capacity of 2pb, with
equivocation (two valid blocks at the same times- each node dedicating at most 6tb to availability storage.
lot). Estimates for memory usage are as follows:
● Is considered audited. gb
Formally: Auditing 20 2 × 10 pvm instances
(19.1) ♭
A(H ) ∋ H ♮ Block execution 2 1 pvm instance
State cache 40
(19.2) U♭ ≡ ⊺ Misc 2
⎧
⎪ HA ≠ HB Total 64
⎪
⎪
⎪
⎪
⎪
⎪ A As a rough guide, each parachain has an average foot-
⎪
⎪ HT = HB
T
(19.3) / HA , HB ∶ ⋀⎨ A
∃ print of around 2mb in the Polkadot Relay chain; a 40gb
⎪
⎪
⎪ H ∈ A(H♭ ) state would allow 20,000 parachains’ information to be
⎪
⎪
⎪
⎪
⎪ retained in state.
⎪ HA
⎩ ∈/ A(H♮ )
What might be called the “virtual hardware” of a Jam
Of these acceptable blocks, that which contains the core is essentially a regular cpu core executing at some-
most ancestor blocks whose author used a seal-key ticket, where between 25% and 50% of regular speed for the
rather than a fallback key should be selected as the best whole six-second portion and which may draw and pro-
head, and thus the chain on which the participant should vide 2.5mb/s average in general-purpose i/o and utilize up
make Grandpa votes. to 2gb in ram. The i/o includes any trustless reads from
Formally, we aim to select B♭ to maximize the value m the Jam chain state, albeit in the recent past. This virtual
where: hardware also provides unlimited reads from a semi-static
A preimage-lookup database.
(19.4) m= ∑ T
HA ∈A♭
Each work-package may occupy this hardware and exe-
cute arbitrary code on it in six-second segments to create
some result of at most 90kb. This work result is then
20. Discussion
entitled to 10ms on the same machine, this time with no
20.1. Technical Characteristics. In total, with our “external” i/o beyond said result, but instead with full
stated target of 1,023 validators and three validators per and immediate access to the Jam chain state and may
core, along with requiring a mean of ten audits per val- alter the service(s) to which the results belong.
idator per timeslot, and thus 30 audits per work-report,
20.2. Illustrating Performance. In terms of pure pro-
Jam is capable of trustlessly processing and integrating
341 work-packages per timeslot. cessing power, the Jam machine architecture can deliver
We assume node hardware is a modern 16 core cpu extremely high levels of homogeneous trustless computa-
with 64gb ram, 1tb secondary storage and 0.5gbe net- tion. However, the core model of Jam is a classic paral-
working. lelized compute architecture, and for solutions to be able
Our performance models assume a rough split of cpu to utilize the architecture well they must be designed with
time as follows: it in mind to some extent. Accordingly, until such use-
cases appear on Jam with similar semantics to existing
Proportion ones, it is very difficult to make direct comparisons to ex-
Audits 10/16 isting systems. That said, if we indulge ourselves with
Merklization 1/16 some assumptions then we can make some crude compar-
Block execution 2/16 isons.
Grandpa and Beefy 1/16
1/16
20.2.1. Comparison to Polkadot. Pre-asynchronous back-
Erasure coding
1/16
ing, Polkadot validates around 50 parachains, each one
Networking & misc
utilizing approximately 250ms of native computation (i.e.
Estimates for network bandwidth requirements are as half a second of Wasm execution time at around a 50%
follows: overhead) and 5mb of i/o for every twelve seconds of
real time which passes. This corresponds to an aggregate
Upload Download
compute performance of around parity with a native cpu
mb/s mb/s
core and a total 24-hour distributed availability of around
Guaranteeing 30 40 20mb/s. Accumulation is beyond Polkadot’s capabilities
Assuring 60 56 and so not comparable.
Auditing 200 200 Post asynchronous-backing and estimating that Polka-
Block publication 42 42 dot is at present capable of validating at most 80
Grandpa and Beefy 4 4 parachains each doing one second of native computation
Total 336 342 in every six, then the aggregate performance is increased
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 32
to around 13x native cpu and the distributed availability YP Ethereum network, now approaching a decade old, is
increased to around 67mb/s. probably the best known example of general purpose de-
For comparison, in our basic models, Jam should be centralized computation and makes for a reasonable yard-
capable of attaining around 85x the computation load of stick. It is able to sustain a computation and i/o rate of
a single native cpu core and a distributed availability of 1.25M gas/sec, with a peak throughput of twice that. The
852mb/s. evm gas metric was designed to be a time-proportional
metric for predicting and constraining program execution.
20.2.2. Simple Transfers. We might also attempt to Attempting to determine a concrete comparison to pvm
model a simple transactions-per-second amount, with each throughput is non-trivial and necessarily opinionated ow-
transaction requiring a signature verification and the mod- ing to the disparity between the two platforms including
ification of two account balances. Once again, until there word size, endianness and stack/register architecture and
are clear designs for precisely how this would work we must memory model. However, we will attempt to determine a
make some assumptions. Our most naive model would be reasonable range of values.
to use the Jam cores (i.e. refinement) simply for trans- Evm gas does not directly translate into native exe-
action verification and account lookups. The Jam chain cution as it also combines state reads and writes as well
would then hold and alter the balances in its state. This as transaction input data, implying it is able to process
is unlikely to give great performance since almost all the some combination of up to 595 storage reads, 57 storage
needed i/o would be synchronous, but it can serve as a writes and 1.25M gas as well as 78kb input data in each
basis. second, trading one against the other.13 We cannot find
A 15mb work-package can hold around 125k transac- any analysis of the typical breakdown between storage i/o
tions at 128 bytes per transaction. However, a 90kb work- and pure computation, so to make a very conservative es-
result could only encode around 11k account updates when timate, we assume it does all four. In reality, we would
each update is given as a pair of a 4 byte account index expect it to be able to do on average 1/4 of each.
and 4 byte balance, resulting in a limit of 5.5k transac- Our experiments14 show that on modern, high-end con-
tions per package, or 312k tps in total. It is possible that sumer hardware with a modern evm implementation, we
the eight bytes could typically be compressed by a byte can expect somewhere between 100 and 500 gas/µs in
or two, increasing maximum throughput a little. Our ex- throughput on pure-compute workloads (we specifically
pectations are that state updates, with highly parallelized utilized Odd-Product, Triangle-Number and several im-
Merklization, can be done at between 500k and 1 million plementations of the Fibonacci calculation). To make a
reads/write per second, implying around 250k-350k tps, conservative comparison to pvm, we propose transcom-
depending on which turns out to be the bottleneck. pilation of the evm code into pvm code and then re-
A more sophisticated model would be to use the Jam execution of it under the Polkavm prototype.15
cores for balance updates as well as transaction verifica- To help estimate a reasonable lower-bound of evm
tion. We would have to assume that state and the trans- gas/µs, e.g. for workloads which are more memory and
actions which operate on them can be partitioned between i/o intensive, we look toward real-world permissionless
work-packages with some degree of efficiency, and that the deployments of the evm and see that the Moonbeam
15mb of the work-package would be split between transac- network, after correcting for the slowdown of execut-
tion data and state witness data. Our basic models predict ing within the recompiled WebAssembly platform on the
that a 4bn 32-bit account system paginated into 210 ac- somewhat conservative Polkadot hardware platform, im-
counts/page and 128 bytes per transaction could, assum- plies a throughput of around 100 gas/µs. We therefore
ing only around 1% of oraclized accounts were useful, av- assert that in terms of computation, 1µs evm gas approx-
erage upwards of 1.7mtps depending on partitioning and imates to around 100-500 gas on modern high-end con-
usage characteristics. Partitioning could be done with a sumer hardware.16
fixed fragmentation (essentially sharding state), a rotating Benchmarking and regression tests show that the pro-
partition pattern or a dynamic partitioning (which would totype pvm engine has a fixed preprocessing overhead of
require specialized sequencing). around 5ns/byte of program code and, for arithmetic-
Interestingly, we expect neither model to be bottle- heavy tasks at least, a marginal factor of 1.6-2% com-
necked in computation, meaning that transactions could pared to evm execution, implying an asymptotic speedup
be substantially more sophisticated, perhaps with more of around 50-60x. For machine code 1mb in size expected
flexible cryptography or smart contract functionality, to take of the order of a second to compute, the com-
without a significant impact on performance. pilation cost becomes only 0.5% of the overall time. 17
20.2.3. Computation Throughput. The tps metric does For code not inherently suited to the 256-bit evm isa,
not lend itself well to measuring distributed systems’ com- we would expect substantially improved relative execu-
putational performance, so we now turn to another slightly tion times on pvm, though more work must be done in
more compute-focussed benchmark: the evm. The basic
13The latest “proto-danksharding” changes allow it to accept 87.3kb/s in committed-to data though this is not directly available within
state, so we exclude it from this illustration, though including it with the input data would change the results little.
14This is detailed at https://hackmd.io/@XXX9CM1uSSCWVNFRYaSB5g/HJarTUhJA and intended to be updated as we get more information.
15
It is conservative since we don’t take into account that the source code was originally compiled into evm code and thus the pvm
machine code will replicate architectural artifacts and thus is very likely to be pessimistic. As an example, all arithmetic operations in evm
are 256-bit and 32-bit native pvm is being forced to honor this even if the source code only actually required 32-bit values.
16
We speculate that the substantial range could possibly be caused in part by the major architectural differences between the evm isa
typical modern hardware.
17
As an example, our odd-product benchmark, a very much pure-compute arithmetic task, execution takes 58s on evm, and 1.04s within
our pvm prototype, including all preprocessing.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 33
order to gain confidence that these speed-ups are broadly ● any L2 scaling which may be possible with either
applicable. Jam or Ethereum;
If we allow for preprocessing to take up to the same ● the state partitioning which uses of Jam would
component within execution as the marginal cost (owing imply;
to, for example, an extremely large but short-running pro- ● the as-yet unfixed gas model for the pvm;
gram) and for the pvm metering to imply a safety overhead ● that pvm/evm comparisons are necessarily impre-
of 2x to execution speeds, then we can expect a Jam core cise;
to be able to process the equivalent of around 1,500 evm ● († ) all figures for Ethereum L1 are drawn from
gas/µs. Owing to the crudeness of our analysis we might the same resource: on average each figure will be
reasonably predict it to be somewhere within a factor of only 1/4 of this maximum.
three either way—i.e. 500-5,000 evm gas/µs. ● (‡ ) the state reads and input data figures for Jam
Jam cores are each capable of 2.5mb/s bandwidth, are drawn from the same resource: on average
which must include any state i/o and data which must be each figure will be only 1/2 of this maximum.
newly introduced (e.g. transactions). While writes come
We leave it as further work for an empirical analysis of
at comparatively little cost to the core, only requiring
performance and an analysis and comparison between Jam
hashing to determine an eventual updated Merkle root,
and the aggregate of a hypothetical Ethereum ecosystem
reads must be witnessed, with each one costing around
which included some maximal amount of L2 deployments
640 bytes of witness conservatively assuming a one-million
together with full Dank-sharding and any other additional
entry binary Merkle trie. This would result in a maxi-
consensus elements which they would require. This, how-
mum of a little under 4k reads/second/core, with the ex-
ever, is out of scope for the present work.
act amount dependent upon how much of the bandwidth
is used for newly introduced input data.
Aggregating everything across Jam, excepting accu- 21. Conclusion
mulation which could add further throughput, numbers
can be multiplied by 341 (with the caveat that each one’s We have introduced a novel computation model which
computation cannot interfere with any of the others’ ex- is able to make use of pre-existing crypto-economic mech-
cept through state oraclization and accumulation). Unlike anisms in order to deliver major improvements in scala-
for roll-up chain designs such as Polkadot and Ethereum, bility without causing persistent state-fragmentation and
there is no need to have persistently fragmented state. thus sacrificing overall cohesion. We call this overall pat-
Smart-contract state may be held in a coherent format on tern collect-refine-join-accumulate. Furthermore, we have
the Jam chain so long as any updates are made through formally defined the on-chain portion of this logic, essen-
the 15kb/core/sec work results, which would need to con- tially the join-accumulate portion. We call this protocol
tain only the hashes of the altered contracts’ state roots. the Jam chain.
Under our modelling assumptions, we can therefore We argue that the model of Jam provides a novel “sweet
summarize: spot”, allowing for massive amounts of computation to
be done in secure, resilient consensus compared to fully-
synchronous models, and yet still have strict guarantees
Eth. L1 Jam Core Jam about both timing and integration of the computation
†
Compute (evm gas/µs) 1.25 500-5,000 0.15-1.5m into some singleton state machine unlike persistently frag-
State writes (s−1 ) 57† n/a n/a mented models.
State reads (s−1 ) 595† 4k‡ 1.4m‡
Input data (s−1 ) 78kb† 2.5mb‡ 852mb‡ 21.1. Further Work. While we are able to estimate the-
What we can see is that Jam’s overall predicted per- oretical computation possible given some basic assump-
formance profile implies it could be comparable to many tions and even make broad comparisons to existing sys-
thousands of that of the basic Ethereum L1 chain. The tems, practical numbers are invaluable. We believe the
large factor here is essentially due to three things: spacial model warrants further empirical research in order to bet-
parallelism, as Jam can host several hundred cores under ter understand how these theoretical limits translate into
its security apparatus; temporal parallelism, as Jam tar- real-world performance. We feel a proper cost analysis
gets continuous execution for its cores and pipelines much and comparison to pre-existing protocols would also be an
of the computation between blocks to ensure a constant, excellent topic for further work.
optimal workload; and platform optimization by using a We can be reasonably confident that the design of Jam
vm and gas model which closely fits modern hardware ar- allows it to host a service under which Polkadot parachains
chitectures. could be validated, however further prototyping work is
It must however be understood that this is a provi- needed to understand the possible throughput which a
sional and crude estimation only. It is included for only pvm-powered metering system could support. We leave
the purpose of expressing Jam’s performance in tangi- such a report as further work. Likewise, we have also
ble terms and is not intended as a means of comparing intentionally omitted details of higher-level protocol ele-
to a “full-blown” Ethereum/L2-ecosystem combination. ments including cryptocurrency, coretime sales, staking
Specifically, it does not take into account: and regular smart-contract functionality.
● that these numbers are based on real performance A number of potential alterations to the protocol de-
of Ethereum and performance modelling of Jam scribed here are being considered in order to make prac-
(though our models are based on real-world per- tical utilization of the protocol easier. These include:
formance of the components); ● Synchronous calls between services in accumulate.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 34
● Restrictions on the transfer function in order to The same team is responsible for Sassafras, Grandpa and
allow for substantial parallelism over accumula- Beefy.
tion. Safrole is a mild simplification of Sassafras and was
● The possibility of reserving substantial additional made under the careful review of Davide Galassi and Al-
computation capacity during accumulate under istair Stewart.
certain conditions. The original CoreJam rfc was refined under the re-
● Introducing Merklization into the Work Package view of Bastian Köcher and Robert Habermeier and most
format in order to obviate the need to have the of the key elements of that proposal have made their way
whole package downloaded in order to evaluate into the present work.
its authorization. The pvm is a formalization of a partially simplified
The networking protocol is also left intentionally un- PolkaVM software prototype, developed by Jan Bujak.
defined at this stage and its description must be done in Cyrill Leutwiler contributed to the empirical analysis of
a follow-up proposal. the pvm reported in the present work.
Validator performance is not presently tracked on- The PolkaJam team and in particular Arkadiy
chain. We do expect this to be tracked on-chain in the Paronyan, Emeric Chevalier and Dave Emett have been
final revision of the Jam protocol, but its specific format instrumental in the design of the lower-level aspects of
is not yet certain and it is therefore omitted at present. the Jam protocol, especially concerning Merklization and
i/o.
Numerous contributors to the repository since publica-
22. Acknowledgements
tion have helped correct errors. Thank you to all.
Much of this present work is based in large part on the And, of course, thanks to the awesome Lemon Jelly,
work of others. The Web3 Foundation research team and a.k.a. Fred Deakin and Nick Franglen, for three of the
in particular Alistair Stewart and Jeff Burdges are respon- most beautiful albums ever produced, the cover art of the
sible for Elves, the security apparatus of Polkadot which first of which was inspiration for this paper’s background
enables the possibility of in-core computation for Jam. art.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 35
A.2. Instructions, Opcodes and Skip-distance. The program blob p is split into a series of octets which make
up the instruction data c and the opcode bitmask k as well as the dynamic jump table, j. The former two imply an
instruction sequence, and by extension a basic-block sequence, itself a sequence of indices of the instructions which follow
a block-termination instruction.
The latter, dynamic jump table, is a sequence of indices into the instruction data blob and is indexed into when
dynamically-computed jumps are taken. It is encoded as a sequence of natural numbers (i.e. non-negative integers) each
encoded with the same length in octets. This length, term z above, is itself encoded prior.
The pvm counts instructions in octet terms (rather than in terms of instructions) and it is thus convenient to define
which octets represent the beginning of an instruction, i.e. the opcode octet, and which do not. This is the purpose of k,
the instruction-opcode bitmask. We assert that the length of the bitmask is equal to the length of the instruction blob.
We define the Skip function skip which provides the number of octets, minus one, to the next instruction’s opcode,
given the index of instruction’s opcode index into c (and by extension k):
N→N
(A.2) skip∶ {
i ↦ min(24, j ∈ N ∶ (k ⌢ [1, 1, . . . ])i+1+j = 1)
The Skip function appends k with a sequence of set bits in order to ensure a well-defined result for the final instruction
skip(∣c∣ − 1).
Given some instruction-index i, its opcode is readily expressed as ci and the distance in octets to move forward to the
next instruction is 1 + skip(i). However, each instruction’s “length” (defined as the number of contiguous octets starting
with the opcode which are needed to fully define the instruction’s semantics) is left implicit though limited to being at
most 16.
We define ζ as being equivalent to the instructions c except with an indefinite sequence of zeroes suffixed to ensure that
no out-of-bounds access is possible. This effectively defines any otherwise-undefined arguments to the final instruction
and ensures that a trap will occur if the program counter passes beyond the program code. Formally:
(A.3) ζ ≡ c ⌢ [0, 0, . . . ]
A.3. Basic Blocks and Termination Instructions. Instructions of the following opcodes are considered basic-block
termination instructions; other than trap & fallthrough, they correspond to instructions which may define the instruction-
counter to be something other than its prior value plus the instruction’s skip amount:
● Trap and fallthrough: trap , fallthrough
● Jumps: jump , jump_ind
● Load-and-Jumps: load_imm_jump , load_imm_jump_ind
● Branches: branch_eq , branch_ne , branch_ge_u , branch_ge_s , branch_lt_u , branch_lt_s , branch_eq_imm ,
branch_ne_imm
● Immediate branches: branch_lt_u_imm , branch_lt_s_imm , branch_le_u_imm , branch_le_s_imm , branch_ge_u_imm ,
branch_ge_s_imm , branch_gt_u_imm , branch_gt_s_imm
We denote this set, as opcode indices rather than names, as T . We define the instruction opcode indices denoting the
beginning of basic-blocks as ϖ:
(A.4) ϖ ≡ [0] ⌢ [n + 1 + skip(n) ∣ n <
− N∣c∣ ∧ kn = 1 ∧ cn ∈ T ]
A.4. Single-Step State Transition. We must now define the single-step pvm state-transition function Ψ1 :
⎧
⎪(Y, B, ⟦NR ⟧, NR , NG , ⟦NR ⟧13 , M) → ({☇, ∎, ▸} ∪ { , h}
F ̵ × NR , NR , ZG , ⟦NR ⟧ , M)
⎪ 13
(A.5) Ψ1 ∶ ⎨ ′ ′ ′ ′
⎪
⎪ (c, k, j, ı, ϱ, ω, µ) ↦ (ε, ı , ϱ , ω , µ )
⎩
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 36
We define ε together with the posterior values (denoted as prime) of each of the items of the machine state as being
in accordance with the table below.
In general, when transitioning machine state for an instruction a number of conditions hold true and instructions are
defined essentially by their exceptions to these rules. Specifically, the machine does not halt, the instruction counter
increments by one, the gas remaining is reduced by the amount corresponding to the instruction type and ram & registers
are unchanged. Formally:
(A.6) ε = ▸, ı′ = ı + 1 + skip(ı), ϱ ′ = ϱ − ϱ∆ , ω ′ = ω, µ′ = µ except as indicated
Where ram must be inspected and yet access is not possible, then machine state is unchanged, and the exit reason is
a fault with the lowest address to be read which is inaccessible. More formally, let a be the set of indices, modulo 232 ,
in to which µ must be subscripted in order to calculate the result of Ψ1 . If a ⊂/ Vµ then let ε = × min(a ∖ Vµ ).
F
Similarly, where ram must be mutated and yet mutable access is not possible, then machine state is unchanged, and
the exit reason is a fault with the lowest address to be read which is inaccessible. More formally, let a be the set of
indices in to which µ′ must be subscripted in order to calculate the result of Ψ1 . If a ⊂/ V∗µ then let ε = × min(a ∖ V∗µ ).
F
We define signed/unsigned transitions for various octet widths:
⎧
⎪N28n → Z−28n−1 ...28n−1
⎪
⎪
⎪
⎪ ⎧
(A.7) Zn∈N∶ ⎨ ⎪
⎪a if a < 28n−1
⎪
⎪
⎪ a ↦ ⎨
⎪ ⎪
⎩a − 2
8n
⎪
⎩ ⎪ otherwise
⎧
⎪Z → N
−1 ⎪ −2
8n−1 ...2 8n−1 28n
(A.8) Zn∈N ∶⎨
⎪
⎪ a ↦ (28n + a) mod 28n
⎩
⎧
⎪N28n → B8n
⎪
⎪
(A.9) Bn∈N∶ ⎨ x
⎪
⎪
⎪ x ↦ y ∶ ∀i ∈ N28n ∶ y[i] ⇔ ⌊ i ⌋ mod 2
⎩ 2
⎧
⎪B8n → N28n
−1 ⎪
⎪
(A.10) Bn∈N ∶ ⎨ x ↦ y ∶ ∑ xi ⋅ 2 i
⎪
⎪
⎪
⎩ i∈N28n
Immediate arguments are encoded in little-endian format with the most-significant bit being the sign bit. They may
be compactly encoded by eliding more significant octets. Elided octets are assumed to be zero if the msb of the value is
zero, and 255 otherwise. This allows for compact representation of both positive and negative encoded values. We thus
define the signed extension function operating on an input of n octets as Xn :
⎧
⎪N28n → NR
⎪
⎪
(A.11) Xn∈{0,1,2,3,4,8} ∶ ⎨ x
⎪
⎪
⎪ x ↦ x + ⌊ 8n−1 ⌋(264 − 28n )
⎩ 2
Any alterations of the program counter stemming from a static jump, call or branch must be to the start of a basic
block or else a panic occurs. Hypotheticals are not considered. Formally:
⎧
⎪ (▸, ı) if ¬C
⎪
⎪
⎪
′
(A.12) branch(b, C) Ô⇒ (ε, ı ) = ⎨(☇, ı) otherwise if b ∈/ ϖ
⎪
⎪
⎪
⎪
⎩(▸, b) otherwise
Jumps whose next instruction is dynamically computed must use an address which may be indexed into the jump-
table j. Through a quirk of tooling18, we define the dynamic address required by the instructions as the jump table index
incremented by one and then multiplied by our jump alignment factor ZA = 2.
As with other irregular alterations to the program counter, target code index must be the start of a basic block or
else a panic occurs. Formally:
⎧
⎪ (∎, ı) if a = 232 − 216
⎪
⎪
⎪
′
(A.13) djump(a) Ô⇒ (ε, ı ) = ⎨(☇, ı) otherwise if a = 0 ∨ a > ∣j∣ ⋅ ZA ∨ a mod ZA ≠ 0 ∨ j(a/ZA )−1 ∈/ ϖ
⎪
⎪
⎪
⎪
⎩ (▸, j ( /ZA )−1
a ) otherwise
A.5. Instruction Tables. Note that in the case that the opcode is not defined in the following tables then the instruction
is considered invalid, and it results in a panic; ε = ☇.
We assume the skip length ℓ is well-defined:
(A.14) ℓ ≡ skip(ı)
A.5.1. Instructions without Arguments.
18The popular code generation backend llvm requires and assumes in its code generation that dynamically computed jump destinations
always have a certain memory alignment. Since at present we depend on this for our tooling, we must acquiesce to its assumptions.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 37
ζı Name ϱ∆ Mutations
0 trap 0 ε=☇
1 fallthrough 0
ζı Name ϱ∆ Mutations
10 ecalli 0 ̵ × νX
ε=h
A.5.3. Instructions with Arguments of One Register and One Extended Width Immediate.
′
(A.16) let rA = min(12, ζı+1 mod 16) , ωA ≡ ωr′ A , νX ≡ E8−1 (ζı+2⋅⋅⋅+8 )
ζı Name ϱ∆ Mutations
′
20 load_imm_64 0 ωA = νX
ζı Name ϱ∆ Mutations
↺
30 store_imm_u8 0 µ′ ν X = νY mod 28
↺
31 store_imm_u16 0 µ′ νX ⋅⋅⋅+2 = E2 (νY mod 216 )
↺
32 store_imm_u32 0 µ′ νX ⋅⋅⋅+4 = E4 (νY mod 232 )
↺
33 store_imm_u64 0 µ′ νX ⋅⋅⋅+8 = E8 (νY )
ζı Name ϱ∆ Mutations
40 jump 0 branch(νX , ⊺)
ζı Name ϱ∆ Mutations
′
51 load_imm 0 ωA = νX
′ ↺
52 load_u8 0 ωA = µν X
′ ↺
53 load_i8 0 ωA = X1 (µνX )
′ ↺
54 load_u16 0 ωA = E2−1 (µνX ⋅⋅⋅+2 )
′ ↺
55 load_i16 0 ωA = X2 (E2−1 (µνX ⋅⋅⋅+2 ))
′ ↺
56 load_u32 0 ωA = E4−1 (µνX ⋅⋅⋅+4 )
′ ↺
57 load_i32 0 ωA = X4 (E4−1 (µνX ⋅⋅⋅+4 ))
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 38
ζı Name ϱ∆ Mutations
′ ↺
58 load_u64 0 ωA = E8−1 (µνX ⋅⋅⋅+8 )
↺
59 store_u8 0 µ′ νX = ωA mod 28
↺
60 store_u16 0 µ′ νX ⋅⋅⋅+2 = E2 (ωA mod 216 )
↺
61 store_u32 0 µ′ νX ⋅⋅⋅+4 = E4 (ωA mod 232 )
↺
62 store_u64 0 µ′ νX ⋅⋅⋅+8 = E8 (ωA )
ζı Name ϱ∆ Mutations
↺
70 store_imm_ind_u8 0 µ′ ωA +νX = νY mod 28
↺
71 store_imm_ind_u16 0 µ′ ωA +νX ⋅⋅⋅+2 = E2 (νY mod 216 )
↺
72 store_imm_ind_u32 0 µ′ ωA +νX ⋅⋅⋅+4 = E4 (νY mod 232 )
↺
73 store_imm_ind_u64 0 µ′ ωA +νX ⋅⋅⋅+8 = E8 (νY )
A.5.8. Instructions with Arguments of One Register, One Immediate and One Offset.
′
let rA = min(12, ζı+1 mod 16) , ωA ≡ ωr A , ωA ≡ ωr′ A
ζı+1
(A.21) let lX = min(4, ⌊ ⌋ mod 8) , νX = XlX (El−1
X
(ζı+2⋅⋅⋅+lX ))
16
let lY = min(4, max(0, ℓ − lX − 1)) , νY = ı + ZlY (El−1
Y
(ζı+2+lX ⋅⋅⋅+lY ))
ζı Name ϱ∆ Mutations
′
80 load_imm_jump 0 branch(νY , ⊺) , ωA = νX
81 branch_eq_imm 0 branch(νY , ωA = νX )
82 branch_ne_imm 0 branch(νY , ωA ≠ νX )
83 branch_lt_u_imm 0 branch(νY , ωA < νX )
84 branch_le_u_imm 0 branch(νY , ωA ≤ νX )
85 branch_ge_u_imm 0 branch(νY , ωA ≥ νX )
86 branch_gt_u_imm 0 branch(νY , ωA > νX )
87 branch_lt_s_imm 0 branch(νY , Z8 (ωA ) < Z8 (νX ))
88 branch_le_s_imm 0 branch(νY , Z8 (ωA ) ≤ Z8 (νX ))
89 branch_ge_s_imm 0 branch(νY , Z8 (ωA ) ≥ Z8 (νX ))
90 branch_gt_s_imm 0 branch(νY , Z8 (ωA ) > Z8 (νX ))
ζı Name ϱ∆ Mutations
′
100 move_reg 0 ωD = ωA
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 39
ζı Name ϱ∆ Mutations
′
ωD ≡ min(x ∈ NR ) ∶
x≥h
101 sbrk 0
Nx⋅⋅⋅+ωA ⊆/ Vµ
Nx⋅⋅⋅+ωA ⊆ V∗µ′
Note, the term h above refers to the beginning of the heap, the second major section of memory as defined in equation
A.34 as 2ZZ + Q(∣o∣). If sbrk instruction is invoked on a pvm instance which does not have such a memory layout, then
h = 0.
A.5.10. Instructions with Arguments of Two Registers & One Immediate.
′
let rA = min(12, (ζı+1 ) mod 16) , ω A ≡ ωr A , ωA ≡ ωr′ A
ζı+1 ′
(A.23) let rB = min(12, ⌊ ⌋) , ωB ≡ ωr B , ωB ≡ ωr′ B
16
let lX = min(4, max(0, ℓ − 1)) , νX ≡ XlX (El−1
X
(ζı+2⋅⋅⋅+lX ))
ζı Name ϱ∆ Mutations
↺
110 store_ind_u8 0 µ′ ωB +νX = ωA mod 28
↺
111 store_ind_u16 0 µ′ ωB +νX ⋅⋅⋅+2 = E2 (ωA mod 216 )
↺
112 store_ind_u32 0 µ′ ωB +νX ⋅⋅⋅+4 = E4 (ωA mod 232 )
↺
113 store_ind_u64 0 µ′ ωB +νX ⋅⋅⋅+8 = E8 (ωA )
′ ↺
114 load_ind_u8 0 ωA = µωB +νX
′ ↺
115 load_ind_i8 0 ωA = Z8−1 (Z1 (µωB +νX ))
′ ↺
116 load_ind_u16 0 ωA = E2−1 (µωB +νX ⋅⋅⋅+2 )
′ ↺
117 load_ind_i16 0 ωA = Z8−1 (Z2 (E2−1 (µωB +νX ⋅⋅⋅+2 )))
′ ↺
118 load_ind_u32 0 ωA = E4−1 (µωB +νX ⋅⋅⋅+4 )
′ ↺
119 load_ind_i32 0 ωA = Z8−1 (Z4 (E4−1 (µωB +νX ⋅⋅⋅+4 )))
′ ↺
120 load_ind_u64 0 ωA = E8−1 (µωB +νX ⋅⋅⋅+8 )
′
121 add_imm_32 0 ωA = X4 ((ωB + νX ) mod 232 )
′
122 and_imm 0 ∀i ∈ N64 ∶ B8 (ωA )i = B8 (ωB )i ∧ B8 (νX )i
′
123 xor_imm 0 ∀i ∈ N64 ∶ B8 (ωA )i = B8 (ωB )i ⊕ B8 (νX )i
′
124 or_imm 0 ∀i ∈ N64 ∶ B8 (ωA )i = B8 (ωB )i ∨ B8 (νX )i
′
125 mul_imm_32 0 ωA = X4 ((ωB ⋅ νX ) mod 232 )
′
126 set_lt_u_imm 0 ωA = ωB < ν X
′
127 set_lt_s_imm 0 ωA = Z8 (ωB ) < Z8 (νX )
′
128 shlo_l_imm_32 0 ωA = X4 ((ωB ⋅ 2νX mod 32 ) mod 232 )
′
129 shlo_r_imm_32 0 ωA = X4 (⌊ωB mod 232 ÷ 2νX mod 32 ⌋)
′
130 shar_r_imm_32 0 ωA = Z8−1 (⌊Z4 (ωB mod 232 ) ÷ 2νX mod 32 ⌋)
′
131 neg_add_imm_32 0 ωA = X4 ((νX + 232 − ωB ) mod 232 )
′
132 set_gt_u_imm 0 ωA = ωB > ν X
′
133 set_gt_s_imm 0 ωA = Z8 (ωB ) > Z8 (νX )
′
134 shlo_l_imm_alt_32 0 ωA = X4 ((νX ⋅ 2ωB mod 32 ) mod 232 )
′
135 shlo_r_imm_alt_32 0 ωA = X4 (⌊νX ÷ 2ωB mod 32 ⌋)
′
136 shar_r_imm_alt_32 0 ωA = Z8−1 (⌊Z4 (νX ) ÷ 2ωB mod 32 ⌋)
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 40
ζı Name ϱ∆ Mutations
⎧
⎪
′ ⎪νX if ωB = 0
137 cmov_iz_imm 0 ωA =⎨
⎪
⎪
⎩ωA otherwise
⎧
⎪
′ ⎪νX if ωB ≠ 0
138 cmov_nz_imm 0 ωA =⎨
⎪
⎪ωA otherwise
⎩
′
139 add_imm_64 0 ωA = (ωB + νX ) mod 264
′
140 mul_imm_64 0 ωA = (ωB ⋅ νX ) mod 264
′
141 shlo_l_imm_64 0 ωA = X8 ((ωB ⋅ 2νX mod 64 ) mod 264 )
′
142 shlo_r_imm_64 0 ωA = X8 (⌊ωB ÷ 2νX mod 64 ⌋)
′
143 shar_r_imm_64 0 ωA = Z8−1 (⌊Z8 (ωB ) ÷ 2νX mod 64 ⌋)
′
144 neg_add_imm_64 0 ωA = (νX + 264 − ωB ) mod 264
′
145 shlo_l_imm_alt_64 0 ωA = (νX ⋅ 2ωB mod 64 ) mod 264
′
146 shlo_r_imm_alt_64 0 ωA = ⌊νX ÷ 2ωB mod 64 ⌋
′
147 shar_r_imm_alt_64 0 ωA = Z8−1 (⌊Z8 (νX ) ÷ 2ωB mod 64 ⌋)
ζı Name ϱ∆ Mutations
ζı Name ϱ∆ Mutations
′
160 load_imm_jump_ind 0 djump((ωB + νY ) mod 232 ) , ωA = νX
ζı Name ϱ∆ Mutations
′
170 add_32 0 ωD = X4 ((ωA + ωB ) mod 2 )32
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 41
ζı Name ϱ∆ Mutations
′
171 sub_32 0 ωD = X4 ((ωA + 232 − (ωB mod 232 )) mod 232 )
′
172 mul_32 0 ωD = X4 ((ωA ⋅ ωB ) mod 232 )
⎧ 64
⎪
′ ⎪2 − 1 if ωB mod 232 = 0
173 div_u_32 0 ωD =⎨
⎪
⎩⌊(ωA mod 2 ) ÷ (ωB mod 2 )⌋ otherwise
32 32
⎪
⎧
⎪264 − 1 if b = 0
⎪
⎪
⎪
⎪
⎪
⎪a if a = −231 ∧ b = −1
′
174 div_s_32 0 ωD = ⎨ −1
⎪
⎪
⎪Z8 (⌊a ÷ b⌋) otherwise
⎪
⎪
⎪
⎪
⎩ where a = Z4 (ωA mod 232 ) , b = Z4 (ωB mod 232 )
⎧
⎪
′ ⎪X4 (ωA ) if ωB mod 232 = 0
175 rem_u_32 0 ωD =⎨
⎪
⎩X4 ((ωA mod 2 ) mod (ωB mod 2 )) otherwise
32 32
⎪
⎧
⎪Z8−1 (a) if b = 0
⎪
⎪
⎪
⎪
⎪
⎪0 if a = −231 ∧ b = −1
′
176 rem_s_32 0 ωD = ⎨ −1
⎪
⎪
⎪Z8 (a mod b) otherwise
⎪
⎪
⎪
⎪
⎩ where a = Z4 (ωA mod 232 ) , b = Z4 (ωB mod 232 )
′
177 shlo_l_32 0 ωD = X4 ((ωA ⋅ 2ωB mod 32 ) mod 232 )
′
178 shlo_r_32 0 ωD = X4 (⌊(ωA mod 232 ) ÷ 2ωB mod 32 ⌋)
′
179 shar_r_32 0 ωD = Z8−1 (⌊Z4 (ωA mod 232 ) ÷ 2ωB mod 32 ⌋)
′
180 add_64 0 ωD = (ωA + ωB ) mod 264
′
181 sub_64 0 ωD = (ωA + 264 − ωB ) mod 264
′
182 mul_64 0 ωD = (ωA ⋅ ωB ) mod 264
⎧
⎪
⎪2 − 1 if ωB = 0
64
′
183 div_u_64 0 ωD =⎨
⎪
⎪⌊ω A ÷ ωB ⌋ otherwise
⎩
⎧264 − 1
⎪ if ωB = 0
⎪
⎪
⎪
′
184 div_s_64 0 ωD = ⎨ωA if Z8 (ωA ) = −263 ∧ Z8 (ωB ) = −1
⎪
⎪
⎪
⎪ −1
⎩Z8 (⌊Z8 (ωA ) ÷ Z8 (ωB )⌋) otherwise
⎧
⎪
′ ⎪ωA if ωB = 0
185 rem_u_64 0 ωD =⎨
⎪
⎪
⎩ωA mod ωB otherwise
⎧
⎪ωA if ωB = 0
⎪
⎪
⎪
′
186 rem_s_64 0 ωD = ⎨0 if Z8 (ωA ) = −263 ∧ Z8 (ωB ) = −1
⎪
⎪
⎪
⎪ −1
⎩Z8 (Z8 (ωA ) mod Z8 (ωB )) otherwise
′
187 shlo_l_64 0 ωD = (ωA ⋅ 2ωB mod 64 ) mod 264
′
188 shlo_r_64 0 ωD = ⌊ωA ÷ 2ωB mod 64 ⌋
′
189 shar_r_64 0 ωD = Z8−1 (⌊Z8 (ωA ) ÷ 2ωB mod 64 ⌋)
′
190 and 0 ∀i ∈ N64 ∶ B8 (ωD )i = B8 (ωA )i ∧ B8 (ωB )i
′
191 xor 0 ∀i ∈ N64 ∶ B8 (ωD )i = B8 (ωA )i ⊕ B8 (ωB )i
′
192 or 0 ∀i ∈ N64 ∶ B8 (ωD )i = B8 (ωA )i ∨ B8 (ωB )i
′
193 mul_upper_s_s 0 ωD = Z8−1 (⌊(Z8 (ωA ) ⋅ Z8 (ωB )) ÷ 264 ⌋)
′
194 mul_upper_u_u 0 ωD = ⌊(ωA ⋅ ωB ) ÷ 264 ⌋
′
195 mul_upper_s_u 0 ωD = Z8−1 (⌊(Z8 (ωA ) ⋅ ωB ) ÷ 264 ⌋)
′
196 set_lt_u 0 ωD = ωA < ωB
′
197 set_lt_s 0 ωD = Z8 (ωA ) < Z8 (ωB )
⎧
⎪
′ ⎪ωA if ωB = 0
198 cmov_iz 0 ωD =⎨
⎪
⎪
⎩ωD otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 42
ζı Name ϱ∆ Mutations
⎧
⎪
′ ⎪ωA if ωB ≠ 0
199 cmov_nz 0 ωD =⎨
⎪
⎪ otherwise
⎩ωD
A.6. Host Call Definition. An extended version of the pvm invocation which is able to progress an inner host-call
state-machine in the case of a host-call halt condition is defined as ΨH :
⎧⎧
⎪Y, NR , NG , ⟦NR ⟧13 ,⎫⎪
⎪⎪ ⎪
⎪
⎪⎪ ⎪ →⎧⎩{☇, ∞, ∎} ∪ { } × NR , NR , ZG , ⟦NR ⟧13 , M, X ⎭
⎫
F
⎪
⎪⎪
⎪M, Ω⟨X⟩, X
⎪ ⎪
⎪
⎪
⎪
⎪
⎪⎩ ⎭
⎪
⎪
⎪
⎪
⎪ ⎧
⎪ let (ε′ , ı′ , ϱ′ , ω ′ , µ′ ) = Ψ(c, ı, ϱ, ω, µ) ∶
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪(ε′ , ı′ , ϱ′ , ω ′ , µ′ , x) if ε′ ∈ {∎, ☇, ∞} ∪ { } × NR
F
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪ ⎧
⎪ ⎪ ′ ̵
⎪
⎪ ⎪
⎪
⎪ ′ ′ ′ ′ ⎪ε =h×h
⎪
ΨH ∶ ⎨ ⎪( × ⋀ ⎨
F
(A.27) ⎪ a, ı , ϱ , ω , µ , x) if
⎪ ⎪ ⎪ × a = f (h, ϱ′ , ω ′ , µ′ , x)
F
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪ (c, ı, ϱ, ω, µ, f, x) ↦ ⎨ ⎩
⎪
⎪ ̵ ×h
⎪
⎪
⎪
⎪
⎪
⎪ΨH (c, ı′′ , ϱ′′ , ω ′′ , µ′′ , f, x′′ ) ε′ = h
⎪
⎪ ⎪
⎪
⎪ if ⋀ {
⎪
⎪
⎪ ⎪
⎪
′′ ′
where ı = ı + 1 + skip(ı ) ′
(▸, ϱ′′ , ω ′′ , µ′′ , x′′ ) = f (h, ϱ′ , ω ′ , µ′ , x)
⎪
⎪ ⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪ ⎧ ε′ = h̵ ×h
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪
⎪
⎪ ⎪
⎪
′′ ′ ′′ ′′
(ε , ı , ϱ , ω , µ , x ) ′′ ′′ ′′ ′′ ′′ ′′ ′′
if ⋀⎨ (ε , ϱ , ω , µ , x ) = f (h, ϱ , ω , µ , x)
′ ′ ′
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪ ′′
⎩ ⎩ ⎩ ε ∈ {☇, ∎, ∞}
Ω⟨X⟩ ≡ (N, NG , ⟦NR ⟧13 , M, X) → ⎧ {▸, ∎, ☇, ∞}, ⟦N ⟧ ⎫
⎭ { } × NR
∪
F
(A.28) ⎩ N G , R 13 , M, X
On exit, the instruction counter ı′ references the instruction which caused the exit. Should the machine be invoked
again using this instruction counter and code, then the same instruction which caused the exit would be executed. This
is sensible when the instruction is one which necessarily needs re-executing such as in the case of an out-of-gas or page
fault reason.
However, when the exit reason to Ψ is a host-call h,̵ then the resultant instruction-counter has a value of the host-
call instruction and resuming with this state would immediately exit with the same result. Re-invoking would therefore
require both the post-host-call machine state and the instruction counter value for the instruction following the one which
resulted in the host-call exit reason. This is always one greater plus the relevant argument skip distance. Resuming the
machine with this instruction counter will continue beyond the host-call instruction.
We use both values of instruction-counter for the definition of ΨH since if the host-call results in a page fault we need
to allow the outer environment to resolve the fault and re-try the host-call. Conversely, if we successfully transition state
according to the host-call, then on resumption we wish to begin with the instruction directly following the host-call.
A.7. Standard Program Initialization. The software programs which will run in each of the four instances where
the pvm is utilized in the main document have a very typical setup pattern characteristic of an output of a compiler and
linker. This means that ram has sections for program-specific read-only data, read-write (heap) data and the stack. An
adjunct to this, very typical of our usage patterns is an extra read-only section via which invocation-specific data may
be passed (i.e. arguments). It thus makes sense to define this properly in a single initializer function. These sections are
quantized into major zones, and one major zone is always left unallocated between sections in order to reduce accidental
overrun. Sections are padded with zeroes to the nearest pvm memory page boundary.
We thus define the standard program code format p, which includes not only the instructions and jump table (previ-
ously represented by the term c), but also information on the state of the ram at program start. Given some p which
is appropriately encoded together with some argument data a, we can define program code c, registers ω and ram µ
through the standard initialization decoder function Y :
⎧
⎪ Y → (Y, ⟦NR ⟧13 , M)?
⎪
⎪
⎪
⎪ ⎧
(A.29) Y ∶⎨ ⎪
⎪(c, ω, µ) if ∃!(c, o, w, z, s) which satisfy equation A.30
⎪
⎪
⎪p↦⎨
⎪ ⎪
⎪
⎩ ⎩∅
⎪ otherwise
With conditions:
Thus, if the above conditions cannot be satisfied with unique values, then the result is ∅, otherwise it is a tuple of c as
above and µ, ω such that:
⎧⎧V oi−Z , A R⎫ if ZZ ≤i< ZZ + ∣o∣
⎪⎩ ⎭
▸ ▸
⎪
▸ ▸
⎪
Z
⎪
⎪
⎪
⎪
⎪(0, R) if ZZ + ∣o∣ ≤i< ZZ + P (∣o∣)
⎪
⎪
⎪
⎪
⎪
⎪(wi−(2ZZ +Z(∣o∣)) , W ) if 2ZZ + Z(∣o∣) ≤i< 2ZZ + Z(∣o∣) + ∣w∣
⎪
⎪
⎪
⎪
⎪
⎪(0, W ) if 2Z + Z(∣o∣) + ∣w∣ ≤ i < 2Z + Z(∣o∣) + P (∣w∣) + zZP
⎪ Z Z
(A.34) ∀i ∈ N232 ∶ ((µV )i , (µA )⌊i/ZP ⌋ ) = ⎨ 32
⎪
⎪
⎪(0, W ) if 2 − 2ZZ − ZI − P (s) ≤ i < 232 − 2ZZ − ZI
⎪
⎪
⎪
⎪
⎪
⎪(ai−(232 −ZZ −ZI ) , R) if 232 − ZZ − ZI ≤i< 232 − ZZ − ZI + ∣a∣
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪(0, R) if 232 − ZZ − ZI + ∣a∣ ≤i< 232 − ZZ − ZI + P (∣a∣)
⎪
⎪
⎪
⎪
⎩(0, ∅) otherwise
⎧
⎪232 − 216 if i = 0
⎪
⎪
⎪
⎪
⎪
⎪232 − 2ZZ − ZI if i = 1
⎪
⎪
⎪ 32
(A.35) ∀i ∈ N13 ∶ ωi = ⎨2 − ZZ − ZI if i = 7
⎪
⎪
⎪
⎪
⎪
⎪∣a∣ if i = 8
⎪
⎪
⎪
⎪
⎩0 otherwise
A.8. Argument Invocation Definition. The four instances where the pvm is utilized each expect to be able to pass
argument data in and receive some return data back. We thus define the common pvm program-argument invocation
function ΨM :
⎧
⎪(Y, NR , NG , Y∶ZI , Ω⟨X⟩, X) → (NG , Y ∪ {☇, ∞}, X)
⎪
⎪
⎪
⎪ ⎧
(A.36) ΨM ∶ ⎨ ⎪
⎪(ϱ, ☇, x) if Y (p) = ∅
⎪
⎪
⎪ (p, ı, ϱ, a, f, x) ↦ ⎨
⎪
⎪ ⎪
⎪R(Ψ (c, ı, ϱ, ω, µ, f, x)) if Y (p) = (c, ω, µ)
⎩ ⎩ H
⎧(ϱ , ∞, x)
⎪
′
if ε = ∞
⎪
⎪
⎪
⎪
⎪
⎪ (ϱ ′
, µ ′
′ ′ , x) if ε = ∎ ∧ Zω10 ′ ⋅⋅⋅+ω ′ ⊂ Vµ′
(A.37) where R∶ (ε, ı′ , ϱ′ , ω ′ , µ′ , x) ↦ ⎨ ′ ω10 ⋅⋅⋅+ω11 11
⎪
⎪
⎪ (ϱ , [], x) if ε = ∎ ∧ Zω10 ′ ⋅⋅⋅+ω ′ ⊂/ Vµ′
⎪
⎪
⎪
11
⎪
⎩ (ϱ ′
, ☇, x) otherwise
(P, NC ) → Y ∪ J
(B.1) ΨI ∶ {
(p, c) ↦ r where (g, r, ∅) = ΨM (pc , 0, GI , E(p, c), F, ∅)
⎧
⎪
⎪ΩG (ϱ, ω, µ) if n = gas
(B.2) F ∈ Ω⟨{}⟩∶ (n, ϱ, ω, µ) ↦ ⎨
⎪
⎪(▸, ϱ − 10, [ω , . . . , ω , WHAT, ω , . . . ], µ) otherwise
⎩ 0 6 8
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 44
Note for the Is-Authorized host-call dispatch function F in equation B.2, we elide the host-call context since, being
essentially stateless, it is always ∅.
B.3. Refine Invocation. We define the Refine service-account invocation function as ΨR . It has no general access to
the state of the Jam chain, with the slight exception being the ability to make a historical lookup. Beyond this it is able
to create inner instances of the pvm and dictate pieces of data to export.
The historical-lookup host-call function, ΩH , is designed to give the same result regardless of the state of the chain for
any time when auditing may occur (which we bound to be less than two epochs from being accumulated). The lookup
anchor may be up to L timeslots before the recent history and therefore adds to the potential age at the time of audit.
We therefore set D = 4, 800, a safe amount of eight hours.
The inner pvm invocation host-calls, meanwhile, depend on an integrated pvm type, which we shall denote M. It
holds some program code, instruction counter and ram:
(B.3) M ≡⎧ ⎫
⎩p ∈ Y, u ∈ M, i ∈ NR ⎭
The Export host-call depends on two pieces of context; one sequence of segments (blobs of length WG ) to which it
may append, and the other an argument passed to the invocation function to dictate the number of segments prior which
may assumed to have already been appended. The latter value ensures that an accurate segment index can be provided
to the caller.
Unlike the other invocation functions, the Refine invocation function implicitly draws upon some recent service account
state item δ. The specific block from which this comes is not important, as long as it is no earlier than its work-package’s
lookup-anchor block. It explicitly accepts the work payload, y, together with the service index which is the subject of
refinement s, the prediction of the hash of that service’s code c at the time of reporting, the hash of the containing
work-package p, the refinement context c, the authorizer hash a and its output o, and an export segment offset ς, the
import segments and extrinsic data blobs as dictated by the work-item, i and x. It results in either some error J or a
pair of the refinement output blob and the export sequence. Formally:
⎧
⎪ H, NG , NS , H, Y, X,
⎪
⎪
⎪ ( ) → (Y ∪ J, ⟦Y⟧)
⎪
⎪
⎪ H, Y, ⟦G⟧, ⟦Y⟧, N
⎪
⎪
⎪
⎪
⎪
⎪ ⎧
⎪
⎪ ⎪
⎪
⎪(BAD, []) if s ∈/ K(δ) ∨ Λ(δ[s], ct , c) = ∅
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪(BIG, []) otherwise if ∣Λ(δ[s], ct , c)∣ > WC
⎪
⎪ ⎪
⎪
(B.4) ΨR ∶ ⎨ ⎪
⎪
⎪ otherwise ∶
⎪
⎪ ⎪
⎪
⎪
⎪
⎪
⎪(c, g, s, p, y, c, a, o, i, x, ς) ↦ ⎨ let a = E(s, y, p, c, a, ↕o, ↕[↕x ∣ x −
< x]) ,
⎪
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎪
⎪
⎪ ⎪
⎪ and (g, r, (m, e)) = Ψ (Λ(δ[s], ct , c), 0, g, a, F, (∅, [])) ∶
⎪
⎪
⎪ ⎪
⎪
M
⎪
⎪ ⎪
⎪
⎪(r, []) if r ∈ {∞, ☇}
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ ⎪
⎪(r, e)
⎪
⎩ ⎩ otherwise
⎧
⎪ΩH (ϱ, ω, µ, (m, e), s, δ, ct ) if n = historical_lookup
⎪
⎪
⎪
⎪
⎪
⎪ΩY (ϱ, ω, µ, (m, e), i) if n = import
⎪
⎪
⎪
⎪
⎪
⎪Ω (ϱ, ω, µ, (m, e), ς) if n = export
⎪
⎪
⎪
E
⎪
⎪
⎪ΩG (ϱ, ω, µ, (m, e)) if n = gas
⎪
⎪
⎪
⎪
⎪
⎪Ω (ϱ, ω, µ, (m, e)) if n = machine
⎪
⎪
⎪
M
⎪
⎪
⎪Ω (ϱ, ω, µ, (m, e)) if n = peek
⎪
⎪
P
(B.5) F ∈ Ω⟨(D⟨N → M⟩, ⟦Y⟧)⟩∶ (n, ϱ, ω, µ, (m, e)) ↦ ⎨ΩZ (ϱ, ω, µ, (m, e)) if n = zero
⎪
⎪
⎪
⎪
⎪
⎪Ω O (ϱ, ω, µ, (m, e)) if n = poke
⎪
⎪
⎪
⎪
⎪
⎪Ω V (ϱ, ω, µ, (m, e)) if n = void
⎪
⎪
⎪
⎪
⎪
⎪Ω K (ϱ, ω, µ, (m, e)) if n = invoke
⎪
⎪
⎪
⎪
⎪
⎪ΩX (ϱ, ω, µ, (m, e)) if n = expunge
⎪
⎪
⎪ ′
⎪
⎪
⎪(▸, ϱ − 10, ω , µ) otherwise
⎪
⎪
⎪ ′ ′
⎩ where ω = ω except ω7 = WHAT
B.4. Accumulate Invocation. Since this is a transition which can directly affect a substantial amount of on-chain
state, our invocation context is accordingly complex. It is a tuple with elements for each of the aspects of state which
can be altered through this invocation and beyond the account of the service itself includes the deferred transfer list and
several dictionaries for alterations to preimage lookup state, core assignments, validator key assignments, newly created
accounts and alterations to account privilege levels.
Formally, we define our result context to be X, and our invocation context to be a pair of these contexts, X × X,
with one dimension being the regular dimension and generally named x and the other being the exceptional dimension
and being named y. The only function which actually alters this second dimension is checkpoint, ΩC and so it is rarely
seen.
(B.6) X ≡⎧ ⎫
⎩d ∈ D⟨NS → A⟩, s ∈ NS , u ∈ U, i ∈ NS , t ∈ ⟦T⟧⎭
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 45
For all such contexts, we define a convenience equivalence, xs , which is the accumulating service account, as found in
the dictionary of (xu )d at the index xs .
We track both regular and exceptional dimensions within our context mutator, but collapse the result of the invocation
to one or the other depending on whether the termination was regular or exceptional (i.e. out-of-gas or panic).
We define ΨA , the Accumulation invocation function as:
⎧
⎪⎧U, NT , NS , NG , ⟦O⟧⎫ → ⎧U, ⟦T⟧, H?, NG ⎫
⎪
⎪⎩ ⎭ ⎩ ⎭
⎪
⎪ ⎧
(B.8) ΨA ∶ ⎨ ⎪⎧I(u, s)u , [], ∅, 0⎫
⎪ if ud [s]c = ∅
⎪
⎪
⎪ (u, t, s, g, o) ↦ ⎨⎩ ⎭
⎪
⎪ ⎪
⎪ (u [s] E(t, ↕o), ⎧ ⎫)) otherwise
⎩ ⎩ C(Ψ M d c , 5, g, s, F, ⎩ I(u, s), I(u, s)⎭
⎧
⎪(U, NS ) → X
⎪
⎪
⎪
⎪ (u, s) ↦ ⎧d d ∖ {s}, s, u ⎧d {s ↦ d[s]}, x u , i u , q u ⎫, i, t []⎫
(B.9) I∶ ⎨ ⎩
▸
▸
⎩
▸
▸
▸
▸
▸
▸
x
▸
▸
▸
▸
i q⎭
▸
▸
⎭
⎪
⎪
⎪
⎪
⎪
−1 ′
where i = check((E4 (H(E(s, η0 , Ht ))) mod (2 − 2 )) + 28 ) 32 9
⎩
⎧
⎪ G(ΩR (ϱ, ω, µ, s, xs , d), (x, y)) if n = read
⎪
⎪
⎪
⎪
⎪
⎪ G(Ω (ϱ, ω, µ, s, x ), (x, y)) if n = write
⎪
⎪
⎪
W s
⎪
⎪
⎪ G(ΩL (ϱ, ω, µ, s, xs , d), (x, y)) if n = lookup
⎪
⎪
⎪
⎪
⎪
⎪ Ω (ϱ, ω, µ, (x, y)) if n = gas
⎪
⎪
⎪
G
⎪
⎪
⎪ G(Ω (ϱ, ω, µ, x , d), (x, y)) if n = info
⎪
⎪
⎪
I s
⎪
⎪
⎪ Ω (ϱ, ω, µ, (x, y)) if n = bless
⎪
⎪
⎪
B
⎪
⎪
⎪ ΩA (ϱ, ω, µ, (x, y)) if n = assign
⎪
⎪
⎪
⎪
⎪
⎪ Ω (ϱ, ω, µ, (x, y)) if n = designate
⎪
⎪
D
(B.10) F ∈ Ω⟨(X, X)⟩∶ (n, ϱ, ω, µ, (x, y)) ↦ ⎨ΩC (ϱ, ω, µ, (x, y)) if n = checkpoint
⎪
⎪
⎪
⎪
⎪
⎪ ΩN (ϱ, ω, µ, (x, y)) if n = new
⎪
⎪
⎪
⎪
⎪
⎪ Ω U (ϱ, ω, µ, (x, y)) if n = upgrade
⎪
⎪
⎪
⎪
⎪
⎪ Ω T (ϱ, ω, µ, (x, y)) if n = transfer
⎪
⎪
⎪
⎪
⎪
⎪ Ω Q (ϱ, ω, µ, (x, y)) if n = quit
⎪
⎪
⎪
⎪
⎪
⎪ ΩS (ϱ, ω, µ, (x, y), Ht ) if n = solicit
⎪
⎪
⎪
⎪
⎪ Ω (ϱ, ω, µ, (x, y), H ) if n = forget
⎪
⎪
⎪
F t
⎪
⎪
⎪ (▸, ϱ − 10, [ω , . . . , ω , WHAT, ω , . . . ], µ, x) otherwise
⎪
⎪
⎪
0 6 8
⎪
⎩ where d = (xu )d ∪ xd , s = (xu )d [xs ]
⎧
⎪(({▸, ∎, ☇, ∞}, N , ⟦N R ⟧13 , M, A), (X, X)) → ({▸, ∎, ☇, ∞}, N G , ⟦N R ⟧13 , M, (X, X))
⎪
⎪
⎪
G
⎪
(B.11) G∶ ⎨ ((ε, ϱ, ω, µ, s), (x, y)) ↦ (ε, ϱ, ω, µ, (x∗ , y))
⎪
⎪
⎪
⎪
⎪ where x∗ = x except (x∗u )d [x∗s ] = s
⎩
⎧
⎪(NG , Y ∪ {∞, ☇}, (X, X)) → (U, ⟦T⟧, H?, NG )
⎪
⎪
⎪
⎪
⎪
⎪ ⎧⎧xu , xt , o, g ⎫ if o ∈ H
⎪
⎪
⎪⎩ ⎭
(B.12) C∶ ⎨ ⎪
⎪
⎪ (g, o, (x, y)) ↦ ⎨⎧ ∅, ⎫ if o ∈ Y ∖ H
⎪
⎪
⎪ ⎪⎩ x u , x t , g ⎭
⎪
⎪ ⎪
⎪
⎪⎧ ⎫
⎩ ⎩⎩yu , yt , ∅, g ⎭ if o ∈ {∞, ☇}
The mutator F governs how this context will alter for any given parameterization, and the collapse function C selects
one of the two dimensions of context depending on whether the virtual machine’s halt was regular or exceptional.
The initializer function I maps some service account s along with its index s to yield a mutator context such that no
alterations to state are implied (beyond those already inherent in s) in either exit scenario. Note that the component a
utilizes the random accumulator η0 and the block’s timeslot Ht to create a deterministic sequence of identifiers which
are extremely likely to be unique.
Concretely, we create the identifier from the Blake2 hash of the identifier of the creating service, the current random
accumulator η0 and the block’s timeslot. Thus, within a service’s accumulation it is almost certainly unique, but it is
not necessarily unique across all services, nor at all times in the past. We utilize a check function to find the first such
index in this sequence which does not already represent a service:
⎧
⎪
⎪i if i ∈/ K(ud )
(B.13) check(i ∈ NS ) ≡ ⎨
⎪
⎩check((i − 2 + 1) mod (2 − 2 ) + 2 ) otherwise
8 32 9 8
⎪
nb In the highly unlikely event that a block executes to find that a single service index has inadvertently been attached
to two different services, then the block is considered invalid. Since no service can predict the identifier sequence ahead
of time, they cannot intentionally disadvantage the block author.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 46
B.5. On-Transfer Invocation. We define the On-Transfer service-account invocation function as ΨT ; it is somewhat
similar to the Accumulation Invocation except that the only state alteration it facilitates are basic alteration to the
storage of the subject account. No further transfers may be made, no privileged operations are possible, no new accounts
may be created nor other operations done on the subject account itself. The function is defined as:
⎧
⎪
⎪
⎪(D⟨NS → A⟩, NT , NS , ⟦T⟧) → A
⎪
⎪ ⎧
(B.14) ΨT ∶ ⎨ ⎪
⎪s if sc = ∅ ∨ t = []
⎪
⎪
⎪(d, t, s, t) ↦⎨ ′
⎪
⎪ ⎪s where (g, r, s′ ) = ΨM (sc , 10, ∑r∈t (rg ), E(t, s, ↕t), F, s) otherwise
⎪
⎩ ⎩
(B.15)
where s = d[s] except sb = d[s]b + ∑ ra
r∈t
(B.16)
⎧
⎪ΩL (ϱ, ω, µ, s, s, d) if n = lookup
⎪
⎪
⎪
⎪
⎪
⎪ΩR (ϱ, ω, µ, s, s, d) if n = read
⎪
⎪
⎪
⎪
⎪
⎪ΩW (ϱ, ω, µ, s, s) if n = write
F ∈ Ω⟨A⟩∶ (n, ϱ, ω, µ, s) ≡ ⎨
⎪
⎪
⎪ΩG (ϱ, ω, µ) if n = gas
⎪
⎪
⎪
⎪
⎪
⎪ΩI (ϱ, ω, µ, s, d) if n = info
⎪
⎪
⎪
⎪
⎩(▸, ϱ − 10, [ω0 , . . . , ω6 , WHAT, ω8 , . . . ], µ, s) otherwise
B.6. General Functions. We come now to defining the host functions which are utilized by the pvm invocations.
Generally, these map some pvm state, including invocation context, possibly together with some additional parameters,
to a new pvm state.
The general functions are all broadly of the form (ϱ′ ∈ ZG , ω ′ ∈ ⟦NR ⟧13 , µ′ , s′ ) = Ω◻ (ϱ ∈ NG , ω ∈ ⟦NR ⟧13 , µ ∈ M, s ∈ A, . . . ).
Functions which have a result component which is equivalent to the corresponding argument may have said components
elided in the description. Functions may also depend upon particular additional parameters.
Unlike the Accumulate functions in appendix B.7, these do not mutate an accumulation context, but merely a service
account s.
The gas function, ΩG has a parameter list suffixed with an ellipsis to denote that any additional parameters may be
taken and are provided transparently into its result. This allows it to be easily utilized in multiple pvm invocations.
Other than the gas-counter which is explicitly defined, elements of pvm state are each assumed to remain unchanged
by the host-call unless explicitly specified.
(B.17) ϱ′ ≡ ϱ − g
⎧
⎪
⎪(∞, ω, µ, s) if ϱ < g
(B.18) (ε′ , ω ′ , µ′ , s′ ) ≡ ⎨
⎪
⎩(▸, ω, µ, s) except as indicated below otherwise
⎪
Function
Identifier Mutations
Gas usage
ΩG (ϱ, ω, . . . )
gas = 0 ω7′ ≡ ϱ′
g = 10
⎧
⎪
⎪s if ω7 ∈ {s, 264 − 1}
let a = ⎨
⎪
⎩d[ω7 ] otherwise
⎪
let [ho , bo , bz ] = ω8..11
⎧
⎪
⎪H(µho ⋅⋅⋅+32 ) if Zho ⋅⋅⋅+32 ⊂ Vµ
let h = ⎨
⎪
⎩∇
⎪ otherwise
ΩL (ϱ, ω, µ, s, s, d) ⎧
lookup = 1 ⎪
⎪ap [h] if a ≠ ∅ ∧ h ∈ K(ap )
let v = ⎨
⎪
g = 10 ⎩∅
⎪ otherwise
⎧
⎪
⎪vi if v ≠ ∅ ∧ Zbo ⋅⋅⋅+bz ⊂ V∗µ
∀i ∈ Nmin(bz ,∣v∣) ∶ µ′bo +i ≡ ⎨
⎪
⎪
⎩µbo +i otherwise
⎧
⎪ NONE if v = ∅
⎪ ∗
′
ω7 ≡ ⎨
⎪ ∣v∣ otherwise} if h ≠ ∇ ∧ Zbo ⋅⋅⋅+bz ⊂ Vµ
⎪
⎪
⎪
⎩OOB otherwise
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 47
Function
Identifier Mutations
Gas usage
⎧
⎪s if ω7 ∈ {s, 264 − 1}
⎪
⎪
⎪
let a = ⎨d[ω7 ] otherwise if ω7 ∈ K(d)
⎪
⎪
⎪
⎪
⎩∅ otherwise
let [ko , kz , bo , bz ] = ω8..12
⎧
⎪
⎪H(E4 (s) ⌢ µko ⋅⋅⋅+kz ) if Zko ⋅⋅⋅+kz ⊂ Vµ
let k = ⎨
⎪
ΩR (ϱ, ω, µ, s, s, d) ⎩∇
⎪ otherwise
read = 2 ⎧
⎪as [k] if a ≠ ∅ ∧ k ∈ K(as )
⎪
g = 10 let v = ⎨
⎪
⎩∅
⎪ otherwise
⎧
⎪vi
⎪ if v ≠ ∅ ∧ Zbo ⋅⋅⋅+bz ⊂ V∗µ
∀i ∈ Nmin(bz ,∣v∣) ∶ µ′bo +i ≡ ⎨
⎪µbo +i otherwise
⎪
⎩
⎧
⎪ NONE if v = ∅
⎪ ∗
⎪
ω7′ ≡ ⎨ ∣v∣ otherwise} if k ≠ ∇ ∧ Zbo ⋅⋅⋅+bz ⊂ Vµ
⎪
⎪
⎪OOB
⎩ otherwise
⎧
⎪
⎪d[s] if ω7 = 264 − 1
let t = ⎨
⎪
⎩d[ω7 ] otherwise
⎪
let o = ω8
⎧
⎪
⎪E(tc , tb , tt , tg , tm , tl , ti ) if t ≠ ∅
ΩI (ϱ, ω, µ, s, d) let m = ⎨
⎪
info = 4 ⎩∅
⎪ otherwise
⎧
⎪mi
g = 10 ⎪ if m ≠ ∅ ∧ Zo⋅⋅⋅+∣m∣ ⊂ V∗µ
∀i ∈ N∣m∣ ∶ µ′o+i ≡ ⎨
⎪
⎪
⎩µo+i otherwise
⎧
⎪ if m ≠ ∅ ∧ Zo⋅⋅⋅+∣m∣ ⊂ V∗µ
⎪
⎪
⎪
OK
′
ω7 ≡ ⎨NONE if m = ∅
⎪
⎪
⎪
⎪
⎩OOB otherwise
B.7. Accumulate Functions. This defines a number of functions broadly of the form (ϱ′ ∈ ZG , ω ′ ∈ ⟦NR ⟧13 , µ′ , (x′ , y′ )) =
Ω◻ (ϱ ∈ NG , ω ∈ ⟦NR ⟧13 , µ ∈ M, (x ∈ X, y ∈ X), . . . ). Functions which have a result component which is equivalent to the
corresponding argument may have said components elided in the description. Functions may also depend upon particular
additional parameters.
Other than the gas-counter which is explicitly defined, elements of pvm state are each assumed to remain unchanged
by the host-call unless explicitly specified.
(B.19) ϱ′ ≡ ϱ − g
⎧
⎪
⎪(∞, ω, µ, x, y) if ϱ < g
(B.20) (ε′ , ω ′ , µ′ , x′ , y′ ) ≡ ⎨
⎪
⎪(▸, ω, µ, x, y) except as indicated below otherwise
⎩
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 48
Function
Identifier Mutations
Gas usage
g = 10 ⎧
⎪(OOB, (xu )x ) if g = ∇
⎪
⎪
⎪
(ω7′ , (x′u )x ) = ⎨(WHO, (xu )x ) otherwise if (m, a, v) ∈/ (NS , NS , NS )
⎪
⎪
⎪
⎪(OK, ⎧m, a, v, g⎫) otherwise
⎩ ⎩ ⎭
let o = ω8
⎧
⎪
⎪[µo+32i⋅⋅⋅+32 ∣ i −
< NQ ] if Zo⋅⋅⋅+32Q ⊂ Vµ
ΩA (ϱ, ω, µ, (x, y)) let c = ⎨
⎪
⎪∇ otherwise
assign = 6 ⎩
g = 10 ⎧
⎪(OOB, (xu )q [ω7 ]) if c = ∇
⎪
⎪
⎪
′ ′
(ω7 , (xu )q [ω7 ]) = ⎨(CORE, (xu )q [ω7 ]) otherwise if ω7 ≥ C
⎪
⎪
⎪
⎪
⎩(OK, c) otherwise
let o = ω7
⎧
⎪
ΩD (ϱ, ω, µ, (x, y)) ⎪[µo+336i⋅⋅⋅+336 ∣ i −
< NV ] if Zo⋅⋅⋅+336V ⊂ Vµ
let v = ⎨
⎪
designate = 7 ⎩∇
⎪ otherwise
g = 10 ⎧
⎪
⎪(OOB, (xu )i ) if v = ∇
(ω7′ , (xu′
)i ) = ⎨
⎪
⎩(OK, v)
⎪ otherwise
Function
Identifier Mutations
Gas usage
⎪
⎪∇ otherwise
⎩
ΩT (ϱ, ω, µ, (x, y))
let b = (xs )b − a
transfer = 11
g = 10 + ω9 ⎧
⎪(OOB, xt , (xs )b ) if t = ∇
⎪
⎪
⎪
⎪
⎪
⎪(WHO, xt , (xs )b ) otherwise if d ∈/ K(d)
⎪
⎪
⎪
(ω7′ , x′t , (x′s )b ) ≡ ⎨(LOW, xt , (xs )b ) otherwise if l < d[d]m
⎪
⎪
⎪
⎪
⎪
⎪(CASH, x t , (x )
s b ) otherwise if b < (xs )t
⎪
⎪
⎪
⎪
⎩(OK, xt t, b) otherwise
⎪
▸ ▸
quit = 12 ⎪
⎪
⎪
⎩∇ otherwise
g = 10
⎧
⎪(∎, OK, (xu )d ∖ {xs }, xt ) if t = ∅
⎪
⎪
⎪
⎪
⎪
⎪(▸, OOB, (xu )d , xt ) otherwise if t = ∇
⎪
⎪
⎪
(ε′ , ω7′ , (x′u )d , x′t ) ≡ ⎨(▸, WHO, (xu )d , xt ) otherwise if d ∈/ K(d)
⎪
⎪
⎪
⎪
⎪
⎪(▸, LOW, (x u )d , x t ) otherwise if g < d[d]m
⎪
⎪
⎪
⎪
⎩(∎, OK, (x )
u d ∖ {x s }, x t t) otherwise
Function
Identifier Mutations
Gas usage
B.8. Refine Functions. These assume some refine context pair (m, e) ∈ (D⟨N → M⟩, ⟦G⟧), which are both initially
empty. Other than the gas-counter which is explicitly defined, elements of pvm state are each assumed to remain
unchanged by the host-call unless explicitly specified.
(B.21) ϱ′ ≡ ϱ − g
⎧
⎪
⎪(∞, ω, µ) if ϱ < g
(B.22) (ε′ , ω ′ , µ′ ) ≡ ⎨
⎪
⎪(▸, ω, µ) except as indicated below otherwise
⎩
Function
Identifier Mutations
Gas usage
⎧
⎪d[s] if ω7 = 264 − 1 ∧ s ∈ K(d)
⎪
⎪
⎪
let a = ⎨d[ω7 ] if ω7 ∈ K(d)
⎪
⎪
⎪
⎪
⎩∅ otherwise
let [ho , bo , bz ] = ω8..11
⎧
⎪
⎪H(µho ⋅⋅⋅+32 ) if Zho ⋅⋅⋅+32 ⊂ Vµ
ΩH (ϱ, ω, µ, (m, e), s, d, t) let h = ⎨
⎪
historical_lookup = 15 ⎩∇
⎪ otherwise
let v = Λ(a, t, h)
g = 10
⎧
⎪
⎪vi if v ≠ ∅ ∧ Zbo ⋅⋅⋅+bz ⊂ V∗µ
∀i ∈ Nmin(bz ,∣v∣) ∶ µ′bo +i ≡ ⎨
⎪
⎪
⎩µbo +i otherwise
⎧
⎪ if h = ∇ ∨ Zbo ⋅⋅⋅+bz ⊂/ V∗µ
⎪
⎪
⎪
OOB
′
ω7 ≡ ⎨NONE otherwise if v = ∅
⎪
⎪
⎪
⎪
⎩∣v∣ otherwise
⎧
⎪
⎪iω if ω7 < ∣i∣
let v = ⎨ 7
⎪
⎩∅
⎪ otherwise
let o = ω8
Function
Identifier Mutations
Gas usage
let p = ω7
let z = min(ω8 , WG )
⎧
⎪ ↺
ΩE (ϱ, ω, µ, (m, e), ς) ⎪PW (µ ) if Np⋅⋅⋅+z ⊆ Vµ
let x = ⎨ G p⋅⋅⋅+z
⎪
export = 17 ⎩∇
⎪ otherwise
g = 10 ⎧
⎪
⎪(OOB, e) if x = ∇
⎪
⎪
(ω7′ , e′ ) ≡ ⎨(FULL, e) otherwise if ς + ∣e∣ ≥ WM
⎪
⎪
⎪
⎪
⎩(ς + ∣e∣, e x) otherwise
⎧
⎪
⎪(OOB, m) if p = ∇
(ω7′ , m) ≡ ⎨
⎪(n, m ∪ {n ↦ ⎧
⎪ ⎩p, u, i ⎫}) otherwise
⎭
⎩
let [n, o, s, z] = ω7...11
⎧
⎪∅ if n ∈/ K(m)
⎪
⎪
⎪
ΩP (ϱ, ω, µ, (m, e)) let s = ⎨(m[n]u )s⋅⋅⋅+z if Ns⋅⋅⋅+z ⊆ Vm[n]u ∧ No⋅⋅⋅+z ⊆ V∗µ
⎪
⎪
⎪
peek = 19 ⎪
⎩∇ otherwise
g = 10 ⎧
⎪
⎪(OOB, µ) if s = ∇
′ ′
⎪
⎪
(ω7 , µ ) ≡ ⎨(WHO, µ) if s = ∅
⎪
⎪
⎪ ↺
⎪ ′
⎩(OK, µ ) where µ′ = µ except µ′ o⋅⋅⋅+z = s otherwise
Function
Identifier Mutations
Gas usage
let n = ω7
ΩX (ϱ, ω, µ, (m, e))
⎧
⎪
expunge = 24 ⎪(WHO, m) if n ≠ K(m)
g = 10 (ω7′ , m′ ) ≡ ⎨
⎪
⎪(m[n] i , m ∖ n) otherwise
⎩
(C.1) E(∅) ≡ []
(C.2) E(x ∈ Y) ≡ x
(C.3) E(⎧ ⎫
⎩a, b, . . .⎭) ≡ E(a) ⌢ E(b) ⌢ . . .
Passing multiple arguments to the serialization functions is equivalent to passing a tuple of those arguments. Formally:
C.1.2. Integer Encoding. We first define the trivial natural number serialization functions which are subscripted by the
number of octets of the final sequence. Values are encoded in a regular little-endian fashion. This is utilized for almost
all integer encoding across the protocol. Formally:
⎧
⎪N28l → Yl
⎪
⎪
⎪
⎪ ⎧
(C.5) El∈N∶ ⎨ ⎪
⎪[] if l = 0
⎪
⎪
⎪ x ↦ ⎨
⎪ ⎪
⎩[x mod 256] ⌢ El−1 (⌊ 256 ⌋) otherwise
x
⎪
⎩ ⎪
We define general natural number serialization, able to encode naturals of up to 264 , as:
⎧
⎪N264 → Y1∶9
⎪
⎪
⎪
⎪
⎪
⎪ ⎧[0] if x = 0
⎪
⎪
⎪
(C.6) E∶ ⎨ ⎪ 8
⎪
⎪
⎪ x ↦ ⎨[2 − 2 + ⌊ 28l ⌋] ⌢ El (x mod 2 ) if ∃l ∈ N8 ∶ 27l ≤ x < 27(l+1)
8−l x 8l
⎪
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎩[2 − 1] ⌢ E8 (x) otherwise if x < 264
8
⎩
Note that at present this is utilized only in encoding the length prefix of variable-length sequences.
C.1.3. Sequence Encoding. We define the sequence serialization function E(⟦T ⟧) for any T which is itself a subset of the
domain of E. We simply concatenate the serializations of each element in the sequence in turn:
Thus, conveniently, fixed length octet sequences (e.g. hashes H and its variants) have an identity serialization.
C.1.4. Discriminator Encoding. When we have sets of heterogeneous items such as a union of different kinds of tuples
or sequences of different length, we require a discriminator to determine the nature of the encoded item for successful
deserialization. Discriminators are encoded as a natural and are encoded immediately prior to the item.
We generally use a length discriminator when serializing sequence terms which have variable length (e.g. general blobs
Y or unbound numeric sequences ⟦N⟧) (though this is omitted in the case of fixed-length terms such as hashes H).19 In
this case, we simply prefix the term its length prior to encoding. Thus, for some term y ∈ ⎧ ⎫
⎩x ∈ Y, . . .⎭, we would generally
define its serialized form to be E(∣x∣) ⌢ E(x) ⌢ . . . . To avoid repetition of the term in such cases, we define the notation
↕x to mean that the term of value x is variable in size and requires a length discriminator. Formally:
(C.8) ↕x ≡ ⎧ ⎫
⎩∣x∣, x⎭ thus E(↕x) ≡ E(∣x∣) ⌢ E(x)
We also define a convenient discriminator operator ¿x specifically for terms defined by some serializable set in union
with ∅ (generally denoted for some set S as S?):
⎧
⎪
⎪0 if x = ∅
(C.9) ¿x ≡ ⎨
⎪
⎪(1, x) otherwise
⎩
C.1.5. Bit Sequence Encoding. A sequence of bits b ∈ B is a special case since encoding each individual bit as an octet
would be very wasteful. We instead pack the bits into octets in order of least significant to most, and arrange into an
octet stream. In the case of a variable length sequence, then the length is prefixed as in the general case.
⎧
⎪
⎪
⎪[] if b = []
⎪
(C.10) E(b ∈ B) ≡ ⎨ min(8,∣b∣)
⎪
⎪
⎪[ ∑ bi ⋅ 2i ] ⌢ E(b8... ) otherwise
⎪
⎩ i=0
C.1.6. Dictionary Encoding. In general, dictionaries are placed in the Merkle trie directly (see appendix E for details).
However, small dictionaries may reasonably be encoded as a sequence of pairs ordered by the key. Formally:
C.1.7. Set Encoding. For any values which are sets and don’t already have a defined encoding above, we define the
serialization of a set as the serialization of the set’s elements in proper order. Formally:
19Note that since specific values may belong to both sets which would need a discriminator and those that would not then we are sadly
unable to introduce a function capable of serializing corresponding to the term’s limitation. A more sophisticated formalism than basic
set-theory would be needed, capable of taking into account not simply the value but the term from which or to which it belongs in order
to do this succinctly.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 54
C.2. Block Serialization. A block B is serialized as a tuple of its elements in regular order, as implied in equations
4.2, 4.3 and 5.1. For the header, we define both the regular serialization and the unsigned serialization EU . Formally:
Note the use of O above to succinctly encode the result of a work item and the slight transformations of EG and
EP to take account of the fact their inner tuples contain variable-length sequence terms a and p which need length
discriminators.
D.1. Serialization. The serialization of state primarily involves placing all the various components of σ into a single
mapping from 32-octet sequence state-keys to octet sequences of indefinite length. The state-key is constructed from a
hash component and a chapter component, equivalent to either the index of a state component or, in the case of the
inner dictionaries of δ, a service index.
We define the state-key constructor functions C as:
⎧
⎪N28 ∪ (N28 , NS ) ∪ ⎧ ⎫
⎩NS , Y⎭ → H
⎪
⎪
⎪
⎪
⎪
⎪ i ∈ N28 ↦ [i, 0, 0, . . . ]
(D.1) C∶ ⎨
⎪
⎪
⎪ (i, s ∈ NS ) ↦ [i, n0 , 0, n1 , 0, n2 , 0, n3 , 0, 0, . . . ] where n = E4 (s)
⎪
⎪
⎪
⎪
⎩ (s, h) ↦ [n0 , h0 , n1 , h1 , n2 , h2 , n3 , h3 , h4 , h5 , . . . , h27 ] where n = E4 (s)
The state serialization is then defined as the dictionary built from the amalgamation of each of the components.
Cryptographic hashing ensures that there will be no duplicate state-keys given that there are no duplicate inputs to C.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 55
Formally, we define T which transforms some state σ into its serialized form:
(D.2)
⎧
⎪ C(1) ↦ E([↕x ∣ x − < α]) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(2) ↦ E(φ) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(3) ↦ E(↕[(h, EM (b), s, ↕p) ∣ (h, b, s, p) − < β]) ,
⎪
⎪
⎪
⎪
⎪
⎪ ⎧ 0 if γ ∈ ⟦C⟧ ⎫
⎪ ⎪ ⎪
C(4) ↦ E(⎪ }, γs , ↕γa ⎪
s
⎪
⎪
⎪ ⎪
⎪γk , γ z , {
E
⎪
⎪),
⎪
⎪ ⎪
⎪ 1 if γs ∈ ⟦HB ⟧E ⎪
⎪
⎪
⎪
⎪ ⎩ ⎭
⎪
⎪
⎪
⎪
⎪ C(5) ↦ E(↕[x ^^ x ∈ ψg ], ↕[x ^^ x ∈ ψb ], ↕[x ^^ x ∈ ψw ], ↕[x ^^ x ∈ ψo ]) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(6) ↦ E(η) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(7) ↦ E(ι) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(8) ↦ E(κ) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(9) ↦ E(λ) ,
⎪
⎪
T (σ) ≡ ⎨ ↦ E([¿(w, E4 (t)) ∣ (w, t) < − ρ]) ,
⎪
⎪
⎪
C(10)
⎪
⎪
⎪
⎪
⎪ C(11) ↦ E4 (τ ) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(12) ↦ E4 (χm , χa , χv ) ⌢ E(χg ) ,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ C(13) ↦ E([E4# (i) ∣ i − < π]) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(14) ↦ E([↕i ∣ i < − ϑ]) ,
⎪
⎪
⎪
⎪
⎪
⎪ C(15) ↦ E([↕i ∣ i < − ξ]) ,
⎪
⎪
⎪
⎪
⎪
⎪ ∀(s ↦ a) ∈ δ ∶ C(255, s) ↦ a ⌢ E (a b , ag , am , al ) ⌢ E4 (ai ) ,
⎪
⎪
⎪
c 8
⎪
⎪
⎪ 32
⎪
⎪ ∀(s ↦ a) ∈ δ, (k ↦ v) ∈ as ∶ C(s, E4 (2 − 1) ⌢ k0...28 ) ↦ v ,
⎪
⎪
⎪
⎪
⎪
⎪ ∀(s ↦ a) ∈ δ, (h ↦ p) ∈ ap ∶ C(s, E4 (232 − 2) ⌢ h1...29 ) ↦ p ,
⎪
⎪
⎪
⎪
⎩∀(s ↦ a) ∈ δ, (⎧
⎪ ⎫
⎩h, l⎭ ↦ t) ∈ al ∶ C(s, E4 (l) ⌢ H(h)2...30 ) ↦ E(↕[E4 (x) ∣ x < − t])
Note that most rows describe a single mapping between key derived from a natural and the serialization of a state
component. However, the final four rows each define sets of mappings since these items act over all service accounts and
in the case of the final three rows, the keys of a nested dictionary with the service.
Also note that all non-discriminator numeric serialization in state is done in fixed-length according to the size of the
term.
D.2. Merklization. With T defined, we now define the rest of Mσ which primarily involves transforming the serialized
mapping into a cryptographic commitment. We define this commitment as the root of the binary Patricia Merkle Trie
with a format optimized for modern compute hardware, primarily by optimizing sizes to fit succinctly into typical memory
layouts and reducing the need for unpredictable branching.
D.2.1. Node Encoding and Trie Identification. We identify (sub-)tries as the hash of their root node, with one exception:
empty (sub-)tries are identified as the zero-hash, H0 .
Nodes are fixed in size at 512 bit (64 bytes). Each node is either a branch or a leaf. The first bit discriminate between
these two types.
In the case of a branch, the remaining 511 bits are split between the two child node hashes, using the last 255 bits of
the 0-bit (left) sub-trie identity and the full 256 bits of the 1-bit (right) sub-trie identity.
Leaf nodes are further subdivided into embedded-value leaves and regular leaves. The second bit of the node discrim-
inates between these.
In the case of an embedded-value leaf, the remaining 6 bits of the first byte are used to store the embedded value size.
The following 31 bytes are dedicated to the first 31 bytes of the key. The last 32 bytes are defined as the value, filling
with zeroes if its length is less than 32 bytes.
In the case of a regular leaf, the remaining 6 bits of the first byte are zeroed. The following 31 bytes store the first 31
bytes of the key. The last 32 bytes store the hash of the value.
Formally, we define the encoding functions B and L:
(H, H) → B512
(D.3) B∶ {
(l, r) ↦ [0] ⌢ bits(l)1... ⌢ bits(r)
⎧
⎪ (H, Y) → B512
⎪
⎪
⎪
⎪ ⎧
(D.4) L∶ ⎨ ⎪
⎪[1, 0] ⌢ bits(E1 (∣v∣))2... ⌢ bits(k)...248 ⌢ bits(v) ⌢ [0, 0, . . . ] if ∣v∣ ≤ 32
⎪
⎪
⎪ (k, v) ↦ ⎨
⎪ ⎪
⎪
⎩ ⎩[1, 1, 0, 0, 0, 0, 0, 0] ⌢ bits(k)...248 ⌢ bits(H(v))
⎪ otherwise
We may then define the basic Merklization function Mσ as:
(D.5) Mσ (σ) ≡ M ({(bits(k) ↦ ⎧ ⎫
⎩k, v ⎭) ∣ (k ↦ v) ∈ T (σ)})
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 56
⎧
⎪H0 if ∣d∣ = 0
⎪
⎪
⎪
⎪
⎪
⎪H(bits −1
(L(k, v))) if V(d) = {(k, v)}
⎪
⎪
⎪ −1
(D.6) M (d ∶ D⟨B → (H, Y)⟩) ≡ ⎨H(bits (B(M (l), M (r)))) otherwise
⎪
⎪
⎪ ⎧
⎪
⎪ ⎪
⎪l if b0 = 0
⎪
⎪
⎪ where ∀b, p ∶ (b ↦ p) ∈ d ⇔ (b ↦ p) ∈ ⎨
⎪
⎪
1...
⎪
⎩ ⎩r if b0 = 1
⎪
E.1.2. Constant-Depth Tree. We define the constant-depth binary Merkle function as M. We define two corresponding
functions for working with subtree pages, Jx and Lx . The latter provides a single page of leaves, themselves hashed,
prefixed data. The former provides the Merkle path to a single page. Both assume size-aligned pages of size 2x and
accept page indices.
(⟦Y⟧, Y → H) → H
(E.4) M∶ {
(v, H) ↦ N (C(v, H), H)
(⟦Y⟧, N∣v∣ , Y → H) → ⟦H⟧
(E.5) Jx ∶ {
(v, i, H) ↦ T (C(v, H), 2x i, H)... max(0,⌈log2 (max(1,∣v∣))−x⌉)
(⟦Y⟧, N∣v∣ , Y → H) → ⟦H⟧
(E.6) Lx ∶ {
(v, i, H) ↦ [H($leaf ⌢ l) ∣ l <
− v2x i... min(2x i+2x ,∣v∣) ]
For the latter justification Jx to be acceptable, we must assume the target observer also knows not merely the value
of the item at the given index, but also all other leaves within its 2x size subtree, given by Lx .
As above, we may assume a default value for H of the Blake 2b hash function, H.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 57
For justifications and Merkle root calculations, a constancy preprocessor function C is applied which hashes all data
items with a fixed prefix ”leaf” and then pads the overall size to the next power of two with the zero hash H0 :
⎧
⎪(⟦Y⟧, Y → H) → ⟦H⟧
⎪
⎪
⎪
⎪
⎪
⎪ ⎧ ∣v′ ∣ = 2⌈log2 (max(1,∣v∣))⌉
⎪ ⎪
⎪
⎪
(E.7) C∶ ⎨ ⎪
⎪
⎪
⎪
⎪ (v, H) ↦ v′ where ⎨ ′ ⎧ ⎪
⎪H($leaf ⌢ vi ) if i < ∣v∣
⎪
⎪
⎪ ⎪
⎪ vi = ⎨
⎪
⎪ ⎪
⎪
⎪ ⎪
⎩ ⎩ ⎪
⎩H0 otherwise
E.2. Merkle Mountain Ranges. The Merkle mountain range (mmr) is an append-only cryptographic data structure
which yields a commitment to a sequence of values. Appending to an mmr and proof of inclusion of some item within
it are both O(log(N )) in time and space for the size of the set.
We define a Merkle mountain range as being within the set ⟦H?⟧, a sequence of peaks, each peak the root of a
Merkle tree containing 2i items where i is the index in the sequence. Since we support set sizes which are not always
powers-of-two-minus-one, some peaks may be empty, ∅ rather than a Merkle root.
Since the sequence of hashes is somewhat unwieldy as a commitment, Merkle mountain ranges are themselves generally
hashed before being published. Hashing them removes the possibility of further appending so the range itself is kept on
the system which needs to generate future proofs.
We define the append function A as:
⎧
⎪ (⟦H?⟧, H, Y → H) → ⟦H?⟧
⎪
A∶ ⎨
⎪
⎪ (r, l, H) ↦ P (r, l, 0, H)
⎩
⎧
⎪ (⟦H?⟧, H, N, Y → H) → ⟦H?⟧
⎪
⎪
⎪
⎪
⎪
⎪ ⎧ if n ≥ ∣r∣
⎪
⎪
⎪
r l
(E.8) where P ∶ ⎨ ⎪
⎪
⎪
⎪ (r, l, n, H) ↦ ⎨R(r, n, l) if n < ∣r∣ ∧ rn = ∅
⎪
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪ (R(r, ∅), ⌢ + otherwise
⎩ ⎩P n, H(r n l), n 1, H)
⎧
⎪ (⟦T ⟧, N, T ) → ⟦T ⟧
⎪
and R∶ ⎨
⎪
⎪
⎩ (s, i, v) ↦ s′ where s′ = s except s′i = v
⎧
⎪ ⟦H?⟧ → H
⎪
⎪
⎪
⎪
⎪
⎪ ⎧
⎪
⎪ ⎪
⎪H0 if ∣h∣ = 0
⎪ ⎪
⎪
⎪
(E.10) MR ∶ ⎨ ⎪h0
⎪ if ∣h∣ = 1
⎪
⎪
⎪ b↦⎨
⎪
⎪
⎪ ⎪
⎪HK ($node ⌢ MR (h...∣h∣−1 ) ⌢ h∣h∣−1 ) otherwise
⎪
⎪ ⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎩ ⎩ where h = [h ∣ h −
< b, h ≠ ∅]
Appendix F. Shuffling
The Fisher-Yates shuffle function is defined formally as:
⎧
⎪(⟦T ⟧l , ⟦N⟧l∶ ) → ⟦T ⟧l
⎪
⎪
⎪
⎪ ⎧
(F.1) ∀T, l ∈ N ∶ F ∶ ⎨ ⎪ ′ ′ ′
⎪[sr0 mod l ] ⌢ F (s...l−1 , r1... ) where s = s except s r0 mod l = sl−1 if s ≠ []
⎪
⎪
⎪ (s, r) ↦ ⎨
⎪ ⎪
⎪
⎩ ⎩[]
⎪ otherwise
Since it is often useful to shuffle a sequence based on some random seed in the form of a hash, we provide a secondary
form of the shuffle function F which accepts a 32-byte hash instead of the numeric sequence. We define Q, the numeric-
sequence-from-hash function, thus:
⎧
⎪H → ⟦Nl ⟧
⎪
(F.2) ∀l ∈ N ∶ Ql ∶ ⎨ −1
⎪
⎩ h ↦ [E4 (H(h ⌢ E4 (⌊ /8⌋))4i mod 32⋅⋅⋅+4 ) ∣ i <
⎪ − Nl ]
i
(⟦T ⟧l , H) → ⟦T ⟧l
(F.3) ∀T, l ∈ N ∶ F ∶ {
(s, h) ↦ F (s, Ql (h))
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 58
(G.1) Fm∈Y
k∈HB ⟨c ∈ H⟩ ⊂ Y96 ≡ {x ∣ x ∈ Y96 , verify(k, c, m, decode(x...32 ), decode(x32... )) = ⊺}
(G.2) Y(s ∈ Fm m
k ⟨c⟩) ∈ H ≡ hashed_output(decode(x...32 ) ∣ x ∈ Fk ⟨c⟩)
The singly-contextualized Bandersnatch Ringvrf proofs F̄mr ⟨c⟩ are a zk-snark-enabled analogue utilizing the Pedersen
vrf, also defined by Hosseini and Galassi 2024 and further detailed by Jeffrey Burdges et al. 2023.
Note that in the case a key HB has no corresponding Bandersnatch point when constructing the ring, then the
Bandersnatch padding point as stated by Hosseini and Galassi 2024 should be substituted.
Segment encoding/decoding may be done using the same functions albeit with a constant k = 6.
H.2. Code Word representation. For the sake of brevity we call each octet pair a word. The code words (including
the message words) are treated as element of F216 finite field. The field is generated as an extension of F2 using the
irreducible polynomial:
(H.8) x16 + x5 + x3 + x2 + 1
Hence:
F2 [x]
(H.9) F16 ≡ + x5 + x3 + x2 + 1
x16
We name the generator of FF162
, the root of the above polynomial, α as such: F16 = F2 (α).
Instead of using the standard basis {1, α, α2 , . . . , α15 }, we opt for a representation of F16 which performs more efficiently
for the encoding and the decoding process. To that aim, we name this specific representation of F16 as F̃16 and define it
as a vector space generated by the following Cantor basis:
v0 1
v1 α15 + α13 + α11 + α10 + α7 + α6 + α3 + α
v2 α13 + α12 + α11 + α10 + α3 + α2 + α
v3 α12 + α10 + α9 + α5 + α4 + α3 + α2 + α
v4 α15 + α14 + α10 + α8 + α7 + α
v5 α15 + α14 + α13 + α11 + α10 + α8 + α5 + α3 + α2 + α
v6 α15 + α12 + α8 + α6 + α3 + α2
v7 α14 + α4 + α
v8 α14 + α13 + α11 + α10 + α7 + α4 + α3
v9 α12 + α7 + α6 + α4 + α3
v10 α14 + α13 + α11 + α9 + α6 + α5 + α4 + α
v11 α15 + α13 + α12 + α11 + α8
v12 α15 + α14 + α13 + α12 + α11 + α10 + α8 + α7 + α5 + α4 + α3
v13 α15 + α14 + α13 + α12 + α11 + α9 + α8 + α5 + α4 + α2
v14 α15 + α14 + α13 + α12 + α11 + α10 + α9 + α8 + α5 + α4 + α3
v15 α15 + α12 + α11 + α8 + α4 + α3 + α2 + α
Every message word mi = mi,15 . . . mi,0 consists of 16 bits. As such it could be regarded as binary vector of length 16:
(H.10) mi = (mi,0 . . . mi,15 )
Where mi,0 is the least significant bit of message word mi . Accordingly we consider the field element m̃i = ∑15
j=0 mi,j vj
to represent that message word.
Similarly, we assign a unique index to each validator between 0 and 1,022 and we represent validator i with the field
element:
15
(H.11) ĩ = ∑ ij vj
j=0
After finding p(y) with such properties, we evaluate p at the following points:
r̃ ̃
342 ∶= p(342)
r̃ ̃
343 ∶= p(343)
(H.13)
⋮
r̃ ̃
1022 ∶= p(1022)
We then distribute the message words and the extra code words among the validators according to their corresponding
indices.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 60
I.2. Functions.
∆: The accumulation function; certain subscripts are used to denote helper functions:
∆1 : The single-step accumulation function.
∆∗ : The parallel accumulation function.
∆+ : The full sequential accumulation function.
Λ: The historical lookup function. See equation 9.7.
Ξ: The work result computation function. See equation 14.11.
Υ: The general state transition function. See equations 4.1, 4.5.
Φ: The key-nullifier function. See equation 6.14.
Ψ: The whole-program pvm machine state-transition function. See equation A.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 61
W: The sequence of work-reports which have now become available and ready for accumulation. See equation
11.16.
Without any superscript, the block is assumed to the block being imported or, if no block is being imported, the head
of the best chain (see section 19). Explicit block-contextualizing superscripts include:
B♮ : The latest finalized block. See equation 19.
B♭ : The block at the head of the best chain. See equation 19.
I.4.2. State components. Here, the prime annotation indicates posterior state. Individual components may be identified
with a letter subscript.
α: The core αuthorizations pool. See equation 8.1.
β: Information on the most recent βlocks.
γ: State concerning Safrole. See equation 6.3.
γa : The sealing lottery ticket accumulator.
γk : The keys for the validators of the next epoch, equivalent to those keys which constitute γz .
γs : The sealing-key sequence of the current epoch.
γz : The Bandersnatch root for the current epoch’s ticket submissions.
δ: The (prior) state of the service accounts.
δ † : The post-preimage integration, pre-accumulation intermediate state.
δ ‡ : The post-accumulation, pre-transfer intermediate state.
η: The eηtropy accumulator and epochal raηdomness.
ι: The validator keys and metadata to be drawn from next.
κ: The validator κeys and metadata currently active.
λ: The validator keys and metadata which were active in the prior epoch.
ρ: The ρending reports, per core, which are being made available prior to accumulation.
ρ† : The post-judgment, pre-guarantees-extrinsic intermediate state.
ρ‡ : The post-guarantees-extrinsic, pre-assurances-extrinsic, intermediate state.
σ: The σverall state of the system. See equations 4.1, 4.4.
τ : The most recent block’s τ imeslot.
φ: The authorization queue.
ψ: Past judgments on work-reports and validators.
ψb : Work-reports judged to be incorrect.
ψg : Work-reports judged to be correct.
ψw : Work-reports whose validity is judged to be unknowable.
ψo : Validators who made a judgment found to be incorrect.
χ: The privileged service indices.
χm : The index of the blessed service.
χv : The index of the designate service.
χa : The index of the assign service.
χg : The always-accumulate service indices and their basic gas allowance.
π: The activity statistics for the validators.
ϑ: The accumulation queue.
ξ: The accumulation history.
JAM: JOIN-ACCUMULATE MACHINE DRAFT 0.5.3 - December 20, 2024 63
I.4.4. Constants.
A = 8: The period, in seconds, between audit tranches.
BI = 10: The additional minimum balance required per item of elective service state.
BL = 1: The additional minimum balance required per octet of elective service state.
BS = 100: The basic minimum balance which all services require.
C = 341: The total number of cores.
D = 28, 800: The period in timeslots after which an unreferenced preimage may be expunged.
E = 600: The length of an epoch in timeslots.
F = 2: The audit bias factor, the expected number of additional validators who will audit a work-report in the
following tranche for each no-show in the previous.
GA = 10, 000, 000: The gas allocated to invoke a work-report’s Accumulation logic.
GI = 50, 000, 000: The gas allocated to invoke a work-package’s Is-Authorized logic.
GR = 5, 000, 000, 000: The gas allocated to invoke a work-package’s Refine logic.
GT = 35 00, 000, 000: The total gas allocated across for all Accumulation. Should be no smaller than GA ⋅ C +
∑g∈V(χg ) (g).
H = 8: The size of recent history, in blocks.
I = 4: The maximum amount of work items in a package.
J = 8: The maximum sum of dependency items in a work-report.
K = 16: The maximum number of tickets which may be submitted in a single extrinsic.
L = 14, 400: The maximum age in timeslots of the lookup anchor.
N = 2: The number of ticket entries per validator.
O = 8: The maximum number of items in the authorizations pool.
P = 6: The slot period, in seconds.
Q = 80: The number of items in the authorizations queue.
R = 10: The rotation period of validator-core assignments, in timeslots.
S = 1024: The maximum number of entries in the accumulation queue.
U = 5: The period in timeslots after which reported but unavailable work may be replaced.
V = 1023: The total number of validators.
WB = 12 ⋅ 220 : The maximum size of an encoded work-package together with its extrinsic data and import impli-
cations, in octets.
WC = 4, 000, 000: The maximum size of service code in octets.
WE = 684: The basic size of erasure-coded pieces in octets. See equation H.6.
WG = WP WE = 4104: The size of a segment in octets.
WM = 211 : The maximum number of entries in a work-package manifest.
WP = 6: The number of erasure-coded pieces in a segment.
WR = 48 ⋅ 210 : The maximum total size of all output blobs in a work-report, in octets.
WT = 128: The size of a transfer memo in octets.
X: Context strings, see below.
Y = 500: The number of slots into an epoch at which ticket-submission ends.
ZA = 2: The pvm dynamic address alignment factor. See equation A.13.
ZI = 224 : The standard pvm program initialization input data size. See equation A.7.
ZP = 212 : The pvm memory page size. See equation 4.24.
ZZ = 216 : The standard pvm program initialization zone size. See section A.7.
References
Bertoni, Guido et al. (2013). “Keccak”. In: Annual international conference on the theory and applications of cryptographic
techniques. Springer, pp. 313–314.
Bögli, Roman (2024). “Assessing risc Zero using ZKit: An Extensible Testing and Benchmarking Suite for ZKP Frame-
works”. PhD thesis. OST Ostschweizer Fachhochschule.
Boneh, Dan, Ben Lynn, and Hovav Shacham (2004). “Short Signatures from the Weil Pairing”. In: J. Cryptology 17,
pp. 297–319. doi: 10.1007/s00145-004-0314-9.
Burdges, Jeff, Alfonso Cevallos, et al. (2024). Efficient Execution Auditing for Blockchains under Byzantine Assumptions.
Cryptology ePrint Archive, Paper 2024/961. https://eprint.iacr.org/2024/961. url: https://eprint.iacr.org/
2024/961.
Burdges, Jeff, Oana Ciobotaru, et al. (2022). Efficient Aggregatable BLS Signatures with Chaum-Pedersen Proofs. Cryp-
tology ePrint Archive, Paper 2022/1611. https://eprint.iacr.org/2022/1611. url: https://eprint.iacr.org/
2022/1611.
Burdges, Jeffrey et al. (2023). Ring Verifiable Random Functions and Zero-Knowledge Continuations. Cryptology ePrint
Archive, Paper 2023/002. url: https://eprint.iacr.org/2023/002.
Buterin, Vitalik (2013). Ethereum: A Next-Generation Smart Contract and Decentralized Application Platform. url:
https://github.com/ethereum/wiki/wiki/White-Paper.
Buterin, Vitalik and Virgil Griffith (2019). Casper the Friendly Finality Gadget. arXiv: 1710.09437 [cs.CR].
Cosmos Project (2023). Interchain Security Begins a New Era for Cosmos. Fetched 18th March, 2024. url: https:
//blog.cosmos.network/interchain-security-begins-a-new-era-for-cosmos-a2dc3c0be63.
Dune and hildobby (2024). Ethereum Staking. Fetched 18th March, 2024. url: https://dune.com/hildobby/eth2-
staking.
Ethereum Foundation (2024a). “A digital future on a global scale”. In: Fetched 4th April, 2024. url: https://ethereum.
org/en/roadmap/vision/.
— (2024b). Danksharding. Fetched 18th March, 2024. url: https://ethereum.org/en/roadmap/danksharding/.
Fisher, Ronald Aylmer and Frank Yates (1938). Statistical tables for biological, agricultural and medical research. Oliver
and Boyd.
Gabizon, Ariel, Zachary J. Williamson, and Oana Ciobotaru (2019). PLONK: Permutations over Lagrange-bases for
Oecumenical Noninteractive arguments of Knowledge. Cryptology ePrint Archive, Paper 2019/953. url: https://
eprint.iacr.org/2019/953.
Goldberg, Sharon et al. (Aug. 2023). Verifiable Random Functions (VRFs). RFC 9381. doi: 10.17487/RFC9381. url:
https://www.rfc-editor.org/info/rfc9381.
Hertig, Alyssa (2016). So, Ethereum’s Blockchain is Still Under Attack... Fetched 18th March, 2024. url: https :
//www.coindesk.com/markets/2016/10/06/so-ethereums-blockchain-is-still-under-attack/.
Hopwood, Daira et al. (2020). BLS12-381. url: https://z.cash/technology/jubjub/.
Hosseini, Seyed and Davide Galassi (2024). “Bandersnatch VRF-AD Specification”. In: Fetched 4th April, 2024. url:
https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf.
Jha, Prashant (2024). Solana outage raises questions about client diversity and beta status. Fetched 18th March, 2024.
url: https://cointelegraph.com/news/solana-outage-client-diversity-beta.
Josefsson, Simon and Ilari Liusvaara (Jan. 2017). Edwards-Curve Digital Signature Algorithm (EdDSA). RFC 8032. doi:
10.17487/RFC8032. url: https://www.rfc-editor.org/info/rfc8032.
Kokoris-Kogias, Eleftherios et al. (2017). OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding. Cryptology
ePrint Archive, Paper 2017/406. https://eprint.iacr.org/2017/406. url: https://eprint.iacr.org/2017/406.
Kwon, Jae and Ethan Buchman (2019). “Cosmos whitepaper”. In: A Netw. Distrib. Ledgers 27, pp. 1–32.
Lin, Sian-Jheng, Wei-Ho Chung, and Yunghsiang S. Han (2014). “Novel Polynomial Basis and Its Application to Reed-
Solomon Erasure Codes”. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 316–325.
doi: 10.1109/FOCS.2014.41.
Masson, Simon, Antonio Sanso, and Zhenfei Zhang (2021). Bandersnatch: a fast elliptic curve built over the BLS12-381
scalar field. Cryptology ePrint Archive, Paper 2021/1152. url: https://eprint.iacr.org/2021/1152.
Ng, Felix (2024). Is measuring blockchain transactions per second stupid in 2024? Fetched 18th March, 2024. url:
https://cointelegraph.com/magazine/blockchain-transactions-per-second-tps-stupid-big-questions/.
Polkavm Project (2024). “PolkaVM/RISC0 Benchmark Results”. In: Fetched 3rd April, 2024. url: https://github.
com/koute/risc0-benchmark/blob/master/README.md.
Saarinen, Markku-Juhani O. and Jean-Philippe Aumasson (Nov. 2015). The BLAKE2 Cryptographic Hash and Message
Authentication Code (MAC). RFC 7693. doi: 10.17487/RFC7693. url: https://www.rfc-editor.org/info/rfc7693.
Sadana, Apoorv (2024). Bringing Polkadot tech to Ethereum. Fetched 18th March, 2024. url: https://ethresear.ch/
t/bringing-polkadot-tech-to-ethereum/17104.
Sharma, Shivam (2023). Ethereum’s Rollups are Centralized. url: https://public.bnbstatic.com/static/files/
research/ethereums-rollups-are-centralized-a-look-into-decentralized-sequencers.pdf.
Solana Foundation (2023). Solana data goes live on Google Cloud BigQuery. Fetched 18th March, 2024. url: https:
//solana.com/news/solana-data-live-on-google-cloud-bigquery.
REFERENCES 66
Solana Labs (2024). Solana Validator Requirements. Fetched 18th March, 2024. url: https://docs.solanalabs.com/
operations/requirements.
Stewart, Alistair and Eleftherios Kokoris-Kogia (2020). “Grandpa: a byzantine finality gadget”. In: arXiv preprint
arXiv:2007.01560.
Tanana, Dmitry (2019). “Avalanche blockchain protocol for distributed computing security”. In: 2019 IEEE International
Black Sea Conference on Communications and Networking (BlackSeaCom). IEEE, pp. 1–3.
Thaler, Justin (2023). “A technical FAQ on Lasso, Jolt, and recent advancements in SNARK design”. In: Fetched 3rd
April, 2024. url: https: // a16zcrypto.com/posts/article/a- technical- faq- on- lasso- jolt- and- recent-
advancements-in-snark-design/.
Wikipedia (2024). Fisher-Yates shuffle: The modern algorithm. url: https://en.wikipedia.org/wiki/Fisher%5C%E2%
5C%80%5C%93Yates_shuffle%5C#The_modern_algorithm.
Wood, Gavin (2014). “Ethereum: A secure decentralised generalised transaction ledger”. In: Ethereum project yellow
paper 151, pp. 1–32.
Yakovenko, Anatoly (2018). “Solana: A new architecture for a high performance blockchain v0. 8.13”. In.