0% found this document useful (0 votes)

36 views50 pages

14 25-15 00-RISCVMemoryModelTutorial

The document provides an overview of memory consistency models and the need for a RISC-V memory consistency model specification. It discusses sequential consistency and total store ordering models. It also outlines different approaches like axiomatic and operational models for specifying memory consistency. The presentation aims to explain memory consistency models and their importance for the RISC-V specification.

Uploaded by

hoboman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views50 pages

14 25-15 00-RISCVMemoryModelTutorial

Uploaded by

hoboman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

RISC-V Memory Consistency Model

Tutorial

Dan Lustig
May 7, 2018
ABOUT ME

• Senior Research Scientist at NVIDIA in Santa Clara, CA

• PhD from Princeton in 2015
• Chair of RISC-V Memory Consistency Model task group
• Co-responsible for NVIDIA GPU memory consistency model

2
OUTLINE
• Setting the Stage
• Litmus Tests
• RISC-V Weak Memory Ordering (“RVWMO”)
• Extensions: “Zam” and “Ztso”
• Documentation and Tools
• Conclude

3
WHAT IS A MEMORY CONSISTENCY MODEL?

Specifies the values that can be returned by loads

4
WHY DO WE NEED A MEMORY MODEL?

5
This Photo by Unknown Author is licensed under CC BY-NC-ND
WHY DO WE NEED A MEMORY MODEL?

…to give everyone a headache?

6
This Photo by Unknown Author is licensed under CC BY-NC-ND
WHY DO WE NEED A MEMORY MODEL?

For the same reason we need

any other technical specification:

It is (one specific part of) the

contract between the software
and the implementation about
the set of legal behaviors

7
This Photo by Unknown Author is licensed under CC BY-SA
WHY DO WE NEED A MEMORY MODEL?

For the same reason we need

any other technical specification:

It is (one specific part of) the

contract between the software
and the implementation about
the set of legal behaviors

The other parts of the contract are defined by the rest of the ISA specification
(including the ISA Formal Specification; see that TG’s tutorial later today)

8
This Photo by Unknown Author is licensed under CC BY-SA
A WIDE RANGE OF MEMORY MODELS
Low
Note: Performance
diagram obviously not to scale,
just a rough picture J Sequential
Consistency

ore O)
l St (TS RISC-V
ota ing
T er (RVWMO)
d
Or IBM Power

Hard for Hard for

NVIDIA GPUs
Implementers Programmers
9
A WIDE RANGE OF MEMORY MODELS
Low There is a big cliff here called
Note: Performance “multi-copy-atomicity”
diagram obviously not to scale,
just a rough picture J Sequential
Consistency

ore O)
l St (TS RISC-V
ota ing
T er (RVWMO)
d
Or IBM Power

Hard for Hard for

NVIDIA GPUs
Implementers Programmers
10
SOME CASES ARE EASY (RELATIVELY)…
Initial condition on both harts: s0 == address x; s1 == address y.
Initial conditions in memory: all locations initialized to 0
Hart 0 Hart 1
li t1, 1 loop:
sw t1, 0(s0) lw a0, 0(s1)
fence w,w beqz a0, loop
sw t1, 0(s1) fence r,r
lw a1, 0(s0)
Final output: what are the possible final values of a0 and a1 on hart 1?

11
SOME CASES ARE EASY (RELATIVELY)…
Initial condition on both harts: s0 == address x; s1 == address y.
Initial conditions in memory: all locations initialized to 0
Hart 0 Hart 1
li t1, 1 loop:
sw t1, 0(s0) lw a0, 0(s1)
fence w,w beqz a0, loop
sw t1, 0(s1) fence r,r
lw a1, 0(s0)
Final output: what are the possible final values of a0 and a1 on hart 1?
Only possible outcome is a0 == a1 == 1
12
SOME CASES ARE HARD…
• Should this outcome be permitted or forbidden? We’re
not even sure ourselves…

13
ARCHITECTURE VS. MICROARCHITECTURE

An implementation can do anything it wants under the

covers, as long as the load return values satisfy RVWMO

i.e., implementations can speculate past a lot of these

rules, as long as they make sure to, e.g., squash and replay
whenever the violation might actually become observable

14
OPERATIONAL VS. AXIOMATIC
In modern practice, at ISA level, two common modeling approaches:
Axiomatic: define a set of criteria (“axioms”) to be satisfied
• Executions permitted unless they fail one or more axioms
Operational: define a golden abstract machine model
• Executions forbidden unless producible when executing this model

Ideally: figure out how to meet in the middle (can be difficult!)

• lots of gray area, obscure code, etc.
15
SEQUENTIAL CONSISTENCY [LAMPORT ‘79]

Axiomatic Operational
1. There is a total order on all 1. Harts take turn executing
memory operations. The order instructions. The order is non-
is non-deterministic. deterministic.
2. That total order respects 2. Each hart executes its own
program order instructions in order
3. Loads return the value written 3. Loads return the value written
by the latest store to the same by the most recent preceding
address in the total order store to the same address

16
SEQUENTIAL CONSISTENCY [LAMPORT ‘79]

Axiomatic Operational
Global memory
1. There is a total order on all 1. Harts take
order turn
executing
memory operations. The order instructions. The order is non-
is non-deterministic. deterministic.
Preserved Program
2. That total order respects 2. Order
Each(PPO)
hart executes its own
program order instructions in order
3. Loads return the value written 3. Loads return
Load Value the value written
Axiom
by the latest store to the same by the most recent preceding
address in the total order store to the same address

17
GLOBAL MEMORY ORDER

A total order over all memory operations in a program

A memory operation “performs” (enters the global memory

order) when:
• a load determines its return value
• a store becomes globally visible

18
SEQUENTIAL CONSISTENCY [LAMPORT ‘79]

19
TOTAL STORE ORDERING (SPARC, X86, RVTSO)
Axiomatic Operational
1. There is a total order on all 1. Harts take turn executing steps. The
memory operations. The order order is non-deterministic.
is non-deterministic. 2. Each hart executes its own
instructions in order
2. That total order respects
program order, except 3. Stores execute in two steps: 1) enter
StoreàLoad ordering store buffer, 2) drain to memory
4. Loads first try to forward from the
3. Loads return the value written store buffer. If that fails, they
by the latest store to the same return the value written by the most
address in program or memory recent preceding store to the same
order (whichever is later) address
20
ADDING A STORE BUFFER
If a load bypasses a store in the
(FIFO) store buffer, then the
load appears before the store in
global memory order
The load determines its return
value before the store becomes
globally visible

The performance win is too

important…the model needs to
be changed to account for this!

[Sewell et al., CACM ‘10] 21

TOTAL STORE ORDERING (SPARC, X86, RVTSO)
Axiomatic Operational
1. There is a total order on all 1. Harts take turn executing steps. The
memory operations. The order order is non-deterministic.
is non-deterministic. 2. Each hart executes its own
instructions in order
2. That total order respects
program order, except 3. Stores execute in two steps: 1) enter
StoreàLoad ordering store buffer, 2) drain to memory
4. Loads first try to forward from the
3. Loads return the value written store buffer. If that fails, they
by the latest store to the same return the value written by the most
address in program or memory recent preceding store to the same
order (whichever is later) address
22
TOTAL STORE ORDERING (SPARC, X86, RVTSO)
Axiomatic Operational
Global memory
1. There is a total order on all 1. Harts take turn executing steps. The
order
memory operations. The order is non-deterministic.
is non-deterministic. 2. Each hart executes its own
Preserved Program
instructions in order
2. That total order respects Order (PPO)
program order, except 3. Stores execute in two steps: 1) enter
StoreàLoad ordering store buffer, 2) drain to memory
4. Load
Loads firstValue
try toAxiom
forward from the
3. Loads return the value written store buffer. If that fails, they
by the latest store to the same return the value written by the most
address in program or memory recent preceding store to the same
order (whichever is later) address
23
RISC-V WEAK MEMORY ORDERING (RVWMO)
Axiomatic Operational
1. There is a total order on all 1. Harts take turn executing steps. The
memory operations. The order order is non-deterministic.
is non-deterministic. 2. Each hart executes its own
instructions in order
2. That total order respects
thirteen specific patterns 3. Multiple steps for each instruction
(next slide) (see spec Appendix B)
4. Loads first try to forward from the
3. Loads return the value written store buffer. If that fails, they
by the latest store to the same return the value written by the most
address in program or memory recent preceding store to the same
order (whichever is later) address
24
RISC-V WEAK MEMORY ORDERING (RVWMO)
Axiomatic Operational
Global memory
1. There is a total order on all 1. Harts take turn executing steps. The
memory operations. The order order isorder
non-deterministic.
is non-deterministic. 2. Each hart executes its own
Preserved Program
instructions in order
2. That total order respects Order (PPO)
thirteen specific patterns 3. Multiple steps for each instruction
(next slide) (see spec Appendix B)
4. Load
Loads firstValue
try toAxiom
forward from the
3. Loads return the value written store buffer. If that fails, they
by the latest store to the same return the value written by the most
address in program or memory recent preceding store to the same
order (whichever is later) address
25
RVWMO PPO RULES IN A NUTSHELL
• Preserved Program Order: if A appears before B in program order,
and A and B match one of the patterns below, then A appears
before B in global memory order.

A Load
A AMO/SC
A A .aq
A A .rl
A LR
A A A
Addr/ctrl/ “(addr|data);rfi”
Overlap Overlap Overlap Fence data dep. or “addr;po”

Store
B Load
B Load
B B B .rl
B .aq
B SC
B B B
except “rsw” with pr/pw/sr/sw except ctrl deps.
and “fri;rfi” set appropriately RCsc where B is a load

26
PPO RULE 1

If A and B access the same address (or have any

overlapping footprint), then A must appear before B
A in global memory order:
Overlap • A load A must determine its value before B
Store
B becomes globally visible
• A store A must become globally visible before B
becomes globally visible

27
PPO RULE 3

AMO/SC
A A load B cannot determine its return value by
forwarding from an Atomic Memory Operation or
Overlap
Store-Conditional operation that has not yet become
Load
B globally visible

28
PPO RULE 3
A load B cannot determine its return value by
forwarding from an Atomic Memory Operation or
Store-Conditional operation that has not yet become
AMO/SC
A globally visible
Overlap

Load
B

(Recall: this defines the architectural rules.

Implementations can do whatever they want, as
long as all outcomes are legal)

29
PPO RULE 4
fence [r][w][i][o], [r][w][i][o]
Orders operations in the predecessor set before
operations in the successor set
A
PR: previous reads. SR: subsequent reads
Fence

B PW: previous writes. SW: subsequent writes

with pr/pw/sr/sw
set appropriately
PI: previous I/O reads. SI: subsequent I/O reads
PO: previous I/O writes. SO: subsequent I/O writes

30
PPO RULES 5-7
AMOs and LR/SC have optional acquire and release
annotations for release consistency
.aq
A A .rl
A • All operations following an acquire in program
order also following it in global memory order
• All operations preceding a release in program
B .rl
B .aq
B
order also precede it in global memory order

RCsc • A release that precedes an acquire in program

order also precedes it in global memory order
• i.e., the RCsc variant of release consistency

31
PPO RULES 9-11
If B has a syntactic address, control, or data dependency
on A, then A precedes B in global memory order
• Except control dependencies where B is a store
A
• Address dependency: the result of A is used to
Addr/ctrl/
data dep. determine the address accessed by B
B • Control dependency: the result of A feeds a branch
except ctrl deps. that determines whether B is executed at all
where B is a load
• Data dependency: the result of A is used to determine
the value written by store B
Note: ordering maintained regardless of actual values!
32
PPO RULES 12-13
1. B follows M in program order, and M has an address
dependency on A

A
2. B returns a value from an earlier store M in the same
hart, and M has an address or data dependency on A
“(addr|data);rfi”
or “addr;po”

B
Most processors will maintain these naturally, yet most
programmers won’t ever use them anyway
We made them explicit rules so that the operational and
axiomatic models all agree
• And also for Linux, which has similar rules too
33
PPO RULE 8

A load-reserve operation determines its value before the

LR
A paired store-conditional becomes globally visible

SC
B (Mostly redundant with rules 1 and 11, except in rare
cases of mismatched addresses and no data dependency)

34
PPO RULE 2
Same-address load-load ordering is also maintained, with
two exceptions:
Load
A 1. Both return values come from the same store
Overlap • A form of architecturally-visible speculation
Load
B • Common in many implementations
except “rsw”
and “fri;rfi” 2. B forwards from a store M between A and B in
program order
• B can determine its value from the store buffer while
A is still fetching an older value from memory

35
ATOMICITY OF AMO AND LR/SC
AMOs grab an old value in memory, perform an arithmetic operation (except for
swap), and write the new value to memory, all in one single atomic operation
• One node in the global memory order

LR grabs a reservation. SC performs a store if the reservation is still valid, and then
releases the reservation.

• A reservation can be killed for any reason. A reservation must be killed if there is
a store to the reserved address range from any other hart.

• Certain constrained LR/SC sequences guaranteed to eventually succeed (see spec)

36
PROGRESS AXIOM

No operation can be preceded in the global memory order by

an infinite sequence of operations from other harts
• Very intentionally the weakest forward progress guarantee
that is needed to make the memory model work
• Does not imply any stronger notion of fairness!

37
…AND THAT’S IT!

38
MEMORY MODEL ISA EXTENSIONS

• “Zam” extends “A” by permitting misaligned AMOs

• “A” without “Zam” now forbids misaligned AMOs or LR/SC pairs

• “Ztso” strengthens the baseline memory model to TSO

• TSO-only code is not backwards-compatible with RVWMO

39
ONGOING/FUTURE WORK
• Mixed-size, partially-overlapping
memory acceses
• Formalize instruction fetches and FENCE.I
TLB flushes and SFENCE.VMA, etc.
• Integration with other extensions (V, J, N, T, …)
• Integration with the ISA formalization task group’s effort
• Cache flush/writeback/etc. operations
• (The task group logistics for all this are still TBD)
40
This Photo by Unknown Author is licensed under CC BY
DOCUMENTATION & TOOLS
• Appendix A: two dozen pages
explaining the details in plain English
• Appendix B: Two axiomatic models and
one operational model, with
associated tools (Alloy, herd, rmem)

• More than 7000 litmus tests online

• (also to be used to test compliance)

41
MEMORY MODEL RATIFICATION TIMELINE
• Released for public review on 5/2/18
• Foundation requires at least 45 days for
public review. This will end no earlier
than 6/16/18.
• If you have comments or feedback:
• send to isa-dev
• send as a PR or issue on riscv-isa-manual GitHub repo
• send to me directly
42
This Photo by Unknown Author is licensed under CC BY-NC-ND
43
TOTAL STORE ORDERING (SPARC, X86, RVTSO)
Axiomatic Operational
ppo := (program order) – WàR
acyclic(ppo U rfe U co U fr U fence)
acyclic(po_loc U rf U co U fr)

44
[Alglave et al., TOPLAS ‘09] [Sewell et al., CACM ‘10]
RVWMO
Axiomatic (App. B.2) Operational (App. B.3)
ppo := (13 rules, on next slide)
acyclic(ppo U rfe U co U fr)
acyclic(po_loc U rf U co U fr)

45
MULTI-COPY ATOMICITY
A load may only return a value from:
• An earlier store from the same hart (“hardware thread”)
• A store that is globally visible

In other words, a store may not “peek” into a neighbor

hart’s private store buffer

46
WHO FEELS THE PAIN?
Synchronization
C/C++ MM Java MM Linux MM Libraries
Canonical Canonical Canonical Hand …
Mapping Mapping Mapping Mapping

RISC-V ISA Memory Consistency Model

• Misconception: end users will have to deal with the memory model
• Reality: end users rarely interact with the ISA memory model directly
• Burden falls instead on library/compiler writers and microarchitects
47
MEMORY MODEL TASK GROUP PROGRESS
• May 2017 Workshop: Formed the task group
(…debate…)

• November 2017 Workshop: Settled on the basics

• RVWMO baseline, and optional RVTSO extension
(…refinement…)

• May 2018 Workshop: released for ratification!

• Public review period runs May 2 through June 16
48
RISC-V MEMORY MODEL SPECIFICATION

• Chapter 6: RISC-V Weak Memory Ordering (“RVWMO”)

• Chapter 20: “Zam” Std. Extension for Misaligned AMOs

• Chapter 21: “Ztso” Std. Extension for Total Store Ordering

• Appendix A: Explanatory Material and Litmus Tests

• Appendix B: Formal Memory Model Specifications
49
RVWMO RULES IN A NUTSHELL

• Load Value Axiom: each byte of each load i returns the value
written to that byte by the store that is the latest in global memory
order among the following stores:
1. Stores that write that byte and that precede i in the global memory order
2. Stores that write that byte and that precede i in program order

• Atomicity Axiom: no store from another hart can appear in the

global memory order between a paired LR and successful SC
• (this axiom simplified here for clarity…see spec for complete definition)

• Progress Axiom: no memory operation may be preceded in the

global memory order by an infinite sequence of other memory
operations 50

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6440)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (642)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1174)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (997)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1855)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5145)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2133)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (463)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2010)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2788)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2884)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4088)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Ecu Repair Vol1
93% (15)
Ecu Repair Vol1
92 pages
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
ARU V144BTE5: Multi V™ 5 Cooling Only 12 RT Outdoor Unit
No ratings yet
ARU V144BTE5: Multi V™ 5 Cooling Only 12 RT Outdoor Unit
2 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Managing The Full API Lifecycle Ebook - 0 PDF
No ratings yet
Managing The Full API Lifecycle Ebook - 0 PDF
22 pages
3.5 - Solving Exponential Equations Using Logs, Applications Math 30-1
No ratings yet
3.5 - Solving Exponential Equations Using Logs, Applications Math 30-1
14 pages
Gamesa Spare Partes
0% (2)
Gamesa Spare Partes
1 page
RC 10 SMS El V1.1
No ratings yet
RC 10 SMS El V1.1
2 pages
How To Reduce Friction
No ratings yet
How To Reduce Friction
17 pages
Mastercam Lathe Lesson 9
No ratings yet
Mastercam Lathe Lesson 9
54 pages
Comparisonof 40 MM Gunfirings
No ratings yet
Comparisonof 40 MM Gunfirings
9 pages
Cf+Oym 2025 26 Practice Test 02c Sol
No ratings yet
Cf+Oym 2025 26 Practice Test 02c Sol
9 pages
Traingle Class 10th
No ratings yet
Traingle Class 10th
6 pages
Sensors Magazine No-22-00427
No ratings yet
Sensors Magazine No-22-00427
16 pages
Wellhead
No ratings yet
Wellhead
7 pages
Photogrammetry Assignment 1
No ratings yet
Photogrammetry Assignment 1
19 pages
BCA-4004 Optimization Techniques Notes
No ratings yet
BCA-4004 Optimization Techniques Notes
3 pages
Reviews of Geophysics - 2018 - Whiteley - Geophysical Monitoring of Moisture Induced Landslides A Review
No ratings yet
Reviews of Geophysics - 2018 - Whiteley - Geophysical Monitoring of Moisture Induced Landslides A Review
40 pages
Maglev Train
No ratings yet
Maglev Train
12 pages
NRF52 Online Power Profiler
No ratings yet
NRF52 Online Power Profiler
6 pages
Metrology & Measurement (R-2013)
No ratings yet
Metrology & Measurement (R-2013)
11 pages
BSCM
No ratings yet
BSCM
4 pages
〈1788.2〉 MEMBRANE MICROSCOPE METHOD FOR THE DETERMINATION OF SUBVISIBLE PARTICULATE MATTER
No ratings yet
〈1788.2〉 MEMBRANE MICROSCOPE METHOD FOR THE DETERMINATION OF SUBVISIBLE PARTICULATE MATTER
8 pages
Glenn A. Hunter - Global Product Manager, Reciprocating: Previous
No ratings yet
Glenn A. Hunter - Global Product Manager, Reciprocating: Previous
73 pages
Acute Effects of Video-Game Playing Versus Television Viewing On Stress Markers and Food Intake in Overweight and Obese Young Men Mario Siervo 2018
No ratings yet
Acute Effects of Video-Game Playing Versus Television Viewing On Stress Markers and Food Intake in Overweight and Obese Young Men Mario Siervo 2018
9 pages
Decimal Misconceptions
No ratings yet
Decimal Misconceptions
2 pages
Abap Code Sample Using RFC - Read - Table For Webas
No ratings yet
Abap Code Sample Using RFC - Read - Table For Webas
19 pages
Quadrature Rules For Numerical Integration Over Triangles and Tetrahedra
No ratings yet
Quadrature Rules For Numerical Integration Over Triangles and Tetrahedra
3 pages
Power Supply Unit - MINI-PS-100-240AC/24DC/2 - 2938730: Key Commercial Data
No ratings yet
Power Supply Unit - MINI-PS-100-240AC/24DC/2 - 2938730: Key Commercial Data
6 pages
Renault Twingo (2000 - 2004) - Fuse Box Diagram
0% (1)
Renault Twingo (2000 - 2004) - Fuse Box Diagram
5 pages
New SAT Math Workbook
100% (12)
New SAT Math Workbook
354 pages
10.1007@978 981 15 7984 4
No ratings yet
10.1007@978 981 15 7984 4
666 pages

14 25-15 00-RISCVMemoryModelTutorial

Uploaded by

14 25-15 00-RISCVMemoryModelTutorial

Uploaded by

RISC-V Memory Consistency Model

• Senior Research Scientist at NVIDIA in Santa Clara, CA

Specifies the values that can be returned by loads

…to give everyone a headache?

For the same reason we need

It is (one specific part of) the

For the same reason we need

It is (one specific part of) the

Hard for Hard for

Hard for Hard for

An implementation can do anything it wants under the

i.e., implementations can speculate past a lot of these

Ideally: figure out how to meet in the middle (can be difficult!)

A total order over all memory operations in a program

A memory operation “performs” (enters the global memory

The performance win is too

[Sewell et al., CACM ‘10] 21

If A and B access the same address (or have any

(Recall: this defines the architectural rules.

B PW: previous writes. SW: subsequent writes

RCsc • A release that precedes an acquire in program

A load-reserve operation determines its value before the

• Certain constrained LR/SC sequences guaranteed to eventually succeed (see spec)

No operation can be preceded in the global memory order by

• “Zam” extends “A” by permitting misaligned AMOs

• “Ztso” strengthens the baseline memory model to TSO

• More than 7000 litmus tests online

In other words, a store may not “peek” into a neighbor

RISC-V ISA Memory Consistency Model

• November 2017 Workshop: Settled on the basics

• May 2018 Workshop: released for ratification!

• Chapter 6: RISC-V Weak Memory Ordering (“RVWMO”)

• Chapter 20: “Zam” Std. Extension for Misaligned AMOs

• Appendix A: Explanatory Material and Litmus Tests

• Atomicity Axiom: no store from another hart can appear in the

• Progress Axiom: no memory operation may be preceded in the

You might also like