SlideShare a Scribd company logo
Porting NetBSD
on
the open source
LatticeMico32 CPU
Yann Sionneau
M-Labs
@ EHSM 2014
About me
• Yann Sionneau
• Embedded software developer
• Working at Sequans Communication
• M-Labs contributor
• @yannsionneau on twitter
• Email: yann.sionneau@gmail.com
I’m going to talk about…
How to run NetBSD and
EdgeBSD on the
Milkymist One
Agenda
• I) The hardware part: the MMU
–What is a MMU and how it works
• II) The software part
–How to port NetBSD to a new CPU
Milkymist One?!
Milkymist One?!
Milkymist One?!
The Milkymist One uses an FPGA
What’s an FPGA??
• A chip
FPGA internals
Milkymist System-on-Chip
LatticeMico32 CPU
• 32 bits Harvard Architecture RISC
• Big Endian
• 6 stages
• Fully bypassed
• Optional configurable I/D caches
– Direct mapped or
– 2-way set associative
• Wishbone on-chip bus
LatticeMico32 , Good points
• Small
• Portable (works with several FPGA vendors)
• Fast (~100 MHz on Slowtanpartan 6)
• Actually works
• GCC/Binutils/GDB/Qemu/uCLinux/OpenWRT
support
• OPEN SOURCE
LatticeMico32, Bad points
• No Memory Management Unit… yet!
LatticeMico32, Bad points
• No Memory Management Unit… yet!
Done 
Used in…
• Closed source commercial ASICs
• Open source projects
• Can achieve 800 MHz in TSMC 90nm
standard cell process
LatticeMico32 pipeline
What’s a pipeline?
• « In computing, a pipeline is a set of
data processing elements connected
in series, where the output of one
element is the input of the next
one. »
-- Pipeline (computing), Wikipedia
What’s a pipeline?
Data processing
element 1
Data processing
element 2
Data processing
element 3
IN
IN INOUTOUT
OUT
What’s a pipeline?
$ cat .bash_history | grep 'cat' | wc -l
6
What’s a CPU pipeline?
What’s a CPU pipeline?
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A
2
3
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F
2 A
3
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D
2 A F
3 A
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X
2 A F D
3 A F
4 A
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M
2 A F D X
3 A F D
4 A F
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M W
2 A F D X M
3 A F D X
4 A F D
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M W
2 A F D X M W
3 A F D X M
4 A F D X
Clock cycle 1 2 3 4 5 6 7
Main Memory
CPU Internal
Before
PHYSICAL
ADDRESS
PHYSICAL
ADDRESS
PA
PA
Main Memory
CPU Internal
Raising exception
After
VIRTUAL ADDRESSES PHYSICAL ADDRESSES
What’s the MMU’s job?
• Translate « virtual addresses » into « physical
addresses »
• Memory protection against unwanted
execution of code or data write (e.g. software
bug or security issue)
– Memory right access management
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
How does the MMU know the VA->PA
translation ?
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page TableWhy « PAGE »?
Why « Page »?
• 0x00000004 -> 0x10000000
• 0x00000005 -> 0x10000001
• 0x00000006 -> 0x10000002
Etc…
Why « Page »?
• 0x00000004 -> 0x10000000
• 0x00000005 -> 0x10000001
• 0x00000006 -> 0x10000002
Etc…
This is WRONG!!!
Why « Page »?
• 0x00000*** -> 0x10000***
• 0x00001*** -> 0x10001***
• 0x00002*** -> 0x10002***
Etc…
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
TLB : Translation
Lookaside Buffer
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
Operating
System
Updates the
Gets information from the
Updates the
Features?
• Page size
–Only 4 kB
32 bits physical address :
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
How many bits of an address indicate the
offset within a given page?
Features?
• Page size
–Only 4 kB
32 bits physical address :
xxxxxxxx xxxxxxxx xxxx xxxx xxxxxxxx
Page number [31:12]
20 bits
Offset [11:0]
12 bits
Features?
• 2 TLB (Translation Lookaside Buffer)
–ITLB
–DTLB
• Each TLB contains 1024 entries
–How many bits needed to index the TLB?
Features?
• 2 TLB (Translation Lookaside Buffer)
–ITLB
–DTLB
• Each TLB contains 1024 entries
–How many bits needed to index the TLB?
10 bits!
Features?
• No hardware page-tree walker
– i.e. TLB is software assisted
Virtual address
Load or store?
Instruction or
Data?
Physical address
Access
granted/denied
Virtual address
Load or store?
Instruction or
Data?
Physical address
Access
granted/denied
I don’t know!
Let’s have a look inside
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001004
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004
Page number
Offset in the page
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
VPN = 0xA0001  1010 0000 0000 0000 0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Physical page number = 0xB0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Physical page number = 0xB0001
Physical Address = 0xB0001004
Porting NetBSD
• 1°) NetBSD cross compilation toolchain
– build.sh
– Makefiles here and there
– Arch-specific directories
Allows to do:
$ ./build.sh -U -m lm32 tools
Porting NetBSD
• 2°) Support for built-ins in libkern
– NetBSD kernel is
• Not linked against libgcc
• Linked against libkern
– Need to implement basic arithmetic functions
emitted by gcc in object code
– Implementation in sys/lib/libkern/arch/lm32
Porting NetBSD
• 3°) Building my first kernel
– Create sys/arch/lm32 and sys/arch/milkymist
– Populate
• sys/arch/<cpu|soc>/include
• sys/arch/<cpu|soc>/conf
– Stub, stub, stub…
Allows to do:
$ ./build.sh -m milkymist -U kernel=GENERIC
Porting NetBSD
• 4°) Write basic console driver for early prints
struct consdev milkymist_com_cons = {
[…]
milkymist_com_cngetc, /* cn_getc: kernel getchar interface */
milkymist_com_cnputc, /* cn_putc: kernel putchar interface */
[…]
};
Porting NetBSD
• 5°) Implement exception handlers
• 6°) Call milkymist_startup() C code
– Initialize console driver
• -> consinit() -> milkymist_uart_cnattach()
• cn_tab = &milkymist_com_cons;
– Initialiaze virtual memory subsystem
• Call MD pmap_bootstrap()
– Let the kernel boot
• Call NetBSD MI main()
Porting NetBSD
• 7°) Implement pmap.9
pmap -- machine-dependent portion of the virtual
memory system
– pmap_bootstrap()
– pmap_init, pmap_create, pmap_destroy …
– SW managed TLB? -> sys/uvm/pmap/
– used in (PowerPC Booke and LM32)
Porting NetBSD
• 8°) Implement copyin/copyout
• 9°) Implement atomic operations
– No atomic instruction  RAS (Restartable Atomic
Sequence) CAS (Compare And Swap)
– Other atomic ops built around this CAS
RAS CAS
int _atomic_cas_32(volatile uint32_t *val, uint32_t old,
uint32_t new);
_atomic_cas_32:
_atomic_cas_ras_start:
lw r4, (r1+0) /* load *val into r4 */
bne r4, r2, 1f /* compare r4 (*val) and old (r2) */
sw (r1+0), r3
_atomic_cas_ras_end:
1:
mv r1, r4 /* return (*val) */
ret
Porting NetBSD
• 10°) Add support for interrupts
– Write a function to register interrupt handlers
• 11°) Have a running system clock
– Write cpu_initclocks()
– Write clock irq handler
• Call hardclock()
Other functions to write
• Switch context from one thread to another
– cpu_switchto(9)
• Copy data and abort on page fault
– kcopy(9)
• Save current context
– setfault()
• Low level code to finish up fork() operation
– cpu_lwp_fork(9)
Other functions to write
• Block interrupts to protect critical sections
– spl(9)
• Init CPU and print copyright message
– cpu_startup(9)
• Determine the root file system device
– cpu_rootconf(9)
• Etc…
Porting NetBSD
• To boot user space
– Create dummy ramdisk with /sbin/init
– Build kernel with MFS
– Insert ramdisk with mdsetimage
– Boot it!
Porting NetBSD
DEMO
Thank you!
Sébastien Bourdeauducq, Michael Walle, Robert
Swindells, Stefan Kristiansson, Lars-Peter
Clausen, Pierre Pronchery, Radoslaw Kujawa,
Youri Mouton, Matt Thomas, tech-kern@, M-
Labs mailing list, and many more
Questions?
NetBSD/milkymist Memory Layout
Kernel
space
User space
0 0xffffffff
0xc0000000
0xc8000000
Ram window
User stack
Kernel
stack
DDR SDRAM :
128 MB

More Related Content

Similar to Porting NetBSD to the open source LatticeMico32 CPU (20)

PDF
Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
eurobsdcon
 
PPTX
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
PPT
Embedded Linux
Quotient Technology Inc.
 
PPTX
Linux Initialization Process (1)
shimosawa
 
PDF
Bài tập lớn hệ điều hành HCMUT_HK232.pdf
danhnguyenthanh15
 
PPT
Linux Memory
Vitaly Nahshunov
 
PDF
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
ThienMinh30
 
PPT
Linux memory
ericrain911
 
PDF
7989-lect 10.pdf
RiazAhmad521284
 
PPT
memory_mapping.ppt
KalimuthuVelappan
 
PDF
Buiding a better Userspace - The current and future state of QEMU and KVM int...
aliguori
 
PPTX
02-OS-review.pptx
TrongMinhHoang1
 
PDF
Virtual memory 20070222-en
Tetsuyuki Kobayashi
 
PPTX
Understanding the virtual memory - Ixia Connect #2
IxiaRomania
 
PPTX
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Magnus Backman
 
PDF
Vmreport
meru2ks
 
PDF
ARM architcture
Hossam Adel
 
PDF
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat Security Conference
 
PPT
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Hsien-Hsin Sean Lee, Ph.D.
 
Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
eurobsdcon
 
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
Embedded Linux
Quotient Technology Inc.
 
Linux Initialization Process (1)
shimosawa
 
Bài tập lớn hệ điều hành HCMUT_HK232.pdf
danhnguyenthanh15
 
Linux Memory
Vitaly Nahshunov
 
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
ThienMinh30
 
Linux memory
ericrain911
 
7989-lect 10.pdf
RiazAhmad521284
 
memory_mapping.ppt
KalimuthuVelappan
 
Buiding a better Userspace - The current and future state of QEMU and KVM int...
aliguori
 
02-OS-review.pptx
TrongMinhHoang1
 
Virtual memory 20070222-en
Tetsuyuki Kobayashi
 
Understanding the virtual memory - Ixia Connect #2
IxiaRomania
 
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Disco
Magnus Backman
 
Vmreport
meru2ks
 
ARM architcture
Hossam Adel
 
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat Security Conference
 
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Hsien-Hsin Sean Lee, Ph.D.
 

Recently uploaded (20)

PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PPTX
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
PPTX
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
Letasoft Sound Booster 1.12.0.538 Crack Download+ Product Key [Latest]
HyperPc soft
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
PDF
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PPTX
Perfecting XM Cloud for Multisite Setup.pptx
Ahmed Okour
 
PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PPTX
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
Letasoft Sound Booster 1.12.0.538 Crack Download+ Product Key [Latest]
HyperPc soft
 
Import Data Form Excel to Tally Services
Tally xperts
 
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
Salesforce CRM Services.VALiNTRY360
VALiNTRY360
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Perfecting XM Cloud for Multisite Setup.pptx
Ahmed Okour
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Ad

Porting NetBSD to the open source LatticeMico32 CPU

  • 1. Porting NetBSD on the open source LatticeMico32 CPU Yann Sionneau M-Labs @ EHSM 2014
  • 2. About me • Yann Sionneau • Embedded software developer • Working at Sequans Communication • M-Labs contributor • @yannsionneau on twitter • Email: [email protected]
  • 3. I’m going to talk about… How to run NetBSD and EdgeBSD on the Milkymist One
  • 4. Agenda • I) The hardware part: the MMU –What is a MMU and how it works • II) The software part –How to port NetBSD to a new CPU
  • 8. The Milkymist One uses an FPGA
  • 12. LatticeMico32 CPU • 32 bits Harvard Architecture RISC • Big Endian • 6 stages • Fully bypassed • Optional configurable I/D caches – Direct mapped or – 2-way set associative • Wishbone on-chip bus
  • 13. LatticeMico32 , Good points • Small • Portable (works with several FPGA vendors) • Fast (~100 MHz on Slowtanpartan 6) • Actually works • GCC/Binutils/GDB/Qemu/uCLinux/OpenWRT support • OPEN SOURCE
  • 14. LatticeMico32, Bad points • No Memory Management Unit… yet!
  • 15. LatticeMico32, Bad points • No Memory Management Unit… yet! Done 
  • 16. Used in… • Closed source commercial ASICs • Open source projects • Can achieve 800 MHz in TSMC 90nm standard cell process
  • 18. What’s a pipeline? • « In computing, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. » -- Pipeline (computing), Wikipedia
  • 19. What’s a pipeline? Data processing element 1 Data processing element 2 Data processing element 3 IN IN INOUTOUT OUT
  • 20. What’s a pipeline? $ cat .bash_history | grep 'cat' | wc -l 6
  • 21. What’s a CPU pipeline?
  • 22. What’s a CPU pipeline?
  • 23. Pipelined instruction execution Instr. number Pipeline Stage 1 A 2 3 4 Clock cycle 1 2 3 4 5 6 7
  • 24. Pipelined instruction execution Instr. number Pipeline Stage 1 A F 2 A 3 4 Clock cycle 1 2 3 4 5 6 7
  • 25. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D 2 A F 3 A 4 Clock cycle 1 2 3 4 5 6 7
  • 26. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X 2 A F D 3 A F 4 A Clock cycle 1 2 3 4 5 6 7
  • 27. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M 2 A F D X 3 A F D 4 A F Clock cycle 1 2 3 4 5 6 7
  • 28. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M W 2 A F D X M 3 A F D X 4 A F D Clock cycle 1 2 3 4 5 6 7
  • 29. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M W 2 A F D X M W 3 A F D X M 4 A F D X Clock cycle 1 2 3 4 5 6 7
  • 31. Main Memory CPU Internal Raising exception After VIRTUAL ADDRESSES PHYSICAL ADDRESSES
  • 32. What’s the MMU’s job? • Translate « virtual addresses » into « physical addresses » • Memory protection against unwanted execution of code or data write (e.g. software bug or security issue) – Memory right access management
  • 33. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address
  • 34. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address How does the MMU know the VA->PA translation ?
  • 35. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table
  • 36. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page TableWhy « PAGE »?
  • 37. Why « Page »? • 0x00000004 -> 0x10000000 • 0x00000005 -> 0x10000001 • 0x00000006 -> 0x10000002 Etc…
  • 38. Why « Page »? • 0x00000004 -> 0x10000000 • 0x00000005 -> 0x10000001 • 0x00000006 -> 0x10000002 Etc… This is WRONG!!!
  • 39. Why « Page »? • 0x00000*** -> 0x10000*** • 0x00001*** -> 0x10001*** • 0x00002*** -> 0x10002*** Etc…
  • 40. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table
  • 41. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table TLB TLB : Translation Lookaside Buffer
  • 42. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table TLB Operating System Updates the Gets information from the Updates the
  • 43. Features? • Page size –Only 4 kB 32 bits physical address : xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx How many bits of an address indicate the offset within a given page?
  • 44. Features? • Page size –Only 4 kB 32 bits physical address : xxxxxxxx xxxxxxxx xxxx xxxx xxxxxxxx Page number [31:12] 20 bits Offset [11:0] 12 bits
  • 45. Features? • 2 TLB (Translation Lookaside Buffer) –ITLB –DTLB • Each TLB contains 1024 entries –How many bits needed to index the TLB?
  • 46. Features? • 2 TLB (Translation Lookaside Buffer) –ITLB –DTLB • Each TLB contains 1024 entries –How many bits needed to index the TLB? 10 bits!
  • 47. Features? • No hardware page-tree walker – i.e. TLB is software assisted
  • 48. Virtual address Load or store? Instruction or Data? Physical address Access granted/denied
  • 49. Virtual address Load or store? Instruction or Data? Physical address Access granted/denied I don’t know!
  • 50. Let’s have a look inside
  • 51. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001004
  • 52. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page number Offset in the page
  • 53. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4
  • 54. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001
  • 55. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 VPN = 0xA0001  1010 0000 0000 0000 0001
  • 56. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 TLB index, used to select a TLB line
  • 57. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 TLB index, used to select a TLB line
  • 58. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 =
  • 59. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 = Physical page number = 0xB0001
  • 60. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 = Physical page number = 0xB0001 Physical Address = 0xB0001004
  • 61. Porting NetBSD • 1°) NetBSD cross compilation toolchain – build.sh – Makefiles here and there – Arch-specific directories Allows to do: $ ./build.sh -U -m lm32 tools
  • 62. Porting NetBSD • 2°) Support for built-ins in libkern – NetBSD kernel is • Not linked against libgcc • Linked against libkern – Need to implement basic arithmetic functions emitted by gcc in object code – Implementation in sys/lib/libkern/arch/lm32
  • 63. Porting NetBSD • 3°) Building my first kernel – Create sys/arch/lm32 and sys/arch/milkymist – Populate • sys/arch/<cpu|soc>/include • sys/arch/<cpu|soc>/conf – Stub, stub, stub… Allows to do: $ ./build.sh -m milkymist -U kernel=GENERIC
  • 64. Porting NetBSD • 4°) Write basic console driver for early prints struct consdev milkymist_com_cons = { […] milkymist_com_cngetc, /* cn_getc: kernel getchar interface */ milkymist_com_cnputc, /* cn_putc: kernel putchar interface */ […] };
  • 65. Porting NetBSD • 5°) Implement exception handlers • 6°) Call milkymist_startup() C code – Initialize console driver • -> consinit() -> milkymist_uart_cnattach() • cn_tab = &milkymist_com_cons; – Initialiaze virtual memory subsystem • Call MD pmap_bootstrap() – Let the kernel boot • Call NetBSD MI main()
  • 66. Porting NetBSD • 7°) Implement pmap.9 pmap -- machine-dependent portion of the virtual memory system – pmap_bootstrap() – pmap_init, pmap_create, pmap_destroy … – SW managed TLB? -> sys/uvm/pmap/ – used in (PowerPC Booke and LM32)
  • 67. Porting NetBSD • 8°) Implement copyin/copyout • 9°) Implement atomic operations – No atomic instruction  RAS (Restartable Atomic Sequence) CAS (Compare And Swap) – Other atomic ops built around this CAS
  • 68. RAS CAS int _atomic_cas_32(volatile uint32_t *val, uint32_t old, uint32_t new); _atomic_cas_32: _atomic_cas_ras_start: lw r4, (r1+0) /* load *val into r4 */ bne r4, r2, 1f /* compare r4 (*val) and old (r2) */ sw (r1+0), r3 _atomic_cas_ras_end: 1: mv r1, r4 /* return (*val) */ ret
  • 69. Porting NetBSD • 10°) Add support for interrupts – Write a function to register interrupt handlers • 11°) Have a running system clock – Write cpu_initclocks() – Write clock irq handler • Call hardclock()
  • 70. Other functions to write • Switch context from one thread to another – cpu_switchto(9) • Copy data and abort on page fault – kcopy(9) • Save current context – setfault() • Low level code to finish up fork() operation – cpu_lwp_fork(9)
  • 71. Other functions to write • Block interrupts to protect critical sections – spl(9) • Init CPU and print copyright message – cpu_startup(9) • Determine the root file system device – cpu_rootconf(9) • Etc…
  • 72. Porting NetBSD • To boot user space – Create dummy ramdisk with /sbin/init – Build kernel with MFS – Insert ramdisk with mdsetimage – Boot it!
  • 74. Thank you! Sébastien Bourdeauducq, Michael Walle, Robert Swindells, Stefan Kristiansson, Lars-Peter Clausen, Pierre Pronchery, Radoslaw Kujawa, Youri Mouton, Matt Thomas, tech-kern@, M- Labs mailing list, and many more
  • 76. NetBSD/milkymist Memory Layout Kernel space User space 0 0xffffffff 0xc0000000 0xc8000000 Ram window User stack Kernel stack DDR SDRAM : 128 MB

Editor's Notes

  • #5: Say « Memory Management Unit »
  • #6: Electronic device aimed at generating artistic video performance in parties and concerts
  • #7: You can capture live dancers and apply videos effects like rotations zoom in/out translations and project the result against a screen of a wall It reacts in real time with synchronization to audio input and can be controlled via MIDI keyboard or DMX (protocol used to control stage lighting and effects)
  • #10: Array of configurable logic blocks, linked together by a programmable switch matrix
  • #41: Previously I said it’s slow to access main memory. Here MMU is accessing PT (in RAM) each time to get translations, aren’t we slowing our CPU down?
  • #42: TLB: clever word for « cache for PVA -> PPA translations » 1st time you wanna translate a page -> go to PT in RAM Next time you translate the same page -> TLB hit in 1 cycle
  • #43: In LM32, like MIPS or PowerPC Booke, MMU does not read the page table itself to refill the TLB. (no hardware page tree walker) Instead MMU raises exception and lets the OS update the TLB. TLB is entirely managed by SW.
  • #71: kcopy: copy data like memcpy, aborts on page fault Setfault: saves current context for later restoring if we take a page fault cpu_lwp_fork() is the machine-dependent portion of fork1() which finishes a fork operation
  • #72: cpu_startup: init cpu, print copyright message spl: raise and lower the interrupt priority level used by kernel code to block interrupts in critical sections cpu_rootconf: determine the root file system device
  • #75: Thank you for attending, and thanks for all those who helped for this work