0% found this document useful (0 votes)
76 views

Martin Decky Microkernels Capabilities

The document discusses Martin Děcký, a research scientist who works on microkernel-based and capability-based operating systems at Huawei Technologies. It provides biographical details on Děcký's background and employment. It also summarizes Děcký's presentation on microkernel-based operating systems, including motivations for the approach, examples from history, and Huawei's work in this area including open roles.

Uploaded by

liangpig1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

Martin Decky Microkernels Capabilities

The document discusses Martin Děcký, a research scientist who works on microkernel-based and capability-based operating systems at Huawei Technologies. It provides biographical details on Děcký's background and employment. It also summarizes Děcký's presentation on microkernel-based operating systems, including motivations for the approach, examples from history, and Huawei's work in this area including open roles.

Uploaded by

liangpig1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Microkernel-based and Capability-based

Operating Systems
Martin Děcký
[email protected]

March 2021
About the Speaker
Charles University
Research scientist at D3S (2008 – 2017)
Graduated (Ph.D.) in 2015
Co-author of the HelenOS (http://www.helenos.org/) microkernel
multiserver operating system
Huawei Technologies
Senior Research Engineer, Munich Research Center (2017 – 2018)
Principal Research Engineer, Dresden Research Center (2019 – present)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 2
Sweden RC Finland RC
UK RC
Ireland RC Belgium RC
Vancouver Edmonton Toronto Poland RC
Montreal Ukraine RC
France RC Germany, Austria, Switzerland RC
Ottawa Beijing
Israel RC
Waterloo Italy RC
Xi’anNanjing Japan RC
Chengdu Shanghai
Wuhan Suzhou
Songshanhu Hangzhou

HQ Shenzhen
India RC

Edinburgh Tampere
Cambridge Stockholm Helsinki
Singapore RC
Ipswich Goteburg
London Lund
Warsaw
Leuven Dresden
Dublin Nuremburg Kyiv
Paris Munich
City R&D Center
Lagrange Zurich Vienna
Grenoble Milan
Related Country Research Center Nice Pisa
City Research Center

Tel Aviv

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 3
Huawei Dresden Research Center (DRC)
Since 2019, ~20 employees (plus a virtualization team in Munich)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 4
Huawei Dresden Research Center (DRC) (2)
Focuses on R&D in the domain of operating systems
Microkernels, hypervisors
Collaboration with the OS Kernel Lab in Huawei HQ
Collaboration with TU Dresden, MPI-SWS, ETH Zürich and other institutions
Formal verification of correctness, weak memory architectures
Safety and security certification
Many-core scalability, heterogeneous hardware
Flexible OS architecture

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 5
We Are Hiring
Operating System Engineer / Researcher (Dresden)
https://apply.workable.com/huawei-16/j/3BAC3458E6/
Formal Verification Engineer / Researcher (Dresden)
https://apply.workable.com/huawei-16/j/95CCAD4EC5/
Virtualization Engineer / Researcher (Munich)
https://apply.workable.com/huawei-16/j/51F90678EA/
Industrial Ph.D. Student (Dresden)
In collaboration with TU Dresden

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 6
Systems Software Innovations Summit 2021
March 30th – 31st 2021
https://huawei-events.de/, on-line, no participation fee

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 7
Microkernels

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 8
Microkernel-based Operating Systems
Motivation
Safety, security, reliability, dependability
Proper software architecture
Formal verification of correctness
Modularity, customization
Virtualization, paravirtualization
Tasks and virtual machines are quite similar types of entities
Partitioning, support for mixed criticality

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 9
Monolithic OS Design Is Flawed
Biggs S., Lee D., Heiser G.: The Jury Is In: Monolithic OS Design Is
Flawed: Microkernel-based Designs Improve Security, ACM 9th Asia-
Pacific Workshop on Systems (APSys), 2018
“While intuitive, the benefits of the small TCB have not been quantified to
date. We address this by a study of critical Linux CVEs, where we examine
whether they would be prevented or mitigated by a microkernel-based
design. We find that almost all exploits are at least mitigated to less than
critical severity, and 40 % completely eliminated by an OS design based
on a verified microkernel, such as seL4.”
https://dl.acm.org/doi/10.1145/3265723.3265733

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 10
Some Data Points from History
Compatible Time-Sharing System (CTSS)
John McCarthy, MIT Computation Center, 1961
Probably one of the earliest “real” operating system
Not just a loader, jobs manager or batch manager
RC 4000 Multiprogramming System
Per Brinch Hansen, Regnecentralen, 1969
Separation of mechanism and policy, modularity via isolated concurrently running
processes, message passing
Multics
MIT, General Electric, Bell Labs, 1969
Traceable influence on UNIX

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 11
Some Data Points from History (2)
HYDRA
William Wulf, Carnegie Mellon University, 1971
Capability-based, object-oriented, separation of mechanism and policy
Probably the earliest peer-reviewed publication of the design principles
UNIX
Ken Thompson, Dennis Ritchie, Brian Kernighan et al., Bell Labs, 1973
Architecture and design traceable in many current monolithic systems
VMS
Digital Equipment, 1977
Architecture and design traceable in Microsoft Windows

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 12
Some Data Points from History (3)
EUMEL / L2
Jochen Liedtke, University of Bielefeld, 1979
Proto-microkernel based on bitcode virtual machines
QNX
Gordon Bell, Dan Dodge, 1982
Earliest commercially successful microkernel multiserver OS
Still in active use and development today
CMU Mach
Richard Rashid, Avie Tevanian, Carnegie Mellon University, 1985
Arguably the most widespread microkernel code base
Still a core part of macOS, iOS and other OS clones by Apple today (but not in a microkernel configuration)
Despite its well-publicized shortcomings, it remains highly influential

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 13
Microkernel-based Operating Systems
Definition
Operating system that follows specific design principles that, in effect,
minimize the amount of code running in the privileged (kernel) mode
Hence the name
Every microkernel-based OS follows slightly different specific design
principles
Two design principles are probably universally common
Minimality principle
Split of mechanism and policy principle

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 14
Minimality Principle
The obvious criterion
The kernel needs to implement the functionality than cannot be possibly
implemented in user space
On typical commodity hardware, this includes
Bootstrapping
Fundamental part of hardware exception and interrupt handling
Configuration of certain control registers (possibly including MMU)
Fundamental part of mode switching (e.g. related to hardware virtualization,
trusted execution environments, etc.)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 15
Minimality Principle (2)
The necessary criterion
The kernel needs to implement the functionality than cannot be
delegated only to a trusted user space component without also
delegating it to any untrusted user space component (thus undermining
the fundamental guarantees that the operating system provides)
On typical commodity hardware, this includes
Configuration of the forced preemption mechanism (e.g. timer interrupt
routing)
Fundamental part of interacting with a hypervisor, firmware and some hardware
components
– Hardware components are tricky: Without IOMMU, almost any interaction
with hardware might potentially undermine the OS guarantees

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 16
Minimality Principle (3)
The practicality criterion
The kernel might also implement the functionality that would be
unpractical (while still technically possible) to be safely delegated to user
space
This is where microkernels differ, but there are still some universal examples
Context switching
Basic scheduling
System timer configuration
Observability and (optional) debugging support

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 17
Split of Mechanism and Policy Principle
Orthogonal to the minimality principle
The microkernel is not an indivisible entity
Composed of instructions, basic blocks, language constructs, etc.
The code inevitably follows some patterns that form architecture, design,
abstractions, parametrization, etc.
Separation of concerns
The kernel implements only pure and universal mechanisms (“the what”)
while the policies (“the how/when”) are delegated to user space
This is where microkernels differ
– Does “arbitrary policy” equal “no policy”?
– Is it fine to have a default (but replaceable) policy?

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 18
Practical Differences
Monolithic kernel
Configurability via compile-time options and parametrization
Modularity via run-time dynamic linking
Tight module coupling, weak module cohesion
Structure is implicit and not enforced (especially at run time)
Microkernel
Configurability via different use (policy in user space)
Modularity via extension in user space
Loose module coupling, strong module cohesion
Structure is explicit and enforced (even at run time)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 19
Design Space of Operating Systems

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 20
Design Space of Operating Systems

monolithic fine-grained
components components

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 21
Design Space of Operating Systems

safety via
isolation

monolithic fine-grained
components components

raw
performance

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 22
Design Space of Operating Systems
static
deployment

safety via
isolation

monolithic fine-grained
components components

raw
performance

dynamic
deployment
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 23
Design Space of Operating Systems
static
deployment
unikernel
OS separation
kernel
safety via
isolation

monolithic fine-grained
components components

raw
microkernel microkernel
performance single server OS multiserver OS
hypervisor

monolithic dynamic
kernel OS
deployment
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 24
Architecture of User Space
Monolithic OS
application application application
unprivileged mode

privileged mode

monolithic kernel

memory scheduler IPC device file system user network ...


mgmt drivers drivers mgmt stack

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 25
Architecture of User Space
Single-server microkernel OS
application application application

system server
device file system user network
...
drivers drivers mgmt stack
unprivileged mode

privileged mode
memory
mgmt scheduler IPC microkernel

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 26
Architecture of User Space
Multiserver microkernel OS
application application application

network security device file system


stack server multiplexer multiplexer

naming location device


device
device
driver
driver
driver file
file
file
system
system
system
...
server server server
server
server driver
driver
driver server
server
server
unprivileged mode

privileged mode
memory
mgmt scheduler IPC microkernel

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 27
Architecture of User Space
Type-1 hypervisor
operating system operating system operating system
app app app app app app

app app app app app app


unprivileged mode unprivileged mode unprivileged mode
privileged mode privileged mode privileged mode

kernel kernel kernel


privileged mode

hyper-privileged
memory mode
mgmt scheduler comm hypervisor

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 28
Architecture of User Space
Type-1 hypervisor (in common deployment)
operating system operating system operating system

app app app

unprivileged mode unprivileged mode unprivileged mode


privileged mode privileged mode privileged mode

kernel kernel kernel


privileged mode

hyper-privileged
memory mode
mgmt scheduler comm hypervisor

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 29
Architecture of User Space
Hypervisor with unikernels
unikernel unikernel unikernel

app app app


component component component

kernel kernel kernel


component component component
privileged mode

hyper-privileged
memory mode
mgmt scheduler comm hypervisor

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 30
Architecture of User Space
Multiserver microkernel with unikernels for device drivers
unikernel
application application application

kernel network security device file system


component stack server multiplexer multiplexer
device
driver naming location device
device
device
driver
driver
driver file
file
file
system
system
system
...
server server server
server
server driver
driver
driver server
server
server
unprivileged mode

privileged mode
memory
mgmt scheduler IPC microkernel

hardware

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 31
Architecture of User Space
Multikernel

application application application

application application application

server server server server server server


unprivileged mode

privileged mode
kernel kernel kernel

CPU CPU CPU

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 32
Capabilities

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 33
Capabilities
Motivation
A universal and pure mechanism in the kernel to safely manage (all)
operating system resources
Without implementing any specific management policy in the kernel
(i.e. delegating the management policy completely to user space)
Potential secondary goal
Possibility to grant or delegate (parts of) the authority over resources from
the original owner of a resource to other users
In a controllable fashion (i.e. including the possibility of revocation)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 34
Capabilities
Definition
Capability
Object (instance of a given object type) identifying some specific (operating system) resource
Kernel object identifying a kernel-managed resource
Kernel object (proxy object) identifying a user space resource
User space object identifying a user space resource
Capability reference
Unforgeable identifier (handle) to a capability
Might be associated with permissions (e.g. permissible operations, methods) and ownership
Capability space
Each capability reference is local to a specific namespace (typically associated with a specific
task, process) and does not have any meaning in other namespaces
Akin to (virtual) address space

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 35
What Are Capabilities, Anyway?
read(0, ...); file descriptor
(capability reference)

user space

kernel space

file descriptor table


0 1 2 3 (capabilities in capability space)

operating system resource


vfs_file_t (open file)

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 36
Capability Granting
struct msghdr msg;
struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
// ...
memmove(CMSG_DATA(cmsg), &fd, sizeof(fd));
sendmsg(socket, &msg, 0);

user space

kernel space

0 1 2 3 0 1 2 3

vfs_file_t

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 37
Capability Granting
struct msghdr msg;
struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
// ...
memmove(CMSG_DATA(cmsg), &fd, sizeof(fd));
sendmsg(socket, &msg, 0); struct msghdr msg;
struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
// ...
recvmsg(socket, &msg, 0);

int fd;
memmove(&fd, CMSG_DATA(cmsg), sizeof(fd));
user space

kernel space

0 1 2 3 0 1 2 3 4

vfs_file_t

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 38
Chicken & Egg Problem
What if we want to represent all resources as capabilities?
Even the resource (memory) needed to store the capabilities and
capability references is a capability
We start with some basic capability (untyped capability) that represents
(physical) memory
Encapsulated capability vs. naked capability
This capability can be retyped to a different capability or converted to multiple
capabilities
– Allocating kernel objects
– Allocating capability nodes that bind capability references to capabilities
Bookkeeping objects (e.g. memory for page tables) might also be represented as
capabilities

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 39
Capability Derivation Tree
Permissible ways of retyping capabilities
untyped 10 pages
cap

untyped untyped untyped untyped


2 pages 6 pages 1 page 1 page
cap cap cap cap

untyped untyped
cap 1 page
cap 1 page L1 PT L2 PT
cap cap

cnode TCB
cap cap

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 40
Representing Capability Space
Effective and efficient storage for capability nodes
Criteria
Low memory overhead and fragmentation even for sparse capability spaces
Fast lookup of capability references (typically the most frequent operation)
Reasonably fast creation and removal of new capability references
Possibility to store metadata (e.g. permissions, ownership/delegations) and even
actual kernel objects (up to a certain size) in-line
Typical candidates
Arrays
Hash tables
Radix trees

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 41
Hierarchical Capability Space
00 01 11 cref_t

user space

kernel space
cnode
cap cnode_t (10 bit index)

cnode untyped
cap cap cnode_t (10 bit index)

untyped endpoint page untyped


cap cap cap cap cnode_t (10 bit index)
cspace

mem_region_t resource

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 42
Capability Operations
Actions that can be performed with capabilities
The permissible set of operations might be defined/restricted by the capability
reference itself
Each capability reference might permit different methods despite pointing to the
same object
Invoke
Executing some “business logic” operation on the target object
Clone
Creating a duplicate capability reference
Mint
Creating a duplicate capability reference, but with restricted permissions

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 43
Capability Operations (2)
Derive
Retyping the capability to a different capability type or converting it to multiple
capabilities
Permissible retyping/conversions defined by the capability derivation tree
Delegate
Passing the ownership of the capability reference to different capability space
Grant
Creating a duplicate capability reference (possibly with restricted permissions) in a
different capability space (while keeping ownership)
Might be done only once or recursively
Revoke
Removing a granted capability reference from a different capability space

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 44
Get to Know
Microkernels
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 45
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 46
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 47
HelenOS Microkernel Functional Blocks
kernel kernel
debug unit
console tests

ELF kernel
lifecycle kernel
loader mgmt log

architecture independent
cycle & lists, generic synchro-
system time tracing nization
information trees, resource
mgmt support
bitmaps allocator interface

interrupt & hardware concurrent read-


syscall resource string misc hash copy-
dispatch mgmt routines routines table update

thread & slab address memory


space work wait
task IPC allocator reservation queues queues
mgmt mgmt

memory memory frame


thread zones cache spinlocks
scheduler capabilities backends allocator coherency
mgmt

hardware abstraction layer

shared architecture
platform interrupt global page hierarchical
platform I/O debugging
architecture

library hash table page table


dependent

dependent
routines handling drivers mgmt support support support

bootstrap CPU context platform atomics shared shared


routines mgmt switching memory & platform debugging
mgmt barriers drivers support

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 48
HelenOS User Space Architecture
remote remote
console framebuffer

console compositor
vterm bdsh
clipboard audio
client
session input output

human interface

TMPFS Location FS

ISO 9660 UDF MINIX FS slip


nconfsrv
FAT exFAT ext4 loopip ethip
dnsrsrv dhcp tcp udp
file system device drivers link layer
drivers networking transport
management protocols layer protocols

vfs device
manager inetsrv

location logger klog


service

naming task
service loader monitor init

kernel

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 49
HelenOS User Space Device Drivers
mbr guid

file disk partitions

ahci ddisk
USB mass storage ata RAM disk
UHCI root hub USB hub block device
drivers
USB MID USB HID
sb16 hdaudio
class drivers
audio drivers

ne2000
s3c24xx
rtl8139 rtl8169
isdv4 xtkbd
ar9271 e1000 uhci xhci
ps2 i8042
network interface pci ohci ehci
drivers pl050 ns8250
isa amba
character
bus drivers device drivers
cmos-rtc
clock drivers

msim leon3 apic i8259


kfb amdm37x amdm37x mac obio icp-ic
framebuffer
drivers icp pc malta interrupt
controller drivers
platform drivers

architecture virtual
root drivers

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 50
Genode OS Framework

[1] Feske N.: Introducing kernel-agnostic Genode executables, Genode Labs, FOSDEM 2017,
https://fosdem.org/2017/schedule/event/microkernel_kernel_agnostic_genode_executables/
Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 51
Further Reading
Du D., Hua Z., Xia Y., Zang B., Chen H.: XPC: Architectural Support for
Secure and Efficient Cross Process Call, ACM/IEEE 46th Annual
International Symposium on Computer Architecture (ISCA), 2019
https://ieeexplore.ieee.org/abstract/document/8980352
Matthias Lange: The impact of Meltre and Specdown on microkernel
systems (*), Microkernel Devroom, FOSDEM, 2019
(*) Deliberate misspelling of Meltdown and Spectre
https://archive.fosdem.org/2019/schedule/event/meltre_specdown/

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 52
Q&A

Martin Děcký, March 25th 2021 Microkernel-based and Capability-based Operating Systems 53
Thank You!

You might also like