SlideShare a Scribd company logo
AKIHIRO SUDA
NTT Corporation
Hardening Docker
daemon with
Rootless mode
About me
● Software Engineer at NTT
● Maintainer of Moby, containerd, and BuildKit
● Docker Tokyo Community Leader
Rootless Docker
● Run Docker as a non-root user on the host
● Protect the host from potential Docker vulns
and misconfiguration
Non-rootroot
Demo
Don’t confuse with..
$ sudo docker
Image: https://xkcd.com/149/
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Image: https://twitter.com/llegaspacheco/status/1111783777372639232
Rootless Docker
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock
$ sudo usermod -aG docker penguin
Non-root username: “penguin”
Image: https://twitter.com/llegaspacheco/status/1111783777372639232
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
$ docker run --user 42
All of them run the daemon as the root!
Don’t confuse with..
$ sudo docker
$ usermod -aG docker penguin
$ docker run --user 42
$ dockerd --userns-remap
Rootless Docker
● Rootless Docker refers to running the Docker daemon
(and containers of course) as a non-root user
● Even if it got compromised, the attacker wouldn’t be able
to gain the root on the host
(unless you have sudo configured with NOPASSWD)
Some caveats apply..
● No OverlayFS (except on Ubuntu)
● Limited network performance by default
● TCP/UDP port numbers below 1024 can’t be listened on
● No cgroup
○ docker run: --memory and --cpu-* flags are
ignored
○ docker top: does not work
You can install it under your $HOME
right now!
● sudo is not required
● But /etc/subuid and /etc/subgid need to be
configured to contain your username
○ configured by default on recent distros
curl -fsSL https://get.docker.com/rootless | sh
You can install it under your $HOME
right now!
● The installer shows helpful error if /etc/sub[ug]id is
unconfigured
○ Thanks to Tõnis Tiigi and Tibor Vass!
● Feel free to ask me after this session if it doesn’t work
curl -fsSL https://get.docker.com/rootless | sh
Katacoda scenario available!
https://www.katacoda.com/courses/docker/rootless
Motivation
Harden containers
● Docker has a lot of features for hardening containers, so
root-in-container is still contained by default
○ namespaces, capabilities
○ seccomp, AppArmor, SELinux...
● But there is no such thing as vulnerability-free software;
root-in-container could break out with an exploit
○ CVE-2019-5736 runc breakout (Feb 11, 2019)
Harden containers
● And people often make misconfiguration!
● “We found 3,822 Docker hosts with the remote API
exposed publicly.”
-- Vitaly Simonovich and Ori Nakar (March 4, 2019)
https://www.imperva.com/blog/hundreds-of-vulnerable-docker-hosts-exploite
d-by-cryptocurrency-miners/
Harden containers
● Rootless mode per se doesn’t fix vulns and
misconfigurations - but it can mitigate attacks
● Attacker won’t be able to:
○ access files owned by other users
○ modify firmware and kernel (→ undetectable malware)
○ ARP spoofing
Caution: not panacea!
● If Docker had a vuln, attackers still might be able to:
○ Mine cryptocurrencies
○ Springboard-attack to other hosts
● Not effective for potential vulns on
kernel / VM / HW side
High-performance Computing (HPC)
● HPC users are typically disallowed to gain the root on the
host
● Good news: GPU (and perhaps FPGA devices) are
known to work with Rootless mode
Docker-in-Docker
● There are a lot of valid use cases to allow a Docker
container to call Docker API
○ FaaS
○ CI
○ Build images
○ ...
Docker-in-Docker
$ docker run -v /var/run/docker.sock:/var/run/docker.sock
$ docker run --privileged docker:dind
● Two types of Docker-in-Docker, both had been unsafe
without Rootless
How it works
Pretend to be the root
● User namespaces allow non-root users to pretend to be
the root
● Root-in-UserNS can have fake UID 0 and also create
other namespaces (MountNS, NetNS..)
Pretend to be the root
● But Root-in-UserNS cannot gain the real root
○ Inaccessible files still remain inaccessible
○ Kernel modules cannot be loaded
○ System cannot be rebooted
Pretend to be the root
$ id -u
1001
$ ls -ln
-rw-rw---- 1 1001 1001 42 May 1 12:00 foo
Pretend to be the root
$ docker run -v $(pwd):/mnt -it alpine
/ # id -u
0
/ # ls -ln /mnt
-rw-rw---- 1 0 0 42 May 1 12:00 foo
Still owned by 1001 on the host
Still running as 1001 on the host
Pretend to be the root
$ docker run -v /:/host -it alpine
/ # ls -ln /host/dev/sda
brw-rw---- 1 65534 65534 8, 0 May 1 12:00 /host/dev/sda
/ # cat /host/dev/sda
cat: can’t open ‘/host/dev/sda’: Permission denied
Still owned by root(0) on the host
Sub-users (and sub-groups)
● Put users in your user account so you can be a user
while you are a user
● Sub-users are used as non-root users in a container
○ USER in Dockerfile
○ docker run --user
Sub-users (and sub-groups)
● If /etc/subuid contains “1001:100000:65536”
● Having 65536 sub-users should be enough for most
containers
0 1001 100000 165535 232
0 1 65536
Host
UserNS
primary user sub-users start sub-users len
● A container has a mutable copy of the image
● Copying file takes time and wastes disk space
● Rootful Docker uses OverlayFS to reduce extra copy
Snapshotting
Image
container
container
container
docker run
Snapshotting
● OverlayFS is currently unavailable for Rootless mode
(unless you have Ubuntu’s kernel patch)
● On ext4, files are just copied instead; Slow and wasteful
● But on XFS “reflink” is used to deduplicate files
○ copy_file_range(2)
○ Slow but not wasteful
Networking
● Non-root user can create NetNS but cannot create a
vEth pair across the host and a NetNS
● VPNKit is used instead of vEth pair
○ User-mode network stack based on MirageOS TCP/IP
○ Also used by Docker for Mac/Win
Practical Tips
systemd service
● The unit file is in your home:
~/.config/systemd/user/docker.service
● To enable user services on system startup:
$ sudo loginctl enable-linger penguin
$ systemctl --user start docker
$ systemctl --user stop docker
Enable OverlayFS
● The vanilla kernel disallows mounting OverlayFS in user
namespaces
● But if you install Ubuntu kernel, you can get support for
OverlayFS
https://lists.ubuntu.com/archives/kernel-team/2014-February/038091.html
Enable XFS reflink
● If OverlayFS is not available, use XFS to deduplicate files
○ efficient for dedupe but slow
○ otherwise (i.e. ext4) all files are duplicated per layer
● ~/.config/docker/daemon.json:
● Make sure to format with `mkfs.xfs -m reflink=1`,
{“storage-driver”: “vfs”,
“data-root”:”/mnt/xfs/foo”}
Change network stack: slirp4netns
● The default network stack (VPNKit) is slow
● Install slirp4netns (v0.3.0+) to get better throughput
○ iperf3 benchmark (container to host):
514Mbps → 9.21 Gbps
○ still slow compared to native vEth 52.1 Gbps
Benchmark: https://fosdem.org/2019/schedule/event/containers_k8s_rootless/
Change network stack: slirp4netns
● https://github.com/rootless-containers/slirp4netns
● ./configure && make && make install
● RPM/DEB is also available for most distros (but
sometimes outdated)
● If slirp4netns is installed on $PATH, Docker automatically
picks up
Change network stack: lxc-user-nic
● Or install lxc-user-nic to get native performance
○ SETUID binary (executed as the root)
■ potentially result in root privilege escalation
if lxc-user-nic had vuln
$ sudo apt-get install liblxc-common
Change network stack: lxc-user-nic
● /etc/lxc/lxc-usernet needs to be configured:
● $DOCKERD_ROOTLESS_ROOTLESSKIT_NET needs to be
set to lxc-user-nic
# USERNAME TYPE BRIDGE COUNT
penguin veth lxcbr0 1
Count of dockerd and LXC containers
(Not count of Docker containers)
Exposing TCP/UDP ports below 1024
● Exposing port numbers below 1024 requires
CAP_NET_BIND_SERVICE
$ sudo setcap cap_net_bind_service=ep 
~/bin/rootlesskit
$ docker run -p 80:80 ...
Future work
Docker 19.09? 20.03?
FUSE-OverlayFS
● FUSE-OverlayFS can emulate OverlayFS without root
privileges on any distro (requires Kernel 4.18)
● Faster than XFS dedupe but slightly slower than real
OverlayFS
● containerd will be able to support FUSE-OverlayFS
● Docker will be able to use containerd snapshotter
https://github.com/moby/moby/pull/38738
OverlayFS
● There has been also discussion to push Ubuntu’s patch
to the real OverlayFS upstream
● Likely to take more time?
cgroup2
cgroup2 is needed for safely supporting rootless cgroup
Docker
containerd
runc
systemd
Linux Kernel
Already support cgroup2
TODO
Work in progress
cgroup2
● runc doesn’t support cgroup2 yet, but “crun” already
supports cgroup2 https://github.com/giuseppe/crun
● OCI (Open Containers Initiative) is working on bringing
proper cgroup2 support to OCI Runtime Spec and runc
https://github.com/opencontainers/runtime-spec/issues/1002
LDAP
● Configuring /etc/subuid and /etc/subgid might be
painful on LDAP environments
● NSS module is under discussion for LDAP environments
https://github.com/shadow-maint/shadow/issues/154
○ No need to configure /etc/subuid and /etc/subgid
LDAP
● Another way: emulate sub-users using a single user
● runROOTLESS: An OCI Runtime Implementation with
sub-users emulation https://github.com/rootless-containers/runrootless
○ Uses Ptrace and Xattr for emulating syscalls
○ 2-15 times performance overhead
https://github.com/rootless-containers/runrootless/issues/14
LDAP
● seccomp could be used for accelerating ptrace, but we
are still facing implementation issues
● We are also looking into possibility of using
“Seccomp Trap To Userspace” (introduced in Kernel 5.0)
○ Modern replacement for ptrace
Join us at Open Source Summit !
● Thursday, May 2, 12:30 PM - 02:30 PM
● Room 2020
● Three BuildKit talks
including this →
Questions?
get.docker.com/rootless

More Related Content

What's hot (20)

PDF
シェル芸初心者によるシェル芸入門
icchy
 
PPTX
これがCassandra
Takehiro Torigaki
 
PDF
containerdの概要と最近の機能
Kohei Tokunaga
 
PDF
Dockerからcontainerdへの移行
Kohei Tokunaga
 
PDF
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
Kohei Tokunaga
 
PDF
ML2/OVN アーキテクチャ概観
Yamato Tanaka
 
PDF
ゼロからはじめるKVM超入門
VirtualTech Japan Inc.
 
PDF
僕のIntel nucが起動しないわけがない
Takuya ASADA
 
PDF
OpenStack Architecture
Mirantis
 
PDF
WebAssemblyのWeb以外のことぜんぶ話す
Takaya Saeki
 
ODP
Guide To AGPL
Mikiya Okuno
 
PPT
Glibc malloc internal
Motohiro KOSAKI
 
PPTX
OVN 設定サンプル | OVN config example 2015/12/27
Kentaro Ebisawa
 
PDF
大規模ソーシャルゲーム開発から学んだPHP&MySQL実践テクニック
infinite_loop
 
PDF
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
whywaita
 
PPTX
Getting started with Docker
Ravindu Fernando
 
PDF
エンジニアなら知っておきたい「仮想マシン」のしくみ v1.1 (hbstudy 17)
Takeshi HASEGAWA
 
PDF
LXC入門 - Osc2011 nagoya
Masahide Yamamoto
 
PDF
忙しい人の5分で分かるDocker 2017年春Ver
Masahito Zembutsu
 
PPTX
Ansible presentation
Suresh Kumar
 
シェル芸初心者によるシェル芸入門
icchy
 
これがCassandra
Takehiro Torigaki
 
containerdの概要と最近の機能
Kohei Tokunaga
 
Dockerからcontainerdへの移行
Kohei Tokunaga
 
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
Kohei Tokunaga
 
ML2/OVN アーキテクチャ概観
Yamato Tanaka
 
ゼロからはじめるKVM超入門
VirtualTech Japan Inc.
 
僕のIntel nucが起動しないわけがない
Takuya ASADA
 
OpenStack Architecture
Mirantis
 
WebAssemblyのWeb以外のことぜんぶ話す
Takaya Saeki
 
Guide To AGPL
Mikiya Okuno
 
Glibc malloc internal
Motohiro KOSAKI
 
OVN 設定サンプル | OVN config example 2015/12/27
Kentaro Ebisawa
 
大規模ソーシャルゲーム開発から学んだPHP&MySQL実践テクニック
infinite_loop
 
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
whywaita
 
Getting started with Docker
Ravindu Fernando
 
エンジニアなら知っておきたい「仮想マシン」のしくみ v1.1 (hbstudy 17)
Takeshi HASEGAWA
 
LXC入門 - Osc2011 nagoya
Masahide Yamamoto
 
忙しい人の5分で分かるDocker 2017年春Ver
Masahito Zembutsu
 
Ansible presentation
Suresh Kumar
 

Similar to DCSF19 Hardening Docker daemon with Rootless mode (20)

PDF
[DockerCon 2020] Hardening Docker daemon with Rootless Mode
Akihiro Suda
 
PDF
[KubeCon NA 2020] containerd: Rootless Containers 2020
Akihiro Suda
 
PDF
Rootless Containers
Akihiro Suda
 
PDF
20240201 [HPC Containers] Rootless Containers.pdf
Akihiro Suda
 
PDF
Rootless Containers & Unresolved issues
Akihiro Suda
 
PDF
Rootless Kubernetes
Akihiro Suda
 
PDF
The State of Rootless Containers
Akihiro Suda
 
PDF
Podman rootless containers
Giuseppe Scrivano
 
PDF
ISC HPCW talks
Akihiro Suda
 
PDF
CIALUG May 2019 Meeting: An intro to docker and using rootless docker
Andrew Denner
 
PPTX
Usernetes: Kubernetes as a non-root user
Akihiro Suda
 
PDF
Introduction to Docker (as presented at December 2013 Global Hackathon)
Jérôme Petazzoni
 
PDF
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
dotCloud
 
PDF
A Gentle Introduction To Docker And All Things Containers
Jérôme Petazzoni
 
PDF
Good - aDocker - Reference Materials.pdf
Kiran Kumar Bugude
 
PDF
Docker
Chen Chun
 
PDF
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
dotCloud
 
PDF
A Gentle Introduction to Docker and Containers
Docker, Inc.
 
PDF
Rooting Out Root: User namespaces in Docker
Phil Estes
 
PPTX
Lessons from running potentially malicious code inside Docker containers
Ben Hall
 
[DockerCon 2020] Hardening Docker daemon with Rootless Mode
Akihiro Suda
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
Akihiro Suda
 
Rootless Containers
Akihiro Suda
 
20240201 [HPC Containers] Rootless Containers.pdf
Akihiro Suda
 
Rootless Containers & Unresolved issues
Akihiro Suda
 
Rootless Kubernetes
Akihiro Suda
 
The State of Rootless Containers
Akihiro Suda
 
Podman rootless containers
Giuseppe Scrivano
 
ISC HPCW talks
Akihiro Suda
 
CIALUG May 2019 Meeting: An intro to docker and using rootless docker
Andrew Denner
 
Usernetes: Kubernetes as a non-root user
Akihiro Suda
 
Introduction to Docker (as presented at December 2013 Global Hackathon)
Jérôme Petazzoni
 
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
dotCloud
 
A Gentle Introduction To Docker And All Things Containers
Jérôme Petazzoni
 
Good - aDocker - Reference Materials.pdf
Kiran Kumar Bugude
 
Docker
Chen Chun
 
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
dotCloud
 
A Gentle Introduction to Docker and Containers
Docker, Inc.
 
Rooting Out Root: User namespaces in Docker
Phil Estes
 
Lessons from running potentially malicious code inside Docker containers
Ben Hall
 
Ad

More from Docker, Inc. (20)

PDF
Containerize Your Game Server for the Best Multiplayer Experience
Docker, Inc.
 
PDF
How to Improve Your Image Builds Using Advance Docker Build
Docker, Inc.
 
PDF
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
PDF
Securing Your Containerized Applications with NGINX
Docker, Inc.
 
PDF
How To Build and Run Node Apps with Docker and Compose
Docker, Inc.
 
PDF
Hands-on Helm
Docker, Inc.
 
PDF
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
PDF
The First 10M Pulls: Building The Official Curl Image for Docker Hub
Docker, Inc.
 
PDF
Monitoring in a Microservices World
Docker, Inc.
 
PDF
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
Docker, Inc.
 
PDF
Predicting Space Weather with Docker
Docker, Inc.
 
PDF
Become a Docker Power User With Microsoft Visual Studio Code
Docker, Inc.
 
PDF
How to Use Mirroring and Caching to Optimize your Container Registry
Docker, Inc.
 
PDF
Monolithic to Microservices + Docker = SDLC on Steroids!
Docker, Inc.
 
PDF
Kubernetes at Datadog Scale
Docker, Inc.
 
PDF
Labels, Labels, Labels
Docker, Inc.
 
PDF
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Docker, Inc.
 
PDF
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
PDF
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
Docker, Inc.
 
PDF
Developing with Docker for the Arm Architecture
Docker, Inc.
 
Containerize Your Game Server for the Best Multiplayer Experience
Docker, Inc.
 
How to Improve Your Image Builds Using Advance Docker Build
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
Securing Your Containerized Applications with NGINX
Docker, Inc.
 
How To Build and Run Node Apps with Docker and Compose
Docker, Inc.
 
Hands-on Helm
Docker, Inc.
 
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
The First 10M Pulls: Building The Official Curl Image for Docker Hub
Docker, Inc.
 
Monitoring in a Microservices World
Docker, Inc.
 
COVID-19 in Italy: How Docker is Helping the Biggest Italian IT Company Conti...
Docker, Inc.
 
Predicting Space Weather with Docker
Docker, Inc.
 
Become a Docker Power User With Microsoft Visual Studio Code
Docker, Inc.
 
How to Use Mirroring and Caching to Optimize your Container Registry
Docker, Inc.
 
Monolithic to Microservices + Docker = SDLC on Steroids!
Docker, Inc.
 
Kubernetes at Datadog Scale
Docker, Inc.
 
Labels, Labels, Labels
Docker, Inc.
 
Using Docker Hub at Scale to Support Micro Focus' Delivery and Deployment Model
Docker, Inc.
 
Build & Deploy Multi-Container Applications to AWS
Docker, Inc.
 
From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration S...
Docker, Inc.
 
Developing with Docker for the Arm Architecture
Docker, Inc.
 
Ad

Recently uploaded (20)

PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
July Patch Tuesday
Ivanti
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
July Patch Tuesday
Ivanti
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Biography of Daniel Podor.pdf
Daniel Podor
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 

DCSF19 Hardening Docker daemon with Rootless mode

  • 1. AKIHIRO SUDA NTT Corporation Hardening Docker daemon with Rootless mode
  • 2. About me ● Software Engineer at NTT ● Maintainer of Moby, containerd, and BuildKit ● Docker Tokyo Community Leader
  • 3. Rootless Docker ● Run Docker as a non-root user on the host ● Protect the host from potential Docker vulns and misconfiguration Non-rootroot
  • 5. Don’t confuse with.. $ sudo docker Image: https://xkcd.com/149/
  • 6. Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin
  • 7. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin”
  • 8. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin” Image: https://twitter.com/llegaspacheco/status/1111783777372639232
  • 9. Rootless Docker $ ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 May 1 12:00 /var/run/docker.sock $ sudo usermod -aG docker penguin Non-root username: “penguin” Image: https://twitter.com/llegaspacheco/status/1111783777372639232
  • 10. Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin $ docker run --user 42
  • 11. All of them run the daemon as the root! Don’t confuse with.. $ sudo docker $ usermod -aG docker penguin $ docker run --user 42 $ dockerd --userns-remap
  • 12. Rootless Docker ● Rootless Docker refers to running the Docker daemon (and containers of course) as a non-root user ● Even if it got compromised, the attacker wouldn’t be able to gain the root on the host (unless you have sudo configured with NOPASSWD)
  • 13. Some caveats apply.. ● No OverlayFS (except on Ubuntu) ● Limited network performance by default ● TCP/UDP port numbers below 1024 can’t be listened on ● No cgroup ○ docker run: --memory and --cpu-* flags are ignored ○ docker top: does not work
  • 14. You can install it under your $HOME right now! ● sudo is not required ● But /etc/subuid and /etc/subgid need to be configured to contain your username ○ configured by default on recent distros curl -fsSL https://get.docker.com/rootless | sh
  • 15. You can install it under your $HOME right now! ● The installer shows helpful error if /etc/sub[ug]id is unconfigured ○ Thanks to Tõnis Tiigi and Tibor Vass! ● Feel free to ask me after this session if it doesn’t work curl -fsSL https://get.docker.com/rootless | sh
  • 18. Harden containers ● Docker has a lot of features for hardening containers, so root-in-container is still contained by default ○ namespaces, capabilities ○ seccomp, AppArmor, SELinux... ● But there is no such thing as vulnerability-free software; root-in-container could break out with an exploit ○ CVE-2019-5736 runc breakout (Feb 11, 2019)
  • 19. Harden containers ● And people often make misconfiguration! ● “We found 3,822 Docker hosts with the remote API exposed publicly.” -- Vitaly Simonovich and Ori Nakar (March 4, 2019) https://www.imperva.com/blog/hundreds-of-vulnerable-docker-hosts-exploite d-by-cryptocurrency-miners/
  • 20. Harden containers ● Rootless mode per se doesn’t fix vulns and misconfigurations - but it can mitigate attacks ● Attacker won’t be able to: ○ access files owned by other users ○ modify firmware and kernel (→ undetectable malware) ○ ARP spoofing
  • 21. Caution: not panacea! ● If Docker had a vuln, attackers still might be able to: ○ Mine cryptocurrencies ○ Springboard-attack to other hosts ● Not effective for potential vulns on kernel / VM / HW side
  • 22. High-performance Computing (HPC) ● HPC users are typically disallowed to gain the root on the host ● Good news: GPU (and perhaps FPGA devices) are known to work with Rootless mode
  • 23. Docker-in-Docker ● There are a lot of valid use cases to allow a Docker container to call Docker API ○ FaaS ○ CI ○ Build images ○ ...
  • 24. Docker-in-Docker $ docker run -v /var/run/docker.sock:/var/run/docker.sock $ docker run --privileged docker:dind ● Two types of Docker-in-Docker, both had been unsafe without Rootless
  • 26. Pretend to be the root ● User namespaces allow non-root users to pretend to be the root ● Root-in-UserNS can have fake UID 0 and also create other namespaces (MountNS, NetNS..)
  • 27. Pretend to be the root ● But Root-in-UserNS cannot gain the real root ○ Inaccessible files still remain inaccessible ○ Kernel modules cannot be loaded ○ System cannot be rebooted
  • 28. Pretend to be the root $ id -u 1001 $ ls -ln -rw-rw---- 1 1001 1001 42 May 1 12:00 foo
  • 29. Pretend to be the root $ docker run -v $(pwd):/mnt -it alpine / # id -u 0 / # ls -ln /mnt -rw-rw---- 1 0 0 42 May 1 12:00 foo Still owned by 1001 on the host Still running as 1001 on the host
  • 30. Pretend to be the root $ docker run -v /:/host -it alpine / # ls -ln /host/dev/sda brw-rw---- 1 65534 65534 8, 0 May 1 12:00 /host/dev/sda / # cat /host/dev/sda cat: can’t open ‘/host/dev/sda’: Permission denied Still owned by root(0) on the host
  • 31. Sub-users (and sub-groups) ● Put users in your user account so you can be a user while you are a user ● Sub-users are used as non-root users in a container ○ USER in Dockerfile ○ docker run --user
  • 32. Sub-users (and sub-groups) ● If /etc/subuid contains “1001:100000:65536” ● Having 65536 sub-users should be enough for most containers 0 1001 100000 165535 232 0 1 65536 Host UserNS primary user sub-users start sub-users len
  • 33. ● A container has a mutable copy of the image ● Copying file takes time and wastes disk space ● Rootful Docker uses OverlayFS to reduce extra copy Snapshotting Image container container container docker run
  • 34. Snapshotting ● OverlayFS is currently unavailable for Rootless mode (unless you have Ubuntu’s kernel patch) ● On ext4, files are just copied instead; Slow and wasteful ● But on XFS “reflink” is used to deduplicate files ○ copy_file_range(2) ○ Slow but not wasteful
  • 35. Networking ● Non-root user can create NetNS but cannot create a vEth pair across the host and a NetNS ● VPNKit is used instead of vEth pair ○ User-mode network stack based on MirageOS TCP/IP ○ Also used by Docker for Mac/Win
  • 37. systemd service ● The unit file is in your home: ~/.config/systemd/user/docker.service ● To enable user services on system startup: $ sudo loginctl enable-linger penguin $ systemctl --user start docker $ systemctl --user stop docker
  • 38. Enable OverlayFS ● The vanilla kernel disallows mounting OverlayFS in user namespaces ● But if you install Ubuntu kernel, you can get support for OverlayFS https://lists.ubuntu.com/archives/kernel-team/2014-February/038091.html
  • 39. Enable XFS reflink ● If OverlayFS is not available, use XFS to deduplicate files ○ efficient for dedupe but slow ○ otherwise (i.e. ext4) all files are duplicated per layer ● ~/.config/docker/daemon.json: ● Make sure to format with `mkfs.xfs -m reflink=1`, {“storage-driver”: “vfs”, “data-root”:”/mnt/xfs/foo”}
  • 40. Change network stack: slirp4netns ● The default network stack (VPNKit) is slow ● Install slirp4netns (v0.3.0+) to get better throughput ○ iperf3 benchmark (container to host): 514Mbps → 9.21 Gbps ○ still slow compared to native vEth 52.1 Gbps Benchmark: https://fosdem.org/2019/schedule/event/containers_k8s_rootless/
  • 41. Change network stack: slirp4netns ● https://github.com/rootless-containers/slirp4netns ● ./configure && make && make install ● RPM/DEB is also available for most distros (but sometimes outdated) ● If slirp4netns is installed on $PATH, Docker automatically picks up
  • 42. Change network stack: lxc-user-nic ● Or install lxc-user-nic to get native performance ○ SETUID binary (executed as the root) ■ potentially result in root privilege escalation if lxc-user-nic had vuln $ sudo apt-get install liblxc-common
  • 43. Change network stack: lxc-user-nic ● /etc/lxc/lxc-usernet needs to be configured: ● $DOCKERD_ROOTLESS_ROOTLESSKIT_NET needs to be set to lxc-user-nic # USERNAME TYPE BRIDGE COUNT penguin veth lxcbr0 1 Count of dockerd and LXC containers (Not count of Docker containers)
  • 44. Exposing TCP/UDP ports below 1024 ● Exposing port numbers below 1024 requires CAP_NET_BIND_SERVICE $ sudo setcap cap_net_bind_service=ep ~/bin/rootlesskit $ docker run -p 80:80 ...
  • 46. FUSE-OverlayFS ● FUSE-OverlayFS can emulate OverlayFS without root privileges on any distro (requires Kernel 4.18) ● Faster than XFS dedupe but slightly slower than real OverlayFS ● containerd will be able to support FUSE-OverlayFS ● Docker will be able to use containerd snapshotter https://github.com/moby/moby/pull/38738
  • 47. OverlayFS ● There has been also discussion to push Ubuntu’s patch to the real OverlayFS upstream ● Likely to take more time?
  • 48. cgroup2 cgroup2 is needed for safely supporting rootless cgroup Docker containerd runc systemd Linux Kernel Already support cgroup2 TODO Work in progress
  • 49. cgroup2 ● runc doesn’t support cgroup2 yet, but “crun” already supports cgroup2 https://github.com/giuseppe/crun ● OCI (Open Containers Initiative) is working on bringing proper cgroup2 support to OCI Runtime Spec and runc https://github.com/opencontainers/runtime-spec/issues/1002
  • 50. LDAP ● Configuring /etc/subuid and /etc/subgid might be painful on LDAP environments ● NSS module is under discussion for LDAP environments https://github.com/shadow-maint/shadow/issues/154 ○ No need to configure /etc/subuid and /etc/subgid
  • 51. LDAP ● Another way: emulate sub-users using a single user ● runROOTLESS: An OCI Runtime Implementation with sub-users emulation https://github.com/rootless-containers/runrootless ○ Uses Ptrace and Xattr for emulating syscalls ○ 2-15 times performance overhead https://github.com/rootless-containers/runrootless/issues/14
  • 52. LDAP ● seccomp could be used for accelerating ptrace, but we are still facing implementation issues ● We are also looking into possibility of using “Seccomp Trap To Userspace” (introduced in Kernel 5.0) ○ Modern replacement for ptrace
  • 53. Join us at Open Source Summit ! ● Thursday, May 2, 12:30 PM - 02:30 PM ● Room 2020 ● Three BuildKit talks including this →