0% found this document useful (0 votes)
5 views

Add a subheading

The document provides an introduction to Docker, explaining its purpose as a tool for automating application deployment in lightweight containers, which are more efficient than traditional virtual machines. It covers installation steps, basic commands, and concepts such as images, containers, and statefulness, along with practical examples of running and managing containers. Additionally, it introduces Docker Hub for image storage, the concept of load balancing, and the use of volumes for persistent data storage.

Uploaded by

Sumit Chapagai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Add a subheading

The document provides an introduction to Docker, explaining its purpose as a tool for automating application deployment in lightweight containers, which are more efficient than traditional virtual machines. It covers installation steps, basic commands, and concepts such as images, containers, and statefulness, along with practical examples of running and managing containers. Additionally, it introduces Docker Hub for image storage, the concept of load balancing, and the use of volumes for persistent data storage.

Uploaded by

Sumit Chapagai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Welcome to "Learn Docker"

Wikipedia defines Docker as


an open-source project that automates the deployment of
software applications inside containers by providing an
additional layer of abstraction and automation of OS-level
virtualization on Linux.

Wow! That's a mouthful. In simpler words, Docker is a tool


that allows developers, sys-admins etc. to easily deploy
their applications in a sandbox (called containers) to run
on the host operating system i.e. Linux. The key benefit of
Docker is that it allows users to package an application
with all of its dependencies into a standardized unit for
software development. Unlike virtual machines,
containers do not have high overhead and hence enable
more efficient usage of the underlying system and
resources.
Installing Docker
Let's download and install Docker Desktop.
Install on macOS
Install on Windows (or WSL 2)
Install on Linux
There are a couple of moving parts to keep in mind when it
comes to using Docker on your local machine:
1. The "Docker server" or "Docker Daemon". This listens to
requests from the desktop app and executes them. If
this isn't running nothing else will work.
2. The "Docker Desktop" GUI. Starting the GUI should start
the server, at least that's how I usually ensure the server
is running. The GUI is the visual way to interact with
Docker.
3. The Docker CLI. As a developer, most of your work will
be interacting with Docker via the CLI. I'd recommend
using the GUI to visualize what's going on with Docker,
but executing most of your commands through the
command line.
Make Sure Your Installation Worked
If you haven't yet, run the "Docker Desktop" application. Make
sure that in the bottom-left of the screen there is a green box
with a whale icon in it. This indicates that the docker server is
running locally and that you are connected to it. Then, make sure
you're on Docker Desktop version 4+. At the time of writing, I'm
on 4.14.0.
Next, run docker version in your command line to make sure the
CLI was installed.
Run and submit the tests.

What Is Docker?

To put it more simply: Docker allows us to deploy our applications


inside "containers" which can be thought of as very lightweight
virtual machines. Instead of just shipping an application, we can
ship an application and the environment it's meant to run within.
What Is a Container?

We've had virtual machines for a long time. The trouble


with virtual machines is that they are very heavy on
resources. Booting one up often takes longer than booting
up your physical machine. If you're curious about VMs,
you can play around with VirtualBox for free.
A container gives us 95% of the benefits that virtual
machines offer (at least as back-end developers), but are
super lightweight. Containers boot up in seconds, while
virtual machines can take minutes.
Virtual Machine Architecture
Container (Docker) Architectures
What Makes Such a Big Difference in
Performance?
Virtual machines virtualize hardware, they emulate what a
physical computer does at a very low level. Containers
virtualize at the operating system level. Isolation between
containers that are running on the same machine is still
really good. For the most part, it appears to each
container as if it has its own operating and filesystem. In
reality, a lot of resources are being shared, but they're
being shared securely through namespaces.
Docker Hub
Docker Hub is the official cloud service for storing and sharing
Docker images. We're going to use Docker Hub in this course,
but it's important to understand that there are other popular
alternatives, and they're usually coupled with cloud service
providers. For example, when using AWS, my team used ECR to
store our images. Now that I'm on GCP, I use Container
Registry.
Go ahead and create a free account on Docker Hub if you don't
already have one.
Once you have an account, open up your Docker Desktop app
and click "Sign in". If it worked, you should see "Connected to
Hub" at the bottom of the client. Having your local Docker
environment connected to a Docker Hub account will make
publishing your images much easier. We'll cover that later in
this course!
Command Line
Help
Run docker help in your command line interface. It should
spit out a giant help menu. As you can see there are a lot
of commands. We're going to learn the most relevant
ones.

Running a Pre-Built Sample Container


Docker hosts a "getting started" image for users to play
with. Run Docker's getting started image
Run this command on your CLI:

Be sure to give it a bit of time to download the image,


because you probably don't already have it stored locally.
You should see the container running in the "Containers"
tab of Docker Desktop. Next, run:

This will list running containers in your command line. On


one of the columns you should see this:
This is saying that port 80 on your local "host" machine is
being forwarded to port 80 on the running container. Port
80 is conventionally used to indicate HTTP web traffic.
Navigate to http://localhost:80 and you should get a
webpage describing the container you're running!

Stopping a Container
In case you don't believe me that this webpage is being
hosted by you on your local machine, allow me to prove it.
Docker has two commands that we can use to stop a
Docker container from running:

docker stop
This stops the container by issuing a SIGTERM signal
to the container. You'll typically want to use docker
stop. If it doesn't work you can always try the harsher
docker kill.
docker kill
This stops the container by issuing a SIGKILL signal to the
container

Stopping Your Container


1. Run docker ps again, and copy the container ID.
2. Run docker stop CONTAINER_ID where CONTAINER_ID
is the ID of your container
3. Refresh the webpage
You should get some kind of "This site can’t be reached"
error message in your browser! The container isn't running
anymore, and therefore the webpage isn't available!
Images vs. Containers
We've been throwing around the terms "docker image" and
"docker container", let's take a step back and make sure we
understand what these terms mean.
A "docker image" is the read-only definition of a container
A "docker container" is a virtualized read-write environment
A container is essentially just an image that's actively running.
Run this command:

You should see an image with the name (repository)


"docker/getting-started". That's a static image that we
downloaded from docker hub. It describes to docker how
to start a new container. When we run:

We're starting a new container from the "docker/getting-


started" image.
Exec
Let's get a little more hands-on with running
containers.
List your running containers:

If you don't have a running container of the


docker/getting-started image, start it up again:

Running One-Off Commands in the Container


Let's start with something simple. Run this:

This executes an ls (list directory) command in the


running container and returns the result to our current
shell session. You should get a list of all the files and
directories in the working directory of the container.
Exec Find Process
Let's try one more command using exec. I'm curious about what
software this container is using to serve a webpage.
The netstat shows us which programs are bound to which ports.
We're looking for the process bound to port 80, that is, the one
serving the webpage.
Run this command:

-i makes the exec command interactive


-t gives us a tty (keyboard) interface
/bin/sh is the path to the command we're running. After all, a
command line shell is just a program. sh is a more basic
version of bash.
Once you have a session in the container print your working
directory:
When you're done playing around in the container you can get back to your host
machine by running the exit command.
Multiple Containers
Now that we have some experience with a single
container, let's run multiple containers!
Remember, Docker is very lightweight. It's normal to run
many containers on a single host machine.
Run this command several times, but replace XX with
different ports each time. E.g 81, 82, 83...

In the -p XX:YY flag, the XX is the host port, while YY is the


port within the container. We keep using port 80 within
each container because that's the port that Nginx is
serving the webpage to within the container.
We have to use different host ports for different
containers because two processes can't bind to the same
port on the same operating system. Don't believe me? Try
binding two containers to the same port.
You should now be able to load the webpage on different
URLs:
http://localhost:80
http://localhost:81
http://localhost:82
...
Keep in mind, even though it's the same web page, they're each being
served from different containers and different running processes.
Run this command

Run the tests against http://localhost:82


When you're done looking at the stats and running the
tests you can use ctrl + c to get your shell back.

Statefulness
Many docker containers are "stateless", or at least
stateless in the persistent sense. That is, when you create
a new container from the same image, it won't store any
information from before. When you restart the
docker/getting-started container, for example, you're
starting from scratch.
That said, Docker does have ways to support a "persistent
state", and the recommended way is through storage
volumes.
Volumes Are Independent of Containers

In a future exercise, we're going to install the Ghost blogging


software on your machine through Docker. As you can imagine, it
would be fairly useless to have blogging software that doesn't save
your blog posts, so we'll need to use volumes. For now, just create
a new volume.
Ghost CMS
Cool, we've got a named volume ready to go. Now we need to
install Ghost, here's a link to its image on Dockerhub.

Download the Image

Start the Container


docker run -d -e NODE_ENV=development -e
url=http://localhost:3001 -p 3001:2368 -v ghost-
vol:/var/lib/ghost/content ghost
-e NODE_ENV=development sets an environment variable within the
container. This tells Ghost to run in "development" mode (rather than
"production", for instance)
-e url=http://localhost:3001 sets another environment variable, this one
tells Ghost that we want to be able to access Ghost via a URL on our
host machine.
We've used -p before. -p 3001:2368 does some port-forwarding
between the container and our host machine.
ghost-vol:/var/lib/ghost/content mounts the ghost-vol volume that we
created before to the /var/lib/ghost/content path in the container.
Ghost will use the /var/lib/ghost/content directory to persist stateful
data (files) between runs
Creating a Website
Navigate to Ghost's admin panel:
http://localhost:3001/ghost/#/setup and create a new
website, then publish your first post.
If you navigate back to the homepage of the website, you
should see your new post: http://localhost:3001/.

Let's See If Our Volume Is Working


Remember, if we wouldn't have created and bound a
volume to this container, then when the container
stops, we would lose any persistent file data (like our
new blog post!).
Let's ensure our volume is working by restarting the
container. Get your container ID:
Deleting the Volume
Now that we're done playing with Ghost, let's save the space
on our host machine by deleting the volume.
1. Use docker ps -a to see all containers, even those that
aren't running.
2. Stop the running Ghost container
3. Remove the ghost container. Use docker --help to find the
right command.
4. Remove the ghost-vol volume. Use docker volume --help
to find the right command.
Now that it's gone, let's see what happens if we try to start
the Ghost container back up and attach it to a volume that
doesn't exist.

docker run -d -e NODE_ENV=development -e url=http://localhost:3001 -p


3001:2368 -v ghost-vol:/var/lib/ghost/content ghost

Navigate to http://localhost:3001/ in your browser, and you


should see a fresh CMS. That's weird, why no errors?
Run:

The ghost-vol is back from the dead!?! It turns out the -v


ghost-vol:/var/lib/ghost/content flag binds to a "ghost-
vol" volume if it exists, otherwise, it creates it
automatically!
So, we now have a fresh installation. Our post that was on
the old volume is gone, but this new volume will persist if
we don't delete it.
Clean up All of Your Resources
Before we move on, let's clean up all the
resources we've been playing with.
Use docker ps -a to find all your containers and
remove them.
Use docker volume ls to find all of your volumes
and remove them.

Ping
We've already done just a bit of networking in Docker.
Specifically, we've exposed containers to the host network
on various ports and accessed web traffic.
Now, let's force a container into offline mode!

Offline Mode
You might be thinking, "why would I want to turn off networking"??? Well,
usually for security reasons. You might want to remove the network
connection from a container in one of these scenarios:
You're running 3rd party code that you don't trust, and it shouldn't need
network access
You're building an e-learning site, and you're allowing students to
execute code on your machines
You know a container has a virus that's sending malicious requests over
the internet, and you want to do an audit
Using the "ping" Utility
The ping command allows you to check for connectivity to a
host by sending small packets of data. Try pinging google.com:

Press ctrl+c to kill the command after the first few pings.

Break the Network


Now that you've seen how you can ping Google
successfully, let's quarantine a container and make sure
that we can't reach Google.
Run the "Getting Started" Container
Start the getting started container, but do it in --network
none mode. This removes the network interface from the
container.
Load Balancers
Let's try something a bit more complex: let's configure a load balancer!

What Is a Load Balancer?


A load balancer behaves as advertised: it balances a load of network
traffic. Think of a huge website like Google.com. There's no way that a
single server (literally a single computer) could handle all of the Google
searches for the entire world. Google uses load balancers to route
requests to different servers.
How Does a Load Balancer Work?
A central server, called the "load balancer", receives traffic from
users (aka clients), then routes those raw requests to different
back-end application servers. In the case of Google, this splits the
world's traffic across potentially many different thousands of
computers.
How Does It "Balance" the Traffic?
A good load balancer sends more traffic to servers that have
unused resources (CPU and memory). The goal is to "balance the
load" evenly. We don't want any individual server to fail due to
too much traffic. There are many strategies that load balancers
use, but a simple strategy is the "round robin". Requests are
simply routed one after the other to different back-end servers.

Example of "Round Robin" Load


Balancing
Request 1 -> Server 1
Request 2 -> Server 2
Request 3 -> Server 3
Request 4 -> Server 1
Request 5 -> Server 2
...
Application Servers
First, we need to start some application servers so that we have
something to load balance! We'll be using Caddy, an awesome
open-source load balancer/web server. Nginx and Apache are
other popular alternatives that do similar things, but Caddy is a
modern version written in Go, so I think it will be cool to play wit

What Will Our Application Servers Do?


Each application server will serve a slightly different HTML
webpage. The reason they're different is just so that we can see
load balancing in action!

1. Pull Down the caddy Image

2. Create an index1.html File in Your Working Directory


4. Run Caddy Containers to Serve the HTML
Run a container for index1.html on port 8001:

You can run them in separate terminal sessions, or you can


run them in detached mode with -d, whichever you prefer.
Navigate to localhost:8001 in a browser. You should see
"Hello from server 1". Next, navigate to localhost:8002 and
hopefully, you'll see "Hello from server 2"!

Custom Network
Docker allows us to create custom bridge networks so that our
containers can communicate with each other if we want them to,
but remain otherwise isolated. We're going to build a system
where the application servers are hidden within a custom
network, and only our load balancer is exposed to the host.
Let's create a custom bridge network called "caddytest".
You can see if it worked by listing all the networks:

Restart Your Application Servers on the Network


Stop and restart your caddy application servers, but this time, make
sure you attach them to the caddytest network and give them names:

Contacting the Caddy Servers Through the


Bridge
To make sure it's working, let's get a shell session inside a
"getting started" container on the custom network:

By giving our containers some names, caddy1 and caddy2,


and providing a bridge network, Docker has set up name
resolution for us! The container names resolve to the
individual containers from all other containers on the
network. Within your docker/getting-started container
shell, curl the first container:
Once you get the HTML responses that you expect, exit out of your
shell session within the "getting started" container and then run and
submit the tests.
Note that if you need to restart your caddy application servers after
naming them, you can use: docker start caddy1 and docker start
caddy2.

Configuring the Load Balancer


We've confirmed that we have 2 application servers (Caddy) working properly
on a custom bridge network. Let's create a load balancer that balances
network requests between the two! We'll use a round-robin balancing
strategy, so each request should route back and forth between the servers.
If you haven't yet, you can stop any containers that aren't the 2 caddy servers
we're working on currently.
Caddyfile for the Load Balancer
Caddy works great as a file server, which is what our little HTML
servers are, but it also works great as a load balancer! To use Caddy
as a load balancer we need to create a custom Caddyfile to tell
Caddy how we want it to balance traffic.
Create a new file in your local directory called Caddyfile:

This tells Caddy to run on localhost:80, and to round


robin any incoming traffic to caddy1:80 and caddy2:80.
Remember, this only works because we're going to run the
loadbalancer within the same network, so caddy1 and
caddy2 will automatically resolve to our application
server's containers.
Run the Load Balancer
Instead of providing an index.html to this Caddy server, we're
going to give it our custom Caddyfile.

Now you can hit the load balancer on http://localhost:8080/! You


should either get a response from server 1 or server 2, and if you hard
refresh the page, it will swap back and forth.
If it's not swapping properly, try using curl instead. Your browser might
be caching the HTML.

Each time you run curl, you should get a response from a different
server!
Building Images
I've used Docker in the past to install third-party software; both on my local
machine and on the production servers of companies I've worked for.
However, as a back-end developer, I've more often used it to build images of
my software.

Dockerfiles

Dockerfiles are amazing because they allow us to define the


environment our applications are meant to use in code. We can even
commit the Dockerfiles to Git alongside our source code.

Creating a Dockerfile
Create a single file called Dockerfile in your working directory. If you're using VS
Code, I'd recommend installing the Docker extension. It will give you some nice
syntax highlighting among other features.
Inside the Dockerfile add these lines of text:
Build a new image from the Dockerfile:

-t helloworld:latest tags the image with the name


"helloworld" and the "latest" tag. Names are used to
organize your images, and tags are used to keep track of
different versions.
Run your image in a new container:

If all went well, you'll see "hello world" printed to the


console!
Next, run docker ps. You'll notice that your container isn't
running anymore! All it did was print and exit. Just like
regular programs, docker containers can execute simple
commands that exit quickly, or they can execute servers
that run until killed.
You can see the stopped container by running docker ps -
a. Feel free to delete the Dockerfile, we don't need it
anymore
Dockerizing a Server
Now that we've built a very simple image from scratch,
let's do something a bit more realistic: let's dockerize a
web server. I'll provide some working Go code for you at
the bottom of this page.
Now that you know how to run your server manually, let's
run it in Docker! First, we need to create a Dockerfile in
the root of your server's repo. Let's start with a simple
lightweight Debian Linux OS again.

Dockerizing Python
We need to add the entire Python runtime to our image to
be able to run Python code in a container!
Before the first COPY line, let's install Python. We'll use
the RUN command, and because this is a Debian/Linux
image, we'll use apt to get some dependencies, then we'll
build the Python source code.
I've included the full Dockerfile you'll need here, along
with some annotations about how it works.
# Build from a slim Debian/Linux image
FROM debian:stable-slim

# Update apt
RUN apt update
RUN apt upgrade -y

# Install build tooling


RUN apt install -y build-essential zlib1g-dev libncurses5-dev libgdbm-dev
libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-dev

# Download Python interpreter code and unpack it


RUN wget https://www.python.org/ftp/python/3.10.8/Python-3.10.8.tgz
RUN tar -xf Python-3.10.*.tgz

# Build the Python interpreter


RUN cd Python-3.10.8 && ./configure --enable-optimizations && make && make
altinstall

# Copy our code into the image


COPY main.py main.py

# Copy our data dependencies


COPY books/ books/

# Run our Python script


CMD ["python3.10", "main.py"]
Rebuild the Image

It might take a few minutes to build the image. Again, this


is the beauty of a language like Go. Many other
programming languages either have long build times or
long configuration steps.

Run the Image

If Bookbot ran, then you did it correctly! We've bundled


up Bookbot, its required data, and its required runtime all
into a nice little container image!
Run and submit the tests.
Publishing to Docker Hub
Hopefully, you've already created and
signed up for a Docker Hub account,
but let's review a bit about what
Docker Hub is.
Docker Hub is the official cloud
service for storing and sharing Docker
images. We call these kinds of services
"registries". Other popular image
registries include:
AWS ECR
GCP Container Registry
GitHub Container Registry
Harbor
Azure ACR
Deployment Pipelines
Publishing new versions of Docker images is a very common
method of deploying cloud-native back-end servers. Here's a
diagram describing the deployment pipeline of many
production systems (including the server that powers the
Boot.dev site you're on currently).
Build and Run Your New Image
Let's tag the new image with a new minor
version. We're fans of semantic versioning
here, though it's a common convention to not
include the v in Docker tags.

Should I Use "latest"?


The convention I'm familiar with is to use semantic versioning on
all your images, but to also push to the "latest" tag. That way you
can keep all of your old versions around, but the latest tag still
always points to the latest version.
You can build and push for multiple tags like this:

We're done, great work!


The Bigger Picture
Before we wrap up this Docker course, I want to reiterate
how Docker fits into the software development lifecycle,
particularly at modern "DevOpsy" tech companies,
because it's really important to understand.

The Deployment Process


1. The developer (you) writes some new code
2. The developer commits the code to Git
3. The developer pushes a new branch to GitHub
4. The developer opens a pull request to the main branch
5. A teammate reviews the PR and approves it (if it looks good)
6. The developer merges the pull request
7. Upon merging, an automated script, perhaps a Github action, is
started
8. The script builds the code (if it's a compiled language)
9. The script builds a new docker image with the latest program
10. The script pushes the new image to Docker Hub
11. The server that runs the containers, perhaps a Kubernetes cluster,
is told there is a new version
12. The k8s cluster pulls down the latest image
13. The k8s cluster shuts down old containers as it spins up new
containers of the latest image

You might also like