Docker Introduction
Docker (opens new window) is a kind of computer virtualization that provides valuable performance benefits.
# Concepts
Being different from classic forms of virtualization, Docker introduces new concepts with their vocabulary.
Take a look at two of them: containers and images.
# Containers
You know how a computer operates: it uses a disk and memory to run an operating system and the programs within it.
You know how a virtual machine works: its computer host offers a virtual disk and memory, and the virtual machine uses both to run its own operating system and programs. Virtualization mimics a full computer, which requires disk space and memory. It also takes time to start because the virtualized operating system needs to start from scratch.
Docker uses a feature of Linux called namespaces (opens new window) and control groups. With this feature, Linux can launch programs (even low-level ones of the operating system) in an environment where the programs believe they are in a complete Linux environment on their own, i.e. sandboxed. This is also called containerization. A container does not have full disk or full memory access to its host (the Linux environment that started it), only to a subset of it. A Linux system can launch multiple containers at a time.
As opposed to proper virtualization, containerization is fast:
- You do not need to start a whole operating system, only to isolate a new container and run its own programs. You now count the start-up time in seconds instead of minutes.
- The memory used by the container is only that of the programs it runs while benefiting from the rest of the Linux system in an isolated way.
# Images
Instead of a virtual disk with a full operating system, a container starts from an image. The image contains the files that are specific to the container and different from its Linux host. Think of tracing paper recording only the differences laid on top of the host's file system. These differences can be very small.
You can either create images locally via a Dockerfile
containing the commands that create the image, or you can use an existing image.
There exist registries that store and/or reference them, like your regular package manager. Docker Hub (opens new window) is the main one, and your local Docker knows how to download directly from it.
This introduces an interesting concept whereby a machine is described as a file (opens new window), which is very useful for reproducibility and DevOps. Further, the images are optimized so that each image is a diff of a parent one.
Images can be versioned and even referenced by their content hash so that you can be sure to use the expected one. For instance, Node.js (opens new window) has a long list of images. Do you want Node 19.1 (opens new window), or Node 19.1 built specifically on Debian Buster (opens new window)?
Because images contain files to be executed, the files also need to have been compiled for the CPU architecture of your Linux machine. This is why images are often uploaded in different "OS/ARCH" versions, as can be seen in the Node 19.1 page (opens new window).
For the avoidance of doubt:
- An image is read-only, and when your container starts its read-write file system is a separate entity.
- More than one container can be started from the same image at the same time.
# How to use it
What if you do not have a Linux operating system? Not to worry, Docker simplifies your life by installing and running a virtual machine running a barebones Linux on your host computer.
After installation, when you want to use Docker, you start Docker - you start the virtual machine running Linux, which is the part that takes time. After it has started, you can use commands to run containers. When you no longer need to use Docker, you can stop it and regain the memory it used.
In these tutorials, you will come across a lot of Docker commands, so it makes sense to familiarize yourself with them.
First, install it (opens new window). Next start Docker.
When it has started you can run your first container. For instance, with Node.js' lts-slim
image:
-it
is short for --interactive --tty
and means "with input and output", instead of a fully detached container. Learn more with docker run --help
. This should return you a Node.js prompt:
Version 18 is a long-term support (lts
) version. If you type .help
, it will tell you what you need. Exit with .exit
.
That was fast. What happened?
Docker downloaded the image from the hub and launched it. By default, the image is configured so that the container launches node
, which is what you got. You did not do anything interesting, though. Yet.
# Open a shell
What if you want to connect to your container with a shell? After all, this is running Linux. Because this image defines Node as its entrypoint (opens new window), you need to override it:
Now you see something different:
You are root
! But you are root
only in the container, not in the Linux host. Typically, programs running in a container are left running with root
, as the container takes care of the isolation.
The image node:lts-slim
, as the slim
part indicates, does not have much else besides Node. Try:
It should tell you:
This means that you need to pick your image carefully, and even then sometimes you will also have to install the tools you need. Exit with a regular exit
.
# Your own image
Suppose you need a Node container with the curl
command available. For that, you need to build your image. It so happens that node:lts-slim
is built on Debian, so you can use apt-get
to install new programs.
Create a new file named Dockerfile
with:
You have to update
the package registry because the image is kept slim by not even having a local copy. Build your new image with a name of your choosing:
Now your image is ready.
Can you call up curl
? Type:
It should return:
So yes, you now have curl
and Node in the same container.
# Hello World
What if you wanted to use Node to print Hello World
? Create the JavaScript file that can do that:
Now pass it to your container:
This does not work:
This is because Docker took the words test.js
and passed them to Node as a string within the context of the container, which has no test.js
. In effect, the container ran node test.js
. The file is currently only on your host computer.
Try again:
Here the content of the file is being passed via StdIn. It should still complain:
So remove -t
:
This time it prints:
# Sharing folders
Shared folders are called volumes. Instead of sending the content of test.js
to the container via StdIn, it could be more judicious to let the container have access to your file directly. Try again, this time by sharing your local folder (pwd
) with the container, and mounting it at /root/temp
inside the container:
It should complain again:
To progress, you also have to tell it to work (-w
) in the right folder: /root/temp
, which you created to access your local files:
This time it returns:
# Clean up
You ran quite a few commands. Where did your containers go?
You should see a long list like this:
All have stopped and exited. When you create a container, it is not removed by default so it can be reused (see docker exec --help
). For now, you should clean up these pointless containers - however, if you have containers that are not part of this introduction, do not run the command:
Having many stopped containers is not ideal, which is why when you want to run a container for a single command the practice is to add --rm
, like so:
Including --rm
automatically removes the container at the moment it is stopped and exited. You can confirm that there are no remaining containers.
What about the images?
There you should see your images too:
You can delete them, in any order, with:
This concludes your introduction to Docker.
Further reading
To summarize, this section has explored:
- What are Docker images and containers.
- How to use a container.
- How to share folders via volumes.
- How to create an image.
- How to clean up.
With these basics, you are equipped to handle the Docker examples of the tutorials.