In this article, we will look into how we can persist data in Docker with Docker volumes.
Let’s think about a database such as MySQL where we store data. In MySQL, data is usually stored inside /var/lib/mysql
directory. If we run MySQL as a container in Docker, then this data will be stored in a similar virtual directory inside the container.
But what would happen if we destroy the container? Since data is saved inside the container, data will also be destroyed along with it. This is certainly not what we want for a stateful application such as a database where we need data to persist. We need a way to preserve the data even if the container is destroyed. Docker handles this problem with volumes.
What we do with Docker volumes is basically mounting a directory in the Docker host into a virtual directory in the container. So when the Docker container writes to it’s virtual file system, it will be replicated in the mounted directory in Docker host. Because of this, even if the container is destroyed, the data will persist in Docker host and we can mount the same directory to a new container.
There are 3 types of storage options in Docker.
- Bind Mounts (referred to as “host volumes” in some places)
- Anonymous Volumes
- Named Volumes
Bind Mounts
With bind mounts, we decide which directory or file specifically is mounted by explicitly defining the absolute path of that directory or file when running the container. Look at below example.
docker run -d --name db \
-e MYSQL_ROOT_PASSWORD=charith \
-e MYSQL_DATABASE=photo_app \
-e MYSQL_USER=charith \
-e MYSQL_PASSWORD=charith \
-p 53306:3306 \
-v /var/lib/mysql:/var/lib/mysql \
mysql
Here notice at line 7, I have specified the absolute path. Left of the colon refers to the location in Docker host, while right of the colon refers to the virtual directory within the Docker container.
The file or directory does not need to exist on the Docker host already. It is created on demand if it does not yet exist. Bind mounts are very performant, but they rely on the host machine’s filesystem having a specific directory structure available.
Another important thing to remember is bind mounts allow access to files in Docker host. Which means, with right permissions, the container could even read and even modify the host file system. Therefore it is important to be aware of security implications when using bind mounts.
That is not to say bind mounts should be avoided at all costs. One important use case of bind mounts is to run Docker inside a Docker container when necessary.
Anonymous Volumes
Have a look at this example, specifically at line number 7.
docker run -d --name db2 \
-e MYSQL_ROOT_PASSWORD=charith \
-e MYSQL_DATABASE=photo_app \
-e MYSQL_USER=charith \
-e MYSQL_PASSWORD=charith \
-p 43306:3306 \
-v /var/lib/mysql \
mysql
In this case, we don’t specify which directory on Docker host is mounted. Instead we just reference the directory or file within the container with -v
flag. Rest will be taken care of by Docker.
Volumes created in this manner usually exist within /var/lib/docker/volumes
directory in Linux. the volume will be named with a random hash.
On Windows, you can find volumes in \\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\
directory.
If I list everything inside the directory, I can find these various volumes belonging to different containers.
charith@ubuntu-ch:~$ sudo ls /var/lib/docker/volumes
65d50b717c42c121366521a2b8922f9ae2e137956889b2d8ff1f179712e54849
9299c8b4e8482ba4cb425eb8a3e6c38dbbb855ba3f2c3d0bf001fcdf13b454e4
backingFsBlockDev
fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
jenkins_home
metadata.db
Since the volumes are named with a random hash, it could be quite difficult when you need to mount it to a different container later. In my opinion, there’s absolutely no reason to use this type of volumes. If you know any use cases though, let me know in the comments. 🙂
Named Volumes
And finally the most popular of all, named volumes. Have a look at below example.
docker run -d --name db3 \
-e MYSQL_ROOT_PASSWORD=charith \
-e MYSQL_DATABASE=photo_app \
-e MYSQL_USER=charith \
-e MYSQL_PASSWORD=charith \
-p 33306:3306 \
-v mysql_volume:/var/lib/mysql \
mysql
WIth named volumes we decide the name of the volume and then Docker will automatically create a directory and manage it. No security implications like in bind mounts and easy to reference, backup and use across containers. Out of all three methods above, this is the most preferred.
In above example, I have named the volume as mysql_volume and if I look at /var/lib/docker/volumes
directory, I can find it there.
Some Useful Docker Commands Related to Volumes
List Available Volumes
To list available volumes, you can run docker volume ls
command. It will list down the volumes like below. As you can see there are both named volumes and anonymous volumes in my Docker host.
charith@ubuntu-ch:~$ docker volume ls
DRIVER VOLUME NAME
local 65d50b717c42c121366521a2b8922f9ae2e137956889b2d8ff1f179712e54849
local 9299c8b4e8482ba4cb425eb8a3e6c38dbbb855ba3f2c3d0bf001fcdf13b454e4
local fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
local mysql_volume
Inspect Volumes
To inspect a volume, you can run the docker volume inspect
command followed by the name of the volume.
charith@ubuntu-ch:~$ docker volume inspect mysql_volume
[
{
"CreatedAt": "2021-10-08T15:08:24Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/mysql_volume/_data",
"Name": "mysql_volume",
"Options": null,
"Scope": "local"
}
]
root@ubuntu-jenkins:~#
The command gives me information such as the mount point, the driver etc.
Remove Volumes
When you permanently get rid of a particular container, volumes attached to them won’t be automatically removed. So it is important to remove them if you need them anymore to reclaim space consumed by them.
To remove a volume, we can run docker volume rm
command followed by the volume name.
charith@ubuntu-ch:~$ docker volume rm fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
Pruning Volumes
Pruning is a method to remove all volumes that are not attached to any container. You have to be very careful when doing a pruning though, as it might remove any volumes that you don’t want removed just yet. For an example, a volume attached to a stopped or removed container that you plan to run again later with same data.
Let’s list down the available volumes first.
charith@ubuntu-ch:~$ docker volume ls
DRIVER VOLUME NAME
local 65d50b717c42c121366521a2b8922f9ae2e137956889b2d8ff1f179712e54849
local 9299c8b4e8482ba4cb425eb8a3e6c38dbbb855ba3f2c3d0bf001fcdf13b454e4
local fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
local mysql_volume
Now we can run the prune command as below. And you will get a warning and a prompt for confirmation. Enter Y to continue. Once done, you will be shown the amount of total space reclaimed as well!
charith@ubuntu-ch:~$ docker volume prune
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N] Y
Deleted Volumes:
65d50b717c42c121366521a2b8922f9ae2e137956889b2d8ff1f179712e54849
9299c8b4e8482ba4cb425eb8a3e6c38dbbb855ba3f2c3d0bf001fcdf13b454e4
fa5cdf104d53d07ea010728ce6a894480a8661e0e4049e6660bc56e89d20551c
Total reclaimed space: 383.5MB
Now let’s list down the volumes again so we can compare what was actually removed by the prune.
charith@ubuntu-ch:~$ docker volume ls
DRIVER VOLUME NAME
local mysql_volume