Purpose of Docker Volumes
Docker containers are meant to be a drop-in replacement for applications. They are meant to be disposable and easy to replace. This property is, in fact, the cornerstone of many CI/CD pipeline. When a change is made pushed to your source repository that triggers a chain of events. Docker images are automatically built, tested and (sometimes) even deployed right into production, replacing the older versions seamlessly.
But there’s often persistent data that needs to be preserved between different releases of your application. Examples include databases, configuration files for your apps, log files, and security credentials like API keys and TLS certificates.
To allow all this data to persist we will use Docker Volumes which are just parts of Docker Host’s filesystem (a directory or a block device formatted with a filesystem) that can be mounted inside a container at any desired location of the container’s filesystem.
Set Up
To ensure that we are all on the same page, here’s the version of Docker runtime and Docker-Compose that I am using:
- Docker version 18.09.2, build 6247962
- Docker-compose version 1.23.2, build 1110ad01
- Compose file version 3: Works with 1.13.0 and above
Example: Hosting a Ghost CMS Website
Working with Compose is really straight-forward. You write a yaml file describing your deployment and then run deploy it using the docker-compose cli. Let’s start with a simple Ghost CMS deployment.
Create a directory called ComposeSamples and within it create a file called docker-compose.yaml
$ cd ComposeSamples
Contents of docker-compose.yaml:
version: "3.0"
services:
web:
image: ghost:latest
ports:
- "2368:2368"
volumes:
- cms-content:/var/lib/ghost/content
volumes:
cms-content:
This compose file declares a single service that is web which is running the latest image of ghost CMS from Docker Hub’s official repository. The port exposed is 2368 (more on this in a little later) and a volume is then a volume called cms-content mounted at /var/lib/ghost/content you can read about your particular application and its nuances by looking up that apps documentation. For example, Ghost container’s default port 2368 and default mount point for the website’s contents /var/lib/ghost/content are both mention it the container’s official documentation.
If you are writing a new application of your own, think about all the persistent data it will need access to and accordingly set the mount points for your Docker volumes.
To test that the persistent volume works, try this:
- Open a browser and enter your Docker Host’s IP, that is, http://DockerHostIP:2368/ghost (or just http://localhost:2368/ghost ) and create an admin account. Modify one of the preexisting posts and save.
- List all the Docker components that are running using the commands: docker ps, docker network ls, docker volume ls
- In the same directory as your compose file, execute the command $docker-compose down and now you can list all the docker containers, network and volumes. Interestingly, you will notice that while the container and the network created by docker-compose are removed the docker volume is still intact.
- Run docker-compose up -d and you will notice that the modified post is just where you left it, even your admin login credentials can be used again, and you don’t have to create a new admin account.
- Remove the sections with volume from both the services: web: section and from the main section, and now if you repeat the above three steps, you will notice that.
Syntax and Verbosity
The syntax to introduce a volume using docker-compose is pretty straightforward. You start with something akin to a container, and mention the name of the volume that you want to mount inside it. If you don’t mention a name, then you can go for a lazy syntax like below:
services:
web:
image: ghost:latest
ports:
- "2368:2368"
volumes:
- /var/lib/ghost/content
If you want to be a bit more verbose, then you will have to mention the Docker Volume as a top level definition:
services:
web:
image: ghost:latest
ports:
- "2368:2368"
volumes:
- cms-content:/var/lib/ghost/content
## Define that cms-content is in fact a volume.
volumes:
cms-content:
Although the latter version requires you to type more, it is more verbose. Choose relevant name for your volumes, so your colleagues can understand what’s been done. You can go even further and mention the type of volume (more on this later) and point out source and target.
- type: volume
source: cms-data
target: /var/lib/ghost/content
Bind Mounts
Bind mounts are parts of the host file system that can be mounted directly inside the Docker container. To introduce a bind mount, simply mention the host directory you want to share and the mount point inside the Docker container where it ought to be mounted:
- /home/<USER>/projects/ghost: /var/lib/ghost/content
I used the path /home/<USER>/projects/ghost as just an example, you can use whatever path on your Docker host you want, provided you have access to it, of course.
You can also use relative paths by using $PWD or ~, but that can easily lead to bugs and disasters in the real-world scenarios where you are collaborating with multiple other humans each with their own Linux environment. On the flip side, sometimes relative paths are actually easier to manage. For example, if your git repo is also supposed to be your bind mount using dot (.) to symbolize current directory may very well be ideal.
New users cloning the repo and clone it anywhere in their host system, and run docker-compose up -d and get pretty much the same result.
If you use a more verbose syntax, this is what your compose file will contain:
- type: bind
source: /home/USER/projects/ghost
target: /var/lib/ghost/content
Conclusion
To organize your applications such that the app is separate from the data can be very helpful. Volumes are sane ways to accomplish just that. Provided that they are backed up, and secure, you can freely use to use the containers as disposable environments, even in production!
Upgrading from one version of the app to the next or using different versions of your app for A/B testing can become very streamlined as long as the way in which data is stored, or accessed is the same for both versions.