Docker containers and persistent data has been a problem that is being often discussed a lot. I faced the same confusion on how Docker containers and persistent data work together since I started learning and working with Docker. On this post I will share what I have learnt so far about Docker containers and how do you manage persistent data on a Docker container environment.
“Containers are immutable and ephemeral” — Mantra around Containers.
What does the above mantra means? When it comes to containers, they are made to be unchanged over time and made to last for a very short time period (containers can be restarted, stopped or replaced easily). The ideal scenario with containers are that they reside in an immutable infrastructure which consist of immutable components where only re-deploying of containers happen without changing them. But this nature of containers doesn’t go hand in hand with the databases which does consists unique data. The main objective of this post is to identify how Docker features ensures these “separation of concerns” which is known as “Persistent data”.
Docker volumes allows to bypass the union file system which used by Docker containers and stores data in an alternate location on the host (typically under /var/lib/docker/volumes). Docker volumes can be managed with it’s own management commands, under docker volume (Use docker volume — help). This command can be overridden by the docker run -v /path/in/container provided on the run time.
When a volume is created it will be given a unique ID, and has the ability to provide a name for the volume as well.
Volumes have the capability to connect with one or multiple containers and most importantly volumes won’t be removed until it’s being explicitly removed. This will allow new containers to run with persistent data whether the containers has been stopped or restarted.
These are containers which doesn’t run any application code, but created specifically to manage volumes. An existing data volume container can be specified when a container is created. This can be done via — volumes-from containerVolumeName command.
A great example of using volume containers to persist the data for PostgresSql database can be referred via this link.
Bind Mounts simply maps a host file or a directory to container file or a directory. Basically its two locations pointing to the same file(s), one on the host machine and the other on the running container. As previously explained, this option also skips the union file system.
Always the host files overwrite the content on the container. In example, if you have a basic website testing environment running via a Docker container and once you push new changes to the code, the changes will be instantly available for the container if you have used bind mounts. But bind mount configs cannot be used on Dockerfiles as we do on the volumes, they should be provided on the run time. Given below is an example command of a bind mount on a docker run command.
docker run -v /home/ravindu/myTestWebsite:/path/container
Compared with volumes, bind mounts carry the extra risk of security as the running container can be given access to the host’s system directories, which will provide the capability to delete or change content given that docker runs as root by default.
Docker also provides the ability to access storage on external applications. You can find out all the available Docker storage plugins via their official documentation. Volume driver has to be specified at the creation of a container with the name and the mount point. The specified driver will take care of the creation and management of the storage, mounting the file system and making it available to the host system before its available to the container.