Persistent Data in Docker: Explanation + Hands-On Demo

Docker containers, by nature, are ephemeral. They are not designed to stay without being deleted for long periods of time. While you could, theoretically never delete the image, most of the benefits of docker are realized when containers are temporary. There is something wrong with this approach, though. Almost all the data in the world is designed to be persistent. This poses a problem that docker volumes can solve.

What are docker volumes?

Docker volumes are a method of ensuring data persistence in a containerized environment. To comply with best practices, you need to make sure that spontaneously losing a container or 2 does not have an impact on anything but the performance of your infrastructure. To do this, all the important data in your containers needs to be put somewhere. Somewhere which is not liked to the life-cycle of the container itself. That somewhere is a volume.

A docker volume is little more than a place to store data. From the point of view of the container, nothing seems to be different. The file-system operates the way it should. When you delete the container, though, the volume stays. You can now attach this volume to another container and it will have the same data. Lets try it!

Docker Volumes Demo

Create the volume

We first need to create a volume. To do that, we need to run the following command:

docker volume create volume

Let's break this command down so we can understand what is going on:

  1. docker: This is the CLI tool we are using to interact with docker.
  2. volume: This is a sub-command inside the docker CLI tool. This has many sub-commands in and of itself. This can help you manage volumes.
  3. create: This sub command of the volume sub command is used to create a volume. It takes 1 argument. The name
  4. Name (volume): The final word required is the name. This can be anything you like. I chose "volume" for this example.

As you can see above, this command should create a volume named "volume". To verify if it did, lets run this command

docker volume ls

This command is very similar to the previous one. The only 2 differences are that we are using the ls command instead of the create command, and there are no additional arguments.

  1. ls: Lists all the available volumes.

There should be a volume with the name that you assigned it somewhere in the list. If there is'nt you might have done something wrong on the previous step.

Create the container

Now that we know our volume exists, lets try using it. We will create a docker container with that volume mounted onto it. Here is the command:

docker run --name hello-world -v volume:/usr/share/nginx/html -p 8080:80 nginx:alpine

Let's break down this command:

  1. run: This tells docker to create a new container
  2. --name: This specifies the name we will use to refer to the docker container for any other operations we want to perform on it. That name, in the example, is hello-world.
  3. -v: This specifies the volume mount. It says the volume we created previously, volume (change it if this is not the name you went for), should be mounted to /usr/share/nginx/html. This is where all the files that nginx serves are stored. This is where the volume is attached to the instance. The volume IS the specific folder.
  4. -p: This binds the port 8080 on the host to the port 80 on the container. Any network packets to port 8080 on the host are sent to port 80 on the container and vice versa.
  5. nginx:alpine : This is the name of the image we will be using for the docker container.

We should now have a container running with our volume mounted to it. We can run the following command to check.

docker ps

This command will list all the containers you have running. Hopefully, the container you just created is one of them.

Testing it

Now that we have everything in place, we can test it and see the power of volumes in action. First, you need to create an HTML file. How you do this is dependent on your OS, so I will not show you the steps. Hopefully you know how to do it. After that, open the terminal in the same directory where the html file is located and run the following command.

docker cp index.html hello-world:/usr/share/nginx/html

You need to replace index.html with whatever the name of your HTML file is. You should also replace hello-world with the name of the container.

In this command, you are simply copying index.html to /usr/share/nginx/html on the container name hello-world. If you remember, that is the same directory that we mounted the volume to. We are essentially writing to the volume.

Now that we have written the HTML file to the right directory in the volume, nginx will be serving them on the port we previously exposed. You should open the following URL in your browser: http://localhost:8080

This simply sends an HTTP request to port 8080 on your computer. As we discussed previously, that port is bound to port 80 on the container. Any network traffic to port 8080 on the host, which is your computer, will be forwarded to port 80 on the container and vice versa. And since nginx is serving its content on port 80, we are essentially accessing the content we put into that directory which is the volume. By proxy, your computer is serving content on that volume on port 8080.

When you open the web page, you will see the HTML file that you put in there. Now, lets use the persistence of the volumes to delete the container. We need to run the following command:

docker rm -f hello-world

Replace hello-world with whatever you named your container. This should delete it. Once you are done with that, we can create a new one using the same command as you did above. After that, open the same URL in your browser. You will receive the same output.

Does'nt COPY in the Dockerfile Do the Same Thing?

Short Answer: Mostly no, kind of yes.

Long Answer: For this specific use-case, you probably would not have seen a difference between the two methods, but for real world use cases, there are several things volumes do that images do not.

  1. Volumes can deal with dynamic data. You can have containers manipulate the data, and when you do not need it, you can just delete it, safe in the knowledge that all your data and all of its changes have be saved. Images do not do that
  2. Volumes can deal with a lot of data. While images can, theoretically, carry an infinite amount of data, that kind of defeats the point of using containers in the first place. Your containers and images are no longer light weight, versatile, and portable.

Conclusion

Data persistence and management on docker is a huge topic. I barely skimmed the surface. I will probably make another post about it soon. You can be on the lookout for that.

If you have any more questions about things covered in this article, you know of another thing I should cover, or you think I should just have done something better, please leave a comment. Thank you for reading.

Comments

Popular posts from this blog

Pods to Deployments | Kubernetes Architecture Evolution

Docker Compose Explained: Simplifying Multi-Container Deployments