Hi all, our team recently created a list with [20 Dockerfile best practices](https://sysdig.com/blog/dockerfile-best-practices/).
Including the most known ones:
👮♀️ Do not run images as root.
⚛️ Use distroless base images.
But also others I never heard before:
🔢 Using multi stage builds.
✅ Implement linters as soon as in development.
What I usually find with these listings is that they are really nice on paper, but then there are tons of minor details that make them impossible to implement.
So my questions are:
1. Is there a best practice you are missing from that list?
2. Which ones are harder to implement for you?
I think you should specify what context you use Docker.
Using multistage builds and implementing linters are both pretty specific to building software applications.
They are irrelevant to someone trying to start a minecraft image on their synology NAS, just to reference something that was a recent top post in this sub.
I don’t think there is a single set of best practices. Docker is a generally useful tool across a ton of use cases.
It’s more a list about best practices around docker in general, not just Dockerfile or even just images. But yah, pretty general stuff.
we have a base image for the company that gets regular updates. That image has the basics like the user, certificates, proxy, basic dependencies.
The rest of the images are based on that one and implement whatever we need.
We try to build a base image for each technology. For example we have a python image that is based on the base image. Then we have django, flask and ML images that are based on that base python image. This way we leverage layer caching and we reduce the amount of images to maintain.
Security patch? no problem, update the base image and schedule a cascade of rebuilds.
We also use quay that has vulnerability scans.
We use multi-stage builds for images that require compilation and have build dependencies.
For example for java we build on a large jdk image and copy the contents to a slim jre image.
What is a “distroless” base image?
Implement image annotations defined at [https://github.com/opencontainers/image-spec/blob/master/annotations.md](https://github.com/opencontainers/image-spec/blob/master/annotations.md) . You’ll thank yourself when you find some important container running somewhere and nobody can figure out where it came from, or which specific version it was. Fully labeling your containers at build time gives you very useful breadcrumbs as to whence it came.
>What I usually find with these listings is that they are really nice on paper, but then there are tons of minor details that make them impossible to implement.
Not really sure what issues you’re having. We use hadolint and have no problems. I actually wish it would catch more things.
Low level best practices (how to write a Dockerfile)
[https://www.docker.com/blog/intro-guide-to-dockerfile-best-practices/](https://www.docker.com/blog/intro-guide-to-dockerfile-best-practices/)
High level best practices (how to use/build Dockerfiles)
[https://codefresh.io/containers/docker-anti-patterns/](https://codefresh.io/containers/docker-anti-patterns/)
I’m a fullstack developer and this is my solution:
if I work on reactjs + docker. I will use docker multistage. 1 stage for build / 1 stage for static serve.
Reduce the image size as much as possible (use distroless, debian / alpine)
DO NOT include env file / env prod into Dockerfile
In some case, native library is not fully supported on alpine / debian or deprecated.
I have a talk coming out tomorrow for Docker’s Community All Hands event that covers best practices around creating a docker-compose.yml and Dockerfile. It pulls everything together into an example app. It’s a 35 minute live demo.
Your list is good and I follow a lot of those recommendations in practice, but at a glance:
One thing I would change is your health check. I like to define it in my docker-compose.yml file because then it only gets used when using Docker Compose and Swarm, and then if you decide to use Kubernetes you can use its checks. I know technically it disables it, but from a conceptual level it feels right to define it in a spot where it won’t interfere with an orchestrator that doesn’t use it.
Another thing is to use the array syntax for CMD. It’ll make sure your process in the container runs as PID 1 instead of `sh -c “…”`.
I also think one thing to remember is best practices are basically opinions that 1 or more people informally agree upon, so it’s expected opinions differ and they also change over time. I know my Docker best practices blog post from 2018 differs a lot from what I do current day.