The release of Docker CE 17.05 (EE 17.06) introduced a new feature that helps create thin Docker images by making it possible to divide the image building process into multiple stages, or in short words: multi-stage builds. This feature allows the reuse of artifacts produced in one
stage by another
stage. The main advantage of the
multi-stage build feature is that it can help creating smaller images.
The old way
Let's look take a look at this
Dockerfile that I'm using to build an image of my flight tracker application:
FROM maven:3.5.2-jdk-9 COPY src /usr/src/app/src COPY pom.xml /usr/src/app RUN mvn -f /usr/src/app/pom.xml clean package EXPOSE 8080 ENTRYPOINT ["java","-jar","/usr/src/app/target/flighttracker-1.0.0-SNAPSHOT.jar"]
My app is a Spring boot application, and as any java application, the build phase typically involves building the app and packaging the generated artifact into an image. I'm using maven as my build tool, which means it will download the required dependencies from repositories and keep them in the image. The number of JARs in the local repository could be significant depending upon the number of dependencies in the
pom.xml, this can cause an unnecessary bloat in the image size at runtime. The total size of this image
Worth nothing to mention that Spring boot generates fat jars. This means we could simply use a
openjdk image as a base for my production image.
Also, One could suggest solving this by splitting the Dockerfile into two files. The first file will build the artifact and copy it to a common location using
volumes. The second file will then pick up the generated artifact and then use the lean base image. This approach comes with drawbacks where the multiple Dockerfiles need to be maintained separately.
Moving on to a multi-stage build
with the multi-stage build, The
Dockerfile can contain multiple
FROM lines and each stage starts with a new
FROM line and a fresh context. You can copy artifacts from stage to stage and the artifacts not copied over are discarded. This allows to keep the final image smaller and only include the relevant artifacts.
Dockerfile for my application looks like this:
FROM maven:3.5.2-jdk-9 AS build COPY src /usr/src/app/src COPY pom.xml /usr/src/app RUN mvn -f /usr/src/app/pom.xml clean package FROM openjdk:9 COPY --from=build /usr/src/app/target/flighttracker-1.0.0-SNAPSHOT.jar /usr/app/flighttracker-1.0.0-SNAPSHOT.jar EXPOSE 8080 ENTRYPOINT ["java","-jar","/usr/app/flighttracker-1.0.0-SNAPSHOT.jar"]
Notice that there are two
FROM instructions. This means this is a
two-stage build. The
maven:3.5.2-jdk-9 stage is the base image for the first build, It is named
build. This is used to build the fat jar file for the application.
As for the
openjdk:8, it's the second and the final base image for the build. the JAR file generated in the first stage is copied over to this stage using
COPY --from syntax. This has the great benefit of reducing the overall size of the runtime image, by allowing us to accordingly choose the base image for the final image to meet the runtime needs. Additionally, the cruft from
build time is discarded during intermediate stage. With this update, our production image is only
There are certainly many other ways to craft your build cycle, but if you are using Dockerfile to build your artifact, then you should seriously consider multi-stage builds.