Building thin Docker images using multi-stage build for your java apps!
The release of Docker CE 17.05 (EE 17.06) introduced a new feature that helps create thin Docker images by making it possible to divide the image building process into multiple stages, or in short words: multi-stage builds. This feature allows the reuse of artifacts produced in one stage
by another stage
. The main advantage of the multi-stage
build feature is that it can help creating smaller images.
The old way
Let's look take a look at this Dockerfile
that I'm using to build an image of my flight tracker application:
FROM maven:3.5.2-jdk-9
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
EXPOSE 8080
ENTRYPOINT ["java","-jar","/usr/src/app/target/flighttracker-1.0.0-SNAPSHOT.jar"]
My app is a Spring boot application, and as any java application, the build phase typically involves building the app and packaging the generated artifact into an image. I'm using maven as my build tool, which means it will download the required dependencies from repositories and keep them in the image. The number of JARs in the local repository could be significant depending upon the number of dependencies in the pom.xml
, this can cause an unnecessary bloat in the image size at runtime. The total size of this image 980MB
!
Worth nothing to mention that Spring boot generates fat jars. This means we could simply use a openjdk
image as a base for my production image.
Also, One could suggest solving this by splitting the Dockerfile into two files. The first file will build the artifact and copy it to a common location using volumes
. The second file will then pick up the generated artifact and then use the lean base image. This approach comes with drawbacks where the multiple Dockerfiles need to be maintained separately.
Moving on to a multi-stage build
with the multi-stage build, The Dockerfile
can contain multiple FROM
lines and each stage starts with a new FROM
line and a fresh context. You can copy artifacts from stage to stage and the artifacts not copied over are discarded. This allows to keep the final image smaller and only include the relevant artifacts.
The updated Dockerfile
for my application looks like this:
FROM maven:3.5.2-jdk-9 AS build
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
FROM openjdk:9
COPY --from=build /usr/src/app/target/flighttracker-1.0.0-SNAPSHOT.jar /usr/app/flighttracker-1.0.0-SNAPSHOT.jar
EXPOSE 8080
ENTRYPOINT ["java","-jar","/usr/app/flighttracker-1.0.0-SNAPSHOT.jar"]
Notice that there are two FROM
instructions. This means this is a two-stage
build. The maven:3.5.2-jdk-9
stage is the base image for the first build, It is named build
. This is used to build the fat jar file for the application.
As for the openjdk:8
, it's the second and the final base image for the build. the JAR file generated in the first stage is copied over to this stage using COPY --from
syntax. This has the great benefit of reducing the overall size of the runtime image, by allowing us to accordingly choose the base image for the final image to meet the runtime needs. Additionally, the cruft from build
time is discarded during intermediate stage. With this update, our production image is only 700MB
.
Final word
There are certainly many other ways to craft your build cycle, but if you are using Dockerfile to build your artifact, then you should seriously consider multi-stage builds.
Credit:
Image taken from: https://www.slideshare.net/ozlerhakan/ignite-session-the-journey-of-multi-stage-builds-moby-project-and-linuxkit