Cedar CI Logo

Ultimate Docker Build

(3 min)

Extract application build from Docker

Building containers is a ubiquitous activity with a wide range of facets and tricks. Everyone builds containers and everyone wants them to build faster.

Most efforts to improve build performance are focused around layer caching. Steps are carefully ordered and broken apart to optimize the number of steps before a cache miss occurs. All steps prior to the miss can be hydrated from cache. Next, the layer cache performance can be tuned since acquiring the layer takes the place of performing the step. If downloading the layer, or reading it from a slow disk, takes longer than the step the cache is moot.

Although the various techniques, employed in CI, for providing responsive access to container layers are effective, they fail to optimize the portions specific to the application source code.

Layer Cache

Consider the common Dockerfile pattern demonstrated below.

FROM base:stable AS build
RUN base_install_build_dependencies
WORKDIR /src

# These steps cache miss on lock file changes.
COPY package.lock .
RUN app_install_dependencies

# These steps are always a cache miss.
COPY . .
RUN app_build


FROM base:stable
RUN base_install_runtime_dependencies

# This step is always a cache miss.
COPY --from=build /src/bin/app /usr/local/app
ENTRYPOINT [ "/usr/local/app" ]
  • The base_install_build_dependencies step is always cached unless directly changed.
  • The next two steps rerun anytime the lock file is changed.
  • When the application source changes the COPY . . and subsequent step rerun.

Thus, the last two steps of the build image will effectively run on every change and the last four often.

A layer cache avoids the base image download, base_install_build_dependencies step, and sometimes the app_install_dependencies step, but has no impact on the application build which is never cached. The layer cache provides a solid improvement for reliability and performance, but unless your application never changes dependencies and has no compilation step you are still left with a slow build.

Extract the Build

Instead of performing the application build alongside the container build, what if we think outside the Dockerfile. Most CI is run inside a container which means the Docker build usually looks something like the following.

build:
  image: docker:stable
  script: docker buildx build .

What if instead of wasting the outer container to simply run docker:stable we use the build image from above, inject the RUN steps into CI, and reduce the Dockerfile.

build:
  image: base:stable
  script:
    - base_install_build_dependencies
    - app_install_dependencies
    - app_build
    - docker buildx build .

The final docker build command effectively only performs the second half of the Docker build which can cache the base_install_runtime_dependencies step and continue to always COPY.

The base_install_build_dependencies step can be moved into a bespoke CI container which is rebuilt only when changed and is cached by the execution environment.

Lastly, the application source specific steps can benefit from the Intelligent Cache that was designed to handle the complexities of incremental caching.

build:
  image: ${CI_IMAGE}
  variables:
    CEDARCI_INCREMENTAL: "true"
  script:
    - app_install_dependencies
    - app_build
    - docker buildx build .

The result is a fully optimized CI Docker build. In any scenario the optimal amount of work is performed.

  • If only the application code is changed, only app_build has work to do.
  • If one package is added, one package is downloaded.

Such a level of optimization can never be achieved with Docker layers that either miss or match entirely.

Pipeline

Taken a step further the build can be broken out as a separate job and used for other parts of CI like tests.

build:
  stage: build
  image: ${CI_IMAGE}
  variables:
    CEDARCI_INCREMENTAL: "true"
  script:
    - app_install_dependencies
    - app_build

container:
  stage: build
  image: docker:stable
  script: docker buildx build .
  needs: [ build ]

test:
  stage: test
  image: ${CI_IMAGE}
  script: app_test ${CI_NODE_INDEX} ${CI_NODE_TOTAL}
  parallel: 5
  needs: [ build ]

The shared application build job powered by the Intelligent Cache results in the following:

  • Avoids waiting for an application dependency cache to extract in every job.
  • Avoids building the application in every test job and the container build job.
  • Build once, build incremental, cache everything without overhead.

The only real downside is losing the portability of a self-contained Docker build, but portability can be retained by either maintaining the original Dockerfile or using clever tooling to assemble a Dockerfile from the parts.

Optimize your whole pipeline and see the results for yourself.