In Part 3 we managed to build a Docker image containing the tools required to compile and link C/C++ code destined for our embedded Arm target system. However, we’ve paid little attention to the size of the image. Doing a quick Docker image listing we can see its grown to a whopping 2.14GB:
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
feabhas/gcc-arm-scons 1.0 6187455a4bfe 8 days ago 2.14GB
gcc 7.2 7d9419e269c3 2 months ago 1.64GB
In your day-to-day work the size of a Docker image may not bother you as Docker caches images locally on your machine. But after a while you’ll certainly need to prune them.
Apart from freeing up disk space, why else look to reduce the size of an image?
Contents
Continuous-Integration (CI)
As previously mentioned, the overriding benefit of using a Docker-based build is consistency and repeatability of the build. But, for modern CI to be effective, we want the build/test cycle to be as quick as possible.
A local provisioned build server (such as a Linux server running Jenkins) will also cache images after the first build, so less of an issue.
Cloud-based CI services (such as the previously-mentioned Travis-CI, Bitbucket, Gitlabs, etc.) are typically costed on build-minutes for a given period, e.g. build-minutes per month. In these cases pulling or building larger images naturally takes longer; and costs more.
Generating Smaller Images
There is plenty of good guidance around, and I’m sure what I’m doing here can be improved up on. Our basic approach to minimise our image consists of these steps:
- Start with a minimal Base image
- Only install what we need
- Remove anything we only needed to help install what we needed!
- Reduce the number of Docker layers
Minimal Base Image
Our base image is gcc:7.2 which comes in at 1.64GB. To that we added Scons and the gcc-arm cross compiler, but for cross-compilation we don’t require the host GCC (x86) compiler.
The most widely used minimal image is call Alpine. Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and Busybox.
If we download the latest Docker image of Alpine:
$ docker pull alpine
and list the Docker images, we can see Alpine is significantly smaller (only 3.97MB) than both gcc:7.2 (1.64GB) and a base image of Ubuntu (123MB):
alpine latest 053cde6e8953 3 weeks ago 3.97MB
ubuntu latest 20c44cd7596f 12 days ago 123MB
We simply can change the first line of our Dockerfile from:
FROM gcc:7.2
to
FROM alpine:3.6
All done? Err no…
Dependences
In our previous Dockerfile we had the following package dependencies:
- Scons:
- scons and lib32ncurses5
- gcc-arm-none-eabi
- gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2
- lib32z1 (.bz2 uncompressor)
- general
- git
- bzr (this was unnecessarily – accidentally included due to an old project!)
As we are intending to use Alpine, we need additional packages that come for free with gcc:7.2. This list of available Alpine packages can be found at https://pkgs.alpinelinux.org/packages.
It also needs to be noted that Alpine uses the apk-tool
package manager, rather than Ubuntu’s apt
.
Scons
Scons is written in Python, so when using Alpine we explicitly must install the package python
.
Wget
The next stage in our Dockerfile build is to grab the gcc-arm-none-eabi
zipped tarball file from https://developer.arm.com
. This involved installing the following packages:
- w3m – this gets us access to wget
However, as the site is https
then we need to access using TLS. This involves installing packages:
- openssl – The toolkit for SSL v2/v3
- ca-certificates – Common CA certificates PEM files (PEM files are the standard format for OpenSSL containing key information)
Unzipping/Untarring
Previously, we’d installed the package lib32z1
to support the unzipping part of the untarring of the file gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2
(a .bz2 file is a file compressed using BZIP2). lib32z1
is not supported under Alpine, instead we use the package bzip2-dev
(or we could have used unzip
, but bzip2-dev
is 1/5 size).
When trying to untar the unzipped-downloaded file, I kept getting a strange error of:
tar: invalid tar magic
which I’d never encountered before.
After a bit of Googling I found found a posting on StackExchange that answered the problem.
As mentioned, Alpine uses Busybox, which itself is commonly used in Embedded Linux builds. Although Busybox supports a version of tar (a smaller, simplified version) it unfortunately doesn’t support all the features required to extract the files from our tarball..
To extract the Arm cross compiler from the zipped-tarball we, therefore, need the following two packages:
- tar
- bzip2-dev
Running the compiler
Finally, coming to compile the code I ran into another (obvious in hindsight) problem. Alpine is supplies the musl
standard library (it’s lightweight and fast), in preference to default GNU C library (glibc).
As the compiler is, after all, just an executing program, when running a compilation the executing process requires certain features from the C Standard Library. It turns out that the GCC-Arm complier requires glibc
be installed.
There are various options to add glibc to Alpine, but the path-of-least-resitance is to utilise a base container that someone else had already created. In the end, I opted to change my base image from alpine
to frolvlad/alpine-glibc:latest
. This is an Alpine-based image, only 5MB in size, that contains glibc and enables
proprietary projects to be compiled against it.
As you can imagine, there was a certain amount of trial-and-error (supported by Google!) to finally get all the needed packages defined.
Installing the packages
Revisiting out Dockerfile, the top of the file was:
FROM gcc:7.2
ENV REFRESHED_AT 2017-11-21
# Set up a tools dev directory
WORKDIR /home/dev
RUN apt-get -qq update \
&& apt-get -y install git scons lib32z1 lib32ncurses5
This now becomes:
FROM frolvlad/alpine-glibc:latest
ENV REFRESHED_AT 2017-12-01
RUN apk add --update \
bzip2-dev \
ca-certificates \
git \
openssl \
python \
scons \
tar \
w3m
You might have spotted that we also have not added the package lib32ncurses5
in Alpine as the Python install brings in the ncurses-libs
used by Scons.
Build the Alpine Based Image
We can now build the Docker image:
$ docker build .
and inspect the size:
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 23e03fd9fb54 About a minute ago 612MB
We can now test it against our default project:
$ docker run --rm -v $(pwd):/usr/project 23e03fd9fb54
This all build successfully, and as a first stab we have reduced the image to nearly 1/4 of the original size.
Further Optimisations
It would be very easy to stop here, but there are a few good practices we can employ when building our Docker images.
Removing one-time-use packages
Several of the packages we install (w3m, openssl, ca-certificates, tar and bzip2-dev) are only needed during the installation of GCC-Arm and Scons. Once installed, in our built image we have no reason to use them again.
The akp package manager supports a concept called virtual packages. This enables you to group together a set of packages and then remove them during the Docker image construction, so they play no part in the final image size.
In the Dockerfile, we collect together all the intermediate packages using the directive apk --no-cache add --virtual
as a group called build-dependencies
RUN apk add --no-cache \
git \
python3 \
scons \
&& apk --no-cache add --virtual build-dependencies \
bzip2-dev \
ca-certificates \
openssl \
tar \
w3m
RUN wget https://developer.arm.com/-/media/Files/downloads/gnu-rm/6-2017q2/gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& tar xvf gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& rm gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& apk del build-dependencies
Then after we’ve used the files we use the command apk del
to remove them.
Also, note the use of the --no-cache
as it saves removing the local cache of packages.
This brings our image size down to a very respectable 586MB.
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> dabcc749dc26 4 minutes ago 586MB
Some final tweaks
There is a Best Practice guide for Dockerfile. Here are a couple of minor improvements:
- Combine the RUN commands into one
- Delete unnecessary files from GCC-Arm
Each use of RUN in a Dockerfile creates an intermediate layer. It is good practice, wherever possible to reduce the number of layers.
As part of the gcc-arm-none-eabi-6-2017-q2-update-linux
install, there are superfluous files. For example, the project documentation (gcc-arm-none-eabi-6-2017-q2-update/share/doc
) is approximately 59MB.
Our final Dockerfile becomes:
FROM frolvlad/alpine-glibc:latest
ENV REFRESHED_AT 2017-12-01
# Set up a tools dev directory
WORKDIR /home/dev
RUN apk add --no-cache \
git \
python \
scons \
&& apk --update --no-cache add --virtual build-dependencies \
bzip2-dev \
ca-certificates \
openssl \
tar \
w3m \
&& wget https://developer.arm.com/-/media/Files/downloads/gnu-rm/6-2017q2/gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& tar xvf gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& rm gcc-arm-none-eabi-6-2017-q2-update-linux.tar.bz2 \
&& apk del build-dependencies \
&& rm -rf /home/dev/gcc-arm-none-eabi-6-2017-q2-update/share/doc
# Set up the compiler path
ENV PATH="/home/dev/gcc-arm-none-eabi-6-2017-q2-update/bin:${PATH}"
WORKDIR /usr/project
CMD ["scons"]
This brings our final image size to 457MB down from the original 2.14GB.
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 9491310da352 6 minutes ago 457MB
Pushing to Dockerhub
After testing it on our default project, we can now rebuild with a tag, test and push to Dockerhub for use with our CI model.
$ docker build -t="feabhas/gcc-arm-scons-alpine:1.0" .
$ cd ../blog-test-project
$ docker run --rm -v $(pwd):/usr/project feabhas/gcc-arm-scons-alpine:1.0
$ docker push feabhas/gcc-arm-scons-alpine:1.0
Finally we can update our test project’s bitbucket-pipelines.yml
file to build with our new, leaner, Docker image:
image: feabhas/gcc-arm-scons-alpine:1.0
pipelines:
default:
- step:
script:
- scons
Summary
As mentioned at the outset, the most compelling reason to start using Docker is consistency of build and test environment.
Spending the time and effort to reduce the image size is worthwhile for automated builds, ultimately reducing costs (the Alpine based image has halved the build time on Bitbucket).
I’m sure the example shown can be further optimised by pruning unnecessary files (especially the GCC-Arm install), please drop me a comment if you spot obvious ones.
Next up; using Docker images in Multi-stage builds…
- Navigating Memory in C++: A Guide to Using std::uintptr_t for Address Handling - February 22, 2024
- Embedded Expertise: Beyond Fixed-Size Integers; Exploring Fast and Least Types - January 15, 2024
- Disassembling a Cortex-M raw binary file with Ghidra - December 20, 2022
Co-Founder and Director of Feabhas since 1995.
Niall has been designing and programming embedded systems for over 30 years. He has worked in different sectors, including aerospace, telecomms, government and banking.
His current interest lie in IoT Security and Agile for Embedded Systems.
This is a great series! Looking forward to it being continued.
Great posts! the real case example, with the detailed explanation, is very helpful.
Another option is DockerSlim ( https://dockersl.im ). You don't need to use Alpine (though it'll shrink Alpine images too). You can use any base image you want. No need to hand optimize your Dockerfile either 🙂 If it doesn't work for some reason ping me and I'll be happy to make it work.
I am very glad that I found your blog. Thank you for such kind of rich and informative blog.i really enjoyed while reading your ideas.