CMake Part 4 – Windows 10 Host

Introduction

In previous blog posts in this series (Part 1Part 2 and Part 3), I looked at using CMake on a Linux host to configure a build to cross compile to target hardware such as the STM32F4 Series.

In this post, we’ll work with the GNU Arm Embedded Toolchain on a Windows 10 Host.

The first part of this blog discusses running the Windows hosted versions of CMake, GNU Arm Embedded Toolchain and GNU Make. An alternative approach, briefly discussed at the end of the blog, is to use container technology such as Windows Subsystem for Linux (WSL2) or Docker, or use a full-blown Linux Virtual Machine  hosted in VirtualBox or VMWare.

CMake on Windows

The first point to make about CMake on Windows is that it defaults to generating build files for Visual Studio and assumes you will be using the Microsoft Visual Studio Toolset.

The second point to make is that the CMake command line is subtly different. Not by much, but enough to confuse  us, as some options that work under Linux are not available under Windows.

The third point is around command and file naming conventions. Microsoft uses a .exe suffix to identify executable programs, but there is no requirement to include this suffix when invoking an executable from the command line. Running the C/C++ compiler is a matter of entering the command CL or CL.EXE (case is ignored but is usually shown as uppercase in documentation). CMake may require the full pathname to the compiler executable, including the .exe suffix. This isn’t used with Linux executables, leading to a minor difference between the command name (CL) and the executable file name (CL.EXE) not found under Linux.

The fourth and last point is that CMake generates build files for Microsoft NMake. Running an NMake build requires a custom environment to be set up by running the vcvarsall.bat supplied with the Microsoft VS Toolset. This isn’t a big problem, but any automated build script must include this.

We need to modify our Linux CMake configuration and supporting build script to address these portability problems. I’ll assume you’re familiar with the Linux based embedded system project that we’ve used in previous posts and just focus on changes required for Windows.

Toolchain Configuration

By default, The GNU Arm Toolchain for Windows is installed in the C:\Program Files (x86)\GNU Arm Embedded Toolchain\ as a subfolder named after the release version. The Arm toolchain does not include the GNU Make command, so we must download this separately, either as a standalone program or as part of a suite of GNU development tools.

Installing GNU development tools on a Linux host is achieved using the system package management commands (apt for Debian/Ubuntu and dnf or yum for Fedora/CentOS/RHEL). However, it’s a little more complex on Windows as there is no official Microsoft package of GNU tools.

We, therefore, need to rely on third-party providers for the GNU development tools, of which MinGW and Cygwin are the most popular.

Additionally there are various standalone versions of GNU Make ported to Windows: but none of the one’s I’ve come across appear to be supported or updated on a regular basis which makes me wary of using them.

I’m not going to digress into the details of installing GNU Make under Windows but assume that a Windows version of the make command is available. In my case, I use the MSYS2 installer which includes the Mingw-w64 development tools (these must be added to the base MinGW installation).

We already have a working CMake toolchain file for our embedded project  (toolchain-STM32F407.cmake), but we will need to make some minor modifications to handle the .exe suffix on the toolchain filenames (not found on Linux). The GitHub project supporting this blog has all the configuration files for Windows.

For CMake we will assume that our Windows environment path variable (%PATH%) is configured to include the directories containing the GNU Arm toolchain (as we have done for the Linux build). This simplifies the toolchain file changes and avoids hard coding filesystem paths into the configuration file. We use a PowerShell script (shown later) to configure the Windows program search path (%PATH%).

In our sample project’s toolchain file (toolchain-STM32F407.cmake) we just add conditional code for including executable suffixes when locating the toolchain executable files:

if (CMAKE_HOST_WIN32)
  set (SUFFIX .exe)
else()
  set (SUFFIX "")
endif()

find_program(CROSS_GCC_PATH arm-none-eabi-gcc${SUFFIX})
get_filename_component(TOOLCHAIN ${CROSS_GCC_PATH} PATH)

set(CMAKE_C_COMPILER ${TOOLCHAIN}/arm-none-eabi-gcc${SUFFIX})
set(CMAKE_Cxx_COMPILER ${TOOLCHAIN}/arm-none-eabi-g++${SUFFIX})
set(TOOLCHAIN_AS ${TOOLCHAIN}/arm-none-eabi-as${SUFFIX} CACHE STRING "arm-none-eabi-as")
set(TOOLCHAIN_LD ${TOOLCHAIN}/arm-none-eabi-ld${SUFFIX} CACHE STRING "arm-none-eabi-ld")
set(TOOLCHAIN_SIZE ${TOOLCHAIN}/arm-none-eabi-size${SUFFIX} CACHE STRING "arm-none-eabi-size")

If we are on Windows, The CMAKE_HOST_WIN32 variable is set to true, allowing the script to set a variable with the required host filename suffix. No other changes to the toolchain file toolchain-STM32F407.cmake are required.

As an aside, if we had decided to adopt a simpler approach for our toolchain configuration, where we only required the compiler and linker without the other build tools, then we could have just specified the compiler command names in the appropriate CMAKE variables:

set(CMAKE_C_COMPILER arm-none-eabi-gcc)
set(CMAKE_CXX_COMPILER arm-none-eabi-g++)

In this case, we are working with command names and not the filenames, so there is no requirement to include a .exe suffix. Configuring the toolchain using this approach means no changes are required to the toolchain file to work with Windows rather than Linux. Our changes are required because we are working withe filenames not commands.

Project Configuration

No changes are required to the project file CMakeLists.txt, which shows that CMake can be configured to work with both Linux hosts (including macOS) and Windows with a single version of the configuration files.

But we do need to look at changes required to the cmake command line to generate the build files and run the build itself.

CMake Command Line

On Linux, we typically add our development commands to the standard search path (defined by the $PATH variable). In contrast, on Windows, we tend not to extend the Windows search path to include development tools mainly because we are less likely to be working at a Windows command line.

By not extending the Windows search path,  we must use full path names for each toolchain command or temporarily extend the path to include the toolchain folder location. As with Linux, a script to manage the build process is essential.

The first change to the cmake command is to specify using the GNU Make code generator instead of Visual Studio Tools by adding a -G “Unix Makefiles” option.

Not only do we need to use the -G option, but we need to specify the location of the make command otherwise we will default to using nmake which we haven’t installed as we are not using the Microsoft host toolchain.

To make it easier to read and maintain a command script, we define a Windows variable for the location of the make program rather than include it on the search path. But we will extend the search path (%PATH%) to include the required GNU Arm toolchain folder.

A simple set of Windows commands to generate a debug build using msys64 Make and GNU Arm toolchain version 2020-q4-major looks like the following (the ^ symbol is the command continuation character for Windows):

set CMAKE=”C:\Program Files\CMake\bin\cmake.exe”
set MAKE=”C:\msys64\usr\bin\make.exe”
set ARMTOOLS=C:\Program Files (x86)\GNU Arm Embedded Toolchain\10 2020-q4-major\bin
set PATH=%PATH%;%ARMTOOLS%

%CMAKE% -S . -B build/debug   ^
  -G “Unix Makefiles”         ^
  -DCMAKE_MAKE_PROGRAM=%MAKE% ^
  -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

Currently, the Windows version of the cmake command does not support the –build option, so we have to invoke the make command directly from the command line. Using the -C option, we can avoid changing directories to work within our project root folder. We can optionally add a VERBOSE=1 to the end of the command to see the build commands as they are executed.

To build the debug version of our project we use:

set MAKE="C:\msys64\usr\bin\make.exe"
%MAKE% -C build/debug VERBOSE=1

A clean build requires adding the clean target to the make command:

set MAKE="C:\msys64\usr\bin\make.exe"
%MAKE% -C build/debug clean

That’s it. All the other CMake command options we used in the Linux build work under Windows. At the end of the blog is an example of this simple command script and a more functional PowerShell script that wraps up the CMake build commands.

An alternative to using the Windows hosted ARM toolchain is to use a virtual environment or container to perform the build under Linux.

WSL and Docker

Both Windows Subsystem for Linux (WSL2) and Docker are containers or self contained execution environments that run Linux and have access to the Windows file system but typically don’t provide a desktop environment (but could do so). We discuss using Docker containers in out blog An Introduction to Docker for Embedded Developers – Part 1 Getting Started.

For both WSL2 and Docker, the host development toolchain (for the make command) and the GNU Arm Embedded Toolchain (for Linux) need to be installed in the container.

These days both WSL2 and Docker can be accessed from Visual Studio Code running on the Windows host. Microsoft extensions (Remote – WSL and Remote – Containers) are required to access the virtual environments, and there are additional third-party VS Code extensions mainly for Docker but also for WSL. This means you can store and edit the code on Windows filesystem but run the build commands in the WSL2 or Docker container, all from within the Visual Studio Code IDE.

I have found using WSL2 from VS Code an effective mechanism for developing and building our training projects. The resultant ELF file in the Windows filesystem can be downloaded to our target hardware (we use Segger Ozone) or run in our customised version of the XPack QEMU emulator from within windows.

VirtualBox and VMware

At Feabhas, we use VirtualBox to build self-contained Linux VMs and distribute these for online training as Open Virtualization or OVA files. Both VMWare and VirtualBox can import OVA files and be configured to access folders on the host through their Shared folder settings. But both products require additional software to be installed in the Linux guest operating system to gain access to the Windows host filesystem.

We use VirtualBox (without shared folders) to build our training projects and run the compiled ELF image in our custom version of QEMU on the Linux guest. We can also map the JLink USB port from the Windows host into the VM in order to use Ozone (in the VM) to download the ELF images to our target hardware.

VirtualBox and VMware both provide a good environment for developing embedded projects on a Windows Host.

Note: WSL2 (and Docker on Windows 10 Pro) use the Microsoft Windows Hypervisor Platform (Hyper-V) feature which, up until recently, has prevented VirtualBox and VMWare VMs from running correctly (see WSL2 FAQ). Recent versions of VMWare and VirtualBox (July 2021) can now coexist with WSL2 and the Hyper-V platform and hopefully will continue to do so. There does appear to be a noticeable drop off in the performance of VirtualBox emulation when the Hyper-V platform is enabled, so my personal preference is to work with WSL2.

Summary

While I prefer Linux and macOS, I use Windows; it is my primary development environment. I generally work directly with the Windows version of the Arm and Segger tools for embedded system development. Recent improvements to WSL and VS Code means that I now find this combination as good as Windows hosted tools and easier to use than working with VirtualBox (or VMWare).

Initially, getting CMake to build an embedded (cross compiler) project under Windows was painful: a term often used when discussing CMake, and Windows for that matter. Once I’d worked out that switching to the GNU Toolchain did not also default to using GNU Make rather than NMake, it turned out to be straightforward to create a portable configuration. If we hadn’t wanted to find the paths to the additional GNU Arm build commands (like as and ld) we would not have had to make any changes to the CMake configuration files whatsoever.

But, as with Linux, it is the complex and necessary command line options used to set up the toolchain and build configurations that are the main problem.

It’s a shame that the Windows version of CMake does not support the –build option so that we have to revert to the old-fashioned approach of running make directly.

It would have been easier to work on Windows if the GNU Arm Embedded Toolchain (for Windows) included a version of make so that we didn’t have to find and install it from elsewhere.

And finally, using a Windows command or PowerShell script to simplify using the cmake and make commands is essential.

A later article on CMake Presets describes how to use the presets feature added at CMake 3.19 in 2020.

Postscript – Simple Build Scripts

The GitHub project supporting this blog contains a command script (configure.bat) containing the build commands in this blog. The repo also includes a more functional PowerShell script (build.ps1) with a supporting Command script (build.bat) for building debug and release projects under Linux.

Windows Configure Script

A simple command script (configure.bat) based on the examples in the blog:

set CMAKE=”C:\Program Files\CMake\bin\cmake.exe”
set MAKE=C:\msys64\usr\bin\make.exe
%CMAKE% -S . -B build/debug   ^
  -G "Unix Makefiles"         ^
  -DCMAKE_MAKE_PROGRAM=%MAKE% ^
  -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake
%MAKE% -C build/debug VERBOSE=1

Windows Build Scripts

A more complex PowerShell script (build.ps1) supports command line options, but this must be invoked via a command script (build.bat) to configure the security permissions.

Windows applies security restrictions to prevent running unsigned PowerShell scripts from the command prompt (or from the Visual Studio Code tasks). The supporting build.bat script is used to start the PowerShell build script without security checks:

powershell.exe -noprofile -executionpolicy bypass -file build.ps1 %*

The build.ps1 script is a port of the shell script (build.sh) for Linux:

Set-StrictMode -version latest

$SCRIPT = Split-Path $PSCommandPath -Leaf;
$USAGE = "Usage: $SCRIPT [-v | --verbose | --rtos] [ reset | clean | debug | release ]"

$CMAKE = 'C:\Program Files\CMake\bin\cmake.exe'
$MAKE = 'C:\msys64\usr\bin\make.exe'
$ARM_TOOLCHAIN = 'C:\Program Files (x86)\GNU Arm Embedded Toolchain\10 2020-q4-major\bin'

$env:PATH += ";$ARM_TOOLCHAIN"

$BUILD= 'build'
$BTYPE = 'DEBUG'
$BUILD_DIR = "$BUILD\debug"
$CLEAN = ''
$RESET = ''
$VERBOSE = ''
$RTOS = ''

switch -regex ($args)
{
  '^(--help|-h|)$'    { Write-Output "$USAGE"; exit 0 }
  '^(--verbose|-v)$'  { $VERBOSE = 'SHELL="/bin/sh -x"'  }
  '^--rtos$'          { $RTOS = '-DUSE_RTOS=ON'  }
  '^debug$'           { $BTYPE = 'DEBUG';   $BUILD_DIR = "$BUILD\debug" }
  '^release$'         { $BTYPE = 'RELEASE'; $BUILD_DIR = "$BUILD\release"  }
  '^clean$'           { $CLEAN = '1'  }
  '^reset$'           { $RESET = '1'  }
  default             { Write-Error "Unknown option $arg"; Show-Usage }
}

if ( $RESET -and (Test-Path $BUILD_DIR -PathType Container) ) {
  Remove-Item $BUILD_DIR -Recurse
}

$TOOLCHAIN = '-DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake'
$CMAKE_ARGS = '-G', 'Unix Makefiles', "-DCMAKE_MAKE_PROGRAM=$MAKE"
$BUILD_TYPE = "-DCMAKE_BUILD_TYPE=$BTYPE"

&$CMAKE -S . -B $BUILD_DIR $CMAKE_ARGS `
  --warn-uninitialized $BUILD_TYPE $TOOLCHAIN $RTOS

if ( $CLEAN  -ne '' ) {
  &$MAKE -C $BUILD_DIR clean
}

&$MAKE -C $BUILD_DIR $VERBOSE

 

Posted in ARM, Build-systems, C/C++ Programming, CMSIS, Cortex, Toolchain | Tagged , , | Leave a comment

C++20 modules with GCC11

Introduction

One of the headline changes of the C++20 standard is the inclusion of modules. Modules promise to significantly change the structure of C++ codebases and possibly signal headers’ ultimate demise (but probably not in my lifetime). It also opens the door to potentially have a unified build system and package manager, similar to Rust’s Cargo package manager; though I imaging standardising a unified build system would be one bloody battle.

Pre-C++20 builds

If you want to start a heated debate on any C++ forum/channel, just state that one particular build system (e.g. Meson, CMake, Bazal, etc.) is better than the others; or that your way of using that build system is the “one, and only one, correct way”. If you are unfamiliar with build systems, I’d recommend reading this post first to understand the challenges.

Start with the Why

There have already been several articles written about Modules (significantly by Microsoft ). But my experience, in reading these, is that they focus ‘how’ modules work in C++20 and seem to miss the ‘Why’. Maybe the authors consider it obvious, but I think it depends on your background. In addition, all I have read use Microsoft MSVC, due to this having the fullest support for Modules among the mainstream toolchains.

First and foremost, when discussing modules, surely, we should be discussing modularity. We already have one form of modularity in C++ with the object/class model. But this is modularity ‘in the small’; modules are addressing ‘modularity in the large’, i.e., program-wide modularity.

So What problem are we trying to solve by adding modules?

Let’s be honest, “Headers are a mess” – they can (and have for many decades) been used effectively, but so often, I see very poorly constructed headers (IMHO). A well-crafted application will, typically, have pairs of files to “mimic” a module, e.g., file.h and file.cpp. But there is no enforcement of this approach; we also need to understand external- and internal-linkage rules to build a modular architecture safely.

The root of the problem with headers is they only exist up to and including pre-processing:

Headers do not exist during the compilation phase

We could happily (okay, maybe not happily) write a complete C++ application without any headers. There would be a lot of code duplication (declarations), and it would be a maintenance nightmare, but that’s how the current build model works (it all stems from the definition of a translation unit).

Modularity (in the large) is typically closely related to application architecture and construction – the files that make up our build.

Before we go on, I need to stress one important aspect – Modules and Namespaces are entirely orthogonal and co-exist as independent aspects (more on this later).

Other languages

Many modern languages tend to build the semantics of a module around all the code for the module existing within a single file, e.g. Java and Python

They have the concepts of exporting and importing types and behaviours from other modules. It is very similar to the public/private semantics of the class but at the file scope.

Interestingly, older languages, such as Ada and Modula-2, designed in the 1980s, around the same time as the original C++, use a two-file structure for defining modules (or packages in Ada’s case). These designs separate the module interface from the implementation.

The significant benefits of the interface/implementation file structure can be:

  • Improved build times
  • Simplified integration and testing

Though, of course, this is another hotly debated subject.

C++20 Module File Structure

Dare I say it, but C++ being C++, rather than a straightforward way of structuring models (a.la. Java), we have been given the suiss-army-knife approach to module construction. There are numerous ways of doing the same thing and lots of special cases. This, initially, caused me a lot of problems as my mental model (based on other language paradigms) wasn’t aligning with what I was being introduced to.

There is no one way of correctly using C++20 modules

I’m sure, over time, we will come up with new idioms regarding the use of modules, but for now, I can see three obvious uses of modules (think 80:20 rule)

  1. Single file module – the Java/Python model or a complete module
  2. A separate Interface file and Implementation file for a module – the Ada model
  3. Multiple separate files (partitions) combining to define a single module concept – the C++20 model

In C++20, any file containing the module syntax is referred to as a Module Unit. Therefore a Named Module may be made up of one or more Module Units.

Continue reading

Posted in C/C++ Programming | Tagged , | 12 Comments

CMake Part 3 – Source File Organisation

Introduction

In previous blog posts in this series (Part 1 and Part 2), I looked at using CMake to configure a build for a cross compilation to target hardware such as the STM32F4 Series. In this blog post I will look at how to configure project source code, identify subsystems and use CMake to manage the build for each subsystem.

In our training courses, we have identified two shared subsystems: the bare metal code used to initialise the C/C++ run time system and a middleware layer consisting of a real-time operating system (RTOS).

Before we look at configuring subsystems, we’ll briefly discuss managing a project with multiple source and header files.

Managing Source Files

Any non-trivial project will use separate source files to encapsulate different functional areas of the system. So far, our example project has just used a single main.cpp source file, although the supporting GitHub projects use multiple source files to build a usable ELF image.

From the previous blog, you may remember that, for our build, we use a separate toolchain file (toolchain-STM32F407.cmake) and a project configuration file (CMakeLists.txt). The following is a simplified project configuration file where we have omitted the compiler and linker options as we are now concentrating on source code management:

cmake_minimum_required(VERSION 3.16)
project(target-cortexm LANGUAGES C CXX)

set(CMAKE_C_STANDARD 99)
set(CMAKE_CXX_STANDARD 17)

add_executable(Application
  src/main.cpp
)
set_target_properties(Application PROPERTIES
  SUFFIX .elf
)

You can find the complete configuration files in the GitHub project accompanying this blog.

We can extend our list of source files for the target executable. Let’s say we have two separate modules for our project:

  • hardware devices (devices.cpp and devices.h)
  • logic controller (controller.cpp and controller.h).

We just add these source files to the add_executable() definition:

add_executable(Application
  src/main.cpp
  src/gpio.cpp
  src/controller.cpp
)

Remember that CMake will scan the source files looking for dependencies to build a dependency tree for the source files and included header files. We don’t specify the header files as part of the source dependencies.

Although we never list the header files as part of the configuration (as discussed in previous blogs), we need to specify the directories to search for header files by adding entries to the target_include_directories() directive. For example:

target_include_directories(Application PRIVATE
  src
  include
)

Directory locations are relative to the project root but we can use the CMAKE_SOURCE_DIR variable to reinforce this:

add_executable(Application
  ${CMAKE_SOURCE_DIR}/src/main.cpp
  ${CMAKE_SOURCE_DIR}/src/gpio.cpp
  ${CMAKE_SOURCE_DIR}/src/controller.cpp
)

target_include_directories(Application PRIVATE
  ${CMAKE_SOURCE_DIR}/src
  ${CMAKE_SOURCE_DIR}/include
 )

When using subdirectories to organise source code the generated build commands print out each directory name as it is being processed. You can suppress this directory tracking using the –no-print-directory option on the build command line.

cmake --build build --no-print-directory

Source File Wildcards

Teams following agile development models based on Evolutionary Prototyping where the source file structure can change regularly may prefer to use wildcard patterns to specify multiple source files to simplify project administration.

Teams following a more formal methodology usually prefer to specify every source file to avoid accidentally including unwanted sources and to maintain accurate list modules dependencies.

Software risk analysis usually requires a definitive list of the source files used to build a given project. As part of the build file generation CMake can optionally generate a list of compilation commands by setting the CMAKE_EXPORT_COMPILE_COMMANDS variable:

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

The well-documented proviso in CMake is that wildcards are evaluated when the build files are generated and not when the build takes place. To get proper wildcard support, the CMake build command must be run whenever there is a change to the source file structure.

Many sites using wildcards simply regenerate the build files whenever a build takes place – it doesn’t take that much time compared to the build. The downside is that old artefacts may be left in the output build folder and used in the build (e.g. linking in an object file that is no longer built from a source file). These sort of build problems may not be discovered until a later date when rebuilding the whole project from scratch.

Note that using the  –target clean option on the CMake command will only delete artefacts defined by the current build configuration and not older unreferenced artefacts. The simple solution to this problem is to delete the entire output folder and regenerate everything.

General advice is to avoid wildcards in a build configuration as the potential problems are more serious than the extra administrative load.

However, we decided our training projects benefited from using wildcards so that we didn’t have to get everyone to edit the build system as we went through the programming exercises. Our supporting build script always regenerates the build files.

Configuring File Dependencies

While we’re looking at source files and wildcards it’s worth pointing out that CMake does not have built in rules for all possible source files for a project. Our embedded linker commands are dependent upon a number of hardware configuration files stored in an ldscripts subdirectory which we need to add as a dependency for the linker stage.

We do this using LINK_DEPENDS option to the set_target_properties() function which is used to configure target specific properties. The related set_property() command is  used to set other CMake properties. There are a plethora of properties available in CMake and being aware of these, and when to use them is a good example of just how complex and confusing it is to define a CMake build configuration.

We can add our two linker configuration scripts as linker dependencies using the following:

set(LINKER_SCRIPTS 
  ${CMAKE_SOURCE_DIR}/ldscripts/mem.ld 
  ${CMAKE_SOURCE_DIR}/ldscripts/sections.ld
)

set_target_properties(Application PROPERTIES
  SUFFIX .elf
  LINK_DEPENDS "${LINKER_SCRIPTS}"
)

The LINK_DEPENDS option requires a single parameter which is a semi-colon separated list of absolute pathnames to the files; relative filenames do not work so we need to use the CMAKE_SOURCE_DIR to prefix the relative paths to the files.

It isn’t well documented but when expanding a variable containing a list inside a quoted string the list values will be separated by semi-colons. Hence the need to create a list of linker configuration scripts and expand that list in a quoted string after the LINK_DEPENDS keyword.

The LINK_DEPENDS option is used to ensure the linker is run to rebuild the image if any linker configuration files change. There is a related CMAKE_CONFIGURE_DEPENDS option to set_property() that can be used to force the build files to be regenerated if one or more files (not known to CMake) have changed.

The CMAKE_CONFIGURE_DEPENDS usage is similar to that of LINK_DEPENDS requiring a semi-colon separated list but this time containing filenames relative to a given directory:

set(FILES config.yml) 
set__property(DIRECTORY ${CMAKE_SOURCE_DIR} 
  APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS "${FILES}"
)

Using Subsystems

Now we have looked at managing source files we can look at subsystems. In our embedded training projects we have identified two subsystems:

  • the bare metal runtime
  • an optional real-time operating system (RTOS)

Each subsystem has its own CMakeLists.txt configuration file and is defined in a subdirectory of our project as shown in the following screen image:

Embedded Project Structure

In the main CMakeLists.txt project file we use  add-subdirectory() to add a subsystem to the main build.

The first subsystem to consider is the bare metal runtime and startup code which has an added complexity in it use of weak linkage.

Bare Metal Runtime Object Library

We created a subdirectory (system) for the Arm CMSIS files, the STMicroelectronics files for the STM32F407xx chipset,  and files to support the newlib C/C++ Standard Library.

Each subsystem is a separate project with its own project definition file (system/CMakeLists.txt). In a subsystem we list each dependent source file avoiding wildcards as we want to select exactly which files to use in our build:

cmake_minimum_required(VERSION 3.16)
project(target-system LANGUAGES C CXX)

add_library(system OBJECT
  src/newlib/_syscalls.c
  src/newlib/_startup.c
  src/newlib/_sbrk.c
  src/newlib/assert.c
  src/newlib/__dso_handle.c
  src/newlib/_exit.c
  src/newlib/_write.c
  src/cortexm/exception_handlers.c
  src/cortexm/_reset_hardware.c
  src/cortexm/_initialize_hardware.c
  src/diag/Trace.c
  src/diag/trace_impl.c
  src/cmsis/vectors_stm32f4xx.c
  src/cmsis/system_stm32f4xx.c
)

As with the main project file, we start with the minimum required CMake version and  a unique name for our subsystem project (target-system). All our current supporting files are written in C, but we still define this as a C and C++ project in case we decide to add C++ files at a later date. It is worth noting that the default project languages for CMake are C and C++ so this line is redundant, but we think it’s worth including anyway.

We define the library itself (called system) with the add_library() function. The first argument is the library name and must be unique within the project (different from all other libraries and executable targets created by the project). CMake supports conditional configuration so two libraries with the same name can be defined, so long as only one is included in the generated build.

The first argument to add_library defines the library type. There are several CMake library types which include:

  • SHARED – dynamically linked libraries (.so or .dll files) not supported by the GNU Arm Embedded Toolchain
  • STATIC – statically linked libraries (.a or .lib files)
  • OBJECT – not a single library file but a collection of separate object files (.o or .obj files)

The GNU Arm Linker supports weak linkage but if we use a static library with the -ffunction-sections linker option then all weakly linked functions in the library will linked into the target image file in preference to any strongly linked versions in our code. To ensure the weak linkage mechanism works correctly we have to create an OBJECT library for our target.

The remaining add_library arguments supply a list of source files which CMake will use to generate the build dependencies. In the previous example we have used paths relative to the project directory but the PROJECT_SOURCE_DIR variable can be used to make it clear these are relative to the directory containing the CMakeLists.txt file (… represents omitted entries):

add_library(system OBJECT
  ${PROJECT_SOURCE_DIR}/src/newlib/_syscalls.c
  ${PROJECT_SOURCE_DIR}/src/newlib/_startup.c
...
  ${PROJECT_SOURCE_DIR}/src/cmsis/vectors_stm32f4xx.c
  ${PROJECT_SOURCE_DIR}/src/cmsis/system_stm32f4xx.c
)

For an OBJECT library, the output object files are created in a build directory named after the library (system.dir). In our case, for a debug build, this is the location build/debug/system/CMakeFiles/system.dir/ (this output directory store object files as a mirror of  the directory structure of the source files).

Apart from the object files, our subsystem includes several headers files which we need to add to the compiler’s include locations. There are two types of include locations:

  • INTERFACE includes are part of the interface to the subsystem
  • PRIVATE includes are only needed to compile the subsystem

We use target_include_directories() to identify the include file folders:

target_include_directories(system INTERFACE
  ${PROJECT_SOURCE_DIR}/include/cmsis
)

target_include_directories(system PRIVATE
  ${PROJECT_SOURCE_DIR}/include
  ${PROJECT_SOURCE_DIR}/include/cmsis
  ${PROJECT_SOURCE_DIR}/include/cortexm
  ${PROJECT_SOURCE_DIR}/include/diag
)

That’s all we need for our bare metal subsystem. The subsystem compilation will inherit the compiler options and defines from the main project configuration. If we had specific options or defines for the subsystem, we would specify these with target_compile_options and target_compile_definitions . Do not set project-wide compiler (or linker) options and definitions from within a subsystem: you’ll only create trouble for yourself.

We now update the main project’s CMakeLists.txt file to add the subsystem configuration and add a dependency to the build target (Application):

add_subdirectory(system)
target_link_libraries(Application PRIVATE system)

The add_subdirectory is used to add the subsystem configuration to the project build. We have made the subsystem project name the same as the directory name, but this isn’t necessary.

A subsystem project could create multiple libraries (or even target executables), so we need to tell CMake to generate appropriate linker options to add the required library to our Application target with target_link_library.

The target_link_library needs to know if the library is only required to build our target executable or is part of this project’s interface. In our case, which is the most common, the library is only required to build the target, so we specify the PRIVATE link library argument. The other options are PUBLIC (the library file or files are made available to enclosing projects) and INTERFACE (the include locations are made available to enclosing projects). This approach allows CMake to support complex subsystem hierarchies.

In the top-level project configuration file (CMakeLists.txt), the use of PUBLIC/PRIVATE/INTERFACE is normally moot as  other project will not depend on this one. Many CMake examples specify PUBLIC libraries in the top-level configuration file, which can, and has, lead to confusion as to which is the correct approach. If in doubt, make libraries PRIVATE as this is usually the right approach; you’ll soon find out when this is wrong when a compilation or link fails.

One final twist to our training project configuration is that we originally intended to use a common CMake build file for training courses for embedded targets (cross compilation) and hosted courses where we do not add the -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake option to the cmake command line.

We did this by testing for the presence of the subsystem directory (which is not included with our hosted training courses):

if (IS_DIRECTORY ${CMAKE_SOURCE_DIR}/system)
  add_subdirectory(system)
  target_link_libraries(Application PRIVATE system)
endif()

There are advantages and disadvantages to this approach. Adopting this approach, if the subsystem directory is not present the build will be generated but the link may fail due to the missing directory.  Without the test for the presence of the directory the code generation stage will fail.

RTOS Shared Library

Our second training project subsystem is the RTOS shared library. Again, we create a separate directory (middleware) for this subsystem with its own CMakeLists.txt file defining a separate project (target-middleware). We use FreeRTOS for our middleware RTOS and add the required sources files to a STATIC library called middleware (a full list of files is included in the accompanying GitHub project):

cmake_minimum_required(VERSION 3.16)
project(target-middleware LANGUAGES C CXX)

add_library(middleware STATIC
  FreeRTOSv202012.00/FreeRTOS/Source/croutine.c
  FreeRTOSv202012.00/FreeRTOS/Source/event_groups.c
  FreeRTOSv202012.00/FreeRTOS/Source/list.c
  FreeRTOSv202012.00/FreeRTOS/Source/queue.c
  FreeRTOSv202012.00/FreeRTOS/Source/stream_buffer.c
  FreeRTOSv202012.00/FreeRTOS/Source/tasks.c
  FreeRTOSv202012.00/FreeRTOS/Source/timers.c
 
  FreeRTOSv202012.00/FreeRTOS/Source/portable/GCC/ARM_CM3/port.c
  FreeRTOSv202012.00/FreeRTOS/Source/portable/MemMang/heap_3.c
)

This will create a static library file in the build folder. On Linux with a debug build, this will be in the location build/debug/middleware/libmiddleware.a.

As with the object library for the bare metal system we need to specify the locations of the interface header files:

target_include_directories(middleware INTERFACE
  cortex_m4_config
  FreeRTOSv202012.00/FreeRTOS/Source/include
  FreeRTOSv202012.00/FreeRTOS/Source/portable/GCC/ARM_CM3
)

CMake does not assume interface headers files are required for the build so in our case we need to include the same header files for the build:

target_include_directories(middleware PRIVATE
  cortex_m4_config
  FreeRTOSv202012.00/FreeRTOS/Source/include
  FreeRTOSv202012.00/FreeRTOS/Source/portable/GCC/ARM_CM3
)

We could have avoided the duplication of header files by using a variable:

set (MIDDLEWARE_INC
  cortex_m4_config
  FreeRTOSv202012.00/FreeRTOS/Source/include
  FreeRTOSv202012.00/FreeRTOS/Source/portable/GCC/ARM_CM3
)

target_include_directories(middleware INTERFACE
  ${MIDDLEWARE_INC}
)

target_include_directories(middleware PRIVATE
  ${MIDDLEWARE_INC}
)

Note that this is a local variable only used in this subsystem project, so we do not add it to the CMake variable cache (which we did for some variables defined in the toolchain file).

Our middleware subsystem requires header files from the bare metal subsystem, so we add this dependency to our project file. The header files are only required to build the middleware library, not to make use of the library, so we make them PRIVATE to this subsystem project:

target_link_libraries(middleware PRIVATE system)

That completes the middleware subsystem configuration so back in our main project we add the middleware static library using the same approach as the system object library:

add_subdirectory(middleware)
target_link_libraries(Application PRIVATE middleware)

We must add this library dependency after the Application target, but it can be placed before or after the system subsystem. CMake will ensure the generated build files will take multiple library dependencies into account.

CMake Options

Not all of our training course exercises use the RTOS features, so we decided to control inclusion of the middleware using a CMake option. We add the controlling variable with its default value as an option (not a variable set command):

option(USE_RTOS "Enable RTOS support" OFF)

We then use this option and the presence of the middleware directory to control adding the library dependency:

if (USE_RTOS AND IS_DIRECTORY ${CMAKE_SOURCE_DIR}/middleware)
  add_subdirectory(${CMAKE_SOURCE_DIR}/middleware)
  target_link_libraries(Application PRIVATE middleware)
endif()

We also use this option to define an RTOS macro for the compiler options so that we can use this to conditionally include code when we are linked with the RTOS middleware:

add_compile_definitions(
  $<$<CONFIG:DEBUG>:DEBUG>
  $<$<CONFIG:DEBUG>:TRACE_ENABLED>
  $<$<BOOL:${USE_RTOS}>:RTOS>
)

Generator expressions (unlike if statements) do not implicitly test the values of variables (options) so we have to test the variable value in a Boolean context.

Back on the command line we need to add a CMake define (-DUSE_RTOS=ON) to override the default value for this option when we want to include the RTOS middleware:

cmake -S . -B build/debug --warn-uninitialized \
    -DCMAKE_BUILD_TYPE=DEBUG \
    -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake \
    -DUSE_RTOS=ON

CMake options are used to configure the generated build files and are not used during the build process itself. If a cmake command line defines a different value for an option, the build files must be regenerated, which is another good reason for always generating the build files when using CMake to manage a project build.

It’s worth emphasizing the point made earlier that CMake command line defines are not passed through to the underlying build commands. The compiler or linker will not see a USE_RTOS macro definition.

Dynamic Link Library

Although the Arm Toolchain does not support dynamic link libraries, it’s worth mentioning that dynamic link libraries are created in the same manner as static libraries but using the SHARED option to add_library.

The following is a minimal CMakeLists.txt for a subsystem creating a shared library called library-c from a single file helper.c:

cmake_minimum_required(VERSION 3.16)
project(host-c-library LANGUAGES C)

add_library(library-c SHARED
  ${PROJECT_SOURCE_DIR}/helper.c
)

target_include_directories(library-c INTERFACE
  ${PROJECT_SOURCE_DIR}/
)

Note the inclusion of the current folder (PROJECT_SOURCE_DIR) in the interface directories because the header files for the library are in the same subsystem directory as the source files. This target will create a shared library on Linux with a debug build: build/debug/library-c/liblibrary-c.so.

To add a shared library to a project use the same approach as object and static libraries:

add_subdirectory(${CMAKE_SOURCE_DIR}/library-c)
target_link_libraries(Application PRIVATE library-c)

Summary

Most, if not all, projects for embedded targets will consist of readily identifiable functional areas which can be configured as subsystems: for example, low level hardware access, bare metal runtime system, and so on. CMake supports subsystems by treating these as separate projects used to create libraries.

Each subsystem requires it’s own CMakelist.txt and therefore has to be defined in a sub directory: CMake is hard-coded to look for CMakelist.txt when generating build files for a project. Typically with embedded projects each subsystem generates a static library (.lib or .a file) linked into the target image.

CMake options can be used to configure the build process using command line definitions (-D settings) for the generation of the project build files. These CMake defines are not added to the compiler defines for the build process but can be used in generator expressions to add compiler defines or compiler/linker options.

In the next blog in this series CMake Part 4 – Windows Hosts, I’ll look at how we configured CMake to use the Arm Toolchain on a Windows 10 host system.

A later article on CMake Presets describes how to use the presets feature added at CMake 3.19 in 2020.

Postscript – A Simple Build Script

The GitHub project supporting for this blog contains a minimal shell script (build.sh) for building debug and release projects under Linux.

Linux Build Script (bash)

set -o errexit
set -o nounset
USAGE="Usage: $(basename $0) [-v | --verbose | --rtos] [ reset | clean | debug | release ]"

CMAKE=cmake
BUILD=./build
TYPE=DEBUG
BUILD_DIR=$BUILD/debug
CLEAN=
RESET=
VERBOSE=
RTOS=

for arg; do
  case "$arg" in
    --help|-h)    echo $USAGE; exit 0;;
    -v|--verbose) VERBOSE='VERBOSE=1' ;;
    --rtos)       RTOS='-DUSE_RTOS=ON' ;;
    debug)        TYPE=DEBUG; BUILD_DIR=$BUILD/debug ;;
    release)      TYPE=RELEASE; BUILD_DIR=$BUILD/release ;;
    clean)        CLEAN=1 ;;
    reset)        RESET=1 ;;
   *)             echo -e "unknown option $arg\n$USAGE" >&2; exit 1 ;;
  esac
done

[[ -n $RESET && -d $BUILD_DIR ]] && rm -rf $BUILD_DIR
$CMAKE -S . -B $BUILD_DIR --warn-uninitialized \
  -DCMAKE_BUILD_TYPE=$TYPE \
  -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake \
  $RTOS
[[ -n $CLEAN ]] && $CMAKE --build $BUILD_DIR --target clean
$CMAKE --build $BUILD_DIR --no-print-directory -- $VERBOSE

Windows Build Script

Developers working on Windows who install CMake will find that the default build generation targets the Microsoft Build Tools for Visual Studio compilers. Configuring CMake on Windows to cross compile using the Arm Embedded Toolchain is not straightforward and will be the subject of the next blog post CMake Part 4 – Windows Hosts and will include a suitable example build script.

Posted in ARM, Build-systems, C/C++ Programming, CMSIS, Cortex, Toolchain | Tagged , | 3 Comments

CMake Part 2 – Release and Debug builds

Introduction

In my previous blog post CMake Part – The Dark Arts I discussed how to configure CMake to cross-compile to target hardware such as our STM32F407 Discovery board.

We looked at the minimum requirements to configure the CMake build generator for a cross-compilation project using a project definition file (CMakeLists.txt), a toolchain definition file (toolchain-STM32F407.cmake). The CMake commands used to generate and build the project are:

cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake
cmake --build build

In the real world, projects are never as simple as this minimal example, and we try to reflect this in our training. To support the different phases and objectives of a Software Development Lifecycle a project will need to differentiate between developing code, testing (in its various forms) and releasing a version for end-use. We usually do this using build configurations.

Outputs from each type of build configuration are usually different. For example, a developer’s build typically includes metadata used by a debugger which is not required for a released version of the project. Therefore, we need to configure our build process to cater for these different output requirements.

Both Visual Studio and Xcode  support multiple build configurations, and CMake can generate appropriate build configuration files for these systems.

On the other hand, the Unix/Linux/GNU Make system does not support build configurations. When using CMake to generate different build requirements using make files we take this into account by placing different build configurations in different output directories for each type of build we want to support.

Configuring Debug and Release Builds

CMake refers to different build configurations as a Build Type.  Suggested build types are values such as Debug and Release, but CMake allows any type that is supported by the build tool. The build type specification is case insensitive, so we prefer to be consistent and use all upper case types despite the fact that the CMake documentation refers to capitalised types.

Our underlying build system for training is Make, so we need to create separate output folders for each type of build we require. Unfortunately, this means we have to run two very similar cmake commands to generate different configurations:

cmake -S . -B build/debug -DCMAKE_BUILD_TYPE=DEBUG \
      -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

cmake -S . -B build/release -DCMAKE_BUILD_TYPE=RELEASE \
      -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

We also have two separate commands, one for each build type:

cmake --build build/debug
cmake --build build/release

Aside: as a traditional Unix/Linux developer used to typing make I find these long and complex commands irksome and I know I’m not alone in this as it is a common source of criticism of CMake.

At this point, using a shell script, or scripts, to encapsulate the underlying cmake commands to simplify build the system would be advisable. There is an example shell script build.sh in the accompanying GitHub project https://github.com/feabhas/cmake-blog-2.

For developer’s working with build tools supporting multiple build configurations (like Xcode and Visual Studio), the build type is not passed on the generate command line (using -CCMAKE_BUILD_TYPE=…) but on the cmake build command with the –config option. For example:

cmake -S . -B build
cmake --build build --config Debug

Note: when using Make builds, the –config option is silently ignored, and when using multi-configuration build tools like Visual Studio, the setting for CMAKE_BUILD_TYPE is also silently ignored. A source of confusion and criticism when first starting to use CMake.

To support multiple build configurations for our training projects we just need to refactor the project and toolchain configuration files to be aware of build types. To do this, we make use of CMake generator expressions, so we need a short digression to discuss this feature of CMake.

CMake Generator Expressions

A generator expression is used to query aspects of the build as the build files are generated giving us a  dynamic view of the build generation process.

A static view of the build generation process is provided by command line definitions and variables defined in the configuration files, which are saved to the build cache file CMakeCache.txt in the build target directory. Note that variables should not change value once the build file generation process begins as this can cause discrepancies in the generated files.

Generator expressions are specified using  $< expression > where the expression can take many different forms, whereas variable values are specified using ${ name }. Variables, once set, can be used at any point in the CMake files, whereas generator expressions query the current build generation environment and are only valid in specific contexts.

The use of the generator expression $<TARGET_FILE:Application> resolves to the path to the output file in the build rule for our main application (Application is the target name). This expression is only valid after both the target and the target suffix have been defined:

add_executable(Application src/main.cpp)
set_target_properties(Application PROPERTIES
    SUFFIX .elf
)

For our project this is the absolute path to build/debug/Application.elf.

A generator expression is defined using a $< : > syntax with the entry after the colon defining the value of the expression. The first part before the colon takes different forms such as:

  • a conditional test such as $<CONFIG:DEBUG> is true if this is a debug build type  defined by the command line option -DCMAKE_BUILD_TYPE=DEBUG
  • a target-dependent query such as $<TARGET_FILE: name >
  • a string manipulation expression
  • a variable query

The generator expressions manual page describes the complete range of generator expressions.

Toolchain Configuration

Refactoring our project toolchain file (toolchain-STM32F407.cmake) requires identifying compilation options only applicable to debug builds:

add_compile_definitions(
  STM32F407xx
  USE_FULL_ASSERT
  $<$<CONFIG:DEBUG>:OS_USE_TRACE_SEMIHOSTING_STDOUT>
  $<$<CONFIG:DEBUG>:OS_USE_SEMIHOSTING>
)

In this example $<CONFIG:DEBUG> is true for a debug build type and similarly $<CONFIG:RELEASE> (not used in the example) is true for a release build. Note that the generator expression is all uppercase regardless of the actual value defined for CMAKE_BUILD_TYPE. The CMake documentation often refers to DCMAKE_BUILD_TYPE=Debug but the generator expression is always $<CONFIG:DEBUG>.

In our example we have added compiler definitions entries to support using host debugging via a serial port for a debug project.

For our training project we will need to use different runtime support configurations for the debug runtime (rdimon.specs) and a bare metal release (nosys.specs):

add_link_options(
  ${ARM_OPTIONS}
  $<$<CONFIG:DEBUG>:--specs=rdimon.specs>
  $<$<CONFIG:RELEASE>:--specs=nosys.specs>
  $<$<CONFIG:DEBUG>:-u_printf_float>
  $<$<CONFIG:DEBUG>:-u_scanf_float>
  -nostartfiles
  LINKER:--gc-sections
  LINKER:--build-id
)

As an alternative to using $<CONFIG:RELEASE> we could have tested for the absence of debug mode using the more complex syntax:

$<$<NOT:$<CONFIG:DEBUG>>:--specs=nosys.specs>

Here we use an inner generator expression to control the inclusion of an enclosing generator expression.

Build Customisation

With the toolchain correctly configured we will update the project configuration (CMakeLists.txt) to refactor the compiler optimisations and symbol definitions for each build type:

add_compile_options(
  -Wall
  -Wextra
  -Wconversion
  -Wsign-conversion
  $<$<CONFIG:DEBUG>:-g3>
  $<$<CONFIG:DEBUG>:-Og>
  $<$<CONFIG:RELEASE>:-O3>
)

add_compile_definitions(
  $<$<CONFIG:DEBUG>:DEBUG>
)

Note: we need to define the compiler DEBUG symbol ourselves – it doesn’t happen automatically when we select the debug build type. The build type variable CMAKE_BUILD_TYPE is a CMake variable and not a linker or compiler defined symbol. The familiar syntax of using -D on the command line to define CMake variables can be confusing when first using CMake as these are not definitions for the underlying compiler.

As an alternative approach for the build type definition we could have simply inserted the $<CONFIG> generator expression as a compiler pre-processor definition:

add_compile_definitions(
  $<CONFIG>
)

This approach would add the pre-processor build type value as a compiler definition. However in this approach the value used would keep the original letter case so that using the CMake approach of -DCMAKE_BUILD_TYPE=Debug would define a compiler variable called Debug which would not match the expected upper case definition (DEBUG).

Our example project does not need any linker options specific to the build type for our example project as these were handled in the toolchain file.

Post Build Tools

Often when creating a target, such as our executable program, there are additional actions required after a successful build.

In our cross compiler project, we want to use the objcopy command to generate the hex file used by some flash memory programmers.

We use add_custom_command() function calls to run actions after a successful build of a target. CMake automatically generates a variable (CMAKE_OBJCOPY) for the path of the objcopy program when the C or C++ compiler is specified in  the toolchain configuration file (in our case it will be arm-none-eabi-objcopy) . We should use this  variable preference to the raw command name:

add_custom_command(
  TARGET Application
  POST_BUILD
  COMMAND ${CMAKE_OBJCOPY} -O ihex $<TARGET_FILE:Application> 
          ${CMAKE_CURRENT_BINARY_DIR}/$<TARGET_NAME:Application>.hex
)

The use of POST_BUILD command line should be self-explanatory: CMAKE_OBJCOPY is set to the path of the the objcopy command (implicitly defined in toolchain-STM32F407.cmake) and CMAKE_CURRENT_BINARY_DIR is the path to the build folder (-B on the command line).

In building the objcopy command line we need to use generator expressions to get the path to the target application ELF file ($<TARGET_FILE:Application>) and the base filename defined by $<TARGET_NAME:Application> because these are specific to that target.

Conditional Tests

One minor complication to using CMAKE_OBJCOPY in the previous section is that the objcopy command may not be part of the toolchain we are using, in which case CMake sets CMAKE_OBJCOPY to the value CMAKE_OBJCOPY-NOTFOUND.

We should test a command path variable to make sure the command exists:

if (EXISTS ${CMAKE_OBJCOPY})
  add_custom_command(
  TARGET Application
  POST_BUILD
  COMMAND ${CMAKE_OBJCOPY} -O ihex $<TARGET_FILE:Application>
          ${CMAKE_CURRENT_BINARY_DIR}/$<TARGET_NAME:Application>.hex
)
else()
  message(STATUS "'objcopy' not found: cannot generate .hex file")
endif()

Note the use of parentheses on the else() and endif() functions – everything is a function in CMake. The else() part is optional, but we have used it to output a message during the build file generation phase, but this won’t be displayed in the actual build.

The first (optional) parameter to message() is a type indicator: in our case a STATUS message is output prefixed with . In contrast, a FATAL message will display the message and stop the build generation at that point. Other message types are described in the CMake manual.

It is worth reinforcing the idea that CMake uses whitespace separated arguments to functions so the COMMAND arguments can be given across multiple lines without using a line continuation character (such as \  in shell or Python scripts).

As an aside, you should be aware that CMake does not warn when an undefined variable is used, it simply substitutes nothing. This can be problematic, so we advise using the command line option –warn-uninitialized, which will display a warning message but won’t stop the build. So make sure you check the output from the build generation steps carefully in case you’ve mistyped a variable name.

cmake -S . -B build --warn-uninitialized -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

There is one downside to adding this warning and that is when CMake generates the build files and the output directory already contains generated files CMake does not usethe toolchain file if the toolchain file has not been recently modified. In this situation the CMAKE_TOOLCHAIN_FILE is effectively unused and a warning is issued. To suppress this warning, which implies something is wrong when it isn’t, you can simply read the variable in a message:

MESSAGE(STATUS "Using toolchain file: ${CMAKE_TOOLCHAIN_FILE}")

Custom Commands

While the CMake toolchain includes a few commonly used commands like objcopy and ar there are often additional project or environment specific commands you need to run post (or pre) build. While you can add these to the CMakeList.txt file, we think the toolchain file is the right place to configure the custom command paths.

In our cross compilation toolchain file (toolchain-STM32F407.cmake) we added logic to locate additional Arm commands not recognised by CMake:

find_program(CROSS_GCC_PATH "arm-none-eabi-gcc")
if (NOT CROSS_GCC_PATH)
  message(FATAL_ERROR "Cannot find ARM GCC compiler: arm-none-eabi-gcc")
endif()
get_filename_component(TOOLCHAIN ${CROSS_GCC_PATH} PATH)

set(CMAKE_C_COMPILER ${TOOLCHAIN}/arm-none-eabi-gcc)
set(CMAKE_Cxx_COMPILER ${TOOLCHAIN}/arm-none-eabi-g++)
set(TOOLCHAIN_AS ${TOOLCHAIN}/arm-none-eabi-as CACHE STRING "arm-none-eabi-as")
set(TOOLCHAIN_LD ${TOOLCHAIN}/arm-none-eabi-ld CACHE STRING "arm-none-eabi-ld")
set(TOOLCHAIN_SIZE ${TOOLCHAIN}/arm-none-eabi-size CACHE STRING "arm-none-eabi-size")

The find_program function searches the host filesystem for the path to a given program which it stores in the variable name given as the first parameter. If the program isn’t found, the variable is set to <name>-NOTFOUND, in our case CROSS_GCC_PATH-NOTFOUND. We can check that the ARM compiler has been found by testing  CROSS_GCC_PATH:variable values ending with -NOTFOUND evaluate to false.

Our search is complicated because we haven’t put the Arm toolchain in the standard Linux folders (such as /usr/bin), so we have to extract the directory path part of the arm-none-eabi-gcc command so we can get the toolchain directory location with get_filename_component.

We have prefixed our custom variables defining the paths to the toolchain commands with TOOLCHAIN- to differentiate them from the standard CMake commands.

We need to store these variables where the main project can reference them, so we add them to the cache file using CACHE STRING followed by a variable description. Each CMake definition file is a separate processing environment, and variables not added to the cache will be discarded after build file processing is finished.

If you are interested, the variable cache is stored the file CMakeCache.txt in the build folder. An entry for the arm-none-eabi-as commnd looks like:

//arm-none-eabi-as
TOOLCHAIN_AS:STRING=/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/arm-none-eabi-as

Note that we don’t use strings for the variable values but use what Perl calls bare words which are values without the quotes (so long as we don’t have whitespace characters in the value). We have chosen to set the variable descriptions as strings because they usually contain spaces: in our case, as we have just used the program name as the description, these too could have been bare words.

Running Post Build Custom Commands

In the project file (CMakeLists.txt) we don’t assume the custom toolchain commands exist because we may be supplying a different toolchain on the command line. As with objcopy we verify we can find the required post build commands:

if (EXISTS "${TOOLCHAIN_SIZE}")
  add_custom_command(
    TARGET Application
    POST_BUILD
    COMMAND ${TOOLCHAIN_SIZE} --format=berkeley $<TARGET_FILE:Application>
            >${CMAKE_CURRENT_BINARY_DIR}/$<TARGET_NAME:Application>.bsz
  )
  add_custom_command(
    TARGET Application
    POST_BUILD
    COMMAND ${TOOLCHAIN_SIZE} --format=sysv -x $<TARGET_FILE:Application>
            >${CMAKE_CURRENT_BINARY_DIR}/$<TARGET_NAME:Application>.ssz
    )

else()

    message(STATUS "'size' not found: cannot generate .[bs]sz files")

endif()

There is nothing in this code that we haven’t seen before.

Summary

Real-world projects are always more complex than the simple examples used in most tutorials. In this post, we’ve looked at how CMake can be configured to generate two separate makefile build configurations using the same project and toolchain definition. This ability to add build configuration types to the GNU Make system is a good reason to use CMake in conjunction with the make command.

We recommend that you use the –warn-uninitialized when running CMake to generate the build files check the output from the build generation as this will help identify mistyped variable names.

A prototype project containing the code shown in this blog can be found in the GitHub project https://github.com/feabhas/cmake-blog-2.

In the next blog CMake Part 3 – Source File Organisation, we’ll look at multiple source and header files for a project and discuss how to organise a more extensive project into subsystems and libraries.

Postscript – A Simple Build Script

The GitHub project supporting for this blog contains a minimal shell script (build.sh) for building debug and release projects under Linux.

Linux Build Script (Bash)

#!/bin/bash
set -o errexit
set -o nounset
USAGE="Usage: (basename $0) [-v | --verbose] [ reset | clean | debug | release ]"

CMAKE=cmake
BUILD=./build
TYPE=DEBUG
BUILD_DIR=$BUILD/debug
CLEAN=
RESET=
VERBOSE=

for arg; do
  case "$arg" in
    --help|-h)    echo $USAGE; exit 0;;
    -v|--verbose) VERBOSE='VERBOSE=1' ;;
    debug)        TYPE=DEBUG; BUILD_DIR=$BUILD/debug ;;
    release)      TYPE=RELEASE; BUILD_DIR=$BUILD/release ;;
    clean)        CLEAN=1 ;;
    reset)        RESET=1 ;;
    *)            echo -e "unknown option $arg\n$USAGE" >&2; exit 1 ;;
  esac
done

[[ -n $RESET && -d $BUILD_DIR ]] && rm -rf $BUILD_DIR

$CMAKE -S . -B $BUILD_DIR --warn-uninitialized -DCMAKE_BUILD_TYPE=$TYPE -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

[[ -n $CLEAN ]] && $CMAKE --build $BUILD_DIR --target clean

$CMAKE --build $BUILD_DIR -- $VERBOSE

Windows Build Script

Developers working on Windows who install CMake will find that the default build generation targets the Microsoft Build Tools for Visual Studio compilers. Configuring CMake on Windows to cross compile using the Arm Embedded Toolchain is not straightforward and is the subject of a later blog post CMake Part 4 – Windows Hosts including a suitable example PowerShell script.


Posted in ARM, Build-systems, C/C++ Programming, Cortex, General, Toolchain | Tagged , | 10 Comments

CMake Part 1 – The Dark Arts

Introduction

In our previous post Why We Need Build Systems we examined the need for Build Systems in modern software development. In this post we will examine how to use CMake to mange the build process for a cross compilation project.

CMake can be described as a marmite application: you either love it or hate it. Here at Feabhas, we find ourselves falling in the latter category, despite the fact the CMake is widely used within the embedded and deeply embedded development community.

But we also know that many of the C/C++ static analysis and code quality tools integrate well with the CMake build system. For this reason, we’ve put aside our prejudices and reconsidered the way we build our example projects used during training by replacing scons with CMake.

This blog post is a mix of musings and advice when using CMake for cross-compiling  to the STM STM32F407 Discovery board that we use for our embedded C and C++ training. It is the first of a small series of posts looking at how we build our training projects comprising application code, supporting library code, real-time operating system and bare metal driver code.

The code and examples used in this blog are from CMake 3.16 on Ubuntu 20.04 LTS using the GNU Arm Embedded Toolchain and can be download from the GitHub project https://github.com/feabhas/cmake-blog-1.

What is CMake

CMake is not a build system like Unix Make but a build system generator. Its purpose is to take your description of a project and generate a set of configuration files to build that project.

As part of the generation of build configuration files CMake also analyses source code to create a dependency graph of components so that when building the project unnecessary recompilation steps can be omitted to reduce build times. For larger projects this can reduce build times down from tens of minutes or hours, to a few minutes, perhaps even less than one minute.

The following schematic overview shows the complexity of building a modern software system with multiple inputs and output artefacts which will help explain why we need to use a build system to manage the process.

CMake supports several hosted build systems such as GNU Make,(Linux), Visual Studio (Microsoft Windows), Xcode (OSX) and Ninja (multiple platforms) as well as cross-compilation systems such as Android Studio and IAR Workbench.

This plethora of different build systems adds to the confusion about using CMake. At a fundamental level both Visual Studio and Xcode provide a GUI environment that supports multiple build configurations such as Debug and Release. Make, on the other hand, is command-line based and does not support different build configurations. CMake tries hard to hide these differences but doesn’t always succeed.

CMake was originally developed in 1999, but the release of version 3.0 in 2014 introduced a new style of defining a project which is generally referred to as Modern CMake. This has added to the confusion over using CMake because there are many resources on the web that refer to the legacy style of CMake.

While CMake has extensive documentation, it is very much a guide to what (descriptions of function and variables) that lacks the how (examples) and the why. It was difficult for us to access information that helped us understand how CMake works: specifically an overall understanding of how to configure a cross-compilation project.

Having said all that, CMake does work and achieves its purpose for creating a cross-platform build system that will generate build files that optimise the compilation steps.

A Minimal Host Project

To use CMake, you create a CMakeLists.txt file, usually located in the root folder of your project. This file defines the source configuration, compiler and linker options, plus anything else needed to build and, if required, install your project.

The first thing in the file is the minimum CMake version, followed by a name for the project.

cmake_minimum_required(VERSION 3.16)
project(simple-host)

By default, the project will support a C and C++ toolchain, but we could declare this explicitly with:

project(simple-host LANGUAGES C CXX)

Each CMake configuration requires one or more targets: either an executable program or a library; plus, the source files used to create that target. We’re going to use a single source file, src/main.cpp, to create a host-based executable Application:

add_executable(Application src/main.cpp)

That’s it for a minimal host build. CMake will use the default host toolchain to figure out how to generate the required build files. For our Ubuntu Linux build, it will be GNU Make files using g++, on Windows it would generate a Visual Studio workspace configuration, and Xcode for OSX.

Generate and Build

Using CMake is a two-step process:

  1. Generate the build files
  2. Run the build system

Step one only needs to be run when creating a project, modifying compiler and/or linker options, adding (removing or renaming) source and header files, or making other configuration changes such as inter-file dependencies defined by #include statements.

Step two is run every time the project needs building (recompiling and linking).

We can shown this schematically for our project that generates GNU Make files.

Our minimal host CMakeLists.txt file looks like:

cmake_minimum_required(VERSION 3.16)
project(simple-host LANGUAGES C CXX)
add_executable(Application src/main.cpp)

If we were to run cmake with no command-line arguments, it will generate the build files in the project root known as an in-source build. This build will intermix the object files, dependency files and executables in with the configuration and source files.

The in-source build approach is not a good idea as it is hard to differentiate source files (requiring source code management) from generated files (which should not be added to a source repository).

The best practice is to generate an out-of-source build, which we do by specifying the project source root (-S option) and target build location (-B option) on the command line:

cmake -S . -B build/

With modern CMake also run the build process via cmake –build (this was introduced with version 3.12 in 2018):

cmake --build build/

The older CMake approach was to change to the build folder to explicitly run the build tool (make) from that folder:

mkdir build
cd build
cmake ..
make

Either way, we now have an executable called Application (in the build folder) that we can run on the host using:

build/Application

Should we want to use a different build system instead of the host default (GNU Make for Linux) we need to tell CMake which build generator to use using the -G command option.  For example, to generate Ninja build files we would use:

cmake -S . -B build/ -G Ninja

Source File Dependencies

CMake does more than just generate the build files used to create object files and executable programs. It will generate a dependency file for each source file in the project. For example a main.cpp file will have a generated main.cpp.d file saved in the build folder hierarchy honouring the directory structure of the source files (in our case the file path is build/CMakeFiles/Application.dir/src/main.cpp.d).

For C/C++ source files  CMake will scan each file for #include statements and add these to the list of dependencies for that file. The generated configuration files for the build system (make in our case) will include those dependencies in its build rules. This will allow the build system to optimise the compilation steps avoiding recompiling source files that are unaffected by changes to other files.

The following diagram shows an example system with dependencies to illustrate how CMake can generate optimised build steps.

In this example if we modify the gpio.cpp file this is the only file that is recompiled as there is no other file that depends on it. Obviously, we will always need to link the entire project to create the new executable image.

If, in our example, we now modify gpio.h then by implication display.h is also out of date as it depends on gpio.h. Now we have to recompile:

  • gpio.cpp (depends on gpio.h)
  • display.cpp (depends on display.h and gpio.h)
  • main.cpp (depends on display.h and gpio.h)

This is an example the generated main.cpp.d file (full path names have been replaced by …):

CMakeFiles/Application.dir/src/main.cpp.obj: \
 .../src/main.cpp .../src/display.h .../src/gpio.h
.../stc/display.h:
.../src/gpio.h:

These dependency files could be used by other applications such as static analysis tools.

If we were to manually maintain our make system build files without using CMake we would have to specify all of these dependencies ourselves. This will be a tedious and error prone process for large projects due to the number of files and inter-dependencies involved. Failure to record the dependencies correctly can result in  unnecessary compilations taking place slowing the build down, or worse, modules not being recompiled when they should be leading to inconsistencies and potential bugs in the built project.

Furthermore, adding, deleting or modifying #include statements in any file requires us to update the build system dependency graph accordingly. Using CMake to manage the build files means we simply regenerate the build when required rather than having manually check and update the affected build configuration files ourselves.

Using CMake to generate the build files is a relatively quick operation compared to  compilation and linking, so many project administrators choose to always regenerate the build files at the start of a system build. That way any new dependencies (changed #include statements) will automatically be recorded in the generated dependency files.

This automated management of the build dependencies is a very powerful argument for using CMake, especially on larger projects with multiple source and headers files where dependencies can quickly become very labyrinthine. Even small projects like our training projects with around 40 sources files benefit from using CMake to manage the build process.

Toolchains

Perhaps the most significant source of confusion we see in articles and questions on web sites is how CMake uses a toolchain when generating the build files.

A toolchain must be defined before CMake starts processing the CMakeLists.txt file. Unless you provide a command-line argument to tell CMake which toolchain to use it use the default toolchain for the current host. Any attempt to modify or override the toolchain from within CMakeLists.txt typically won’t work or is just plain wrong.

Once the toolchain is defined, CMake will then validate the compiler and linker by building and discarding a simple test application. A toolchain configuration must define all the compiler and linker options necessary to perform a successful test build.

Cross Compiling

If the default host toolchain is not suitable, as is the case for cross compiling, then the recommended way of specifying the toolchain details is in a separate toolchain file. In fact this is the only reliable way of overriding the default toolchain due to the lifecycle of the CMake processing steps.

To generate a cross compilation build using CMake, we specify the location and command names of the compiler, linker and other build tools (the toolchain). We also have to define compiler and linker options that will ensure the test build works.

To do this, we add a command-line option to cmake to tell it to read toolchain information from a file using the CMAKE_TOOLCHAIN_FILE variable:

cmake -S . -B build/ -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake

There are no standard naming conventions for toolchain files, but we’ve followed other examples and included the target specification in the file toolchain-STM32F407.cmake.

Toolchain Definition File

In the toolchain-STM32F407.cmake file we define variables for the target system name and  version:

set(CMAKE_SYSTEM_NAME Generic)
set(CMAKE_SYSTEM_VERSION Cortex-M4-STM32F407)

CMake has a standard set of known system names (Linux, Windows, OSX, Android and others) but we are using Generic as there is no predefined name for a bare-metal embedded system.

Setting the system name tells CMake that this is a cross compilation project, and it will define the CROSS_COMPILING variable as true.

The system version can be anything we want, and we decided to use it to identify the actual target rather than a version number.

Toolchain Program Paths

The next step is to specify the toolchain programs. We have added the toolchain directory to the search path, so we just need to set the C and C++ compiler command names which are prefixed with arm-none-eabi- for the GNU Arm Embedded Toolchain:

set(CMAKE_C_COMPILER arm-none-eabi-gcc)
set(CMAKE_CXX_COMPILER arm-none-eabi-g++)

CMake has rules for finding the location of the other toolchain commands. Typically, a cross compiler toolchain uses a common command prefix (arm-none-eabi- for GNU Arm) and CMake uses this convention to generate names for the other tools (arm-none-eabi-gcc, arm-none-eabi-ar, and so on) if we provide the name of the C compiler. This means we could have just defined the CMAKE_C_COMPILER as arm-none-eabi-gcc and CMake will have inferred the name of the C++ compiler as arm-none-eabi-g++.

CMake will use the full pathnames for the tools rather than the command name so that the actual build tool can be run without adding the build tools directory to the search path.

A downside of using full pathnames in the generated build files is that the build configuration must be regenerated if the build tool location changes. This happens when Arm release a new version of their GNU Toolchain as the version number is part of the path. You cannnot generate build files that use relative pathnames, even if you use relative pathnames in the toolchain definitions.

If you are interested, you can look at the generated configuration variables in the file CMakeCache.txt in the output build directory. If your build isn’t working as expected, this is one of the files to examine to look for a misconfiguration. To check for the C++ compiler path look for the line following the comment line containing CXX  compiler:

//CXX compiler
CMAKE_CXX_COMPILER:FILEPATH=/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/arm-none-eabi-c++

All that’s left to do in the toolchain file is to provide sufficient compiler and linker options to ensure the test build will compile and link successfully.

At this point we should digress and explain the syntax for CMake functions, arguments, and strings.

CMake Functions and Variables

The CMake configuration language is simply a series of function calls with function arguments (parameters) passed in parentheses (round brackets). Flow control constructs such as if statements and loops are also implemented as functions.Parameters are white space separated and long argument lists are usually split across multiple lines (one argument per line) to aid readability.

There is no need to surround arguments with double quotes unless a space or round bracket is needed in the argument.An argument in quotes defines a string and sometimes CMake can be confused by an empty argument and an empty string. It is best to avoid strings except when using the if() function to test string values.

Multiple arguments form a list and most functions accept arbitrary sized lists. Some functions use context-sensitive keywords (such as PRIVATE shown later in the CMakeLists.txt file) to supply function specific information or partition the list of arguments into different sections.

Variable substitution uses ${…} (the curly brackets are mandatory) – there is no need to wrap variable substitution in a string (even when the variable value contains white space or round brackets).

Toolchain Compiler and Linker Options

Resuming our example of a minimal cross compiler build definition we have to supply a some common compiler and linker options for the Arm target. We’ll put these into a custom CMake variable so we can reuse the values:

set(ARM_OPTIONS -mcpu=cortex-m4 -mfloat-abi=soft --specs=nano.specs)

Cross Compiler Options

We add our common options along with other cross compiler options using the add_compile_options function:

add_compile_options(
  ${ARM_OPTIONS}
  -fmessage-length=0
  -funsigned-char
  -ffunction-sections
  -fdata-sections
  -MMD
  -MP
)

And some required pre-processor defines using add_compile_definitions:

add_compile_definitions(
  STM32F407xx
  USE_FULL_ASSERT
  OS_USE_TRACE_SEMIHOSTING_STDOUT
  OS_USE_SEMIHOSTING
)

We could have equally well have added the compiler SEMIHOSTING definitions in our main CMakeLists.txt file, but as they are standard for all cross compilations for the target we’ve put them in the toolchain configuration.

Cross Linker Options

Linker options defined using add_link_options need to include a minimal bare metal C runtime library specification:

add_link_options(
  ${ARM_OPTIONS}
  --specs=rdimon.specs
  -u_printf_float
  -u_scanf_float
  -nostartfiles
  LINKER:--gc-sections
  LINKER:--build-id
)

CMake uses the LINKER: prefix to indicate a linker specific directive. On older gcc linkers this will generate a Wl, option, whereas on  other compilers (later gcc, clang, etc.), it will generate -Xlinker options.

Cross Compiler Search Paths

Finally, we need to tell CMake which locations to search when resolving the absolute paths for toolchain components:

set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

This is a standard definition that basically says the toolchain commands (programs) are outside the project, but libraries, packages and include file locations are within the project folder hierarchy.

We now have a complete toolchain configuration file which, just to remind you, we must add to the cmake command line only when generating the build files (it isn’t required when we perform the actual build):

cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake
cmake --build build

Compilation Options

In our cross compilation configuration in CMakeLists.txt, as with our hosted projects, we need to define the CMake version and project name:

cmake_minimum_required(VERSION 3.16)
project(target-cortexm LANGUAGES C CXX)

In most projects we will want to override the standard compiler and linker options to configure C/C++ standards compliance and warning levels (at the very least). So, before we define the cross compiler build target using add_executable, we now set C and C++ options to use for all compilations:

set(CMAKE_C_STANDARD 99)
set(CMAKE_CXX_STANDARD 17)

set(CMAKE_C_EXTENSIONS OFF)
set(CMAKE_C_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

The last four lines ensure we use recommended compiler options -std=c++17 instead of the GNU specific versions -std=gnu17; we also enforce ISO C/C++ compiler standards.

We add compiler options and definitions, in the same manner, we used in the toolchain file:

add_compile_options(
  -Wall
  -Wextra
  -Wconversion
  -Wsign-conversion
  -g3
  -Og
)

add_compile_definitions(
  DEBUG
)

The options and definitions are cumulative. If there are any conflicts, then the values defined in CMakeLists.txt take precedence.

Adding a Target

As with the host project we need to add an executable target:

add_executable(Application src/main.cpp)

Again it is worth emphasising the add_executable function must define the target before you set any target specific definitions. CMake is generating the build files and must be told what to build first, and then how to define the build steps.

After we have added the project executable, we can set compiler and linker options for the target. For a cross compilation we want the target executable to have a .elf suffix. This is achieved using target specific function calls that require the name of the target (Application) as the first argument.

set_target_properties(Application PROPERTIES
  SUFFIX .elf
)

We must define the target hardware configuration for the linker memory allocation, display memory usage after linking, and generate a map file:

target_link_options(Application PRIVATE
  -T${CMAKE_SOURCE_DIR}/ldscripts/mem.ld
  -T${CMAKE_SOURCE_DIR}/ldscripts/sections.ld
  LINKER:--print-memory-usage
  LINKER:-Map,${CMAKE_CURRENT_BINARY_DIR}/Application.map
)

As a minor digression we’ll point out the duplication of the word Application used for Application.exe and Application.map. We have done this for simplicity while we get the basic concepts sorted. In the next post we’ll look at using CMake generator functions to avoid this repetition.

Although our simple example currently  doesn’t include any user defined header files we normally need to tell CMake which include directories to add to the compiler command line:

target_include_directories(Application PRIVATE
  src
)

The PRIVATE keyword defines the scope of the include directories when using the target. This is more applicable to a library target (discussed in a later post) where we may want to define INTERFACE or PUBLIC includes to be used with the library. As this is an executable program there is no external dependency on the include files, so we mark these as private.

Note that at this point we haven’t included any driver files for our target board, just a single main application file. Additional source files could be added to the source dependencies on the add_executable definition but this doesn’t capture the architecture of our application. To add support files for the target hardware, and possibly a Real Time OS we will use the target_link_libraries to define a subsystem in out application in a later blog post.

For now if you want to view the complete project you can do so in out public git repo https://github.com/feabhas/cmake-blog-1

For completeness, if we had any target-specific compiler configuration requirements that are not included in the toolchain file, we’d have used the target_compile_definitions and target_compile_options functions specifying our target name (Application) as the first argument.

As an aside, CMake automatically sets several variables that reflect the project build environment. We have used:

  • ${CMAKE_SOURCE_DIR} – the project root folder (-S on the cmake command line)
  • ${CMAKE_CURRENT_BINARY_DIR} – the output build directory (-B on the cmake command line)

This configuration will create the build/Application.elf file ready for use by our target loader tools:

cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake
cmake --build build

Tracing the Build Commands

CMake prints out information about the build files as they are generated and includes in those generated files print statements about what is being built, but how the compilation and linker commands themselves.

To diagnose problems with the generated commands you can add the VERBOSE=1 option to the cmake –build command to passed into the build. This is not a CMake command option so must be added after option to mark the end of the options:

cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake 
cmake --build build -- VERBOSE=1

Clean Builds

Build systems typically optimise the build process by omitting steps that produce artefacts that are already up to date – in simple terms don’t recompile a file if the source and the dependency files have not changed since the last build.

When the build fails, or the generated artefacts are missing or incorrect, a first step is to force a rebuild of the entire system. You can do this by adding the – -target clean option to the build command line and then rerun the build step:

cmake --build build --target clean
cmake --build build

The clean target will remove the generated files forcing all build steps to be executed on the next build command (cleaning the build does not automatically initiate a new build).

When changing and updating the build configuration itself inconsistencies can arise in the build folder. Frequently obsolete files that are no longer required can be left around in the build sub-folders. A more dramatic clean build is to remove the entire build folder and regenerate the build files. On our Linux system we’d simply run an rm command:

rm -r build
cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=toolchain-STM32F407.cmake
cmake --build build

As these CMake build steps start to get more complex many sites will add a front end script to simplify running the different build steps so developers do not have to learn and enter the potentially long CMake build commands.

Summary

Once you understand that the CMake toolchain must be configured on the command line, problems associated with using a cross compiler should be much easier to resolve.

Cross compiler toolchain configuration is complex enough to require a separate toolchain definition file specified with the -DCMAKE_TOOLCHAIN_FILE command-line option.

If you simply wanted to use a different compiler such as clang you could possibly get away with setting the compiler name or compiler path on the CMake command line:

cmake -S . -B build -DCMAKE_CXX_COMPILER=clang++
cmake --build build

But this approach will only define the C++ compiler command leaving the C compiler and standard toolchain programs with their default names. To use the full Clang toolchain (often called binutils), you should use a toolchain definition file without defining the CMAKE_SYSTEM_NAME variable because this won’t be a cross compilation – the target architecture is still the host.

NOTE: if you read about toolchain configuration on some web pages, you may find references to  _CMAKE_TOOLCHAIN_PREFIX or CMAKE_TOOLCHAIN_PREFIX variables. This is a common misconception as these variables do not exist in modern CMake and cannot be used to configure a toolchain by defining a common prefix before the command name.

In the next post CMake Part 2 – Release and Debug builds, we’ll look at using CMake to configure different debug and release builds.

You can download the complete project from our GitHub repository https://github.com/feabhas/cmake-blog-1.

Posted in ARM, Build-systems, C/C++ Programming, Cortex, Toolchain | Tagged , | 6 Comments

Why We Need Build Systems

Build systems were developed to simplify and automate running the compiler and linker and are an essential part of modern software development. This blog post is a precursor to future posts discussing our experiences refactoring the training projects to use the CMake build generator.

Using Build Systems

Build systems can be standalone command line applications such as  Make, Scons and Ninja; or part of an (Integrated Development Environment IDE) like Visual Studio , XCode or IAR Workbench.

Configuring build systems for a project can be complex and there are a few applications around that will generate the required build files from a simpler project configuration file. The most popular of these tools are CMake, which generates files for several build systems, and Meson, which generates Ninja build files.

A 2021 survey by the Standard C++ Foundation showed that CMake was used by 4 out of 5 of the respondents, while Meson and  scons are each used by less than 1 in 20. The survey also showed that Make/nmake and MSBuild (Visual Studio) are used roughly equally by 2 out of 5 people and Ninja by 1 out of every 3. In many cases the respondents will be using more than one build system across multiple projects.

Interestingly, or worryingly, from our perspective of moving our training projects to CMake, about two thirds of the respondents found CMake to be a major or minor pain point. Managing CMake was the third most frustrating aspect of C++ development behind managing libraries and project build times.

We need to use build systems because compiling an application from source code is no longer as simple as running a single compilation command such as:

$ g++ -o Application main.cpp

While this works, it relies on default configuration options for the compiler and linker.

We should point out that referring to g++ as a compiler is misleading. It isn’t very obvious, but g++ is itself a very simplistic build system responsible for running a number of build phases:

  • the preprocessor
  • the compiler
  • the assembler (code generator)
  • the linker

Anyone who has worked with the Microsoft C++ tools will be aware that there is a separate compiler (cl.exe which includes the preprocessor and assembler) and linker (link.exe).

The build process is complex and involves many stages with different requirements, inputs and outputs and can be summarised in the following diagram.

In reality, we use a build system to manage some or all of the following aspects of software development:

  • source code organisation
  • source code inter-dependecies
  • managing third party libraries
  • compilation options
  • code generation options
  • program linker configuration
  • post build processing
  • managing testing

It’s worth looking closely at these steps in order to understand the requirements of a build system and the concept of a development Toolchain.

Source Code Organisation – In Source Builds

Our simple example above generates the intermediary object files and executable program in the current directory. When we store the output files in the same directory as the source files we call this an In-source build: this is generally considered a bad idea. Managing the source code will become problematical when our application gets more complex and requires multiple files.

The output files from a build process can be re-created from the source code so do not need to be saved using a backup regime, whereas source code must be saved which these days is usually achieved using a source code repository such as Git.

The lifecycle management of source code and build output files are independent and should be stored in different locations. Many build tools and tool generators (like Meson) do not even support in-source builds.

Source Code Organisation – Out of Source Builds

All projects should store generated artefacts (object files, executable applications, etc.) in a separate location to the source code.

A typical C/C++ application will separate out logical subsystems into different components and these will usually be stored in a hierarchical directories structure representing the application’s architecture. A build system should manage source code (including header files) stored in multiple directory locations.

There is no standard structure for C/C++ source code and a quick browse of open source projects in GitHub shows nearly as many different source code structures as there are projects (a bit of an exaggeration but it does show there is no single standard).

Many projects intermingle the header files with the source files, which can lead to complex compilation options when the source code is stored in hierarchical directories. Header files define the interface to a module’s functionality, and a best practice approach is to separate interface from implementation. Applying this approach to source code organisation implies that header files should be stored separately from the implementation files.

Back when I was developing code using Unix I used directory names src for source files and hdr for the header files, with both directories stored in the project’s root directory; but the use of src and inc is the usual practice in embedded system. Not everyone likes the traditional Unix style of shortening words (often by omitting vowels) preferring to use source and include instead.

What everyone does agree on is that the generated output files go in a separate folder: normally in the project workspace. Typical names are build or target as used by the Maven build system.

No matter how the source code is organised, the compiler must be told where the source and header files are located.

Another aspect of source code organisation is integrating the build process with a source code management system. The ability to download the latest committed version of source code before building means the same build system can be used for local development as well as the centralised pipelines of Continuous Integration commonly used in agile development methodologies.

Once the locations of source and headers files are determined, a definition of the component files is also required for a build process. These files can be supplied individually or by using wildcard patterns – a good build system will support both.

The benefit of listing each file individually is that the build system provides a definitive list of source dependencies for the build artefacts, which is helpful for general administration tasks and undertaking a risk analysis for business continuity purposes. The drawback is that this list has to be maintained and the extra administration required can nudge developers into including code in existing modules rather than creating a new module, especially if the build configuration is centrally administered in an overly restrictive manner.

A wildcard approach to filenames (e.g. src/*.cpp) superficially seems more straightforward as it doesn’t require the developer to list each file allowing new files to be easily added. The downside is that the build system does not have a definitive list of the source code files for a given artefact, making it harder to track dependencies and understand precisely what components are required. Wildcards also allow spurious files to be included in the build – maybe an older module that has been superseded but not removed from the source folder.

Best practice says to list all source modules individually despite the, hopefully minor, extra workload involved when first configuring the project or adding additional modules as the project evolves.

Source File Dependencies

Larger projects will use multiple source files to breakdown a large code base into smaller manageable units, probably using directories to group files in component subsystems. There will be interdependencies between these files. For C/C++ projects the dependencies can be identified through the occurrence of #include statements.

Large projects can take a while to compile and link all of the separate files from scratch (a clean build).  A large C project can take 20 or 30 minutes to build even using a fast multi-core server. For C++ projects making heavy use of templates the build times can be measured in hours rather than minutes.

Most build systems will optimise the build process by omitting stages that are already up to date. For C/C++ builds this means omitting the compilation of a source file if neither the source nor any of the files it depends have been changed since the last build.

But a build optimisation only works correctly if the build configuration correctly captures the dependencies between files. For simple build system like GNU Make the developer must specify and maintain these dependencies manually. A build system generator like CMake will scan the source files to maintain the dependencies automatically.

Compilation Options

Compilation options must specify:

  • the source language version, sometimes the source language
  • compilation options
  • include file locations
  • preprocessor symbols (or defines)

Typically, compilation options are provided as command-line parameters but there is no reason why a compiler couldn’t read a configuration from a specification file.

As an example of a compilation option the GNU g++ compiler uses the -std=c++17 for working with C++17 whereas the Microsoft cl compiler uses /std:c++17. It’s also worth pointing out that the GNU Compiler Collection uses different compiler for C (gcc) and C++ ( g++), but Microsoft supplies one compiler (cl) and uses the filename extension to determine the language (.c or .cpp).

A build file generator such as CMake or Meson must handle these different approaches.

An essential compilation option is setting the correct level of compiler warnings. A C/C++ compiler will attempt to generate code whenever possible; only if code cannot be generated will a compilation error be issued. This means that the compiler sometimes makes assumptions about what the programmer intended when writing the source code statements. The phrase “never assume because you make an ASS out of U and ME” has some significance here.

Using g++, we recommend, as a minimum, using the -Wall and -Wextra options to enable warnings for code use that is generally regarded as questionable (this doesn’t include implicit type conversion). The -pedantic and/or -ansi options enforce strict ISO (formerly ANSI) language compliance which is advisable as it will ensure you don’t make use of g++ specific features and pre-empt potential problems if you decide to use a different tool chain in the future.

Other compilation options are used to generate warnings when the compiler infers a programmer’s intentions when the source code is not explicit. A good example of the compiler inferring the programmer’s intentions where the source code is not explicit is the implicit conversion of a signed to an unsigned integer value. Implicit sign conversion can cause subtle problems when working with hardware device registers, so we usually add the -Wconversion and -Wsign-conversion warnings on g++ to identify these situations.

In general, set warnings to the highest level possible and remove all, or at least as many warnings as possible, from the compilation.

Your compilation phase should be augmented by static analysis tools such as clang-tidy, cppcheck or commercial tools such as Coverity which examine code structure and data flow without executing the code. These tools use heuristic rules to identify potential logic flaws and non compliance to coding guidelines such as MISRA widely used in embedded systems.

Include File Locations

We find that when teaching the use of #include preprocessor directive is a common source of confusion, even amongst experienced C/C++ programmers. We’re often asked what is the difference between using angle brackets < > and quotes ” “?

Originally angle brackets were used for header files in standard locations known to the compiler, whereas quotes were used to define a string literal specifying the path to the header file (relative to the project workspace). These days life isn’t quite that clear cut, but the general approach is to use < > for library headers and “” for user-defined headers.

The ISO C and C++ standards both say that “The named source file is searched for in an implementation-defined manner” for both < > and ” “ (see the “Source File Inclusion” section in the relevant standard).

The C++ standard header files are defined with logical module names like <iostream> whereas in C we use the header filenames like <stdio.h>. When using C header files with C++ the logical name is the base part of the filename prefixed by c so <stdio.h> becomes <cstdio>.

The compiler knows where standard header files are located. For example, the Linux host g++ compiler looks in /usr/include whereas the Arm g++ compiler (located at /opt/arm-toolchain/bin/arm-none-eabi-g++) looks in /opt/arm-toolchain/include. There is a full description for hosted GNU compilers in the Search Path section of the manual.

We can tell the compiler to look in other locations using the -I directive which specifies an additional directory to search for include headers. Note that this is just the top-level directory and not a recursive directory search.

We can use multiple include path locations on a compilation so we could include nested include directories as separate -I options. For example, when developing out embedded target code for an STM Discovery board (https://www.st.com/en/evaluation-tools/stm32f4discovery.html) we include standard header files using -Isystem/include/cmsis and -Isystem/include/stm32f4xx separately;  we cannot just use  -Isystem/include. Note that these are relative paths from the project workspace (not an absolute path like /usr/include).

When using #include directives with string literals we are specifying file pathnames. But this approach can be abused by using a directive such as:

#include “../hdr/mylib.h”

This approach has coupled the organisation of the files on the file system to the C++ program code structure. We could not rename the hdr directory to include without modifying every source file that uses this header file. A maintenance nightmare – so this should be avoided.

To solve the file system dependency, we would add -Ihdr to include the hdr folder in source include files search (some people prefer to use -i./hdr to make it explicit this is relative to the project workspace). Our include path now becomes “mylib.h” without any path information. It is common to organise headers files in sub-directories so it would still be acceptable to use “lib/mysublib.h” as the header file organisation is part of the project structure.

The compiler uses the include locations to resolve all include directives so in the previous example we could also have written #include <mylib.h> but this would not follow the accepted conventions of using string literals for user defined header files.

In resolving include file paths modern compilers will typically look for the header file relative to the location of the source files containing the #include statement. If the header file is not found that way the compiler will search the specified include locations, in the order given on the command line, until it finds an exact filename match (Note, both Windows and OSX are case insensitive when searching for included filenames).

Some compilers may adopt a different approach such as initially looking for include files relative to the current working directory rather than the location of the source file. Defining and using a compiler’s -I option (or equivalent) in a build system removes the dependency on a compiler’s include file lookup strategy.

The specified order of the include directories is therefore important and a possible source of problems if there are multiple header files with the same name: the first filename match is used and the same filename in subsequent include directories are ignored: duplicated include header filenames are not treated as an error. The best practice is to use unique header filenames or, failing that, use hierarchical directories to ensure unique paths. A good example of the directory approach is the standard Linux header types.h which is included as <sys/types.h>.

Code generation Options

Code generation or assembler options are specified on the compilation command line and include general concepts like optimisation levels and architecture-specific options. For example, when cross compiling to an Arm processor, we use -mcpu=cortex-m4 flag to set the correct architecture and -mfloat-abi=soft to use a software FPU library because the QEMU we use for online training does not include support for a hardware FPU.

Code optimisation is typically disabled during development by using the -Og option (the g is short for gdb the name of the Gnu debugger), but for a released version of the application (see build types described later) we may want to include some optimisation. For example, using -Ofast will optimise for speed at the expense of a potentially larger memory footprint whereas -Os will optimise for the smallest memory usage usually with slower code execution (the s means size).

Usually, we apply the same options to all compilation units (each source file is compiled independently) but sometimes it may be desirable to treat some source files differently. A good example would be targeting an embedded system with a limited amount of memory where we optimise for small size with -Os. However, if one module is a critical performance bottleneck, we may need that compilation unit to be optimised for speed with -Ofast.

Make sure whatever build system you use will support different compilation options for different source files: even if you don’t need it now, you might in the future.

Preprocessor Directives

We use preprocessor directives to configure our source code so that we can use a single code base (one project) to build potentially different applications.

Examples of standard preprocessor symbols are __STDC_VERSION_ and __cplusplus which can be used to verify the compiler options are set to the correct C or C++ language version.

To ensure we are using C++17 we would a check in our code such as:

#if __cplusplus < 201703L
#error “__FILE__ requires a C++17 compiler”
#endif

Here the symbol __FILE__ is the filename of the current compilation unit.

If we knew we were always using a Modern C++ compiler (C++11 or later) we could also have used:

static_assert(__cplusplus >= 201703L)

A good example of user-defined configuration options can be seen in our embedded training projects currently using an STM Discovery board (STM32F407VG) with hardware components configured at specific addresses. We know that we may need to change this to a different board in the future; perhaps STM will stop manufacturing this particular board.

If we need to move to a different board with similar hardware components in the future, these components could be mapped to different physical addresses. We can build this dependency into our source code using conditional compilation based on preprocessor symbol definitions.

To resolve our theoretical problem of differing physical addresses we use pre-processor directives to include the appropriate device header file:

#ifdef STM32F407xx
#include “stm32f407xx.h”
#elif defined(STM32F417xx)
#include "stm32f417xx.h"
…
#endif

Our build configuration would define the appropriate preprocessor symbol on the command line: in this case using -DSTM32F407xx to select the appropriate hardware configuration.

Similarly, we can use -DDEBUG to define a debug symbol which we use to set options applicable to developing and debugging code such a -Og to optimise for the debugger.

Compiler defines are used to support the concept of different Build Configurations which CMake refers to as build type, but an IDE usually calls a build configuration or build target.

Build Configuration

A build configuration can be described as the combination of all the build options that uniquely define the application being built. Many projects have the idea of a debug build optimised for development and a release build optimised for use.

Each build configuration uses a separate output folder for the build artefacts (object files, executable and additional supporting files). This prevents a one build configuration from overwriting the output from a different configuration.

Build configurations can be used for many purposes. Our previous example of using two different target hardware boards would be separate builds for each hardware target. Actually, including debug and release versions for the two different hardware targets, that’s four different configurations and for separate build output locations.

We can also use build configurations to install our application in a central location or a web server where end-users can access and use our build artefacts.

A good build system will allow us to automate this deployment and/or installation process.

Program Linker Configuration

Like the compiler, linker options are provided as command line parameters usually augmented by configuration files and run-time libraries. The linking stage for embedded systems is often more complex than the compilation stage because it as at this point that run time libraries, and the physical memory architecture have to be resolved to create a image suitable for the target board.

When using the Gnu toolchain, the use of linker options can be confusing because the same command (gcc or g++) is used for both compilation and linking. Linker options specified to a compilation are ignored, similarly compiler options are ignored when linking. Working with Arm g++ we might build a system using a compilation command and a link command:

$ g++ -c -o build/main.o -std=c++17 --specs=rdimon.specs src/main.cpp
$ g++ -o build/Application.elf -std=c++17 --specs=rdimon.specs main.o

While this works it isn’t clear that the std=c++17 option is probably not used by the linker. Similarly, the –specs option tells the linker to include a debug version of the embedded C standard library, which isn’t required by the compiler.

Microsoft developers have a separate link.exe program used to link the application so the two build steps are clearly differentiated.

A good build system will clearly differentiate between compiler and linker options.

On a host compilation the linker is specific to the host and simply needs to be given the list of optional run time libraries used by the project. In our C++ courses we discuss threading which requires the linker to include the POSIX threading library using the g++ option -lpthread.  The other common linking requirement is for handling libraries: should they be included in the executable image (static linkage), or use dynamic linking to a shared library (.so or .dll file) at application startup.

Using a development toolchain for an embedded (cross compiled) system has more complex linkage requirements. The physical board memory layout must be supplied to the linker: the GNU Arm Embedded Toolchain  uses .ld files for this purpose. There will be runtime kernel or executive code that executes at board reset to initialise the hardware, setup the stack, heap and static data sections before calling the main function to start the program.

An embedded system linker needs to include the C/C++ runtime libraries. In the case of the GNU Arm Embedded Toolchain these are provided by the  –specs options. For a debug version the rdimon.specs is linked to support semi-hosted debugging (I/O streams are mapped onto the serial port used to flash memory), while a release version uses nosys.specs which has stubs for standard I/O support and unsupported host system functions.

Post Build Processing

The last step in our build (if we ignore deployment and installation) is any additional processing required after a successful build.

For a hosted application, we might strip the generated executable to remove all embedded symbols and other redundant information to reduce the size of the executable image.

For a cross-compilation to target hardware, we usually have to generate additional binary (or hex) files containing the image to load into flash memory. If we have embedded debug support, we will need map files to support post-crash analysis it allows the programmer to understand and review the target memory layout.

Managing Testing and Debugging

Testing and debugging is too extensive a subject to cover in any detail in this post, but a build system should be capable of running automated tests at any point of the build cycle. Test management should include unit testing on a per-source code module, integration testing, and functional testing of the various build artefacts generated by the build process.

Using debugging tools such as Open OCD  and Segger Ozone while a necessary part of development may be too much for a build system to manage. Typically build systems like to create artefacts such as object files, executable images or other output files such as generated web pages for reports and build summaries. Interactive debugging does not fit very well with this approach.

Limitations of a  Build System

Tools such as CMake have a very extensive “programming language” used to configure the build process. While it may appear attractive to use this programmable capability to incorporate additional steps to the build system this can end up adding huge complexity to the build instructions. It may be better kept some aspect oft the development lifecycle decoupled from the build system.

Moving away from a Build System will introduce problems to do with portability and maintenance of the supporting scripts. Bash scripts are supported on Linux and OSX but on Windows are provided by third party libraries such as MinGW or Cygwin. Recently Microsoft have been integrating Windows Subsystem for Linux (WSL) more closely with Windows 10 and this provides another source for running shell scripts. Using bash scripts for Windows developers in not straightforward. Similarly PowerShell is standard on Windows 10, but has to be installed on Linux and OSX.

While Python is cross platform some of the standard libraries are only available on Linux like platforms (usually including OSX) while others are only available on Windows. Python’s popularity is partly down to the extensive range of third-party libraries available, which would have to be installed on the developer’s workstations.

Stepping away from the build tool to provide additional development lifecycle support but that loses the inherent portability of a build tool like CMake. Developers can find themselves deciding to use a build tool to configure the build environment before they can even start development work on a project.

Summary

Hopefully, you now know why we need build systems to support modern software development. Applications have complex configuration and build requirements that need to be captured and implemented by a build system. Applications are no longer single source files that can be compiled using a single command line.

In case it isn’t apparent, the build system configuration is part of the project’s source code and should be stored in the same code repository as the program source. Anyone checking out a project source code will then be able to build the project.

This blog post is a pre-cursor to posts discussing what we learned when configuring CMake for our cross compilation and hosted training projects.

Posted in Build-systems, C/C++ Programming, Toolchain | 8 Comments

Modern Embedded C++ – Deprecation of volatile

Compiling the following, straightforward code:

volatile int x;

int main() {
    x += 10;
}

https://godbolt.org/z/jq83vdvj5

Using g++ with the directive -std=c++17 builds without any warnings or errors. However, change the directive to -std=c++20, and the result is:

source>: In function 'int main()':
<source>:5:5: warning: compound assignment with 'volatile'-qualified left operand is deprecated [-Wvolatile]
    5 |   x += 10;
      |   ~~^~~~~
Compiler returned: 0

The new C++ standard, C++20, has deprecated volatile! So, what does this mean for the embedded programmer?

We covered the need for and use of, volatile in a previous posting. That post (written in April 2020) did state that:

In C++20 many general uses of volatile are being deprecated.

The key phrase here is general uses.

Volatile in embedded

Continue reading

Posted in ARM, C/C++ Programming | Tagged | Leave a comment

GitHub Codespaces and online development

In our previous posting, we discussed using VSCode’s Dev Container extension to allow running workspaces directly within a Docker container.

In December 2020, I was granted early access to a new feature developed by GitHub called Codespaces. Codespaces offers an online VSCode development environment, enabling you to develop entirely in the cloud.

The great news is that Codespaces uses the same core process, and file structure, as Dev Containers; meaning once we have our .devcontainer folder setup (if you are unfamiliar with Dev Containers it is worth reading the previous blog first) “it just works” online.

TDD in the cloud

Using our example from the previous blog (GoogleTest and meson) to run using Codespaces is Simples!.

When granted Codespaces access, open the GitHub project and under the Clone or download button you are offered the option Open with Codespaces

Continue reading

Posted in Agile, C/C++ Programming, General, Industry Analysis, Testing | Tagged , , , , , | Leave a comment

VSCode, Dev Containers and Docker: moving software development forward

Long term readers of this blog will know our devotion to using container-based technology, especially Docker, to significantly improve software quality through repeatable builds.

In the Autumn/fall of 2020, Microsoft introduced a Visual Studio Code (VSCode) extension Remote – Containers. With one quick stroke, this extension allows you to open a VSCode project within a Docker container.

Getting started with Dev Containers and Docker

There are several different approaches to using Dev Containers. In this post, we shall cover three options:

  1. Using an existing Docker image from Docker Hub
  2. Using a pre-build Microsoft container setup
  3. Using a custom Docker image based on a project specific Dockerfile

There are a couple of prerequisites:

Using an existing Docker image – TDD in C with Ceedling

Anyone using or experimenting with Test-Driven-Development in C will probably be aware of Ceedling, unity and CMock.

Whether or not you have Ceedling, or any dependents, such as Ruby, installed we can begin using Dev Container with an existing Dockerhub container image. Containerisation ensures we can quickly get up and running with Ceedling in a known environment. In true ‘Blue Peter‘ style, we happen to have a pre-built Ceedling based Docker image on Docker Hub.

  1. Create an empty folder, e.g.
$ mkdir ceedling_test
  1. In the new folder, create another folder called .devcontainer (note the preceding .)
$ cd ceedling_test
$ mkdir .devcontainer
  1. In that new folder, add a file called devcontainer.json with the contents
{
    "image": "feabhas/ceedling"
}
  1. Your project structure should now be:
.devcontainer
    └── devcontainer.json
  1. Now open VSCode in the working directory
$ code .
  1. VScode will detect the Dev Container configuration file and ask if you want to reopen the folder in a container. Click Reopen in Container.
  2. Open a terminal window within VSCode, and you will be presented with a shell prompt #. We are now running within a Docker container based on the image feabhas/ceedling.
  3. Test the container, e.g.
# ceedling new test_project
Welcome to Ceedling!
      create  test_project/project.yml

Project 'test_project' created!
 - Execute 'ceedling help' from test_project to view available test & build tasks

# cd test_project

# ceedling module:create[widget]
File src/widget.c created
File src/widget.h created
File test/test_widget.c created
Generate Complete

# ceedling test

Test 'test_widget.c'
--------------------
Generating runner for test_widget.c...
Compiling test_widget_runner.c...
Compiling test_widget.c...
Compiling unity.c...
Compiling widget.c...
Compiling cmock.c...
Linking test_widget.out...
Running test_widget.out...

--------------------
IGNORED TEST SUMMARY
--------------------
[test_widget.c]
  Test: test_widget_NeedToImplement
  At line (15): "Need to Implement widget"

--------------------
OVERALL TEST SUMMARY
--------------------
TESTED:  1
PASSED:  0
FAILED:  0
IGNORED: 1

ceedling-test

After exiting VSCode, all files created will exist in your local file system. Reopening VSCode, you will once again be prompted to reopen in the container.

Continue reading

Posted in Agile, C/C++ Programming, Testing | Tagged , , , , , , , | 1 Comment

Introduction to the ARM® Cortex®-M7 Cache – Part 3 Optimising software to use cache

Part 1 Cache Basics

Part 2 Cache Replacement Policy

Caches – Why do we miss?

Cold Start

As stated, both data and instruction caches are required to be invalidated on system start. Therefore, the first load of any object (code or data) cannot be in cache (thus the cold start condition).

One available technique to help with cold-start conditions is the ability to pre-load data into the cache. The ARMv7-M instruction set adds the Preload Data (PLD) instruction. The PLD instruction signals to the memory system that data memory accesses from a specified address are likely shortly. If the address is cacheable, then the memory system responds by pre-loading the cache line containing the specified address into the cache. Unfortunately, there is currently no CMSIS intrinsic support for the PLD instruction.

It is worth noting that some processor data caches implement an automatic prefetcher (e.g. Cortex-A15). This monitors cache misses, and when a pattern is detected, the automatic prefetcher starts linefills in the background. Unfortunately, the Cortex-M7 data cache does not support automatic prefetch.

Capacity

The other most obvious reason for misses is that of cache capacity. The larger the cache, the higher the probability of a cache hit and the lower the frequency of eviction. However, all this comes at a cost, not only financial but also power.

A larger cache is, naturally, going to contribute to the overall System-on-Chip (SoC) costs, making the end microprocessor more expensive. In high volume designs, this is always a significant factor in SoC choice.

Among all processor components, the cache and memory subsystem generally consume a large portion of the total microprocessor system power, commonly 30-50% of the total power [Zang13]. Caches, thus, add a further level of complexity to the poor-overworked engineer trying to calculate the design’s power model and has an impact on all battery-based designs.

Conflict

Finally, misses will occur due to natural eviction followed by a reload. So a simple loop such as:

for(uint32_t i = 0; i < N; ++i) {
   dst[i] = src[i];
}

may result in multiple eviction/reload cycles depending on the memory addresses of dst and src. Also, any dst[i] eviction will result in a memory write as the line is marked dirty. The 4-way data cache goes a long way to help reduce the potential of dst[i] eviction, but because of the pseudo-random replacement policy, it may happen more often than we would expect or like.

Code Optimizations

There are a key number of areas where we, as a software developer, can potentially impact the performance of cache:

  • Algorithms
  • Data structures
  • Code structures

Continue reading

Posted in ARM, C/C++ Programming, CMSIS, Cortex, Design Issues | Tagged , | 3 Comments