You are currently browsing the archives for the C Declarations tag.

The hokey-cokey* of function calls

January 20th, 2014

Glennan Carnie

Technical Consultant at Feabhas Ltd
Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry.

He specialises in C++, UML, software modelling, Systems Engineering and process development.

Latest posts by Glennan Carnie (see all)

Functions are the lifeblood of a C program. The program flow is altered by passing parameters to functions, which are then manipulated. Conceptually function parameters are defined as being either:

  • Inputs (Read-only) – client-supplied objects manipulated within the function only
  • Outputs (Write-only) – objects generated by the function for use by the client.
  • Input-Outputs (Read-Write) – client objects that can be manipulated by the function.

Defining the use of a parameter gives vital information not only to the implementer, but (perhaps more importantly) to the user of the function, by more-explicitly specifying the ‘contract’ of the function.

Many programming languages (for example, Ada) support these concepts explicitly. C, however, does not. One has to remember that when Kernighan and Ritchie developed C structured programming was very much in its infancy and many of these ideas were still being formulated (also remember that one of the C design goals was parsimony).

Even today, though, these concepts are rarely taught to C programmers and that has often led to clumsy, insecure or even downright dangerous APIs.

If C doesn’t support these concepts explicitly, can we simulate them? The answer is (of course) yes, by using some basic language constructs and forming some idioms.

Let’s look at each parameter type in turn.

Input parameters

C specifies a call-by-value or call-by-copy paradigm. That is, when a C function is called the compiler sets up a call frame that holds copies of the function parameters. Therefore, when you pass parameters by value you are – in effect – creating a parameter for the function to use that in no way affects the caller’s data


This is fine for simple types, but what about user-defined types – structs? What’s the problem with passing them by value?


Passing a structure by value means allocating enough memory for the parameter and then copying the contents of the original object into the parameter. In many embedded systems, where memory is at a premium, this could easily overflow the stack – at run-time, where its consequences could be difficult to track.

Strictly, to be explicit you should specify the type of the parameter as a const:


For simple types this is unlikely to add much value; however it may provide some benefit with structures.

If a parameter is passed as a const struct the compiler has the opportunity to perform a lazy evaluation – it passes the address of the structure instead of making a copy.


Note that this optimisation may not be supported by all compilers; or might not occur at all levels of optimisation.

Input-Output parameters

The resolution to the above problem is to explicitly pass a pointer to the structure:


This is clearly more efficient than copying the whole structure. OK, the syntax has got a little messier, but we can live with that.

But hang on: do we still have an Input parameter? Actually, no.

What we’ve got here is an input-output parameter. By passing a ‘raw’ pointer the function can manipulate the caller’s object. To fix this we need to prevent manipulation of the pointed-to object:


Still not quite there, though. What happens below?


Strictly we should make the pointer itself const to prevent (either accidently or maliciously) the function manipulating the caller’s object:


This is a very good general rule-of-thumb for functions: make all pointers const

Output parameters

An output parameter is one that the function can write to, but never read (i.e. write-only). In C the only real mechanism we have for that is the function return value.

Most programmers are happy to return simple types from functions but what about the following code?


Since C performs pass (and return!) by value this would appear very inefficient:


The original object (biiig) is constructed. Then, when makeBigStruct is called space for the return value is allocated. Inside makeBigStruct, temp is allocated. On return temp is copied into the return value then, finally, copied into biiig.

Knowing this, most programmers never return structures from functions; preferring instead to supply them as input-output parameters. However, most modern compilers provide an optimisation which does just this.

Below is the same code but showing the optimisation. Instead of returning the structure the address of the receiving object is (implicitly) passed to the function. At the end of the function the return value is copied into the receiver, negating the need for a temporary return object.


In general, then, it is OK to return a struct from a function by value (unless you’re using an ancient C compiler). If you’re not certain (or your compiler doesn’t support this optimisation) it’s probably safer for you to use input-output parameters instead.

Finally, it’s worth noting the small detail that, unlike other languages, a C function can only have one output parameter. You’ll need to use input-output parameters for the rest.

Making the world a better place.

Using these idioms consistently is a very good way to improve the quality of your code. Firstly, it allows the compiler to provide stronger checking on your code. Second, it gives the reader extra information about how to use your functions and what guarantee (or promise) they can expect from them.

You may have noticed I’ve ignored arrays in this article. Check out this blog post for passing arrays to functions.

In summary:



* Or, hokey-pokey if you prefer.

Native C/C++ Application development for the mbed using CMSIS-DAP

April 12th, 2013

If you have been following the Feabhas blog for some time, you may remember that in April of last year I posted about my experiences of using the MQTT protocol. The demonstration code was ran the ARM Cortex-M3 based mbed platform.mbed-microcontroller-angled

For those that are not familiar with the mbed, it is an “Arduino-like” development platform for small microcontroller embedded systems. The variant I’m using is built using an NXP LPC1768 Cortex-M3 device, which offers a plethora of connection options, ranging from simple GPIO, through I2C and SPI, right up to CAN, USB and Ethernet. With a similar conceptual model to Arduino’s, the drivers for all these drivers are supplied in a well-tested (C++) library. The mbed is connect to a PC via a USB cable (which also powers it), so allows the mbed to act as a great rapid prototyping platform. [I have never been a big fan of the 8-bit Arduino (personal choice no need to flame me  ) and have never used the newer ARM Cortex-M based Arduino’s, such as the Due.]

However, in its early guise, there were two limitations when targeting an mbed (say compared to the Arduino).

First was the development environment; initially all software development was done through a web-based IDE. This is great for cross-platform support; especially for me being an Apple fanboy. Personally I never had a problem using the online IDE, especially as I am used to using offline environments such as Keil’s uVision, IAR’s Embedded Workbench and Eclipse. Over the years the mbed IDE has evolved and makes life very easy for importing other mbed developers libraries, creating your own libraries and even have an integrated distributed version control feature. But the need Internet connection inhibit the ability to develop code on a long flight or train journey for example.

Second, the output from the build process is a “.bin” (binary) file, which you save on to the mbed (the PC sees the mbed as a USB storage device). You then press the reset button on the mbed to execute your program. I guessing you’re well ahead of me here, but of course that means there is no on-target debug capabilities (breakpoints, single-step, variable and memory viewing, etc.). Now of course one could argue, as we have a well-defined set of driver libraries and if we followed aTest-Driven-Development (TDD) process that we don’t need target debugging (there is support for printf style debugging via the USB support serial mode); but that is a discussion/debate for another session! I would hazard a guess most embedded developers would prefer at least the option of target based source code debugging? Read more »

Setting up the Cortex-M3/4 (ARMv7-M) Memory Protection Unit (MPU)

February 25th, 2013

An optional part of the ARMv7-M architecture is the support of a Memory Protection Unit (MPU). This is a fairly simplistic device (compared to a fully blow Memory Management Unit (MMU) as found on the Cortex-A family), but if available can be programmed to help capture illegal or dangerous memory accesses.
When first looking at programming the MPU it may seem rather daunting, but in reality it is very straightforward. The added benefit of the ARMv7-M family is the well-defined memory map.
All example code is based around an NXP LPC1768 and Keil uVision v4.70 development environment. However as all examples are built using CMSIS, then they should work on an Cortex-M3/4 supporting the MPU.
First, let’s take four types of memory access we may want to capture or inhibit:

  1. Tying to read at an address that is reserved in the memory map (i.e. no physical memory of any type there)
  2. Trying to write to Flash/ROM
  3. Stopping areas of memory being accessible
  4. Disable running code located in SRAM (eliminating potential exploit)

Before we start we need to understand the microcontrollers memory map, so here we can look at the memory map of the NXP LPC1768 as defined in chapter 2 of the LPC17xx User Manual (UM10360).

  • 512kB FLASH @ 0x0000 0000 – 0x0007 FFFF
  • 32kB on-chip SRAM @ 0x1000 0000 – 0x1000 7FFFF
  • 8kB boot ROM @ 0x1FFF 0000 – 0x1FFF 1FFF
  • 32kB on-chip SRAM @ 0x2007 C000 [AHB SRAM]
  • GPIO @ 0x2009C000 – 0x2009 FFFF
  • APB Peripherals  @ 0x4000 0000 – 0x400F FFFF
  • AHB Peripheral @ 0x5000 0000 – 0x501F FFFF
  • Private Peripheral Bus @ 0xE000 0000 – 0xE00F FFFF

Based on the above map we can set up four tests:

  1. Read from location 0x0008 0000 – this is beyond Flash in a reserved area of memory
  2. Write to location 0x0000 4000 – some random loaction in the flash region
  3. Read the boot ROM at 0x1FFF 0000
  4. Construct a function in SRAM and execute it

The first three tests are pretty easy to set up using pointer indirection, e.g.:

int* test1 = (int*)0x000004000;   // reserved location
x= *test1;                        // try to read from reserved location
int* test2 = (int*)0x000004000;   // flash location
*test2 = x;                       // try to write to flash
int* test3 = (int*)0x1fff0000 ;   // Boot ROM location
x = *test3 ;                      // try to read from boot ROM

The fourth takes a little more effort, e.g.

// int func(int r0)
// {
//    return r0+1;
// }
uint16_t func[] = { 0x4601, 0x1c48, 0x4770 };
int main(void)
   funcPtr test4= (funcPtr)(((uint32_t)func)+1);  // setup RAM function (+1 for thumb)
   x = test4(x);                                  // call ram function

Default Behavior

Without the MPU setup the following will happen (output from the previous Fault Handler project):

  • test1 will generate a precise bus error


  • test2 will generate an imprecise bus error


Test3 and test4 will run without any fault being generated.

Setting up the MPU

There are a lot of options when setting up the MPU, but 90% of the time a core set are sufficient. The ARMv7-M MPU supports up to 8 different regions (an address range) that can be individually configured. For each region the core choices are:

  • the start address (e.g. 0x10000000)
  •  the size (e.g. 32kB)
  •  Access permissions (e.g. Read/Write access)
  • Memory type (here we’ll limit to either Normal for Flash/SRAM, Device for NXP peripherals, and Strongly Ordered for the private peripherals)
  • Executable or not (refereed to a Execute Never [XN] in MPU speak)

Both access permissions and memory types have many more options than those covered here, but for the majority of cases these will suffice. Here I’m not intending to cover privileged/non-privileged options (don’t worry if that doesn’t make sense, I shall cover it in a later posting).
Based on our previous LPC1768 memory map we could define as region map thus:

No.  Memory             Address       Type      Access Permissions  Size
0    Flash              0x00000000    Normal    Full access, RO    512KB
1    SRAM               0x10000000    Normal    Full access, RW     32KB
2    SRAM               0x2007C000    Normal    Full access, RW     32KB
3    GPIO               0x2009C000    Device    Full access, RW     16KB
4    APB Peripherals    0x40000000    Device    Full access, RW    512KB
5    AHB Peripherals    0x50000000    Device    Full access, RW      2MB
6    PPB                0xE0000000    SO        Full access, RW      1MB

Not that the boot ROM has not been explicitly mapped. This means any access to that region once the MPU has been initialized will get caught as a memory access violation.
To program a region, we need to write to two registers in order:

  • MPU Region Base Address Register (CMSIS: SCB->RBAR)
  • MPU Region Attribute and Size Register (CMSIS: SCB->RASR)

MPU Region Base Address Register

Bits 0..3 specify the region number
Bit 4 needs to be set to make the region valid
bits 5..31 have the base address of the region (note the bottom 5 bits are ignored – base address must also be on a natural boundary, i.e. for a 32kB region the base address must be a multiple of 32kB).

So if we want to program region 1 we would write:

#define VALID 0x10
SCB->RBAR = 0x10000000 | VALID | 1;  // base addr | valid | region no

MPU Region Attribute and Size Register

This is slightly more complex, but the key bits are:

bit 0 – Enable the region
bits 1..5 – region size; where size is used as 2**(size+1)
bits 16..21 – Memory type (this is actually divided into 4 separate groups)
bits 24..26 – Access Privilege
bit 28 – XN

So given the following defines:

#define REGION_Enabled  (0x01)
#define REGION_32K      (14 << 1)      // 2**15 == 32k
#define NORMAL          (8 << 16)      // TEX:0b001 S:0b0 C:0b0 B:0b0
#define FULL_ACCESS     (0x03 << 24)   // Privileged Read Write, Unprivileged Read Write
#define NOT_EXEC        (0x01 << 28)   // All Instruction fetches abort

We can configure region 0 thus:


We can now repeat this for each region, thus:

void lpc1768_mpu_config(void)
   /* Disable MPU */
   MPU->CTRL = 0;
   /* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
   MPU->RBAR = 0x00000000 | REGION_Valid | 0;
   MPU->RASR = REGION_Enabled | NORMAL | REGION_512K | RO;
   /* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x10000000 | REGION_Valid | 1;
   /* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x2007C000 | REGION_Valid | 2;
   /* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x2009C000 | REGION_Valid | 3;
   /* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x40000000 | REGION_Valid | 4;
   /* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x50000000 | REGION_Valid | 5;
   /* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
   MPU->RBAR = 0xE0000000 | REGION_Valid | 6;
   /* Enable MPU */
   MPU->CTRL = 1;

After the MPU has been enabled, ISB and DSB barrier calls have been added to ensure that the pipeline is flushed and no further operations are executed until the memory access that enables the MPU completes.

Using the Keil environment, we can examine the MPU configuration:


Rerunning the tests with MPU enabled

To get useful output we can develop a memory fault handler, building on the Hard Fault handler, e.g.

void printMemoryManagementErrorMsg(uint32_t CFSRValue)
   printErrorMsg("Memory Management fault: ");
   CFSRValue &= 0x000000FF; // mask just mem faults
   if((CFSRValue & (1<<5)) != 0) {
      printErrorMsg("A MemManage fault occurred during FP lazy state preservation\n");
   if((CFSRValue & (1<<4)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception entry\n");
   if((CFSRValue & (1<<3)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception return.\n");
   if((CFSRValue & (1<<1)) != 0) {
      printErrorMsg("Data access violation.\n");
   if((CFSRValue & (1<<0)) != 0) {
      printErrorMsg("MPU or Execute Never (XN) default memory map access violation\n");
   if((CFSRValue & (1<<7)) != 0) {
      static char msg[80];
      sprintf(msg, "SCB->MMFAR = 0x%08x\n", SCB->MMFAR );

Test 1 – Reading undefined region

Rerunning test one with the MPU enabled gives the following output:


The SCB->MMFAR contains the address of the memory that caused the access violation, and the PC guides us towards the offending instruction


Test 2 – Writing to RO defined region



Test 3 – Reading Undefined Region (where memory exists)



Test 4 – Executing code in XN marked Region


The PC gives us the location of the code (in SRAM) that tried to be executed


The LR indicates the code where the branch was executed


So, we can see with a small amount of programming we can (a) simplify debugging by quickly being able to establish the offending opcode/memory access, and (b) better defend our code against accidental/malicious access.

Optimizing the MPU programming.

Once useful feature of the Cortex-M3/4 MPU is that the Region Base Address Register and Region Attribute and Size Register are aliased three further times. This means up to 4 regions can be programmed at once using a memcpy. So instead of the repeated writes to RBAR and RASR, we can create configuration tables and initialize the MPU using a simple memcpy, thus:

uint32_t table1[] = {
/* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
(0x00000000 | REGION_Valid | 0),
/* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x10000000 | REGION_Valid | 1),
/* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x2007C000 | REGION_Valid | 2),
/* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
(0x2009C000 | REGION_Valid | 3),

uint32_t table2[] = {
/* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x40000000 | REGION_Valid | 4),
/* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x50000000 | REGION_Valid | 5),
/* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
(0xE0000000 | REGION_Valid | 6),

void lpc1768_mpu_config_tbl(void)
   /* Disable MPU */
   MPU->CTRL = 0;
   memcpy((void*)&( MPU->RBAR), table1, sizeof(table1));
   memcpy((void*)&( MPU->RBAR), table2, sizeof(table2));
   /* Enable MPU */
   MPU->CTRL = 1;

I hope this is enough to get you started with your ARMv7-M MPU.

Weak linkage in C programming

January 25th, 2013

When linking C programs there are (in general) only a couple of errors you’re likely to see. If, for example, you have two functions in different files, both with external linkage, then the files will compile okay, but when you link you’ll likely see an error along these lines:

weak_linkage.axf: Error: L6200E: Symbol foo multiply defined (by foo.o and foo2.o).
Target not created

Most of the time this makes sense and is as expected; however there is a particular instance where it gets in the way.

If we need to supply a code framework where we need placeholders (stubs) for someone else to fill in at a later date, it can sometimes mean developing complex makefiles and/or conditional compilation to allow new code to be introduced as seamlessly as possible.

However, there is a hidden gem supported by most linkers called “weak linkage”. The principle of weak linkage is that you can define a function and tag it as (surprisingly) weak, e.g.

// foo_weak.c
__weak int foo(void)
// ...
return 1;

This then can be called from the main application:

// main.c
int foo(void);

int main(void)

This project can build built as normal:

compiling main.c…
compiling foo_weak.c…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

At some time later we can add another file with the same function signature to the project

// foo.c
int foo(void)
// override weak function
return 2;

If we rebuild, normally we would get the “multiply defined” symbols error, however with weak linkage the linker will now bind the new “strong” function to the call in main.

compiling main.c…
compiling foo_weak.c…
compiling foo.c…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

As you can also see, the weak function is optimized away.

A good example of the use of weak linkage is the definition of the default interrupt handlers in CMSIS.

This example code is based on Keil’s uVision v4.60 compiler/linker, however both GCC and IAR also support weak linkage.

The C build process

June 29th, 2012

Glennan Carnie

Technical Consultant at Feabhas Ltd
Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry.

He specialises in C++, UML, software modelling, Systems Engineering and process development.

Latest posts by Glennan Carnie (see all)

In this article we look at the C build process – that is, how we get from C source files to executable code, programmed on the target.  It wasn’t so long ago this was common knowledge (the halcyon days of the hand-crafted make file!) but modern IDEs are making this knowledge ever-more arcane.


The first stage of the build process is compilation.


The compiler is responsible for allocating memory for definitions (static and automatic) and generating opcodes from program statements. A relocatable object file (.o) is produced.  The assembler also produces .o files from assembly-language source.

The compiler works with one translation unit at a time.  A translation unit is a .c file that has passed through the pre-processor.

The compiler and assembler create relocatable object files (.o)

A Librarian facility may be used to take the object files and combine them into a library file.

Compilation stages

Compilation is a multi-stage process; each stage working with the output of the previous.  The Compiler itself is normally broken down into three parts:

  • The front end, responsible for parsing the source code
  • The middle end, responsible for optimisation
  • The back end, responsible for code generation

Front End Processing:


The pre-processor parses the source code file and evaluates pre-processor directives (starting with a #) – for example #define.  A typical function of the pre-processor is to#include function / type declarations from header files.  The input to the pre-processor is known as a pre-processed translation unit; the output from the pre-processor is a post-processed translation unit.

Whitespace removal

C ignores whitespace so the first stage of processing the translation unit is to strip out all whitespace.


A C program is made up of tokens.  A token may be

  • a keyword (for example ‘while’)
  • an operator (for example, ‘*’)
  • an identifier; a variable name
  • a literal (for example, 10 or “my string”)
  • a comment (which is discarded at this point)

Syntax analysis

Syntax analysis ensures that tokens are organised in the correct way, according to the rules of the language.  If not, the compiler will produce a syntax error at this point.  The output of syntax analysis is a data structure known as a parse tree.

Intermediate Representation

The output from the compiler front end is a functionally equivalent program expressed in some machine-independent form known as an Intermediate Representation (IR).  The IR program is generated from the parse tree.

IR allows the compiler vendor to support multiple different languages (for example C and C++) on multiple targets without having n * m combinations of toolchain.   

There are several IRs in use, for example Gimple, used by GCC.  IRs are typically in the form of an Abstract Syntax Tree (AST) or pseudo-code.

Middle End Processing:

Semantic analysis

Semantic analysis adds further semantic information to the IR AST and performs checks on the logical structure of the program.  The type and amount of semantic analysis performed varies from compiler to compiler but most modern compilers are able to detect potential problems such as unused variables, uninitialized variables,  etc.  Any problems found at this stage are normally presented as warnings, rather than errors.

It is normally at this stage the program symbol table is constructed, and any debug information inserted.


Optimisation transforms the code into a functionally-equivalent, but smaller or faster form.  Optimisation is usually a multi-level process.  Common optimisations include inline expansion of functions, dead code removal, loop unrolling, register allocation, etc.

Back End Processing:

Code generation

Code generation converts the optimised IR code structure into native opcodes for the target platform.

Memory allocation

The C compiler allocates memory for code and data in Sections.  Each section contains a different type of information.  Sections may be identified by name and/or with attributes that identify the type of information contained within.  This attribute information is used by the Linker for locating sections in memory (see later).


Opcodes generated by the compiler are stored in their own memory section, typically known as .code or  .text


Static data

The static data region is actually subdivided into two further sections:

  • one for uninitialized-definitions (int iVar1;).
  • one for initialised-definitions (int iVar2 = 10;)

So it would not be unexpected for the address of iVar1 and iVar2 to not be adjacent to each other in memory.

The uninitialized-definitions’ section is commonly known as the .bss or ZI section. The initialised-definitions’ section is commonly known as the .data or RW section. 



Constants may come in two forms:

  • User-defined constant objects (for example const int c;)
  • Literals (‘magic numbers’, macro definitions or strings)

The traditional C model places user-defined const objects in the .data section, along with non-const statics (so they may not be truly constant – this is why C disallows using constant integers to initialise arrays, for example)

Literals are commonly placed in the .text / .code section.  Most compilers will optimise numeric literals away and use their values directly where possible.

Many modern C toolchains support a separate .const / .rodata section specifically for constant values.  This section can be placed (in ROM) separate from the .data section.  Strictly, this is a toolchain extension.


Automatic variables

The majority of variables are defined within functions and classed as automatic variables. This also includes parameters and any temporary-returned-object (TRO) from a non-void function.
The default model in general programming is that the memory for these program objects is allocated from the stack. For parameters and TRO’s the memory is normally allocated by the calling function (by pushing values onto the stack), whereas for local objects, memory is allocated once the function is called. This key feature enables a function to call itself – recursion (though recursion is generally a bad idea in embedded programming as it may cause stack-overflow problems). In this model, automatic memory is reclaimed by popping the stack on function exit.

It is important to note that the compiler does NOT create a .stack segment.  Instead, opcodes are generated that access memory relative to some register, the Stack Pointer, which is configured at program start-up to point to the top of the stack segment (see below)

However, on most modern microcontrollers, especially 32-bit RISC architectures, automatics are stored in scratch registers, where possible, rather than the stack. For example the ARM Architecture Procedure Call Standard (AAPCS) defines which CPU registers are used for function call arguments into, and results from, a function and local variables.


Dynamic data

Memory for dynamic objects is allocated from a section known as the Heap.  As with the Stack, the Heap is not allocated by the compiler at compile time but by the Linker at link-time.


Object files

The compiler produces relocatable object files – .o files.
The object file contains the compiled source code – opcodes and data sections.  Note that the object file only contains the sections for static variables.  At this stage, section locations are not fixed.

The .o file is not (yet) executable because, although some items are set in concrete (for example: instruction opcodes, pc-relative addresses, “immediate” constants, etc.), static and global addresses are known only as offsets from the starts of their relevant sections. Also, addresses defined in other modules are not known at all, except by name.  The object file contains two tables -  Imports and Exports:

  • Exports contains any extern identifiers defined within this translation unit (so no statics!)
  • Imports contains any identifiers declared (and used) within the translation; but not defined within it.

Note the identifier names are in name-mangled form.



The Linker combines the (compiled) object files into a single executable program.  In order to do that it must perform a number of tasks.


Symbol resolution

The primary function of the Linker (from whence it derives its name) is to resolve references between object files – that is, to ensure each symbol defined by the program has a unique address.

If any references remain unresolved, all specified library/archive (.a) files are searched and the appropriate modules are gathered in order to resolve those references.  This is an iterative process.  If, after this, the Linker still cannot resolve a symbol it will report an ‘unresolved reference’ error.

Similarly, C specifies a ‘one-definition rule’ – that is, each symbol must have a unique and unambiguous address.  If the Linker finds the same symbol defined in two object files it will report a ‘redefinition’ error (be careful, though – some older C compilers assume that the same symbol defined in two translation units must refer to the same object!)

Section concatenation

The Linker then concatenates like-named sections from the input object files.
The combined sections (output sections) are usually given the same names as their input sections.  Program addresses are adjusted to take account of the concatenation.

Section location

To be executable code and data sections must be located at absolute addresses in memory.  Each section is given an absolute address in memory.  This can be done on a section-by-section basis but more commonly sections are concatenated from some base address.  Normally there is one base address in non-volatile memory for persistent sections (for example code) and one address in volatile memory for non-persistent sections (for example the Stack).

Data initialisation

On an embedded system any initialised data must be stored in non-volatile memory (Flash / ROM).  On startup any non-const data must be copied to RAM.  It is also very common to copy read-only sections like code to RAM to speed up execution (not shown in this example).
In order to achieve this the Linker must create extra sections to enable copying from ROM to RAM. Each section that is to be initialized by copying is divided into two, one for the ROM part (the initialisation section) and one for the RAM part (the run-time location).  The initialisation section generated by the Linker is commonly called a shadow data section – .sdata in our example (although it may have other names).

If manual initialization is not used, the linker also arranges for the startup code to perform the initialization.

The .bss section is also located in RAM but does not have a shadow copy in ROM.  A shadow copy is unnecessary, since the .bss section contains only zeroes.  This section can be initialised algorithmically as part of the startup code.

Linker control

The detailed operation of the linker can be controlled by invocation (command-line) options or by a Linker Control File (LCF). 

You may know this file by another name such as linker-script file, linker configuration file or even scatter-loading description file. The LCF file defines the physical memory layout (Flash/SRAM) and placement of the different program regions.  LCF syntax is highly compiler-dependent, so each will have its own format; although the role performed by the LCF is largely the same in all cases.

When an IDE is used, these options can usually be specified in a relatively friendly way.  The IDE then generates the necessary script and invocation options.


The most important thing to control is where the final memory sections are located.  The hardware memory layout must obviously be respected – for most processors, certain things must be in specific places.

Secondly, the LCF specifies the size and location of the Stack and Heap (if dynamic memory is used). It is common practice to locate the Stack and Heap with the Heap at the lower address in RAM and the Stack at a higher address to minimise the potential for the two areas overlapping (remember, the Heap grows up the memory and the Stack grows down) and corrupting each other at run-time.

The linker configuration file shown above leads to a fairly typical memory layout shown here.

  • .cstartup – the system boot code – is explicitly located at the start of Flash.
  • .text and .rodata are located in Flash, since they need to be persistent
  • .stack and .heap are located in RAM.
  • .bss is located in RAM in this case but is (probably) empty at this point.  It will be initialised to zero at start-up.
  • The .data section is located in RAM (for run-time) but its initialisation section, .sdata, is in ROM.



The Linker will perform checks to ensure that your code and data sections will fit into the designated regions of memory.

The output from the locating process is a load file in a platform-independent format, commonly .ELF or .DWARF (although there are many others)

The ELF file is also used by the debugger when performing source-code debugging.



ELF or DWARF are target-independent output file formats.  In order to be loaded onto the target the ELF file must be converted into a native flash / PROM format (typically, .bin or .hex)



Key points

  • The compiler produces opcodes and data allocation from source code files to produce an object file.
  • The compiler works on a single translation unit at a time.
  • The linker concatenates object files and library files to create a program
  • The linker is responsible for allocating stack and free store sections
  • The linker operation is controlled by a configuration file, unique to the target system.
  • Linked files must be translated to a target-dependent format for loading onto the target.

enum ; past, present and future

June 15th, 2011

The enumerated type (enum) is probably one of the simplest and most underused  features of the C and C++ which can make code safer and more readable without compromising performance.

In this posting we shall look at the basic enum from C, how C++ improved on C’s enum, and how C++0X will make them a first class type.

Often I see headers filled with lists of #defines where an enum would be a much better choice. Here is a classic example:

/* adc.h */
#define ADC_Channel_0                               (0x00) 
#define ADC_Channel_1                               (0x01) 
#define ADC_Channel_2                               (0x02) 
#define ADC_Channel_3                               (0x03) 
#define ADC_Channel_4                               (0x04) 
#define ADC_Channel_5                               (0x05) 
#define ADC_Channel_6                               (0x06) 
#define ADC_Channel_7                               (0x07) 
#define ADC_Channel_8                               (0x08) 
#define ADC_Channel_9                               (0x09) 
#define ADC_Channel_10                              (0x0A) 
#define ADC_Channel_11                              (0x0B) 
#define ADC_Channel_12                              (0x0C) 
#define ADC_Channel_13                              (0x0D) 
#define ADC_Channel_14                              (0x0E) 
#define ADC_Channel_15                              (0x0F) 

which probably would be better re-written as:

enum ADC_Channel_no {

Before getting onto the advantages and disadvantages of enum’s, let’s have a quick review.

Read more »

Declarations and Definitions in C

January 18th, 2010

Latest posts by admin (see all)

Please Note: This post is focusing on pre-C99. The reason being is that it is aimed at the embedded C programmer who tends to be working with pre-C99 based cross-compilers. Also I have split it into two as it became my larger, due to feedback, than first anticipated.

On the surface declarations and definitions in C are pretty straight-forward; but once we start introducing the concepts of scope, storage-duration, linkage and namespace life is not so simple.

Program Objects (Variables)

Let’s start with a general rule for variables:
  1. if the statement has an “=” it’s a definition?
  2. otherwise, if it has “extern” and no “=” it’s a declaration?
  3. otherwise it’s a tentative-definition that may become a declaration or a actual-definition

Object Definitions

Simply put, a definition allocates storage (memory) e.g.
int ev = 20; /* definition – reserves enough memory to hold an int */
Let’s assume from here-on that an int occupies 32-bits.

Object Declaration

A declaration gives meaning to an identifier; that is, it defines the type information of the identifier. This allows the compiler to generate correct object code to access the variable based its size (i.e. the number of bytes to read or write).


When compiling a source file, a variable must be declared before it is used or it will result in a compiler error.

int main(void)
   ev = 10; /* fails to compile as ev has not been declared */
   return 0;
int ev = 20; /* definition – allocates 32-bits */
Importantly, an object declaration does not reserve memory. e.g.
extern int ev; /* declaration – no memory reserved but defines sizeof(ev) */
int main(void)
   ev = 10; /* okay to use ev as declared, knows to read (say) 32-bits; k = 20 */
   return 0;
 int ev = 20; /* definition – memory reserved here and initialised */
Key point 1:
If no declaration is encountered before the definition, then the definition acts as an implicit declaration.
int ev = 20; /* definition and implicit-declaration: reserves memory */
int main(void) 
   ev = 10; /* okay to use ev as declared (implicitly) */ 
   return 0; 
Key point 2:
In a compiled source file there may be only one definition for an identifier, but there may be multiple declarations (as long as they agree).
extern int ev; /* 1st declaration */
extern int ev; /* 2nd declaration */
int main(void) { ev = 10; /* okay to use ev as declared */ return 0; } int ev = 20; /* definition */
In the examples so far, all definitions have included an initialisation and all declarations have used the “extern” keyword. But there is one further concept we need to examine and that is the concept of a tentative definition (this only applies to variables defined outside of functions – more on that later). Take, for example, the following program snippet:

int ev = 20; /* actual definition    */
int td;      /* tentative definition */
int main(void) { ... return 0; }
With a tentative definition, the following rule applies:

If an actual definition is found later in the source file, then the tentative definition just acts as a declaration. If the end of the source file is reached and no actual definition is found, then the tentative definition acts as an actual definition (and implicit declaration) with an initialisation of 0 (zero).

int ev; /* tentative definition becomes declaration */
int td; /* tentative definition become actual definition initialised to 0 */
int main(void)
   return 0;

int ev = 20; /* actual definition */
I’d like to address two more syntactical items before we move on. First, It is perfectly legal to write:
 extern int ev = 20; /* actual-definition */

I’m sure someone can (and will) tell me why this is useful, but in my 25 years of doing C I’ve never had need to use it. I my view anyone found doing this should be made to sit in the corner wearing a hat with a big ‘D’ on it!

Second, it is highly unusual (so unusual that I’ve never seen it used), but the following is also legal syntax:
 extern int(ev);
 int(ev) = 20;
Before we start looking at such items as scope and linkage let’s address function declarations and definitions.


Function declarations and definitions are in many ways simpler than variables. A function definition includes the function’s body. e.g.

void f(int p) /* definition and implicit-declaration */

int main(void)
   f(10); /* okay to call f as declared */
   return 0;

A function’s declaration (typically called its prototype) makes the compiler aware there is a valid function with this identifier. e.g.

void f(int p); /* declaration */

int main(void)
   f(10); /* okay to call f as declared */
   return 0;

void f(int p) /* definition */
   // ...
On the call to the function “f” in main, the declaration enables the compiler to construct the correct call frame based on three things:
  1. the validity of the identifier
  2. the storage required to pass any parameters (by stack or register)
  3. the storage required for any return information
At the call, the names of function parameters, if any, are irrelevant (to the compiler), so can be omitted from the declaration, e.g. void f(int); /* declaration */
Also it is not illegal to have parameter names that differ from the declaration and the definition (but obviously very bad practice).

Before we move on, there are two problem areas we need to cover. First, let’s look at the following snippet:

int main()
   f(20); /* call f with no declaration */
   return 0;

void f(int i) /* definition and implicit-declaration */
   // ...

Here we are trying to call a function that hasn’t been declared. As probably expected, this code fails to compile, but not for the reason you probably assume. Earlier I stated that an identifier must be declared before being used otherwise you get a compiler error. Unfortunately this only applies to variables and not functions!

With functions, if no declaration is found before its first call, the compiler creates an implicit declaration. As it cannot determine the return type, then it assumes an int return type. So for the call
the complier assumes a declaration of
int f();
The compiler error will actually occur at the definition of function “f” due to the implicit-declaration and definition not agreeing (as the definition is void f()). The parts being compared are officially called the function designator. As the two designators don’t match the compiler will generate an error of the form:

error: ‘f’ : redefinition; different basic types

If we change f’s return type to int, then this code will compile quite happily.

int main(void)
   f(20); /* call f implicit-designator of int f() */
   return 0;

int f(int i) /* definition’s designator matches implicit-designator */
   // ...
Why int as the return type? This is historical baggage. In the original specification of C by Kernighan & Ritche it states, regarding function return types:
If the return type is omitted, int is assumed.

This baggage is still evident today, as the following code should compile successfully:

int main()
   f(20); /* call f implicit-designator of int f() */
   return 0;

f(int i) /* definition’s designator has implicit return type of int */
   // ...
Horrible? Yes (and it’s going to get worse) but all it not lost – any modern compiler worth its salt will issue a warning similar to:

warning: 'f' undefined; assuming extern returning int

Never ignore this warning. Some compilers (such as IAR) allow a non-standard extension requiring function prototypes. Note that C++ also requires prototypes, thus closing this loophole.

Can it get worse? Oh yes, much worse.

There is a very common mistake that C programmers assume that an empty parameter list means the same as void in the parameter list. Unfortunately, in some cases it does and in others it doesn’t.

With a function definition, then empty parameter list is the same as void.

void f()       /* definition and implicit-decln of void f(void) */
   // ...

int main()
   f(20);       /* error as call doesn’t match decln */
   return 0;
However (and here it comes) for declarations this isn’t the case.
void f();      /* declaration */
void f(void);  /* prototype-declaration – not the same as above */
If a declaration has a parameter list (including void) then it becomes a prototype-declaration. The empty list in a function declarator specifies that no information about the number or types of the parameters is supplied. This has a horrible implication; take for example the following code:

void f(); /* declaration */
int main(void)
f(20); /* okay to call f as declared */
return 0;
void f(int i) /* definition */
// ...
This is perfectly legal C code, which will compile and run quite happily. The standard states that the number and types of arguments are not compared with those of the parameters in a function definition that does not include a function prototype (I know, I know, but please don’t shoot the messenger). Simply put, if there is an empty parameter list the compiler assumes that arguments to the call are correct, e.g.

void f(); /* declaration */
int main(void)
f(20); /* okay to call f as declared!!! */
return 0;
void f(void) /* definition */
// ...
So what happens above? Well the standard states that if the number of arguments does not agree with the number of parameters, the behaviour is undefined. In many cases with embedded systems, this actually won’t cause a major problem. Many modern microcontroller architectures (e.g. ARM) arguments are passed in registers. Only once the compiler starts using the stack to pass arguments will problems ensue.

Guideline: For all function always supply a function-prototype.

So hopefully that lays the groundwork of declarations and definitions we can now start addressing the concepts of scope, storage-duration, linkage and namespace.


void f()      /* definition and implicit-decln of void f(void) */
   // ...

int main(void)
   f(20);       /* error a call doesn’t match decln */
   return 0;
Microsoft compiler bug – this code should fail to compile. Microsoft compiles, whereas both IAR and Keil fail.

Unscrambling C Declarations

December 9th, 2009

Latest posts by admin (see all)

Note: Based on some feedback I should clarify that this does not cover C99 syntax

Even though the C programming language has been around since the late 1960’s, many programmers still have trouble understanding how C declarations are formed. This is not unsurprising due to the complexity that can arise when mixing pointer, array and function-pointer declarations.

In this posting we shall look at some complex declarations to try and understand them by considering how they are formed. The intent is not so you can go off and write wonderfully complex declarations, but more hopefully you may actually be able to understand someone else’s code. Finally we shall look at how most complex declarations can be easily simplified.
Here I’m going to focus on object declarations/definitions rather than functions. Also, in this posting I’m not going to examine structure, union or enumeration specifies. They’ll keep for another day.
How to read a declaration
Very simple ones (specifically those not involving “[]” or “()“) can be read from right-to-left, e.g.
int x
where ‘x’ is an (identifier for an) integer. However, this approach starts to break down very quickly, e.g.
int a[10]
Therefore a more sophisticated approach is needed for complex declarations because of precedence and associativity rules that apply to the differing symbols in the declaration.

Before building a rule-set there are a number of things we can exclude:
  1. A function cannot return a function – () foo()
  2. A function cannot return an array – [] foo ()
  3. An array cannot hold functions – foo[]()
Let’s start with some simple examples:
int x         x is an integer
This can give us:
Rule 1: Read from left to right looking for an identifier.
So ignore types (int, char, etc.), qualifiers (e.g. const, volatile) and the symbols ‘()’,'[]’ and ‘*’ until you find the first unique identifier. This is the identifier for the declaration.
Building on this, once the identifier is found we look for either array or function notation, e.g.
int a[10]            x is an array of (ten) integers
void x(int y)    x is a function that takes an integer parameter (y) and returns nothing (void)

Rule 2:    look right from the identifier for postfix operators () or []. If [] then it is an array, else if () then it is a function.

Next we introduce pointer notation:
int * x      x is a pointer to an integer

Rule 3:    look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.

Finally we can introduce type qualifiers (const / volatile), e.g.
const int x     x is an integer constant

Rule 4:    If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier

So that gives us a preliminary set of 4 rules.
These hold for the following declarations:
int const x      x is a constant integer (This is identical to the previous declaration. This is part of the confusing syntax of the C programming language, but Rule 4 still applies).
const int * x      x is a pointer to a constant integer. Rule 3 followed by Rule 4
int const * x       x is a pointer to a constant integer (as above – still confused?)
int * x[10]          x is an array of pointers to integers ( Rule 2, Rule 3)
int * x(void     x is a function that returns a pointer to an integer (Rule 2, Rule 3)
int **x                 x is a pointer to a pointer to an integer (Rule 3, Rule 3)

So far so good? Pretty straight forward? Maybe not the pointer- to-a-pointer, but we still need to add two further rules. The first affects Rule 4. What if we have a const that is not next to the type? as in:
int * const x
Here we need a new rule, which we’ll call Rule 4b (with our previous Rule 4 becoming 4a):   

Rule 4b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

int * const x      x is a constant pointer to an integer (this means the pointer address is constant)
Combining 4a and 4b gives us:
int const * const x     x is a constant pointer to a constant integer
We have one final rule required to force precedence. For example we’ve already seen that int * x(void)declares x as a function that returns a pointer to an integer (Rule 2, Rule 3). But what if I wanted to declare a pointer to a function that returns an integer?
The syntax is as follows:
int (*x)(void)    x is a pointer to a function that returns an integer
This gives our final rule, which becomes a new Rule 2 and pushes everything down by one:

Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first

This rule is required because when we have  *x() then the function parentheses always win. Thus:
void (*x)(int y)     x is a pointer to a function that takes an integer (y) as a parameter and returns void
Rule Summary
  • Rule 1: Read from left to right looking for an identifier
  • Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first
  • Rule 3:    look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
  • Rule 4:    look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
  • Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
  • Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left
Complex Declarations
This core set that should decode C program object declarations. Let’s put it to the test on a couple of horrible declarations. First can you work out:
void (*fpa[10])(int)
Have a go before I break it down…
Okay, let’s decompose this:
Rule 1: From left to right find identifier, this gives us fpa
Rule 2: (*fpa[]) parentheses win, so evaluate inside the parentheses     
Rule 3: fpa[10]  postfix [] wins; fpa is a ten element array ($ now represents fpa[10])
Rule 4:    *$    prefix * wins; fpa is an array of pointers. Now we’ve evaluated inside the parentheses we step outside.
Rule 3: $() postfix, () wins fpa is an array of pointers to functions
Rule 2: void $(int   parentheses; fpa is an array of pointers to functions each taking an integer parameter and returning void
So the identifier fpa represents an array of ten pointers to functions each of which takes an integer as a parameter and returns void. Phew…
Okay one last one to try, go to the C standard library and look at the declarations in <signal.h> and you should see:
 void (*signal(int sig, void(*func)(int)))(int);
If you can decode this then I’m really impressed!

Let’s apply our rule-set to this:
First, as always is rule 1; signal is the identifier. signal is in parentheses, so based on Rule 2 we must evaluate that first. If we match parenthesis then we get:
(*signal(int sig, void(*func)(int)))
Which we can temporarily simplify (by ignoring the function parameters) to:
Based on Rule 3, then signal is a function that returns a pointer. The question is a pointer to what?  Using the simplification we can work out the return type as:
void (* signal() )(int)
which becomes
void (*$)(int)
which means the function signal returns a pointer to a function that has an integer parameter and returns void.
So let’s return to the parameters, this gives us:
signal(int sig, void(*func)(int))
So signal takes two parameters
int sigsig is an integer
void(*func)(int) –  func is a pointer to a function that has an integer parameter and returns void.
To summarise:
  • signal is a function
  • that returns a pointer to a function that has an integer parameter and returns void
  • and takes two parameters of
  • an integer, and
  • a pointer to a function that has an integer parameter and returns void
It doesn’t get much worse that this (and remember this example comes from the standard library, which is shameful!).
How to avoid complexity in declarations
Avoid by design, as far as possible. If this fails, divide and conquer remembering that typedef is your friend.  A typedef declaration does not introduce a new type, only a synonym for the type specified. For example:
typedef  int  MILES;
MILES  m;   /* m is of type int */
typedef int*  int_ptr;
int_ptr  ip;  /* ip is of type integer pointer int* */
Used well typedef’s makes life easier. For example:
typedef void (*FuncPtr)(int);
FuncPtr is a typedef for a pointer to any function which takes an integer parameter and returns void.
In the “signal” example, both function pointers are of this type, so using the typedef, the declaration
void (*signal(int sig, void(*func)(int)))(int)
FuncPtr signal(int sig, FuncPtr)
and our previous declaration of:
void (*fpa[10])(int)
FuncPtr  fpa[10]
After that I need to find a dark room to lie down in.
Decoding Rule-set
Rule 1:  Read from left to right looking for an identifier
Rule 2:  If the identifier is with parentheses, then evaluate inside the parentheses first
Rule 3:   look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
Rule 4:   look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

Also check out (thanks @FrankSansC)

%d bloggers like this: