You are currently browsing the archives for the ARM category.

goto fail and embedded C Compilers

February 27th, 2014

I can’t imagine anyone reading this posting hasn’t already read about the Apple “goto fail” bug in SSL. My reaction was one of incredulity; I really couldn’t believe this code could have got into the wild on so many levels.

First we’ve got to consider the testing (or lack thereof) for this codebase. The side effect of the bug was that all SSL certificates passed, even malformed ones. This implies positive testing (i.e. we can demonstrate it works), but no negative testing (i.e. a malformed SSL certificate), or even no dynamic SSL certificate testing at all?

What I haven’t established* is whether the bug came about through code removal (e.g. there was another ‘if’ statement before the second goto) or, due to trial-and-error programming, the extra goto got added (with other code) that then didn’t get removed on a clean-up. There are, of course, some schools of thought that believe it was deliberately put in as part of prism!

Then you have to query regression testing; have they never tested for malformed SSL certificates (I can’t believe that; mind you I didn’t believe Lance was doping!) or did they use a regression-subset for this release which happened to miss this bug? Regression testing vs. product release is always a massive pressure. Automation of regression testing through continuous integration is key, but even so, for very large code bases it is simplistic to say “rerun all tests”; we live in a world of compromises.

Next, if we actually analyse the code then I can imagine the MISRA-C group jumping around saying “look, look, if only they’d followed MISRA-C this couldn’t of happened” (yes Chris, it’s you I’m envisaging) and of course they’re correct. This code breaks a number of MISRA-c:2012 rules, but most notably:

15.6 (Required) The body of an iteration-statement or selection-statement shall be a compound-statement

Which boils down to all if-statements must use a block structure, so the code would go from (ignoring the glaring coding error of two gotos):
Read more »

UK based One-day ARM User Conference (and it’s free!)

September 11th, 2013

For those of you that are not on our company hit list, sorry I mean mailing list, then you may not have heard about next week’s ARM User Conference run by the good folks at Hitex UK.

The event is titled “ARM – Continually Raising the Standard” and is being held at Stoneleigh Park near Coventry on the 19th September 2013. This year there are two streams running to allow a wider choice of presentation.


The event is also preceded by a number of (paid) workshops on the 17th  and 18th.

I shall be presenting the paper on “Developing a Generic Hard-fault Handler for the ARMv7–M Architecture”. Feabhas shall also have a table-top there so if you’re attending please stop by and say hello.

Full details of the event can be found here

ARM TechCon 2013

August 15th, 2013

ARM’s Technical Conference called TechCon™ is running between October 29th and 31st at the Santa Clara Convention Center in California.


This year I shall be making the trip over to present three classes:

For those of you who are regular readers of this blog you’ll recognise the Generic Hard-Fault Handler from a previous post.

The class “Can Existing Embedded Applications Benefit from Multicore Technology?” came about as it seemed that not a day would goe by without an announcement of a major development in multicore technology. With so much press about multicore, I started to wonder whether I should consider using multicore technology in my typical embedded applications? 

From a software developer’s perspective, however, all the code examples seem to demonstrate the (same) massive performance improvements to rendering fractals or ray-tracing programs. The examples always refer to Amdahl’s Law, showing gains when using, say, 16 or 128 cores. This is all very interesting, but not what I, and hopefully most embedded developers, might consider embedded. This class discusses multicore from a more traditional embedded viewpoint.

Many Embedded-C programmers still believe that C++ leads to slow, bloated programs. Though this viewpoint may have had limited foundation over a decade ago, it is misplaced for the core aspects of C++ (classes, inheritance, and dynamic polymorphism). With a modern ARM C++ cross-compiler, it is also misplaced for the more advanced features (templates and exception handling). In the class “Virtual Functions in C++ on the ARM Architecture”, I will be focusing on the performance and memory of the C++ virtual functions, type info and look at the use of multiple inheritance in an ARM embedded environment.

If you are planning to attend TechCon this year then please look me up. I will be making the presentations available via the blog after the event.


Test Driven Development (TDD) with the mbed

May 24th, 2013

One of the most useful fallout’s from the acceptance of Agile techniques is the use of Test-Driven-Development (TDD) and the growth of associated test frameworks, such as GoogleTest and CppUTest, etc.

I won’t get into the details of TDD here as they are well covered elsewhere (I recommend James Grenning’s book “Test Driven Development for Embedded C” for a good coverage of the subject area), but the principle is

  1. Write a test
  2. Develop enough code to compile and build (but will fail the test)
  3. Write the application code to pass the test
  4. repeat until done

Obviously that is massively simplifying the process, but that’s the gist. The key to it all is automation, in that you want to write a new test and then the build-deploy-test-report (BDTR) cycle is automated.

To build a TDD environment with the mbed I needed to solve the following obstacles:

  1. Build – Using a TDD framework and building the project
  2. Deploy –  Download to the mbed
  3. Test – auto-executing the test code on the mbed
  4. Report –  Getting test reports back from the mbed to the host

Read more »

Rehosting ARMCC for the mbed with CMSIS-DAP

April 26th, 2013

In this posting I will look at porting the C standard library output (e.g. puts / printf ) to use a UART rather than the default ARM/Keil semihosting.


In my last post, I looked at getting basic user I/O out from a native-mbed via UART0 to a terminal emulator (e.g. Tera Term). This was driven by the fact that, currently, neither printf (via semihosting) or ITM_SendChar do not function on the mbed. Unfortunately, my solution uses a propriety API, such as init_serial0 and putchar0, etc., rather than puts, printf, etc.

ITM_SendChar uses the ITM (Instrumented Trace Macrocell) on the Cortex-M3 core, which in turn uses a Trace Port (either SWO or 4-pin) to send messages. To get output from a Trace Port you need a debug unit with trace capabilities, e.g. a ULINK or J-Link device. The current implementation of CMSIS-DAP does not support any trace capabilities; however I am led to believe that ARM are planning to add some trace capabilities in future versions or variants of CMSIS-DAP (no timeframes).

To reference the ARM website:

Semihosting is implemented by a set of defined software instructions, for example, SVCs, that generate exceptions from program control. The application invokes the appropriate semihosting call and the debug agent then handles the exception. The debug agent provides the required communication with the host.

On a Cortex-M3 (ARMv7–M) you’d typically see “BKPT 0xAB” opcode instead of SVC’s. For the same reason as the ITM_SendChar, currently semihosting is not supported on the mbed.

So, ideally, it would be nice to still be able to use puts/printf (the greatest debug tool of all) but redirect the output to our UART; i.e. rehosting.

Rehosting in the Keil environment is very easy, once you know how! It is easy to go down a couple of dead ends, which hopefully I’ll help you avoid.

First, in our main, where we’re using printf, we need to include the following pre-processor directive in main.c:

#pragma import(__use_no_semihosting_swi)

Read more »

User I/O from mbed with CMSIS-DAP

April 19th, 2013

Following on from my last posting regarding using native C/C++ on the mbed I have found that I currently cannot get output via the standard CMSIS ITM_SendChar function as used in the Cortex-M hard fault handler (I am currently in dialog with the guys at ARM trying to resolve this).


In the standard mbed environment, the mbed can communicate with a host PC through a “USB Virtual Serial Port” over the same USB cable that is used for programming using printf(), e.g.

#include “mbed.h”
int main()
printf(“Hello World!\n”);

To achieve the same output, an mbed SerialPC object can be defined, e.g.

#include “mbed.h”
Serial pc(USBTX, USBRX); // tx, rx
int main()
pc.printf(“Hello World!\n”);

Currently, with a native minimal project, the semi-hosting of printf is not supported. This can be overcome by “re-targeting the project”, so I’ll cover that in the future, but for now the is a simple way of getting basic user I/O.

User I/O via UART0

Luckily for us, if we push characters out over the UART0 serial interface they are transmitted via the same channel that the mbed SerialPC uses. To test this out I quickly (using the best agile techniques of course) put together a very basic UART driver. Read more »

Native C/C++ Application development for the mbed using CMSIS-DAP

April 12th, 2013

If you have been following the Feabhas blog for some time, you may remember that in April of last year I posted about my experiences of using the MQTT protocol. The demonstration code was ran the ARM Cortex-M3 based mbed platform.mbed-microcontroller-angled

For those that are not familiar with the mbed, it is an “Arduino-like” development platform for small microcontroller embedded systems. The variant I’m using is built using an NXP LPC1768 Cortex-M3 device, which offers a plethora of connection options, ranging from simple GPIO, through I2C and SPI, right up to CAN, USB and Ethernet. With a similar conceptual model to Arduino’s, the drivers for all these drivers are supplied in a well-tested (C++) library. The mbed is connect to a PC via a USB cable (which also powers it), so allows the mbed to act as a great rapid prototyping platform. [I have never been a big fan of the 8-bit Arduino (personal choice no need to flame me  ) and have never used the newer ARM Cortex-M based Arduino’s, such as the Due.]

However, in its early guise, there were two limitations when targeting an mbed (say compared to the Arduino).

First was the development environment; initially all software development was done through a web-based IDE. This is great for cross-platform support; especially for me being an Apple fanboy. Personally I never had a problem using the online IDE, especially as I am used to using offline environments such as Keil’s uVision, IAR’s Embedded Workbench and Eclipse. Over the years the mbed IDE has evolved and makes life very easy for importing other mbed developers libraries, creating your own libraries and even have an integrated distributed version control feature. But the need Internet connection inhibit the ability to develop code on a long flight or train journey for example.

Second, the output from the build process is a “.bin” (binary) file, which you save on to the mbed (the PC sees the mbed as a USB storage device). You then press the reset button on the mbed to execute your program. I guessing you’re well ahead of me here, but of course that means there is no on-target debug capabilities (breakpoints, single-step, variable and memory viewing, etc.). Now of course one could argue, as we have a well-defined set of driver libraries and if we followed aTest-Driven-Development (TDD) process that we don’t need target debugging (there is support for printf style debugging via the USB support serial mode); but that is a discussion/debate for another session! I would hazard a guess most embedded developers would prefer at least the option of target based source code debugging? Read more »

Setting up the Cortex-M3/4 (ARMv7-M) Memory Protection Unit (MPU)

February 25th, 2013

An optional part of the ARMv7-M architecture is the support of a Memory Protection Unit (MPU). This is a fairly simplistic device (compared to a fully blow Memory Management Unit (MMU) as found on the Cortex-A family), but if available can be programmed to help capture illegal or dangerous memory accesses.
When first looking at programming the MPU it may seem rather daunting, but in reality it is very straightforward. The added benefit of the ARMv7-M family is the well-defined memory map.
All example code is based around an NXP LPC1768 and Keil uVision v4.70 development environment. However as all examples are built using CMSIS, then they should work on an Cortex-M3/4 supporting the MPU.
First, let’s take four types of memory access we may want to capture or inhibit:

  1. Tying to read at an address that is reserved in the memory map (i.e. no physical memory of any type there)
  2. Trying to write to Flash/ROM
  3. Stopping areas of memory being accessible
  4. Disable running code located in SRAM (eliminating potential exploit)

Before we start we need to understand the microcontrollers memory map, so here we can look at the memory map of the NXP LPC1768 as defined in chapter 2 of the LPC17xx User Manual (UM10360).

  • 512kB FLASH @ 0x0000 0000 – 0x0007 FFFF
  • 32kB on-chip SRAM @ 0x1000 0000 – 0x1000 7FFFF
  • 8kB boot ROM @ 0x1FFF 0000 – 0x1FFF 1FFF
  • 32kB on-chip SRAM @ 0x2007 C000 [AHB SRAM]
  • GPIO @ 0x2009C000 – 0x2009 FFFF
  • APB Peripherals  @ 0x4000 0000 – 0x400F FFFF
  • AHB Peripheral @ 0x5000 0000 – 0x501F FFFF
  • Private Peripheral Bus @ 0xE000 0000 – 0xE00F FFFF

Based on the above map we can set up four tests:

  1. Read from location 0x0008 0000 – this is beyond Flash in a reserved area of memory
  2. Write to location 0x0000 4000 – some random loaction in the flash region
  3. Read the boot ROM at 0x1FFF 0000
  4. Construct a function in SRAM and execute it

The first three tests are pretty easy to set up using pointer indirection, e.g.:

int* test1 = (int*)0x000004000;   // reserved location
x= *test1;                        // try to read from reserved location
int* test2 = (int*)0x000004000;   // flash location
*test2 = x;                       // try to write to flash
int* test3 = (int*)0x1fff0000 ;   // Boot ROM location
x = *test3 ;                      // try to read from boot ROM

The fourth takes a little more effort, e.g.

// int func(int r0)
// {
//    return r0+1;
// }
uint16_t func[] = { 0x4601, 0x1c48, 0x4770 };
int main(void)
   funcPtr test4= (funcPtr)(((uint32_t)func)+1);  // setup RAM function (+1 for thumb)
   x = test4(x);                                  // call ram function

Default Behavior

Without the MPU setup the following will happen (output from the previous Fault Handler project):

  • test1 will generate a precise bus error


  • test2 will generate an imprecise bus error


Test3 and test4 will run without any fault being generated.

Setting up the MPU

There are a lot of options when setting up the MPU, but 90% of the time a core set are sufficient. The ARMv7-M MPU supports up to 8 different regions (an address range) that can be individually configured. For each region the core choices are:

  • the start address (e.g. 0x10000000)
  •  the size (e.g. 32kB)
  •  Access permissions (e.g. Read/Write access)
  • Memory type (here we’ll limit to either Normal for Flash/SRAM, Device for NXP peripherals, and Strongly Ordered for the private peripherals)
  • Executable or not (refereed to a Execute Never [XN] in MPU speak)

Both access permissions and memory types have many more options than those covered here, but for the majority of cases these will suffice. Here I’m not intending to cover privileged/non-privileged options (don’t worry if that doesn’t make sense, I shall cover it in a later posting).
Based on our previous LPC1768 memory map we could define as region map thus:

No.  Memory             Address       Type      Access Permissions  Size
0    Flash              0x00000000    Normal    Full access, RO    512KB
1    SRAM               0x10000000    Normal    Full access, RW     32KB
2    SRAM               0x2007C000    Normal    Full access, RW     32KB
3    GPIO               0x2009C000    Device    Full access, RW     16KB
4    APB Peripherals    0x40000000    Device    Full access, RW    512KB
5    AHB Peripherals    0x50000000    Device    Full access, RW      2MB
6    PPB                0xE0000000    SO        Full access, RW      1MB

Not that the boot ROM has not been explicitly mapped. This means any access to that region once the MPU has been initialized will get caught as a memory access violation.
To program a region, we need to write to two registers in order:

  • MPU Region Base Address Register (CMSIS: SCB->RBAR)
  • MPU Region Attribute and Size Register (CMSIS: SCB->RASR)

MPU Region Base Address Register

Bits 0..3 specify the region number
Bit 4 needs to be set to make the region valid
bits 5..31 have the base address of the region (note the bottom 5 bits are ignored – base address must also be on a natural boundary, i.e. for a 32kB region the base address must be a multiple of 32kB).

So if we want to program region 1 we would write:

#define VALID 0x10
SCB->RBAR = 0x10000000 | VALID | 1;  // base addr | valid | region no

MPU Region Attribute and Size Register

This is slightly more complex, but the key bits are:

bit 0 – Enable the region
bits 1..5 – region size; where size is used as 2**(size+1)
bits 16..21 – Memory type (this is actually divided into 4 separate groups)
bits 24..26 – Access Privilege
bit 28 – XN

So given the following defines:

#define REGION_Enabled  (0x01)
#define REGION_32K      (14 << 1)      // 2**15 == 32k
#define NORMAL          (8 << 16)      // TEX:0b001 S:0b0 C:0b0 B:0b0
#define FULL_ACCESS     (0x03 << 24)   // Privileged Read Write, Unprivileged Read Write
#define NOT_EXEC        (0x01 << 28)   // All Instruction fetches abort

We can configure region 0 thus:


We can now repeat this for each region, thus:

void lpc1768_mpu_config(void)
   /* Disable MPU */
   MPU->CTRL = 0;
   /* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
   MPU->RBAR = 0x00000000 | REGION_Valid | 0;
   MPU->RASR = REGION_Enabled | NORMAL | REGION_512K | RO;
   /* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x10000000 | REGION_Valid | 1;
   /* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x2007C000 | REGION_Valid | 2;
   /* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x2009C000 | REGION_Valid | 3;
   /* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x40000000 | REGION_Valid | 4;
   /* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x50000000 | REGION_Valid | 5;
   /* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
   MPU->RBAR = 0xE0000000 | REGION_Valid | 6;
   /* Enable MPU */
   MPU->CTRL = 1;

After the MPU has been enabled, ISB and DSB barrier calls have been added to ensure that the pipeline is flushed and no further operations are executed until the memory access that enables the MPU completes.

Using the Keil environment, we can examine the MPU configuration:


Rerunning the tests with MPU enabled

To get useful output we can develop a memory fault handler, building on the Hard Fault handler, e.g.

void printMemoryManagementErrorMsg(uint32_t CFSRValue)
   printErrorMsg("Memory Management fault: ");
   CFSRValue &= 0x000000FF; // mask just mem faults
   if((CFSRValue & (1<<5)) != 0) {
      printErrorMsg("A MemManage fault occurred during FP lazy state preservation\n");
   if((CFSRValue & (1<<4)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception entry\n");
   if((CFSRValue & (1<<3)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception return.\n");
   if((CFSRValue & (1<<1)) != 0) {
      printErrorMsg("Data access violation.\n");
   if((CFSRValue & (1<<0)) != 0) {
      printErrorMsg("MPU or Execute Never (XN) default memory map access violation\n");
   if((CFSRValue & (1<<7)) != 0) {
      static char msg[80];
      sprintf(msg, "SCB->MMFAR = 0x%08x\n", SCB->MMFAR );

Test 1 – Reading undefined region

Rerunning test one with the MPU enabled gives the following output:


The SCB->MMFAR contains the address of the memory that caused the access violation, and the PC guides us towards the offending instruction


Test 2 – Writing to RO defined region



Test 3 – Reading Undefined Region (where memory exists)



Test 4 – Executing code in XN marked Region


The PC gives us the location of the code (in SRAM) that tried to be executed


The LR indicates the code where the branch was executed


So, we can see with a small amount of programming we can (a) simplify debugging by quickly being able to establish the offending opcode/memory access, and (b) better defend our code against accidental/malicious access.

Optimizing the MPU programming.

Once useful feature of the Cortex-M3/4 MPU is that the Region Base Address Register and Region Attribute and Size Register are aliased three further times. This means up to 4 regions can be programmed at once using a memcpy. So instead of the repeated writes to RBAR and RASR, we can create configuration tables and initialize the MPU using a simple memcpy, thus:

uint32_t table1[] = {
/* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
(0x00000000 | REGION_Valid | 0),
/* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x10000000 | REGION_Valid | 1),
/* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x2007C000 | REGION_Valid | 2),
/* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
(0x2009C000 | REGION_Valid | 3),

uint32_t table2[] = {
/* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x40000000 | REGION_Valid | 4),
/* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x50000000 | REGION_Valid | 5),
/* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
(0xE0000000 | REGION_Valid | 6),

void lpc1768_mpu_config_tbl(void)
   /* Disable MPU */
   MPU->CTRL = 0;
   memcpy((void*)&( MPU->RBAR), table1, sizeof(table1));
   memcpy((void*)&( MPU->RBAR), table2, sizeof(table2));
   /* Enable MPU */
   MPU->CTRL = 1;

I hope this is enough to get you started with your ARMv7-M MPU.

Developing a Generic Hard Fault handler for ARM Cortex-M3/Cortex-M4

February 1st, 2013

This posting assumes you that you have a working ARM Cortex-M3 base project in Keil uVision. If not, please see the “howto” video: Creating ARM Cortex-M3 CMSIS Base Project in uVision

Divide by zero error

Given the following C function

int div(int lho, int rho)
    return lho/rho;

called from main with these arguments

int main(void)
   int a = 10;
   int b = 0;
   int c;
   c = div(a, b);
   // other code

You would expect a hardware “divide-by-zero” (div0) error. Possibly surprisingly, by default the Cortex-M3 will not report the error but return zero.

Configuration and Control Register (CCR)

To enable hardware reporting of div0 errors we need to configure the CCR. The CCR is part of the Cortex-M’s System Control Block (SCB) and controls entry trapping of divide by zero and unaligned accesses among other things. The CCR bit assignment for div0 is:

[4] DIV_0_TRP Enables faulting or halting when the processor executes an SDIV or UDIV instruction with a divisor of 0:0 = do not trap divide by 01 = trap divide by 0. When this bit is set to 0, a divide by zero returns a quotient of 0.

So to enable DIV_0_TRP, we can use CMSIS definitions for the SCB and CCR, as in:

SCB->CCR |= 0x10;

If we now build and run the project you will need to stop execution as it will appear to run forever. When execution is stopped you’ll find debugger stopped in the file startup_ARMCM3.s in the CMSIS default Hard Fault exception handler:


Override The Default Hard Fault_Handler

As all the exceptions handlers are build with “Weak” linkage in CMSIS, it is very easy to create your own Hard Fault handler. Simply define a function with the name “HardFault_Handler”, as in:

void HardFault_Handler(void)

If we now build, run and then stop the project, we’ll find the debugger will be looping in our new handler rather than the CMSIS default one (alternatively we could put a breakpoint at the while(1) line in the debugger).


Rather than having to enter breakpoints via your IDE, I like to force the processor to enter debug state automatically if a certain instruction is reached (a sort of debug based assert). Inserting the BKPT (breakpoint) ARM instruction in our code will cause the processor to enter debug state. The immediate following the opcode normally doesn’t matter (but always check) except it shouldn’t be 0xAB (which is used for semihosting).

#include "ARMCM3.h" 
void HardFault_Handler(void)
  __ASM volatile("BKPT #01"); 

If we now build and run, the program execution should break automatically at the BKPT instruction.


Error Message Output

The next step in developing the fault handler is the ability to report the fault. One option is, of course, to use stdio (stderr) and semihosting. However, as the support for semihosting can vary from compiler to compiler, I prefer to use Instrumented Trace Macrocell (ITM) utilizing the CMSIS wrapper function ITM_SendChar, e.g.

void printErrorMsg(const char * errMsg)
   while(*errMsg != ''){



Fault Reporting

Now that we have a framework for the Hard Fault handler, we can start reporting on the actual fault details. Within the Cortex-M3’s System Control Block (SCB) is the HardFault Status Register (SCB->HFSR). Luckily again for use, CMSIS has defined symbols allowing us to access these register:

void HardFault_Handler(void)
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   __ASM volatile("BKPT #01");

Building and running the application should now result in the following output:


By examining the HFSR bit configuration, we can see that the FORCED bit is set.


When this bit is set to 1, the HardFault handler must read the other fault status registers to find the cause of the fault.

Configurable Fault Status Register (SCB->CFSR)

A forced hard fault may be caused by a bus fault, a memory fault, or as in our case, a usage fault. For brevity, here I am only going to focus on the Usage Fault and cover the Bus and Memory faults in a later posting (as these have additional registers to access for details).


So given what we know to date, our basic Fault Handler:

  • checks if the FORCED bit is set ( if ((SCB->HFSR & (1 << 30)) != 0) )
  • prints out the contents of the CFSR
void HardFault_Handler(void)
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   if ((SCB->HFSR & (1 << 30)) != 0) {
       printErrorMsg("Forced Hard Fault\n");
       sprintf(msg, "SCB->CFSR = 0x%08x\n", SCB->CFSR );
   __ASM volatile("BKPT #01");

Running the application should now result in the following output:


The output indicated that bit 25 of the UsageFault Status Register (UFSR) part of the CFSR is set.

UsageFault Status Register

The bit configuration of the UFSR is shown below, and unsurprisingly the outpur shows that bit 9 (DIVBYZERO) is set.


We can now extend the HardFault handler to mask the top half of the CFSR, and if not zero then further report on those flags, as in:

void HardFault_Handler(void)
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   if ((SCB->HFSR & (1 << 30)) != 0) {
       printErrorMsg("Forced Hard Fault\n");
       sprintf(msg, "SCB->CFSR = 0x%08x\n", SCB->CFSR );
       if((SCB->CFSR & 0xFFFF0000) != 0) {
   __ASM volatile("BKPT #01");
void printUsageErrorMsg(uint32_t CFSRValue)
   printErrorMsg("Usage fault: ");
   CFSRValue >>= 16;                  // right shift to lsb
   if((CFSRValue & (1 << 9)) != 0) {
      printErrorMsg("Divide by zero\n");

A run should now result in the following output:


Register dump

One final thing we can do as part of any fault handler is to dump out known register contents as they were at the time of the exception. One really useful feature of the Cortex-M architecture is that a core set of registers are automatically stacked (by the hardware) as part of the exception handling mechanism. The set of stacked registers is shown below:


Using this knowledge in conjunction with AAPCS we can get access to these register values. First we modify our original HardFault handler by:

  • modifying it’s name to “Hard_Fault_Handler”
  • adding a parameter declared as an array of 32–bit unsigned integers.
void Hard_Fault_Handler(uint32_t stack[])

Based on AAPCS rules, we know that the parameter label (stack) will map onto register r0. We now implement the actual HardFault_Handler. This function simply copies the current Main Stack Pointer (MSP) into r0 and then branches to our Hard_Fault_Handler (this is based on ARM/Keil syntax):

__asm void HardFault_Handler(void) 
  MRS r0, MSP
  B __cpp(Hard_Fault_Handler) 

Finally we implement a function to dump the stack values based on their relative offset, e.g.

enum { r0, r1, r2, r3, r12, lr, pc, psr};

void stackDump(uint32_t stack[])
   static char msg[80];
   sprintf(msg, "r0  = 0x%08x\n", stack[r0]);  printErrorMsg(msg);
   sprintf(msg, "r1  = 0x%08x\n", stack[r1]);  printErrorMsg(msg);
   sprintf(msg, "r2  = 0x%08x\n", stack[r2]);  printErrorMsg(msg);
   sprintf(msg, "r3  = 0x%08x\n", stack[r3]);  printErrorMsg(msg);
   sprintf(msg, "r12 = 0x%08x\n", stack[r12]); printErrorMsg(msg);
   sprintf(msg, "lr  = 0x%08x\n", stack[lr]);  printErrorMsg(msg);
   sprintf(msg, "pc  = 0x%08x\n", stack[pc]);  printErrorMsg(msg);
   sprintf(msg, "psr = 0x%08x\n", stack[psr]); printErrorMsg(msg);

This function can then be called from the Fault handler, passing through the stack argument. Running the program should result is the following output:


Examining the output, we can see that the program counter (pc) is reported as being the value 0x00000272, giving us the opcode generating the fault. If we disassemble the image using the command:

fromelf -c CM3_Fault_Handler.axf —output listing.txt

By trawling through the listing (listing.txt) we can see the SDIV instruction at offending line (note also r2 contains 10 and r1 the offending 0).

0x00000270: 4602          MOV r2,r0
0x00000272: fb92f0f1      SDIV r0,r2,r1
0x00000276: 4770          BX lr

Finally, if you’re going to use the privilege/non–privilege model, you’ll need to modify the HardFault_Handler to detect whether the exception happened in Thread mode or Handler mode. This can be achieved by checking bit 3 of the HardFault_Handler’s Link Register (lr) value. Bit 3 determines whether on return from the exception, the Main Stack Pointer (MSP) or Process Stack Pointer (PSP) is used.

__asm void HardFault_Handler(void)
  TST lr, #4     // Test for MSP or PSP
  B __cpp(Hard_Fault_Handler)

In a later post I shall develop specific handler for all three faults.


  1. The initial model for fault handling can be found in Joseph Yiu’s excellent book “The Definitive Guide to the ARM Cortex-M3
  2. The code shown was built using the ARM/Keil MDK-ARM Version 4.60 development environment (a 32Kb size limited evaluation is available from the Keil website)
  3. The code, deliberately, has not been refactored to remove the hard-coded (magic) values.
  4. Code available on GitHub at git://

Weak linkage in C programming

January 25th, 2013

When linking C programs there are (in general) only a couple of errors you’re likely to see. If, for example, you have two functions in different files, both with external linkage, then the files will compile okay, but when you link you’ll likely see an error along these lines:

weak_linkage.axf: Error: L6200E: Symbol foo multiply defined (by foo.o and foo2.o).
Target not created

Most of the time this makes sense and is as expected; however there is a particular instance where it gets in the way.

If we need to supply a code framework where we need placeholders (stubs) for someone else to fill in at a later date, it can sometimes mean developing complex makefiles and/or conditional compilation to allow new code to be introduced as seamlessly as possible.

However, there is a hidden gem supported by most linkers called “weak linkage”. The principle of weak linkage is that you can define a function and tag it as (surprisingly) weak, e.g.

// foo_weak.c
__weak int foo(void)
// ...
return 1;

This then can be called from the main application:

// main.c
int foo(void);

int main(void)

This project can build built as normal:

compiling main.c…
compiling foo_weak.c…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

At some time later we can add another file with the same function signature to the project

// foo.c
int foo(void)
// override weak function
return 2;

If we rebuild, normally we would get the “multiply defined” symbols error, however with weak linkage the linker will now bind the new “strong” function to the call in main.

compiling main.c…
compiling foo_weak.c…
compiling foo.c…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

As you can also see, the weak function is optimized away.

A good example of the use of weak linkage is the definition of the default interrupt handlers in CMSIS.

This example code is based on Keil’s uVision v4.60 compiler/linker, however both GCC and IAR also support weak linkage.

%d bloggers like this: