Python – The everyman’s language

May 17th, 2013

Python is a very nice language in many respects: enforced white-spacing promotes readability, extensibility and Python’s inbuilt Read-Eval-Print-Loop interpreter combined with its introspection capabilities provides a very easy way to learn and get to grips with the language.

But that can’t be all, can it? Why Python?

One of the reasons behind the success of our course has been customers wanting a good language for developing automated testing scripts and Python fits the bill brilliantly – it’s fast (enough), approachable and has great support for the embedded platforms of today and tomorrow (read: Linux  :) )

In the scripting ring we have a number of contenders – Bash, Perl, Ruby, Lua, Javascript but each lacks that certain je ne sais quoi that makes Python so good – or maybe it’s just that the others don’t quite do what I want; Perl has a syntax that makes me want to scratch my eyes out, Bash is great on the command line but has control structures and compatibility issues that make the baby Jesus cry but some of the others are worth a look.

Lua is nice, I’m honestly a fan of Lua and have used it in previous projects where Python was just too big to embed (adding in Lua is a ‘tiny’ 400kb) but that’s the issue – Python is a general programming language – I can quickly bring in web services, advanced numerical libraries, GUIs and scientific libraries as well as the built-in things like networking and threading but Lua just simply isn’t designed for the vast contexts that Python fits and that’s part of Lua’s design – it’s not a general scripting language.

JavaScript is the in-vogue scripting language of the moment; it’s easy to test and develop in the web browser and it has a C style language that can appeal but I worry about any language where I can type in the following and not have it shout an error at me…

[nick@zeus ~]$ gjs
gjs> +((+!![]+[])+(!+[]+!![]))
12

12? of course it is. Go home JavaScript. You’re drunk.

I am seeing more and more interest in using JavaScript in the embedded space, one recent example being the new Beaglebone Black, which allows you to interact with the hardware using JavaScript and a Node.JS back-end.  JavaScript, though, it is still too tied to web technologies and less as a general system scripting language.

Ruby… well, I just simply haven’t found a good resource for learning about Ruby in the embedded space – that one is on me, sorry but maybe I was just scared by the famous Wat talk (here’s looking at you too JavaScript).

Problem?

One thing that does let down Python, in my opinion, is the lack of a good developer environment. I appreciate that Python is easy to use and the interactivity is a massive boon but showing IDLE to someone who has used Visual Studio and all it’s spoon feeding goodness does make me a little sad.

IDLE

Approachable huh?

Line numbers? Stability? A carat that will allow you to type when you misclick? – why do you need those when you can have… Detachable Menus!
It’s easy to make fun but IDLE seems un-maintained and could do with some TLC but it’s still useful as a learning tool to bridge the gap between Visual Studio and the command line.  On the plus side, the debugger does bring some good insight into the operation of the code for first-timers.

Summary

Whenever I need to script something, mock up an interface, test a design, develop some back-end code or create a full application – Python is always there for me.
Python’s versatility, compatibility and ‘kitchen sink’ approach make Python a fantastic choice for almost everyone, from non-programmers through to the physicists at CERN using it to create black holes.  It truly is the everyman’s (and woman’s) language.

So why not learn something new?

[nick@zeus ~]$ python
Python 2.7.3 (default, Aug 9 2012, 17:23:58)
[GCC 4.7.1 20120720 (Red Hat 4.7.1-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import antigravity

Rehosting ARMCC for the mbed with CMSIS-DAP

April 26th, 2013

In this posting I will look at porting the C standard library output (e.g. puts / printf ) to use a UART rather than the default ARM/Keil semihosting.

A-simples-life-001

In my last post, I looked at getting basic user I/O out from a native-mbed via UART0 to a terminal emulator (e.g. Tera Term). This was driven by the fact that, currently, neither printf (via semihosting) or ITM_SendChar do not function on the mbed. Unfortunately, my solution uses a propriety API, such as init_serial0 and putchar0, etc., rather than puts, printf, etc.

ITM_SendChar uses the ITM (Instrumented Trace Macrocell) on the Cortex-M3 core, which in turn uses a Trace Port (either SWO or 4-pin) to send messages. To get output from a Trace Port you need a debug unit with trace capabilities, e.g. a ULINK or J-Link device. The current implementation of CMSIS-DAP does not support any trace capabilities; however I am led to believe that ARM are planning to add some trace capabilities in future versions or variants of CMSIS-DAP (no timeframes).

To reference the ARM website:

Semihosting is implemented by a set of defined software instructions, for example, SVCs, that generate exceptions from program control. The application invokes the appropriate semihosting call and the debug agent then handles the exception. The debug agent provides the required communication with the host.

On a Cortex-M3 (ARMv7–M) you’d typically see “BKPT 0xAB” opcode instead of SVC’s. For the same reason as the ITM_SendChar, currently semihosting is not supported on the mbed.

So, ideally, it would be nice to still be able to use puts/printf (the greatest debug tool of all) but redirect the output to our UART; i.e. rehosting.

Rehosting in the Keil environment is very easy, once you know how! It is easy to go down a couple of dead ends, which hopefully I’ll help you avoid.

First, in our main, where we’re using printf, we need to include the following pre-processor directive in main.c:

#pragma import(__use_no_semihosting_swi)

Read more »

User I/O from mbed with CMSIS-DAP

April 19th, 2013

Following on from my last posting regarding using native C/C++ on the mbed I have found that I currently cannot get output via the standard CMSIS ITM_SendChar function as used in the Cortex-M hard fault handler (I am currently in dialog with the guys at ARM trying to resolve this).

MdebHello

In the standard mbed environment, the mbed can communicate with a host PC through a “USB Virtual Serial Port” over the same USB cable that is used for programming using printf(), e.g.

#include “mbed.h”
int main()
{
printf(“Hello World!\n”);
}

To achieve the same output, an mbed SerialPC object can be defined, e.g.

#include “mbed.h”
Serial pc(USBTX, USBRX); // tx, rx
int main()
{
pc.printf(“Hello World!\n”);
}

Currently, with a native minimal project, the semi-hosting of printf is not supported. This can be overcome by “re-targeting the project”, so I’ll cover that in the future, but for now the is a simple way of getting basic user I/O.

User I/O via UART0

Luckily for us, if we push characters out over the UART0 serial interface they are transmitted via the same channel that the mbed SerialPC uses. To test this out I quickly (using the best agile techniques of course) put together a very basic UART driver. Read more »

Native C/C++ Application development for the mbed using CMSIS-DAP

April 12th, 2013

If you have been following the Feabhas blog for some time, you may remember that in April of last year I posted about my experiences of using the MQTT protocol. The demonstration code was ran the ARM Cortex-M3 based mbed platform.mbed-microcontroller-angled

For those that are not familiar with the mbed, it is an “Arduino-like” development platform for small microcontroller embedded systems. The variant I’m using is built using an NXP LPC1768 Cortex-M3 device, which offers a plethora of connection options, ranging from simple GPIO, through I2C and SPI, right up to CAN, USB and Ethernet. With a similar conceptual model to Arduino’s, the drivers for all these drivers are supplied in a well-tested (C++) library. The mbed is connect to a PC via a USB cable (which also powers it), so allows the mbed to act as a great rapid prototyping platform. [I have never been a big fan of the 8-bit Arduino (personal choice no need to flame me  ) and have never used the newer ARM Cortex-M based Arduino's, such as the Due.]

However, in its early guise, there were two limitations when targeting an mbed (say compared to the Arduino).

First was the development environment; initially all software development was done through a web-based IDE. This is great for cross-platform support; especially for me being an Apple fanboy. Personally I never had a problem using the online IDE, especially as I am used to using offline environments such as Keil’s uVision, IAR’s Embedded Workbench and Eclipse. Over the years the mbed IDE has evolved and makes life very easy for importing other mbed developers libraries, creating your own libraries and even have an integrated distributed version control feature. But the need Internet connection inhibit the ability to develop code on a long flight or train journey for example.

Second, the output from the build process is a “.bin” (binary) file, which you save on to the mbed (the PC sees the mbed as a USB storage device). You then press the reset button on the mbed to execute your program. I guessing you’re well ahead of me here, but of course that means there is no on-target debug capabilities (breakpoints, single-step, variable and memory viewing, etc.). Now of course one could argue, as we have a well-defined set of driver libraries and if we followed aTest-Driven-Development (TDD) process that we don’t need target debugging (there is support for printf style debugging via the USB support serial mode); but that is a discussion/debate for another session! I would hazard a guess most embedded developers would prefer at least the option of target based source code debugging? Read more »

Setting up the Cortex-M3/4 (ARMv7-M) Memory Protection Unit (MPU)

February 25th, 2013

An optional part of the ARMv7-M architecture is the support of a Memory Protection Unit (MPU). This is a fairly simplistic device (compared to a fully blow Memory Management Unit (MMU) as found on the Cortex-A family), but if available can be programmed to help capture illegal or dangerous memory accesses.
When first looking at programming the MPU it may seem rather daunting, but in reality it is very straightforward. The added benefit of the ARMv7-M family is the well-defined memory map.
All example code is based around an NXP LPC1768 and Keil uVision v4.70 development environment. However as all examples are built using CMSIS, then they should work on an Cortex-M3/4 supporting the MPU.
First, let’s take four types of memory access we may want to capture or inhibit:

  1. Tying to read at an address that is reserved in the memory map (i.e. no physical memory of any type there)
  2. Trying to write to Flash/ROM
  3. Stopping areas of memory being accessible
  4. Disable running code located in SRAM (eliminating potential exploit)

Before we start we need to understand the microcontrollers memory map, so here we can look at the memory map of the NXP LPC1768 as defined in chapter 2 of the LPC17xx User Manual (UM10360).

  • 512kB FLASH @ 0×0000 0000 – 0×0007 FFFF
  • 32kB on-chip SRAM @ 0×1000 0000 – 0×1000 7FFFF
  • 8kB boot ROM @ 0x1FFF 0000 – 0x1FFF 1FFF
  • 32kB on-chip SRAM @ 0×2007 C000 [AHB SRAM]
  • GPIO @ 0x2009C000 – 0×2009 FFFF
  • APB Peripherals  @ 0×4000 0000 – 0x400F FFFF
  • AHB Peripheral @ 0×5000 0000 – 0x501F FFFF
  • Private Peripheral Bus @ 0xE000 0000 – 0xE00F FFFF

Based on the above map we can set up four tests:

  1. Read from location 0×0008 0000 – this is beyond Flash in a reserved area of memory
  2. Write to location 0×0000 4000 – some random loaction in the flash region
  3. Read the boot ROM at 0x1FFF 0000
  4. Construct a function in SRAM and execute it

The first three tests are pretty easy to set up using pointer indirection, e.g.:

int* test1 = (int*)0x000004000;   // reserved location
x= *test1;                        // try to read from reserved location
int* test2 = (int*)0x000004000;   // flash location
*test2 = x;                       // try to write to flash
int* test3 = (int*)0x1fff0000 ;   // Boot ROM location
x = *test3 ;                      // try to read from boot ROM

The fourth takes a little more effort, e.g.

// int func(int r0)
// {
//    return r0+1;
// }
uint16_t func[] = { 0x4601, 0x1c48, 0x4770 };
int main(void)
{
   funcPtr test4= (funcPtr)(((uint32_t)func)+1);  // setup RAM function (+1 for thumb)
   x = test4(x);                                  // call ram function
   while(1);
}

Default Behavior

Without the MPU setup the following will happen (output from the previous Fault Handler project):

  • test1 will generate a precise bus error

f1

  • test2 will generate an imprecise bus error

f2

Test3 and test4 will run without any fault being generated.

Setting up the MPU

There are a lot of options when setting up the MPU, but 90% of the time a core set are sufficient. The ARMv7-M MPU supports up to 8 different regions (an address range) that can be individually configured. For each region the core choices are:

  • the start address (e.g. 0×10000000)
  •  the size (e.g. 32kB)
  •  Access permissions (e.g. Read/Write access)
  • Memory type (here we’ll limit to either Normal for Flash/SRAM, Device for NXP peripherals, and Strongly Ordered for the private peripherals)
  • Executable or not (refereed to a Execute Never [XN] in MPU speak)

Both access permissions and memory types have many more options than those covered here, but for the majority of cases these will suffice. Here I’m not intending to cover privileged/non-privileged options (don’t worry if that doesn’t make sense, I shall cover it in a later posting).
Based on our previous LPC1768 memory map we could define as region map thus:

No.  Memory             Address       Type      Access Permissions  Size
0    Flash              0x00000000    Normal    Full access, RO    512KB
1    SRAM               0x10000000    Normal    Full access, RW     32KB
2    SRAM               0x2007C000    Normal    Full access, RW     32KB
3    GPIO               0x2009C000    Device    Full access, RW     16KB
4    APB Peripherals    0x40000000    Device    Full access, RW    512KB
5    AHB Peripherals    0x50000000    Device    Full access, RW      2MB
6    PPB                0xE0000000    SO        Full access, RW      1MB

Not that the boot ROM has not been explicitly mapped. This means any access to that region once the MPU has been initialized will get caught as a memory access violation.
To program a region, we need to write to two registers in order:

  • MPU Region Base Address Register (CMSIS: SCB->RBAR)
  • MPU Region Attribute and Size Register (CMSIS: SCB->RASR)

MPU Region Base Address Register

Bits 0..3 specify the region number
Bit 4 needs to be set to make the region valid
bits 5..31 have the base address of the region (note the bottom 5 bits are ignored – base address must also be on a natural boundary, i.e. for a 32kB region the base address must be a multiple of 32kB).

So if we want to program region 1 we would write:

#define VALID 0x10
SCB->RBAR = 0x10000000 | VALID | 1;  // base addr | valid | region no

MPU Region Attribute and Size Register

This is slightly more complex, but the key bits are:

bit 0 – Enable the region
bits 1..5 – region size; where size is used as 2**(size+1)
bits 16..21 – Memory type (this is actually divided into 4 separate groups)
bits 24..26 – Access Privilege
bit 28 – XN

So given the following defines:

#define REGION_Enabled  (0x01)
#define REGION_32K      (14 << 1)      // 2**15 == 32k
#define NORMAL          (8 << 16)      // TEX:0b001 S:0b0 C:0b0 B:0b0
#define FULL_ACCESS     (0x03 << 24)   // Privileged Read Write, Unprivileged Read Write
#define NOT_EXEC        (0x01 << 28)   // All Instruction fetches abort

We can configure region 0 thus:

SCB->RASR = (REGION_Enabled | NOT_EXEC | NORMAL | REGION_32K | FULL_ACCESS);

We can now repeat this for each region, thus:

void lpc1768_mpu_config(void)
{
   /* Disable MPU */
   MPU->CTRL = 0;
   /* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
   MPU->RBAR = 0x00000000 | REGION_Valid | 0;
   MPU->RASR = REGION_Enabled | NORMAL | REGION_512K | RO;
   /* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x10000000 | REGION_Valid | 1;
   MPU->RASR = REGION_Enabled | NOT_EXEC | NORMAL | REGION_32K | FULL_ACCESS;
   /* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
   MPU->RBAR = 0x2007C000 | REGION_Valid | 2;
   MPU->RASR = REGION_Enabled | NOT_EXEC | NORMAL | REGION_32K | FULL_ACCESS;
   /* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x2009C000 | REGION_Valid | 3;
   MPU->RASR = REGION_Enabled |DEVICE_NON_SHAREABLE | REGION_16K | FULL_ACCESS;
   /* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x40000000 | REGION_Valid | 4;
   MPU->RASR = REGION_Enabled | DEVICE_NON_SHAREABLE | REGION_512K | FULL_ACCESS;
   /* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
   MPU->RBAR = 0x50000000 | REGION_Valid | 5;
   MPU->RASR = REGION_Enabled | DEVICE_NON_SHAREABLE | REGION_2M | FULL_ACCESS;
   /* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
   MPU->RBAR = 0xE0000000 | REGION_Valid | 6;
   MPU->RASR = REGION_Enabled |STRONGLY_ORDERED_SHAREABLE | REGION_1M | FULL_ACCESS;
   /* Enable MPU */
   MPU->CTRL = 1;
   __ISB();
   __DSB();
}

After the MPU has been enabled, ISB and DSB barrier calls have been added to ensure that the pipeline is flushed and no further operations are executed until the memory access that enables the MPU completes.

Using the Keil environment, we can examine the MPU configuration:

f3

Rerunning the tests with MPU enabled

To get useful output we can develop a memory fault handler, building on the Hard Fault handler, e.g.

void printMemoryManagementErrorMsg(uint32_t CFSRValue)
{
   printErrorMsg("Memory Management fault: ");
   CFSRValue &= 0x000000FF; // mask just mem faults
   if((CFSRValue & (1<<5)) != 0) {
      printErrorMsg("A MemManage fault occurred during FP lazy state preservation\n");
   }
   if((CFSRValue & (1<<4)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception entry\n");
   }
   if((CFSRValue & (1<<3)) != 0) {
      printErrorMsg("A derived MemManage fault occurred on exception return.\n");
   }
   if((CFSRValue & (1<<1)) != 0) {
      printErrorMsg("Data access violation.\n");
   }
   if((CFSRValue & (1<<0)) != 0) {
      printErrorMsg("MPU or Execute Never (XN) default memory map access violation\n");
   }
   if((CFSRValue & (1<<7)) != 0) {
      static char msg[80];
      sprintf(msg, "SCB->MMFAR = 0x%08x\n", SCB->MMFAR );
      printErrorMsg(msg);
   }
}

Test 1 – Reading undefined region

Rerunning test one with the MPU enabled gives the following output:

f4

The SCB->MMFAR contains the address of the memory that caused the access violation, and the PC guides us towards the offending instruction

f5

Test 2 – Writing to RO defined region

f6

f7

Test 3 – Reading Undefined Region (where memory exists)

f8

f9

Test 4 – Executing code in XN marked Region

f10

The PC gives us the location of the code (in SRAM) that tried to be executed

f11

The LR indicates the code where the branch was executed

f12

So, we can see with a small amount of programming we can (a) simplify debugging by quickly being able to establish the offending opcode/memory access, and (b) better defend our code against accidental/malicious access.

Optimizing the MPU programming.

Once useful feature of the Cortex-M3/4 MPU is that the Region Base Address Register and Region Attribute and Size Register are aliased three further times. This means up to 4 regions can be programmed at once using a memcpy. So instead of the repeated writes to RBAR and RASR, we can create configuration tables and initialize the MPU using a simple memcpy, thus:

uint32_t table1[] = {
/* Configure region 0 to cover 512KB Flash (Normal, Non-Shared, Executable, Read-only) */
(0x00000000 | REGION_Valid | 0),
(REGION_Enabled | NORMAL_OUTER_INNER_NON_CACHEABLE_NON_SHAREABLE | REGION_512K | RO),
/* Configure region 1 to cover CPU 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x10000000 | REGION_Valid | 1),
(REGION_Enabled | NOT_EXEC | NORMAL | REGION_32K | FULL_ACCESS),
/* Configure region 2 to cover AHB 32KB SRAM (Normal, Non-Shared, Executable, Full Access) */
(0x2007C000 | REGION_Valid | 2),
(REGION_Enabled | NOT_EXEC | NORMAL_OUTER_INNER_NON_CACHEABLE_NON_SHAREABLE | REGION_32K | FULL_ACCESS),
/* Configure region 3 to cover 16KB GPIO (Device, Non-Shared, Full Access Device, Full Access) */
(0x2009C000 | REGION_Valid | 3),
(REGION_Enabled | DEVICE_NON_SHAREABLE | REGION_16K | FULL_ACCESS)
};

uint32_t table2[] = {
/* Configure region 4 to cover 512KB APB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x40000000 | REGION_Valid | 4),
(REGION_Enabled | DEVICE_NON_SHAREABLE | REGION_512K | FULL_ACCESS),
/* Configure region 5 to cover 2MB AHB Peripherials (Device, Non-Shared, Full Access Device, Full Access) */
(0x50000000 | REGION_Valid | 5),
(REGION_Enabled | DEVICE_NON_SHAREABLE | REGION_2M | FULL_ACCESS),
/* Configure region 6 to cover the 1MB PPB (Privileged, XN, Read-Write) */
(0xE0000000 | REGION_Valid | 6),
(REGION_Enabled | NOT_EXEC | DEVICE_NON_SHAREABLE | REGION_1M | P_RW_U_NA),
};

void lpc1768_mpu_config_tbl(void)
{
   /* Disable MPU */
   MPU->CTRL = 0;
   memcpy((void*)&( MPU->RBAR), table1, sizeof(table1));
   memcpy((void*)&( MPU->RBAR), table2, sizeof(table2));
   /* Enable MPU */
   MPU->CTRL = 1;
   __ISB();
   __DSB();
}

I hope this is enough to get you started with your ARMv7-M MPU.

L-values, r-values, expressions and types

February 18th, 2013

Simple question: Why does this code compile?

image

…and this code doesn’t?

image

The compiler gives the following:

image

L-values

What is this ‘l-value’ thing? When (most of us) were taught C we were told an l-value is a value that can be placed on the left-hand-side of an assignment expression. However, that doesn’t give much of a clue as to what might constitute an l-value; so most of the time we resort to guessing and trial-and-error.

Basically, an l-value is a named object; which may be modifiable or non-modifiable. A named object is a region of memory you’ve given a symbolic name to variable (in human-speak: a variable). A literal has no name (it’s what we call a value-type) so that can’t be an l-value. A const variable is a non-modifiable l-value, but you can’t put them on the left-hand-side of an assignment. There fore our rule becomes:

Only modifiable l-values can go on the left-hand-side of an assignment statement.

R-values

An r-value, if we take the traditional view, is a value that can go on the right-hand-side of an assignment. The following compiles fine:

image

By our definition, literals are r-values. But what about the variable a? We defined it as a (non-modifiable) l-value, so are l-values also r-values? We need a better definition for r-values:

An r-value is an un-named object; or, if you prefer, a temporary variable.

An r-value can never go on the left-hand-side of an assignment since it will not exist beyond the end of the statement it is created in.

(As an aside: C++ differentiates between modifiable and non-modifiable r-values. C doesn’t make this distinction and treats all r-values as non-modifiable)

Expressions

Where do r-values come from? C is an expression-based language. An expression is a collection of operands and operators that yields a value. In fact, even an operand on its own is considered an expression (which yields its value). The value yielded by an expression is an object (a temporary variable) – an r-value.

Type of an expression

If an r-value is an object (variable) it must – since C is a typed language – must have a type. The type of an expression is the type of its r-value.

It’s worth having a look at some examples.

In the case of an assignment expression (y = x) the result of the expression is the value of the left-hand-side after assignment; the type is the type of the left-hand side.

If the expression is a logical operator (a != b) the result is an integer (an ‘effective Boolean’) with a value 0 or 1.

For mathematical expressions, the type of the expression is the same as the largest operand-type. So, for example:

image

C will promote, or convert, from smaller types to larger types automatically (if possible).

Finally, functions are considered r-values; with the type of the function being its return type. Thus, a function can never be placed on the right-hand-side of an assignment. There is a special-case exception to this: functions that take a pointer as a parameter, and then return that pointer as the return value, are considered l-values. For example:

image

(By the way, I don’t condone writing code like this!)

Back to the original question, then:

image

This compiles because c is a modifiable l-value; therefore can sit on the left-hand-side of an assignment. a + b is an expression that yields a (non-modifiable) r-value. Because of this, the code below cannot work:

image

And finally…

I’ll leave the following as an exercise for the reader (no prizes, though!)

Why does this compile:

image

…and this doesn’t:

image

Developing a Generic Hard Fault handler for ARM Cortex-M3/Cortex-M4

February 1st, 2013

This posting assumes you that you have a working ARM Cortex-M3 base project in Keil uVision. If not, please see the “howto” video: Creating ARM Cortex-M3 CMSIS Base Project in uVision

Divide by zero error

Given the following C function

int div(int lho, int rho)
{
    return lho/rho;
}

called from main with these arguments

int main(void)
{
   int a = 10;
   int b = 0;
   int c;
   c = div(a, b);
   // other code
}

You would expect a hardware “divide-by-zero” (div0) error. Possibly surprisingly, by default the Cortex-M3 will not report the error but return zero.

Configuration and Control Register (CCR)

To enable hardware reporting of div0 errors we need to configure the CCR. The CCR is part of the Cortex-M’s System Control Block (SCB) and controls entry trapping of divide by zero and unaligned accesses among other things. The CCR bit assignment for div0 is:

[4] DIV_0_TRP Enables faulting or halting when the processor executes an SDIV or UDIV instruction with a divisor of 0:0 = do not trap divide by 01 = trap divide by 0. When this bit is set to 0, a divide by zero returns a quotient of 0.

So to enable DIV_0_TRP, we can use CMSIS definitions for the SCB and CCR, as in:

SCB->CCR |= 0x10;

If we now build and run the project you will need to stop execution as it will appear to run forever. When execution is stopped you’ll find debugger stopped in the file startup_ARMCM3.s in the CMSIS default Hard Fault exception handler:

CMSIS_handler

Override The Default Hard Fault_Handler

As all the exceptions handlers are build with “Weak” linkage in CMSIS, it is very easy to create your own Hard Fault handler. Simply define a function with the name “HardFault_Handler”, as in:

void HardFault_Handler(void)
{ 
   while(1); 
}


If we now build, run and then stop the project, we’ll find the debugger will be looping in our new handler rather than the CMSIS default one (alternatively we could put a breakpoint at the while(1) line in the debugger).

Empty_handler

Rather than having to enter breakpoints via your IDE, I like to force the processor to enter debug state automatically if a certain instruction is reached (a sort of debug based assert). Inserting the BKPT (breakpoint) ARM instruction in our code will cause the processor to enter debug state. The immediate following the opcode normally doesn’t matter (but always check) except it shouldn’t be 0xAB (which is used for semihosting).

#include "ARMCM3.h" 
void HardFault_Handler(void)
{
  __ASM volatile("BKPT #01"); 
  while(1); 
}

If we now build and run, the program execution should break automatically at the BKPT instruction.

BKPT

Error Message Output

The next step in developing the fault handler is the ability to report the fault. One option is, of course, to use stdio (stderr) and semihosting. However, as the support for semihosting can vary from compiler to compiler, I prefer to use Instrumented Trace Macrocell (ITM) utilizing the CMSIS wrapper function ITM_SendChar, e.g.

void printErrorMsg(const char * errMsg)
{
   while(*errMsg != ''){
      ITM_SendChar(*errMsg);
      ++errMsg;
   }
}

InitalErrMsg

Printf1

Fault Reporting

Now that we have a framework for the Hard Fault handler, we can start reporting on the actual fault details. Within the Cortex-M3′s System Control Block (SCB) is the HardFault Status Register (SCB->HFSR). Luckily again for use, CMSIS has defined symbols allowing us to access these register:

void HardFault_Handler(void)
{
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   printErrorMsg(msg);
   __ASM volatile("BKPT #01");
   while(1);
}

Building and running the application should now result in the following output:

HFSR

By examining the HFSR bit configuration, we can see that the FORCED bit is set.

HFSR2

When this bit is set to 1, the HardFault handler must read the other fault status registers to find the cause of the fault.

Configurable Fault Status Register (SCB->CFSR)

A forced hard fault may be caused by a bus fault, a memory fault, or as in our case, a usage fault. For brevity, here I am only going to focus on the Usage Fault and cover the Bus and Memory faults in a later posting (as these have additional registers to access for details).

CFSR

So given what we know to date, our basic Fault Handler:

  • checks if the FORCED bit is set ( if ((SCB->HFSR & (1 << 30)) != 0) )
  • prints out the contents of the CFSR
void HardFault_Handler(void)
{
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   printErrorMsg(msg);
   if ((SCB->HFSR & (1 << 30)) != 0) {
       printErrorMsg("Forced Hard Fault\n");
       sprintf(msg, "SCB->CFSR = 0x%08x\n", SCB->CFSR );
       printErrorMsg(msg);
   }
   __ASM volatile("BKPT #01");
   while(1);
}

Running the application should now result in the following output:

CFSR

The output indicated that bit 25 of the UsageFault Status Register (UFSR) part of the CFSR is set.

UsageFault Status Register

The bit configuration of the UFSR is shown below, and unsurprisingly the outpur shows that bit 9 (DIVBYZERO) is set.

UFSR

We can now extend the HardFault handler to mask the top half of the CFSR, and if not zero then further report on those flags, as in:

void HardFault_Handler(void)
{
   static char msg[80];
   printErrorMsg("In Hard Fault Handler\n");
   sprintf(msg, "SCB->HFSR = 0x%08x\n", SCB->HFSR);
   printErrorMsg(msg);
   if ((SCB->HFSR & (1 << 30)) != 0) {
       printErrorMsg("Forced Hard Fault\n");
       sprintf(msg, "SCB->CFSR = 0x%08x\n", SCB->CFSR );
       printErrorMsg(msg);
       if((SCB->CFSR & 0xFFFF0000) != 0) {
         printUsageErrorMsg(SCB->CFSR);
      }
   }
   __ASM volatile("BKPT #01");
   while(1);
}
void printUsageErrorMsg(uint32_t CFSRValue)
{
   printErrorMsg("Usage fault: ");
   CFSRValue >>= 16;                  // right shift to lsb
   if((CFSRValue & (1 << 9)) != 0) {
      printErrorMsg("Divide by zero\n");
   }
}

A run should now result in the following output:

Div0msg

Register dump

One final thing we can do as part of any fault handler is to dump out known register contents as they were at the time of the exception. One really useful feature of the Cortex-M architecture is that a core set of registers are automatically stacked (by the hardware) as part of the exception handling mechanism. The set of stacked registers is shown below:

IntEntryStack

Using this knowledge in conjunction with AAPCS we can get access to these register values. First we modify our original HardFault handler by:

  • modifying it’s name to “Hard_Fault_Handler”
  • adding a parameter declared as an array of 32–bit unsigned integers.
void Hard_Fault_Handler(uint32_t stack[])

Based on AAPCS rules, we know that the parameter label (stack) will map onto register r0. We now implement the actual HardFault_Handler. This function simply copies the current Main Stack Pointer (MSP) into r0 and then branches to our Hard_Fault_Handler (this is based on ARM/Keil syntax):

__asm void HardFault_Handler(void) 
{
  MRS r0, MSP
  B __cpp(Hard_Fault_Handler) 
}

Finally we implement a function to dump the stack values based on their relative offset, e.g.

enum { r0, r1, r2, r3, r12, lr, pc, psr};

void stackDump(uint32_t stack[])
{
   static char msg[80];
   sprintf(msg, "r0  = 0x%08x\n", stack[r0]);  printErrorMsg(msg);
   sprintf(msg, "r1  = 0x%08x\n", stack[r1]);  printErrorMsg(msg);
   sprintf(msg, "r2  = 0x%08x\n", stack[r2]);  printErrorMsg(msg);
   sprintf(msg, "r3  = 0x%08x\n", stack[r3]);  printErrorMsg(msg);
   sprintf(msg, "r12 = 0x%08x\n", stack[r12]); printErrorMsg(msg);
   sprintf(msg, "lr  = 0x%08x\n", stack[lr]);  printErrorMsg(msg);
   sprintf(msg, "pc  = 0x%08x\n", stack[pc]);  printErrorMsg(msg);
   sprintf(msg, "psr = 0x%08x\n", stack[psr]); printErrorMsg(msg);
}

This function can then be called from the Fault handler, passing through the stack argument. Running the program should result is the following output:

StackDump

Examining the output, we can see that the program counter (pc) is reported as being the value 0×00000272, giving us the opcode generating the fault. If we disassemble the image using the command:

fromelf -c CM3_Fault_Handler.axf —output listing.txt

By trawling through the listing (listing.txt) we can see the SDIV instruction at offending line (note also r2 contains 10 and r1 the offending 0).

.text
div
0x00000270: 4602          MOV r2,r0
0x00000272: fb92f0f1      SDIV r0,r2,r1
0x00000276: 4770          BX lr
.text

Finally, if you’re going to use the privilege/non–privilege model, you’ll need to modify the HardFault_Handler to detect whether the exception happened in Thread mode or Handler mode. This can be achieved by checking bit 3 of the HardFault_Handler’s Link Register (lr) value. Bit 3 determines whether on return from the exception, the Main Stack Pointer (MSP) or Process Stack Pointer (PSP) is used.

__asm void HardFault_Handler(void)
{
  TST lr, #4     // Test for MSP or PSP
  ITE EQ
  MRSEQ r0, MSP
  MRSNE r0, PSP
  B __cpp(Hard_Fault_Handler)
}

In a later post I shall develop specific handler for all three faults.

Notes:

  1. The initial model for fault handling can be found in Joseph Yiu’s excellent book “The Definitive Guide to the ARM Cortex-M3
  2. The code shown was built using the ARM/Keil MDK-ARM Version 4.60 development environment (a 32Kb size limited evaluation is available from the Keil website)
  3. The code, deliberately, has not been refactored to remove the hard-coded (magic) values.
  4. Code available on GitHub at git://github.com/feabhas/CM3_Fault_Handler.git

Weak linkage in C programming

January 25th, 2013

When linking C programs there are (in general) only a couple of errors you’re likely to see. If, for example, you have two functions in different files, both with external linkage, then the files will compile okay, but when you link you’ll likely see an error along these lines:

linking…
weak_linkage.axf: Error: L6200E: Symbol foo multiply defined (by foo.o and foo2.o).
Target not created

Most of the time this makes sense and is as expected; however there is a particular instance where it gets in the way.

If we need to supply a code framework where we need placeholders (stubs) for someone else to fill in at a later date, it can sometimes mean developing complex makefiles and/or conditional compilation to allow new code to be introduced as seamlessly as possible.

However, there is a hidden gem supported by most linkers called “weak linkage”. The principle of weak linkage is that you can define a function and tag it as (surprisingly) weak, e.g.

// foo_weak.c
__weak int foo(void)
{
// ...
return 1;
}

This then can be called from the main application:

// main.c
int foo(void);

int main(void)
{
foo();
while(1);
}

This project can build built as normal:

compiling main.c…
compiling foo_weak.c…
linking…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

At some time later we can add another file with the same function signature to the project

// foo.c
int foo(void)
{
// override weak function
return 2;
}

If we rebuild, normally we would get the “multiply defined” symbols error, however with weak linkage the linker will now bind the new “strong” function to the call in main.

compiling main.c…
compiling foo_weak.c…
compiling foo.c…
linking…
Program Size: Code=372 RO-data=224 RW-data=4 ZI-data=4196
“weak_linkage.axf” – 0 Error(s), 0 Warning(s).

As you can also see, the weak function is optimized away.

A good example of the use of weak linkage is the definition of the default interrupt handlers in CMSIS.

This example code is based on Keil’s uVision v4.60 compiler/linker, however both GCC and IAR also support weak linkage.

Sailing the Seven C’s of design*

January 18th, 2013

I’m always looking for nice little mnemonics to help out remember the important concepts in design.  Here’s one for model-driven development I call the “Seven C’s”.  It basically enumerates the seven stages a design goes through, from initial idea to code.

CONCEPT
The Concept phase is about understanding the problem.  In other words: requirements analysis.  When you’re in Concept mode your main focus is on validation – am I solving the right problem for my customer?

CREATION
In Creation mode you are synthesising a solution.  At this stage we are building an Ideal model, ignoring many of the complications of the real world.  Our design should be completely concurrent (every object has its own thread of control), completely asynchronous (messaging) and completely flat (no hierarchy)
Your focus should be on:

  • Design verification – if I can’t demonstrate an ideal design works then there’s no way a less-than-ideal design will work!
  • Design evaluation – as with all design: is this the best compromise?

COMPOSITION
Also known as ‘design levelling’ (but that doesn’t start with C!).  Your Ideal model probably has some high-level elements (components, etc) that need further refinement; and plenty of low-level elements (objects) that can be collected together to perform ‘emergent’ behaviours.  Design levelling is the act of (re)organising the design into appropriate levels of abstraction – sub-systems, components, objects, etc.

CONCURRENCY
Our Ideal model treats every element as concurrent, but experience tells us that isn’t practical for a real software system, so reduce the number of concurrent elements in our system to get the most effective solution with the smallest number of threads.

CONSTRUCTION
Like any other construction project our plans must contain enough information so that skilled implementers can build our dream.  In our case the construction plans are our class diagrams, composite structures, state machines and algorithm descriptions.
Once in the construction stage our focus is on verification – doesn’t this design still fulfil its requirements?

CORRUPTION
Our construction models may be complete and may be translatable directly to code.  However, the result of that translation may not be ideal; and may not benefit from many of the features of our chosen implementation language.  Corruption is the act of modifying our design to make the most effective use of our implementation language and platform.

CODE
The end result.

How does this all map onto typical model-driven-design nomenclature?  Well, pretty much like this:

Computationally-Independent Model (CIM)
•    Concept

Platform-Independent Model (PIM)
•    Creation
•    Composition
•    Concurrency
•    Construction

Platform-Specific Model (PSM)
•    Corruption
•    Code

*  Apologies to all the grammar fanatics out there for my use of the “grocer’s apostrophe”

Default construction and initialisation in C++11

November 22nd, 2012

Default constructors

In C++ if you don’t provide a constructor for the class the compiler provides one for you (that does nothing):

image

This, of course, isn’t very useful so typically we write our own constructors to initialise the attributes of our class.  However, as soon as you write a (non-default) constructor the compiler stops providing the default constructor:

image

The normal solution is to write our own default constructor.  In C++11 there is syntax to allow you to explicitly create the compiler-supplied constructor without having to write the definition yourself:

image

So far, this doesn’t seem to have gained us very much; if anything it feels like syntax for the sake of it.

 

Non-static member initialisation

C++ has, for some time, allowed initialisation of static member variables in the class declaration.

image

This was allowed because the static member is allocated at compile time.  C++11 has extended this notation to include non-static members as well:

image

What this syntax means is: in the absence of any other initialisation the value of data should be set to 100.  This is, in essence, a little bit of syntactic sugar around the MIL to save you having to write the following:

image

Putting it all together

Combining non-static member initialisation with default constructors allows us to write code like this:

image

There are possibly two ways to view this:  either it’s a new (and uncomfortable) syntax to learn that just adds more variety (and confusion) to the language; or that C++’s default initialisation was ‘broken’ (and we learned to compensate) and this is the syntax that should have always have been.

%d bloggers like this: