Making things do stuff – Part 1

C has long been the language of choice for smaller, microcontroller-based embedded systems; particularly for close-to-the-metal hardware manipulation.

C++ was originally conceived with a bias towards systems programming; performance and efficiency being key design highlights.  Traditionally, many of the advancements in compiler technology, optimisation, etc., had centred around generating code for PC-like platforms (Linux, Windows, etc).  In the last few years C++ compiler support for microcontroller targets has advanced dramatically, to the point where Modern C++ is a increasingly attractive language for embedded systems development.

In this set of articles we will explore how to use Modern C++ to manipulate hardware on a typical embedded microcontroller.

As the articles progress we’ll look at how we can use C++’s features to hide the actual underlying hardware of our target system and provide an abstract hardware API that developers can work to.  We’ll explore the performance (in terms of memory and code size) of these abstractions compared to their C counterparts.

We’ll begin by having a look at the very basics of hardware manipulation – accessing hardware devices and bit manipulation.

Port vs memory-mapped Input/Output (I/O)

Memory-Mapped Input/Output (MMIO) and port Input/Output (also called port-mapped I/O or PMIO) are two complementary methods of performing input/output between the CPU and I/O devices in a target.

PMIO uses a special class of CPU instructions specifically for performing I/O. This is generally found on Intel microprocessors, specifically the IN and OUT instructions which can read and write a single byte to an I/O device. I/O devices have a separate address space from general memory, either accomplished by an extra “I/O” pin on the CPU’s physical interface, or an entire bus dedicated to I/O.

PMIO is usually accessed via special compiler-specific intrinsic functions; that is, non-standard C++.  For that reason, I’ll exclude PMIO from this discussion.

MMIO uses the same bus to address both memory and I/O devices, and the CPU instructions used to read and write to memory are also used in accessing I/O devices. In order to accommodate the I/O devices, areas of CPU addressable space must be reserved for I/O rather than memory. The I/O devices monitor the CPU’s address bus and respond to any CPU access of their assigned address space, mapping the address to their hardware registers.

It is important to note that the two techniques are not normally found on same architecture; although for example on older PCs video ram would be memory mapped and all other I/O devices would be port I/O.

Accessing hardware from C++

The problem from a language perspective is that the compiler can only see objects that have been declared; and by definition those objects with be in the memory areas.  There is no direct way of accessing I/O addresses from C++.  The way round this is to indirectly access the memory, using pointers.

There are two things to take into consideration with this method:

  • We must select a pointer type that matches our hardware register
  • We have to ‘force’ an address into the pointer

The type of a pointer tells the compiler how many bytes to access, how to interpret the bits and the valid behaviours that can be performed on that memory location.  In configuration-type registers normally each bit has its own significance; for data-type registers the value is normally held as a ‘raw’ number.  For that reason, hardware access registers should always be qualified as unsigned.  The actual type depends on the size of the hardware register.

unsigned char*  reg_8  { };  // 8-bit
unsigned short* reg_16 { };  // 16-bit
unsigned long*  reg_32 { };  // 32-bit

Note:

A 32-bit register (for example) may be accessible (depending on hardware) as either 8-bits, 16-bits or 32-bits.

However, using a pointer-to-32-bit to access an 8-bit register is undefined.  You risk overwriting adjacent registers; and even reads could have unintended side-effects (for example, reading some hardware registers has the effect of clearing them)

To be more explicit it’s probably good practice to use specific-width aliases:

#include <cstdint>

int main()
{
  std::uint8_t*  reg_8  { };
  std::uint16_t* reg_16 { };
  std::uint32_t* reg_32 { };
  ...

}

Our next task is to ‘force’ a register address into the pointer.  As an integral value cannot be assigned to a pointer, we must cast the value. The correct C++ way is using reinterpret_cast<> (in fact, it’s one of the few uses of reinterpret_cast<>)

(NOTE:  I’ve used an inline function to cast my pointers.  Normally I wouldn’t code like this but I’m limited on screen real-estate and they reduce clutter in the code.)

#include <cstdint>

// Just to reduce code clutter
//
using std::uint8_t;
using std::uint16_t;
using std::uint32_t;

inline
uint8_t* reg08_ptr(uint32_t addr)
{
  return reinterpret_cast<uint8_t*>(addr);
}

inline
uint16_t* reg16_ptr(uint32_t addr)
{
  return reinterpret_cast<uint16_t*>(addr);
}

inline
uint32_t reg32_ptr(uint32_t addr)
{
  return reinterpret_cast<uint32_t*>(addr);
}

int main()
{
  uint8_t*  reg_8  { reg08_ptr(0x40020000) };
  uint16_t* reg_16 { reg16_ptr(0x40020010) };
  uint32_t* reg_32 { reg32_ptr(0x40020020) };
  ...
}

Now we’ve set up the pointer we can access the register indirectly.

int main()
{
  uint8_t*  reg_8  { reg08_ptr(0x40020000) };
  uint16_t* reg_16 { reg16_ptr(0x40020010) };
  uint32_t* reg_32 { reg32_ptr(0x40020020) };

  uint8_t value { };  

  value = *byte_reg;      // Read
  *byte_reg = 0;          // Write
}

Since our hardware registers are at fixed locations in memory (and should never change!) we can improve the optimisation capacity of the compiler by making the pointers constant.

int main()
{
  uint8_t*  const reg_8  { reg08_ptr(0x40020000) };
  uint16_t* const reg_16 { reg16_ptr(0x40020010) };
  uint32_t* const reg_32 { reg32_ptr(0x40020020) };

  uint8_t value { };  

  value = *byte_reg;      // Read
  *byte_reg = 0;          // Write; also OK. Register is not const.
}

Remember, the const refers to the pointer, not the object being addressed (the register)

(It might seem compelling at this point to consider making the pointers a constexpr.  After all, the pointer is fixed and can never change.  However, the C++ standard explicitly prohibits the results of reinterpret_cast in constant-expressions.  See here for more details)

Registers and side-effects

One of the defining features of hardware registers is their value is dictated by the (current) state of the hardware; and not necessarily by the actions of the program.  This means, for example:

  • A write to a hardware register, followed by a read may not yield the same value. This is often true for write-only registers
  • Two sequential reads from a register may yield different results

We could therefore state that any access to a hardware register – read or write – might yield a side-effect.

However, the compiler can only reason about objects declared within the program.  It will base all its optimisations on the code that is presented.  For example:

int main()
{
  uint8_t* const ctrl { reg08_ptr(0x40020000) };
  uint8_t* const cfg  { reg08_ptr(0x40020001) };
  uint8_t* const data { reg08_ptr(0x40020002) };
  ...

  while(*data == 0)
  {
    // Wait for data to arrive...
  }
}

The code above could hang at run-time; particularly at higher optimisation levels.

From the compiler’s perspective there is no code to modify the object referenced by data.  Therefore, the compiler is free to optimise out the (apparently) redundant reads and simply read *data before the loop.

Similarly, the compiler is likely to optimise redundant writes; for example:

int main()
{
  uint8_t* const ctrl { reg08_ptr(0x40020000) };
  uint8_t* const cfg  { reg08_ptr(0x40020001) };
  uint8_t* const data { reg08_ptr(0x40020002) };

  *ctrl = 1;   // Enter configuration mode
  *cfg  = 3;   // Configure the device
  *ctrl = 0;   // Enter operational mode.
  ...
}

It is likely that the first write to *ctrl will be optimised away, since there is no read of *ctrl before the second write.

We have to inform the compiler that the object we are referencing via the pointer is a special case and therefore any optimisations (removing redundant reads / writes; or re-orderings) must be disabled for this object.  Enter the volatile qualifier:

int main()
{
  volatile uint8_t* const ctrl { reg08_ptr(0x40020000) };
  volatile uint8_t* const cfg  { reg08_ptr(0x40020001) };
  volatile uint8_t* const data { reg08_ptr(0x40020002) };

  *ctrl = 1;      // Redundant writes not optimised-out.
  *cfg  = 3;
  *ctrl = 0;
  ...

  while(*data == 0)  // Redundant reads not optimised-out
  {
    // Wait for data...
  }
}

It is therefore good practice to make all hardware-access pointers volatile.  There is never a good reason not to.

We’ve now established the basic idiom for accessing hardware via pointers.

An aside: decluttering code

Our pointer declarations are starting to look a bit verbose.  We can use auto type-deduction to make our code cleaner:

// Modifying the casting functions
//
inline
volatile uint8_t* reg08_ptr(uint32_t addr)
{
  return reintepret_cast<volatile uint8_t*>(addr);
}

int main()
{
  auto const ctrl { reg08_ptr(0x40020000) };
  auto const cfg  { reg08_ptr(0x40020001) };
  auto const data { reg08_ptr(0x40020002) };

  *ctrl = 1;         // Redundant writes not optimised-out.
  *cfg  = 3;
  *ctrl = 0;
  ...

  while(*data == 0)  // Redundant reads not optimised-out
  {
    // Wait for data...
  }
}

There are a few things to note in this new code:

Since auto uses the type of the initialiser to determine the type of the object, we must change the type of the reinterpret_cast to a volatile uint8_t*.

The const qualifier is applied after type-deduction and applies to the deduced type.  So in this case our pointers become

volatile uint8_t* const ctrl;
volatile uint8_t* const cfg;
volatile uint8_t* const data;

Although our idiom is explicit (and for that reason, preferred) all the pointer dereferencing can make code less-than-clean to read.

As a C programmer we might resort to the pre-processor to clean up the code:

// C programmer’s version
//
#define CTRL (*(volatile uint8_t*) 0x4002000)
#define CFG  (*(volatile uint8_t*) 0x4002010)
#define DATA (*(volatile uint8_t*) 0x4002020)

int main(void)
{
  CTRL = 1;
  CFG  = 3;
  CTRL = 0;

  while(DATA == 0)
  {
    //...
  }
}

The macros (CTRL, CFG, DATA) perform an inline cast of an integer to a pointer, then immediately dereference it to get an object.  So any access to CTRL (for example) will be an indirect access to the address 0x4002000.

In C++ we can use references to achieve the same effect.  One advantage of using references is that they will appear in the symbol table, making debugging easier.

// C++ programmer’s version
//
int main(void)
{
  auto& ctrl { * reg08_ptr(0x40020000) };
  auto& cfg  { * reg08_ptr(0x40020010) };
  auto& data { * reg08_ptr(0x40020020) };

  ctrl = 1;
  cfg  = 3;
  ctrl = 0;

  while(data == 0)
  {
    //...
  }
}

Notice the pointer dereference in the initialisers – we want references to objects, not pointers.  Also notice you don’t have to make references const.  References cannot be ‘re-seated’ so are effectively always const.

Although we’ll look at this more in the next article it’s worth mentioning that both the pointer and reference versions shown will generate the same opcodes.  The choice, then, becomes which is clearer to the reader (and maintainer) of the code.  For the rest of these articles I’m going to stick to the pointer version.  From an explanation point of view it is more explicit that we are indirectly accessing the IO memory.

Bit manipulation

One of the distinguishing aspects of hardware manipulation code is that we are often dealing with variables on a bit-by-bit basis.  There are a set of idiomatic operations we’ll need to do regularly:

  • Set a particular bit, or set of bits
  • Clearing bit(s)
  • Check to see if a bit is set

For the purposes of this next section I’m going to assume read-write registers.  That is, the register can be written to and read from.  This is the general case.  We’ll discuss read-only registers and write-only registers at the end.

Setting bits

When setting individual bits a we have to leave all the bits we’re not interested in unchanged.  Therefore, a simple assignment is not adequate.  We need bitwise OR (|):

int main()
{
  auto const ctrl { reg08_ptr(0x40020000) };
  auto const cfg  { reg08_ptr(0x40020001) };
  auto const data { reg08_ptr(0x40020002) };

  *ctrl = *ctrl | 0b10000000;  // Set bit 7
  ...
}

This code will set bit 7 of the ctrl register, leaving all others intact, since OR-ing with zero has no effect

The above code explicitly shows the read-modify-write operation, although idiomatically programmers prefer the syntactic sugar of the OR-assignment operator.

*ctrl |= 0b10000000;

Notice here we’re using C++’s binary literal to specify the bits we want to set.  Hard-coding bit values is fine for simple (8-bit) values but can become tedious – and error-prone – for multiple bits on larger words (For example, what about setting bits 17 and 23 on a 32-bit word?).

We could use hexadecimal:

*ctrl |= 0x80

Or we can make use of the left-shift operator:

*ctrl |= (1 << 7);

That is, put a 1 in the least-significant bit position then shift left 7 times.  This will put the 1 in bit 7, the rightmost bits guaranteed to be 0.

Clearing a bit

We (obviously?) can’t use bitwise-OR to clear a bit since OR-ing with zero has no effect.  When clearing bits we need to set the offending bits to zero, whilst maintaining the state of all other bits.  For this we use bitwise-AND

int main()
{
  auto const ctrl { reg08_ptr(0x40020000) };
  auto const cfg  { reg08_ptr(0x40020001) };
  auto const data { reg08_ptr(0x40020002) };

  *ctrl |= (1 << 7);        // Set bit 7
  ...

  *ctrl &= 0b01111111;      // Clear bit 7
  ...
}

To make the code more readable (for larger register sizes) we can again make use of the bitwise-NOT operator (~):

*ctrl &= ~0x80;

Or even:

*ctrl &= ~(1 << 7);

Checking a bit

To check whether a bit is set we again use bitwise-AND.  An (bit) value AND-ed with one will retain its original value.

Note:  the result will either be zero if our target bit is not set, or non-zero if it is set.  Therefore always compare the result of the bitwise-AND operation to 0

int main()
{
  auto const ctrl { reg08_ptr(0x40020000) };
  auto const cfg  { reg08_ptr(0x40020001) };
  auto const data { reg08_ptr(0x40020002) };

  if((*cfg & (1 << 4)) != 0))  // Always compare to zero.
  {
    // ...
  }
}

Read-only registers

As the name suggests a read-only register cannot be written to.  Writing to a read-only register is undefined.

Rather than rely on programmer diligence we can get the compiler to help us by marking our read-only registers as pointers-to-const:

// Define read-only and read-write pointers
// 
inline
const volatile uint8_t* reg08_ptr_RO(uint32_t addr)
{
  return reintepret_cast<const volatile uint8_t*>(addr);
}

inline
volatile uint8_t* reg08_ptr_RW(uint32_t addr)
{
  return reinterpret_cast<volatile uint8_t*>(addr);
}

int main()
{
  auto const ro_reg { reg08_ptr_RO(0x40020001) };

  auto val = *ro_reg;   // OK   - Read allowed.
  *ro_reg  = 1;         // FAIL - Write not allowed.
}

Hardware abstraction layers like CMSIS provide similar macros to define read-only and read-write registers

/* IO definitions (access restrictions to peripheral registers) */
/*CMSIS Global Defines
  IO Type Qualifiers are used:
   - to specify the access to peripheral variables.
   - for automatic generation of peripheral register debug 
     information.
*/
#define __I volatile const  /* Defines 'read only' permissions */
#define __O volatile        /* Defines 'write only' permissions */
#define __IO volatile       /* Defines 'read/write' permissions */

Write-only registers

A write-only register can only be written to.  The value read from a write-only register is undefined.  They are likely to be junk, and no reflection of the actual state of the register.

Therefore the code idioms I’ve shown above should not be used with write-only registers.  In fact, they could even be dangerous.  Writing back a (modified) version of a junk value (from a read) could unintentionally enable bits in the register!

When accessing write-only registers only ever use the assignment operator (=)

int main()
{
  // Can we enforce write-only?
  //
  auto const wo_reg { reg08_ptr(0x40020002) };

  auto val = *wo_reg;  // Will compile, but invalid
  *wo_reg  = 0x55;     // Never use |= to set bits
}

Unfortunately, unlike read-only registers there’s no way with pointers of ensuring you never read the pointer (you can always read an object in C++).  You are reliant on the programmer applying due diligence.  (Notice in the CMSIS code above the definition for write-only is the same as for read-write!)

We will explore how C++ can help us enforce register read- and write- characteristics in a later article.

Summary

In this article we’ve looked at the basic concepts and idioms of hardware access in C++.  In the next article we’ll look at applying this to a real-world example.

Glennan Carnie

Glennan Carnie

Technical Consultant at Feabhas Ltd
Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry.

He specialises in C++, UML, software modelling, Systems Engineering and process development.
Glennan Carnie

Latest posts by Glennan Carnie (see all)

Dislike (0)

About Glennan Carnie

Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry. He specialises in C++, UML, software modelling, Systems Engineering and process development.
This entry was posted in C/C++ Programming, Cortex and tagged , , , . Bookmark the permalink.

13 Responses to Making things do stuff – Part 1

  1. embdd says:

    Thanks for a perfect introduction to the bit-banding.

    Like (0)
    Dislike (0)
  2. We haven't got to bit-banding just yet.

    But we will. 🙂

    Like (0)
    Dislike (0)
  3. I'm looking forward to next article in the series! It's always nice to read something about C++ in embedded, without the usual "C++ is too big, too slow, there are no good compilers, it won't work without heap, ..." kind of nonsense. I'm a big fan of C++ myself - I have used only C++11 for embedded firmware (ARM Cortex-M chips) since the days it was still called C++0x and I never needed to look back or "downgrade to C". To confront all the standard C++ myths, I've even started writing my own C++ RTOS for these microcontrollers ( http://distortos.org/ ), which is way easier (and safe) to use than all the other projects written entirely in C, mostly because of the features which are just not possible in C (threads with any number of any arguments, queues for real C++ objects, ...).

    Keep up the good work with these articles, I'll read the following parts for sure!

    Like (2)
    Dislike (0)
  4. Really good article! Are you going to make it even "more elegant"? I just tried to create a "Register" class, with the type of the pointer dependent to a template size value to handle everything... Using metaprogramming approaches to handle low-level problems is a lot of fun!

    Like (1)
    Dislike (0)
  5. That's exactly where I'm going with these articles, Eduardo. Watch this space!

    Like (1)
    Dislike (0)
  6. Rumpel says:

    The problem with using pointers to uint8_t values or similar, may be the generated assembler by the compiler:
    volatile uint8_t* const Reg = reinterpret_cast(0x1234);
    *Reg |= 0x32;

    is translated by clang 4.0.0 (with -O3) to

    or byte ptr [4660], 50

    But GCC 6.3 (with -O3) generates

    movzx eax, BYTE PTR ds:4660
    or eax, 50
    mov BYTE PTR ds:4660, al

    If reading from the register has side effects, the clang version works fine, but the GCC is not usable in this situation. Additionally, the GCC version has a race condition.

    But attention! Adding additional code and/or change the order of the statments may alter the generated output! The only solution is using compiler specific extensions for stuff like that or even pure assembler.

    Like (0)
    Dislike (0)
  7. Wilhelm Meier says:

    I think your reg08_ptr(), etc. functions can't be constexpr, because reinterpret_cast() expressions are non-constexpr by definition.

    Like (0)
    Dislike (0)
  8. For your amusement: "C with classes" did have "writeonly" but I had to take it out in the compromise with the C guys at Bell Labs that gave us "const" (instead of "readonly").

    Like (0)
    Dislike (0)
  9. D'Oh!

    Well spotted, Wilhelm! And, of course, you are quite correct. I even mention it in the article!

    I've updated the article to remove the mistake.

    That'll teach me for updating blogs on-the-fly. 🙂

    Thanks!

    Like (0)
    Dislike (0)
  10. Dónal says:

    > Unfortunately, unlike read-only registers there’s no way with pointers of ensuring you never read the pointer (you can always read an object in C++)

    If you wrap the register in a small struct, you can get some compile time checking using templates, and std::enable_if (or partial specialisation), at the expense of some syntax (namely you have a Get and a Set method for the register, rather than dereferencing a pointer. I'm sure you could tidy it up to overload the dereference operator too, but here's an example: https://ideone.com/TtepQQ I am reasonably sure that with some extra work you should be able to reduce this to almost 0 overhead, especially with inlining.

    Like (0)
    Dislike (0)
  11. Dónal says:

    Also, I believe the casts you mentioned at the beginning of this article aren't technically allowed., e.g.

    inline
    uint16_t* reg16_ptr(uint32_t addr)
    {
    return reinterpret_cast(addr);
    }

    converting form a uint32_t to a uint16_t* would violate strict aliasing. Even converting from uint32_t to uint8_t* violates it in some cases; unless uint8_t is a typedef to a char (which isn't guaranteed).

    Like (0)
    Dislike (0)
  12. Hi Dónal,

    That's pretty much where I'm going with this. Watch this space! 🙂

    Like (0)
    Dislike (0)
  13. Unless it's particularly subtle, I'm not seeing any strict aliasing violations in this function.

    Strict aliasing violations occur when references (pointers) to different types access the same address in memory.

    In this example we're simply forcing an address into a pointer. The fact that the address is being held in a uint32_t is an artefact of our 32-bit hardware architecture. The value stored in a uint16_t* is still a 32-bit value (it's an address); it's just what is being referenced is a uint16_t. There is no type conversion happening between pointers.

    The types of optimisation issues associated with type aliasing are going to be largely negated by the fact we are volatile-qualifying the pointed-to object; thus disabling any optimisations on that object.

    That said, ANY cast has the potential to cause us problems (https://blog.feabhas.com/2013/09/casting-what-could-possibly-go-wrong/)

    Like (0)
    Dislike (0)

Leave a Reply