You are currently browsing the archives for the C/C++ Programming category.

Overcoming Name Clashes in Multiple C++ Interfaces

December 23rd, 2011

Interfaces

One of our key design goals is to reduce coupling between objects and classes. By keeping coupling to a minimum a design is more resilient to change imposed by new feature requests or missing requirements[1].

An Interface represents an abstract service. That is, it is the specification of a set of behaviours (operations) that represent a problem that needs to be solved.

An Interface is more than a set of cohesive operations. The Interface can be thought of as a contract between two objects – the client of the interface and the provider of the interface implementation.

The implementer of the Interface guarantees to fulfil the specifications of the Interface. That is, given that operation pre-conditions are met the implementer will fulfil any behavioural requirements, post-conditions, invariants and quality-of-services requirements.

From the client’s perspective it must conform to the operation specifications and fulfil any pre-conditions required by the Interface. Failure to comply on either side may cause a failure of the software.

Read more »

enum ; past, present and future

June 15th, 2011

The enumerated type (enum) is probably one of the simplest and most underused  features of the C and C++ which can make code safer and more readable without compromising performance.

In this posting we shall look at the basic enum from C, how C++ improved on C’s enum, and how C++0X will make them a first class type.

Often I see headers filled with lists of #defines where an enum would be a much better choice. Here is a classic example:

/* adc.h */
#define ADC_Channel_0                               (0x00) 
#define ADC_Channel_1                               (0x01) 
#define ADC_Channel_2                               (0x02) 
#define ADC_Channel_3                               (0x03) 
#define ADC_Channel_4                               (0x04) 
#define ADC_Channel_5                               (0x05) 
#define ADC_Channel_6                               (0x06) 
#define ADC_Channel_7                               (0x07) 
#define ADC_Channel_8                               (0x08) 
#define ADC_Channel_9                               (0x09) 
#define ADC_Channel_10                              (0x0A) 
#define ADC_Channel_11                              (0x0B) 
#define ADC_Channel_12                              (0x0C) 
#define ADC_Channel_13                              (0x0D) 
#define ADC_Channel_14                              (0x0E) 
#define ADC_Channel_15                              (0x0F) 

which probably would be better re-written as:

enum ADC_Channel_no {
	ADC_Channel_0,
	ADC_Channel_1,
	ADC_Channel_2,
	ADC_Channel_3,
	ADC_Channel_4,
	ADC_Channel_5,
	ADC_Channel_6,
	ADC_Channel_7,
	ADC_Channel_8,
	ADC_Channel_9,
	ADC_Channel_10,
	ADC_Channel_11,
	ADC_Channel_12,
	ADC_Channel_13,
	ADC_Channel_14,
	ADC_Channel_15
};

Before getting onto the advantages and disadvantages of enum’s, let’s have a quick review.

Read more »

GNU, and void pointers

April 18th, 2011

Void pointers were introduced in ANSI C as ‘generic’ pointers; or, if you prefer, ‘pointers to no particular type’. They were designed to replace unsigned char* pointers in instances where the type of the object being pointed to could change.

unsigned char* has the least restrictive alignment – it aligns on a byte boundary. This means an unsigned char* pointer could be used to point to any object (with an appropriate cast, of course).

Remember, though, the type of a pointer defines how to interpret the memory at the address held in the pointer. Using an unsigned char* to point to any object is a bit of an abuse of such a pointer (unless, of course, the thing you’re referencing actually IS an unsigned char!)

Hence, the introduction of the void* – a pointer that imposes no requirements on the memory it references.

The void pointer has the same alignment as an unsigned char*; that is, void pointers align on byte boundaries.

Because void pointers are generic (and are effectively useless on their own) you can implicitly convert any pointer to and from a void*.

There is one little wrinkle with void pointers: you cannot perform (pointer) arithmetic on them. The following code fails to compile:

int main (void)
{
  int i;
  void *p = &i;
  p++;  /* But what’s the sizeof(void)?! */
}

When you increment (or perform any other arithmetic) on a pointer it modifies the value of the pointer by the size of the type it references.

In the case of a void pointer it doesn’t point to any type, so the compiler cannot know how to manipulate the pointer value.

Except, it seems, in the GNU compiler.

The above code will compile with no errors or warnings. GNU have included an extension that treats void* just like unsigned char*. So with a GNU compiler the value of p would increase by one. We can only assume that, since a void* has the same alignment as an unsigned char*, GNU thought its arithmetic should work the same way, too.

This code is highly unportable (and not even standards compliant). If you’re using little tricks like this in your everyday code be prepared for a painful life when you port to another compiler.

Inheritance, ABCs and Polymorphism

March 14th, 2011

Virtual functions

Virtual functions in C++ exist to maintain the consistent behaviour of polymorphism when accessing derived objects via base class pointers. (If that statement has made your head spin, I’d suggest reading this article before carrying on)

class Base
{
public:
  virtual void v_op();
};

class Derived : public Base
{
public:
  virtual void v_op();
}

I can access either a Base object or a Derived object via a Base pointer; and I should get the appropriate behaviour for the actual type of the object I’m pointed at:

Base* pB;
pB = new Base;	  // Point at Base object…
pB->v_op(); 	  // calls Base::v_op()
pB = new Derived  // Derived object is also a Base object
pB->v_op()	  // this time calls Derived::v_op()

This mechanism is known as dynamic polymorphism.
Dynamic polymorphism is a very powerful mechanism for building flexible, maintainable solutions. If we base the client code not on a particular object but on a more abstract (in the design sense) super class (something known as the Dependency Inversion Principle) we can swap implementations in and out without having to re-factor the client code (using a principle called Substitutability).

Abstract Base Classes

Let’s take a common design problem. During design we identify a set of classes with some similar behaviour but some unique behaviour. For example, our design may require concurrent behaviour so we define a set of ‘active’ classes, each one doing a different task:

class ActiveComms
{
public:
  void start();    // Create a thread; and call doStuff()
  void doStuff();  // Perform this class’ behaviour
};

class ActiveDisplay
{
public:
  void start();    // Create a thread; and call doStuff()
  void doStuff();  // Perform this class’ behaviour
};

class ActiveControl
{
public:
  void start();    // Create a thread; and call doStuff()
  void doStuff();  // Perform this class’ behaviour
};

These objects share common behaviour – they all have to create and manage a thread of control in the underlying OS – but they all have (at least one) unique behaviour – the actual work they are doing (for example, managing the display)
Rather than replicating the common code it is put into a base class and the ‘working’ classes inherit from it. We make the unique behaviour virtual so that derived classes can provide their own (overridden) implementation.

class Active
{
public:
  void start();            // Create a thread; ; and call doStuff()
  virtual void doStuff();  // What should this function do?
};

class ActiveComms : public Active
{
  virtual void doStuff();  // ActiveComm’s unique behaviour
};

In the client code we can use dynamic polymorphism to decouple us from any particular implementation:

Active* pActiveObject;
pActiveObject = new ActiveComms;   // Derived class
pActiveObject->start(); 	   // Assume start() calls
                                   // polymorphic doStuff()

What’s to stop us creating a base class object, though?

Active* pActiveObject;
pActiveObject = new Active;        // Base class object.  Correct, but…
pActiveObject->start(); 	   // what happens here?

What happens when the base class object runs? What does its doStuff() function do? At best it may do some generic (common) behaviour; at worst nothing at all.
In reality we don’t want clients to create objects of the common base type. They’re a convenience to improve intrinsic quality, not really part of the functional behaviour. In order to inhibit creation of base class objects we make the virtual function pure.

class Active
{
public:
  void start();                // As before
  virtual void doStuff() = 0;  // Pure virtual function.
};

Active is now referred to as an Abstract class. You cannot create an instance of an abstract class. This is because a pure virtual function does not require an implementation. (We’ll talk about what happens if you try and call this base class function in a while). Derived classes must implement the doStuff() function; they then become what are called Concrete classes.

// Active* pActiveObject = new Active     // Fails – Active is abstract
Active* pActiveObject = new ActiveComms;  // Concrete class
pActiveObject->start();

What happens if a derived class calls the base class’ (pure) virtual fuction? C++ scope operator (::) allows you to call a function outside your current scope; and in fact anywhere in the class hierarchy.

// Class declarations as before.
void ActiveComms::doStuff() 	// virtual function
{
  // Do ActiveComms unique behaviour…
  Active::doStuff();            // Call base class’ (pure) virtual function.
                                // This is legal, but what happens?!
}

What happens when you write code like this depends on your compiler. On most modern compilers – for example GCC, IAR or even Microsoft! – you will get a Linker error when you compile this code. This is because there is no implementation for the pure virtual function (as expected).
Be careful, though: on some older compilers declaring a function as a pure virtual may cause the compiler to insert a null (zero) into the vtable for the class. The code compiles and links with no errors or warnings. When the code executes and the pure virtual function is called (via the vtable) it will execute a null pointer. On many processors this is the reset vector; meaning your call to the base class pure virtual function will reset your system (with no warning!)

Pure Virtual Functions with Implementations

So, can you provide an implementation for a pure virtual function? And why would I want to?
In our example, let’s assume there is some common behaviour that all our concrete classes have to perform as part of their doStuff() function. It makes sense to collect the common behaviour together in the base class – but, we still don’t want clients to create instances of the base class, Active.
Remember, adding (at least one) pure virtual functions to a class makes it an abstract class, meaning you cannot create an instance of it. A pure virtual class is not required to have an implementation – but that doesn’t mean you can’t provide an implementation.
The solution to our problem is to keep the Active class exactly as before but add an implementation to its doStuff() pure virtual function that contains all the common behaviour. This common behaviour can be called from the overridden derived classes’ doStuff() function:

class Active
{
public:
  void start();                // As before
  virtual void doStuff() = 0;  // Pure virtual function.
};

void Active::doStuff()		// Pure virtual function implementation!
{
  // Common behaviour goes in here.
}

class ActiveComms : public Active
{
  virtual void doStuff();  // ActiveComm’s unique behaviour
};

void ActiveComms::doStuff() 	// virtual function
{
  // Call base class’ (pure) virtual function.
  // Contains all the common behaviour.
  //
  Active::doStuff();       

  // Do ActiveComm's unique behaviour…
}

The design and use of pure virtual functions is two-fold:

  • To create abstract classes which acts as a common interface (or contract) to client code. Derived concrete classes can be substituted for the abstract base class (using dynamic polymorphism). The abstract base class designer must specify the (only) set of services the client can expect.
  • To force derived classes to provide their own implementation of the function. The abstract base class designer must specify which parts of the implementation can be shared and which must be unique.

Use of abstract base classes and substitution, using pure virtual functions and dynamic polymorphism allows you to build flexible and adaptable solutions, particularly in areas of your system that will be subject to change over the life of the system.

void main(void)–the argument continues…

January 31st, 2011

For, what must be, years now the perpetual argument among programmers in various forums  resurfaces about the legality, or not, of the use of void as the return type for the main function.

I generally try and ignore these arguments as it seems such a trivial point, but maybe it’s because yet another birthday has just passed it’s time to put my two-penneth in.

Before we start, hopefully we all agree that the following code is an abomination:

main() { }

You would probably be more shocked if I told you that this comes from a fairly recent book (2008) Programming 32-bit Microcontrollers in C. Shame on you all.

The argument against the use of void has weighty backing, notability the creator of C++, no other than Bjarne Stroustrup himself. The following is taken from his very useful C++ Style and Technique FAQ:

The definition

void main() { /* ... */ }

is not and never has been C++, nor has it even been C. See the ISO C++ standard 3.6.1[2] or the ISO C standard 5.1.2.2.1. A conforming implementation accepts

int main() { /* ... */ }

and

int main(int argc, char* argv[]) { /* ... */ }

He then goes on to state;

A conforming implementation may provide more versions of main(), but they must all have return type int. The int returned by main() is a way for a program to return a value to “the system” that invokes it. On systems that doesn’t provide such a facility the return value is ignored, but that doesn’t make “void main()” legal C++ or legal C. Even if your compiler accepts “void main()” avoid it, or risk being considered ignorant by C and C++ programmers

Ouch; ignorant, moi. I take that as a personal insult Winking smile

Unfortunately, what Prof. Strousturp is failing to recognise (or possibly acknowledge) is the hundreds, if not thousands, of embedded programmers out there. For example, in the text above the key line is:

The int returned by main() is a way for a program to return a value to “the system” that invokes it.

Deeply embedded systems don’t have a “system” that invokes it, so there is nowhere for it return to and thus having an int return type is actually misleading.

However, I hear you shout, the standard says it must be; not so.

Both the C and C++ standards define two types of execution environments

  • Freestanding
  • Hosted

To quote the C standard (5.1.2.1):

In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation defined.

and the C++ standard (3.6.1)

It is implementation-defined whether a program in a freestanding environment is required to define a main function

So there you have it your honour, embedded systems are freestanding, and as so I can call the startup anything, therefore void main(void) is perfectly legal.

So I would argue, not understanding that void main(void) is perfectly acceptable in for freestanding environment is in risk of being considered ignorant of the finer details of embedded programming.

I know this won’t be the end of it but next time someone starts down this line, just ask them to explain the difference between a freestanding and hosted environment (makes a nice interview question as well!).

C++ Overheads

January 14th, 2011

Recently IAR have finally released full support for C++ (adding exceptions and RTTI) to their family of cross compilers. Initially the kickstart (free) version had not had exceptions and RTTI enabled, however with the release of version 6.10.2 this has now been rectified.

We currently use the IAR compilers on our training courses, targeting an NXP LPC2129 (ARM7TDMI) based systems. As part of verifying that the previous version’s (v5.41) projects still work with v6.10, I decided to investigate the potential overheads of full C++ in this environment (I’m pleased to say all projects worked under v6.10 without modification – phew).

Here are my preliminary findings and I’ll add to them as I investigate further:

First off, I created a project based on the C++ main project, giving the following code:

int main()
{
return 0;
}

[Yes I know, return 0 isn’t necessary and it’s perfectly legal to have “void main()” in C++]

First off, the default language selection is still “Extended Embedded C++”, so I needed to change this to full C++. All build numbers are based on a Debug setting and I/O set to semi-hosting (I/O via debugger terminal window).

image

The build then gave (in bytes):

  • code – 296
  • const – 1
  • data – 8704

The data size surprised me, but by checking the default settings in the Linker configuration file, the default stack size was set to 0×2000 (8192). By changing this to 0×400 (1024) the data requirement dropped to 1536 bytes. I haven’t, as yet, dug down to where the other 500+ bytes are coming from (for another day).

Read more »

C-201x

December 3rd, 2010

The last few years have been dominated by the development of the new C++ standard (still generally referred too as C++0x  but now expected in 2011). This will be the second edition of the C++ standard (ISO/IEC 14882) ignoring any TC’s and alike (useful C++0x FAQ here).

However, in parallel and generally under the radar, there has recently been publish a committee draft for the third edition of the C standard (ISO/IEC 9899). This should not really be a surprise, as my understanding is that all standards must be revised or retired every ten years (or thereabouts). But I still find that in the embedded community many people are unaware of the second edition (usually called C99, with the first edition being C90). The support for C99 features has been slow in coming; for example the release notes of the for recently released IAR v6.10 states:

The product now uses the current C standard defined in 1999, known as C99, as the default C language

It is also noteworthy the MISRA_C:2004 still uses C90 (okay C96 if you want to be pedantic) as it’s Rule 1.1 requirement.

So why hasn’t C99 been widely adopted? When asking the question of a someone I know very well, who has been involved in various commitees over the years, I thought I’d share his response :

In short:-

C90 was needed and it harmonised current compiler practice.

C99 then added a “wish list” of Good Ideas ™ that some people wanted because they were cool. (the ideas not the people :-) However the industry in general did not want the new features, many of which had not been thought through as regards implementation. So the compiler companies did not bother. Only adding things when enough people wanted them. E.g. // for comments.

C1* (formerly C0*) is more of the same. People adding Cool features that the industry is not crying out for and compiler vendors have not added. Though some “Cool Ideas” ™ have already been dropped. Fortunately.

I haven’t yet had time to trawl through the standard, but a couple of key changes that sprung out straight away are:

  • support for multiple threads of execution including an improved memory sequencing model, atomic objects, and thread-local storage
  • static assertions
  • support for bounds-checking interfaces (optional)

Once I’ve had opportunity to investigate these further I shall report back.

The product now uses the current C standard defined in 1999, known as C99, as the default C language

Importing IAR EW 5.4 Projects into Parasoft C++test

November 17th, 2010

Background

Recently I have been experimenting with Parasoft’s C++test tool for static analysis of C and C++ code. As part of this I went through the process of importing an existing C project developed in IAR’s Embedded Workbench toolset. Even though importing a project and checking it against MISRA-C isn’t too taxing, I though I would share my notes for doing this.

Read more »

EMBEDDED PROGRAMMERS’ GUIDE TO THE ARM CORTEX-M ARCHITECTURE

October 13th, 2010

At Embedded Live 2010 I shall be presenting a half-day tutorial entitled “EMBEDDED PROGRAMMERS’ GUIDE TO THE ARM CORTEX-M ARCHITECTURE”.

Feabhas have been training embedded software engineers in languages and architectures for the last 15 years. For the last decade we have been using ARM based target systems for all our programming based courses (C, C++ and testing – ARM7TDMI) and embedded Linux courses (ARM926). However with the development and release of the new generation Cortex micros we are moving our training over to Cortex-M for the languages and Cortex-A for Linux.

As part of this exercise we have to spend lots of time getting to know the Cortex microprocessors in detail, looking at different implementations and various support tools and environments.

The majority of supporting material around the new generation of ARM Cortex-M architectures (M0, M3 & M4), unsurprisingly, focuses heavily on the key hardware specifics of the microcontroller core, with most coding examples being in THUMB2 assembler. However the majority of programming for the Cortex will be in the C programming language (recently a VDC report showed C is still head-and-shoulders above other languages for embedded programming )

Core Features

This class looks at all the really useful features added to the Cortex-M that makes it a truly excellent target environment for the embedded software engineer.  As a simple example many embedded processors do not support integer division in hardware (e.g. ARM7), so division typically handled by an intrinsic library function call or compiler ‘tricks’

The new Cortex-M3 has new signed and unsigned integer division instructions, that can also support modulo operation ( x % y )

There are many other features that I shall cover including unaligned-transfers, bit-banding and the new improved interrupt support architecture (NVIC).

However, there are three other significant supporting technologies that really help the software engineer.

  1. Cortex Microcontroller Software Interface Standard (CMSIS)
  2. Debug Support
  3. RTOS Support

CMSIS

Simply put, CMSIS is a collection of source files (.c, .h and assembler) to create a minimal board support package (BSP) for Cortex-M series processors. Very usefully, it defines a common way to access peripheral registers and define exception vectors. It also defines the register names of the Core Peripherals and the names of the Core Exception Vectors. So, instead of having to spend time and effort defining structs for register definitions for onboard devices (or hoping you development environment has already done this for you) you can be assured that they already exist. For example, the NXP LPC17xx family of microcontrollers support a watchdog timer. Being CMSIS compliant, then the supplied header LPC17xx.h defines the register layout and necessary #defines:

Debug Support

JTAG units, such as the Keil ULINK, have made target programming and source-level debug very affordable. However, for small pin count micros, the 4-wire JTAG is seen as quite expensive option (in terms of pure pin-count). As part of the Cortex-M core is support for a new serial-wire interface. The advantage being that it only requires 2-wires, which makes it very easy and affordable to support debug (and power) over a simple USB connection.

At the other end of the spectrum, ARM have added the option for an Embedded Trace Macro (ETM) unit, which allows features such as debug of events in real-time systems where the target cannot be halted and software profiling and code coverage.

RTOS Support

For someone who has a long background in Real-Time Operating Systems, I was very interested to discover how ARM has made it simpler and easier for an RTOS vendor to support the Cortex-M.  As you can guess CMSIS is a huge step forward, as it means once an RTOS has been ported using CMSIS, the core aspects will work on, say, all Cortex-M3 implementations.

As a simple example, pretty much all RTOS require a time-frame reference (the “tick” timer) for timeouts and delays, etc.  ARM has integrated this directly into the core (called Systick) rather than each silicon vendor having to implement their own count-up or count-down variant. There are already 20+ RTOSs running on the Cortex-M.

Also as, as an optional part of the Cortex-M3/M4 core is a memory-protection unit. An RTOS can make use of this to create a safer multitasking platform without the expense of a full-blow MMU.

Finally, what makes the Cortex-M so attractive from a embedded software engineers perspective is to abundance of low cost evaluation kit, such as mbed, LPCXpresso, STM32 Value line Discovery, Energy Micro Gecko Starter Kit, and Actel’s  SmartFusion to name just a small selection.

I hope to see you at Embedded Live 2010. If so please come and say hello.

Scope and Lifetime of Variables in C

September 27th, 2010

In a previous posting we looked at the principles (and peculiarities) of declarations and definitions. Here I would like to address the concepts of scope and lifetime of variables (program objects to be precise).

In the general case:

  • The placement of the declaration affects scope
  • The placement of the definition affects lifetime

Lifetime

The lifetime of an object is the time in which memory is reserved while the program is executing. There are three object lifetimes:

  • static
  • automatic
  • dynamic

Given the following piece of code:

int global_a;       /* tentative defn; become actual defn init to 0 */
int global_b = 20;     /* defn and implicit-decl */

int f(int* param_c)
{
   int local_d = 10;
   . . .
   return local_d;
}
int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);
   ...
   free(ptr);
}

global_a and global_b are static
The memory allocated by the call to malloc is dynamic
All others (including param_c, ptr and the return value from function f) are automatic.

Static Objects

The memory for static objects is allocated at compile/link time. Their address is fixed by the linker based on the linker control file (LCF).  You may know this file by another name such as linker-script file, linker configuration file or even scatter-loading description file. The LCF file defines the physical memory layout (Flash/SRAM) and placement of the different program regions.

The static region is actually subdivided into two further sections, one for initialised-definitions (int global_ b = 20;)  and one for uninitialized-definitions (int global_a;). So it would not be unexpected for the address of global_a and global_b to not be adjacent to each other in SRAM. The uninitialised-definitions’ section is commonly known as the .bss or ZI section. The initialised-definitions’ section is commonly known as the .data or RW section.
Finally, the initial value of global_a will be zero (0) and 20 for global_b.

Automatic objects

The majority of variables are defined within functions and classed as automatic variables. This also includes parameters and any temporary-returned-object (TRO) from a non-void function, e.g.

int f(int* param_c)  /* tro(int) and parameter(param_c) */
{  
   int local_d = 10; /* local variable */
   . . .
   return local_d;   /* copy local_d to tro */
}

The default model in general programming is that the memory for these program objects is allocated from the stack. For parameters and TRO’s the memory is normally allocated by the calling function (by pushing values onto the stack), whereas for local objects, memory is allocated once the function is called. This key feature enables a function to call itself – recursion (though recursion is generally a bad idea in embedded programming as it may cause stack-overflow problems).
In this model, automatic memory is reclaimed by popping the stack on function exit.

Within a function variables may be localised to a block associated with a control structure, e.g.

for(x = 0; x < N; ++x) {
   int block_y = 0;   /* nested local variable */
   . . .
}

Here the memory is allocated on entry to the block and reclaimed on exit.
However, on most modern microcontrollers, especially 32-bit RISC architectures, automatics are stored in scratch registers, where possible, rather than the stack. For example the ARM Architecture Procedure Call Standard (AAPCS) defines which CPU registers are used for function call arguments into, and results from, a function and local variables.

Importantly, if an automatic is not explicitly initialised, then the initial value is indeterminate (thus garbage) and therefore should never be read before being set. If the automatic is explicitly initialised then the memory is reinitialised on each call of the function.
The location and size of the stack are typically defined using the LCF.
Finally, there still are the (historic) keywords auto and register that can be applied to automatics. Both are pretty much redundant in modern programming.

Dynamic Objects

Strictly speaking (according to the C standard) dynamically allocated objects are also called automatics. However, it is important to differentiate between this type of object and automatics for two reasons:

  1. The memory is allocated from a different memory area (the heap not the stack)
  2. The lifetime is under the control of the programmer rather than the C run-time system.

When calling on malloc, calloc or realloc, these functions return an address (void*) for a block of dynamically allocated memory. The lifetime of this memory is from allocation until the call to either free or realloc the memory.

The realloc function takes an allocated memory block and expands (or contracts) it to a bigger (or smaller) size. This may involve moving the chunk of memory and copying over the old contents. When this is done, the old contents are automatically freed.

The contents of the memory return from malloc are indeterminate; whereas for calloc the memory is initialised to all zeros. If realloc expands the allocated memory area, then the contents of the extra expended area are indeterminate.
The size and location of the heap are also usually defined in the LCF.

Programming errors involving not releasing dynamically allocated memory have been, and still are, a major source of run-time errors (memory leaks). This is why most modern language use garbage collection (which limits their applicability to many real-time embedded applications) and why many coding standards, such as MISRA-C, ban dynamic memory allocation.

Static local variables

Before we leave lifetimes, there is one further anomaly. The keyword static can be applied to a local variable, e.g.

#include <stdio.h>
void f1(void)
{
   static int slocal = 10;        /* static local */
   int alocal = 10;              /* automatic local */
   printf("In f1: slocal = %d, alocal = %d\n", slocal, alocal);
   ++slocal;
}

int main(void)
{
   f1();
   f1();
   f1();
}

Applying static to a local variable changes the objects lifetime from automatic to static. This means that the memory is allocated at compiler/link time and its address in memory is fixed. However, as the memory is static these local variables retain their value from function call to function call. The local static is initialised only the first call of the function. So given the example above, the output is:
In f1: slocal = 10, alocal = 10
In f1: slocal = 11, alocal = 10
In f1: slocal = 12, alocal = 10

Local statics may look useful, however they cause major problems when trying to port code to a multi-task/multi-threading environment, and should generally be avoided where possible.

Scope

The scope of an object is the part of the program where the variable can be accessed (i.e. it is visible). The scope of an object generally falls into one of two general categories:

  • File scope
  • Block scope

As explained in the posting on declarations and definitions, a variable must be declared before it is accessed. Hence the scope of a variable is determined by the placement of its declaration. Returning to the previous example (slightly modified):

int global_a;       /* Decln and Defn */

int f(int* param_c)
{
   int local_d = param_c;       /* automatic local */
   static int local_s = 10;     /* static local    */
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);
   ...
   free(ptr);
}

In the example given, identifier global_a has file scope, whereas all other variables have block scope.

File Scope

Any variable declared with file scope can be accessed by any function defined after the declaration (in our example both f and main can access global_a). If global_a was declared after the function f but before main it would only be accessible within main.

Block Scope

Block scope is defined by the pairing of the curly braces { and } .  A variable declared within a block can only be accessed within that block. For example, local_d has block scope determined by the function-block for f and cannot be accessed outside that function. The variable ptr also has function-block scope limited to the main function. Note also that the local static, local_s, has block scope even though it has static lifetime.
Interestingly the parameter of function f, param_c, is also classed as have block scope. It can be accessed anywhere within the function it is a parameter of. Personally I would prefer to define this as “function” scope, but that would be incorrect according to the standard!

Within a function further localised (inner) scopes can be introduced, e.g.

for(x = 0; x < N; ++x) {
   int block_y = 0;
   . . .
}

Here, block_y is scoped to within the for-loop (i.e. it cannot be accessed in the for-expression region or outside of the for-block).

In a file and/or function we can have overlapping scopes, e.g.

int k = 20;
int main()
{
   int k = 10;
   printf( "In main, k is %d\n", k);
}

The rule is that an inner scope identifier always hides an outer scope identifier. Hence, the block-scoped identifier k hides the file-scoped identifier k (and thus the value displayed will be ten). Note that the file-scoped k is still in scope but is rendered invisible. It is generally bad practice to have variables with overlapping scopes.

Good programming practices limit scope as much as possible. By localising scope the potential for programming errors to creep in are significantly reduced.

Scope of Dynamic Objects

So it can be seen that the general case is that static objects have file scope and automatic objects have function scope. But what about the scope of dynamic objects?
A dynamic object doesn’t actually have scope, as such. In effect, its scope is dictated by the scope of any pointer holding the address of the dynamically allocated memory. As long as the pointer is in scope it can be dereferenced and the memory accessed.

External and Internal Linkage

Before leaving scope there is one final item to address. By default a variable with file scope can be accessed by any function in the whole program (e.g. in other files from where it is defined) as long as it is declared in scope for the function.
If a variable is defined with file scope in one file, but is required in another, then it can be brought into scope using the “extern” storage-class specifier, e.g.

/* file a.c */
int global_a = 10;       /* definition of global_a */

int f(int* param_c)
{
   int local_d = param_c;
   static int local_s = 10;
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

/* file main.c */
extern int global_a;    /* declaration of global_a, now visible */
int f(int*);

int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);  /* global_a is visible so can be accessed */
   ...
   free(ptr);
}

Quite often we have the case where we need a variable with static lifetime, we don’t want it globally accessible (i.e. want to limit its use to functions in the current file), but we don’t want to define it as a local static as it is needed in multiple local functions.
To achieve this we can use the keyword static, but this time to affect scope rather than lifetime. If a file scoped variable is tagged as static then it has, what is called, internal linkage, e.g.

/* file a.c */
int global_a = 10;      /* external linkage – global scope */
static int internal_b;    /* internal linkage – this-file scope  */

int f(int* param_c)
{
   int local_d = param_c;   /* function scope, auto */
   static int local_s = 10; /* function scope, static */
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

If another file tried to declare internal_b as extern, then this would result in a link-time error.
Note that internal linkage can also be applied to functions. All functions have external linkage by default, so it is very good practice to declare a function as static if it is only being used with the current file.

Next time: Why understanding Scope and Lifetime is important to embedded programming