Scope and Lifetime of Variables in C

In a previous posting we looked at the principles (and peculiarities) of declarations and definitions. Here I would like to address the concepts of scope and lifetime of variables (program objects to be precise).

In the general case:

  • The placement of the declaration affects scope
  • The placement of the definition affects lifetime

Lifetime

The lifetime of an object is the time in which memory is reserved while the program is executing. There are three object lifetimes:

  • static
  • automatic
  • dynamic

Given the following piece of code:

int global_a;       /* tentative defn; become actual defn init to 0 */
int global_b = 20;     /* defn and implicit-decl */

int f(int* param_c)
{
   int local_d = 10;
   . . .
   return local_d;
}
int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);
   ...
   free(ptr);
}

global_a and global_b are static
The memory allocated by the call to malloc is dynamic
All others (including param_c, ptr and the return value from function f) are automatic.

Static Objects

The memory for static objects is allocated at compile/link time. Their address is fixed by the linker based on the linker control file (LCF).  You may know this file by another name such as linker-script file, linker configuration file or even scatter-loading description file. The LCF file defines the physical memory layout (Flash/SRAM) and placement of the different program regions.

The static region is actually subdivided into two further sections, one for initialised-definitions (int global_ b = 20;)  and one for uninitialized-definitions (int global_a;). So it would not be unexpected for the address of global_a and global_b to not be adjacent to each other in SRAM. The uninitialised-definitions’ section is commonly known as the .bss or ZI section. The initialised-definitions’ section is commonly known as the .data or RW section.
Finally, the initial value of global_a will be zero (0) and 20 for global_b.

Automatic objects

The majority of variables are defined within functions and classed as automatic variables. This also includes parameters and any temporary-returned-object (TRO) from a non-void function, e.g.

int f(int* param_c)  /* tro(int) and parameter(param_c) */
{  
   int local_d = 10; /* local variable */
   . . .
   return local_d;   /* copy local_d to tro */
}

The default model in general programming is that the memory for these program objects is allocated from the stack. For parameters and TRO’s the memory is normally allocated by the calling function (by pushing values onto the stack), whereas for local objects, memory is allocated once the function is called. This key feature enables a function to call itself – recursion (though recursion is generally a bad idea in embedded programming as it may cause stack-overflow problems).
In this model, automatic memory is reclaimed by popping the stack on function exit.

Within a function variables may be localised to a block associated with a control structure, e.g.

for(x = 0; x < N; ++x) {
   int block_y = 0;   /* nested local variable */
   . . .
}

Here the memory is allocated on entry to the block and reclaimed on exit.
However, on most modern microcontrollers, especially 32-bit RISC architectures, automatics are stored in scratch registers, where possible, rather than the stack. For example the ARM Architecture Procedure Call Standard (AAPCS) defines which CPU registers are used for function call arguments into, and results from, a function and local variables.

Importantly, if an automatic is not explicitly initialised, then the initial value is indeterminate (thus garbage) and therefore should never be read before being set. If the automatic is explicitly initialised then the memory is reinitialised on each call of the function.
The location and size of the stack are typically defined using the LCF.
Finally, there still are the (historic) keywords auto and register that can be applied to automatics. Both are pretty much redundant in modern programming.

Dynamic Objects

Strictly speaking (according to the C standard) dynamically allocated objects are also called automatics. However, it is important to differentiate between this type of object and automatics for two reasons:

  1. The memory is allocated from a different memory area (the heap not the stack)
  2. The lifetime is under the control of the programmer rather than the C run-time system.

When calling on malloc, calloc or realloc, these functions return an address (void*) for a block of dynamically allocated memory. The lifetime of this memory is from allocation until the call to either free or realloc the memory.

The realloc function takes an allocated memory block and expands (or contracts) it to a bigger (or smaller) size. This may involve moving the chunk of memory and copying over the old contents. When this is done, the old contents are automatically freed.

The contents of the memory return from malloc are indeterminate; whereas for calloc the memory is initialised to all zeros. If realloc expands the allocated memory area, then the contents of the extra expended area are indeterminate.
The size and location of the heap are also usually defined in the LCF.

Programming errors involving not releasing dynamically allocated memory have been, and still are, a major source of run-time errors (memory leaks). This is why most modern language use garbage collection (which limits their applicability to many real-time embedded applications) and why many coding standards, such as MISRA-C, ban dynamic memory allocation.

Static local variables

Before we leave lifetimes, there is one further anomaly. The keyword static can be applied to a local variable, e.g.

#include <stdio.h>
void f1(void)
{
   static int slocal = 10;        /* static local */
   int alocal = 10;              /* automatic local */
   printf("In f1: slocal = %d, alocal = %d\n", slocal, alocal);
   ++slocal;
}

int main(void)
{
   f1();
   f1();
   f1();
}

Applying static to a local variable changes the objects lifetime from automatic to static. This means that the memory is allocated at compiler/link time and its address in memory is fixed. However, as the memory is static these local variables retain their value from function call to function call. The local static is initialised only the first call of the function. So given the example above, the output is:
In f1: slocal = 10, alocal = 10
In f1: slocal = 11, alocal = 10
In f1: slocal = 12, alocal = 10

Local statics may look useful, however they cause major problems when trying to port code to a multi-task/multi-threading environment, and should generally be avoided where possible.

Scope

The scope of an object is the part of the program where the variable can be accessed (i.e. it is visible). The scope of an object generally falls into one of two general categories:

  • File scope
  • Block scope

As explained in the posting on declarations and definitions, a variable must be declared before it is accessed. Hence the scope of a variable is determined by the placement of its declaration. Returning to the previous example (slightly modified):

int global_a;       /* Decln and Defn */

int f(int* param_c)
{
   int local_d = param_c;       /* automatic local */
   static int local_s = 10;     /* static local    */
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);
   ...
   free(ptr);
}

In the example given, identifier global_a has file scope, whereas all other variables have block scope.

File Scope

Any variable declared with file scope can be accessed by any function defined after the declaration (in our example both f and main can access global_a). If global_a was declared after the function f but before main it would only be accessible within main.

Block Scope

Block scope is defined by the pairing of the curly braces { and } .  A variable declared within a block can only be accessed within that block. For example, local_d has block scope determined by the function-block for f and cannot be accessed outside that function. The variable ptr also has function-block scope limited to the main function. Note also that the local static, local_s, has block scope even though it has static lifetime.
Interestingly the parameter of function f, param_c, is also classed as have block scope. It can be accessed anywhere within the function it is a parameter of. Personally I would prefer to define this as “function” scope, but that would be incorrect according to the standard!

Within a function further localised (inner) scopes can be introduced, e.g.

for(x = 0; x < N; ++x) {
   int block_y = 0;
   . . .
}

Here, block_y is scoped to within the for-loop (i.e. it cannot be accessed in the for-expression region or outside of the for-block).

In a file and/or function we can have overlapping scopes, e.g.

int k = 20;
int main()
{
   int k = 10;
   printf( "In main, k is %d\n", k);
}

The rule is that an inner scope identifier always hides an outer scope identifier. Hence, the block-scoped identifier k hides the file-scoped identifier k (and thus the value displayed will be ten). Note that the file-scoped k is still in scope but is rendered invisible. It is generally bad practice to have variables with overlapping scopes.

Good programming practices limit scope as much as possible. By localising scope the potential for programming errors to creep in are significantly reduced.

Scope of Dynamic Objects

So it can be seen that the general case is that static objects have file scope and automatic objects have function scope. But what about the scope of dynamic objects?
A dynamic object doesn’t actually have scope, as such. In effect, its scope is dictated by the scope of any pointer holding the address of the dynamically allocated memory. As long as the pointer is in scope it can be dereferenced and the memory accessed.

External and Internal Linkage

Before leaving scope there is one final item to address. By default a variable with file scope can be accessed by any function in the whole program (e.g. in other files from where it is defined) as long as it is declared in scope for the function.
If a variable is defined with file scope in one file, but is required in another, then it can be brought into scope using the “extern” storage-class specifier, e.g.

/* file a.c */
int global_a = 10;       /* definition of global_a */

int f(int* param_c)
{
   int local_d = param_c;
   static int local_s = 10;
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

/* file main.c */
extern int global_a;    /* declaration of global_a, now visible */
int f(int*);

int main(void)
{
   int *ptr = malloc(sizeof(int)*100);
   ...
   global_a = f(ptr);  /* global_a is visible so can be accessed */
   ...
   free(ptr);
}

Quite often we have the case where we need a variable with static lifetime, we don’t want it globally accessible (i.e. want to limit its use to functions in the current file), but we don’t want to define it as a local static as it is needed in multiple local functions.
To achieve this we can use the keyword static, but this time to affect scope rather than lifetime. If a file scoped variable is tagged as static then it has, what is called, internal linkage, e.g.

/* file a.c */
int global_a = 10;      /* external linkage – global scope */
static int internal_b;    /* internal linkage – this-file scope  */

int f(int* param_c)
{
   int local_d = param_c;   /* function scope, auto */
   static int local_s = 10; /* function scope, static */
   . . .
   local_s = global_a;
   . . .
   return local_d;
}

If another file tried to declare internal_b as extern, then this would result in a link-time error.
Note that internal linkage can also be applied to functions. All functions have external linkage by default, so it is very good practice to declare a function as static if it is only being used with the current file.

Next time: Why understanding Scope and Lifetime is important to embedded programming

Posted on September 27th, 2010
» Feed to this thread
» Trackback

5 Comments a “Scope and Lifetime of Variables in C”

  1. Manish says:

    Hello,
    After reading the above article, I have a query regarding scope of a global variable in the scenario described in below cases:

    Case 1:

    /* file one.c */

    int global_a = 10; /* global scope */
    …….
    …….

    /* file two.c */

    int global_a; /* global scope */
    …….
    …….

    In the above scenario, the two variables with sane name are declared as globals in 2 separate files. Would this result in compilation error or does the C compiler perform some kind of name mangling wherein the internal references to global variables are stored differently?

    Case 2:

    /* file one.c */

    int global_a = 10; /* global scope */
    …….
    …….

    /* file two.c */

    int global_a; /* global scope */
    …….
    …….

    /* file three.c */

    extern int global_a;
    …….
    …….

    In this case, the compiler should definitely throw a Link time error as the variable to be referenced in file three.c is not clear.

    Do share your thoughts on the above scenarios,
    Thanks,
    Manish

  2. Niall Cooling says:

    Manish,
    You’ve brought up a very good point and the short answer is that it’s compiler dependent, where some will generate a link time error and some won’t.
    In your first example, according to the standard, global_a in file one.c is a definition, whereas global_a in file two.c is a tentative definition. As there isn’t a definition in file two.c then global_a becomes a full definition with an initial value of 0. This should then cause a link time error along the lines of:
    test_arm.axf: Error: L6200E: Symbol global_a multiply defined (by one.o and two.o).
    This is based on ODR (One Definition Rule).
    However, most hosted environments (Linux, Windows, etc.) relax ODR for tentative definitions and rather than being on a per translation unit basis, they look across all translation units. In this case, global_a in file two.c becomes a declaration as a full definition is found in another translation unit (i.e. the tentative definition becomes a declaration, e.g. extern int global_a).
    In summary, if a compiler fails with Case 1 (e.g. Keil, IAR) then it will also fail with Case 2. However if it allows Case 1 (e.g. GCC, VC++) then it will also allow Case 2 (as file two.c and file three.c as basically the same). In this second case there is only one version of global_a.

    Finally, if a compiler resolves tentative definitions across translation units and there isn’t a full definition (i.e. with an “=”) anywhere, then one global_a is created with an initial value of 0.

  3. Manish Gajjaria says:

    Thanks Niall for throwing some light on this.

    Would the consequences be any different if in file two.c the variable is fully defined as below, with the other files remaining unchanged:

    /* file two.c */

    int global_a = 35; /* global scope */

  4. Niall Cooling says:

    As it is a full definition, then both cases should now fail at link with a “duplicate symbol” error.

  5. abilash says:

    Nice post. Really helpful

Leave a Reply