You are currently browsing the Sticky Bits blog archives for December, 2009.

Unscrambling C Declarations

December 9th, 2009
Note: Based on some feedback I should clarify that this does not cover C99 syntax

Even though the C programming language has been around since the late 1960’s, many programmers still have trouble understanding how C declarations are formed. This is not unsurprising due to the complexity that can arise when mixing pointer, array and function-pointer declarations.

In this posting we shall look at some complex declarations to try and understand them by considering how they are formed. The intent is not so you can go off and write wonderfully complex declarations, but more hopefully you may actually be able to understand someone else’s code. Finally we shall look at how most complex declarations can be easily simplified.
Here I’m going to focus on object declarations/definitions rather than functions. Also, in this posting I’m not going to examine structure, union or enumeration specifies. They’ll keep for another day.
How to read a declaration
Very simple ones (specifically those not involving “[]” or “()“) can be read from right-to-left, e.g.
int x
where ‘x’ is an (identifier for an) integer. However, this approach starts to break down very quickly, e.g.
int a[10]
Therefore a more sophisticated approach is needed for complex declarations because of precedence and associativity rules that apply to the differing symbols in the declaration.

Before building a rule-set there are a number of things we can exclude:
  1. A function cannot return a function – () foo()
  2. A function cannot return an array – [] foo ()
  3. An array cannot hold functions – foo[]()
Let’s start with some simple examples:
int x         x is an integer
This can give us:
Rule 1: Read from left to right looking for an identifier.
So ignore types (int, char, etc.), qualifiers (e.g. const, volatile) and the symbols ‘()’,'[]’ and ‘*’ until you find the first unique identifier. This is the identifier for the declaration.
Building on this, once the identifier is found we look for either array or function notation, e.g.
int a[10]            x is an array of (ten) integers
void x(int y)    x is a function that takes an integer parameter (y) and returns nothing (void)

Rule 2:    look right from the identifier for postfix operators () or []. If [] then it is an array, else if () then it is a function.

Next we introduce pointer notation:
int * x      x is a pointer to an integer

Rule 3:    look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.

Finally we can introduce type qualifiers (const / volatile), e.g.
const int x     x is an integer constant

Rule 4:    If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier

So that gives us a preliminary set of 4 rules.
These hold for the following declarations:
int const x      x is a constant integer (This is identical to the previous declaration. This is part of the confusing syntax of the C programming language, but Rule 4 still applies).
const int * x      x is a pointer to a constant integer. Rule 3 followed by Rule 4
int const * x       x is a pointer to a constant integer (as above – still confused?)
int * x[10]          x is an array of pointers to integers ( Rule 2, Rule 3)
int * x(void     x is a function that returns a pointer to an integer (Rule 2, Rule 3)
int **x                 x is a pointer to a pointer to an integer (Rule 3, Rule 3)

So far so good? Pretty straight forward? Maybe not the pointer- to-a-pointer, but we still need to add two further rules. The first affects Rule 4. What if we have a const that is not next to the type? as in:
int * const x
Here we need a new rule, which we’ll call Rule 4b (with our previous Rule 4 becoming 4a):   

Rule 4b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

int * const x      x is a constant pointer to an integer (this means the pointer address is constant)
Combining 4a and 4b gives us:
int const * const x     x is a constant pointer to a constant integer
We have one final rule required to force precedence. For example we’ve already seen that int * x(void)declares x as a function that returns a pointer to an integer (Rule 2, Rule 3). But what if I wanted to declare a pointer to a function that returns an integer?
The syntax is as follows:
int (*x)(void)    x is a pointer to a function that returns an integer
This gives our final rule, which becomes a new Rule 2 and pushes everything down by one:

Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first

This rule is required because when we have  *x() then the function parentheses always win. Thus:
void (*x)(int y)     x is a pointer to a function that takes an integer (y) as a parameter and returns void
Rule Summary
  • Rule 1: Read from left to right looking for an identifier
  • Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first
  • Rule 3:    look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
  • Rule 4:    look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
  • Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
  • Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left
Complex Declarations
This core set that should decode C program object declarations. Let’s put it to the test on a couple of horrible declarations. First can you work out:
void (*fpa[10])(int)
Have a go before I break it down…
Okay, let’s decompose this:
Rule 1: From left to right find identifier, this gives us fpa
Rule 2: (*fpa[]) parentheses win, so evaluate inside the parentheses     
Rule 3: fpa[10]  postfix [] wins; fpa is a ten element array ($ now represents fpa[10])
Rule 4:    *$    prefix * wins; fpa is an array of pointers. Now we’ve evaluated inside the parentheses we step outside.
Rule 3: $() postfix, () wins fpa is an array of pointers to functions
Rule 2: void $(int   parentheses; fpa is an array of pointers to functions each taking an integer parameter and returning void
So the identifier fpa represents an array of ten pointers to functions each of which takes an integer as a parameter and returns void. Phew…
Okay one last one to try, go to the C standard library and look at the declarations in <signal.h> and you should see:
 void (*signal(int sig, void(*func)(int)))(int);
If you can decode this then I’m really impressed!

Let’s apply our rule-set to this:
First, as always is rule 1; signal is the identifier. signal is in parentheses, so based on Rule 2 we must evaluate that first. If we match parenthesis then we get:
(*signal(int sig, void(*func)(int)))
Which we can temporarily simplify (by ignoring the function parameters) to:
(*signal())       
Based on Rule 3, then signal is a function that returns a pointer. The question is a pointer to what?  Using the simplification we can work out the return type as:
void (* signal() )(int)
which becomes
void (*$)(int)
which means the function signal returns a pointer to a function that has an integer parameter and returns void.
So let’s return to the parameters, this gives us:
signal(int sig, void(*func)(int))
So signal takes two parameters
style='font-family: "Courier New",Courier,monospace;'>int sig – sig is an integer
void(*func)(int) –  func is a pointer to a function that has an integer parameter and returns void.
To summarise:
  • signal is a function
  • that returns a pointer to a function that has an integer parameter and returns void
  • and takes two parameters of
  • an integer, and
  • a pointer to a function that has an integer parameter and returns void
It doesn’t get much worse that this (and remember this example comes from the standard library, which is shameful!).
How to avoid complexity in declarations
Avoid by design, as far as possible. If this fails, divide and conquer remembering that typedef is your friend.  A typedef declaration does not introduce a new type, only a synonym for the type specified. For example:
typedef  int  MILES;
MILES  m;   /* m is of type int */
typedef int*  int_ptr;
int_ptr  ip;  /* ip is of type integer pointer int* */
Used well typedef’s makes life easier. For example:
typedef void (*FuncPtr)(int);
FuncPtr is a typedef for a pointer to any function which takes an integer parameter and returns void.
In the “signal” example, both function pointers are of this type, so using the typedef, the declaration
void (*signal(int sig, void(*func)(int)))(int)
becomes
FuncPtr signal(int sig, FuncPtr)
and our previous declaration of:
void (*fpa[10])(int)
becomes
FuncPtr  fpa[10]
After that I need to find a dark room to lie down in.
Decoding Rule-set
Rule 1:  Read from left to right looking for an identifier
Rule 2:  If the identifier is with parentheses, then evaluate inside the parentheses first
Rule 3:   look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
Rule 4:   look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

Also check out http://www.cdecl.org/ (thanks @FrankSansC)

%d bloggers like this: