Unscrambling C Declarations

Note: Based on some feedback I should clarify that this does not cover C99 syntax

Even though the C programming language has been around since the late 1960’s, many programmers still have trouble understanding how C declarations are formed. This is not unsurprising due to the complexity that can arise when mixing pointer, array and function-pointer declarations.

In this posting we shall look at some complex declarations to try and understand them by considering how they are formed. The intent is not so you can go off and write wonderfully complex declarations, but more hopefully you may actually be able to understand someone else’s code. Finally we shall look at how most complex declarations can be easily simplified.

Here I’m going to focus on object declarations/definitions rather than functions. Also, in this posting I’m not going to examine structure, union or enumeration specifies. They’ll keep for another day.

How to read a declaration
Very simple ones (specifically those not involving “[]” or “()“) can be read from right-to-left, e.g.
int x

where ‘x’ is an (identifier for an) integer. However, this approach starts to break down very quickly, e.g.
int a[10]

Therefore a more sophisticated approach is needed for complex declarations because of precedence and associativity rules that apply to the differing symbols in the declaration.

Before building a rule-set there are a number of things we can exclude:

A function cannot return a function – () foo()
A function cannot return an array – [] foo ()
An array cannot hold functions – foo[]()

Let’s start with some simple examples:
int x x is an integer

This can give us:

Rule 1: Read from left to right looking for an identifier.

So ignore types (int, char, etc.), qualifiers (e.g. const, volatile) and the symbols ‘()’,'[]’ and ‘*’ until you find the first unique identifier. This is the identifier for the declaration.

Building on this, once the identifier is found we look for either array or function notation, e.g.
int a[10] x is an array of (ten) integers
void x(int y) x is a function that takes an integer parameter (y) and returns nothing (void)

Rule 2: look right from the identifier for postfix operators () or []. If [] then it is an array, else if () then it is a function.

Next we introduce pointer notation:
int * x x is a pointer to an integer

Rule 3: look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.

Finally we can introduce type qualifiers (const / volatile), e.g.
const int x x is an integer constant

Rule 4: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier

So that gives us a preliminary set of 4 rules.

These hold for the following declarations:
int const x x is a constant integer (This is identical to the previous declaration. This is part of the confusing syntax of the C programming language, but Rule 4 still applies).

const int * x     x is a pointer to a constant integer. Rule 3 followed by Rule 4
int const * x      x is a pointer to a constant integer (as above – still confused?)
int * x[10]          x is an array of pointers to integers ( Rule 2, Rule 3)
int * x(void)      x is a function that returns a pointer to an integer (Rule 2, Rule 3)
int **x                 x is a pointer to a pointer to an integer (Rule 3, Rule 3)

So far so good? Pretty straight forward? Maybe not the pointer- to-a-pointer, but we still need to add two further rules. The first affects Rule 4. What if we have a const that is not next to the type? as in:
int * const x

Here we need a new rule, which we’ll call Rule 4b (with our previous Rule 4 becoming 4a):

Rule 4b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

int * const x x is a constant pointer to an integer (this means the pointer address is constant)

Combining 4a and 4b gives us:
int const * const x x is a constant pointer to a constant integer

We have one final rule required to force precedence. For example we’ve already seen that int * x(void)declares x as a function that returns a pointer to an integer (Rule 2, Rule 3). But what if I wanted to declare a pointer to a function that returns an integer?

The syntax is as follows:
int (*x)(void) x is a pointer to a function that returns an integer

This gives our final rule, which becomes a new Rule 2 and pushes everything down by one:

Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first

This rule is required because when we have *x() then the function parentheses always win. Thus:
void (*x)(int y) x is a pointer to a function that takes an integer (y) as a parameter and returns void

Rule Summary

Rule 1: Read from left to right looking for an identifier
Rule 2: If the identifier is within parentheses, then evaluate inside the parentheses first
Rule 3: look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
Rule 4: look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

Complex Declarations

This core set that should decode C program object declarations. Let’s put it to the test on a couple of horrible declarations. First can you work out:
void (*fpa[10])(int)

Have a go before I break it down…

Okay, let’s decompose this:
Rule 1: From left to right find identifier, this gives us fpa
Rule 2: (*fpa[]) parentheses win, so evaluate inside the parentheses
Rule 3: fpa[10] postfix [] wins; fpa is a ten element array ($ now represents fpa[10])
Rule 4:    *$    prefix * wins; fpa is an array of pointers. Now we’ve evaluated inside the parentheses we step outside.
Rule 3: $() postfix, () wins fpa is an array of pointers to functions
Rule 2: void $(int)    parentheses; fpa is an array of pointers to functions each taking an integer parameter and returning void

So the identifier fpa represents an array of ten pointers to functions each of which takes an integer as a parameter and returns void. Phew…

Okay one last one to try, go to the C standard library and look at the declarations in <signal.h> and you should see:

void (*signal(int sig, void(*func)(int)))(int);

If you can decode this then I’m really impressed!

Let’s apply our rule-set to this:
First, as always is rule 1; signal is the identifier. signal is in parentheses, so based on Rule 2 we must evaluate that first. If we match parenthesis then we get:
(*signal(int sig, void(*func)(int)))

Which we can temporarily simplify (by ignoring the function parameters) to:
(*signal())

Based on Rule 3, then signal is a function that returns a pointer. The question is a pointer to what? Using the simplification we can work out the return type as:
void (* signal() )(int)

which becomes
void (*$)(int)

which means the function signal returns a pointer to a function that has an integer parameter and returns void.

So let’s return to the parameters, this gives us:
signal(int sig, void(*func)(int))

So signal takes two parameters
int sig – sig is an integer
void(*func)(int) – func is a pointer to a function that has an integer parameter and returns void.

To summarise:

signal is a function
that returns a pointer to a function that has an integer parameter and returns void
and takes two parameters of
an integer, and
a pointer to a function that has an integer parameter and returns void

It doesn’t get much worse that this (and remember this example comes from the standard library, which is shameful!).

How to avoid complexity in declarations
Avoid by design, as far as possible. If this fails, divide and conquer remembering that typedef is your friend. A typedef declaration does not introduce a new type, only a synonym for the type specified. For example:

typedef int MILES;
MILES m; /* m is of type int */

typedef int*  int_ptr;
int_ptr  ip;  /* ip is of type integer pointer int* */

Used well typedef’s makes life easier. For example:
typedef void (*FuncPtr)(int);

FuncPtr is a typedef for a pointer to any function which takes an integer parameter and returns void.

In the “signal” example, both function pointers are of this type, so using the typedef, the declaration
void (*signal(int sig, void(*func)(int)))(int)

becomes
FuncPtr signal(int sig, FuncPtr)

and our previous declaration of:
void (*fpa[10])(int)

becomes
FuncPtr fpa[10]

After that I need to find a dark room to lie down in.

Decoding Rule-set
Rule 1: Read from left to right looking for an identifier
Rule 2: If the identifier is with parentheses, then evaluate inside the parentheses first
Rule 3: look right for postfix operators ( ) or [ ]. If [] then it is an array, else if () then it is a function.
Rule 4: look left for prefix pointer asterisk ‘*’. If found the identifier is a pointer.
Rule 5a: If a const and/or volatile is next to a type specifier (int, long, etc.) it applies to that specifier
Rule 5b: if a const and/or volatile is not next to a type then it applies to the pointer asterisk on its immediate left

Also check out https://www.cdecl.org/ (thanks @FrankSansC)

About
Latest Posts

Niall Cooling

Director at Feabhas Limited

Co-Founder and Director of Feabhas since 1995.
Niall has been designing and programming embedded systems for over 30 years. He has worked in different sectors, including aerospace, telecomms, government and banking.
His current interest lie in IoT Security and Agile for Embedded Systems.

Niall Cooling

Website | + posts

5 Responses to Unscrambling C Declarations

Pavel says:

December 9, 2009 at 7:37 pm

Useful stuff. Thank you!
However, the theme is a bit popular in related books, etc.
PS
Seems to be a little slip: "where ‘x’ is an (identifier for an) integer" should be "where ‘int’ is an (identifier for an) integer" or "'x' declared as of integer (data) type".
PPS
Suppose this will be followed by discussing of 'volatile' and 'const' usage.

Like (0)

Dislike (0)
pjotr says:

December 10, 2009 at 8:51 am

Sorry to (maybe) disappoint you, but your assumption is wrong about functions that never return an array. An example:

struct ret {
int a[10];
} Function (void);

Like (0)

Dislike (0)
peterbushellwp says:

December 21, 2009 at 12:05 pm

Pavel, I don't understand your two alternatives to Niall's perfectly lucid "where 'x' is an (identifier for an) integer". Your first alternative is just wrong - 'int' is not an identifier but a reserved word, in C, indicating a type. Your second alternative approximates to what Niall wrote in the first place, but does so less clearly. I feel something has been lost, here, in your translation from English to Russian and back again 🙂

Pjotr, your Function does not return an array; it returns a structure containing an array.

A prerequisite for successful pedantry is meticulous accuracy.

Like (0)

Dislike (0)
Pingback: Unscrambling C Declarations | Roman's knowledgebase
uMinded says:

November 14, 2010 at 11:49 pm

Here is one I still gotta re-re-read when I do not use it for a while.

#define SBIT(port,pin) ((*(volatile struct bits*)&port).b##pin)

Its used to access register bits (via a structure) like a variable so you could say:

#define LED0 SBIT(PORTB,4)
LED0 = 1;
LED0 = 0;

Like (0)

Dislike (0)

Niall Cooling

About Niall Cooling

5 Responses to Unscrambling C Declarations

Leave a ReplyCancel reply

Categories

Archives

Unscrambling C Declarations

Niall Cooling

About Niall Cooling

5 Responses to Unscrambling C Declarations

Leave a ReplyCancel reply

Follow Us

Categories

Archives