Latest posts by Glennan Carnie (see all)
- The Rule of Zero - January 15, 2015
- The Rule of The Big Four (and a half) – Move Semantics and Resource Management - January 1, 2015
- The Rule of The Big Three (and a half) – Resource Management in C++ - December 18, 2014
Functions are the lifeblood of a C program. The program flow is altered by passing parameters to functions, which are then manipulated. Conceptually function parameters are defined as being either:
- Inputs (Read-only) – client-supplied objects manipulated within the function only
- Outputs (Write-only) – objects generated by the function for use by the client.
- Input-Outputs (Read-Write) – client objects that can be manipulated by the function.
Defining the use of a parameter gives vital information not only to the implementer, but (perhaps more importantly) to the user of the function, by more-explicitly specifying the ‘contract’ of the function.
Many programming languages (for example, Ada) support these concepts explicitly. C, however, does not. One has to remember that when Kernighan and Ritchie developed C structured programming was very much in its infancy and many of these ideas were still being formulated (also remember that one of the C design goals was parsimony).
Even today, though, these concepts are rarely taught to C programmers and that has often led to clumsy, insecure or even downright dangerous APIs.
If C doesn’t support these concepts explicitly, can we simulate them? The answer is (of course) yes, by using some basic language constructs and forming some idioms.
Let’s look at each parameter type in turn.
C specifies a call-by-value or call-by-copy paradigm. That is, when a C function is called the compiler sets up a call frame that holds copies of the function parameters. Therefore, when you pass parameters by value you are – in effect – creating a parameter for the function to use that in no way affects the caller’s data
This is fine for simple types, but what about user-defined types – structs? What’s the problem with passing them by value?
Passing a structure by value means allocating enough memory for the parameter and then copying the contents of the original object into the parameter. In many embedded systems, where memory is at a premium, this could easily overflow the stack – at run-time, where its consequences could be difficult to track.
Strictly, to be explicit you should specify the type of the parameter as a const:
For simple types this is unlikely to add much value; however it may provide some benefit with structures.
If a parameter is passed as a const struct the compiler has the opportunity to perform a lazy evaluation – it passes the address of the structure instead of making a copy.
Note that this optimisation may not be supported by all compilers; or might not occur at all levels of optimisation.
The resolution to the above problem is to explicitly pass a pointer to the structure:
This is clearly more efficient than copying the whole structure. OK, the syntax has got a little messier, but we can live with that.
But hang on: do we still have an Input parameter? Actually, no.
What we’ve got here is an input-output parameter. By passing a ‘raw’ pointer the function can manipulate the caller’s object. To fix this we need to prevent manipulation of the pointed-to object:
Still not quite there, though. What happens below?
Strictly we should make the pointer itself const to prevent (either accidently or maliciously) the function manipulating the caller’s object:
This is a very good general rule-of-thumb for functions: make all pointers const
An output parameter is one that the function can write to, but never read (i.e. write-only). In C the only real mechanism we have for that is the function return value.
Most programmers are happy to return simple types from functions but what about the following code?
Since C performs pass (and return!) by value this would appear very inefficient:
The original object (biiig) is constructed. Then, when makeBigStruct is called space for the return value is allocated. Inside makeBigStruct, temp is allocated. On return temp is copied into the return value then, finally, copied into biiig.
Knowing this, most programmers never return structures from functions; preferring instead to supply them as input-output parameters. However, most modern compilers provide an optimisation which does just this.
Below is the same code but showing the optimisation. Instead of returning the structure the address of the receiving object is (implicitly) passed to the function. At the end of the function the return value is copied into the receiver, negating the need for a temporary return object.
In general, then, it is OK to return a struct from a function by value (unless you’re using an ancient C compiler). If you’re not certain (or your compiler doesn’t support this optimisation) it’s probably safer for you to use input-output parameters instead.
Finally, it’s worth noting the small detail that, unlike other languages, a C function can only have one output parameter. You’ll need to use input-output parameters for the rest.
Making the world a better place.
Using these idioms consistently is a very good way to improve the quality of your code. Firstly, it allows the compiler to provide stronger checking on your code. Second, it gives the reader extra information about how to use your functions and what guarantee (or promise) they can expect from them.
You may have noticed I’ve ignored arrays in this article. Check out this blog post for passing arrays to functions.
* Or, hokey-pokey if you prefer.