If you work in high-integrity or safety-critical software then you have a duty-of-care to:
- Ensure your code does what it should
- Ensure your code doesn’t do what it’s not supposed to.
Moreover, you’re typically required to both demonstrate and document how you’ve achieved this duty-of-care. Typical mechanisms for demonstrating compliance include unit testing and static analysis; documentation usually includes function specifications and test results. The amount of testing, static analysis and documentation you (have to) do depends on the integrity level of your system. The more critical the consequences of a failure, the more evidence you have to provide to demonstrate that you’ve considered, and alleviated, the failure.
I would argue that, even if your code could never kill someone, its failure will still ruin someone’s day. As a professional engineer you always have some duty-of-care to ensure your system works ‘properly’. Therefore, you always should do an ‘appropriate’ level of testing, analysis and documentation.
Writing dependable code is a combination of two things:
- Ensuring your code is implemented correctly (by you)
- Ensuring your code is used correctly (by others)
In my experience, engineers are normally pretty good at implementing functionality correctly. Where they are usually weaker is in taking the time and effort to specify exactly what they should be implementing. In particular, error paths, corner cases and combination of these things tend to get overlooked. That is, less experienced engineers tend to focus on the ‘sunny day’ scenario, and be overly optimistic of the other’s code around them.
Ensuring your code is used correctly is normally an exercise in documentation. However, documentation suffers from a number of challenges:
- We’re not given the time to do it well
- No-one likes writing it.
- Good documentation is a rarity; probably because of the above
- No-one reads documentation anyway; also probably because of the above.
- Your code can’t be automatically checked against the documentation
The above means that documentation is often an exercise in futility for the writer, and an exercise in frustration for the reader.
So, in this article I want to explore language features, existing and forthcoming, that can help us improve the quality of our code. For the purposes of this article I’ll concentrate on one aspect of C++ – functions.
In order to be more effective than simple text documentation these features must provide the following characteristics:
- They should guide the implementer as to corner cases, error paths, etc.
- They should establish a (bare minimum) set of unit test cases
- They should specify any requirements imposed on the user
- They should establish guarantees – promises – provided by the code to the client
- They should be automatically checkable – ideally at compile-time
- They should make code more readable
- They should, wherever possible, improve a compiler’s ability to optimise code.
UPDATE:
Since writing this article Contracts have been dropped from C++20. The code presented here remains incomplete and experimental. Here’s hoping this very important feature makes it into a release sometime soon.
Contents
When things go wrong
Program errors can be broken down into two main categories
- Programming errors
- Failures of the C++ abstract machine.
Programming errors are mistakes made by the programmer. We could sub-divide programmer errors further:
- Use of unspecified or undefined language constructs
- Out-of-bounds and out-by-one errors
- Failure to bounds-check parameters
- Incorrect implementation of algorithms
- Missing paths through code
- Incorrect system construction
- Misunderstanding of system requirements
Notice, this set of errors says nothing about the flexibility, maintainability or extensibility of the code (its intrinsic quality). It is merely concerned with code correctness (what we might call its extrinsic quality). We might build a program that avoids all these errors and still be completely unmaintainable, or re-usable!
Some of these programming errors can be caught by tools (for example, the use of unspecified or undefined language constructs is the remit of static analysis tools).
Some programming errors require dynamic testing, for example incorrect implementation of algorithms. White box – coverage – testing, is a useful tool for identifying missing paths
Higher-level programming errors – incorrect system construction and misunderstanding of system requirements – are, ultimately, failures of the design (and requirements analysis) process.
The C++ abstract machine is a conceptual model of the C++ memory model and run-time environment. It assumes, for example, that a C++ program has static memory, a stack, a free store (heap) and so on. A failure of the abstract machine could be, for example:
- running out of space on the free store
- running out of stack memory
- I/O device failure
- etc…
Failures of the abstract machine are not necessarily errors that can be expected to happen, but they can be anticipated. We should expect that abstract machine failures could happen, and design contingencies for if they do. This is why, for example, most embedded coding standards ban the use of dynamic memory.
When something does go wrong, we have to design an error handling strategy. We’ve got four basic options:
- Ignore it
- Handle it
- Degrade
- Halt
System errors – for example, a user making an incorrect selection; or messages arriving out-of-sequence – are not program errors. Your design should anticipate these and have code to handle them. Not having the handler code is a program error!
In the case of some errors – for example, hardware failures – we cannot simply handle them and carry on. In such cases our only solution may be to degrade the system in some way. That is, continue, but with reduced functionality, or less accuracy or lower performance.
Some programming errors, notably out-of-bounds value or system construction errors mean that the system is executing outside its designed parameters. The behaviour of the system is ipso facto undefined; and we should never expect it to operate correctly. The only viable option on detected such errors is to halt the system, rebuild it correctly and try again.
In general, the more you err towards the lower layers the safer your system is likely to be. However, this will normally come at the price of system availability. Which is more important is governed by your stakeholders.
For example: if every time an error occurs we simply halt the system (after putting it in a quiescient or safe state) then the less chance there is of it doing something uncontrolled. However, this is likely to lead to frustrated users because the system keeps stopping and requiring a reset.
On the other hand, your stakeholders may require the system keeps functioning even in the face of multiple errors (for example, a life-support system), even if the performance is sub-optimal.
Contracts
The idea of design-by-contract is not new. It was first coined by Bertrand Meyer in 1986 and implemented in his Eiffel programming language. Design-by-contract has its roots in formal specification techniques.
Design-by-contract is based around a metaphor that software modules cooperate based around the idea of a contract. The contract specifies:
- A set of obligations the client must fulfil, known as the preconditions. The server benefits by not having to handle cases outside of the precondition.
- A set of guarantees the server will deliver – the postcondition. The server is obliged to guarantee this. Obviously, the client benefits by having guaranteed properties after the call
- Certain states that will be maintained – known as the invariant.
The principle of design-by-contract is that many programming errors can be caught (or even avoided) by specifying contractual obligations for the caller (pre-conditions) and guarantees/promises required of the function itself (post-conditions).
Given enough information (strong typing, variable permissible ranges, etc.) contracts should be checkable at compile-time. Languages such as Ada (and to an extent, Rust) provide such capabilities. For C++, function pre- and post-condition checking depends on the actual values of program objects contract checking has to be a run-time facility. This, of course, has run-time performance impact, so it should be possible to disable contract checking for a particular system build (for example, you will probably want contract checking enabled in debug builds, but disabled for release builds)
In C++20 contract specification was formalized using the attributes mechanism of the language.
Assertion
C programmers are no doubt familiar with the assert macro. This has been replaced in C++20 with the assertion attribute.
The assertion attribute takes an expression in the form of a predicate. If the predicate evaluates to true execution continues. If the predicate evaluates to false a contract violation occurs (which we discuss below)
void func(int a, int b) { // Some code... [[ assert: a + b < 256 ]]; }
This works just like the C assert macro. However, this isn’t a function-like macro, as C’s assert is, so it doesn’t suffer from problems of function-like macro expansion. For example, the following won’t compile.
struct Coordinate { int x; int y; }; bool operator==(const Coordinate& lhs, const Coordinate& rhs); void func(const Coordinate& pos) { assert(pos == Coordinate { 0, 0 }); // <= That comma upsets // the precompiler! }
However, using the assert attribute removes the error
void func(const Coordinate& pos) { [[ assert: pos == Coordinate { 0, 0 } ]]; // OK }
The predicate must not have any side-effects; that is, they must not change the state of any objects in the system – either local variables or statics. A predicate with a side-effect is undefined behaviour (and, given that we’re trying to reduce the opportunity for programming errors, must be considered A Bad Thing).
A predicate can also include function calls
bool is_valid(int x); void func(int val) { ... [[ assert: is_valid(val) ]]; ... }
Preconditions
Preconditions are specified with the ‘expects’ attribute. The attributes are applied to the function’s declaration, not its definition.
Precondition predicates are evaluated prior to entering the function’s body. If a predicate evaluates to false this is a contract violation.
void func(int a) [[ expects: a < 100 ]];
You can have multiple preconditions. They are evaluated in order of declaration.
Error queue_create( QUEUE * const queue_handle, std::size_t elem_size, std::size_t queue_size ) [[ expects: queue_handle != nullptr ]] [[ expects: elem_size > 0 ]] [[ expects: queue_size > 0 ]];
As with assertions, a precondition predicate can include a function call; but cannot have a side-effect
class Buffer { public: void add(int data); int get() bool is_empty() const; }; void process(const Buffer& buffer) [[ expects: !buffer.is_empty() ]];
Preconditions are, wherever possible, evaluated at the call site. This is what you want the majority of the time.
Establishing preconditions on a function, as well as reduce the opportunity for programming errors, also gives the compiler the potential to optimise code, given that there are now constraints on the parameters that will not be violated.
Postconditions
Postconditions are specified with the ‘ensures’ attribute. Postconditions are evaluated at the exit of the function.
void allocate_ADT(ADT* ptr) [[ expects: ptr == nullptr ]] [[ ensures: ptr != nullptr ]]
Postcondition predicates can use the return value from the function. Since the return value object doesn’t have a name you can (for the lifetime of the predicate) give it a name
int count_items(const vector<int>& v) [[ ensures result : result >= 0 ]];
As previously, postcondition predicates can call functions, and cannot have side effects. Additionally, the postcondition predicate cannot modify the return value object.
bool is_valid(int val); int func() [[ ensures result : is_valid(result) ]];
(Yes the function being called in the postcondition can itself have preconditions. I’ll leave it as the perennial ‘exercise for the reader’ for you to work out what should happen)
Finally, if (for some deranged reason) you are use automatically-deduced function return types you cannot put postconditions on the function declaration
auto make_something() [[ ensures result : true ]] ; // Error - what's the type // of result?!
Contract violation
Violating a contract causes the compiler to generate a std::contract_violation object, as follows
namespace std { class contract_violation { public: uint_least32_t line_number() const noexcept; string_view file_name() const noexcept; string_view function_name() const noexcept; string_view comment() const noexcept; string_view file_name() const noexcept; string_view assertion_level() const noexcept; }; }
Although the details of the contract_violation are implementation-defined, the standard encourages certain responses:
- In the case of a precondition violation implementations are encouraged to report the caller site source location.
- In the case of a postcondition violation the source location violation will be the one from the callee (server) site.
- In the case of an assertion, the source location will be the one from the assertion itself.
- The comment should contain text relating to the conditional-expression of the contract that was not satisfied.
The constraint_violation object is passed to a violation handler function, of the form
void (*)(const std::constraint_violation&)
We might expect the output to be something of the form
Error: contract violation In file: /usr/feabhas/project/src/main.cpp line 130 In call to function 'do_stuff': precondition violation: 'ptr != nullptr' assertion level: default
(Please note: This output is pure speculation. At the time of writing there are no implementations of contracts)
The API should also allow you to write your own violation handler; to do whatever you may want to for your system.
Violation continuation mode
A translation unit may be compiled with one of two violation continuation modes: off and on.
With the continuation mode set to off (the default), after the violation handler has completed, std::terminate() is called.
With the continuation mode set to on, after the violation handler has completed your program will continue executing. This provides the opportunity to install a logging handler to instrument a pre-existing code base and fix errors before (formally) enforcing checks.
(Again, as of the time of writing, there is no current implementation, I can only assume the violation continuation mode will be set via a compilation flag.)
Contract levels
Evaluating contract constraints has the potential to be very expensive; and we don’t always need to perform run-time checks of all constraints. To aid in this each constraint can be given a contract level; one of the following:
default
This is (as the name suggests) the default value. The default contract level is suggested for those contract predicates that are cheap (or rather, not expensive) to perform; and so can be performed without significantly affecting the performance of the system.
audit
This level is recommended for constraints that are expensive to perform and should therefore be limited to development-type builds.
axiom
Constraints marked as axiom are not included in run-time checks. However, it is envisaged that axiom constraints could be subjected to static analysis.
The contract level is specified as an optional parameter on each constraint
double square(double x) [[ expects axiom : x >= 0 ]]; double sqrt(double x) [[ expects default: x >= 0 ]]; void sort(vector<ADT>& v) [[ ensures audit: is_sorted(v) ]];
Which constrain level is currently active is selected via the build level. How the build level is set is implementation-defined (but again, I’m presuming a compiler flag). The build level has one of three values:
- off. No contract violation checking is performed
- default. Contract checking is only performed on those constraints specified as default.
- audit. Contract checking is performed on all constraints specified as either default or audit.
Function attributes
while we’re talking about attributes and extending function specifications, I should probably mention some of the other attributes that can be applied to functions. Some of these have been around since C++11, some were introduced in C++17
[[ maybe_unused ]] (from C++17)
This instructs the compiler to suppress any warnings about objects (parameters) that may not be used in the function. Normally you shouldn’t have unused parameters in functions (if you’re not going to use them, don’t add them!). This attribute may be useful with interfaces/virtual functions: an interface should represent an abstract service; a particular implementation may not require all parameters (whereas others might).
[[ nodiscard ]] (from C++17)
Emits a diagnostic if the return value from the function is not used (read). For example, if a function returns an error code it is probably good practice to ensure the caller checks that error.
[[ noreturn ]] (from C++11)
Indicates that this function never exits (that is, an infinite loop); will emit a diagnostic if the function has a (reachable) return statement. This attribute is useful for indicating thread functions for run-forever (‘static’) tasks in concurrent systems, for example.
[[ deprecated ]] (from C++14)
Issues a diagnostic warning that the function should not be used; and is superseded by another.
void uses_some( int param1, int param2 [[ maybe_unused ]], int param3 [[ maybe_unused ]] ); [[ noreturn ]] void control_loop(); [[ nodiscard ]] Error_type might_fail(); [[ deprecated("Use new_do_stuff instead") ]] void do_stuff();
Contracts versus exceptions
There are situations during a program’s execution where the occurrence of an error would me to continue would be meaningless (or dangerous):
- Failures of the abstract machine – for example, running out of dynamic memory, or a hardware failure
- The program is receiving data, or has state, that is outside the design parameters for the system.
- The system’s construction is incorrect – parts are missing, of the wrong type, or connected together incorrectly
In the first case, we have the potential to handle such situations by disabling functionality, degrading performance, etc. We have to design for these contingencies, hoping they will never happen.
In the second case, we must have our design incorrect, if the system is acting in a way that we’ve never considered. Our best option is to stop and re-visit the design.
In the third case, we again have to stop and put the system together correctly.
A rigorous testing regime aims to identify as many of these situations are possible before the system is released into operation.
Until C++20 we had two options for handling this type of program error:
- Manual error handling, with error codes
- Exceptions
- The assert macro
Each of these mechanisms have important considerations:
- Manual error handling clutters application code and complicates testing by creating additional paths through code which all have to be tested.
- Exceptions are a ‘heavyweight’ option. Although the non-exception-case code generally suffers little performance impact, the exception handling mechanism is significant additional code (especially for smaller, embedded systems) and exception propagation is non-deterministic. More significantly, the exception handling architecture is a complex design problem. For these reasons many embedded, and high-integrity architectures forbid the use of exceptions.
- The assert macro is a holdover from C and is somewhat crude. It doesn’t provide any mechanism for degrading systems at run-time.
With C++20’s contracts we now have a more effective mechanism for dealing with a large number of our programming errors:
- As contracts are checked by the C++ run-time we don’t have to provide error-checking code ourselves, leading to cleaner code.
- Specifying contract constraints on code forces developers to actually think about what should happen in a function. This both increases the chances of actually coding the function correctly, but also provides important information for establishing effective unit tests.
- In many instances contract checks can be used in place of exceptions. This is not only a lot more efficient; it makes developing an exception handling architecture simpler.
- Exceptions can now be employed for truly exceptional cases – failures of the abstract machine.
Interestingly, the C++ standards committee has acknowledge that C++ exceptions – whilst not broken – are a less-than-optimal way of providing error handling in programs. There are projects to reduce or even eliminate exception handling from the standard library. This should mean that many more embedded projects will be able to adopt the C++ standard library, rather than having to ‘roll their own’.
Summary
In my view C++20’s contracts (along with concepts) are one of the most important new features in C++. They change the way we specify and write C++ code.
One of the key points that comes out of this, though, is the fundamental requirement to design a system before coding. This might sound blindingly obvious, but in my experience many engineers prefer to code their way out of problems, rather than design them out before they ever become a problem.
By adding these new mechanisms, hopefully we’ll see a step-change in the quality of C++ programs being constructed.
- Practice makes perfect, part 3 – Idiomatic kata - February 27, 2020
- Practice makes perfect, part 2– foundation kata - February 13, 2020
- Practice makes perfect, part 1 – Code kata - January 30, 2020
Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry.
He specialises in C++, UML, software modelling, Systems Engineering and process development.
Contracts have been removed from c++20, so they won't be available before c++23, and probably in a different form. Your article is misleading on this matter.
Nice article, I've been using assert (compile and run time) in my embedded code, I generally don't use exceptions (embedded code), perhaps this is a step forward.
(small typo) -- "Until C++20 we had two options for handling this type of program error:" -- and then 3 options are listed. Of course there are programming jokes about counting and off-by-one errors, perhaps that was the intent, but it doesn't read as such.
A more common (in my experience) usage of `[[noreturn]]` is on a function that always throws an exception, i.e. is a wrapper around `throw`, or always exits the process.