Task Synchronisation – Part 2: Multiple Tasks and RTOS APIs

First off, apologies for the delay in this follow up to the previous post Task Synchronisation, it has been a mad couple off weeks with a combination of vacation and work.
In the previous post I looked at the foundation of task synchronization demonstrating there are  range of synchronisation models (bilateral/unilateral, persistent/non-persistent, etc.). In this post I shall look at multi-task synchronisation and then investigate specific RTOS APIs.
MULTIPLE TASKS WAITING
So far we have only dealt with the simple case of two task synchronisation. We now address the case where multiple tasks are waiting at the synchronisation point:

We now have two tasks, Task2 and Task3 blocked waiting on the SO. Task1 now reaches the synchronisation point and signals the event. Again we could have the case of bilateral and unilateral sync, though typically Task1 doesn’t block if no other task is waiting (unilateral model).
The first question is: how many of the waiting tasks synchronise? All the waiting tasks, or only the first one? (And how do we define which task is the first one?).
In many systems we need the ability to broadcast a condition to all waiting tasks (defined as one-to-many or 1:M synchronisation). This is often referred to as a “Barrier”. Barriers many be non-persistent (only those waiting at that time get readied; subsequent tasks block), or persistent (all tasks continue past sync point until the synchronisation is removed with a manual reset).
Alternatively, only one waiting task is allowed past the synchronisation point, the rest remaining blocked. With single task synchronisation there must be a policy regarding which task is synchronised. For a real-time system this is normally based on ordering the waiting queue by task priority. However, as matter of choice, most RTOSs will support queuing based on arrival order (FIFO).
MULTIPLE CONDITION SYNCHRONISATION
In the design of real-time systems it is common for a task to synchronise on a number of different conditions. This many involve a conjunction (AND) or disjunction (OR) of events. For example, a motor fault condition may be specified as:
  • low oil pressure AND > 30sec after start-up
Or a gas system nanoautomate alarm condition as:
  • Isobutane present in outlet line OR
  • Isobutane present in front flush OR
  • Isobutane present in rear flush
RTOS SYNCHRONISATION OBJECTS
Given the array of synchronisation requirements and options how do modern RTOS support synchronisation? The majority of RTOSs support the following:
  • Event Flags / Event Flag Groups
  • Semaphores as Signals
The Semaphore as a Signal is not to be confused with C standard library signals or UNIX signals. A signal is, according to the UNIX terminology, a notification sent to a process that will interrupt that process. In order to catch a signal a process must have a signal handler. This is similar to the behaviour of a software interrupts.

EVENT FLAGS
Typically an Event Flag is implemented as a Manual-reset, Persistent, Unilateral synchronisation object. It is by far the simplest idea and mechanism, and is best suited to one-to-many (1:M) or many-to-one (1:M) synchronisation. An API may consist of the following calls:

  • Set – sets the flag, readying any waiting tasks; can be called from an ISR
  • Clear – clears the flag, if the flag is cleared then any arriving task is blocked
  • Wait – non-consuming pend on a flag
However, event flags are normally bound together to form Event Flag Groups (EFG). Having support for groups of flags allows task the wait either conjunctively or disjunctively . A typical implementation will bind an event flag group to a word in memory (e.g. 32 bits = 32 flags). To support groups, the API is extended to include a further argument, and the wait includes the specification of conjunctive or disjunctive synchronisation.
  • Set(group, bitmask)
  • Clear(group, bitmask)
  • Wait(group, bitmask, AND_OR, timeout)

RTOSs differ in the implementation of Event Flag Groups, with some only supporting M:1 synchronisation and not supporting 1:M or M:M synchronisation. In this case each event flag group is bound to a specific task (i.e. EFGs cannot stand as independent resources), altering the API to:

  • Set(task_id, bitmask)
  • Clear(task_id, bitmask)
  • Wait( bitmask, AND_OR, timeout, &events;)
ISSUES WITH EVENT FLAGS

A surprising number of RTOSs do not support the concept of Event Flags, thus no support for any form of disjunctive or conjunctive synchronisation. The usual argument for not supporting them is that it can be difficult (if not impossible) to make timing deterministic (especially disjunction). Timing typically takes an O(N) form where N is the number of waiting tasks.

In addition, I am not aware of any that support event flag groups being able to do combination logic; e.g. A or B or (C and D).

Some examples of event flag support from commercial RTOSs are:

  • >VxWorks

    • 32 events in an event field
    • Each task has its own events field that can be filled by having tasks and/or ISRs sending events to the task.
  • ThreadX
    • 32 event flags in a group
    • Each event flag group is a public resource
  • Nucleus PLUS
    • 32 event flags in a group
    • Each event flag group is a public resource
  • uC/OS-III
    • 8, 16 or 32 event flags in a group (compile time configured)
    • Each event flag group is a public resource
SEMAPHORE AS SIGNALS
The generic concept of a signal is for synchronisation between two tasks, with a simple API of:
  • signal
  • wait – timeout option
However, the term ‘signal’ is now more commonly used in the context of C and UNIX programming, where it refers to an asynchronous communication mechanism.
Most RTOSs, then, do not support the concept of a signal directly, instead directing the programmer to use the semaphore as a synchronisation object (the Semaphore as a Signal pattern). When using a semaphore as a signal, the SO takes the form of an Auto-reset Persistent Unilateral synchronisation object.
The semaphore was originally designed for support mutual exclusion in multi-tasking systems and pretty much all modern RTOSs support the semaphore.This is thoroughly covered in a previous posting (Mutex vs. Semaphore).
The fact that semaphores do not check the releaser of the semaphore was actually the taker allows the semaphore to be used for unidirectional synchronization. This works by creating the semaphore with a count of zero. When the semaphore count is zero, any task waiting (‘pending’ in our example) will block. When a signal is sent (SemPost), the count is incremented by one. If a task is waiting, it will consume the count token and continue (decrementing the count back to zero). If there is more than one task waiting, then only the first will be signalled – the others remaining blocked (either by priority or FIFO ordering).
Note that the Pend and Post calls are not used as a pair in the same task. In the example, assuming Thread1 calls the OSSemPend function it will block. When Thread2 later calls the OSSemPost function then the unilateral synchronization takes place and both task are ready to run (with the higher priority task actually running).

The Semaphore as a Signal pattern is regularly used to synchronise tasks with ISRs triggered by external events.  This mechanism is favoured since the ISR will never block when posting to the semaphore (thus avoiding the potential to ‘stall’ the system).

An RTOS may support counting semaphores (where the count increments each time it is signalled) and/or binary semaphores (where the count is either zero or one – signalled or unsignalled).  The choice of semaphore can have important implications for the behaviour of the application.

If we have sporadic interrupts, then the ISR may signal the semaphore multiple times before the task waits on it. Does the application need to know only that the event has occurred, or does it need to know the actual number of times an event has occurred? Either is valid, depending on the system requirements. Note that most RTOSs use the counting semaphore as a signal and thus will count the number of events.

Unfortunately using the semaphore as synchronization primitive can be problematic in that it makes debugging harder and increase the potential to miss “accidental release” type problems, as an OSSemPost on its own (i.e. not paired with a OSSemPend) is considered legal code.

As an example, VxWorks does not support signals, but does support both the binary semaphore and the counting semaphore. Either can be used for synchronisation if created EMPTY (0). The following calls can be used by tasks to use the semaphore as a synchronisation object.

  • semGive()
    • Binary – giving a “full” semaphore has no effect
    • Counting – increments count by one
  • semTake()
    • will block if count is zero
  • semFlush()
    • all waiting tasks are unblocked; semaphore left at zero
The semFlush is interesting, in that it allows all waiting tasks past the synchronisation point, thus supporting a 1:M barrier synchronisation.
BILATERAL SYNCHRONIZATION
As already stated, one limitation of bilateral synchronisation (aka the Rendezvous) is that it cannot be used for ISR to task synchronisation . Because of this, bilateral synchronisation is rarely supported by commercial RTOSs. Notable, though, is the ITRON Project, which creates standards for real-time operating systems used in embedded systems (µITRON4.0 Specification). Like the Ada programming language it supports the concept of the Rendezvous for bilateral synchronisation (it actually uses the term Rendezvous Ports).
For those RTOSs that don’t support Bilateral Synchronization, this can be simulated/implemented using a pair of semaphores.

Irrespective of which task reaches the synchronisation point first it will have to wait until the other task to arrives.
MUTUAL EXCLUSION AND SYNCHRONISATION
So far we have considered general synchronisation between two or more tasks. There is one further synchronisation use case we need to examine. Take, for example, the C code for a simple stack shown below. These routines use the variable count to keep track of the stack size. If the stack is empty then pop returns STACK_EMPTY and the caller must check for and take appropriate error handling actions.

Suppose that we do not want to return STACK_EMPTY, but want to wait (synchronise) for the stack to contain data. Since the waiting task owns the mutex no other task will be able to enter the critical region and push an element onto the stack. Thus the waiting task could never be signalled the stack is no longer empty – a deadlock.

A Condition Variable is used in conjunction with mutexes to implement synchronisation for certain conditions to become true (e.g. the stack becoming not empty). A condition variable (also called a condition object), therefore is a unilateral, auto-reset, non-counting, synchronisation object.

SUMMARY
When designing software for embedded systems that is relying on a particular Real-Time Operating System, it is essential to understand that the behaviour of synchronisation primitives differ from RTOS to RTOS. Many questions need answering such as:

  • Does the RTOS support only unilateral synchronisation, or does it include primitives for bilateral synchronisation?
  • If multiple tasks are waiting then how many are readied?
    • If only one, then how is it selected (priority / FIFO)?
  • Is the signal persistent or non-persistent?
  • Is the synchronisation object manual-reset or auto-reset?
  • Does the RTOS support multiple event Conjunction and/or Disjunction?
Finally, where mutual exclusion is required (as in most systems) does the RTOS support the concept of condition objects?
In summary, note that many RTOSs are very weak in terms of supporting different types of synchronisation, most relying on using the counting semaphore as a general synchronisation object.

Posted on November 16th, 2009
» Feed to this thread
» Trackback

1 Comment a “Task Synchronisation – Part 2: Multiple Tasks and RTOS APIs”

  1. Pete says:

    Barriers are not commonly defined as you have stated here. They are usually a generalisation of a rendezvous (and so bilateral) — all (involved) tasks will wait when they reach a barrier. When the last task reaches the barrier, all are released. This is *not* what semFlush does!

    Also, since the last task to reach the barrier does the signalling, and this can be any of the tasks, it is effectively a many-to-many synchronisation.

    I'm glad you added the note about bilateral rendezvous with two semaphores, but it is better to signal (post) first and then wait (pend) in *both* tasks. Mixing the order as you have will result in scenarios with one more context switch than is necessary. It's just an efficiency thing.

    I'm glad to see you mention condition variables, but you didn't say anything about blocking/non-blocking varieties. That specifies whether the 'signal' operation will block or not ('wait' can always block).

Leave a Reply