Mutex vs. Semaphores – Part 2: The Mutex

In Part 1 of this series we looked at the history of the binary and counting semaphore, and then went on to discuss some of the associated problem areas. In this posting I aim to show how a different RTOS construct, the mutex, may overcome some, if not all, of these weaknesses.

To address the problems associated with semaphore, a new concept was developed during the late 1980’s. I have struggled to find it’s first clear definition, but the major use of the term mutex (another neologism based around MUTual EXclusion) appears to have been driven through the development of the common programming specification for UNIX based systems. In 1990 this was formalised by the IEEE as standard IEEE Std 1003.1 commonly known as POSIX.

The mutex is similar to the principles of the binary semaphore with one significant difference: the principle of ownership. Ownership is the simple concept that when a task locks (acquires) a mutex only it can unlock (release) it. If a task tries to unlock a mutex it hasn’t locked (thus doesn’t own) then an error condition is encountered and, most importantly, the mutex is not unlocked. If the mutual exclusion object doesn’t have ownership then, irrelevant of what it is called, it is not a mutex.

The concept of ownership enables mutex implementations to address the problems discussed in part 1:

  1. Accidental release
  2. Recursive deadlock
  3. Task-Death deadlock
  4. Priority inversion
  5. Semaphore as a signal


Accidental Release
As already stated, ownership stops accidental release of a mutex as a check is made on the release and an error is raised if current task is not owner.

Recursive Deadlock
Due to ownership, a mutex can support relocking of the same mutex by the owning task as long as it is released the same number of times.

Priority Inversion
With ownership this problem can be addressed using one of the following priority inheritance protocols:

  • [Basic] Priority Inheritance Protocol
  • Priority Ceiling Protocol

The Basic Priority Inheritance Protocol enables a low-priority task to inherit a higher-priorities task’s priority if this higher-priority task becomes blocked waiting on a mutex currently owned by the low-priority task. The low priority task can now run and unlock the mutex – at this point it is returned back to its original priority.

The details of the Priority Inheritance Protocol and Priority Ceiling Protocol (a slight variant) will be covered in part 3 of this series.

Death Detection
If a task terminates for any reason, the RTOS can detect if that task current owns a mutex and signal waiting tasks of this condition. In terms of what happens to the waiting tasks, there are various models, but two doiminate:

  • All tasks readied with error condition;
  • Only one task readied; this task is responsible for ensuring integrity of critical region.

When all tasks are readied, these tasks must then assume critical region is in an undefined state. In this model no task currently has ownership of the mutex. The mutex is in an undefined state (and cannot be locked) and must be reinitialized.

When only one task is readied, ownership of the mutex is passed from the terminated task to the readied task. This task is now responsible for ensuring integrity of critical region, and can unlock the mutex as normal.

Mutual Exclusion / Synchronisation
Due to ownership a mutex cannot be used for synchronization due to lock/unlock pairing. This makes the code cleaner by not confusing the issues of mutual exclusion with synchronization.

Caveat
A specific Operating Systems mutex implementation may or may not support the following:

  • Recursion
  • Priority Inheritance
  • Death Detection

Review of some APIs
It should be noted that many Real-Time Operating Systems (or more correctly Real-Time Kernels) do not support the concept of the mutex, only supporting the Counting Semaphore (e.g. MicroC/OS-II). [ CORRECTION: The later versions of uC/OS-II do support the mutex, only the original version did not].

In this section we shall briefly examine three different implementations. I have chosen these as they represent the broad spectum of APIs offered (Footnote 1):

  • VxWorks Version 5.4
  • POSIX Threads (pThreads) – IEEE Std 1003.1, 2004 Edition
  • Microsoft Windows Win32 – Not .NET

VxWorks from Wind River Systems is among the leading commercial Real-Time Operating System used in embedded systems today. POSIX Threads is a widely supported standard, but has become more widely used due to the growth of the use of Embedded Linux. Finally Microsoft Window’s common programming API, Win32 is examined. Windows CE, targeted at embedded development, supports this API.

However, before addressing the APIs in detail we need to introduce the concept of a Release Order Policy. In Dijkstra’s original work the concept of task priorities were not part of the problem domain. Therefore it was assumed that if more than one task was waiting on a held semaphore, when released the next task to acquire the semaphore would be chosen on a First-Come-First-Server (First-In-First-Out; FIFO) policy. However once tasks have priorities, the policy may be:

  • FIFO            – waiting tasks ordered by arrival time
  • Priority        – waiting tasks ordered by priority
  • Undefined    – implementation doesn’t specify

VxWorks v5.4
VxWorks supports the Binary Semaphore, the Counting Semaphore and the Mutex (called the Mutual-Exclusion Semaphore in VxWorks terminology). They all support a common API for acquiring (semTake) and releasing (semGive) the particular semaphore. For all semaphore types, waiting tasks can be queued by priority or FIFO and can have a timeout specified.

The Binary Semaphore has, as expected, no support for recursion or inheritance and the taker and giver do not have to be same task. Some additional points of interest are  that there is no effect of releasing the semaphore again; It can be used as a signal (thus can be created empty); and supports the idea of a broadcast release (wake up all waiting tasks rather than just the first). The Counting Semaphore, as expected, is the same as the Binary Semaphore with ability to define an initial count.

The Mutual-Exclusion Semaphore is the VxWorks mutex. Only the owning task may successfully call semGive. The VxWorks mutex also has the ability to support both priority inheritance (basic priority inheritance protocol) and deletion safety.

POSIX
POSIX is an acronym for Portable Operating System Interface (the X has no meaning). The current POSIX standard is formally defined by IEEE Std 1003.1, 2004 Edition. The mutex is part of the core POSIX Threads (pThreads) specification (historically referred to as IEEE Std 1003.1c-1995).
POSIX also supports both semaphores and priority-inheritance mutexes as part of what are called Feature Groups. Support for these Feature Groups is optional, but when an implementation claims that a feature is provided, all of its constituent parts must be provided
and must comply with this specification. There are two main Feature Groups of interest, the Realtime Group and Realtime Threads Groups.

The semaphore is not part of the core standard but is supported as part of the Realtime Feature Group. The Realtime Semaphore is an implementation of the Counting semaphore.

The default POSIX mutex is non-recursive , has no priority inheritance support or death detection.
However, the Pthreads standard allows for non-portable extensions (as long as they are tagged with “-np”).  A high proportion of programmers using POSIX threads are programming for Linux. Linux supports four different mutex types through non-portable extensions:

  • Fast mutex                  – non-recursive and will deadlock [default]
  • Error checking mutex – non-recursive but will report error
  • Recursive mutex        – as the name implies
  • Adaptive mutex         – extra fast for mutli-processor systems

These are extreamly well covered by Chris Simmonds in his posting Mutex mutandis: understanding mutex types and attributes.

Finally the Realtime Threads Feature Group adds mutex support for both priority inheritance and priority ceiling protocols.

Win32 API
Microsoft Window’s common API is referred to as Win32. This API supports three different primitives:

  • Semaphore            – The counting semaphore
  • Critical Section     – Mutex between threads in the same process; Recursive, no timeout, queuing order undefined
  • Mutex                    – As per critical sections, but can be used by threads in different processes; Recursive, timeout, queuing order undefined

The XP/Win32 mutex API does not support priority inheritance in application code, however the WinCE/Win32 API does!

Win32 mutexes do have built-in death detection; if a thread terminates when holding a mutex, then that mutex is said to be abandoned. The mutex is released (with WAIT_ABANDONED error code) and a waiting thread will take ownership. Note that Critical sections do not have any form of death detection.

Critical Sections have no timeout ability, whereas mutexes do. However Critical Sections support a separate function call TryEnterCriticalSection. A major weakness of the Win32 API is that the queuing model is undefined (i.e. neither Priority nor FIFO). According to Microsoft this is done to improve performance.

So, what can we gather from this? First and foremost the term mutex is less well defined than the semaphore. Secondly,the actual implementations from RTOS to RTOS vary massively. I urge you to go back and look at your faviourite RTOS and work out what support, if any, you have for the mutex. I’d love to hear from people regarding mutual exclusion support (both semaphores and mutexes) for their RTOS of choice. If you’d like to contact me do so at nsc(at)acm.org.

Finally, Part 3 will look at a couple of problems the mutex doesn’t solve, and how these can be overcome. As part of that it will review the Basic Priority Inheritance Protcol and the Prority Ceiling Protocol.

At a later date I will also address the use of, and problems associted with, the semaphore being used for task synchronisation.

ENDNOTES

  1. Please I do not want to get into the “that’s not a real-time OS” debate here – let’s save that for another day!
  2. A number of people pointed out that Michael Barr (former editor of Embedded Systems Programming, now president of Netrino) has a good article about the differences between mutexes & semaphores at the following location: http://www.netrino.com/node/202. I urge you to read his posting as well.
  3. Apologies about not having the atom feed sorted – this should all be working now

Posted on September 11th, 2009
» Feed to this thread
» Trackback

16 Comments a “Mutex vs. Semaphores – Part 2: The Mutex”

  1. Michael Barr says:

    Contrary to your parenthetical, Micrium's MicroC/OS-II has a Mutex object that meets all of the requirements you laid out–and knows you can't try for a mutex in an ISR. It is one of the best implementations in the industry, IMHO.

  2. Robert Berger says:

    Hi,
    I can only fully agree with Michael. MicroC/OS-II has a Mutex implementation which can can be found in OS_MUTEX.C.

    I find it a very clever and simple way how the priority inversion prevention is implemented on top of the bitmap scheduler with fixed priorities.

    Regards,

    Robert


    Robert Berger
    Embedded Software Specialist

    Reliable Embedded Systems
    Consulting Training Engineering
    Tel.: (+30) 697 593 3428
    Fax.:(+30) 210 684 7881
    URL: http://www.reliableembeddedsystems.com

  3. Bradley Smith says:

    Very interesting read. I was shocked at the amount of incorrect information being presented by folk with regard to this topic. I look forward to your final chapter.

    Brad

  4. Randell J says:

    Back In The Day…

    Semaphores in the Amiga Exec (which was modeled on Xinu) were by your/modern terminology Mutexes with support for recursion. You can think of them (and recursive Mutexes in general) as Counting Semaphores with the counts going in the opposite direction: acquiring a "Semaphore" increases the count (if unowned (count == 0) or if you already own it).

    On top of that, we implemented "Read/Write Semaphores", where any number of people could acquire the "semaphore" for reading protected data, but only one could for writing to it. This was effectively the equivalent of a "read-permission" Mutex and a Write-permission mutex (though higher performance than a decomposed implementation).

    Roughly (pseudo-code):
    –acquire(read):
    —-if (I own a lock) // if recursion is allowed
    ——count++ // note: others may also own it
    —-else if (writer_count != 0)
    ——add self to queue
    —-else
    ——count++ //increase count of (read) owners

    –acquire(write):
    —-if (I own a lock)
    ——count++
    ——writer_count++
    —-else if (count != 0)
    ——add self to queue
    ——writer_count++
    —-else
    ——count++
    ——writer_count++ // makes sure readers won't read while we're writing

    The operations for release are obvious (make sure writer_count is dealt with correctly). It can be tricky to implement recursion with this; you can't just store a handle of the 'owner' since you have multiple owners.

    You can extend this with queuing mods, like "a write request jumps ahead of all read requests" (note that you can only have a queue if a writer either owns the resource or is in the queue also, waiting for all the readers to release). I think we had that as an option (and I think I used an internal implementation of that in the filesystem code, which was all coded as coroutines – now there's a paradigm that virtually disappeared, at least formally).

  5. Sticky Bits » Blog Archives » Polymorphism in C++ says:

    [...] example, if we are using the  uC/OS-II RTOS and have developed a Mutex class, [...]

  6. Mutex Vs Semaphore « Roshan Singh says:

    [...] difference: 1. http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-1-semaphores/ 2. http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-2-the-mutex/ 3. [...]

  7. craniumonempty says:

    Still reading, but got caught on:
    “POSIX is an acronym for Portable Operating System Interface (the X has no meaning).”

    I always thought it mean Unix or something and had to track it down:

    http://standards.ieee.org/news/2010/posix.html

    Which says “The IEEE has reaffirmed two standards covering the Portable Operating System Interface for Unix, better known as POSIX®.”

    I know they drop off “for Unix” part, but that was the original meaning of the X as far as I can tell.

  8. Niall Cooling says:

    Thanks for digging this out. Interesting that the IEEE are using the “Unix” reference again, but at least it makes some sense.

  9. Sushma says:

    This is a good explanation of mutex and semaphores. Thanks.

  10. Sergey Oboguev says:

    Just as a purely historical footnote, mutexes (both the term and the primitive) were used in VAX/VMS kernel starting from the original version, as a basic synchronization primitive, so they date back at least to 1975 when VMS development started, and most likely even earlier.

  11. Niall Cooling says:

    @Sergey – being a user of VMS back in the day (early ’80s for me) I never came across this. Do you have any references I could refer to? Thanks.

  12. Sergey Oboguev says:

    The term was not used at user level until PPL (see below), but mutex primitive is used very extensively in VMS kernel. See “VMS Internals and Data Structures”, any edition (still easily findable at amazon) or VMS documentation on driver writing (there are PDFs easily findable online) and/or System Dump Analyzer documentation or, better yet, kernel sources/listings (a little harder to get, but obtainable for motivated people ;-)).

    The relevant functions are SCH$LOCKR, SCH$LOCKW, SCH$UNLOCK, SCH$LOCKRNOWAIT, SCH$LOCKWNOWAIT, SCH$LOCKWEXEC, SCH$LOCKREXEC located in module [SYS.SRC]MUTEX.MAR. Many mutex objects throughout the kernel also bear a word “mutex” in their name, for example I/O database mutex IOC$GL_MUTEX and so on, there are dozens of various mutexes in the system.

    In terms of functionality VMS kernel mutexes are actually closer to modern RWLocks since they allow both exclusive locking (“writer lock”) and shared locking (“reader lock”). They also elevate process IPL to ASTDEL (2), to prevent process deletion while a mutex is held — the process can still be preempted, but control flow cannot be interrupted by ASTs and, in particular, cannot be deleted while any kernel mutex is held.

    At user level, principal VMS mechanism for exclusion were ENQ/DEQ locks, which had a richer semantics than most existing locking primitives, and in addition were cluster-wide, but of course had greater overhead than grandma’s futex.

    Besides VMS kernel mutexes, mutexes (of different design than kernel ones) were also used in user space, internally in run-time libraries for ADA and at some point CRTL, and eventually were released for developers use as a part of Parallel Processing Library which, according to the headers in the sources, dates back to 1986.

  13. Sergey Oboguev says:

    On a somewhat related curious matter, one of the big “discoveries” in the field of synchronization was 1990 discovery that “test-and-set” hogs the interconnect bus with intense traffic and that “test-and-test-and-set” performs much better, since it spins on the cached value and does not issue interlocked instruction until the cache coherence indicates the lock may be available.

    The most quoted article on the issue nowadays is Anderson’s 1990 article “The Performance of Spin Lock Alternatives for Shared—Memory Multiprocessors”, though it was actually preceded by other articles on the matter published in 1990.

    I am sure there are thousands of references to Anderson’s article, indeed google alone finds 16600 references on the web and 800 references in google books. Web also contains 236.000 references to “test and test and set” idea allegedly pioneered by the Anderson’s article.

    Yet, VMS SMP spinlock code (module [SYS.SRC]SPINLOCK.MAR) written 5 years earlier, in 1985, does just that — spins on local cached value, and performs interlocked operation only after the cache is updated, with the code section prepended by the comment: “The busy wait loop below assumes that a cache is present and that cache coherency is maintained on all available processors. Note that if the cache is not present and working, the busy wait is likely to impact the system performance overall by making many references to memory.”

    I would expect many other examples like this could be found, and thus invention and use of mutex in the industry many years before it made a splash in academic publications is by no means anything too unique.

  14. Sergey Oboguev says:

    Back to the origins of the term “mutex”, lookup in google books suggests that the term originated in 1967, originally as a designation for binary semaphore (indeed, the very first use findable in google books appears to be as a name of semaphore variable) and then gradually acquiring a separate meaning.

    The word was introduced apparently not without some resistance to the reader’s ears. One reviewer of J.L. Jolley’s “Data Study” (published in 1968) writing in “New Scientist” (1968) complains:

    “The basic idea is a simple one and many will be irritated by the author’s use of such words as mutex, ambisubterm, homeostasis and idempotency to provide names for many quite ordinary phenomena”.

  15. Multithreading, mutex, semaphore | Agnihotri says:

    [...] http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-1-semaphores/ 2. http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-2-the-mutex/ 3. [...]

  16. mutex vs semaphore | technoless says:

    […] – http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-1-semaphores/http://blog.feabhas.com/2009/09/mutex-vs-semaphores-%e2%80%93-part-2-the-mutex/ – […]

Leave a Reply