This time I want to look at a seemingly trivial concept in C++ programming: accessing class members, either directly or via a pointer. More than anything it’s an excuse to talk about two of C++’s more obscure operators – .* and ->*
The member access operator
Anyone who’s ever programmed in C++ knows that you access the members of a class-type object using the member access (.) operator.
This is a concept you’ve almost certainly taken for granted up to this point but let’s explore what’s going on in more detail.
The member access operator is a binary operator which acts as an expression – that is, yields an object that has a type and a value category. (For the purposes of this article I’m going to largely ignore the vagaries of the C++ value category system; feel free to post all your favourite special cases in the comments section)
The left-hand operand of the member access operator is an object of class, structure or union type (and yes, you can have member functions on unions. Go ahead and try. I’ll wait). More strictly, the left-hand operator must be an expression of complete class type. In simple terms this means the compiler has to be able to see the class type-definition (declaration); but more on this later. The left-hand operand expression is always evaluated; even if it’s not needed (see below)
The right-hand operand of the member access operator must be the name of an attribute or member function of the left-hand operator.
The result of the member access operator expression depends on the types of the operands. The complete set of rules is complex (not unexpectedly, this being C++) but can be summarised for our purposes as follows:
If the right-hand operator specifies a data member, then the expression yields an object. The value category of the expression is based on the value category of the left-hand operand expression. As a rule-of-thumb glvalue left-hand operator expressions will result in a glvalue; rvalue expressions will result in an rvalue. The cv-qualifiers of the result will be the union of the left-hand and right-hand expression qualifiers.
If the right-hand operator specifies a member function, then the expression results in a function (address). The resultant value is a special case of rvalue – it can only be used as the operand on a function call operator. The function call of adt.func() above is actually a separate expression and not part of the member access operator. The function call operator is evaluated after the member access expression.
Indirect member access
If we have indirect access to an object – via a reference or pointer – the above rules remain the same, with a few notable extensions.
A reference, as an alias to an (already-defined) object, can be used as the left-hand operator of the member access operand; the ‘reference expression’ (for lack of a better term) will yield the referant object.
With pointers we have to perform an extra operation to get an object:
Notice we now have to dereference the pointer to get the object. The extra parentheses are the result of C++’s precedence rules; the member access operator is higher precedence than the indirection (dereference) operator.
Since the resulting syntax is somewhat clunky C++ sweetens it with some syntactic sugar – the (indirect) member access operator, ->
The (indirect) member access operator can be overloaded for a class (in fact, it’s the only member access operator that can be overloaded). The default operator returns a pointer-to-type; which is then dereferenced (by the compiler) to yield an object. You can overload the -> operator for your class to return anything you desire. The compiler will repeatedly invoke the -> operator until it yields a pointer-to-type.
In the above code the call to ptr->attr calls SmarterPointer<ADT>::operator-> which yields a SmartPointer<ADT>. The compiler then calls SmartPointer<ADT>::operator-> which returns an ADT*. This is then dereferenced and the member access operator evaluated.
Incomplete and complete types
While we’re talking about indirect member access it’s worth returning to a point I made earlier: The left-hand operand of the member access operators must be an expression of complete class type. Why is this significant? Because C++ allows pointers – and references – to incomplete class types. To illustrate this, have a look at the code below
The forward reference (to Server) declares that Server is a valid type. This is adequate to be able to declare a pointer or reference to that type. However, to access the forward-reference object the type-definition (declaration) must be visible. You can reason about it this way: The compiler knows how much memory to allocate for a pointer (or reference); it does not need to know the memory layout of the referant object to do this. To access the object’s members though, the compiler must know the memory layout of the referant object to calculate offsets from the object’s address; similarly, function declarations must be visible to be able to validate the call-frame of any member functions called.
The pointer-to-member operator
A pointer, fundamentally, holds an address. The type of a pointer defines how to interpret that address. In the case of a pointer-to-object it defines how much memory the referant occupies, the memory organisation and the allowable behaviours; in the case of a pointer-to-function it defines the call-frame for the function – the parameters (and their types), the return value, etc. As programmers we make use of the fact that a pointer’s address can be bound to different objects (at run-time) to increase the flexibility of our code.
C++ extends this capability by allowing us to generate pointers to objects (or functions) of the same type within a particular class.
Conceptually, this means this new pointer type has to be thought of in a slightly different way:
- For class attributes a pointer-to-member-object can be thought of as an offset from the address of an actual object.
- For a (non-static) member function, the pointer-to-member-function requires the address of the function and the address of an object (the this pointer)
In both cases we need an object as part of the expression. This demands some new syntax – both in the declaration of pointer objects and when we use them.
The pointer-to-member-object, p_attr, is declared as a pointer to an int member of type ADT. That means p_attr can hold the address (offset, if you prefer) of any int attribute in the ADT class. In our example that means it can be initialised to the address of ADT::attr1 or ADT::attr2.
Because a pointer-to-member requires an object to act on we need a new operator – the pointer-to-member operator .* (yes, it’s a single operator). The left-hand operand of the pointer-to-member operator must be an expression of complete class type (as before). The right-hand operand must be an expression of pointer-to-member type. The pointer-to-member type must be compatible with the left-hand operand’s type (for fairly obvious reasons).
Note that, unlike C, you must use address-of operator (&) when specifying pointers-to-members.
For member functions we need to follow a similar pattern:
Like a function pointer we must specify the type of the pointer in terms of the function signature. In the above example ptr_mem_fn can hold the address of any (non-static) function, on class ADT, that takes no parameters and returns nothing.
As with pointer-to-member-objects we have to supply a left-hand operand that is an expression of complete class type (that is, an object). Once again, because of precedence rules on operator evaluation we must enclose the pointer-to-member expression in parentheses.
The indirect pointer-to-member operator
Since the left-hand operand of the pointer-to-member operator is an expression of complete class type we can use any expression that could yield an appropriate object – including indirect access via a pointer.
I’ve added a using alias to help clean up the signature of the doStuff() function.
Now, we have two pointers: a pointer to an object and a pointer to a member function. We must dereference the pointer-to-object for the left-hand operand; and dereference the pointer-to-member-function for the right-hand operand. All the previously-established rules about expression value type, etc., still hold true.
To “simplify” (I’ll use that term here ironically) the syntax – and for symmetry with the member access operators – C++ has an (indirect) pointer-to-member operator, ->*
Note, we still need the parentheses around the ->* operator to maintain the correct operator precedence.
Finally, if you have a pointer-to-member you must always supply an object as the left-hand operand. If an object is calling one of its own member functions indirectly (because… you know, reasons) then you must explicitly use the this keyword.
Summary
C++ provides four member access operators, dependant on whether the object or the member is accessed directly or indirectly. The syntax of these operators can be summarised as the table below.
- Practice makes perfect, part 3 – Idiomatic kata - February 27, 2020
- Practice makes perfect, part 2– foundation kata - February 13, 2020
- Practice makes perfect, part 1 – Code kata - January 30, 2020
Glennan is an embedded systems and software engineer with over 20 years experience, mostly in high-integrity systems for the defence and aerospace industry.
He specialises in C++, UML, software modelling, Systems Engineering and process development.
That's really a good explanation (and summary!) of PMF's. Thanks a lot!