C++20 Coroutine Iterators

In my first blog post about C++20 Coroutines I introduced the concepts behind a synchronous or generator style coroutine and developed a template class to support coroutines for any data type.

In this post I’ll add an iterator to the template to support the range-for loop and iterative algorithms. You may want to review that post before reading this one but the following code should act as a reminder about how to write and use a coroutine to read two floating point values into a data structure for subsequent analysis.

struct DataPoint { 
    float timestamp; 
    float data; 
}; 

Future<DataPoint> read_data(std::istream& in) 
{ 
    std::optional<float> first{}; 
    auto raw_data = read_stream(in); 
    while (auto next = raw_data.next()) { 
        if (first) { 
            co_yield DataPoint{*first, *next}; 
            first = std::nullopt; 
        } 
        else { 
            first = next; 
        } 
    } 
}
static constexpr float threshold{25.0}; 

int main() 
{ 
    std::cout << std::fixed << std::setprecision(2); 
    std::cout << "Time (ms)   Data" << std::endl; 
    auto values = read_data(std::cin); 
    while (auto n = values.next()) { 
        std::cout << std::setw(8) << n->timestamp 
                  << std::setw(8) << n->data 
                  << (n->data > threshold ? " ***Threshold exceeded***" : "") 
                  << std::endl; 
    } 
    return 0; 
}

The full example of this code is in the files future.h and datapoint_demo.cpp in the accompanying GitHub repo coroutines-blog.

Using Iterative For Loops

To add support for the C++11 range-for loop in the Future template described in the coroutines blog post (the coroutine approach used by Python and C#) we need to add support for iterating over the coroutine sequence of values.

To differentiate this refactored version of the template from the one in the last blog the class has been renamed to Generator which more accurately describes its purpose.

Firstly, to simplify the examples using the coroutines, we add operator << support for DataPoint objects:

static constexpr float threshold{21.0};

std::ostream& operator<<(std::ostream& out, const std::optional<DataPoint>& dp)
{
    if (dp) {
        std::cout << std::fixed << std::setprecision(2);
        std::cout << std::setw(8) << dp->timestamp
                  << std::setw(8) << dp->data
                  << (dp->data > threshold ? " ***Threshold exceeded***" : "");
    }
    return out;
}

We now want to add support to the generator so that we can refactor our client code to use the range-for loop:

std::cout << "Time (ms)   Data" << std::endl;
for (auto&& dp: read_data(std::cin)) {
    std::cout << dp << std::endl;
}

This is equivalent to the traditional C++ iteration loop:

auto stream = read_data(std::cin);
std::cout << "Time (ms)   Data" << std::endl;
for (auto it = stream.begin(); it != stream.end(); ++it) {
    std::cout << *it << std::endl;
}

Support for range-for loops requires iterator support to be added to the Generator template. To do this we provide a class that encapsulates the ability to iterate (step through) each value in the coroutine and stop at the end of the sequence which requires an iterator type that supports:

  • accessing the current value in the sequence via operator*
  • moving the iterator forward one value using operator++
  • checking for the end of the sequence with operator==

In the Generator class we add a begin method to create an iterator object and an end method which returns an object used when testing for the end of the sequence (it is a little more complex than that but we want to keep this discussion short and to the point).

Following the style of the standard library containers, we define our iterator type as a nested class named iterator in the Generator class.

Compared with a traditional container such as std::vector which stores all values in memory and can randomly access those values, our coroutine can only store the latest value in the sequence. In C++ terms our generator supports a simple input_iterator  for which we need only construct one iterator object (in the Generator::begin method) that points to the underlying Promise object. If you are interested you can read more about iterator styles and concept requirements on this iterators page.

To conform to C++20 iterator concepts our nested iterator class has to define a set of type traits and a default constructor:

class iterator
{
public:
    using value_type = Promise::value_type;
    using iterator_category = std::input_iterator_tag;
    using difference_type =  std::ptrdiff_t;
    
    iterator() = default;
    iterator(Generator& generator) : generator{&generator}
    {}

    // iterator methods below

private:
    Generator* generator{};
};

The value_type identifies the type of object that the iterator acessews; in this case a std::optional<DataPoint> which we already have as type trait in our promise class. The other two type traits are required to ensure C++ algorithms generate the correct code for an input iterator.

The iterator requires access to the underlying Generator object so it can retrieve the value from the Promise so we pass that as a constructor argument and store it in a private variable.

An iterator is considered to be a pointer to the underlying sequence so we supply an operator* method to retrieve the current iteration value. The recommended approach for an iterator is to return a reference to the original data (or a constant reference for readonly access) but we will simply return the value from the promise: it’s a simpler solution and is sufficient for our purposes.

value_type operator*() const { 
    if (generator) {
        return generator->handle.promise().get_value();
    }
    return {}; 
}

We should also provide operator-> for working with pointers:

value_type operator*() const {    
    if (generator) {
        return generator->handle.promise().get_value();    
    }
    return {};  
}

Next we require the ability to move the iterator forward using operator++:

iterator& operator++() {
    if (generator&& generator->handle) {
        generator->handle.resume();
    }
    return *this;
}

iterator& operator++(int) {
    return ++(*this);
}

The iterator concepts require both the prefix and postfix versions despite the fact we’re not going to use the postfix version. The increment operators simply resumes the coroutine executing code up to the next yield statement (or the end of the coroutine). Note that a C++ input iterator does not need to allow access to previous data values which is why we only need the one iterator object.

We can now write our Generator::begin method to create the iterator and step forward to the first sequence value:

template <typename T>
class Generator
{
    class Promise {...}
    class iterator {...}

    iterator begin() {
        auto it = iterator{*this};
        return ++it;
    }

    // omitted code
};

The last requirement of the iterator is to identify the end of the sequence so that the for loop can terminate as shown in the traditional form of iteration:

for (auto it = stream.begin(); it != stream.end(); ++it)

This is a little more complex with our generator’s input iterator than it is with a container iterator (such as std::vector) because we don’t have all the values stored in memory locations. Instead we have to identify when we have reached the end of the coroutine and test for that condition.

Firstly we add a finished method to the Promise class to simplify testing for the end of the coroutine so that our revised Promise class is:

class Promise
{
public:
    using value_type = std::optional<T>;
    // coroutine lifecycle methods

    value_type get_value() {
       return std::move(value);
    }

    bool finished() {
        return !value.has_value();
    }

private:
    value_type value{};
};

Remember that the template supports movable types so we have to make sure that testing for the end of the coroutine does not read (move) and discard the value in the generator.

Prior to C++17 we had the restriction that both begin and end must return the same type of object from both methods which meant we had to provide the following comparison method:

// end of iteration comparison prior to C++17
bool operator== (const iterator&) const {
    return generator ? generator->handle.promise().finished() : true;
}

The comparison indicates end of iteration if the coroutine has finished or  if there is no generator object because the default constructor was used.

This works but does not properly capture the concept of an input iterator as the following nonsensical test will return false if the coroutine has not ended:

stream.begin() == stream.begin()

To resolve this problem C++17 introduced iterator sentinels which use a different type (class) to mark the end of the iteration loop. For our coroutine we define a data type to act as a marker (sentinel) to indicate that the coroutine has finished:

 struct end_iterator {};

We  provide an operator== for this sentinel type and this is the only operator== we should provide because our previous comparison between two iterator objects does not make sense for input iterators:

bool operator== (const end_iterator&) const {
    return generator? !generator->handle.promise().get_value() : true;
}

C++20 has made some sweeping changes to the way compilers must support comparison operators which affects our operator== method.

We no longer need to provide a corresponding operator!= as the compiler will use the inverted value of our operator==. Neither do we need to implement comparison as friend functions using two versions of each operator with the operands swapped around to support equivalent comparisons such as the following:

generator.begin() == generator.end()
generator.end() == generator.begin()

Under C++20 the compiler must consider swapping the operands when expanding our operator== method. In other words the single operator== method is all we now need to supply. You will notice that many C++20 standard library classes have had redundant comparison operator definitions removed.

Now we have the complete iterator class we can add the sentinel object, begin and end methods to our Generator:

template <typename T>
class Generator
{
    class Promise {...} 
    
    struct end_iterator {};
    class iterator {...}

    iterator begin() { 
        auto it = iterator{*this}; 
        return ++it; 
    }

    end_iterator end() {
        return end_sentinel;
    }

private:
    end_iterator end_sentinel{};
};

The Generator class now supports the range-for loop we showed earlier:

std::cout << "Time (ms)   Data" << std::endl; 
for (auto&& dp: read_data(std::cin)) { 
    std::cout << dp << std::endl; 
}

The iterator also supports standard algorithms that use an input iterator such as copy or transform. There is only one complication with algorithms and that is our use of an end sentinel object. To maintain backward compatibility with existing code the end sentinel versions of the algorithms are defined in the std::ranges namespace so we can use std::ranges::copy and an ostream_iterator to display our data values:

auto stream = read_data(std::cin);
std::cout << "Time (ms)   Data" << std::endl;
std::ranges::copy(stream.begin(), stream.end(),
    std::ostream_iterator<std::optional<DataPoint>>(std::cout,"\n"));

Sometimes C++ can be elegantly simple like a duck floating on a river; but for coroutines we have to be aware of the frantic paddling under the surface just to hold position.

The full example of this code is in the files generator.h and iterator_demo.cpp in the accompanying GitHub repo coroutines-blog. The repo also contains an example of a generator using a movable type (std::unique_ptr) in the file iterator_move_demo.cpp.

Martin Bond
Latest posts by Martin Bond (see all)
Dislike (0)
+ posts

An independent IT trainer Martin has over 40 years academic and commercial experience in open systems software engineering. He has worked with a range of technologies from real time process controllers, through compilers, to large scale parallel processing systems; and across multiple sectors including industrial systems, semi-conductor manufacturing, telecomms, banking, MoD, and government.

About Martin Bond

An independent IT trainer Martin has over 40 years academic and commercial experience in open systems software engineering. He has worked with a range of technologies from real time process controllers, through compilers, to large scale parallel processing systems; and across multiple sectors including industrial systems, semi-conductor manufacturing, telecomms, banking, MoD, and government.
This entry was posted in C/C++ Programming and tagged , . Bookmark the permalink.

2 Responses to C++20 Coroutine Iterators

  1. Tomáš Hering says:

    Hello Martin,
    thank you for an amazing and well crafted article. I'm struggling to find grounds in the standard for your following claim though: "Prior to C++17 we would also have to have provided an operator-> method as well but that is no longer a requirement as the compiler can implement -> using the * and . operators." Are you sure, this is the case? If so, can you please reference that? Thanks again, looking forward to further reading.

    tom

    Like (0)
    Dislike (0)
  2. Martin Bond says:

    Whoops, I got this wrong. I read the comment in an article on iterator concepts and took it at face value - I should know better at my age. I could find no other mention of this as a C++ compiler requirement, including the ISO standards. My tests cases show that we do need operator-> as well as operator*. I've amended the article to reflect this. Thanks for the good feedback and spotting the mistake.

    Like (2)
    Dislike (0)

Leave a Reply