Python Coroutines

Picture of a quad-helix drone

How to make coroutines return values.

WHY


Maybe you need to handle a large volume of records and you heard coroutines are a good way of doing it. Or maybe you just started looking into them. Whatever the reason, you probably have noticed that a lot of blog posts and tutorials out there show how to code one to print some values.

If you then tried to actually do something with them, you may have ended up here trying to figure out how to coerce the little buggers into returning those values!

You are not alone 😉 I too found it confusing to figure out at first. So I'm going to show you the basics of how to use python coroutines that return values.

If you're in a hurry, let me give you the secret sauce up front: define your yield-expression with a variable! That's all, simple but often overlooked.

For a complete explanation, examples, and walls of text please read on 🙂


WHAT


I've you're familiar with the theory (or if you hit Wikipedia), you'd read that

"Coroutines are computer-program components that generalize subroutines for non-preemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes."
which is a very formal way of saying that coroutines execute instructions concurrently.

In python, coroutines are indeed implemented as a a form of iterator. Specifically, they are a kind of generator that can accept values.

Coroutines were established in PEP 342 -- Coroutines via Enhanced Generators. The PEP is fairly dense, so we'll digest it bit by bit here.

For starters, we find that coroutines only have a handful of extras compared to regular generators:
  • a send() method, that allows one to send a value to the coroutine at entry point
  • a close() method to exit generation
  • a throw() method to pass exceptions up to the caller
  • and the big ticket item — they allow yield to be used as an expression

A trivial generator implementing a counter that goes from 1 up to and including some maximum value would look like this:

def counter(max_value):
    i = 1
    while i <= max_value:
      yield i #pauses upon encountering yield, generates value
      i+= 1

Every time the yield statement is encountered execution pauses, and control returns to the caller. When execution resumes, 'counter' remembers where it was up to, and increments that value.

Our first coroutine


Let's start the way you might have seen before, defining a trivial coroutine. Then we'll gradually increase the level of complexity.

def pointless_coroutine():
    while True: #serve for ever and ever, or until close()
        value = yield # now yield is on the right, which turns it into an expression.
        print("Received value of {v!r}".format(v=str(value)) #more on print formatting at https://pyformat.info/


So what is going on here?

On line 2 we enter an infinite loop, which lets us emit and receive values forever. Let's not forget that coroutines are both generatos and consumers. To exit this loop we have to manually call .close() on the coroutine.

On line 3, value = yield, is where magic happens. Placing yield on the right side of the equal changes it from being a statement to into being an expression, and that's what makes it a coroutine. This yield-expression can receive values the next time execution of the coroutine resumes.

As mentioned earlier, whenever we reach yield in a generator, python pauses execution and returns the yielded value up to the calling function. In coroutines the exact same thing happens, and then the value sent to the coroutine is assigned internally in it. We'll see this in more detail later. 

Before we use our coroutine, though, we need to be aware of the first gotcha of coroutines. Whilst you can instantiate a generator and use it right away, to use a coroutine you need to kickstart it.

You see, a coroutine has to emit a value before it can receive one, even though there's nothing to emit the first time it's called. This means we have to make the coroutine advance up to the yield statement to be ready to accept values. That is called "priming" or "warming up" the coroutine, and there are two exactly equivalent ways of doing it:

  • call method send() of a coroutine instance with a value of 'None' — coroutine_instance.send(None)
  • or call method next() with no parameters. 
    • We list this as a second option because it may not be implemented, depending on your python version (3.5+ has a __next__ magic method only )

Let's use our coroutine:

>>> instance = pointless_coroutine()
>>> instance.send(None)
>>> for n in range(0, 5):
...     instance.send(n)
... 
Received value of '0'
Received value of '1'
Received value of '2'
Received value of '3'
Received value of '4'

 

Remember we are trying things step by step; run before you fly and all that, so this is a good moment to point out that "... = yield" is a perfectly valid syntax, but not recommended because is valid only some times 😲

A yield-expression must always be parenthesized except when it occurs at the top-level expression on the right-hand side of an assignment. 

You're welcome to try remembering that, but I find it easier to always define yield-expressions in my coroutines surrounded by brackets, like so:

my_variable = (yield)
 

Decorator to 'prime' coroutines


It can be a bit repetitive and error prone remembering to prime a coroutine before first use, so the common solution is to use a decorator that takes care of that for us. Let's make one.

def prime_cr(func):
    "'Prime', (i.e. warm up, start) co-routine."
    def kickstart(*args, **kwargs):
        cr = func(*args, **kwargs)
        print('prime_cr is warming up coroutine {0!r}'.format(func.__name__))
        cr.next() #prime. Also can be cr.send(None)
        return cr
    kickstart.__name__ = func.__name__
    kickstart.__dict__ = func.__dict__
    kickstart.__doc__ = func.__doc__
    return kickstart


Now we can send values right away without bothering to remember to call send(None) or next() first.


@prime_cr
def pointless_coroutine():
    while True:
        value = yield
        print("Received value of {v!r}".format(v=str(value))

and we can call it straight away, like we do with generators:

>>> instance = pointless_coroutine()
>>> for n in range(0, 5):
...     instance.send(n)
... 
Received value of '0'
Received value of '1'
Received value of '2'
Received value of '3'
Received value of '4'


Once we're done working with a coroutine, we shold shut it down by calling it's 'close()' method:

coroutine_instance.close()

OK, but HOW?


That's all very well, but the point was to show how to return values from coroutines. As I mentioned at the top, we do it by assigning a variable to the yield expression.

To illustrate, let's redefine the counter generator from the top as a coroutine:

def counter_coroutine(max_value):
    i = 1
    while i <= max_value: 
        print("Start of iteration. Before yield 'i' contained {0!r}".format(str(i)))
        value = (yield i)
        print("Resuming iteration. After yield 'i' contains {0!r} and 'value' contains {1!r}".format(str(i), str(value))) 
        if value:
            i = value
        else:
            i+=1
        print("...'i' updated to {0!r}".format(str(i)))
    print("Reached max_value and exited loop.")


Note that we don't decorate it so we don't loose the first value, which would be discarded during warm-up when 'prime_cr' calls next() for us.

On line 6 is were we yield a value to calling function. Goes like this:
  1. Upon reaching yield, execution of 'counter_coroutine' pauses, and the old contents of i are emited to the function that called it.
  2. When execution of 'counter_coroutine' resumes, any value passed to it via .send() is stored in i
  3. value is assigned the new value of i
    1. If 'counter_coroutine' wasn't sent something, for example when calling .next(), then i contains None.
  4. i is internally reassigned and is ready to emit during the next iteration.

A coroutine is a generator too so it must to emit a value; that's the value we return to the caller. But  has to emit the value  before it can receive one, even though there's nothing to emit the first time it's called. This is why we need to initialise 'i' in line 2.

If we don't initialise it, the first time an instance of 'counter_coroutine' is sent a value we'll get an exception:

UnboundLocalError: local variable 'result' referenced before assignment     

Returning a value from a coroutine is similar to a function call. If we defined my_variable = some_function(x), but x isn't defined, we'd get the above exception too. The big difference is that coroutine calls are executed in reverse order (emit first, then use new value).

You don't always need a variable to emit values from. Simple yield-expressions like "... = (yield)"and "... = (yield None)" emit a value of None back to the caller. We use the first form on our affectionatelly-named pointless_coroutine.

Ok, let's see it in action!

>>> c = counter_coroutine(7)
>>> c.next()
Start of iteration. Before yield 'i' contained '1'
1
>>> c.next()
Resuming iteration. After yield 'i' contains '1' and 'value' contains 'None'
...'i' updated to '2'
Start of iteration. Before yield 'i' contained '2'
2

So far no difference with regular generators. Every time we call next() we receive the following value. If we continue, we get 3,4,5... until eventually we reach max_value of (7 in this example) and the coroutine will raise a StopIteration exception to tell us it's done.

But what if instead of just calling next we sent it a value?

>>> c.send(5)
Resuming iteration. After yield 'i' contains '2' and 'value' contains '5'
...'i' updated to '5'
Start of iteration. Before yield 'i' contained '5'
5
>>> result = c.next() #verify we can assign returned values
Resuming iteration. After yield 'i' contains '5' and 'value' contains 'None'
...'i' updated to '6'
Start of iteration. Before yield 'i' contained '6'
>>> result
6 

Our counter moves to the new value we sent it, and when we call next() again, continues from there, skipping values in between.

For Bonus points.


Probably your head is spinning fast already from all this. But if you're game for more, here's a nice extra.

Coroutines are very lightweight but they still consume a little bit of resources. If you are planning to use them to handle very large amounts of data, as in a log parser for example, it becomes rather important to dispose of them so they can be properly garbage collected.

When a coroutine ends normally, it raises a GeneratorExit exception, but it's up to the programmer to remember closing them. Not to worry, that's why we have context managers in python! 😁

class open_coro(object):
    """Simple coroutine context manager. 
    """
    def __init__(self, coro_instance):
        self.coroutine=coro_instance
    def __enter__(self):
        print("Starting context manager!")
        return self.coroutine #return coroutine instance
    def __exit__(self, *args):
        self.coroutine.close()
        print("Exiting from context manager -- Goodbye!")
#end open_coro.


Now we can use our coroutines with confidence that they will be closed and ready for cleanup when we're done:

>>> c = counter_coroutine(10)
>>> with open_coro(c) as oc:
...     for i in oc:
...         print("Counted to:{}".format(i))
...         if i == 2:
...             oc.send(9)
... 
Starting context manager!
Start of iteration. Before yield 'i' contained '1'
Counted to:1
Resuming iteration. After yield 'i' contains '1' and 'value' contains 'None'
...'i' updated to '2'
Start of iteration. Before yield 'i' contained '2'
Counted to:2
Resuming iteration. After yield 'i' contains '2' and 'value' contains '9'
...'i' updated to '9'
Start of iteration. Before yield 'i' contained '9'
9
Resuming iteration. After yield 'i' contains '9' and 'value' contains 'None'
...'i' updated to '10'
Start of iteration. Before yield 'i' contained '10'
Counted to:10
Resuming iteration. After yield 'i' contains '10' and 'value' contains 'None'
...'i' updated to '11'
Reached max_value and exited loop.
Exiting from context manager -- Goodbye!


Two interesting things to note here. The first is a message telling us that variable 'i' was updated to 11. However, this happens internally before returning to the top of the loop; when it does return exceeds max_value and 11 is never emitted. We can verify that by calling 'i' after execution.
>>> i
10


The second thing is that our coroutine is closed at the end as expected. If we call next() once more we do get a StopIteration exception.

>>> c.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration


That's quite the infodump, so we'll leave it here.
Hope it helps!

Link-o-graphy

No comments:

Post a Comment

Found a bug/mistake, or got a better way? Your constructive comment is welcome! Something is unclear? Ask!

But if you need help with your code I'd encourage you to head over to https://stackoverflow.com instead ;)

Python coroutines

How to make coroutines return values. WHY Maybe you need to handle a large volume of records and you heard coroutines are a good way...