Python coroutines

Picture of a quad-helix drone

How to make coroutines return values.

WHY


Maybe you need to handle a large volume of records and you heard coroutines are a good way of doing it. Or maybe you just started looking into them. Whatever the reason, you probably have noticed that a lot of blog posts and tutorials out there show how to code one to print some values.

If you then tried to actually do something with them, you may have ended up here trying to figure out how to coerce the little buggers into returning those values!

You are not alone 😉 I too found it confusing to figure out at first. So I'm going to show you the basics of how to use python coroutines that return values.

If you're in a hurry, let me give you the secret sauce up front: define your yield-expression with a variable! That's all, simple but often overlooked.

For a complete explanation, examples, and walls of text please read on 🙂

Multitasking

In a mind-blowing twist, in computer science concurrent and simultaneous aren't synonyms.

Concurrency denotes access to a shared resource within a time frame. And there are different approaches to achieving this.

To illustrate the difference lets imagine we're making breakfast, and we want eggs and toast with a glass of fresh orange juice.

In Sequence

If we did it in a linear way we might take 2 minutes to cook scrambled eggs; once they're ready we would pop bread in the toaster, wait another two minutes, and spend an extra 30 seconds spreading butter on our toast at the end. Finally we'd move on to the juicer and spend one minute pressing oranges. Total preparation time 5m:30s

Eggs are cold by then!

With multitasking

To solve the cold eggs problem we can use a multitasking approach instead of waiting for each part to complete in turn.

If we cook multithreaded style, we may crack the eggs open on a pan, then turn to the toaster and put the bread in. We then go back and forth between the juicer and our eggs, making sure eggs don't stick to the bottom of the pan, and squeezing oranges.

One minute in we get our juice. Two minutes in, eggs are ready and bread pops out. We spend the same 30 seconds as before applying golden buttery goodness.

In reality all that context switching between juicer, pan, and toaster isn't instant, so let's add 30 seconds more to account for the time it takes to move from one to another. Total preparation time 3m:00s! We saved a whole two minutes over the linear method.

If we did it in parallel, we'd ask somebody (let's say a couple of friends) to help out. We could cook the eggs ourselves, while one of them does the toast and the other prepares the juice. Total time goes down to 2m:30s, which was the time of the slowest task.


This distinction becomes important to understand when choosing a multiprocessing model, as each offer advantages and tradeoffs.

Kickoff

I spend considerable amounts of time reading documentation and googling the intertubes in pursuit of learning "that one new thing I need for this".

Frequently, I see people say things like "you can build Amazing Thingamabob (tm)", and "use the combobulator to defenestrate 10x faster!", but they gloss over details which are obvious to someone that already knows the topic, yet impenetrable to first time newbies. That's lead me many a time to ask

OK,but how?


So I decided to share here (every once in a while at least!) what I learned in my deep dives, in hopes that it will save someone a bit of time when looking into those topics 😊

Stay tuned!

Python coroutines

How to make coroutines return values. WHY Maybe you need to handle a large volume of records and you heard coroutines are a good way...