SYNCHRONOUS LAZY LISTS
Lazy lists make it easy to write infinite data sets without doing infinite computations. For instance, the following static method in C# will generate a lazy list that serves up all the integers (actually, all the 32-bit integers in an infinite cycle, but we quibble):
1 2 3 4
Notice that an instance of this
LazyList type has two properties (or fields, if you prefer): a
value and a
rest, which is a function, (lambda expression) that, when invoked, will produce the rest of the infinite stream. That’s the essence of the technique: keep the infinite stuff in a function, recursively, and pick out values only when you need them.
Next is my implementation of the
LazyList class. Of course, I could use C#’s built-in
Lazy type, but I am, on purpose, doing the details myself for deeper understanding and exploration. We’re going to morph these into task-based LazyLists that produce the downstream values asynchronously:
1 2 3 4 5 6 7 8 9 10 11 12 13
In addition to the properties already mentioned, I’ve included an
nth method that walks up the lazy list in O(n) time to get the
n-th item (without the tail-recursion optimization, this code will also consume O(n) space on the stack). This is not designed to fetch the
n-th item directly, as is possible with some spigot algorithms, and it might be very interesting to explore spigots using lazy techniques in C#, but perhaps another time.
Next is a little more meat on the bones: the obligatory factorial:
1 2 3 4 5 6 7 8
This one uses recursion in the production of
value. There are two uses of recursion:
value recursively uses prior values,
rest recursively produces future values. This is pretty neat. Both uses of recursion are completely synchronous and all this code runs on one thread.
We could go on to fibonaccis and more exotic streams; we could anonymize these for remoting, memoize them for efficiency, as explained an earlier post on this blog; but that’s not where we’re going today.
One last thing before going
async, and thats for a bit of code that exercises the above, using LinqPad:
1 2 3 4 5 6 7 8 9 10 11 12 13
Much more with synchronous lazy streams can be found in this wonderful paper: Music of the Streams by Doug McIlroy. But on to asynchrony for us.
ASYNCHRONOUS LAZY LISTS WITH TASK
Let’s start with ‘what-we-want’ in AsyncMain:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
We made a new main because we needed it to be
async and we can’t mark the regular
async. What we want, here, is to
await the computation of
asyncFactorials in the body of
AsyncMain. This can cause various threads to jump around in our code, so we’re careful to print thread id’s after every
await and in the body of every lambda expression that runs asynchronously in
a task, via the following static method:
1 2 3
We’ll also insert a number of artificial random delays to try to entice the thread scheduler to send different threads through our code. None of these
Sleeps are part of the algorithm at all; they’re here to induce interesting behavior.
Now look at the implementation of
1 2 3 4 5 6 7 8 9 10
Looks just like the implementation of lazy integers. The variable
k3 = asyncIntegers(5) in
AsyncMain must be of type
AsyncList. The deeper differences are in
nth, the method we use to walk the lazy list:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
We use only one facet of the multifaceted
Task class. The rule we use is that
Task.Run, when given a
Func<T>, produces a
Task<T>. Awaiting a
Task<T> prduces a
T. So, except for the routing through
Task, it looks like invoking a
Func<T> to produce a
T. This seems like a minimal intrusion of task-ness on our original lazy-list design.
Look at the first branch of
n <= 1. We
Task.Run of a
Func<T> that produces
value, so the result of the
await is of type
T, the type of
T gets returned through an
async method, namely
nth, so it gets converted back into a
Task<T> on the way out. In the body of
AsyncMain, where we are calling
Task<T>, getting back a
T in the end.
Task<T> when run under
Task.Run or when returned from any
async method that would otherwise produce a
awaited (there’s a monad in here somewhere, but that’s for another time, too).
On the other branch of
nth, The new
rest will walk up the lazy list just as before, only awaiting the value of
rest().nth(n-1) to get a
T; and returning it through the
async to get a
Task<T> on the way out (EDIT: thanks to Brian Grunkmeyer for a bug fix here).
On both branches,
nth is of type
Task<T>, just what we need to
await on in the caller to get
Here is the more meaty async factorial, which doesn’t differ in any interesting way from the lazy-list factorial:
1 2 3 4 5 6 7 8 9 10
It may take a few tries to get a lot of thread variety: the thread-pool scheduler seems to have hints from your code and tries to reuse the same thread whenever it can. But eventually, you can get a trace like the following that shows various thread moving around these tasks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
A gist for this LINQPad script can be found here: https://gist.github.com/4580491.