The Light Cone

Where past and future meet at a point in spacetime

Asynchronous Lazy Lists

SYNCHRONOUS LAZY LISTS

Lazy lists make it easy to write infinite data sets without doing infinite computations. For instance, the following static method in C# will generate a lazy list that serves up all the integers (actually, all the 32-bit integers in an infinite cycle, but we quibble):

A lazy list for an infinite stream of ints
1
2
3
4
public static LazyList<int> integers(int n)
{
    return new LazyList<int> {value = n, rest = () => integers(n + 1)};
}

Notice that an instance of this LazyList type has two properties (or fields, if you prefer): a value and a rest, which is a function, (lambda expression) that, when invoked, will produce the rest of the infinite stream. That’s the essence of the technique: keep the infinite stuff in a function, recursively, and pick out values only when you need them.

Next is my implementation of the LazyList class. Of course, I could use C#’s built-in Lazy type, but I am, on purpose, doing the details myself for deeper understanding and exploration. We’re going to morph these into task-based LazyLists that produce the downstream values asynchronously:

Implementation of the LazyList class
1
2
3
4
5
6
7
8
9
10
11
12
13
public class LazyList<T>
{
    public T value;
    public Func<LazyList<T>> rest;

    public T nth(int n)
    {
        return
            (n <= 1) ?
            value :
            rest().nth(n - 1);
    }
}

In addition to the properties already mentioned, I’ve included an nth method that walks up the lazy list in O(n) time to get the n-th item (without the tail-recursion optimization, this code will also consume O(n) space on the stack). This is not designed to fetch the n-th item directly, as is possible with some spigot algorithms, and it might be very interesting to explore spigots using lazy techniques in C#, but perhaps another time.

Next is a little more meat on the bones: the obligatory factorial:

A LazyList that produces factorials
1
2
3
4
5
6
7
8
public static LazyList<int> factorials(int n)
{
    return new LazyList<int>
    {
        value = (n <= 1) ? 1 : n * factorials(n - 1).value,
        rest = () => factorials(n + 1)
    };
}

This one uses recursion in the production of value. There are two uses of recursion: value recursively uses prior values, rest recursively produces future values. This is pretty neat. Both uses of recursion are completely synchronous and all this code runs on one thread.

We could go on to fibonaccis and more exotic streams; we could anonymize these for remoting, memoize them for efficiency, as explained an earlier post on this blog; but that’s not where we’re going today.

One last thing before going async, and thats for a bit of code that exercises the above, using LinqPad:

Exercising the LazyLists
1
2
3
4
5
6
7
8
9
10
11
12
13
void Main()
{
    var k1 = integers(1);
    // Expect 5
    k1.nth(5).Dump();

    var k2 = factorials(1);
    // Expect 120
    k2.nth(5).Dump();

    // A hint for what's coming up next
    AsyncMain();
}

Much more with synchronous lazy streams can be found in this wonderful paper: Music of the Streams by Doug McIlroy. But on to asynchrony for us.

ASYNCHRONOUS LAZY LISTS WITH TASK

Let’s start with ‘what-we-want’ in AsyncMain:

AsyncMain
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public static async void AsyncMain()
{
    DumpThreadId("Async main, k3: ");
    var k3 = asyncIntegers(1);
    (await k3.nth(5)).Dump();

    DumpThreadId("Async main, k4: ");
    var k4 = asyncFactorials(1);
    (await k4.nth(5)).Dump();

    var z0 = (await Task.Run(() =>
    {   DumpThreadId("Anonymous task: ");
        Thread.Sleep(ran.Next(200));
        return 42;
    }));

    DumpThreadId("Async main, z0: ");
}

We made a new main because we needed it to be async and we can’t mark the regular Main as async. What we want, here, is to await the computation of asyncIntegers and asyncFactorials in the body of AsyncMain. This can cause various threads to jump around in our code, so we’re careful to print thread id’s after every await and in the body of every lambda expression that runs asynchronously in a task, via the following static method:

How to dump a thread id
1
2
3
public static void DumpThreadId(string msg = "")
{   System.Threading.Thread.CurrentThread.ManagedThreadId.Dump(msg + "Thread Id");
}

We’ll also insert a number of artificial random delays to try to entice the thread scheduler to send different threads through our code. None of these Sleeps are part of the algorithm at all; they’re here to induce interesting behavior.

Now look at the implementation of asyncIntegers:

asyncIntegers
1
2
3
4
5
6
7
8
9
10
public static AsyncList<int> asyncIntegers(int n)
{   return new AsyncList<int>
    {   value = n,
        rest = () =>
        {   DumpThreadId("AsyncIntegers, rest: ");
            Thread.Sleep(ran.Next(200));
            return asyncIntegers(n + 1);
        }
    };
}

Looks just like the implementation of lazy integers. The variable k3 = asyncIntegers(5) in AsyncMain must be of type AsyncList. The deeper differences are in nth, the method we use to walk the lazy list:

AsyncList
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class AsyncList<T>
{   public T value;
    public Func<AsyncList<T>> rest;

    public async Task<T> nth(int n)
    {   return
            (n <= 1) ?
            await (Task.Run(() =>
            {   DumpThreadId("AsyncList, nth: ");
                return value;
            })) :
            await Task.Run(() =>
            {   Thread.Sleep(ran.Next(200));
                return rest().nth(n - 1);
            });
    }
}

We use only one facet of the multifaceted Task class. The rule we use is that Task.Run, when given a Func<T>, produces a Task<T>. Awaiting a Task<T> prduces a T. So, except for the routing through Task, it looks like invoking a Func<T> to produce a T. This seems like a minimal intrusion of task-ness on our original lazy-list design.

Look at the first branch of nth, when n <= 1. We await a Task.Run of a Func<T> that produces value, so the result of the await is of type T, the type of value. This T gets returned through an async method, namely nth, so it gets converted back into a Task<T> on the way out. In the body of AsyncMain, where we are calling nth, we await this Task<T>, getting back a T in the end.

T’s become Task<T> when run under Task.Run or when returned from any async method that would otherwise produce a T. Task<T>’s become T’s when awaited (there’s a monad in here somewhere, but that’s for another time, too).

On the other branch of nth, The new rest will walk up the lazy list just as before, only awaiting the value of rest().nth(n-1) to get a T; and returning it through the async to get a Task<T> on the way out (EDIT: thanks to Brian Grunkmeyer for a bug fix here).

On both branches, nth is of type Task<T>, just what we need to await on in the caller to get T’s.

Here is the more meaty async factorial, which doesn’t differ in any interesting way from the lazy-list factorial:

asyncFactorial
1
2
3
4
5
6
7
8
9
10
public static AsyncList<int> asyncFactorials(int n)
{   return new AsyncList<int>
    {   value = (n <= 1) ? 1 : n * asyncFactorials(n - 1).value,
        rest = () =>
        {   DumpThreadId("AsyncFactorials, rest: ");
            Thread.Sleep(ran.Next(200));
            return asyncFactorials(n + 1);
        }
    };
}

It may take a few tries to get a lot of thread variety: the thread-pool scheduler seems to have hints from your code and tries to reuse the same thread whenever it can. But eventually, you can get a trace like the following that shows various thread moving around these tasks.

asyncFactorial
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
5
120
Async main, k3: Thread Id
16

AsyncIntegers, rest: Thread Id
22

AsyncIntegers, rest: Thread Id
37

AsyncIntegers, rest: Thread Id
37

AsyncIntegers, rest: Thread Id
37

AsyncList, nth: Thread Id
37

5
Async main, k4: Thread Id
37

AsyncFactorials, rest: Thread Id
37

AsyncFactorials, rest: Thread Id
37

AsyncFactorials, rest: Thread Id
37

AsyncFactorials, rest: Thread Id
37

AsyncList, nth: Thread Id
26

120
Anonymous task: Thread Id
37

Async main, z0: Thread Id
37

A gist for this LINQPad script can be found here: https://gist.github.com/4580491.