I came across this post yesterday that was talking about the “yield return” statement in C#. It is probably one of the most underused features of C# 2.0! And yes, it was introduced in C# 2.0. 🙂 The author does an excellent job explaining what the “yield return” statement does, and that is to essentially pause the execution of the method so that next time it is called the function resumes where it left off. It is a very limited implementation of a continuation, where we are effectively weaving the execution of the method being called with the caller.
I started to write a comment on the authors blog, but then I decided that the comment was getting way too long and I wouldn’t be able to explain what I was saying effectively without showing example code. I wanted to express that I thought that one of the most important aspects of the yield return keyword was its delayed execution ability. To show this, lets use a method that I created for an earlier blog post. It is a Map extension method that operates on an IEnumerable. It just loops through each item in the IEnumerable passing the item into the passed function and then yield returns the results. The method looks like this:
public static IEnumerable<TResult> Map<TArg, TResult>(
this IEnumerable<TArg> list,
Func<TArg, TResult> func)
{
foreach (TArg item in list)
{
yield return func(item);
}
}
Very simple. The only complicated part is the method signature where we have two generic parameters TArg and TResult. We are passing in a list of TArg and then a function which takes a TArg and returns a TResult. It is easier if you just ignore the signature and look at the foreach loop. It loops through each item and then yield returns the result of calling “func”. Simple.
So, lets look at using this method on a list of strings.
var data = new List<string>() { "al", "bob", "bill", "chris", "doug" };
var result = data.Map(n => n.ToUpper());
Here we are just making all the strings in this list uppercase. But what is actually executing here?
The short answer is NOTHING. At this point, until we try to get the results out of our ‘result’ variable nothing is actually executed. So, what happens when we feed this into another method that has a “yield return”?
var data = new List<string>() { "al", "bob", "bill", "chris", "doug" };
var result = data.Map(n => n.ToUpper());
var result2 = result.Map(n => n + " " + n);
Well, again, the short answer is nothing! At this point nothing (in the method) is actually executing.
So, when we go to execute this we will just execute the “data.Map” and get all of the strings uppercased and then pass the result into “result.Map” which will go through them all and concatenate each string with itself, right? Wrong. Let’s say we we do this:
foreach (string name in result2)
{
Console.WriteLine(name);
}
First I’ll show you what the final output looks like so there is no guessing:
Now, where does the code go when you trace into it? Well, first thing that happens is that “al” gets passed into the “Map” function and then gets uppercase. Then it is yield returned and then we find ourselves back in the map function with “AL” and then it is concatenated with itself to form “AL AL” and then that is yield returned. Next you find yourself on the “Console.WriteLine” line with “AL AL” being passed in.
Pretty crazy, right? Until the last line is output we haven’t even executed all of our code! So, we don’t execute code unless we need the result of it! How cool is that? And one thing that this enables is to represent an infinite series. Here, we aren’t really representing an infinite series (since int is bounded), but I wanted to keep it simple.
public static IEnumerable<int> InfiniteSeries()
{
int result = 0;
while (true)
{
yield return result;
result++;
}
}
Okay, so this method just returns each integer and keeps going until we run out of space in the integer. But for now we are going to pretend that it is infinite. Then we feed this into this method:
public static IEnumerable<int> GetEvenNumbers(IEnumerable<int> list)
{
foreach (int value in list)
{
if (value % 2 == 0)
{
yield return value;
}
}
}
This method just returns all the even numbers. So when we call these like this:
int count = 0;
foreach (int value in GetEvenNumbers(InfiniteSeries()))
{
if (count > 10)
break;
Console.WriteLine(value);
count++;
}
Update: Matt Podwysocki reminded me that I can use “Take” here instead of doing this in an imperative way (thanks!):
GetEvenNumbers(InfiniteSeries()).Take(10).ForEach(Console.WriteLine);
We have a method that returns an infinite series of even numbers until we decide to stop calling it. So you get the infinite list without having to actually fill an infinite list, and there are certainly a few uses for that.
So, I hope you see that the delayed execution of the yield return statement is extremely useful and allows you to chain together different methods without having to execute all intermediate steps. This allows a much more efficient execution model, but if you aren’t aware what is happening you can get a few surprises when debugging.
Loved the article? Hated it? Didn’t even read it?
We’d love to hear from you.
Good continuation stuff for C# here
http://tomasp.net/blog/csharp-async.aspx
Thanks, I’ll check that out.
It should also be noted that you don’t need to have an explicit `break’ in your foreach loop. You can instead use another extension method so that you only get the desired number of results — .Take(int):
foreach (int value in
GetEvenNumbers(InfiniteSeries())
.Take(10))
{
Console.WriteLine(value);
}
Or use the handy Apply() extension method defined last time (more details here: http://groups.google.com/group/mono-rocks/browse_frm/thread/d68aa4afae092cd0#)
GetEvenNumbers(InfiniteSeries())
.Take(10)
.Apply (Console.WriteLine)
.Apply ();
Don’t forget about Take and TakeWhile which will get rid of any need for having those nasty break statements. Use the plethora of methods that Enumerable gives us.
Matt
@Matt I’ve updated the code to be less break-ful! I agree, that last code block was ugly and had I thought about it more, it probably would have ended up less so. Thanks!
One thing to take note of of is if you are to itterate over "result2" twice in a foreach loop, the result iteration also takes places twice. Using LINQ to Objects is really nice, but be careful not to do something like this:
var something = somethingElse.Select(thing=>new otherThing(thing)).OrderBy(otherThing=>otherThing.Name);
foreach(var otherThing in something){
….do something here
}
…do more stuff here
foreach(var otherThing in something){
…do something different here
}
This will result in the entire "linq query" executing twice. To avoid, plop a ".ToArray()" or ".ToList()" on the end of the query.
var something = somethingElse.Select(thing=>new otherThing(thing)).OrderBy(otherThing=>otherThing.Name).ToArray();
I ran into this when exposing an IEnumerable<string> property on one of my classes. The IEnuerable was created in the class from a LINQ to Objects block…and every time a collaborating object requested the property the internal LINQ to Objects block executed again.
The ability to resume execution makes it easy to implement fibers and coroutines. Check more info at http://fxcritic.blogspot.com/2008/05/lightweight-fibercoroutines.html
If you’re interested in how the delayed execution actually works read this:
http://startbigthinksmall.wordpress.com/2008/06/09/behind-the-scenes-of-the-c-yield-keyword/
@Brian Thanks, there is an "Apply()" function (custom, not in .net) that I have been using which forces an IEnumerable to be executed. I will be including this in Dizzy soon.
@Antao Thanks, that looks cool. I’ll have to check it out when I get a minute.
@Lars Yep, I have reflected through all of it, but you have a very nice writeup on it! Kicked!
Well, I ment rather your readers than you 🙂 but thanks!
Very cool stuff, almost feels kind of F#-esque.
Thanks. That made yield clearer. I’m a big fan of delayed execution, or as I call it "pushing it down". 🙂