Closures and lambdas really are a simple concept, but I continue to see definitions that are really confusing. So what is a closure in C#? In this post I’d like to give you a few examples that will clear everything up for you. But first, let’s start with the Wikipedia definition of a closure:
“In computer science, a closure is a first-class function with free variables that are bound in the lexical environment.”
All clear, right? Well, if it is for you, then great… you can stop reading. But if not, and the next time this topic comes up you want to sound like Super Duper Computer Science Guy™ … then keep reading.
First Class Functions – Sir, Your Functions Are First Class
So first, what is a C# “first-class function”? A first class function simply means that it is a function which your language treats as a first class data type. It means that you can assign a function to a variable, pass it around, and invoke it… just like a normal function. In C# we can create a first class function using anonymous methods:
Func<string,string> myFunc = delegate(string var1) { return "some value"; };
Or we can do it using a lambda function which is just a shorter syntax:
Func<string,string> myFunc = var1 => "some value";
Both of those are functionally equivalent, and they just create a method that takes a string and returns a string. We can call that method by invoking the variable just like we would any method:
string myVar = myFunc("something");
This means that C# supports first class functions, yay!
Free Variables
And so now we have first-class functions with free variables… And what, pray tell, is a free variable? A free variable just happens to be a variable which is referenced in a function which is not a parameter of the function or a local variable of the function. Okay, so it might look like this:
var myVar = "this is good"; Func<string,string> myFunc = delegate(string var1) { return var1 + myVar; };
Okay, so the anonymous delegate is referencing a variable that is in its enclosing scope. That variable isn’t a parameter, and it isn’t a local variable. So it is a free variable. So what?
It Has To Close Over It, Son
So, what happens in this case:
static void Main(string[] args) { var inc = GetAFunc(); Console.WriteLine(inc(5)); Console.WriteLine(inc(6)); } public static Func<int,int> GetAFunc() { var myVar = 1; Func<int, int> inc = delegate(int var1) { myVar = myVar + 1; return var1 + myVar; }; return inc; }
Hmmm, stare at that for just a second. When we call “GetAFunc”, we get a method back that increments a local variable inside of the method. You see? “myVar” is a local variable, but when we return the “inc” method, it is bound inside of the delegate.
But don’t local variables get created on the stack? Don’t they go away when we finish executing the method? Normally yes. But if we ran this code, this would be the result:
So, when we passed back the method, the variable now lives along with the method. Outside of its original scope. You see, it got incremented when we called the method twice. Crazy!
But you know what is even more crazy? You just learned what a closure is! You just bound some free variables in the lexical environment! Don’t you just feel super smart now?
You see, it is oh so simple. It is really all about variables getting referenced which might leave scope. So you can say that these delegates are “closed” over these variables, which causes them to live outside of their lexical scope.
How Does It Work?
Well, the next question I always ask is, how does it work. And normally when people ask me this question I say “very carefully”, but that is only because I am annoying. In reality this is implemented in a more straightforward manner than you might think. What we are really doing here is binding up a method with some data and passing it around. Geez, I sure do wish we had something like that in C#… oh wait, we do, it is called a class!
You see, the C# compiler detects when a delegate forms a closure which is passed out of the current scope and it promotes the delegate, and the associated local variables into a compiler generated class. This way, it simply needs a bit of compiler trickery to pass around an instance of the compiler generated class, so each time we invoke the delegate we are actually calling the method on this class. Once we are no longer holding a reference to this delegate, the class can be garbage collected and it all works exactly as it is supposed to!
Easy Peasy
Now, the C# compiler team would probably spit in my face for simplifying it this much, but at least at the conceptual level it is a fairly straightforward process. My head hurts a little bit thinking of all of the complex interactions and edge cases that would accompany something like this though.
Loved the article? Hated it? Didn’t even read it?
We’d love to hear from you.
Mr Etheredge, you humble me sir. Beautifully explained, many thanks.
what an interesting concept! actually makes me miss college and all this crazy computer sciencey stuff was still new. thanks for the insight!
Thanks for the writeup.
I’ve got a couple questions, could you expand a bit on what "lexical environment" means?
Also, what are some examples/ideas, in the C# world, where it would make good sense to use a closure?
Excellent! That’s one less thing off of my "things to learn when I have the time" list!
Pretty nice explanation. Thank You.
@Andy H: A "lexical environment" means "scope".
As for where it would make sense, that depends on what you’re doing. 🙂
A cursory grep of some sources I have shows this use of closures:
var data = new int [7];
var result = new int [] { 6, 7, 8, 9, 10, 11, 12 };
int j = 0;
6.UpTo (12).ForEach (i => { data [j++] = i; });
AssertAreSame (result, data);
This is to test an Int32.UpTo(int, int) extension method, and uses an IEnumerable<T>.ForEach(Action<T>) extension method for the testing. Since we can’t provide additional parameters to .ForEach(), we rely on a closure over the ‘data’ and ‘j’ variables, allowing for `data[j++]=i`, where `i` is a parameter to the lambda.
A simpler (but likely sillier) example would be this:
Func<int, int> IncrementBy (int by)
{
return value => value + by;
}
In the above, `by` is captured in a closure, allowing:
Func<int, int> inc1 = IncrementBy (1);
Console.WriteLine (inc1 (2)); // == 3
Func<int, int> inc9 = IncrementBy (9);
Console.WriteLine (inc9 (2)); // == 11
The primary motivation for closures is that it allows for more succinct code, as you need to modify fewer "intermediate" layers to provide additional "contextual" data; consider a Linq-to-SQL query that does name matching:
IQueryable<Person> GetManagersWithName (IQueryable<Person> people, string name)
{
return from p in people
where p.Name == name && // implicit closure over ‘name’
p.PersonType == PersonType.Manager
select p;
}
Without closures, we’d either need to copy&paste the entire body into each code site (yay maintenance! so much for DRY!) or complicate the calling code. (There are times in C++ where, since I couldn’t use a closure, I had to add additional member variables to a class and change the argument lists of multiple methods so that the contextual data I needed was "passed down" through all the intermediate methods. It’s NOT FUN.)
Very wonderfully and simply explained! Love it! I’m going to forward to the folks in our C# teams where I work as there’s a lot of folks who scratch their heads at closure and I think this may be just the thing to help!
Excellent. Couldn’t have said it myself better 🙂
What an elegant description!
Thank you both Justin and Jonathan for the wonderful tips.
Excellent writeup. Like your style.
I read a lot of programming blogs/websites. This is one of the best posts on anything I’ve ever read. Heading to the other parts of the site now…
Here’s a nice use for closures I found a few days ago: resource cleanup at exit.
var excel = new Excel.Application() { Visible = false };
AppDomain.CurrentDomain.ProcessExit += (o, args) => excel.Quit();
This code ensures that the Excel application is stopped as this process exits.
Closures were added in version C# 2. It was VB that didn’t get them until .NET 3.
http://joe.truemesh.com/blog/000390.html
Another example for using closures is to do Undo/Redo operations.
For example in an Editor, you can have a function DeleteLines() that would delete all lines in a given range, but the function would return to you a closure, that you can call later to restore the state. Same for InsertLines() – it would return you back a function that would restore the state.
All you have to do then, is just keep a list of function pointers (no arguments, no extra data) – everything else is "hidden".
It’s not the most efficient way for undo/redo, but this gives you some prototyping powers, where the performance/memory are not that important, but feature completeness is.
Here is an example in Lua: (Sorry, I don’t have much experience in C#)
http://gist.github.com/428272
Also look in google for "undo using closures"
Obviously there are many many other good examples.
@Jonathan Thanks, I’m aware, when I said function programming goodness I was referring to all of the additional functionality we got with C# 3.0 which makes it more often used. I don’t know about you, but I never really saw people using anonymous methods very often. 🙂
But anyways, you are correct, anonymous methods were added in C# 2.0 and you can create closures with them.
@Dimiter Excellent example, thanks!
please try following to examples, the behavior of "close over" the free variable is inconsistent:
============================================
class Program
{
static void Main(string[] args)
{
var multiple = 2;
Func<int, int> fun_double = GetAFunc(multiple);
multiple = 3;
Func<int, int> fun_triple = GetAFunc(multiple);
Console.WriteLine(fun_double(5));
Console.WriteLine(fun_triple(5));
return;
}
public static Func<int,int> GetAFunc(int multiple)
{
Func<int, int> functor = delegate(int var1)
{
return var1 * multiple;
};
return functor;
}
}
===============================================
class Program
{
static void Main(string[] args)
{
var multiple = 2;
Func<int, int> fun_double = delegate(int var1)
{
return var1 * multiple;
};
multiple = 3;
Func<int, int> fun_triple = delegate(int var1)
{
return var1 * multiple;
};
Console.WriteLine(fun_double(5));
Console.WriteLine(fun_triple(5));
return;
}
@abc I wouldn’t say that is inconsistent. In the first example you are closing over the parameter to the method. So you are creating two closures which are binding to two different instances of a value type. So they are getting "bound" separately. I’m pretty sure that if you were using a mutable reference type, you would get the behavior you are expecting, but I’m not sure I’d say it is inconsistent. It is leveraging the model that .net has always used.
This is the first explanation of closures I’ve read that actually make sense 🙂
Thank you so much, I am, for the first time, confident that I get the concept (bow)
Before understand I got it:
int n = 1;
var f = new Func<int, int>(a => a += (n += 1));
Console.WriteLine(f(5));
Console.WriteLine(f(6));
that do the same 🙂
Short version:
int n = 1;
var f = new Func<int, int>(a => a += (n += 1));
Console.WriteLine(f(5));
Console.WriteLine(f(6));
This is the best explanation of this I’ve ever read
@Justin, @abc is *correct*. From a functional programmer’s point of view C# has actually done closures *wrong*. A closure is properly created using the value bound in the lexical environment – yet the value that C# binds in the lambda’s scope is in fact a reference to the value.
In fairness, doing such a thing is not strictly *incorrect*, but it has been tried in various contexts and found wanting. The most obvious one being the original semantics for a Smalltalk block expression. What C# has created is a language that is lexically scoped, *except* for lambdas which are dynamically scoped. Consistent lexical scoping is actually preferable, as you can always arrange some way to communicate changes, whereas freezing the data in time turns out to be what is wanted about 85% of the time.
@thay I don’t understand why ‘lambdas are dynamically scoped’. In fact I don’t think they are. How can we temporarily redefine a variable that’s been closed over?
http://en.wikipedia.org/wiki/Scope_(programming)#Lexical_versus_dynamic_scoping
Isn’t the reason this has confused the functional programmers because C# isn’t a functional language? In functional languages we expect no side effects, and variables are usually immutable, so we wouldn’t expect to be able to give ‘multiple’ a different value in @abc’s example. Since we CAN do this in C# this seems a logical behaviour to me. Whether it’s actually useful or even appropriate in an OO language is another question.
just wanted so say: WOW, beautifully explained 🙂
Thanks man! keep it up! 🙂
thanks man…this tutorial really helped me to 🙂
I have to thank you from keeping me from reading through thousands of confusing explanations of something so simple. I usually like to remember my explanations in a simple way like this, it’s just getting there that is the problem. You deserve a freaking medal!
beautiful …. man… you rock so much….
best explanation on closure i have ever seen.
Thanks, great explanation!
Great article with simple to understand facts 🙂 thanks for sharing the knowledge in an understandable manner…
Great, easy to understand. Thank you.
btw, you’re the first thing that google returns from “C# Closure”.
So needed this. Thanks!
I saw a question on StackOverflow about closures and had no idea what they were. I googled it and found this beautiful and simple explanation. Thank you!
Thanks @Justin, great explanation.
In Visual Studio 2012, the output of your program is 7 and 8. Looks like they have changed something with the closures in .NET 4.5
Thanks.
Now I worry the GC can never collect those local variables used in closure. Is this kind of memory leak? Maybe we should avoid use external local var/instance var in a closure?
For instance, I often use a delegate as a callback function, because of the fear above, I always try to make callback function static, so the class itself which hold the callback function, can be GCed.
Does it make sense?
No, that doesn’t make sense because any local variables used in a closure are promoted to a compiler created anonymous class that contains the method and the data. Once all references to this object are gone, this object will be collected just like any other object.
Thanks for reply, Justin. Theoretically guess you are right. I think about one scenario. I pass in an callback delegate as state to an static method to do async HTTP post. When async response read finish, it will get the callback from state and fire the call back function.
Since it is async I/O, main thread is finished, those resource should can be considered collected-able in main thread.
I don’t know when the I/O will complete, or in case network error, it might never complete. So if callback is an instance method or refer to any local/instance var, those resource still cannot be GCed.
Thanks!
Very good explanation
Thanks
So, why are below two different?
List l = new List();
for (int i = 0; i { Console.WriteLine(value); });
}
foreach (var item in l)
{
item();
}
—————————————————————–
List l = new List();
for (int i = 0; i { Console.WriteLine(i); });
}
foreach (var item in l)
{
item();
}
Very well explained Justin. Even the article is posted years back, it is still actively read by so many users. Thanks.
Shortest accumulator implementation. Do I win?
private static Func accumulate(int n)
{
return i => n += i;
}
“But don’t local variables get created on the stack?”
Not in static methods, right? Because they’re static.
Very nice article. I was used to closures in JavaScript but it’s nice to see the implementation in C#.
Thanks!
Great job Justin. Informative and laugh-ative :))).
Very good explanation.
Very nice. One of the best simplification of a mind twisting topic I’ve seen. Kudos!
>>k<<
Does this means that if we are passed a method of a class as callback, the class instance will not be garbage collected ?
Very good explanation! You should write a book!
Thanks – this got me headed in the right direction when I needed to write an event handler which invokes a method. But the use of Func over-complicates the explanation IMHO. If you do use Func you need to explain that the last type parameter is the return type and if the return type is void, use Action instead.
But I think you also need to explain what Func is doing:
delegate string MyFuncType(string s);
…
string enclosed = “My Enclosure”;
MyFuncType MyFunc = delegate(string s) {return enclosed;};
Which also makes it clear that you can re-use the “type” definition. Admittedly the fact that the “type” cannot be declared in local scope or defined anonymously in the variable declaration is an ugly cludge in C# syntax – but so is Func and Action ! Why cannot we just write:
var MyFunc = new delegate string(string) {return enclosed;};
Beautifully explained
Beautiful explanation, really well done.
Hum…. could it be clearer by not reusing “inc”?
I understand this correctly
static void Main(string[] args)
{
var inc = GetAFunc();
this “inc” is NOT the same as this inc:
Func inc = delegate(int var1)
The first is a variable that holds a function the second is a function. Perhaps the first should be:
var incFuncReference = GetAFunc();
Maybe not the best name but it should be a unique name. I always believe that when giving an example ALL unique entities should have unique names. Even if the context makes them different that contextual difference should be driven home with the entity name.
Excellent, very well explained, thanks 🙂
Definitively the world needs more people like you!
Great article, son