I like the C# programming language. It feels like C++ done right, divesting itself of much of the C legacy that complicates matters so much. When I do programming, I prefer to use this language. Having to go back and deal with C++ just doesn’t give me that warm feeling like it used to.
But, I have a pet peeve that I miss from C++.
Quick recap. Here’s a brief C++ function…
void MyFunction()
{
string x;
MyOtherFunction(x); /* Pass by value. */
}
Doesn’t look like much is going on, but there’s five function calls in there.
- A constructor function is called to build x.
- A copy constructor function is called to copy x for the function call.
- ‘MyOtherFunction’ is called.
- A destructor function is called to tidy up the copy of x.
- The same destructor function is called to tidy up x itself.
The clever bit is that the compiler has worked out when objects go “out of scope” and inserted calls to that’s objects destructor function in exactly the right place. Even “anonymous” objects are tidied up. Say a function is called that returns an object, but the caller just ignores the return value. The compiler spots it and inserts the destructor call in just the right place.
C# doesn’t do this. Instead, from the very beginning of the language, unused objects are “garbage collected”. Every so often, some code will run that goes over everything built by the program and sees if its being used anywhere. Anything that can’t be traced to running code is removed. Doing it this way allows the programmer to share objects between two different areas of code, without having to worry about which one has responsibility for tidying up.
I imagine that when the very clever people at Microsoft designed the C# language, they had already decided to use garbage collection, and so concluded that this messing about with destructors was no longer needed. No need to insert a function call into code, just leave the object lying around and the garbage collector will deal with it.
This would all great if memory was the only resource we have to keep track of. Open file handlers, database connections, etc. All must closed in a deterministic manner, instead of at some unknown time in the future when memory is about to run out.
Microsoft didn’t leave us completely out on the branch, classes that need to be tidied up can be written to implement the IDisposable interface. This allows the using block to work.
using (SqlConnection con = new SqlConnection(db))
{
/* Use con. */
} /* Dispose con. */
With the using block, just like with the C++ destructors shown above, the compiler inserts a call to the tidy-up function at the end of the block. Even if there’s a return or throw statement in the middle, it’ll make sure everything is tidied up when the code leaves the using block.
But why have the using block at all? If you forget to include the using block, the tidy-up code won’t be called (unless you invoke it manually) and you won’t even get a compiler warning. (You don’t get the warning for very good reasons which I won’t go into right now.)
Even when you use using correctly, adding a using block to a function means introducing an additional block, with all the block-visibility issues and additional indenting that implies.
Structs to the rescue
Fortunately, C# and .NET come with a type of object called structs. These are similar to classes except they are solid value types rather than references to data floating in the ether. The practical difference is that when a struct value is copied (such as when passed into a function as a parameter) and you change the value of the one of the copies, the other copy stays the same.
In contrast, when you copy a class value, you’re instead just making a copy of the reference, so both point to same data. Change the contents of one, and the other changes value too, because there is no “other”.
So what if, when a struct appears in code, it came with an automatic using block attached? That way, we could open files or database connections just by introducing one in code and it would be tidied up in a deterministic way.
To complete the job, we would need mechanisms to support copy constructors and assignment as well as the final destructor call, just like the C++ people are used to.
I’ve been nursing this peeve and whining about it for so long that I’m even boring myself. I plan this to be my last word on the topic and in future I’ll just post links to this article. Enjoy.
Picture credits
“staypuft_3feb2009_0621” by patrick h. lauke on flickr
“choose determinism” by alyceobvious on flickr
“John E. Cox Memorial Bridge” by Elizabeth Thomsen on flickr
Automatic boxing conversions from struct to object completely ruin this plan, as far as I can see.
Doug…
Hmmm, I hadn't considered that, but I don't think this completely kills the plan.
The mechanism is only useful when an instance is on the stack. The point of it is defeated should one become a member of a class or is boxed. I would be inclined to prohibit such uses in the compiler.
(Alternatively, defer them to the GC, but I don't like that idea for reasons I can't quite put my finger on as I'm writing this.)
Anyway, thank you for that. I may do an update.
my next need would be nullable ? for all objects:D
You should really look into Managed C++. It allows you to write something like:
void MyFunc()
{
// SqlCollection is a ref class implemented in C#.
SqlConnection con(db);
SqlConnection^ con2 = gcnew SqlCollection(db);
/* con is disposed of automatically here. */
/* con2 is not disposed but will be finalized eventually. */
}
Yes, managed C++ allows fully deterministic disposal of managed objects.
I remember reading that they considered adding some form of destructors for structs (called "value class" in managed C++) but found that the CLR relies on the ability to make bit-level copies of structs and play with them.