Just to be clear from the very beginning: This is not going to be a Torvalds-ish rant against C++ from the point of view of die-hard C programmer.
I've been using C++ whole my professional career and it's still my language of choice when doing most projects.
Naturally, when I started ZeroMQ project back in 2007, I've opted for C++. The main reasons were:
- Library of data structures and algorithms (STL) is part of the language. With C I would have to either depend on a 3rd party library or had to write basic algorithms of my own in 1970's manner.
- C++ enforces some basic consistency in the coding style. For example, having the implicit 'this' parameter doesn't allow to pass pointer to the object being worked on using several disparate mechanisms as it often happens to be the case with C projects. Same applies to explicit marking of member variables as private and many other features of the language.
- This point is actually a subset of the previous one, but it's worth of explicit mention: Implementing virtual functions in C is pretty complex and tends to be slightly different for each class which makes understanding and managing the code a pain.
- And finally: Everybody loves destructors being invoked automatically at the end of the block.
Now, almost 5 years later, I would like to publicly admit that using C++ was a poor choice and explain why I believe it is so.
First, it's important to take into account that ZeroMQ was intended to be a piece of infrastructure with continuous uptime. It should never fail and never exhibit undefined behaviour. Thus, the error handling was of utmost importance. It had to be very explicit and unforgiving.
C++ exceptions just didn't fill the bill. They are great for guaranteeing that program doesn't fail — just wrap the main function in try/catch block and you can handle all the errors in a single place.
However, what's great for avoiding straightforward failures becomes a nightmare when your goal is to guarantee that no undefined behaviour happens. The decoupling between raising of the exception and handling it, that makes avoiding failures so easy in C++, makes it virtually impossible to guarantee that the program never runs info undefined behaviour.
With C, the raising of the error and handling it are tightly couped and reside at the same place in the source code. This makes it easy to understand what happens if error happens:
int rc = fx ();
if (rc != 0)
handle_error ();
With C++ you just throw the error. What happens then is not at all obvious:
int rc = fx ();
if (rc != 0)
throw std::exception ();
The problem with that is that you have no idea of who and where is going to handle the exception. As long as the handling code is in the same function the error handling often remains more of less sane although not very readable:
try {
...
int rc = fx ();
if (rc != 0)
throw std::exception ("Error!");
...
catch (std::exception &e) {
handle_exception ();
}
However, consider what happens when there are two different errors thrown in the same function:
class exception1 {};
class exception2 {};
try {
...
if (condition1)
throw my_exception1 ();
...
if (condition2)
throw my_exception2 ();
...
}
catch (my_exception1 &e) {
handle_exception1 ();
}
catch (my_exception2 &e) {
handle_exception2 ();
}
Compare that to its C equivalent:
...
if (condition1)
handle_exception1 ();
...
if (condition2)
handle_exception2 ();
...
It's far more readable and — as a bonus — compiler is likely to produce more efficient code.
However, it doesn't end there. Consider the case when the exception is not handled in the function that raises it. In such case the handling of the error can happen anywhere, depending on where the function is called from.
While the possibility to handle the exceptions differently in different contexts may seem appealing at the first sight, it quickly turns into a nightmare.
As you fix individual bugs you'll find out that you are replicating almost the same error handling code in many places. Adding a new function call to the code introduces that possibility that different types of exceptions will bubble up to the calling function where there are not yet properly handled. Which means new bugs.
If you don't give up on the "no undefined behaviour" principle, you'll have to introduce new exception types all the time to distinguish between different failure modes. However, adding a new exception type means that it can bubble up to different places. Pieces of code have to be added to all those places, otherwise you end up with undefined behaviour.
At this point you may be screaming: That's what exception specifications are for!
Well, the problem is that exception specifications are just a tool to handle the problem of exponential growth of the exception handling code in a more systematic manner, but it doesn't solve the problem itself. It can even be said it makes it worse as now you have to write code for the new exception types, new exception handling code *and* new exception specifications.
Taking the problems described above into account I've decided to use C++ minus exceptions. That's exactly how ZeroMQ and Crossroads I/O looks like today.
Unfortunately, the problems don't end up here…
Consider what happens when initialisation of an object can fail. Constructors have no return values, so failure can be reported only by throwing an exception. However, I've decided not to use exceptions. So we have to go for something like this:
class foo
{
public:
foo ();
int init ();
...
};
When you create an instance of the class, constructor is called (which cannot fail) and then you explicitly call init function (which can fail).
This is more complex that what you would do with C:
struct foo
{
...
};
int foo_init (struct foo *self);
However, the really bad thing about the C++ version of the code is what happens when developers put some actual code into the constructor instead of systematically keeping the constructors empty.
If that's the case a special new object state comes into being. It's the 'semi-initialised' state when object has been constructed but init function haven't been called yet. The object (and specifically the destructor) should be modified in such a way as to decently handle the new state. Which in the end means adding new condition to every method.
Now you say: But that's just a consequence of your artificial restriction of not using exceptions! If exception is thrown in a constructor, C++ runtime cleans the object as appropriate and there is no 'semi-initalised' state whatsoever!
Fair enough. However, it's beside the point. If you start using exceptions you have to handle all the exception-related complexity as described in the beginning. And that is not a reasonable option for an infrastructure component with the need to be very robust in the face of failures.
Moreover, even if initialisation wasn't a problem, termination definitely is. You can't really throw exceptions in the destructor. Not because of some self-imposed artificial restrictions but because if the destructor is invoked in the process or unwinding the stack and it happens to throw an exception, it crashes the entire process.
Thus, if termination can fail, you need two separate functions to handle it:
class foo
{
public:
...
int term ();
~foo ();
};
Now we are back to the problem we've had with the initialisation: There's a new 'semi-terminated' state that we have to handle somehow, add new conditions to individual member functions etc.
class foo
{
public:
foo () : state (semi_initialised)
{
...
}
int init ()
{
if (state != semi_initialised)
handle_state_error ();
...
state = intitialised;
}
int term ()
{
if (state != initialised)
handle_state_error ();
...
state = semi_terminated;
}
~foo ()
{
if (state != semi_terminated)
handle_state_error ();
...
}
int bar ()
{
if (state != initialised)
handle_state_error ();
...
}
};
Compare the above to the C implementation. There are only two states. Not initialised object/memory where all the bets are off and the structure can contain random data. And there is initialised state, where the object is fully functional. Thus, there's no need to incorporate a state machine into the object:
struct foo
{
...
};
int foo_init ()
{
...
}
int foo_term ()
{
...
}
int foo_bar ()
{
...
}
Now consider what happens when you add inheritance to the mix. C++ allows to initialise base classes as a part of derived class' constructor. Throwing an exception will destruct the parts of the object that were already successfully initialised:
class foo : public bar
{
public:
foo () : bar () {}
...
};
However, once you introduce separate init functions, the number of states starts to grow. In addition to uninitialised, semi-initialised, initialised and semi-terminated states you encounter combinations of the states. As an example you can imagine a fully initialised base class with semi-initialised derived class.
With objects like these it's almost impossible to guarantee predictable behaviour. There's a lot of different combinations of semi-initialised and semi-terminated parts of the object and given that failures that cause them are often very rare the most of the related code probably goes into the production untested.
To summarise the above, I believe that requirement for fully-defined behaviour breaks the object-oriented programming model. The reasoning is not specific to C++. It applies to any object-oriented language with constructors and destructors.
Consequently is seems that object-oriented languages are better suited for the environments where the need for rapid development beats the requirement for no undefined behaviour.
There's no silver bullet here. The systems programming will have to live on with C.
By the way, I've started experimenting with translating ZeroMQ into C lately. The code looks great!
EDIT: The endeavour evolved into a new project called nanomsg in the meantime. Check it here.
Read part II of the article.
Martin Sústrik, May 10th, 2012
Martin, in your new C implementation, how are you addressing point #1? Did you find a nice library or are you rolling your own data structures and utility classes?
I've just started to play with it, so there was no need for sophisticated algorithms so far. In any case, in original ZeroMQ codebase lot of STL functionality was explicitly re-implemented to get better performance. What remains are pieces out of the critical path — I'll still have to decide how to handle those. Any suggestions are welcome.
I have a similar history with C++. Learned it as my 2nd major programming language and after several years of using it, I got tired of the same kinds of issues you bring up.
For std::map, I'd recommend GNU libavl. I'd link it but your blog commenting system doesn't trust me to not be a link spammer. The main site for libavl links to the tarball for v2.0.2a, which isn't actually the most recent release. Check the ftp_gnu_org* libavl file index, which has v2.0.3.
For std::list/std::vector/std::set, implementing them should be fairly easy in C. And for std::sort, there's always qsort(). What STL features do you find yourself needing most?
* I hate blog comment systems.
We have a handful of C projects, and so we factored out our data structure and utility classes into a standalone library. I've tried to make it so that it's easily embeddable into other codebases, so you wouldn't have to link to the library at runtime. All of the configuration choices are handled by the C preprocessor, so it should (in theory) be easy to incorporate into whatever build scripts your using. BSD licensed.
It also has some error handling code that's loosely based on standard POSIX int error codes and glib's GError API.
Fake URL (since I can't link in the comment) is github redjack libcork. Documentation is linked to from there.
Hi Steven, I hate the commenting systems as well. This one doesn't allow me to reply to your post for example :| Anyway, I'll check out the libarary you mention.
Hi Douglas, thanks for the link. I'll check it out!
For list and tree algorithms you can take a look at the BSD macro's for queues and trees as included in libbsd (a project on freedesktop). They are used throughout BSD projects, so also lots of code to examine their use.
Thanks!
"For std::list/std::vector/std::set, implementing them should be fairly easy in C"
Not really, std::set is much more similar to std::map in complexity when it comes to implementation.
"And for std::sort, there's always qsort()"
qsort is infamous for being actually slower than std::sort.
What about Apple Core Fundation?
Martin, would you considering releasing the implementations as a separate utility library ?
If you think c++ is bad choice for ZeroMQ, why don't you rewrite it using c?
I did: nanomsg.org
Reading this comment on a three year-old post and seeing you reply to it within 13min is kind of funny.
I agree that half-initialized objects are bad. I think in Java world, coders tend to accomplish this more safely using the builder pattern, which is why we get all these boilerplate FactoryFactoryFactory classes. Do C++ programmers sometimes use builder patterns? Can an entire class be marked as private/internal?
In our project (small game) we constantly use builder pattern to fight half-initialized objects. Also you can use less? boilerplate static constructor pattern:
class FooError { code / description, etc };
template<typename T>
using FooResult = std::tuple<T, FooError>;
class Foo {
public:
static FooResult<Foo*> Create(params) {
auto *foo = new (std::nothrow) Foo(params);
return {std::unique_ptr<Foo>(
foo == nullptr || foo->LastError().IsFailed() ? nullptr : foo),
foo->LastError()};
}
private:
Foo(params) const noexcept;
FooError LastError() const noexcept {
// Gets error code which was set via constructor.
}
…
Unfortunately, problems with error handling in destructor are still here.
static inline constexpr FooError NoMemory(ENOMEM, "Can't allocate memory for Foo instance");
And do not forget to free Foo's memory in case of failure, too.
Do not write comments when falling asleep like me.
Interesting read. I am no expert but Go's error handling seems to address some of your issues with exceptions pretty well. Go is no C regarding portability and performance yet but it sure seems to be heading in the right direction in terms of language features.
hey, yeah, i totally signed up to comment just to say you'd love Go. make sure to check out Go's defer keyword, in addition to how it handles errors (no exceptions, but multiple return values, etc).
I'll give it a look. However it's not a viable option as the long-term goal with Crossroads is to move the functionality to the kernel space.
+1
I'm also Go-fan too (though I wouldn't recommend it for 0mq).
In fact Go's attitude to error handling is close-to-perfect:
The last point means you use exceptions when you detect actual bugs (rather like assert() in C), or inside a single module, where you know exactly what exceptions can occur and you catch them before they bubble out.
You can actually do all of this in C++, but the nice thing is that Go doesn't need constructors or destructors - except in the same sense that C has them.
Late to the party, as always, but since I can still add my .02, I will. And I don't insist on these points, I believe that if you fall in love with a language and it meets your needs, go forth and multiply. But to Martin's objections I would raise these points:
C++ Exceptions: Nothing is stopping you from using return codes in your app/system. You can infact turn them completely off with a command line option with some compilers, and simply not use them in the rest.
Object Inititalizers: Use them where they make sense, and just use structs instead of objects for the rest of the cases and where the compiler-supplied constructors are not useful. The STL is pretty damn useful, I don't see the point of doing without it for the points you raised here. They are valid, but C++ is a buffet table, use what you want, ignore the stuff you don't.
And to Adrian above: "…This is the C way, but with a better standard library." Fine, but you don't explain how Go's "standard library" is better than C's. And by standard library, do you infact mean C's standard runtime library? Yes, I can agree with you that some of C's "standard" runtime libs are lacking, but you're not limited to using them. I do embedded programming and using GNU libc is usually not optimal in most cases, I often use eglibc and if I think C++ is a better fit for a problem I'll use uSTL.
Go is a good language, a real contender as an all-around replacement for C in general for a compiled solution. But in the specific case of embedded programming, most of the companies I deal with will probably not even look at it for a long time, if ever. Aside from the support issues its still slower than C.
I know this is a very late comment but for what it's worth:
Another alternative to is rust, being developed by Mozilla. It's focus is on being very fast and very safe. The first stable version is due to be released within the next few months and from the little bit I've played around with it, it looks promising as an alternative to C that can be used to build system programs that need to be robust.
You can still return a structure in C if you want more detailed errors. For instance when parsing a file, it is useful to have a line attached to the error message :
struct foo { char *error; int line; };
Granted, the syntax is not as pretty as go but it is not that bad. I use this pattern in my code here and there and it works well while still having a readable code.
Hello Martin,
I think some issues with exceptions could be addressed by smart hierarchy of exception's classes.
And, of course, '…' does not give room for undefined behavior.
What I am agree with you about.
C++ is complex, exceptions are roots of many problems.
Exception specifications are good for nothing.
What I am not agree with you about.
Your article is written in 2012, not in 2002. C++ issues are discovered and solutions are worked out.
Fast things like templates and inlining, power C++ libraries like Boost do not allow to come back to plain C.
Forget.
Move to C++11, not to plain C.
Hi Ilya! Nice to hear from you again. I must admit I haven't rea C++11 spec in detail but AFAIK there are no significant changes to the error handling.
Handle errors however you want, C-like if necessary. Everything should be for one's benefit, not for respecting the C or C++ book. What is shocking me is that you treat C and C++ as distinct, when those aren't. Having C++ means having choices, that much should have been obvious.
The article advocates deliberately restricting the choices to get simpler and more manageable codebase.
Thats what java did.
Interesting, Java is arugably the most diluted language ever. Not that Microsoft had anything to do with that, but still…
Like language advocacy there, douche! C++ is cancer, the judge has already passed sentence. Anyone still sane has vowed never again to use it..
I have to agree with others C++ is still a better bet, and to touch on Ilja's remark:
Move to C++14, not to plain C. C++ i s an extension of C -> You can use any part of C in C++ -> including C syntax. In fact, most beginner C++ courses don't even teach you any C++ at all - it's all C syntax run under a C++ compiler. So in essence it doesn't even make sense to say C would have been better than C++….
The true trick to any lower level programming language is architecture. How you design your application will make it or break it -> regardless of the actual language used.
I'm disappointed seeing all the comments about replacement languages for C; why aren't people working on replacing POSIX APIs with C++ methods over C methods? Why isn't anyone working on replacing C altogether with C++ rather than just letting it be an extension? You know…making a true C++ runtime in Assembly instead of requiring a C runtime over Assembly? Bet ya that'd be much faster than Go and we wouldn't have to learn any new languages or APIs (well, for the most part - syntax would certainly change a bit for some system methods).
Donate me 100k and I'll do it myself - just give me a few years…
The father of C++ isn't dead yet, its only getting better and better and better and better and better.
You just need to take the time to work it out. It takes YEARS to LEARN programming, not days -> if you are looking for a single day answer switch to .NET or some other auto-generated non-sense and let the quality speak for itself (but I'm sure you know this already ^^).
The article is talking about idomatic C++ vs. idomatic C.
If it was talking about C vs. C++ it would make no sense. C is (more or less) a subclass of C++.
As for C++11, C++14 etc., by growing the featureset it's making the problem even more grave. Idiomatic C++ today is definetely a bigger mess than it used to be in 1995. Soon enough it will be as messy as Java :(
I like your reasoning about exception handling and crippled initializations. May I ask though, why do you say Java is messy?
The article is talking about idiomatic C++ vs idiomatic C. As far C++11 and C++14 etc, by growing the featureset there are new better way to do things in C++ than there were before. This means that growing the featureset will change what is considered idiomatic in C++
I don't really buy your reasoning. To begin with, exceptions don't cause undefined behaviour by themselves. That only happens if you invoke undefined behaviour in your destructors, which is a very bad thing to do.
If you mean that they cause behaviour that is difficult to reason about, that's a very different thing, but also not really worse than the C version. You can, after all, catch all the exceptions in a block with a single statement.
Failing destructors is also curious. What are you doing in a destructor that could possibly fail? If you can't delete your object, then what course of action is there to take other than to crash? If your destructor is doing things like sending messages over pipes or whatever that might fail, handle that failure inside the destructor itself. This is also not solved or even mitigated by C; if anything, it's worse in C because it forces you to expose the error handling in your deinitialization.
On a general note if you have to have a choice between a total crash and burn, OR to just ignore that failure and cause a memory leak or handle leak or unreleased resources…. then yeah…. I would rather keep going than crash, its a choice of two evils, one can still get you home or keep you alive.
Pretend your program is a lifeform, a total crash, like a epileptic fit is the worst choice.
The problem is that when something unexpected happens you don't know whether it will cause memory leak or format your hard disk. That's the very nature of "unexpected".
Thus, you have to clean up the mess and try again.
However, both the programmer and the stack unwind algorithm are notoriously bad at cleaning up after unexpected failure.
The OS cleanup, on the other hand, while not perfect, tends to clean up nicely when process is aborted and restarted.
The OS does just fine as long as you don't create files. I seem to remember some fun with 0mq, unix domain sockets, and remove(). And that didn't even involve error cases.
Grr.
Yep. There are shortcommings. The design of unix domain sockets is broken. There's no way to close the socket and unlink the associated file in a single atomic step. All kinds of race conditions, hang-ups etc. arise.
If you are using Linux, use sockets in the anonymous namespace and you won't have these problems.
Unfortunately, ZeroMQ is a multi-platform app, so you have to go with what POSIX offers.
I don’t think the problem is really C++ – or even exceptions – here. It’s the wrong use of exceptions. Eric Lippert has a nice post about the correct and erroneous use of exceptions (which I cannot link to, but which can be found by googling for “vexing exceptions”). But even his analysis contains a flaw (i.e. C++ stream classes show how to circumvent those “vexing” exceptions).
Fundamentally, you show that handling proximal failure isn’t best solved using reflections. This is true, and all very well. Exceptions still might make sense to put error-handling on a different layer.
Regarding constructors, this might still be a good place to use exceptions, or a pattern as employed by the C++ file streams (i.e. have the object indicate whether it’s in a valid state) or (my favourite) use a facultatively initialised object (maybe<foo>, or a smart pointer).
You argue that C makes initialisation easier but that’s not true, even if you end up using the init function (ugh!) – just encapsulate constructor call + init into a free function which returns either a valid pointer to an object or a null pointer (which should of course be a smart pointer). In fact, how would your C code using foo_init indicate failure?
Which brings us to destructors. Those are indeed a sore spot when it comes to failure handling. But once again, this problem exists in C, too – it’s just hidden by the lack of exceptions.
Of course the ZeroMQ project is enormous and complex but with all due respect I still think your analysis of the difficulties is wrong, not fundamentally caused by the use of exceptions nor by C++’ use of classes, and I doubt that using C offers a better solution.
It appears to me that you do not understand exceptions. Not that I'm going to be nasty about this, but that's my fundamental opinion.
"The problem with that is that you have no idea of who and where is going to handle the exception."
That's the *point*. The *point* of exceptions is that you can choose the appropriate point to handle the error. The simple fact is that it's almost never in return codes that they are *not* propagated. Consider a file opening function. If the file does not exist, what are you gonna do? You can't handle the error there. Only the caller knows if it's OK for that file to not exist. User settings? Default them. Critical resource? Program failure. So you have to propagate it. In fact, it was probably already propagated to that function from an OS function. This happens all the time with return codes and exceptions are no different, except the compiler propagates it for you.
The key about RAII is that it should not matter to you where the exception is caught. If you throw an exception, it's not your problem where it's caught. It's the caller's problem to handle that. Just clean up and forget about it. And if you're super-worried about new exception types, then you can use catch(…).
Having exceptions be well-defined is easy. It's easier than the DRY-violation-o-licious return codes.
DeadMG, I believe you exaggerate Martin's ignorance ;)
Martin, I am sorry to ask, but didn't you write the text on April 1 ?
Nope. We've implemented RFC3514 in linux kernel this year's April the 1st :)
I started writing up a similar response. The article suggests that catching exceptions within the function that threw them is the norm. Occasionally it is done, but the real benefits of exceptions is that a relatively simple syntax lets the stack get efficiently unwound to the place in the code where a problem can be handled. Deciding where in your call hierarchy to handle an error condition is a challenge in any language, but it's easier and cleaner to implement in a language that supports exceptions.
Regarding destructors: The caller of a function that throws an exception doesn't care if secondary resource-release failures occur in the context of an exception. Masking (and logging) exceptions within destructors is a valid approach. If you encounter an error upon releasing resources, you certainly have bigger problems that will soon rear their ugly heads. If you really want a code path that specifically handles exceptional behavior on resource release, you can add an explicit exception-throwing method for releasing resources. This tends to complicate things by adding state (isClosed), and is rarely of much benefit.
Hmm, maybe I'll make a fork of 0MQ using exceptions! :-)
That would be a nice experiment!
However, I think most of the posters above have misunderstood what I was saying.
The argument was that decoupling raising of the error from it's handling creates enormous amount of undefined or semi-defined — and in pactice mostly untested — states.
A simple change to the program like adding a single method invocation can change to global picture of error flows, which error can occur where etc.
In a long run such system in unmanageable because any change changes the behaviour of almost any other part of the system and thus requires full round of testing.
Exceptions are of course great when you don't care too much about what happens if error occurs. You handle most common cases and handle all the remaining cases with a generic handler, say opening a dialog box with the error description.
I suspect that if you spent some time on a well-written project in Java or C# (these can be hard to find!), you might find that your concerns are overstated. In my experience, properly implemented exceptions actually tend to localize problems rather than globalize them, reduce error-handling complexity, and reduce the appearance of indeterminate states.
And of course, "a full round of testing" should be a normal part of development, not something considered unmanageable!
Given the quality of 0MQ, I'm sure you can do a clean and robust rewrite in C with no problem. It does actually seem like a good language for 0MQ, but IMO that's mostly due to being more portable than C++ (for embedded dev, etc).
Wow. How does Java/C# use exceptions to localise the error handling?
As for full round of desting, I believe we have a different view of what robustness mean. For me "full round of testing" means checking that the delayed TCP acks on OpenVMS on VAX platform still work as expected. And that when this specific way of network switch failure happens, the program can ultimately recover. That this particular timing of the incoming network events (at microsecond precision) doesn't introduce a hang-up. Etc.
If you consider that to be normal part of development you must be surely working for IBM :)
I don't know what you mean by how they handle localized errors, but you can surround any piece of code with try/catch and handle it there or have the method throw an exception that should be handled somewhere up the stack.
In the example you shown us you'd have something like this in hava
Why do you consider this not localized enough?
Yes, that's nice. However, it's the same thing as the C-style error handling:
Sorry I was unclear. Java and C# were mentioned because those are languages where exception handling is pervasive. The standard libraries and even the runtime can raise exceptions, so code is always developed with exceptions in mind. Returning error codes isn't something anyone does much of in these languages. And yet, even in large projects written in these languages, your concerns do not manifest themselves in any way that I have seen.
If you have a lot of experience with exceptions in Java/Python/Ruby/etc and disagree, that's fine. If not, I merely suggest that there is something to be learned from how exception handling plays out in projects written in these languages.
I guess there's no much system-level programming going on in Java or C#. So the problem is never hit. And as I already said in the blogpost, exceptions are great for rapid development which is exactly what Java a C# are used for.
If what you're saying is that C++ (and any other highlevel language) is no good if you want to have extremely tightly specified behavior in every possible edge case, you are probably right. But do we really want to do that, except in some extreme cases ?
I'm sure assembly is an even better option. After all, you can even deal with stack corruption, stack overflow and weird register state, which you will not correctly accomplish in C.
The problem you are creating here is that anybody who does not know the last location of the last bit in your code will not code decent responses to error conditions. In fact they may ignore them and proceed to use the partially initialized structures you've created. At that point you can tell them "hey I didn't let you do that, see I told you what happened by setting the return code for foo_init to X". But the end result of this will be a segfault, because nobody checks the C function's return values, that's a boatload of uninteresting work ! In practice, serious damage will be done the program bails out with inexperienced programmers (and let's be fair : it happens to experienced ones often enough as well). And by inexperienced I mean practically every programmer who's not a seasoned kernel coder (and even they make error handling mistakes resulting in crashes).
Just so you know, I once designed a post about the correct way of opening a tcp connection to a remote host, correctly handling all possible error cases in C. It was 150+ lines of code. It took over 10 attempts and each time a code review brought a few things I hadn't thought of to light. And frankly, it still only handled ipv4 and ipv6 (and a hostname lookup can actually return more than just those 2), and still did not correctly initialize the connection in the SCTP case, and I'm sure there's a few old unixes that have even more options.
150 lines of code. For writing "hello, world !" to a tcp port.
150.
The reason for the massive length is that one has to repeat large sections of the program based on what the different cases are. And there's the case that DNS returned an IPv6 address but the host does not actually have IPv6 connectivity (and doesn't even handle the case that the host is on an IPv6 island, like most hosts are today, which can probably add another 15 lines of code easy)
And frankly, there's several stupidities that are still not handled. Like running out of stack space (does 0mq handle that correctly in all cases ? I'd be extremely impressed if it does)
In python it's 5 lines of code. In Java about 15.
Now first, you're right. That python code can fail with almost a dozen possible exceptions. Making it handle the 6to4 degradation gracefully, the only error case that can be meaningfully recovered from, makes it 10 lines of code. Properly abstracted, it's maybe 15 lines of code.
That code has lots of failure cases that I don't really care about, which essentially all need the sysadmin to fix things.
The problem is not just that the C stuff is 150 lines of code, but the fact that that is so extremely limiting. If merely sending a few bytes correctly is such a huge effort, imagine if I was copying a file to a tcp connection. Dozens of new failure scenarios (and all are uniformly poorly documented). Suppose I was writing a simple fuse filesystem.
The problem is that the error state space explodes, just like you say. But that happens in C just as it happens in every other language. If we ever want to get to actually implementing meaningful logic on top, we will need some way to do it that doesn't involve the programmer figuring out all possible error cases.
And it absolutely true that this will mean we lose some control over what happens if weird errors do occur. It essentially guarantees that the programmer will not have thought about, say, the IPv6->IPv4 degradation hinted above.
But given the fact that that programmer was probably implementing, say, a web store … is that really such a bad thing ?
150 lines of code to correctly handle TCP connection? Doesn't seem likely. You must have missed a lot of corner cases. 500-1000 would be a more reasonable number.
As for the stack overflow, Crossroads have limited stack depth, ie. you should never run out of stack space unless your OS allocated unreasonably short stack by default.
As for the error state space exploding, it doesn't happen if you handle any error immediately when it occurs rather than passing the error up the stack. That way you convert 'exceptional' state into regular well-defined codepath. If you pass it on, on the other hand, you have no control of who handles it and how. Also, if you write a handler you have no idea of what exactly caused the error. In such environment your error handling code is just a guesswork at best.
You can't always handle an error immediately. If I am implementing a complex API, and a function deep inside the API fails in such a way that indicates that makes it so that the top-level API call has failed, the error must be propagated up the stack somehow. In C, it propagates up with stuff like "if deepFunction() == -1 return -1;" (or goto fail or whatever), all the way up to the top. The stack must unwind somehow, and the error must propagate up.
"If you write a handler you have no idea of what exactly caused the error."
Wow, this is simply incorrect. Not only can you specify an exception with any degree of granularity and with whatever information you want when you throw it, but most languages and some C++ exception libraries even automatically attach a stack trace to an exception when thrown, so you can easily analyze both the exact line where the failure occurred and the code path that led to it. As I recommended before, get some experience with a quality codebase that uses exceptions and you'll quickly find that the problems you worry about do not actually occur.
You need to know what the exception is at *development time* so that you can write correct handler. The stack trace received at runtime doesn't help in any way.
That's a good answer for errors than can be handled locally, but that's hardly ever true. And there's nothing in exceptions that prevents you from locally handling errors.
Irritating in C++ is that you always have a mix of exceptions and C-style error checking because, well you don't have a choice, do you ?
But most errors cannot be handled in the code. Suppose a DNS lookup fails. What are you going to do to handle the error locally ? You're going to pass on a failure up the stack, that's what you're going to do. You argue in favor of doing it with return values, but those are easy to ignore, and they can take on surprising values.
You can easily create a well-defined codepath by catching any exception in the calling function and re-throwing a new one with a getCause method. And the only function that needs to be able to handle partially initialised state is the destructor and even then only if you have external resources to worry about. Or you can redesign your classes to contain resource management class instances that manage one external resource and close it correctly on destruction. That way, you can nearly always rewrite pretty much any destructor to become empty at the cost of having a few trivial classes (much like in Java).
With an exception you're sure that either it's handled, or the program dies. Both outcomes are preferable to proceeding with crippled backends. Return values are too easy to ignore, and therefore they usually are simply ignored.
What martin prefers is a more "functional" approach to error handling, where an error is actually another kind of outcome of some particular function. There's no reason not to be able to handle the error locally, and propagating an error is always a _convenience_ practice that DOES have it's many pitfalls. It's easier to create predictable error paths if the caller always handle the error at call-site. With exceptions and propagation, you'll need to be able to always know all the exceptions a particular function throws to spot unhandled ones, requiring static analysis (which not always gives correct results) and you'll still have the problem of not knowing which function did throw the exception (which is in many cases needed) if exceptions are not handled at call site. The big problem with the functional approach is that it IS against rapid and in many cases the code is a lot harder to read. Exceptions makes the code clearer and easier to write at the expense of creating many hidden error paths, and for some projects that's unacceptable. One approach to the problem could be to use C++ without exceptions.
Yes. If you want to do rapid development use exceptions. If you want to do systems programming, forget about it.
Btw, the discussion does have a philosophical aspect. It is about concept of "unknown" (a.k.a. "unexpected").
My argument was that with systems programming you should avoid "unknown" as much as possible. By handling error as soon as they happen you basically say while it is error it is not "unknown" or "unexpected" state. It's just a different, perfectly valid and well-understood codepath.
With exceptions you take the error, you acknowledge it's "unknown" nature by converting it into generic exception, you break away from the standard codepath and pass the responsibility for dealing with the unknown to someone else. This is great when you are OK with 80% or 90% solutions. Errors are sparse and handling them correctly in 90% or cases can mean the problem will be never hit, especially if the program is not used very widely.
This point you make IS valid:
"This is great when you are OK with 80% or 90% solutions. Errors are sparse and handling them correctly in 90% or cases can mean the problem will be never hit, especially if the program is not used very widely."
However, even a hello world application doesn't have a <10% (we're sayingthat only 10% or less of the logic in the application doesn't handle every single error that can happen absolutely perfectly…) fault ratio; not even the UNIX kernel has a <10% fault ratio….there's always a way to break stuff, why fix it until it needs to be fixed? At which point you know the issue, and how to handle it…. I can hack Windows and activate it any day of the week -> switching to another language isn't going to stop that.
Interesting article, couldn't you could just make constructors and destructors private then give your classes static create and destroy functions?
class Foo {
public:
static int create(Foo** foo) { *foo = new Foo(); return 0; }
static int destroy(Foo* foo) { delete foo; return 0; }
private:
Foo() {}
virtual ~Foo() {}
};
Sure. You can. But you still have to care about the semi-initialised state inside the class, especially if the termination function performs non-trivial operations on the class. If it calls any other classes you are back to the original problem: the object is in semi-defined state while processing is going on. Once you add callback into the mix you'll be completely lost.
I share these concerns, but how does C improve on that over C++? Those issues are just moved to foo_init() and foo_term().
create() can fail and not return a new object, if it couldn't be constructed with a well-defined state. In that case, every live instance is guaranteed to be valid, and the destructor can also assume that.
It doesn't improve anything. It's just explicit and authoritative about not using the problematic constructs.
The big problem with C++'s exception handling is that it always unwinds the stack to the try-catch frame before doing anything with the exception. A lot of context is lost as a result.
As far back as the late '80s, languages like Common Lisp allowed conditions (CL's name for exceptions) to be handled in the context where they occurred, at the top of the stack. I'm disappointed that this model hasn't been implemented in a C-like language.
Seems to me like the conclusion is, "I can't save people from themselves or enforce easily enough good design paradigms. So, I'm going to do something perhaps a little more convoluted initially just so people don't break RAII or use exceptions incorrectly. I must save them from themselves or save me from constant clean up trouble."
All of these problems can be surpassed if everyone was on-board with a mix of exceptions/classic error handling/proper RAII/consistent design, and you'd still get some of the, arguable, programming benefits of C++. Now, I have immense respect for ZeroMQ. So, I don't disagree with the choice, and I unfortunately know the perils of large projects and the existence of inconsistency. So, I don't begrudge you for the choice, and I respect that it seems you know these things can be overcome; it's just not easy to enforce or keep things consistent. Many times I've saved myself from myself by just writing it in C so that I don't get so grand on design that I lose focus - because for what I do on a daily basis, whatever produces the fastest code is almost always what I have to ultimately go with.
I write more C than C++ for the same reasons, but it's sort of sad it comes to this at times…
I would say the conclusion is "there's no good design paradigm for exhaustively handling errors once exceptions get involved".
You cannot avoid exponential growth of number of failure modes, unless, of course, you enclose every function invocation into it's own try/catch block.
Or, you could try consciously limit the distance between where exception is thrown and where it is being caught. For example, if function foo() throws exceptions, we can enforce every call of foo to be in a try/catch block and handles every exception foo() throws. That is, we could by policy, disallow exceptions propagate from under foo() — foo() has to handle all the exceptions foo calls upon, and if it has to, turn into foo()'s own exception. I think that is the only way to avoid exponential growth of exceptions.
A similar effect goes with objects. We should limit layers of objects. Only those parts of code that has very clean and limited interface should be wrapped in a class. Or we are subscribed to endless code bureaucracy.
Wrt exceptions: Yes. But that's exactly what you would do in C by checking the error codes. And C syntax is much nicer IMO:
vs.
You can certainly build something like conditions, but the problem is you end up with zero encapsulation. Exception objects should act as an encapsulated closure around the original error, capturing all relevant state that ought to be exposed to any bits of code that might catch it. The end result is far cleaner code.
I'm a bit skeptical about these conclusions, except the one about exception specifications being bogus. It sounds like you have constructors and destructors that do too much and therefore can fail in undesirable ways. From what I've seen of the zeromq code (nice project btw) your ctors tend to be pretty lean, except things like ctx. So I'm not sure what problems you've had there.
Reg. calling code that throws exceptions. I'd guess maybe they raise exceptions for unexceptional things? Archtypical examples like exceptions for eof versus handling that in a reading loop are the typical patterns of exceptions that get increasingly impossible to deal with. Exceptions should truly be for the things that are unexpected and unhandlable at the place it happened. Normal "expected" problems should be dealt with using C style return codes.
As an example, I'd rather have a socket/thread/message class raise a std::bad_alloc error, rather than assert and stop. I can catch a bad_alloc higher up, log that we're low on system resources. Of course, the logging has to be written so that it can log things without memory allocations, and that's also rough and timeconsuming to get right.
Your comment that, "Adding a new function call to the code introduces that possibility that different types of exceptions will bubble up to the calling function where there are not yet properly handled. Which means new bugs." particularly irks me. C style handling makes it easy to forget to check a return value causing bugs that are rare and non-trivial to reproduce/debug. With an unhandled exception, you can at least get a stacktrace and can see where it was missed.
I've done systems programming for many years in C++, with and without exceptions. Without, it degrades to C style checks and such like you describe, and with, well, you have to be careful and there's a lot of rope to hang yourself in. But at the end of the day, I've found it was easier and simpler, as long as you don't use Java style exceptions for every little conceivable "error". Less is more when it comes to exceptions—keep them for the exceptional cases.
But, if you want to try going back to the C path, I wish you the best of luck with that project, don't forget to free your memory and close your fds :-D
Well, I personally find explicitly handling the error codes quite easy (4 year old could do that). All it requires is a bit of discipline.
However, understanding how your local patch changes global error handling is close to impossible, especially when the patch doesn't deal with the exception handling directly, just happens to be on the path of the stack unwind.
If you trust the entire development team, and somehow trust the API users and their ability to write callbacks. Return codes are fine, but only for things that are more or less optional to check. Non-optional and exceptional things, exceptions.
I can't quite see many places in zeromq where exceptions would improve life, except for the not using no-throw new. That's one where I'd rather get a std::bad_alloc than risk a missing assert.
There's no much you can do with bad_alloc other than terminating the process anyway.
Looks like there's a little typo in the second code example.
I think you meant to test the value of rc in the conditional.
uh, C++ has huge advantages. What you really want is C + two or three C++ features. or 100, really. But not exception handling, not RTTI, not STL, not all the Java-esque shit. You have to understand that the people who designed it are basically morons and the people who implemented stl are retarded morons, but underneath it all there is still C with vastly improved namespacing and data hiding.
I use these flags at home:
-Wall -Werror -Wunused -Wundef -Wsign-compare -Wshadow -Wold-style-cast -Woverloaded-virtual -Wsign-promo -Wsynth -fno-exceptions -fno-rtti -fcheck-new -g -ggdb -O -D_GNU_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -pthread
sometimes g++ (or gcc) needs "-g -O" to actually make all the checks work.
C with templates would be nice. I never understood why nobody have though of doing that.
At least, one reason is that C doesn't support name mangling, and templates *imply* that, since they produce functions with the same name that differ only in their input parameters.
Example: do(T) can be instantiated to be do(This_t) and do(That_t).
Fair enough, unless templates are always evaluated as inlines.
Hmm, I'm currently deciding whether to use C or C++ for my project, and talk like that makes me look at the whole article with a renewed sense of scepticism. It seems you're not happy with exceptions, but you don't understand the whole slew of problems that comes with C.
When reading through the described problem it feels like, you are looking for "inversion of control", e.g. http://code.google.com/p/iocc/wiki/GettingStarted
To be frank, I believe that the whole concept of inversion of control is one of the more horrible ideas to be found in computer science. Just to give an example why it's problematic: My domain-specific code can happily use two libraries. However, would you ever dare to embed your domain-specific code into two frameworks in parallel?
Sigh.
Exceptions are just big returns. And the catch block is the call point.
You should have ONE exception class in your project, subclassed from std::exception. In that class you have members that hold the OS error code, an application specific error code, and a string that describes the error and possibly the file and line where the exception was thrown.
At the catch point you can decide how to handle the problem. In 95% of these things all you need to do is keep running, possibly logging a warning or error.
If you are down on exception syntax create a macro
#define IFTHROW(CODE) try { CODE } catch
That will act just like an old fashioned "if (err no)"
If you think C++ causes undefined behavior, just wait until you use C.
This is actually unlike about C++. All the time I am getting to the situation where I have to write extra code (the macro in this case) just to get rid of C++ features. I rather use a language that lacks the features in question in the first place.
As for C, I've already re-written ZeroMQ in C (see nanomsg project) and the code is simpler, more readable, behaviour is fully defined etc.
Fixed. Thanks!
Exceptions and undefined behaviour have nothing to do with each other. Nothing. I don't comment all the remaining fluff about constructors having non-empty bodies, half-initialised objects, and such. You don't seem to know anything about the "Rule of Zero", your reasoning about separate init functions makes absolutely no sense, I am afraid.
The very fact that you have to rely on such rules is a failure of language.
I once heard it being put like this: "Design pattern is a sign of problem in the language."