Error Handling with Error Code Objects #
Introduction #
I have used the code and the mechanism described in this article for almost 20 years now and so far, I haven’t found a better method for error handling in large C++ projects. The original idea is taken from an article that appeared in Dr Dobbs Journal back in 2000. I have added a few bits and pieces to make it easier to use in a production environment.
Background #
Every C++ programmer learns that there are two traditional methods to deal with abnormal conditions: one, inherited from good old C is to return an error code and hope that the caller will test it and take an appropriate action; the second one is to throw an exception and hope that a surrounding block has provided a catch handler for that exception. C++ FAQ strongly advocates for the second method arguing that it is leads to safer code.
However, using exceptions also has its own drawbacks. Code tends to become more complicated and users have to be aware of all exceptions that can be thrown. This is why older C++ specification had “exception specification” added to function declarations. In addition exceptions tend to make code less efficient.
Error code objects (erc
) are designed to be returned by functions like the traditional C error codes. The big difference is that, when not tested, they throw an exception.
Let us take a small example and see how the different implementations would look like. First the “classical C” approach with traditional error codes:
int my_sqrt (float& value) {
if (value < 0)
return -1;
value = sqrt(value);
return 0;
}
main () {
double val = -1;
// careful coding verifies result of my_sqrt
if (my_sqrt (val) == -1)
printf ("square root of negative number");
// someone forgot to check the result
my_sqrt (val);
// disaster can strike here because we didn't check the return value
assert (val >= 0);
}
If the result isn’t checked, bad things can happen, and we may need to rely on traditional debugging tools to find out what went wrong.
Using “traditional” C++ exceptions the same code could look like this:
void my_sqrt (float& value) {
if (value < 0)
throw std::exception ();
value = sqrt(value);
}
main () {
double val = -1;
// careful coding verifies result of my_sqrt
try {
my_sqrt (val);
} catch (std::exception& x) {
printf ("square root of negative number");
}
// someone forgot to check the result
my_sqrt (val);
// program terminates abnormally because there is an uncaught exception
assert (val >= 0);
}
This works great in a small example like this because we can see what the my_sqrt
function does and pepper the code with try...catch
blocks. If however the function is buried deep in a library, you might not know what exceptions it might throw. Note that the signature of my_sqrt
doesn’t give any clue as to what, if any, exceptions it might throw.
And now… drumroll… here are the erc
objects in action:
erc my_sqrt (float& value) {
if (value < 0)
return -1;
value = sqrt(value);
return 0;
}
main () {
double val = -1;
// careful coding verifies result of my_sqrt
if (my_sqrt (val) == -1) // (1)
printf ("square root of negative number");
// if you are in love with exceptions still use them
try {
my_sqrt (val);
} catch (erc& x) {
printf ("square root of negative number");
}
// someone forgot to check the result
my_sqrt (val); // (2)
// program terminates abnormally because there is an uncaught exception
assert (val >= 0);
}
A few observations before going into how this works:
First a question of terminology: to distinguish between traditional “C” error codes and the new error code objects, in the rest of this article I’m going to call “error code” the error code objects. When I need to refer to traditional “C” error codes, I’m going to call them “C error codes”.
The signature of
my_sqrt
clearly indicates that it will return an error code. In the C++ exception case there is no indication that an exception could be thrown. Once upon a time C++98 had those exception specifications but they have been deprecated in C++11. You can find a longer discussion on this subject in Raymond Chen’s post The sad history of the C++ throw(…) exception specifier.The solution with C error codes doesn’t make it obvious that the integer value returned is an error code.
First Look at Error Code Objects #
When an erc
object is created it has a numerical value (like any C error code) and an activity flag that is initially set.
class erc
{
public:
erc (int val) : value (val), active (true) {};
//...
private:
int value;
bool active;
}
If the object is destructed and the activity flag is set, the destructor throws an exception.
class erc
{
public:
erc (int val) : value (val), active (true) {}
~erc () noexcept(false) {if (active) throw *this;}
//...
private:
int value;
bool active;
}
So this is just an object throwing an exception, albeit doing it during the destructor execution. Nowadays, this is frown upon and, for this reason, we have to decorate the destructor declaration with noexcept(false)
.
The integer conversion operator returns the numerical value of the erc
object and resets the activity flag:
class erc
{
public:
erc (int val) : value (val), active (true) {}
~erc () noexcept(false) {if (active) throw *this;}
operator int () {active = false; return value;}
//...
private:
int value;
bool active;
}
Because the activity flag has been reset, the destructor will no longer throw an exception when the object goes out of scope. Typically, the integer conversion operator is invoked when the error code is tested against a certain value.
Looking back at the simple usage example, at the comment marked (1):
if (my_sqrt (val) == -1) // (1)
printf ("square root of negative number");
Here the erc
object returned by the function my_sqrt
is compared with an integer value. This invokes the integer conversion operator and the activity flag is reset. Consequently the destructor doesn’t throw an exception.
Looking at the comment marked (2):
my_sqrt (val); // (2)
Here the returned erc
object is destroyed when my_sqrt()
returns and, because its activity flag is set, the destructor throws an exception.
The value 0
is reserved to indicate success following the well-established Unix convention (as Aristotle was saying, there is only one way to succeed). An erc
with a value of 0
never throws an exception; any other value indicates failure and, if not tested, generates an exception.
This is the essence of the whole idea of error code objects as presented in the Dr. Dobbs Journal article. However, I couldn’t resist the temptation to take a simple idea and make it more complicated; keep reading!
More Details #
The “big picture” presentation has ignored some details that are needed to make error codes easier to integrate in large scale projects. First we need a move constructor and an move assignment operator that borrow the activity flag from the copied object and deactivate the copied object. This ensures that we have only one active erc
object.
We also need a mechanism to group classes of error codes together for easy handling. This mechanism is implemented through error facility objects (errfac
). In addition to the value and activity flag attributes, the erc
’s have also a facility and a severity level. The erc
destructor does not directly throws an exception as shown before, but instead it invokes the errfac::raise
function of the associated facility object. The raise
function compares the severity level of the erc
object against a throw level and a log level associated with each facility. If the error code’s priority is higher than the facility’s log level, the errfac::raise()
function invokes the errfac::log()
function to generate an error message and throws the exception or logs the error only if the preset levels are exceeded. The severity levels are borrowed from the UNIX syslog function:
Name | Value | Action |
---|---|---|
ERROR_PRI_SUCCESS | 0 | always not logged, not thrown |
ERROR_PRI_INFO | 1 | default not logged, not thrown |
ERROR_PRI_NOTICE | 2 | default not logged, not thrown |
ERROR_PRI_WARNING | 3 | default logged, not thrown |
ERROR_PRI_ERROR | 4 | default logged, thrown |
ERROR_PRI_CRITICAL | 5 | default logged, thrown |
ERROR_PRI_ALERT | 6 | default logged, thrown |
ERROR_PRI_EMERG | 7 | always logged, thrown |
By default the error codes are associated with a default facility but one can create different facilities to regroup classes of errors. For instance, you can create a specialized error facility for all socket errors that knows how to translate the numerical error codes into meaningful messages.
Having different error levels can be useful for testing or debugging purposes when one can vary the throwing or logging level for a class of errors.
A More Realistic Example #
A blog article by Andrzej Krzemieński discusses cascading cancellations when an error occurs. It shows the basic layout of a HTTP client program:
Status get_data_from_server(HostName host)
{
open_socket();
if (failed)
return failure();
resolve_host();
if (failed)
return failure();
connect();
if (failed)
return failure();
send_data();
if (failed)
return failure();
receive_data();
if (failed)
return failure();
close_socket(); // potential resource leak
return success();
}
The issue here is that an early return can produce a resource leak because the socket is not closed. Let’s see how error codes could be used in this situation.
If we want to use exceptions the code could look like this:
// functions declarations
erc open_socket ();
erc resolve_host ();
erc connect ();
erc send_data ();
erc receive_data ();
erc close_socket ();
erc get_data_from_server(HostName host)
{
erc result;
try {
//the first operation that fails triggers an exception
open_socket ();
resolve_host ();
connect ();
send_data ();
receive_data ();
} catch (erc& x) {
result = x; //return the failure code to our caller
}
close_socket (); //cleanup
return result;
}
The same code can be written Without using exceptions:
// functions declarations
erc open_socket ();
erc resolve_host ();
erc connect ();
erc send_data ();
erc receive_data ();
erc close_socket ();
erc get_data_from_server(HostName host)
{
erc result;
(result = open_socket ())
|| (result = resolve_host ())
|| (result = connect ())
|| (result = send_data ())
|| (result = receive data ());
close_socket (); //cleanup
result.reactivate ();
return result;
}
In the fragment above, result
has been converted to an integer because it has to participate in the logical OR expression. This conversion resets the activity flag so we have to explicitly turn it on again by calling the reactivate()
function. If all the functions have been successful, result
is 0 and, by convention, it will not throw an exception.
Final Touches #
The source code attached is production quality and reasonably well optimized. Hopefully that doesn’t make it much harder to use. The demo project is a C++ wrapper for the popular SQLITE database. It is much bigger in size just because it includes the latest version (as of this writing) of the SQLITE code. Both the source code and the demo project include Doxygen documentation.