Error Handling with Error Code Objects

Error Handling with Error Code Objects #

It is possible to fail in many ways…while to succeed is possible only in one way

Introduction #

I have used the code and the mechanism described in this article for almost 20 years now and so far, I didn’t find a better method for error handling in large C++ projects. The original idea is taken from an article that appeared in Dr Dobbs Journal back in 2000. I have added a few bits and pieces to make it easier to use in a production environment.

The impulse to write this article was a posting on Andrzej’s C++ blog. As we’ll see latter in this article, using the error code objects can produce significantly cleaner and easier to maintain code.

Background #

Every C++ programmer learns that there are two traditional methods to deal with abnormal conditions: one, inherited from good old C is to return an error code and hope that the caller will test it and take an appropriate action; the second one is to throw an exception and hope that a surrounding block has provided a catch handler for that exception. C++ FAQ strongly advocates for the second method arguing that it is leads to safer code.

Using exceptions however has also it’s own drawbacks. Code tends to become more complicated and users have to be aware of all exceptions that can be thrown. This is why older C++ specification had “exception specification” added to function declarations. In addition exceptions tend to make code less efficient.

Error code objects (erc) are designed to be returned by functions like the traditional C error codes. The big difference is that, when not tested, they throw an exception.

Let us take a small example and see how the different implementations would look like. First the “classical C” approach with traditional error codes:

  int my_sqrt (float& value) {
      if (value < 0)
        return -1;
      value = sqrt(value);
      return 0;
  }

  main () {
    double val = -1;

    // careful coding verifies result of my_sqrt
    if (my_sqrt (val) == -1)
      printf ("square root of negative number");

    // someone forgot to check the result 
    my_sqrt (val);

    // disaster can strike here because we didn't check the return value
    assert (val >= 0);
  }

If the result is not checked all kind of bad things can happen and we have to be prepared to use all traditional debugging tools to find out what went wrong.

Using “traditional” C++ exceptions the same code could look like this:

  void my_sqrt (float& value) {
      if (value < 0)
        throw std::exception ();
      value = sqrt(value);
  }

  main () {
    double val = -1;

    // careful coding verifies result of my_sqrt
    try {
      my_sqrt (val);
    } catch (std::exception& x) {
      printf ("square root of negative number");
    }

    // someone forgot to check the result 
    my_sqrt (val);

    // program terminates abnormally because there is an uncaught exception
    assert (val >= 0);
  }

This works great in a small example like this because we can see what the my_sqrt function does and pepper the code with try...catch blocks. If however the function is buried deep in a library, you might not know what exceptions it might throw. Note that the signature of my_sqrt doesn’t give any clue as to what, if any, exceptions it might throw.

And now… drumroll… here are the erc objects in action:

  erc my_sqrt (float& value) {
      if (value < 0)
        return -1;
      value = sqrt(value);
      return 0;
  }

  main () {
    double val = -1;

    // careful coding verifies result of my_sqrt
    if (my_sqrt (val) == -1)                    // (1)
      printf ("square root of negative number");

    // if you are in love with exceptions still use them
    try {
      my_sqrt (val);
    } catch (erc& x) {
      printf ("square root of negative number");
    }

    // someone forgot to check the result 
    my_sqrt (val);                              // (2)

    // program terminates abnormally because there is an uncaught exception
    assert (val >= 0);
  }

A few observations before diving into the magic of how this works:

  • First a question of terminology: to distinguish between traditional “C” error codes and my error code objects, in the rest of this article I’m going to call “error code” my error code objects. When I need to refer to traditional “C” error codes, I’m going to call them “C error codes”.

  • The signature of my_sqrt clearly indicates that it will return an error code. In the C++ exception case there is no indication that an exception could be thrown. Once upon a time C++98 had those exception specifications but they have been deprecated in C++11. You can find a longer discussion about that in Raymond Chen’s post The sad history of the C++ throw(…) exception specifier.

  • The C error codes solution also doesn’t make it obvious that the integer value returned is an error code.

A first look at error code objects #

For a “big picture” presentation we are going to ignore some details but we’ll get back to those in a moment.

When an erc object is created it has a numerical value (like any C error code) and an activity flag that is initially set.

  class erc
  {
  public:
    erc (int val) : value (val), active (true) {};
  //...
  private:
    int value;
    bool active;
  }

If the object is destructed and the activity flag is set, the destructor throws an exception.

  class erc
  {
  public:
    erc (int val) : value (val), active (true) {}
    ~erc () noexcept(false) {if (active) throw *this;}
  //...
  private:
    int value;
    bool active;
  }

So far, still nothing very special: this is an object throwing an exception, albeit doing it during the destructor execution. Nowadays this is frown upon nowadays and that is why we have to decorate the destructor declaration with noexcept(false).

The integer conversion operator returns the numerical value of the erc object and resets the activity flag:

  class erc
  {
  public:
    erc (int val) : value (val), active (true) {}
    ~erc () noexcept(false) {if (active) throw *this;}
    operator int () {active = false; return value;}
  //...
  private:
    int value;
    bool active;
  }

Because the activity flag has been reset the destructor will no longer throw an exception when the object goes out of scope. Typically the integer conversion operator is invoked when the error code is tested against a certain value.

Looking back at the simple usage example, at the comment marked (1), the erc object returned by the function my_sqrt is compared with an integer value and this invokes the integer conversion operator. As a result, the activity flag is reset and the destructor doesn’t throw. At the comment marked (2), the returned erc object is destroyed after my_sqrt() returns and, because its activity flag is set, the destructor throws an exception.

Following a well-established Unix convention, and because, as Aristotle was saying there is only one way to succeed, the value ‘0’ is reserved to indicate success. An erc with a value of 0, never throws an exception. Any other value indicates failure and generates an exception (if not tested).

This is the essence of the whole idea of error code objects as presented in the Dr. Dobbs Journal article. However, I couldn’t resist the temptation to take a simple idea and make it more complicated; keep reading!

More details #

The “big picture” presentation has ignored some details that are needed to make error codes more functional and for integrating them in large scale projects. First we need a move constructor and an move assignment operator that borrow the activity flag from the copied object and deactivate the copied object. This ensures that we have only one active erc object.

We also need a mechanism for grouping classes of error codes together for easy handling. This mechanism is implemented through error facility objects (errfac). In addition to the value and activity flag attributes, the erc’s have also a facility and a severity level. The erc destructor does not directly throws an exception as we shown before, but instead it invokes the errfac::raise function of the associated facility object. The raise function compares the severity level of the erc object against a throw level and a log level associated with each facility. If the error code’s priority is higher than the facility’s log level, the errfac::raise() function invokes the errfac::log() function to generate an error message and throws the exception or logs the error only if the preset levels are exceeded. The severity levels are borrowed from the UNIX syslog function:

NameValueAction
ERROR_PRI_SUCCESS0always not logged, not thrown
ERROR_PRI_INFO1default not logged, not thrown
ERROR_PRI_NOTICE2default not logged, not thrown
ERROR_PRI_WARNING3default logged, not thrown
ERROR_PRI_ERROR4default logged, thrown
ERROR_PRI_CRITICAL5default logged, thrown
ERROR_PRI_ALERT6default logged, thrown
ERROR_PRI_EMERG7always logged, thrown

By default the error codes are associated with a default facility but one can create different facilities to regroup classes of errors. For instance you can create a specialized error facility for all socket errors that knows how to translate the numerical error codes into meaningful messages.

Having different error levels can be useful for test or debugging purposes when one can vary the throwing or logging level for a class of errors.

A More Realistic Example #

The blog article mentioned before shows the basic layout of a HTTP client program:

  Status get_data_from_server(HostName host)
  {
    open_socket();
    if (failed)
      return failure();

    resolve_host();
    if (failed)
      return failure();

    connect();
    if (failed)
      return failure();

    send_data();
    if (failed)
      return failure();

    receive_data();
    if (failed)
      return failure();

    close_socket(); // potential resource leak
    return success();
  }

The issue here is that an early return can produce a resource leak because the socket is not closed. Let’s see how error codes could be used in this situation.

If we want to use exceptions the code could look like this:

  // functions declarations
  erc open_socket ();
  erc resolve_host ();
  erc connect ();
  erc send_data ();
  erc receive_data ();
  erc close_socket ();

  erc get_data_from_server(HostName host)
  {
    erc result; 
    try {
      //the first operation that fails triggers an exception
      open_socket ();
      resolve_host ();
      connect ();
      send_data ();
      receive data ();
    } catch (erc& x) {
      result = x;         //return the failure code to our caller
    }

    close_socket ();      //cleanup
    return result;
  }

Without exceptions the same code can be written as:

  // functions declarations
  erc open_socket ();
  erc resolve_host ();
  erc connect ();
  erc send_data ();
  erc receive_data ();
  erc close_socket ();

  erc get_data_from_server(HostName host)
  {
    erc result; 
    
    (result = open_socket ())
    || (result = resolve_host ())
    || (result = connect ())
    || (result = send_data ())
    || (result = receive data ());

    close_socket ();      //cleanup
    result.reactivate ();
    return result;
  }

In the fragment above, result has been converted to an integer because it has to participate in the logical OR expression. This conversion resets the activity flag so we have to explicitly turn it on again by calling the reactivate() function. If all the functions have been successful, result is 0 and, by convention, it will not throw an exception.

Final Touches #

The source code attached is production quality and reasonably well optimized. Hopefully that doesn’t make it much harder to use. The demo project is C++ wrapper for the popular SQLITE database. It is much bigger in size just because it includes the latest version (as of this writing) of the SQLITE code. Both the source code and the demo project include Doxygen documentation.

History #

12-Nov-2019 Initial version