An improved assert()

assert() is a very useful tool for ensuring that pre- and post-conditions, as well as invariants, are met upon calling or exiting a function. If you have never used assertions before, start using them now – they will help you find bugs in your own code, and quickly highlight when code is not used in the way originally intended.

Still, there are a few bits missing in the standard assert(), which is why we will try to build an improved version of it today.

Specifically, the features I miss in the standard assert() are the following:

  • No way to output a (formatted) message to the user.
  • When a debugger is connected, the assert() does not halt execution in the line the assert() fired.
  • No way to show the variables’ values used in the assert(), only the whole condition is shown.

Consider the following example which will demonstrate the above:

// original assert:
assert(GetSize() > 0);

// improved assert:
ME_ASSERT(GetSize() > 0, "Cannot pop a value from an empty FIFO.")(GetSize(), m_end, m_read, m_fillCount);

Using the standard assert(), all you know is that the “Assertion GetSize() > 0 failed.” If the assert fired on anybody else’s PC, this is not really helpful. How large was the FIFO? What did GetSize() yield? Was the FIFO really empty, or were the internal pointers messed up because of e.g. a memory stomp? All questions unanswered.

What we would like is an improved assert() which is able to fill in those gaps, and provide answers to exactly these questions.

ME_ASSERT

The format I came up with for assertions in the Molecule Engine is as follows:

ME_ASSERT(condition, message, optional comma-separated list of message parameters)(optional comma-separated list of variables/values);

I tried a few different formats in the past, but always ran into problems with at least one feature – either it was impossible to trigger breakpoints after the message and variables had been logged, or it was impossible to completely get rid of all code in retail builds.

Let’s take a look at a few examples to see how we would like to use our assertion macro:

ME_ASSERT(from >= std::numeric_limits<TO>::min(), "Number to cast exceeds numeric limits.")(from);
ME_ASSERT(false, "Key %s could not be found.", key.c_str())();
ME_ASSERT(a < b, "a was not less than b")(a, b);

Before we worry about how to put that into a macro, let’s try to come up with the equivalent C++ code first.

The first piece of the puzzle is a mechanism which allows us to log a formatted message, and an optional, unlimited number of variables, in that order. A temporary class instance nicely fits the bill:

class Assert
{
public:
    // logs the formatted message
    Assert(const SourceInfo& sourceInfo, const char* format, ...);
};

// example usage
Assert(ME_SOURCEINFO, "Key %s could not be found.", key.c_str());

Keep in mind that you can still call methods on the temporary instance, so the following is perfectly valid code, and allows us to log an unlimited number of variables:

class Assert
{
public:
    // logs the formatted message
    Assert(const SourceInfo& sourceInfo, const char* format, ...);

    Assert& Variable(const char* const name, bool var);
    Assert& Variable(const char* const name, char var);
    Assert& Variable(const char* const name, short var);
    Assert& Variable(const char* const name, int var);
    // more overloads for built-in types...

    // generic
    template <typename T>
    Assert& Variable(const char* const name, const T& value);
};

// example usage
Assert(ME_SOURCEINFO, "a was not less than b.").Variable("a", a).Variable("b", b);

Adding a breakpoint after everything has been logged is now surprisingly easy:

Assert(ME_SOURCEINFO, "a was not less than b.").Variable("a", a).Variable("b", b), ME_BREAKPOINT;

The comma operator ensures that the temporary Assert’s destructor has been called before the breakpoint is triggered, hence both the formatted message and the variables (and their values) will have been logged already.

One thing we need to make sure is that the temporary Assert instance is created and the the breakpoint triggered only if the condition is not met. This could be done by simply using an if-statement, but this wouldn’t allow us to “return” a value from an assertion, like the following:

int* ptr = ME_ASSERT_NOT_NULL(otherPtr);

So instead of using an if-statement, we use the conditional (?:) operator:

(a < b) ? (void)0 : (Assert(ME_SOURCEINFO, "a was not less than b.").Variable("a", a).Variable("b", b), ME_BREAKPOINT);

Note the extra parentheses before the Assert and after the breakpoint! Without these, the ME_BREAKPOINT statement would not be part of the second conditional operand, but rather be its own statement – meaning that a breakpoint would always be triggered, no matter if the condition was met or not (due to the operator precedence of the conditional operator and the comma operator).

Well, this part of implementing our own assert was easy, now we have to turn that into a macro, which arguably is a little bit harder and requires quite some preprocessor trickery.

First, let’s not worry about the variables and the breakpoint yet. Let’s try to stuff the left part of the example above into a macro:

#define ME_ASSERT(condition, format, ...)        (condition) ? (void)0 : Assert(ME_SOURCE_INFO, "Assertion \"" #condition "\" failed. " format, __VA_ARGS__)

As you can surely see, this will expand into the following:

// macro
ME_ASSERT(a < b, "a was not less than b, %s", "sadly");

// expansion
(a < b) ? (void)0 : Assert(ME_SOURCEINFO, "Assertion \"a < b\" failed. a was not less than b.", "sadly");

Simple.

Now let’s turn our attention to the optional list of variables. Because of supporting printf-style formatted messages, our macro already is variadic (note the … as macro parameter), so how can we offer an additional, variable number of arguments?

The following clearly doesn’t work:

#define ME_ASSERT(condition, format, ..., ...)  // huh?

There’s no way to distinguish where one list of arguments ends, and where the next starts. That is, a variable number of arguments (…) must always be the last argument to a macro. Bummer.

But, by introducing an additional pair of parentheses, we can “start” the expansion of any other (variadic) macro, because you can think of the preprocessor running multiple passes as long as new function macros are detected, like in the following example:

#define ME_ASSERT(condition, format, ...)    (condition) ? (void)0 : Assert(ME_SOURCE_INFO, "Assertion \"" #condition "\" failed. " format, __VA_ARGS__) ME_ASSERT_IMPL_VARS
#define ME_ASSERT_IMPL_VARS(...)             // whatever

See what we just did? Put on your preprocessor hat:

  • In the first pass, ME_ASSERT() gets expanded into some source code, ending in ME_ASSERT_IMPL_VARS. So if you write “ME_ASSERT(foo, bar, a, b, c)()” (note the parentheses), it will get expanded into “some_code ME_ASSERT_IMPL_VARS()”.
  • In the second pass, the preprocessor finds ME_ASSERT_IMPL_VARS() and recognizes it as another function macro. Hence, you can put a variable number of arguments into the parentheses, and they will be the arguments to the ME_ASSERT_IMPL_VARS macro.

Nifty! This way, we can have a variable number of arguments to the formatted message, and a variable number of arguments for our list of variables. The rest of the macro implementation is as follows:

#define ME_ASSERT_IMPL_VAR(variable)         .Variable(ME_PP_STRINGIZE(variable), variable)
#define ME_ASSERT_IMPL_VARS(...)             ME_PP_EXPAND_ARGS ME_PP_PASS_ARGS(ME_ASSERT_IMPL_VAR, __VA_ARGS__), ME_BREAKPOINT)
#define ME_ASSERT(condition, format, ...)    (condition) ? ME_UNUSED(true) : (Assert(ME_SOURCE_INFO, "Assertion \"" #condition "\" failed. " format, __VA_ARGS__) ME_ASSERT_IMPL_VARS

The last piece of magic missing is the ME_PP_EXPAND_ARGS macro which is part of the engine’s preprocessor library. It allows to expand a variable number of arguments into anything you like, by “calling” another macro on each argument. Looking at the source will make it clear (ME_PP_NUM_ARGS was introduced in another post):

#define ME_PP_EXPAND_ARGS_1(op, a1)                op(a1)
#define ME_PP_EXPAND_ARGS_2(op, a1, a2)            op(a1) op(a2)
#define ME_PP_EXPAND_ARGS_3(op, a1, a2, a3)        op(a1) op(a2) op(a3)
#define ME_PP_EXPAND_ARGS_4(op, a1, a2, a3, a4)    op(a1) op(a2) op(a3) op(a4)
// and so on...

// variadic macro "dispatching" the arguments to the correct macro.
// the number of arguments is found by using ME_PP_NUM_ARGS(__VA_ARGS__)
#define ME_PP_EXPAND_ARGS(op, ...)        ME_PP_JOIN(ME_PP_EXPAND_ARGS_, ME_PP_NUM_ARGS(__VA_ARGS__)) ME_PP_PASS_ARGS(op, __VA_ARGS__)

Note that the argument op can be a macro itself! This means you can turn the arguments “(a, b, c)” into “.Variable(“a”, a).Variable(“b”, b”.Variable(“c”, c) using the following:

#define ME_ASSERT_IMPL_VAR(variable)         .Variable(ME_PP_STRINGIZE(variable), variable)
ME_PP_EXPAND_ARGS_3 ME_PP_PASS_ARGS(ME_ASSERT_IMPL_VAR, __VA_ARGS__)

Which is exactly what we use for automatically expanding our list of variables into Variable()-calls on the temporary Assert instance.

And that’s about it! We can have as many parameters for formatted messages as we want, have an additional number of variables which will be expanded by the preprocessor, and trigger a breakpoint after the assert has fired and everything has been logged.

Finally, this is what the output looks like in Visual Studio after an assert has fired:

..\src\Core\Platform\ProcessorInfo.cpp(417): [Assert] (ASSERT) Assertion "a > b" failed. a was not less than b, sadly
..\src\Core\Platform\ProcessorInfo.cpp(417): [Assert] (ASSERT)   o Variable a = 10 (int)
..\src\Core\Platform\ProcessorInfo.cpp(417): [Assert] (ASSERT)   o Variable b = 20 (int)
molecule_core_d.exe has triggered a breakpoint

The macro magic introduced in this post might be a bit overwhelming for people not used to “programming using the preprocessor”, so let me know if I should post a follow-up, explaining parts of Molecule’s preprocessor library. A good introduction can be found here.

About these ads

6 thoughts on “An improved assert()

  1. Thx a lot! I was writing my own Assert class a month ago but I never saw one before…This helps me really a lot in optimalising my own !
    Thx again!
    Jakub

  2. I really like this implementation and your logging implementation. Since those their interfaces are very similar I was thinking of doing a DebugMessage class that had a similar interface to Assert. Then the Assert class and logging class could take in a DebugMessage. Any thoughts on that?

    I was also wondering what the ME_BREAKPOINT expanded to? I’m guessing signal. Also is the assert macro itself ever called, I’m guessing in the destructor if anywhere.

    Another great article on your end!

    • Hmmm, I’m not really sure I understand your first question. What is the purpose of DebugMessage? Could you maybe provide some pseudo-code?

      Answering your other two questions, ME_BREAKPOINT triggers a breakpoint when a debugger is connected, and otherwise does nothing. Also it’s completely disabled in retail builds.

      It goes something like this:
      #if !ME_MASTER
      # define ME_BREAKPOINT (IsDebuggerConnected() ? __debugbreak() : ME_UNUSED(true))
      #else
      # define ME_BREAKPOINT ME_UNUSED(true)
      #endif

      IsDebuggerConnected() is a platform-dependent implementation which checks whether a debugger is connected or not (uses IsDebuggerPresent() on Windows).

      The assert() macro is never used, the Assert class forwards everything to the logger(s) and the installed assertion handler(s).

      And thanks for the feedback, greatly appreciated!

  3. Saw that you have a variant of this post for #AltDev. Glad to see you’re showing your older posts. They deserve a bigger audience.

    Anyways when I was reading your articles on logging and then assert I was thinking about the similarities in their interfaces.

    class Logger
    {
    virtual void Log(size_t channel, size_t type, size_t verbosity, const SourceInfo& SourceInfo, const char* format, va_list args) = 0;
    };

    class Assert
    {
    Assert(const SourceInfo& sourceInfo, const char* format, …);
    };

    And I was thinking the handling of the variable number of arguments could be brought out into a separate class. So a DebugMessage would look like this, which was just lifted from part of the Assert interface.

    class DebugMessage
    {
    public:
    // logs the formatted message
    DebugMessage(const SourceInfo& sourceInfo, const char* format, …);

    DebugMessage& Variable(const char* const name, bool var);
    DebugMessage& Variable(const char* const name, char var);
    DebugMessage& Variable(const char* const name, short var);
    DebugMessage& Variable(const char* const name, int var);
    // more overloads for built-in types…

    // generic
    template
    DebugMessage& Variable(const char* const name, const T& value);
    };

    Then the const SourceInfo& sourceInfo, const char* format, … could be replaced by const DebugMessage& message.

    Sorry for the lack of source code initially. I was on my phone which makes it hard to spit out code.

    • Ah, that makes it a lot clearer :).
      Sounds reasonable to me, however I would not make the Variable() methods part of the DebugMessage interface. That’s really specific to the Assert implementation.

      The reason I’ve not done something similar is that my logging macros call a free function in a namespace (which takes …), which in turn forwards the arguments to the different registered loggers. Because you cannot forward an ellipsis (…) to another function taking an ellipsis (at least not in a portable way), I decided not to use a class like DebugMessage. But if it fits your needs, it’s certainly a good way to get rid of interface duplication.

      • I was thinking the logging could be done macro style to get the variable style declarations. That would also make the logging easier to rip out if need be. That would fill out all the variable stuff as well.

        I was always a fan of how Python does logging . So a free function you pass a string to which gets the logger based on the name passed. From there you act on the logger.

        Never understood the usage of channels in a logger. Seems like this could be solved better by just having multiple loggers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s