C++ Unit Testing

Unit testing in managed languages such as Java and C# is easy. The general theory is that you create a Test annotation and apply that to your test functions. Next, you create a series of assert functions that throw an AssertFailed exception if an assertion doesn’t match what you expect. Finally, create a test runner that scans your assemblies or JARs using reflection for functions marked with your Test annotation and invoke them. The test runner just needs to catch any exceptions thrown by the test functions and report them somewhere. The test runner doesn’t have to care too much if the exception thrown is a failed assert or something more serious such as a null pointer exception, it can handle both of them in pretty much the same way. Tools such as NUnit or TestNG will provide all this for you so you will very rarely ever need to write any of this yourself.

With C++, things aren’t quite so easy. There isn’t really any form of annotations or reflection so discovering tests is harder. You do have exceptions, but you might not be able to use them due to the environment you are using and you don’t get call stacks with them either. And anyway, you could just get a fatal crash deep in the code you’re testing before you ever get the chance to throw an exception from one of your assertions.

This doesn’t mean that you can’t get C++ unit testing frameworks with a similar level of functionality as the ones for managed languages, Google Test is a pretty good one for example and CppUnitLite2 is another example of a very portable framework. I want to take a look at how a C++ unit testing framework could be implemented as I find it an interesting problem.

Goals

  • Easy to implement test functions that can be discovered by the test runner.
  • Assert functions that will tell me what test has failed along with a call stack.
  • Fatal application crashes won’t kill the test runner but are reported as failed tests along with a call stack.
  • Possible to plug into an continuous integration build server so that it can evaluate if a build is stable or not.

For my example framework, I’ll only be targeting Unix type platforms (Linux, Mac OS) as the methods I’ll be using are cleaner to implement making it easier to explain the theory. This also allows me to to provide a sample that will work on Ideone so you can have a play with the framework and see it running without needing to download any code.

The framework I present here takes its inspiration from Google Test so I highly recommend taking a look at that.

The Sample Framework

You can try out my sample framework on Ideone. Due to Ideone being primarily a code fiddle site to try out ideas, all your code must live in a single source file so don’t judge the structure too harshly! Normally you would separate everything out a bit and have clear interfaces between the test runner and your tests.

Test Function Registration

This is achieved by defining a macro to generate the test function declaration. The macro also creates a static object that contains the test function details and registers itself with the test runner in its constructor. The test function details contain the name of the function, the source file, line number and a pointer to the function to execute. They can then be stored in a simple linked list for the test runner to iterate over when it comes to run the tests. By using static objects, we can ensure that all our tests are registered automatically before main() is executed saving us the need to explicitly call a set up function that contains a list of all our test functions that needs to be maintained as new tests are added.

Test Reference Class

//-----------------------------------------------------------------------------------------//
// Class for storing reference details for a test function
// Test references are stored in a simple linked list
//-----------------------------------------------------------------------------------------//
// Type def for test function pointer
typedef void (*pfnTestFunc)(void);

// Class to store test reference data
class TestRef
{
public:
	TestRef(const char * testName, const char * filename, int lineNumber, pfnTestFunc func)
	{
		function = func;
		name = testName;
		module = filename;
		line = lineNumber;
		next = NULL;

		// Register this test function to be run by the main process
		registerTest(this);
	}

	pfnTestFunc function;	// Pointer to test function
	const char * name;		// Test name
	const char * module;	// Module name
	int line;				// Module line number
	TestRef * next;			// Pointer to next test reference in the linked list
};

// Linked list to store test references
static TestRef * s_FirstTest = NULL;
static TestRef * s_LastTest = NULL;

This is a pretty simple class as it doesn’t need to do much more than register itself. In my sample, registerTest() is a global function that just add the object to the linked list.

// Add a new test to the linked list
void registerTest(TestRef * test)
{
	if (s_FirstTest == NULL)
	{
		s_FirstTest = test;
		s_LastTest = test;
	}
	else
	{
		s_LastTest->next = test;
		s_LastTest = test;
	}
}

Test Registration Macro

// Macro to register a test function
#define TEST(name)													
static void name();													
static TestRef s_Test_ ## name(#name, __FILE__, __LINE__, name);	
static void name()

It simply declares the test function prototype, constructs a static test reference object passing the function pointer into the constructor and then declares the first line of the function implementation. Here’s an example of using it:

TEST(MyTest)
{
    // Your test implementation...
}

When the macro is expanded by the preprocessor, it effectively becomes:

static void MyTest();
static TestRef s_Test_MyTest("MyTest", "example.cpp", 1, MyTest);
static void MyTest()
{
    // Your test implementation...
}

I’ve inserted line breaks to make it easier to read.

Test Execution

This isn’t the exact code I’ve used in my sample, but it’s doing pretty much the same thing.

TestRef * test = s_FirstTest;
while (test != NULL)
{
    test->function();
    // Report success or failure...
    test = test->next;
}

Assert Function

In my sample, I’ve just used an assert macro similar to one you’re probably already using in your own code.

// Assert macro for tests
#define TESTASSERT(cond)											
do {																
	if (!(cond)) {													
		assertHandler(__FILE__, __LINE__, #cond);					
	}																
} while (0);

If the assert condition fails, it turns it into a string and passes it along with the current file and line number into an assert handler function to actually report the failure.

This actually isn’t the best example for a unit testing framework as it’s really only testing for a true condition. If you were developing a fully featured framework, you would probably want more assert functions along the lines of ASSERT_EQUALS(actual,expected) and ASSERT_NOTEQUALS(actual,notexpected) so that you can report how the actual result from a test differs from what was expected. Implementing these types of functions isn’t too hard, so I won’t dwell on that now.

Assert Handler

// Handler for failed asserts
void assertHandler(const char * file, int line, const char * message)
{
	fprintf(stdout, "nAssert failed (%s:%d):n", file, line);
	fprintf(stdout, "    %sn", message);
	fflush(stdout);

	dumpStack(1);

	_exit(1);
}

The functions reports the location of the failed assert along with the failed condition before dumping a stack trace and exiting. The reason for calling exit is because my framework actually runs tests in a child process separate to the test runner (more on that later). This is also why I’ve used fprintf with the stdout file handle rather than just using printf(). The child and parent processes actually share the same file handles and so I need to be explicit about where my output is going and when buffers are flushed so that I don’t get overlapping test output.

Dumping the Call Stack

For this, I’ve used a feature of glibc which is one of the reasons my sample is written for *nix.

// Dump a stack trace to stdout
void dumpStack(int topFunctionsToSkip)
{
	topFunctionsToSkip += 1; // We always want to skip this dumpStack() function
	void * array[64];
	size_t size = backtrace(array, 64);
	backtrace_symbols_fd(array + topFunctionsToSkip, size - topFunctionsToSkip, 1); // Adjust the array pointer to skip n elements at top of stack
}

I provide the ability to skip a number of calls at the top of the stack so that the assert and stack dumping functions aren’t reported in the call stack. The call stack is then written to stdout directly.

The function backtrace_symbols_fd() will attempt to resolve function symbols when it outputs the stack trace but it can be a bit hit or miss with getting the names and will be affected by optimisation level. For the most likely chance to get symbols out, you need to compile with the -g option and link with -rdynamic if using gcc. When I compile and run the sample on my Raspberry Pi, I get the following call stack for a failed assert:

Assert failed (main.cpp:89):
    1 == false
./a.out[0x8c44]
./a.out(_Z7runTestP7TestRef+0x70)[0x8ed8]
./a.out(main+0xb8)[0x8d40]
/lib/libc.so.6(__libc_start_main+0x11c)[0x403cc538]

As you can see, it’s managed to find the symbols for some functions but not the one at the very top of the call stack which is where our assert failed. Fortunately, we can use the addr2line tool to look up this function:

pi@raspberrypi:~/Devs/cunit_sample$ addr2line -e a.out -f -s 0x8c44
TestAssertFailed
main.cpp:90

Calling addr2line can become quite tedious, so you might find it worth writing a script (e.g. in Python) to feed stack traces into addr2line if you find yourself needing to do this regularly which is something I’ve done in the past.

Sample Test Function

TEST(TestPass1)
{
	int x = 1;
	int y = 2;
	int z = x + y;
	TESTASSERT(z == 3);
}

Nothing too shocking there and hopefully very easy to implement.

Handling Fatal Crashes

The first C++ testing framework I wrote had all the tests running in the same process. If everything was working, this wouldn’t be a problem as all test would pass without incident. However, if there was a fatal crash (e.g. attempting to use a null pointer), the entire test application would crash halting all the tests making it very difficult to assess the overall code health. This can be resolved by signal handlers that wait for crash conditions and attempt to gracefully clear up so that the test runner can keep on running. However, I still ran into bugs that could screw up the stack or heap in fatal ways leaving me no better off in these situations.

In this sample framework, I’ve borrowed an idea from Google Chrome in that I run each test in its own process. This way a test can mess up its own process as much as it wants and it’s completely isolated from any of the other tests. It also enforces good practice with your tests as you can’t have one test depending on the side effects of another test. Each test is completely independent and can be guaranteed to run in any order which makes them much easier to debug. In addition, it makes my crash handling code much simpler as I don’t need to do any more than report the error and exit the process. Simpler code is good in my opinion.

Signal Handler

// Handler for exception signals
void crashHandler(int sig)
{

	fprintf(stdout, "nGot signal %d in crash handlern", sig);
	fflush(stdout);

	dumpStack(1);

	_exit(1);
}

The handler uses the same stack dumping code as the assert handler and exits with a non-zero exit code to notify the parent test runner application that the test has failed.

The handler is registered with the following code in main():

	// Register crash handlers
	signal(SIGFPE, crashHandler);
	signal(SIGILL, crashHandler);
	signal(SIGSEGV, crashHandler);

Here, I’ve used the antiquated signal() interface when really, I should be using sigaction(). I’ve probably not registered all the signals that could indicate a fatal code bug either. This is something I may address in the future, but for now it provides a simple example of what I’m trying to achieve.

Spawning the Child Test Process

For simplicity, my test runner just forks itself as that’s one of the easiest way to launch a child process on *nix. It also has the advantage of not needing to do much configuration in the child process in order to run the test.

I’ve wrapped the forking and running of a single test in a function to keep all the logic in one place:

// Wrapper function to run the test in a child process
bool runTest(TestRef * test)
{
	// Fork the process, the test will actually be run by the child process
	pid_t pid = fork();

	switch (pid)
	{
	case -1:
		fprintf(stderr, "Failed to spawn child process, %dn", errno);
		exit(1); // No point running any further tests

	case 0:
		// We're in the child process so run the test
		test->function();
		exit(0); // Test passed, so exit the child with a success code

	default:{
		// Parent process, wait for the child to exit
		int stat_val;
		pid_t child_pid = wait(&stat_val);

		if (WIFEXITED(stat_val))
		{
			// Child exited normally so check the return code
			if (WEXITSTATUS(stat_val) == 0)
			{
				// Test passed
				return true;
			}
			else
			{
				// Test failed
				return false;
			}
		}
		else
		{
			// Child process crashed in a way we couldn't handle!
			fprintf(stdout, "Child exited abnormally!n");
			return false;
		}

		break;}
	}
}

After the process is forked, the child process calls the test function referenced in the passed in TestRef object. If the function completes without incident, the child exits with a zero exit code to indicate success. The parent process waits for the child process to exit and then logs success or failure of the test based on the exit code of the child process.

The main test runner loop is:

	int testCount = 0;
	int testPasses = 0;

	// Loop round all the tests in the linked list
	TestRef * test = s_FirstTest;
	while (test != NULL)
	{
		// Print out the name of the test we're about to run
		fprintf(stdout, "%s:%s... ", test->module, test->name);
		fflush(stdout);

		testCount++;

		bool passed = runTest(test);
		if (passed == true)
		{
			testPasses++;
			fprintf(stdout, "Okn");
		}
		else
		{
			fprintf(stdout, "FAILEDn");
		}

		// Get the next test and loop again
		test = test->next;
	}

Plugging Into Build Server

This is just a case of following the Unix principal of your process returning 0 if you’re happy or non-zero if not. In my main() function, I keep a count of the number of tests run and the number of tests passed. I then have the following at the end of main():

	// Print out final report
	int exitCode;
	if (testPasses == testCount)
	{
		fprintf(stdout, "n*** TEST SUCCESS ***n");
		exitCode = 0;
	}
	else
	{
		fprintf(stdout, "n*** TEST FAILED ***n");
		exitCode = 1;
	}
	fprintf(stdout, "%d/%d Tests Passedn", testPasses, testCount);

	return exitCode;

Pretty much every build server has the ability to launch external processes as part of a build and report a build failure if that process doesn’t exit with a zero code. It’s just a case of building your test framework as part of your normal build process and then executing it as a post build step. Everyone should be doing it!

Other Platforms and Future Improvements

As I mentioned earlier, this sample will only work on *nix platforms. However with a bit of work, most of these ideas can be ported to other platforms.

Call Stack Dumping

Although there is no standard way to get a stack trace, it’s been possible on every platform I’ve used so far. Some platforms being easier than others though.

For Windows, here’s one example, There also the CaptureStackBackTrace() API in the Windows API.

Fatal Exception Handling

I’ve already mentioned that I should switch to using sigaction() rather than signal() for registering my crash handlers.

On Windows, you could use Structured Exception Handling (SEH) to detect access violations and other fatal errors. Here’s a Stack Overflow question that covers some of the pros and cons of SEH. This is something that’s always going to be very platform specific so you may have to research this for yourself if you’re using something a bit more esoteric.

Child Process Spawning

This is one area I could put a lot more effort in. Currently, I’m only using fork() which isn’t available on all platforms and only gives me limited control over the child process. If instead I launched the child processes as completely separate processes that I attached to using stdout/stderr and specified which test to run using command line arguments, I’d have a much more portable solution. It would make debugging individual tests much easier as I could launch the test process directly from my debugger without needing to run through a complete test cycle. This would also give me more options over how I implemented my test runner as I could develop a GUI application in a completely different language if I wanted or implement distributed tests across multiple machines if my test cycle could take a long time. Finally, reading test output such as failed assertions and call stacks from stdout of the child process rather than letting the child write directly to stdout of the parent process would allow the test runner to present the output in a much nicer way or redirect it to a separate log file that only contained info about failed tests.

If I were to develop this sample further, this is an area I would certainly put more effort into.

More Restrictive Platforms

A few platforms I’ve worked on have only supported running a single process at a time. Launching another process results in the current running process being unloaded and completely replaced by the child process. This makes running tests in a background child process completely impossible. In these situations, I’d have the runTest() function run the test directly in the current process. The assert and crash handlers would also need to be updated to return control to the test runner in the case of a test failure. Your best bet would be to use normal C++ exceptions for this, but if you really don’t want to use them, you could use setjmp()/longjmp(). Which ever way you go, fatal application crashes are likely to halt your test cycles early.

If possible, I’d try to get the code I was testing to also compile on another platform such as Windows or Linux and perform most of my testing there. If you get to the point where all your tests are passing, running the tests on your target platform should just be a formality to make sure the platform specific parts of you code are also working.

Before/After Method Events

Something that I haven’t implemented in this sample but would be very easy to add would be before and after method events so that common set up and tear down code could be automatically called by the test runner. This is a standard feature of just about every other framework, so I wouldn’t consider a framework I wrote complete without it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: