Interim Unit Testing Guide

* Unit tests declared in the module-level CMakeLists.txt file are now build and executed as part of the commit-level level builds performed by Jenkins. (12 November 2015)

Current Status --- Shows the status of tests declared in each module indicating how many are still in the legacy state versus converted to use casa_add_unit_test. Eventually it will show test coverage as well.

Introduction

Testing is a large area within the software development area. As with anything so highly studied there are a plethora of different terms and definitions for various aspects of testing; because there are many different schools of thought, the definitions are sometimes at odds with one another, particularly when terms from different schools are mixed together. Here is a simple, operational definition that should serve us. At the top are acceptance or validation testing which check to see if the software does what the end-user wanted; this is what our scientific testers perform for us. Next down are system tests which do end-to-end testing of the software; we've been calling those regression tests (however, the accepted definition of regression testing is any testing done to determine whether errors were injected into previously working software). The next level down can be called integration testing: it ensures that various modules or components work together. At the bottom is unit testing which focuses on very small section of the software, often involving small set of classes, a single class or even a set of methods within a class. During initial development, a unit test can be used to verify that a small part of the software functions as expected while during maintenance, the existing units tests are used as regression tests to warn that errors have been introduced.

A good suite of unit tests will make CASA easier to maintain and is an industry-wide best practice (see "Working Effectively with Legacy Code" by Michael Feathers). As is usual in computer science, the exact definition of "unit test" depends a bit on who you ask. However, the basic idea is that a unit test should be written by the developer, have a very small scope and run quickly (see Unit Test by Martin Fowler for a slightly more in depth discussion). Small scope can be a class, a very small set of classes or even a subset of a class; the idea is that a single unit test (suite?) should allow the developer to specifically test a very limited and tightly-related section of the code (i.e., you don't want to waste time testing code that is unrelated to the modifications you are currently making). The scope also relates to speed since the tests could be run on a frequency near to that of compilation. It is true that not all useful, limited-scope tests can be done quickly; the speed requirement does not mean that such tests shouldn't be created. Instead, these slow tests should be grouped into a second suite which can be performed at a different time then the speedy unit tests. For CASA purposes we'll define these slower tests as module tests since they are slow and/or have a somewhat larger scope than a unit test. It's a cost/benefit tradeoff: the speedy tests should catch a lot of the obvious errors while not being expensive to run while the slower tests will run less frequently while catching important but less likely errors.

Related to both speed and scope is the level of isolation between the unit (e.g., class) under test and other units. Some schools of thought (the Mockists) believe that all unit tests should be entirely isolated from other units; to accomplish this their unit tests substitute Mock objects (i.e., dummy implementations) for the real external objects; these mock objects provide well defined behavior for the units not under test. The idea is that when using mock objects any test failures are entirely due to the unit under test rather than possibly originating from the other code units (assuming the mocks themselves were written correctly, of course). Speed can also argue for the use of mock objects; for example a mock visibility iterator could provide very specific simulated data (a ramp, a delta function, etc.) to a class performing a unit test on a higher level of functionality such as imaging or flagging. The simulated data would be provided without requiring either a saved external MS or requiring that the external MS be created before running the unit test. I would lean toward using mock objects for replacing external data and complicated objects and otherwise rely on the unit tests for the other classes to catch errors in those classes.

When modifying a section of CASA, particularly one that you don't work in on a regular basis, it's easy to introduce regression bug. As part of our process modernization, we want to make better use of unit testing. The ultimate goal is to provide a suite of unit tests which cover all parts of CASA. Of course that's not going to happen overnight. As a first step a mechanism is being implemented to better support defining tests to the build and test system so that they can be run and reported automatically. At the same time, we want to see if we can make use of the existing unit tests that appear to be defined in the various "test" directories in the codebase. Unfortunately, some of these do not build and it's likely that some may no longer work once built. A set of new cmake macros has been defined which should help with both problems.

Going Forward

The first step is to get the existing unit/module tests online and provide a framework where new tests can be created and integrated into the CASA build and test system and the CASA software development process. Once the existing test are up and running we can start looking at test coverage (see also: amusing fable on test coverage levels) with an eye towards increasing the unit test coverage. We will also look at the tests to determine whether they are properly categorized as "unit" or "module" tests. When the testing infrastructure is in place, the unit tests will be run at commit time while the module tests will be run somewhat less frequently (e.g., nightly).

Google does a lot of unit testing and has studied it quite a bit. They report unit test coverage (in percent of possible branches) ranging from 56.6% for software written in C++ to 84.2% for Python; the weaker type system of Python probably explains the higher coverage for Python software. As with any metric, coverage numbers need to be taken in context since there is a quality aspect to coverage as well as a quantity; if a small uncovered section of the code contains important logic, then the set of unit tests may not be very effective even with a 90% coverage.

Plan Summary

  • Get current tests building and running or mark them as not useful.
  • Routinely add more unit tests when doing significant work on a part of a module.
  • Separate tests into module and unit categories and provide for automatic build and execution.
  • Make regular running of the relevant unit and module tests are part of the development workflow.
  • Refactor some module tests to use mock objects to allow better separation of functionality and to eliminate I/O. Ideally, this may allow some module tests to become unit tests.
  • Determine unit test coverage and embark on development tasks (perhaps during the housekeeping phase?) to add test coverage in areas where it is lacking.

Legacy Tests

CASA has a number of legacy units tests that are defined in the module CMakeLists.txt files (these are in the various code/module directories such as code/flagging). These utilize the macros casa_add_test and casa_add_assay. The difference between these two macros is that casa_add_test test is run directly by ctest whereas casa_add_assay runs the test via a CASA shell script (code/install/assay); the assay script can do a number of things, but the most used is to run the test and compare its output to a saved, "golden" version of the output.. The build of these tests is excluded from the normal "make" default target ("all"); they are built and executed when make unit_test is done.

Because many of the tests do not build, the two macros have been redefined to keep the tests out of new unit_test make target. Instead a warning will be issued when running cmake indicating that the unit test is defined but not ready for production use. When a developer has fixed the unit test they should change the definition in the CMakeLists.txt file to use the new macro casa_add_unit_test (see below).

The existing tests can be built by using make unit_test_unready or make unit_test_unready_module (e.g., make unit_test_unready_synthesis). This will only build them for now. To run them you can find the executables down below code/build/module.

Unit Tests

Two macros have been created which will allow defining new unit tests to be used in the build and test process: casa_add_unit_test and casa_add_google_test (syntax below).. The google-test macro is used to ensure that tests built using google test can find the appropriate includes and libraries but delegates most functionality to casa_add_unit_test. The casa_add_unit_test macro normally adds logic to build the test and run it when either "make unit_test" or "make unit_test_MODULE" are done.

Syntax

casa_add_unit_test (MODULES module [submodule] [subsubmodule] ... ] SOURCES source1 [source2 [source3] ...] [LIBRARIES lib1 ...] [INCLUDE_DIRS dir1 ...] [NOT_READY])

Adds a unit test. The name of the test will be the basename of of the first source file. Usually there will be a single source file and no libraries or include directories added (a feature used for other macros such as casa_add_google_test. The NOT_READY option is used to keep an unproven unit test out of the production unit test build and test process (usually used for legacy unit tests awaiting vetting). The next most complicated use would be to include other C++ source files that provide possibly shared functionality with other unit tests, etc.

Example

casa_add_unit_test (msvis VisibilityIterator_Test.cc)

Google Test

New unit tests should be written using the Google Test unit testing framework (start out by reading the provided primer). It's fairly easy to use and the documentation is pretty good.

Google test has the concepts of errors and fatal_errors. An error causes the result of the test case to be "fail" but does not prematurely end the test case execution. Fatal errors cause the result of the test case to "fail" as well as stopping the execution of the test case. You should generate a fatal error when further execution of the test case is (or is likely) to be pointless or worse. The classic example is an operation the fails and causes a pointer to have a null value; continuing execution will likely result in dereferencing the pointer and cause a seg fault, so stopping the test case execution when a null pointer is encountered is the best approach. Fatal errors are caused by the ASSERT_* macros while EXPECT_* only cause errors.

If a fatal error is declared in a method, the result is that the method declaring the fatal error stop execution immediately. This is because the ASSERT_* macros set a testing state to indicate test case failure and then execute a return. If the method calling the routine which discovered the fatal error needs to stop after the call, then it must wrap the call in ASSERT_NO_FATAL_FAILURE (theMethodCall) which will cause a return after theMethodCall:

Module Tests

Module tests are tests with scope limited to a single module and are expected to more extensively test some parts of the module's functionality; typically they run somewhat longer than is appropriate for a unit test. These can be added with the macro casa_add_module_test; the syntax is otherwise the same as for casa_add_unit_test. These will be built whenever CASA code is built, but they will only be tested when the make targets module_test or module_test_MODULE (e.g., module_test_msvis) are used. The corresponding macro for google test based module tests is casa_add_google_module_test.

void testingMethod ()
{
   Table * tablePointer;
    setUpTablePointer (tablePointer);

    tablePointer -> doWhatever();
}

void setupTablePointer (Table * & tablePointer){

    ASSERT_FALSE (todayIsMonday());

    tablePointer = new Table();

    tablePointer->initialize();
}

The above sequence will not seg fault in setupTablePointer but will cause testingMethod to seg fault because ASSERT_FALSE sets changes the global state to indicate failure and then does a return. To avoid the seg fault rewrite as:

void testingMethod ()
{
   Table * tablePointer;
    ASSERT_NO_FATAL_FAILURE (setUpTablePointer (tablePointer));

    tablePointer -> doWhatever();
}

void setupTablePointer (Table * & tablePointer){

    ASSERT_FALSE (todayIsMonday());

    tablePointer = new Table();

    tablePointer->initialize();
}

Adding a Google Test

Google tests are defined using the macro:

casa_add_google_test (MODULES module SOURCES source1 [source2] ... [LIBS library1 [library2]...])

Adds the google test to the specified module. The test will be built using the specified source files. The first source file will be used to name the test and will also be the name of the executable created. If any unusual libraries are needed they can be specified using the LIBS option.

Demonstration Applications

Some CASA modules have demonstration applications that document the workings of a module by providing a working example. These applications reside in the test directory of the module and the main file is named with an inital lowercase "d" (e.g., dDemoApp.cc). Originally, these were made know to the CASA build system by using the casa_add_assay macro; this was also used to define unit tests. The demo apps have now been converted over to use the casa_add_demo macro. When defined using this macro, the demo app will be built but unlike unit tests, there is no facility to execute them during any of CASA's automatic testing processes; this is appropriate since the intent of tests is documentation, not testing. They are automatically built to help keep them in sync with the current codebase.

Useful make targets

  • unit_test — Builds and runs all of the unit tests
  • unit_test_module — Builds and runs the unit tests associated with the module. (e.g., make unit_test_msvis)
  • module — makes the specified modules plus any of the modules it depends on (e.g., make flagging)
  • unit_test_unready — attempts to build all of the legacy unit tests; does not attempt to run them
  • unit_test_unready_module — attempts to build all of the legacy unit tests formodule; does not attempt to run them. (e.g., make unit_test_unready_flagging)
  • SomeTest .— makes the specified test executable (e.g., make VisibilityIterator_Gtest); no paths are necessary

Another useful trick is to change to the code directory or a module directory and issue ctest -N; this will list the tests known for the entire project or the module, respectively.

For a module with a large number of tests, it can be helpful to create a script (e.g., myUnitTests) which will run the subset of unit tests relevant to the files being worked on. This allows the important tests to be easily and frequently run. Before committing the full module suite of tests should be run, at a minimum, and probably occasionally during the modification process.

-- JimJacobs - 2015-08-28
Topic revision: r22 - 2016-02-24, JimJacobs
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback