This document describes the operation of the test scheduling framework in the pounder30 package. This document reflects pounder30 as of 2011-8-09. Authors: Darrick Wong <djwong@us.ibm.com> Lucy Liang <lgliang@us.ibm.com> Copyright (C) 2011 IBM. Contents ======== 1. Overview 2. Test Files 3. Build Scripts 4. Test Scripts 5. Scheduling Tests 6. Running Tests Repeatedly 7. The Provided Test Schedulers 8. Creating Your Own Test Scheduler 9. Including and Excluding Tests Overview ======== The scheduler in the original pounder release was too simplistic--it would kick off every test at once, simultaneously. There was no attempt to ramp up the machine's stress levels test by test, or to run only certain combinations, or even run the tests one by one before beginning the real load testing. In addition, the test scripts had a very simple pass/fail mechanism--failure was defined by a kernel panic/oops/bug, and passing was defined by the lack of that condition. There was no attempt to find soft failures--situations where a test program would fail, but without bringing the machine down. The test suite would not alert the user that these failures had occurred. Consequently, Darrick Wong rewrote the test scheduling framework to achieve several goals--first, to separate the test automation code from the tests themselves, to allow for more intelligent scheduling of tests, to give better summary reports of what passed (and what didn't), and finally to improve the load testing that this suite could do. Test Files ========== Each test should only need to provide three files: 1) build_scripts/<testname> - The build_scripts/ directory contains scripts that take care of checking for system requirements, downloading the relevant packages and binaries, and building any code necessary to run the subtests. See the "Build Scripts" section below for more information. 2) test_scripts/<testname> - The test_script/ directory contains scripts that take care of running the actual tests. See the "Test Scripts" section below for more information. 3) tests/.../[T|D]XX<testname> - The tests/ directory represents our unpackaged "test scheduler" (if your tests/ directory is empty, that means you haven't unpacked any test schedulers yet and will need run "make install" to unpack a scheduler - see "The Provided Test Schedulers" section for more information. The test_repo/ directory also provides an example of what an unpacked test scheduler should look like). The files in the tests/ directory are usually symlinks that point to files in test_scripts/. The order in which the subtests are run upon starting pounder depends on how the files in tests/ are named and organized. See the "Scheduling Tests" section below for more information. Note: <testname> should be the same in the build_scripts/, test_scripts/, and tests/ folders. (Example: build_scripts/subtest1, test_scripts/subtest1, and tests/D99subtest1 would be valid. build_scripts/subtest1, test_scripts/subtest1_different, and tests/D99subtest1 would not.) See "Scheduling Tests" below for a detailed description of naming rules for files in the tests/ directory. Build Scripts ============= As the name implies, a script in build_scripts/ is in charge of downloading and building whatever bits of code are necessary to make the test run. Temporary files needed to run a test should go in $POUNDER_TMPDIR. Third party source, packages, binaries should go in $POUNDER_OPTDIR. Third party packages can be fetched from the web or from a user-created cache, a web-accessible directory containing cached tarballs and files used for whatever it is you'll need to build. (see "$POUNDER_CACHE" in doc/CONFIGURATION for more information) Should there be a failure in the build script that is essential to the ability to run a test, the build script should exit with error to halt the main build process immediately. Also, be aware that distributing pre-built binary tarballs is not always a good idea. Though one could cache pre-built binary tarballs rather than source, it may not be a good idea because distros are not always good at ABI/library path compatibility, despite the efforts of LSB, FHS, etc. It is always safest to build your subtests from source on your target system. The build_scripts/ directory provides some examples. Test Scripts ============ A script in test_scripts/ is in charge of running the actual test. The requirements on test scripts are pretty light. First, the building of the test ought to go in the build script unless it's absolutely necessary to build a test component at run time. Any checking for system requirements should also go in the build script. Second, the script must catch SIGTERM and clean up after itself. SIGTERM is used by the test scheduler to stop tests. The third requirement is much more stringent: Return codes. The script should return 0 to indicate success, 1-254 to indicate failure (the common use is to signify the number of failures), and -1 or 255 to indicate that the there was a failure that cannot be fixed. Note: If a test is being run in a timed or infinite loop (see the "Running Tests Repeatedly" section below for details), returning -1 or 255 has the effect of cancelling all subsequent loops. Quick map of return codes to what gets reported: 0 = "PASS" -1 = "ABORT" 255 = "ABORT" anything else = "FAIL" Also note: If a test is killed by an unhandled signal, the test is reported as failing. Put any temporary files created during test run in $POUNDER_TMPDIR. The test_scripts/ directory provides some examples. Scheduling Tests ================ Everything under the tests/ directory is used for scheduling purposes. The current test scheduler borrows a System V rc script-like structure for specifying how and when tests should be run. Files under tests/ should have names that follow the this standard: [type][sequence number][name] "type" is the type of test. Currently, there are two types, 'D' and 'T'. 'T' signifies a test, which means that the scheduler starts the test, waits for the test to complete, and reports on its exit status. 'D' signifies a daemon "test", which is to say that the scheduler will start the test, let it run in the background, and kill it when it's done running all the tests in that directory. The "sequence number" dictates the order in which the test are run. 00 goes first, 99 goes last. Tests with the same number are started simultaneously, regardless of the type. "name" is just a convenient mnemonic to distinguish between tests. However, it should be the same as the corresponding name using in build_scripts and test_scripts. (A test with build script "build_scripts/subtest" and test script "test_scripts/subtest" should be defined as something like "tests/T00subtest" as opposed to "tests/T00whatever_i_feel_like") Test names must be unique! File system objects under the tests/ directory can be nearly anything-- directories, symbolic links, or files. The test scheduler will not run anything that doesn't have the execute bit set. If a FS object is a directory, then the contents of the directory are executed sequentially. Example: Let's examine the following test scheduler hierarchy: tests/ D00stats T01foo T01bar T02dir/ T00gav -> ../../test_scripts/gav T01hic -> ../../test_scripts/hic T03lat Let's see how the tests are run. The test scheduler will start off by scanning the tests/ directory. First it spawns D00stats and lets it run in the background. Next, T01foo and T01bar are launched at the same time; the scheduler will wait for both of them to complete before proceeding. Since T01foo is a file and not just a symbolic link, there is a fair chance that T01foo runs some test in a loop for a certain amount of time. In any case, the scheduler next sees T02dir and proceeds into it. In the T02dir/, we find two test scripts. First T00gav runs, followed by T01hic. Now there are no more tests to run in T02dir/, so the scheduler heads back up to the parent directory. T03lat is forked and allowed to run to completion, after which D00stats is killed, and the test suite exits. Running Tests Repeatedly ======================== Two helper programs are provided to run tests repeatedly, timed_loop and infinite_loop. (This version of pounder currently also includes a fancy_timed_loop.c file, but it's only meant to be used for the random_syscall and will most likely be merged with timed_loop.c in the future, so we will ignore it here for now.) 1. timed_loop timed_loop [-m max_failures] duration_in_seconds command [arguments] This program will run "command" with the given arguments repeated until the number of seconds given as "duration" has passed or the command has failed a total of "max_failures" times, whichever comes first. If the $MAX_FAILURES variable is set (defined in config, see CONFIGURATION for details), then the program will run until command has failed a total of $MAX_FAILURES time (as long as it's not overridden by the -m option). 2. infinite_loop infinite_loop [-m max_failures] command [arguments] This program runs "command" repeatedly until sent SIGTERM or the command has failed a total of "max_failures" times. If the $MAX_FAILURES variable is set (defined in config, see CONFIGURATION for details), then the program will run until command has failed a total of $MAX_FAILURES time (as long as it's not overridden by the -m option). Examples: 1. test_repo/T90ramp/D02build_kernel contains the following line: "$POUNDER_HOME/infinite_loop $POUNDER_HOME/test_scripts/build_kernel" which will run the build_kernel test script repeatedly until sent SIGTERM or until it has failed a total of $MAX_FAILURES times. "$POUNDER_HOME/infinite_loop -m 10 $POUNDER_HOME/test_scripts/build_kernel" would run the build_kernel test script repeatedly until sent SIGTERM or until it has failed 10 times, regardless of what $MAX_FAILURES is. 2. test_scripts/time_drift contains the following line: "$POUNDER_HOME/timed_loop 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ" which will run the drift-test.py script ($NTP_SERVER and $FREQ are some args passed to drift-test.py) for 15 minutes or until it has failed a total of $MAX_FAILURES times. "$POUNDER_HOME/timed_loop -m 10 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ" would run the drift-test.py script for 15 minutes or until it has failed 10 times, regardless of what $MAX_FAILURES is. The Provided Test Schedulers ============================ This version of pounder provides 3 test schedulers: the "default," "fast," and "test" test schedulers. The tarred versions can be found in the schedulers/ directory as default-tests.tar.gz, fast-tests.tar.gz, and test-tests.tar.gz respectively. To unpack a test scheduler, run "make install" in the pounder/ directory and enter the name of the scheduler you would like to unpack at the first prompt. Example of unpacking the "fast" test scheduler: # make install ./Install Looking for tools...make g++ lex gcc python wget sudo diff patch egrep rm echo test which cp mkdir . All tools were found. WHICH TEST SCHEDULER SETUP DO YOU WANT TO UNPACK? [Choose from: default-tests.tar.gz fast-tests.tar.gz test-tests.tar.gz] [Or simply press ENTER for the default scheduler] Scheduler selection: fast Descriptions of the provided test schedulers: 1. default - provides a general purpose stress test, runs for 48 hours unless the -d option is used when starting pounder. 2. fast - basically the same as default, except it runs for 12 hours by default. 3. test - provides a set of useless tests. Each test simply passes, fails, aborts, or sleeps for some period of time. They don't do anything useful but can be used to see how the test scheduling setup works. Creating Your Own Test Schedulers ================================= From the pounder directory, place the desired tests in the tests/ directory according to the rules described in the "Scheduling Tests" section above. Then run the following command: ./pounder -c name_of_scheduler to create a new test scheduler, which will be tarred as name_of_scheduler-tests.tar.gz and placed in the schedulers/ directory. Example Usage: # ls ./schedulers default-tests.tar.gz fast-tests.tar.gz test-tests.tar.gz # ls ./tests T00hwinfo # ./pounder -c new_sched # ls ./schedulers default-tests.tar.gz fast-tests.tar.gz new_sched-tests.tar.gz test-tests.tar.gz After unpacking the "new_sched" test scheduler during install, the tests/ directory should contain the T00hwinfo subtest along with a tests/excluded/ directory (see the "Including and Excluding Tests" section below for details regarding the tests/excluded directory). Including and Excluding Tests ============================= After unpacking the test scheduler and building each individual test, running "./pounder" will automatically run every test included in the tests folder. If you would like to run only ONE test, run "./pounder ./tests/<some subtest>". If you would like to run a portion of tests, you can use the "./pounder -e" option to exclude certain subtests from subsequent pounder runs: Example: Suppose you have already ran "make install" and unpacked the default test scheduler. The tests/ directory should now contain the subtests to be run 1) ./pounder -l - lists all of the subtests that came with the currently active test scheduler. The output should look something like: ------------------ #./pounder -l Included subtests: ... .../ltp-full-xxxxxxxx/tools/pounder/tests/T10single/T00xterm_stress .../ltp-full-xxxxxxxx/tools/pounder/tests/T00hwinfo ... Excluded subtests: [NONE] ------------------ 2) ./pounder -e "tests/T10single/T00xterm_stress tests/T00hwinfo" - will exclude T00xterm_stress and T00hwinfo from any subsequent pounder runs. This command essentially moves the two tests from the "tests" folder to the "tests/excluded" folder for temporary storage, where they will remain until re-included back into the test scheduler (this is also why all test names should be unique). A file "tests/excluded/testlist" keeps track of which tests have been excluded from the test scheduler and what their original paths were. 3) ./pounder -l - should now output something like: ------------------ #./pounder -l Included subtests: ... Excluded subtests: T00xterm_stress T00hwinfo ------------------ 4) ./pounder -i "T00xterm_stress T00hwinfo" - will re-include these subtests back into the test scheduler. They will be moved from the tests/excluded folder back into the tests folder under their original paths.