335 lines
13 KiB
Plaintext
335 lines
13 KiB
Plaintext
|
The standard purpose of regression testing is to avoid getting the same
|
||
|
bug twice. When a bug is found, the programmer fixes the bug and adds a
|
||
|
test to the test suite. The test should fail before the fix and pass
|
||
|
after the fix. When a new version is about to be released, all the tests
|
||
|
in the regression test suite are run and if an old bug reappears, this
|
||
|
will be seen quickly since the appropriate test will fail.
|
||
|
|
||
|
The regression testing in GNU Go is slightly different. A typical test
|
||
|
case involves specifying a position and asking the engine what move it
|
||
|
would make. This is compared to one or more correct moves to decide
|
||
|
whether the test case passes or fails. It is also stored whether a test
|
||
|
case is expected to pass or fail, and deviations in this status signify
|
||
|
whether a change has solved some problem and/or broken something
|
||
|
else. Thus the regression tests both include positions highlighting some
|
||
|
mistake being done by the engine, which are waiting to be fixed, and
|
||
|
positions where the engine does the right thing, where we want to detect
|
||
|
if a change breaks something.
|
||
|
|
||
|
@menu
|
||
|
* Regression Testing:: Regression Testing in GNU Go
|
||
|
* Test Suites:: Test Suites
|
||
|
* Running the Regressions:: Running the Regression Tests
|
||
|
* Running regress.pike:: Running regress.pike
|
||
|
* Viewing with Emacs:: Viewing tests with Emacs
|
||
|
* HTML Views:: HTML Views
|
||
|
@end menu
|
||
|
|
||
|
@node Regression Testing
|
||
|
@section Regression testing in GNU Go
|
||
|
|
||
|
Regression testing is performed by the files in the @file{regression/}
|
||
|
directory. The tests are specified as GTP commands in files with the
|
||
|
suffix @file{.tst}, with corresponding correct results and expected
|
||
|
pass/fail status encoded in GTP comments following the test. To run a
|
||
|
test suite the shell scripts @file{test.sh}, @file{eval.sh}, and
|
||
|
@file{regress.sh} can be used. There are also Makefile targets to do
|
||
|
this. If you @command{make all_batches} most of the tests are run. The
|
||
|
Pike script @file{regress.pike} can also be used to run all tests or a
|
||
|
subset of the tests.
|
||
|
|
||
|
Game records used by the regression tests are stored in the
|
||
|
directory @file{regression/games/} and its subdirectories.
|
||
|
|
||
|
@node Test Suites
|
||
|
@section Test suites
|
||
|
|
||
|
The regression tests are grouped into suites and stored in files as GTP
|
||
|
commands. A part of a test suite can look as follows:
|
||
|
@example
|
||
|
@group
|
||
|
# Connecting with ko at B14 looks best. Cutting at D17 might be
|
||
|
# considered. B17 (game move) is inferior.
|
||
|
loadsgf games/strategy25.sgf 61
|
||
|
90 gg_genmove black
|
||
|
#? [B14|D17]
|
||
|
|
||
|
# The game move at P13 is a suicidal blunder.
|
||
|
loadsgf games/strategy25.sgf 249
|
||
|
95 gg_genmove black
|
||
|
#? [!P13]
|
||
|
|
||
|
loadsgf games/strategy26.sgf 257
|
||
|
100 gg_genmove black
|
||
|
#? [M16]*
|
||
|
@end group
|
||
|
@end example
|
||
|
|
||
|
Lines starting with a hash sign, or in general anything following a hash
|
||
|
sign, are interpreted as comments by the GTP mode and thus ignored by
|
||
|
the engine. GTP commands are executed in the order they appear, but only
|
||
|
those on numbered lines are used for testing. The comment lines starting
|
||
|
with @code{#?} are magical to the regression testing scripts and
|
||
|
indicate correct results and expected pass/fail status. The string
|
||
|
within brackets is matched as a regular expression against the response
|
||
|
from the previous numbered GTP command. A particular useful feature of
|
||
|
regular expressions is that by using @samp{|} it is possible to specify
|
||
|
alternatives. Thus @code{B14|D17} above means that if either @code{B14}
|
||
|
or @code{D17} is the move generated in test case 90, it passes. There is
|
||
|
one important special case to be aware of. If the correct result string
|
||
|
starts with an exclamation mark, this is excluded from the regular
|
||
|
expression but afterwards the result of the matching is negated. Thus
|
||
|
@code{!P13} in test case 95 means that any move except @code{P13} is
|
||
|
accepted as a correct result.
|
||
|
|
||
|
In test case 100, the brackets on the @code{#?} line is followed by an
|
||
|
asterisk. This means that the test is expected to fail. If there is no
|
||
|
asterisk, the test is expected to pass. The brackets may also be
|
||
|
followed by a @samp{&}, meaning that the result is ignored. This is
|
||
|
primarily used to report statistics, e.g. how many tactical reading
|
||
|
nodes were spent while running the test suite.
|
||
|
|
||
|
@node Running the Regressions
|
||
|
@section Running the Regression Tests
|
||
|
|
||
|
@code{./test.sh blunder.tst} runs the tests in @file{blunder.tst} and
|
||
|
prints the results of the commands on numbered lines, which may look
|
||
|
like:
|
||
|
|
||
|
@example
|
||
|
1 E5
|
||
|
2 F9
|
||
|
3 O18
|
||
|
4 B7
|
||
|
5 A4
|
||
|
6 E4
|
||
|
7 E3
|
||
|
8 A3
|
||
|
9 D9
|
||
|
10 J9
|
||
|
11 B3
|
||
|
12 C6
|
||
|
13 C6
|
||
|
@end example
|
||
|
|
||
|
This is usually not very informative, however. More interesting is
|
||
|
@code{./eval.sh blunder.tst} which also compares the results above
|
||
|
against the correct ones in the test file and prints a report for each
|
||
|
test on the form:
|
||
|
|
||
|
@example
|
||
|
1 failed: Correct '!E5', got 'E5'
|
||
|
2 failed: Correct 'C9|H9', got 'F9'
|
||
|
3 PASSED
|
||
|
4 failed: Correct 'B5|C5|C4|D4|E4|E3|F3', got 'B7'
|
||
|
5 PASSED
|
||
|
6 failed: Correct 'D4', got 'E4'
|
||
|
7 PASSED
|
||
|
8 failed: Correct 'B4', got 'A3'
|
||
|
9 failed: Correct 'G8|G9|H8', got 'D9'
|
||
|
10 failed: Correct 'G9|F9|C7', got 'J9'
|
||
|
11 failed: Correct 'D4|E4|E5|F4|C6', got 'B3'
|
||
|
12 failed: Correct 'D4', got 'C6'
|
||
|
13 failed: Correct 'D4|E4|E5|F4', got 'C6'
|
||
|
@end example
|
||
|
|
||
|
The result of a test can be one of four different cases:
|
||
|
|
||
|
@itemize @bullet
|
||
|
@item @code{passed}: An expected pass
|
||
|
|
||
|
This is the ideal result.
|
||
|
|
||
|
@item @code{PASSED}: An unexpected pass
|
||
|
|
||
|
This is a result that we are hoping for when we fix a bug. An old test
|
||
|
case that used to fail is now passing.
|
||
|
|
||
|
@item @code{failed}: An expected failure
|
||
|
|
||
|
The test failed but this was also what we expected, unless we were
|
||
|
trying to fix the particular mistake highlighted by the test case.
|
||
|
These tests show weaknesses of the GNU Go engine and are good places to
|
||
|
search if you want to detect an area which needs improvement.
|
||
|
|
||
|
@item @code{FAILED}: An unexpected failure
|
||
|
|
||
|
This should nominally only happen if something is broken by a
|
||
|
change. However, sometimes GNU Go passes a test, but for the wrong
|
||
|
reason or for a combination of wrong reasons. When one of these reasons
|
||
|
is fixed, the other one may shine through so that the test suddenly
|
||
|
fails. When a test case unexpectedly fails, it is necessary to make a
|
||
|
closer examination in order to determine whether a change has broken
|
||
|
something.
|
||
|
|
||
|
@end itemize
|
||
|
|
||
|
If you want a less verbose report, @code{./regress.sh . blunder.tst}
|
||
|
does the same thing as the previous command, but only reports unexpected
|
||
|
results. The example above is compressed to
|
||
|
|
||
|
@example
|
||
|
3 unexpected PASS!
|
||
|
5 unexpected PASS!
|
||
|
7 unexpected PASS!
|
||
|
@end example
|
||
|
|
||
|
For convenience the tests are also available as makefile targets. For
|
||
|
example, @code{make blunder} runs the tests in the blunder test suite by
|
||
|
executing @code{eval.sh blunder.tst}. @code{make all_batches} runs all
|
||
|
test suites in a sequence using the @code{regress.sh} script.
|
||
|
|
||
|
@node Running regress.pike
|
||
|
@section Running regress.pike
|
||
|
|
||
|
A more powerful way to run regressions is with the script
|
||
|
@file{regress.pike}. This requires that you have Pike
|
||
|
(@url{http://pike.ida.liu.se}) installed.
|
||
|
|
||
|
Executing @code{./regress.pike} without arguments will run all
|
||
|
testsuites that @code{make all_batches} would run. The difference is
|
||
|
that unexpected results are reported immediately when they have been
|
||
|
found (instead of after the whole file has been run) and that statistics
|
||
|
of time consumption and node usage is presented for each test file and
|
||
|
in total.
|
||
|
|
||
|
To run a single test suite do e.g. @code{./regress.pike nicklas3.tst} or
|
||
|
@code{./regress.pike nicklas3}. The result may look like:
|
||
|
@example
|
||
|
nicklas3 2.96 614772 3322 469
|
||
|
Total nodes: 614772 3322 469
|
||
|
Total time: 2.96 (3.22)
|
||
|
Total uncertainty: 0.00
|
||
|
@end example
|
||
|
The numbers here mean that the test suite took 2.96 seconds of processor
|
||
|
time and 3.22 seconds of real time. The consumption of reading nodes was
|
||
|
614772 for tactical reading, 3322 for owl reading, and 469 for
|
||
|
connection reading. The last line relates to the variability of the
|
||
|
generated moves in the test suite, and 0 means that none was decided by
|
||
|
the randomness contribution to the move valuation. Multiple testsuites
|
||
|
can be run by e.g. @code{./regress.pike owl ld_owl owl1}.
|
||
|
|
||
|
It is also possible to run a single testcase, e.g. @code{./regress.pike
|
||
|
strategy:6}, a number of testcases, e.g. @code{./regress.pike
|
||
|
strategy:6,23,45}, a range of testcases, e.g. @code{./regress.pike
|
||
|
strategy:13-15} or more complex combinations e.g. @code{./regress.pike
|
||
|
strategy:6,13-15,23,45 nicklas3:602,1403}.
|
||
|
|
||
|
There are also command line options to choose what engine to run, what
|
||
|
options to send to the engine, to turn on verbose output, and to use a
|
||
|
file to specify which testcases to run. Run @code{./regress.pike --help}
|
||
|
for a complete and up to date list of options.
|
||
|
|
||
|
@node Viewing with Emacs
|
||
|
@section Viewing tests with Emacs
|
||
|
|
||
|
To get a quick regression view, you may use the graphical
|
||
|
display mode available with Emacs (@pxref{Emacs}). You will
|
||
|
want the cursor in the regression buffer when you enter
|
||
|
@command{M-x gnugo}, so that GNU Go opens in the correct
|
||
|
directory. A good way to be in the right directory is to
|
||
|
open the window of the test you want to investigate. Then
|
||
|
you can cut and past GTP commands directly from the test to
|
||
|
the minibuffer, using the @command{:} command from
|
||
|
Emacs. Although Emacs mode does not have a coordinate grid,
|
||
|
you may get an ascii board with the coordinate grid using
|
||
|
@command{: showboard} command.
|
||
|
|
||
|
@node HTML Views
|
||
|
@section HTML Regression Views
|
||
|
|
||
|
Extremely useful HTML Views of the regression tests may be
|
||
|
produced using two perl scripts @file{regression/regress.pl}
|
||
|
and @file{regression/regress.plx}.
|
||
|
|
||
|
@enumerate
|
||
|
@item The driver program (regress.pl) which:
|
||
|
@itemize @bullet
|
||
|
@item Runs the regression tests, invoking GNU Go.
|
||
|
@item Captures the trace output, board position, and pass/fail status,
|
||
|
sgf output, and dragon status information.
|
||
|
@end itemize
|
||
|
@item The interface to view the captured output (regress.plx) which:
|
||
|
@itemize @bullet
|
||
|
@item Never invokes GNU Go.
|
||
|
@item Displays the captured output in helpful formats (i.e. HTML).
|
||
|
@end itemize
|
||
|
@end enumerate
|
||
|
|
||
|
@subsection Setting up the HTML regression Views
|
||
|
|
||
|
There are many ways configuring Apache to permit CGI scripts, all of them are
|
||
|
featured in Apache documentation, which can be found at
|
||
|
@url{http://httpd.apache.org/docs/2.0/howto/cgi.html}
|
||
|
|
||
|
Below you will find one example.
|
||
|
|
||
|
This documentation assumes an Apache 2.0 included in Fedora Core distribution,
|
||
|
but it should be fairly close to the config for other distributions.
|
||
|
|
||
|
First, you will need to configure Apache to run CGI scripts in the directory
|
||
|
you wish to serve the html views from. In @file{/etc/httpd/conf/httpd.conf}
|
||
|
there should be a line:
|
||
|
|
||
|
@code{DocumentRoot "/var/www/html"}
|
||
|
|
||
|
Search for a line @code{<Directory "/path/to/directory">}, where
|
||
|
@code{/path/to/directory} is the same as provided in @code{DocumentRoot},
|
||
|
then add @code{ExecCGI} to list of @code{Options}.
|
||
|
The whole section should look like:
|
||
|
|
||
|
@example
|
||
|
<Directory "/var/www/html">
|
||
|
...
|
||
|
Options ... ExecCGI
|
||
|
...
|
||
|
</Directory>
|
||
|
@end example
|
||
|
|
||
|
This allows CGI scripts to be executed in the directory used by regress.plx.
|
||
|
Next, you need to tell Apache that @file{.plx} is a CGI script ending. Your
|
||
|
@file{httpd.conf} file should contain a line:
|
||
|
|
||
|
@code{AddHandler cgi-script ...}
|
||
|
|
||
|
If there isn't already, add it; add @file{.plx} to the list of extensions,
|
||
|
so line should look like:
|
||
|
|
||
|
@code{AddHandler cgi-script ... .plx}
|
||
|
|
||
|
You will also need to make sure you have the necessary modules loaded to run
|
||
|
CGI scripts; mod_cgi and mod_mime should be sufficient. Your @file{httpd.conf}
|
||
|
should have the relevant @code{LoadModule cgi_module modules/mod_cgi.so} and
|
||
|
@code{LoadModule mime_module modules/mod_mime.so} lines; uncomment them if
|
||
|
necessary.
|
||
|
|
||
|
Next, you need to put a copy of @file{regress.plx} in the @code{DocumentRoot}
|
||
|
directory @code{/var/www/html} or it subdirectories where you plan to serve the
|
||
|
html views from.
|
||
|
|
||
|
You will also need to install the Perl module GD
|
||
|
(@url{http://search.cpan.org/dist/GD/}), available from CPAN.
|
||
|
|
||
|
Finally, run @file{regression/regress.pl} to create the xml data used to
|
||
|
generate the html views (to do all regression tests run
|
||
|
@file{regression/regress.pl -a 1}); then, copy the @file{html/} directory to
|
||
|
the same directory as @file{regress.plx} resides in.
|
||
|
|
||
|
At this point, you should have a working copy of the html regression views.
|
||
|
|
||
|
Additional notes for Debian users: The Perl GD module can be installed
|
||
|
by @code{apt-get install libgd-perl}. It may suffice to add this to
|
||
|
the apache2 configuration:
|
||
|
|
||
|
@example
|
||
|
<Directory "/var/www/regression">
|
||
|
Options +ExecCGI
|
||
|
AddHandler cgi-script .plx
|
||
|
RedirectMatch ^/regression$ /regression/regress.plx
|
||
|
</Directory>
|
||
|
@end example
|
||
|
|
||
|
and then make a link from @file{/var/www/regression} to the GNU Go
|
||
|
regression directory. The @code{RedirectMatch} statement is only
|
||
|
needed to set up a shorter entry URL.
|