Included gomill framework for SGF and GTP support, and sketched out SGF game-loading code.
This commit is contained in:
parent
700a6a2f32
commit
692dc294d6
119 changed files with 27458 additions and 3 deletions
376
gomill/docs/cem_tuner.rst
Normal file
376
gomill/docs/cem_tuner.rst
Normal file
|
@ -0,0 +1,376 @@
|
|||
.. |ce| replace:: :ref:`[CE] <ce_paper>`
|
||||
|
||||
The cross-entropy tuner
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
:setting:`competition_type` string: ``"ce_tuner"``.
|
||||
|
||||
The cross-entropy tuner uses the :dfn:`cross-entropy method` described in
|
||||
|ce|:
|
||||
|
||||
.. _ce_paper:
|
||||
|
||||
| [CE] G.M.J-B. Chaslot, M.H.M Winands, I. Szita, and H.J. van den Herik.
|
||||
| Cross-entropy for Monte-Carlo Tree Search. ICGA Journal, 31(3):145-156.
|
||||
| http://www.personeel.unimaas.nl/g-chaslot/papers/crossmcICGA.pdf
|
||||
|
||||
.. caution:: The cross-entropy tuner is experimental. It can take a very large
|
||||
number of games to converge.
|
||||
|
||||
|
||||
.. contents:: Page contents
|
||||
:local:
|
||||
:backlinks: none
|
||||
|
||||
|
||||
The tuning algorithm
|
||||
""""""""""""""""""""
|
||||
|
||||
The algorithm is not described in detail in this documentation. See |ce|
|
||||
section 3 for the description. The tuner always uses a Gaussian distribution.
|
||||
The improvement suggested in section 5 is not implemented.
|
||||
|
||||
|
||||
.. _ce parameter model:
|
||||
|
||||
The parameter model
|
||||
"""""""""""""""""""
|
||||
|
||||
The parameter values taken from the Gaussian distribution are floating-point
|
||||
numbers known as :dfn:`optimiser parameters`.
|
||||
|
||||
These parameters can be transformed before being used to configure the
|
||||
candidate (see 3.3 *Normalising Parameters* in |ce|). The transformed values
|
||||
are known as :dfn:`engine parameters`. The transformation is implemented using
|
||||
a Python :ce-setting:`transform` function defined in the control file.
|
||||
|
||||
Reports show engine parameters (see the :ce-setting:`format` parameter
|
||||
setting), together with the mean and variance of the corresponding optimiser
|
||||
parameter distribution in the form :samp:`{mean}~{variance}`.
|
||||
|
||||
|
||||
.. _the cem tuning algorithm:
|
||||
|
||||
.. _sample_cem_control_file:
|
||||
|
||||
Sample control file
|
||||
"""""""""""""""""""
|
||||
|
||||
Here is a sample control file, illustrating most of the available settings for
|
||||
a cross-entropy tuning event::
|
||||
|
||||
competition_type = "ce_tuner"
|
||||
|
||||
description = """\
|
||||
This is a sample control file.
|
||||
|
||||
It illustrates the available settings for the cross entropy tuner.
|
||||
"""
|
||||
|
||||
players = {
|
||||
'gnugo-l10' : Player("gnugo --mode=gtp --chinese-rules "
|
||||
"--capture-all-dead --level=10"),
|
||||
}
|
||||
|
||||
def fuego(max_games, additional_commands=[]):
|
||||
commands = [
|
||||
"go_param timelimit 999999",
|
||||
"uct_max_memory 350000000",
|
||||
"uct_param_search number_threads 1",
|
||||
"uct_param_player reuse_subtree 0",
|
||||
"uct_param_player ponder 0",
|
||||
"uct_param_player max_games %d" % max_games,
|
||||
]
|
||||
return Player(
|
||||
"fuego --quiet",
|
||||
startup_gtp_commands=commands+additional_commands)
|
||||
|
||||
FUEGO_MAX_GAMES = 1000
|
||||
|
||||
def exp_10(f):
|
||||
return 10.0**f
|
||||
|
||||
parameters = [
|
||||
Parameter('rave_weight_initial',
|
||||
# Mean and variance are in terms of log_10 (rave_weight_initial)
|
||||
initial_mean = -1.0,
|
||||
initial_variance = 1.5,
|
||||
transform = exp_10,
|
||||
format = "I: %4.2f"),
|
||||
|
||||
Parameter('rave_weight_final',
|
||||
# Mean and variance are in terms of log_10 (rave_weight_final)
|
||||
initial_mean = 3.5,
|
||||
initial_variance = 1.5,
|
||||
transform = exp_10,
|
||||
format = "F: %4.2f"),
|
||||
]
|
||||
|
||||
def make_candidate(rwi, rwf):
|
||||
return fuego(
|
||||
FUEGO_MAX_GAMES,
|
||||
["uct_param_search rave_weight_initial %f" % rwi,
|
||||
"uct_param_search rave_weight_final %f" % rwf])
|
||||
|
||||
board_size = 9
|
||||
komi = 7.5
|
||||
opponent = 'gnugo-l10'
|
||||
candidate_colour = 'w'
|
||||
|
||||
number_of_generations = 5
|
||||
samples_per_generation = 100
|
||||
batch_size = 10
|
||||
elite_proportion = 0.1
|
||||
step_size = 0.8
|
||||
|
||||
|
||||
|
||||
.. _cem_control_file_settings:
|
||||
|
||||
Control file settings
|
||||
"""""""""""""""""""""
|
||||
|
||||
The following settings can be set at the top level of the control file:
|
||||
|
||||
All :ref:`common settings <common settings>` (the :setting:`players`
|
||||
dictionary is required, though it is used only to define the opponent).
|
||||
|
||||
The following game settings (only :setting:`!board_size` and :setting:`!komi`
|
||||
are required):
|
||||
|
||||
- :setting:`board_size`
|
||||
- :setting:`komi`
|
||||
- :setting:`handicap`
|
||||
- :setting:`handicap_style`
|
||||
- :setting:`move_limit`
|
||||
- :setting:`scorer`
|
||||
|
||||
|
||||
The following additional settings (they are all required):
|
||||
|
||||
.. ce-setting:: candidate_colour
|
||||
|
||||
String: ``"b"`` or ``"w"``
|
||||
|
||||
The colour for the candidates to take in every game.
|
||||
|
||||
|
||||
.. ce-setting:: opponent
|
||||
|
||||
Identifier
|
||||
|
||||
The :ref:`player code <player codes>` of the player to use as the
|
||||
candidates' opponent.
|
||||
|
||||
|
||||
.. ce-setting:: parameters
|
||||
|
||||
List of :ce-setting-cls:`Parameter` definitions (see :ref:`ce parameter
|
||||
configuration`).
|
||||
|
||||
Describes the parameters that the tuner will work with. See :ref:`ce
|
||||
parameter model` for more details.
|
||||
|
||||
The order of the :ce-setting-cls:`Parameter` definitions is used for the
|
||||
arguments to :ce-setting:`make_candidate`, and whenever parameters are
|
||||
described in reports or game records.
|
||||
|
||||
|
||||
.. ce-setting:: make_candidate
|
||||
|
||||
Python function
|
||||
|
||||
Function to create a :setting-cls:`Player` from its engine parameters.
|
||||
|
||||
This function is passed one argument for each candidate parameter, and must
|
||||
return a :setting-cls:`Player` definition. Each argument is the output of
|
||||
the corresponding Parameter's :ce-setting:`transform`.
|
||||
|
||||
The function will typically use its arguments to construct command line
|
||||
options or |gtp| commands for the player. For example::
|
||||
|
||||
def make_candidate(param1, param2):
|
||||
return Player(["goplayer", "--param1", str(param1),
|
||||
"--param2", str(param2)])
|
||||
|
||||
def make_candidate(param1, param2):
|
||||
return Player("goplayer", startup_gtp_commands=[
|
||||
["param1", str(param1)],
|
||||
["param2", str(param2)],
|
||||
])
|
||||
|
||||
|
||||
.. ce-setting:: number_of_generations
|
||||
|
||||
Positive integer
|
||||
|
||||
The number of times to repeat the tuning algorithm (*number of iterations*
|
||||
or *T* in the terminology of |ce|).
|
||||
|
||||
|
||||
.. ce-setting:: samples_per_generation
|
||||
|
||||
Positive integer
|
||||
|
||||
The number of candidates to make in each generation (*population_size* or
|
||||
*N* in the terminology of |ce|).
|
||||
|
||||
|
||||
.. ce-setting:: batch_size
|
||||
|
||||
Positive integer
|
||||
|
||||
The number of games played by each candidate.
|
||||
|
||||
|
||||
.. ce-setting:: elite_proportion
|
||||
|
||||
Float between 0.0 and 1.0
|
||||
|
||||
The proportion of candidates to select from each generation as 'elite' (the
|
||||
*selection ratio* or *ρ* in the terminology of |ce|). A value between 0.01
|
||||
and 0.1 is recommended.
|
||||
|
||||
|
||||
|
||||
.. ce-setting:: step_size
|
||||
|
||||
Float between 0.0 and 1.0
|
||||
|
||||
The rate at which to update the distribution parameters between generations
|
||||
(*α* in the terminology of |ce|).
|
||||
|
||||
.. caution:: I can't find anywhere in the paper the value they used for
|
||||
this, so I don't know what to recommend.
|
||||
|
||||
|
||||
.. _ce parameter configuration:
|
||||
|
||||
Parameter configuration
|
||||
"""""""""""""""""""""""
|
||||
|
||||
.. ce-setting-cls:: Parameter
|
||||
|
||||
A :ce-setting-cls:`!Parameter` definition has the same syntax as a Python
|
||||
function call: :samp:`Parameter({arguments})`. Apart from :ce-setting:`!code`,
|
||||
the arguments should be specified using keyword form (see
|
||||
:ref:`sample_cem_control_file`).
|
||||
|
||||
The :ce-setting:`code`, :ce-setting:`initial_mean`, and
|
||||
:ce-setting:`initial_variance` arguments are required.
|
||||
|
||||
The arguments are:
|
||||
|
||||
|
||||
.. ce-setting:: code
|
||||
|
||||
Identifier
|
||||
|
||||
A short string used to identify the parameter. This is used in error
|
||||
messages, and in the default for :ce-setting:`format`.
|
||||
|
||||
|
||||
.. ce-setting:: initial_mean
|
||||
|
||||
Float
|
||||
|
||||
The mean value for the parameter in the first generation's distribution.
|
||||
|
||||
|
||||
.. ce-setting:: initial_variance
|
||||
|
||||
Float >= 0
|
||||
|
||||
The variance for the parameter in the first generation's distribution.
|
||||
|
||||
|
||||
.. ce-setting:: transform
|
||||
|
||||
Python function (default identity)
|
||||
|
||||
Function mapping an optimiser parameter to an engine parameter; see :ref:`ce
|
||||
parameter model`.
|
||||
|
||||
Examples::
|
||||
|
||||
def exp_10(f):
|
||||
return 10.0**f
|
||||
|
||||
Parameter('p1', initial_mean = …, initial_variance = …,
|
||||
transform = exp_10)
|
||||
|
||||
If the :ce-setting:`!transform` is not specified, the optimiser parameter is
|
||||
used directly as the engine parameter.
|
||||
|
||||
|
||||
.. ce-setting:: format
|
||||
|
||||
String (default :samp:`"{parameter_code}: %s"`)
|
||||
|
||||
Format string used to display the parameter value. This should include a
|
||||
short abbreviation to indicate which parameter is being displayed, and also
|
||||
contain ``%s``, which will be replaced with the engine parameter value.
|
||||
|
||||
You can use any Python conversion specifier instead of ``%s``. For example,
|
||||
``%.2f`` will format a floating point number to two decimal places. ``%s``
|
||||
should be safe to use for all types of value. See `string formatting
|
||||
operations`__ for details.
|
||||
|
||||
.. __: http://docs.python.org/release/2.7/library/stdtypes.html#string-formatting-operations
|
||||
|
||||
Format strings should be kept short, as screen space is limited.
|
||||
|
||||
Examples::
|
||||
|
||||
Parameter('parameter_1',
|
||||
initial_mean = 0.0, initial_variance = 1.0,
|
||||
format = "p1: %.2f")
|
||||
|
||||
Parameter('parameter_2',
|
||||
initial_mean = 5000, initial_variance = 250000,
|
||||
format = "p2: %d")
|
||||
|
||||
|
||||
Reporting
|
||||
"""""""""
|
||||
|
||||
Currently, there aren't any sophisticated reports.
|
||||
|
||||
The standard report shows the parameters of the current Gaussian distribution,
|
||||
and the number of wins for each candidate in the current generation.
|
||||
|
||||
After each generation, the details of the candidates are written to the
|
||||
:ref:`history file <logging>`. The candidates selected as elite are marked
|
||||
with a ``*``.
|
||||
|
||||
|
||||
Changing the control file between runs
|
||||
""""""""""""""""""""""""""""""""""""""
|
||||
|
||||
Some settings can safely be changed between runs of the same cross-entropy
|
||||
tuning event:
|
||||
|
||||
:ce-setting:`batch_size`
|
||||
safe to increase
|
||||
|
||||
:ce-setting:`samples_per_generation`
|
||||
not safe to change
|
||||
|
||||
:ce-setting:`number_of_generations`
|
||||
safe to change
|
||||
|
||||
:ce-setting:`elite_proportion`
|
||||
safe to change
|
||||
|
||||
:ce-setting:`step_size`
|
||||
safe to change
|
||||
|
||||
:ce-setting:`make_candidate`
|
||||
safe to change, but don't alter play-affecting options
|
||||
|
||||
:ce-setting:`transform`
|
||||
not safe to change
|
||||
|
||||
:ce-setting:`format`
|
||||
safe to change
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue