4.4. Explorations

4.4.1. Exploration Base Class

class rlgraph.components.explorations.exploration.Exploration(epsilon_spec=None, noise_spec=None, scope='exploration', **kwargs)[source]

Bases: rlgraph.components.component.Component

A Component that can be plugged on top of a Policy’s output to produce action choices. It includes noise and/or epsilon-based exploration options as well as an out-Socket to draw actions from the Policy’s distribution - either by sampling or by deterministically choosing the max-likelihood value.

check_input_spaces(input_spaces, action_space=None)[source]

Should check on the nature of all in-Sockets Spaces of this Component. This method is called automatically by the Model when all these Spaces are know during the Model’s build time.

Args:
input_spaces (Dict[str,Space]): A dict with Space/shape information.
keys=in-Socket name (str); values=the associated Space
action_space (Optional[Space]): The action Space of the Agent/GraphBuilder. Can be used to construct and
connect more Components (which rely on this information). This eliminates the need to pass the action Space information into many Components’ constructors.

4.4.2. EpsilonExploration Helper Class

class rlgraph.components.explorations.epsilon_exploration.EpsilonExploration(decay_spec=None, scope='epsilon-exploration', **kwargs)[source]

Bases: rlgraph.components.component.Component

A component to handle epsilon-exploration functionality. It takes the current time step and outputs a bool on whether to explore (uniformly random) or not (greedy or sampling). The time step is used by a epsilon-decay component to determine the current epsilon value between 1.0 and 0.0. The result of this decay is the probability, with which we output “True” (meaning: do explore), vs “False” (meaning: do not explore).

API: ins:

time_step (int): The current time step.
outs:
do_explore (bool): The decision whether to explore (do_explore=True; pick uniformly randomly) or
whether to use a sample (or max-likelihood value) from a distribution (do_explore=False).
check_input_spaces(input_spaces, action_space=None)[source]

Should check on the nature of all in-Sockets Spaces of this Component. This method is called automatically by the Model when all these Spaces are know during the Model’s build time.

Args:
input_spaces (Dict[str,Space]): A dict with Space/shape information.
keys=in-Socket name (str); values=the associated Space
action_space (Optional[Space]): The action Space of the Agent/GraphBuilder. Can be used to construct and
connect more Components (which rely on this information). This eliminates the need to pass the action Space information into many Components’ constructors.
do_explore(*args, **kwargs)