4.4. Explorations¶
4.4.1. Exploration Base Class¶
-
class
rlgraph.components.explorations.exploration.
Exploration
(epsilon_spec=None, noise_spec=None, scope='exploration', **kwargs)[source]¶ Bases:
rlgraph.components.component.Component
A Component that can be plugged on top of a Policy’s output to produce action choices. It includes noise and/or epsilon-based exploration options as well as an out-Socket to draw actions from the Policy’s distribution - either by sampling or by deterministically choosing the max-likelihood value.
-
check_input_spaces
(input_spaces, action_space=None)[source]¶ Should check on the nature of all in-Sockets Spaces of this Component. This method is called automatically by the Model when all these Spaces are know during the Model’s build time.
- Args:
- input_spaces (Dict[str,Space]): A dict with Space/shape information.
- keys=in-Socket name (str); values=the associated Space
- action_space (Optional[Space]): The action Space of the Agent/GraphBuilder. Can be used to construct and
- connect more Components (which rely on this information). This eliminates the need to pass the action Space information into many Components’ constructors.
-
4.4.2. EpsilonExploration Helper Class¶
-
class
rlgraph.components.explorations.epsilon_exploration.
EpsilonExploration
(decay_spec=None, scope='epsilon-exploration', **kwargs)[source]¶ Bases:
rlgraph.components.component.Component
A component to handle epsilon-exploration functionality. It takes the current time step and outputs a bool on whether to explore (uniformly random) or not (greedy or sampling). The time step is used by a epsilon-decay component to determine the current epsilon value between 1.0 and 0.0. The result of this decay is the probability, with which we output “True” (meaning: do explore), vs “False” (meaning: do not explore).
API: ins:
time_step (int): The current time step.- outs:
- do_explore (bool): The decision whether to explore (do_explore=True; pick uniformly randomly) or
- whether to use a sample (or max-likelihood value) from a distribution (do_explore=False).
-
check_input_spaces
(input_spaces, action_space=None)[source]¶ Should check on the nature of all in-Sockets Spaces of this Component. This method is called automatically by the Model when all these Spaces are know during the Model’s build time.
- Args:
- input_spaces (Dict[str,Space]): A dict with Space/shape information.
- keys=in-Socket name (str); values=the associated Space
- action_space (Optional[Space]): The action Space of the Agent/GraphBuilder. Can be used to construct and
- connect more Components (which rely on this information). This eliminates the need to pass the action Space information into many Components’ constructors.
-
do_explore
(*args, **kwargs)¶