4.9. Neural Networks

4.9.1. Stack Class

class rlgraph.components.neural_networks.stack.Stack(*sub_components, **kwargs)[source]

Bases: rlgraph.components.component.Component

A component container stack that incorporates one or more sub-components some of whose API-methods (default: only apply) are automatically connected with each other (in the sequence the sub-Components are given in the c’tor), resulting in an API of the Stack. All sub-components’ API-methods need to match in the number of input and output values. E.g. the third sub-component’s api-metehod’s number of return values has to match the forth sub-component’s api-method’s number of input parameters.

classmethod from_spec(spec=None, **kwargs)[source]

Uses the given spec to create an object. If spec is a dict, an optional “type” key can be used as a “constructor hint” to specify a certain class of the object. If spec is not a dict, spec’s value is used directly as the “constructor hint”.

The rest of spec (if it’s a dict) will be used as kwargs for the (to-be-determined) constructor. Additional keys in **kwargs will always have precedence (overwrite keys in spec (if a dict)). Also, if the spec-dict or **kwargs contains the special key “_args”, it will be popped from the dict and used as *args list to be passed separately to the constructor.

The following constructor hints are valid: - None: Use cls as constructor. - An already instantiated object: Will be returned as is; no constructor call. - A string or an object that is a key in cls’s __lookup_classes__ dict: The value in __lookup_classes__

for that key will be used as the constructor.
  • A python callable: Use that as constructor.
  • A string: Either a json filename or the name of a python module+class (e.g. “rlgraph.components.Component”)
    to be Will be used to
Args:
spec (Optional[dict]): The specification dict.
Keyword Args:
kwargs (any): Optional possibility to pass the c’tor arguments in here and use spec as the type-only info.
Then we can call this like: from_spec([type]?, [**kwargs for ctor]) If spec is already a dict, then kwargs will be merged with spec (overwriting keys in spec) after “type” has been popped out of spec. If a constructor of a Specifiable needs an *args list of items, the special key _args can be passed inside kwargs with a list type value (e.g. kwargs={“_args”: [arg1, arg2, arg3]}).
Returns:
The object generated from the spec.
rlgraph.components.neural_networks.stack.force_tuple(elements=None, *, to_tuple=True)

Makes sure elements is returned as a list, whether elements is a single item, already a list, or a tuple.

Args:
elements (Optional[any]): The inputs as single item, list, or tuple to be converted into a list/tuple.
If None, returns empty list/tuple.

to_tuple (bool): Whether to use tuple (instead of list).

Returns:
Union[list,tuple]: All given elements in a list/tuple depending on to_tuple’s value. If elements is None,
returns an empty list/tuple.

4.9.2. PreprocessorStack

class rlgraph.components.neural_networks.preprocessor_stack.PreprocessorStack(*preprocessors, **kwargs)[source]

Bases: rlgraph.components.neural_networks.stack.Stack

A special Stack that only carries PreprocessLayer Components and bundles all their reset output ops into one exposed reset output op. Otherwise, behaves like a Stack in feeding the outputs of one sub-Component to the inputs of the next sub-Component, etc..

API:
preprocess(input_): Outputs the preprocessed input after sending it through all sub-Components of this Stack. reset(): An op to trigger all PreprocessorLayers of this Stack to be reset.
get_preprocessed_space(space)[source]

Returns the Space obtained after pushing the input through all layers of this Stack.

Args:
space (Space): The incoming Space object.
Returns:
Space: The Space after preprocessing.
reset(*args, **kwargs)

4.9.3. DictPreprocessorStack

class rlgraph.components.neural_networks.dict_preprocessor_stack.DictPreprocessorStack(preprocessors, **kwargs)[source]

Bases: rlgraph.components.neural_networks.preprocessor_stack.PreprocessorStack

A generic PreprocessorStack that can handle Dict/Tuple Spaces and parallely preprocess different Spaces within different (and separate) single PreprocessorStack components. The output is again a dict of preprocessed inputs.

API:
preprocess(input_): Outputs the preprocessed input after sending it through all sub-Components of this Stack. reset(): An op to trigger all PreprocessorStacks of this Vector to be reset.
get_preprocessed_space(space)[source]

Returns the Space obtained after pushing the input through all layers of this Stack.

Args:
space (Dict): The incoming Space object.
Returns:
Space: The Space after preprocessing.
reset(*args, **kwargs)

4.9.4. NeuralNetwork

class rlgraph.components.neural_networks.neural_network.NeuralNetwork(*layers, **kwargs)[source]

Bases: rlgraph.components.neural_networks.stack.Stack

A NeuralNetwork is a Stack, in which the apply method is defined either by custom-API-method OR by connecting through all sub-Components’ apply methods. In both cases, a dict should be returned with at least the output key set. Possible further keys could be last_internal_states for RNN-based NNs and other keys.

has_rnn()[source]
rlgraph.components.neural_networks.neural_network.force_tuple(elements=None, *, to_tuple=True)

Makes sure elements is returned as a list, whether elements is a single item, already a list, or a tuple.

Args:
elements (Optional[any]): The inputs as single item, list, or tuple to be converted into a list/tuple.
If None, returns empty list/tuple.

to_tuple (bool): Whether to use tuple (instead of list).

Returns:
Union[list,tuple]: All given elements in a list/tuple depending on to_tuple’s value. If elements is None,
returns an empty list/tuple.

4.9.5. Policy

class rlgraph.components.neural_networks.policy.Policy(network_spec, action_space=None, action_adapter_spec=None, max_likelihood=True, scope='policy', **kwargs)[source]

Bases: rlgraph.components.component.Component

A Policy is a wrapper Component that contains a NeuralNetwork, an ActionAdapter and a Distribution Component.

get_action(*args, **kwargs)
get_action_layer_output(*args, **kwargs)
get_entropy(*args, **kwargs)
get_logits_probabilities_log_probs(*args, **kwargs)
get_max_likelihood_action(*args, **kwargs)
get_nn_output(*args, **kwargs)
get_stochastic_action(*args, **kwargs)

4.9.6. ActorComponent

class rlgraph.components.neural_networks.actor_component.ActorComponent(preprocessor_spec, policy_spec, exploration_spec, max_likelihood=None, **kwargs)[source]

Bases: rlgraph.components.component.Component

A Component that incorporates an entire pipeline from env state to an action choice. Includes preprocessor, policy and exploration sub-components.

API:
get_preprocessed_state_and_action(state, time_step, use_exploration) ->
get_preprocessed_state_action_and_action_probs(*args, **kwargs)
get_preprocessed_state_and_action(*args, **kwargs)