.. Copyright 2018 The RLgraph authors. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ============================================================================ .. image:: images/rlcore-logo-full.png :scale: 25% :alt: What is an RLgraph Component? ============================= Components are the basic building blocks, which you will use to build any machine learning and reinforcement learning models with. A component is the smallest unit, which can be run and tested in and by itself via RLgraph's different executor and testing classes. RLgraph components span from simple (and single) neural network layers to highly complex policy networks, memories, optimizers and mathematical components, such as loss functions. Each component contains: - ... any number of sub-components, each of which may again contain their own sub-components (also sometimes called "child components"). Hence components are arbitrarily nestable inside each other. - ... at least one API-method, so that clients of the component (in the end this will be our reinforcement learning agent) can use it. .. figure:: images/dense_layer_component.png :alt: A DenseLayer component (1) with two API-methods (2), one graph function (3) and two variables (kernel and bias) (4). :scale: 60% Above: A DenseLayer component (1) with two API-methods (2), one graph function (3) and two variables (kernel and bias) (4). - ... any number of "graph functions", which are special component methods, which contain the actual computation code. These are the only places, where you will find backend (tensorflow, pytorch, etc..) specific code. - ... any number of variables for the component to use for its computations (graph functions). On the `following page `_, we will walk through building our own custom component, which will include all of the above items. But let's first talk in some more detail about RLgraph's Component base class. The Component Base Class ------------------------ The `Component` base class contains the core functionality, which every RLgraph Component inherits from. Its most important methods are listed below. For a more detailed overview, please take a look at the `Component reference documentation `_. #. `add_components`: This method is used to add an arbitrary number of sub-components to the component itself. #. `check_input_spaces`: Can be used to sanity check the incoming spaces (see the `documentation on RLgraph's Space classes `_) of all API-method call arguments. #. `create_variables`: This method is called automatically by the RLgraph build system and can be implemented in order to create an arbitrary number of variables used by the component's computation functions ("graph functions"). #. `copy`: Copies the component and returns a new Component object that is identical to the original one. This is useful, for example, to create a target network as a copy of the main policy network in a DQN-type agent. API-Methods +++++++++++ A component's API-methods are its outside facing handles through which clients of the component (either another component or an agent that contains the component in question) can access and control its behavior. For example, a typical memory component would need an `insert_records` API-method to insert some data into the memory, a `get_records` method to retrieve a certain number of already stored records, and maybe a `clear` method to wipe out all stored information from the memory. API-methods are normal class methods, but must be tagged with the `@rlgraph_api` decorator, which can be imported as follows: .. code-block:: python from rlgraph.utils.decorators import rlgraph_api An API-method can have any arbitrary combination of regular python args and kwargs, as well as define default values for some of these. For example: .. code-block:: python # inside some component class ... ... @rlgraph_api def my_api_method(self, a, b=5, c=None): # do and return something Calling the above API-method (e.g. from its parent component) requires the call argument `a` to be provided, whereas `b` and `c` are optional arguments. As you may recall from the `spaces chapter `_, information in RLgraph is passed around between components within fixed space constraints. In fact, each API-method call argument (`a`, `b`, and `c` in our example above) has a dedicated space after the final graph has been built from all components in it. **Important note:** Up until now, if an API-method is called more than once by the component's client(s), the spaces of the provided call arguments (e.g. the space of `a`) in the different API-calls have to match. So if in the first call, `a` is an IntBox, in the second call, it has to be an IntBox as well. This is because of a possible dependency of the component's variables (see below) on these "input-spaces". We will try to further loosen this restriction in future releases and only require it if RLgraph knows for sure that the space of the argument in question is being used to define variables of the component. Variables +++++++++ Variables are the data that each component can store for the lifetime of the computation graph. A variable has a fixed data type and shape, hence a fixed Rlgraph space. As a matter of fact, variables are often created directly from `Space` instances via the practical `Space.get_variable()` method. Variables can be accessed inside graph functions (see below) and can be read as well as be written to. Examples for variables are: - The buffer of a memory that stores a certain part of a memory record, for example an image (rank-3 uint8 tensor). - A memory component's index pointer (which record should we retrieve next?). This is usually a single int scalar. - The weights matrix of some neural network layer. This is always a rank-2 float tensor. Variables are created in a component's `create_variables` method, which gets called automatically, once all input spaces of the component (all its API-method arguments' spaces) are known to the RLgraph build system. In the next paragraph, we will explain how this stage of "input-completeness" is reached and why it's important for the component. Input Spaces and the concept of "input-completeness" ++++++++++++++++++++++++++++++++++++++++++++++++++++ Let's look at a Component's API-method and its variable generating code to understand the concept of "input-completeness". .. code-block:: python # inside some component class ... ... @rlgraph_api def insert(self, record): # Call a graph function that will take care of the assignment. return self._graph_fn_insert(record) def create_variables(input_spaces, action_space=None): """ Override this base class method to create variables based on the spaces that are underlying each API-method's call argument (in our case, this is only the call arg "records" of the "insert" API-method). """ # Lookup the input space by the name of the API-method's call arg ("record"). in_space = input_spaces["record"] self.storage_buffer = in_space.get_variable(trainable=False, ... other options) A component reaches input-completeness, if all spaces to all its unique call parameters (by their names) are known. A space for a call argument (e.g. `record`) gets known once the respective API-method (here: `insert`) gets called by a client (a parent component). Only the outermost component, also called the "root", needs its spaces to be provided manually by the user, since its API-methods are only executed (called) at graph-execution time. If a component has many API-methods, each with the only call argument `a` , which share the call parameter's names (e.g. a component has API-methods: `one(a, b)`) A client of this component (a parent component or the RL agent directly) will eventually make a call to the component's API-method `insert()`. At that point, the space of the `record` argument will be known. If the component above only has that one API-method, and hence only that one API-method call argument (`record`), it is then input-complete. Graph Functions +++++++++++++++ Every component serves a certain computation purpose within a machine learning model. A neural network layer maps input data to output data via, for example, a matrix-matrix multiplication (and adding maybe some bias). An optimizer calculates the gradient of a loss function over the weights of a trainable layer and applies the resulting gradients in a certain way to these weights. All these calculation steps happen inside a component's graph functions, the only place in RLgraph, were we can find backend specific code, such as calls to tensorflow's static graph building functions or computations on pytorch tensors. Unlike API-methods, graph functions can only be called from within the same component that owns them (not by parents or grandparents of the component). These calls happen from within the component's different API-methods (similar to calling another API-method). Graph functions are - similar to API-methods - regular python class methods, but must be tagged with the `@graph_fn` decorator as follows: .. code-block:: python # inside some component class ... ... @graph_fn def _graph_fn_do_some_computation(self, a, b): # All backend-specific code in RLgraph goes into graph functions. if get_backend() == "tf": # Do some computation in tf. some_result = tf.add(a, b) elif get_backend() == "pytorch": # Do some computation in pytorch. some_result = a + b return some_result Inside a graph function, any type of backend specific computations are allowed to be coded. A graph function then returns the result of the computation or many results as a tuple.