4.5. Helper Components Reference

class rlgraph.components.helpers.MemSegmentTree(values, capacity, operator=<built-in function add>)[source]

Bases: object

In-memory Segment tree for prioritized replay.

Note: The pure TensorFlow segment tree is much slower because variable updating is expensive, and in scenarios like Ape-X, memory and update are separated processes, so there is little to be gained from inserting into the graph.

get(index)[source]

Reads an item from the segment tree.

Args:
index (int):

Returns: The element.

get_min_value(start=0, stop=None)[source]

Returns min value of storage variable.

get_sum(start=0, stop=None)[source]

Returns sum value of storage variable.

index_of_prefixsum(prefix_sum)[source]

Identifies the highest index which satisfies the condition that the sum over all elements from 0 till the index is <= prefix_sum.

Args:
prefix_sum .float): Upper bound on prefix we are allowed to select.
Returns:
int: Index/indices satisfying prefix sum condition.
insert(index, element)[source]

Inserts an element into the segment tree by determining its position in the tree.

Args:
index (int): Insertion index. element (any): Element to insert.
reduce(start, limit, reduce_op=<built-in function add>)[source]

Applies an operation to specified segment.

Args:
start (int): Start index to apply reduction to. limit (end): End index to apply reduction to. reduce_op (Union(operator.add, min, max)): Reduce op to apply.
Returns:
Number: Result of reduce operation
class rlgraph.components.helpers.SegmentTree(storage_variable, capacity=1048)[source]

Bases: object

TensorFlow Segment tree for prioritized replay.

get(index)[source]

Reads an item from the segment tree.

Args:
index (int):

Returns: The element.

get_min_value()[source]

Returns min value of storage variable.

get_sum()[source]

Returns sum value of storage variable.

index_of_prefixsum(prefix_sum)[source]

Identifies the highest index which satisfies the condition that the sum over all elements from 0 till the index is <= prefix_sum.

Args:
prefix_sum .float): Upper bound on prefix we are allowed to select.
Returns:
int: Index/indices satisfying prefix sum condition.
insert(index, element, insert_op=<function add>)[source]

Inserts an element into the segment tree by determining its position in the tree.

Args:
index (int): Insertion index. element (any): Element to insert. insert_op (Union(tf.add, tf.minimum, tf, maximum)): Insert operation on the tree.
reduce(start, limit, reduce_op=<function add>)[source]

Applies an operation to specified segment.

Args:
start (int): Start index to apply reduction to. limit (end): End index to apply reduction to. reduce_op (Union(tf.add, tf.minimum, tf.maximum)): Reduce op to apply.
Returns:
Number: Result of reduce operation
class rlgraph.components.helpers.SoftMax(scope='softmax', **kwargs)[source]

Bases: rlgraph.components.component.Component

A simple softmax component that translates logits into probabilities (and log-probabilities).

API:
apply(logits) -> returns probabilities (softmaxed) and log-probabilities.
class rlgraph.components.helpers.VTraceFunction(rho_bar=1.0, rho_bar_pg=1.0, c_bar=1.0, device='/device:CPU:0', scope='v-trace-function', **kwargs)[source]

Bases: rlgraph.components.component.Component

A Helper Component that contains a graph_fn to calculate V-trace values from importance ratios (rhos). Based on [1] and coded analogously to: https://github.com/deepmind/scalable_agent

[1] IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures - Espeholt, Soyer,
Munos et al. - 2018 (https://arxiv.org/abs/1802.01561)
check_input_spaces(input_spaces, action_space=None)[source]

Should check on the nature of all in-Sockets Spaces of this Component. This method is called automatically by the Model when all these Spaces are know during the Model’s build time.

Args:
input_spaces (Dict[str,Space]): A dict with Space/shape information.
keys=in-Socket name (str); values=the associated Space
action_space (Optional[Space]): The action Space of the Agent/GraphBuilder. Can be used to construct and
connect more Components (which rely on this information). This eliminates the need to pass the action Space information into many Components’ constructors.