tensortrade.env.rewards.pbr module¶

class tensortrade.env.rewards.pbr.PBR(*args, **kwargs)[source]¶

Bases: AbstractRewardScheme

A reward scheme for position-based returns.

Let \(p_t\) denote the price at time t.
Let \(x_t\) denote the position at time t.
Let \(R_t\) denote the reward at time t.

Then the reward is defined as, \(R_{t} = (p_{t} - p_{t-1}) \cdot x_{t}\).

Parameters:: price (Stream) – The price stream to use for computing rewards.

on_action(action: int) → None[source]¶

registered_name = 'pbr'¶

reset() → None[source]¶: Resets the position and feed of the reward scheme.

reward() → float[source]¶

Computes the reward for the current step of an episode.

Returns:: The computed reward.
Return type:: float