tensortrade.env.rewards.pbr module

class tensortrade.env.rewards.pbr.PBR(*args, **kwargs)[source]

Bases: AbstractRewardScheme

A reward scheme for position-based returns.

  • Let \(p_t\) denote the price at time t.

  • Let \(x_t\) denote the position at time t.

  • Let \(R_t\) denote the reward at time t.

Then the reward is defined as, \(R_{t} = (p_{t} - p_{t-1}) \cdot x_{t}\).

Parameters:

price (Stream) – The price stream to use for computing rewards.

on_action(action: int) None[source]
registered_name = 'pbr'
reset() None[source]

Resets the position and feed of the reward scheme.

reward() float[source]

Computes the reward for the current step of an episode.

Returns:

The computed reward.

Return type:

float