return_calculator module

Author: Banafsheh Rafiee

Description: ReturnCalculator samples some time steps from the behavior policy and computes the return for them. In order to compute the return for each sample time step, it switches from the behavior policy to the target policy.

class return_calculator.ReturnCalculator(time_scale, gvf, num_features, features_to_use, behavior_policy, target_policy)[source]
compute_return(sample_number)[source]
create_state()[source]
run()[source]
take_action(action)[source]
update_return_buffers(index, observations)[source]
return_calculator.start_return_calculator(time_scale, GVF, num_features, features_to_use, behavior_policy, target_policy)[source]