gvf module¶
- Authors:
- Banafsheh Rafiee, Niko Yasui
- 
class gvf.GVF(cumulant, gamma, target_policy, num_features, alpha0, alpha, name, learner, feature_indices, use_MSRE=False, **kwargs)[source]¶
- Implements General Value Functions. - General Value Functions pose a question defined by the cumulant, gamma, and the target policy, that is learned by a learning algorithm, here called the - learner.- Parameters: - cumulant (fun) – Function of observation that gives a float value.
- gamma (fun) – Function of observation that gives a float value. Together with cumulant, makes the return that the agent tries to predict.
- target_policy (Policy) – Policy under which the agent makes its predictions. Can be the same as the behavior policy.
- num_features (int) – Number of features that are used.
- alpha0 (float) – Value to calculate beta0 for RUPEE.
- alpha (float) – Value to calculate alpha for RUPEE.
- name (str) – Name of the GVF for recording data.
- learner – Class instance with a predictandupdatefunction, andtheta,tderr_elig, anddeltaattributes. For example, GTD.
- feature_indices (numpy array of bool) – Indices of the features to use.
- use_MSRE (bool) – Whether or not to calculate MSRE.