wall_demo_example module

Runs the demo that learns to avoid walls.

This module describes an agent that learns to avoid walls. It specifies the agent’s learning algorithm, parameters, policies, features, and actions. The module also interfaces with the learning_foreground module and the action_manager module to run the main learning loop and publish actions respectively.

All parameters are set in if __name__ == "__main__"

Authors:
Michele Albach, Shibhansh Dohare, Banafsheh Rafiee, Parash Rahman, Niko Yasui.
class wall_demo_example.GoForward(action_space, fwd_action_index, *args, **kwargs)[source]

Bases: policy.Policy

Target Policy.

Constant policy that only goes forward.

action_space

numpy array of action – Numpy array containing all actions available to any agent.

fwd_action_index

int – Index of action_space containing the forward action.

update(phi, observation, *args, **kwargs)[source]
class wall_demo_example.PavlovSoftmax(time_scale, action_space, value_function, feature_indices, *args, **kwargs)[source]

Bases: policy.Policy

Behavior Policy.

Softmax policy that forces the agent to select a “turn” action if the bump sensor is on.

time_scale

float – Number of seconds in a learning timestep.

action_space

numpy array of action – Numpy array containing all actions available to any agent.

value_function

A function used by the policy to update values of pi. This is usually a value function learned by a GVF.

feature_indices

numpy array of bool – Indices of the feature vector corresponding to indices used by the value_function.

update(phi, observation, *args, **kwargs)[source]

Updates pi.