wall_demo_example module¶

Runs the demo that learns to avoid walls.

This module describes an agent that learns to avoid walls. It specifies the agent’s learning algorithm, parameters, policies, features, and actions. The module also interfaces with the learning_foreground module and the action_manager module to run the main learning loop and publish actions respectively.

All parameters are set in if __name__ == "__main__"

Authors:: Michele Albach, Shibhansh Dohare, Banafsheh Rafiee, Parash Rahman, Niko Yasui.

class wall_demo_example.GoForward(action_space, fwd_action_index, *args, **kwargs)[source]¶

Bases: policy.Policy

Target Policy.

Constant policy that only goes forward.

action_space¶: numpy array of action – Numpy array containing all actions available to any agent.

fwd_action_index¶: int – Index of action_space containing the forward action.

update(phi, observation, *args, **kwargs)[source]¶

class wall_demo_example.PavlovSoftmax(time_scale, action_space, value_function, feature_indices, *args, **kwargs)[source]¶

Bases: policy.Policy

Behavior Policy.

Softmax policy that forces the agent to select a “turn” action if the bump sensor is on.

time_scale¶: float – Number of seconds in a learning timestep.

action_space¶: numpy array of action – Numpy array containing all actions available to any agent.

value_function¶: A function used by the policy to update values of pi. This is usually a value function learned by a GVF.

feature_indices¶: numpy array of bool – Indices of the feature vector corresponding to indices used by the value_function.

update(phi, observation, *args, **kwargs)[source]¶: Updates pi.