learning_foreground module

Ties together the learning environment and main learning loop.

Authors:
Michele Albach, Shibhansh Dohare, David Quail, Parash Rahman, Niko Yasui.
class learning_foreground.LearningForeground(time_scale, gvfs, features_to_use, behavior_policy, stats, control_gvf=None, cumulant_counter=None, reset_episode=None, custom_stats=None)[source]

Connects the environment through sensors and ROS with the learning algs.

Parameters:
  • time_scale (float) – Length of a time step in seconds.
  • gvfs (list of GVF) – List of GVFs to learn.
  • features_to_use (set of str) – Union of the features that each GVF uses to learn their respective predictions.
  • behavior_policy (Policy) – Policy for the robot to follow.
  • stats (list of str) – List of statistics to record and publish.
  • control_gvf (GVF) – GVF that needs to be reset when the episode restarts.
  • cumulant_counter (multiprocessing Value) – Record for the number of times the cumulant is non-zero. Could be incorporated into the Evaluator.
  • reset_episode (fun) – Whether the episode should be reset.
  • (dictionary[string (custom_stats) – lambda]): The custom topics defined by the user.
COLLECT_DATA_FLAG

bool – Whether or not to save data in bags.

vis

bool – Whether or not to use the visualizer.

to_replay_experience

bool – Whether or not to use experience replay.

recent

dict of queue – Dictionary mapping topic names to the queue of recent values from their respective topics.

publishers

dict of ROS publishers – Publishers for each of the data we want to publish.

create_state(*args, **kw)[source]

Uses data from recent to create the state representation.

  1. Reads data from the recent dictionary.

  2. Process data into a format to pass to state_representation module.

  3. Pass data to get_phi()

    and get_observation().

Returns:
Feature vector and ancillary state
information.
Return type:(numpy array, dict)
read_source(source, history=False)[source]

Reads from the topics and returns the most recent value.

run()[source]

Main learning loop.

Repeat:
  1. Get new state.
  2. Take an action.
  3. Learn.
take_action(action)[source]
update_gvfs(*args, **kw)[source]

Calls the GVF update function for each GVF and publishes their updated statistics.

Parameters:
  • phi_prime (numpy array) – Feature vector for timestep t+1.
  • observation (dict) – Ancillary state information.
  • action (action) – Action taken at time t+1.
learning_foreground.start_learning_foreground(time_scale, GVFs, topics, policy, stats, control_gvf=None, cumulant_counter=None, reset_episode=None, custom_stats=None)[source]

Function to call with multiprocessing or multithreading.