learning_foreground module¶

Ties together the learning environment and main learning loop.

Authors:: Michele Albach, Shibhansh Dohare, David Quail, Parash Rahman, Niko Yasui.

class learning_foreground.LearningForeground(time_scale, gvfs, features_to_use, behavior_policy, stats, control_gvf=None, cumulant_counter=None, reset_episode=None, custom_stats=None)[source]¶

Connects the environment through sensors and ROS with the learning algs.

Parameters:

time_scale (float) – Length of a time step in seconds.
gvfs (list of GVF) – List of GVFs to learn.
features_to_use (set of str) – Union of the features that each GVF uses to learn their respective predictions.
behavior_policy (Policy) – Policy for the robot to follow.
stats (list of str) – List of statistics to record and publish.
control_gvf (GVF) – GVF that needs to be reset when the episode restarts.
cumulant_counter (multiprocessing Value) – Record for the number of times the cumulant is non-zero. Could be incorporated into the Evaluator.
reset_episode (fun) – Whether the episode should be reset.
(dictionary[string (custom_stats) – lambda]): The custom topics defined by the user.

COLLECT_DATA_FLAG¶: bool – Whether or not to save data in bags.

vis¶: bool – Whether or not to use the visualizer.

to_replay_experience¶: bool – Whether or not to use experience replay.

recent¶: dict of queue – Dictionary mapping topic names to the queue of recent values from their respective topics.

publishers¶: dict of ROS publishers – Publishers for each of the data we want to publish.

create_state(*args, **kw)[source]¶

Uses data from recent to create the state representation.

Reads data from the recent dictionary.
Process data into a format to pass to state_representation module.
Pass data to get_phi()

and get_observation().

Returns:	Feature vector and ancillary state information.
Return type:	(numpy array, dict)

read_source(source, history=False)[source]¶: Reads from the topics and returns the most recent value.

run()[source]¶

Main learning loop.

Repeat:

Get new state.
Take an action.
Learn.

take_action(action)[source]¶

update_gvfs(*args, **kw)[source]¶

Calls the GVF update function for each GVF and publishes their updated statistics.

Parameters:	phi_prime (numpy array) – Feature vector for timestep t+1. observation (dict) – Ancillary state information. action (action) – Action taken at time t+1.

learning_foreground.start_learning_foreground(time_scale, GVFs, topics, policy, stats, control_gvf=None, cumulant_counter=None, reset_episode=None, custom_stats=None)[source]¶: Function to call with multiprocessing or multithreading.