Tasks¶

This module demonstrates how to build environments of various tasks with Vista. We follow roughly OpenAI gym interface for reinforcement learning setting here (with member functions of the environments like reset, step and return values like observation, reward, done, info).

class vista.tasks.lane_following.LaneFollowing(trace_paths: List[str], trace_config: Dict, car_config: Dict, sensors_configs: Optional[List[Dict]] = [], task_config: Optional[Dict] = {}, logging_level: Optional[str] = 'WARNING')[source]¶

This class defines a simple lane following task in Vista. It basically handles vehicle state update of the ego car, rendering of specified sensors, and determing reward and terminal condition. The default terminal condition is set to (1) being out of lane (2) exceed maximal roation (3) reaching the end of the trace.

Parameters

trace_paths (List[str]) – A list of trace paths.
trace_config (Dict) – Configuration of the trace.
car_configs (List[Dict]) – Configuration of every cars.
sensors_configs (List[Dict]) – Configuration of sensors on every cars.
task_config (Dict) – Configuration of the task, which specifies reward function and terminal condition. For more details, please check the source code.
logging_level (str) – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL); default set to WARNING.

reset()[source]¶

Reset the environment. This basically reset the World object in Vista, which reset all (the only) agent in the world.

Returns: A dictionary with keys as agent IDs and values as observation for each agent, which is also a dictionary with keys as sensor IDs and values as sensory measurements.
Return type: Dict

step(action, dt=0.03333333333333333)[source]¶

Step the environment. This involves updating agent’s state based on the given actions and determining reward and termination.

Parameters

actions (Dict[str, np.ndarray]) – A dictionary with keys as agent IDs and values as actions to be executed to interact with the environment and other agents.
dt (float) – Elapsed time in second; default set to 1/30.

Returns

Return a tuple (dict_a, dict_b, dict_c, dict_d), where dict_a is the observation, dict_b is the reward, dict_c is whether the episode terminates, dict_d is additional informations for every agents; keys of every dictionary are agent IDs.

set_seed(seed) → None[source]¶

Set random seed.

Parameters: seed (int) – Random seed.

property config¶: Configuration of this task.

property world¶: World of this task.

property seed¶: Random seed for the task and the associated World.

class vista.tasks.multi_agent_base.MultiAgentBase(trace_paths: List[str], trace_config: Dict, car_configs: List[Dict], sensors_configs: List[List[Dict]], task_config: Optional[Dict] = {}, logging_level: Optional[str] = 'WARNING')[source]¶

This class builds a simple environment with multiple cars in the scene, which involves randomly initializing ado cars in the front of the ego car, checking collision between cars, handling meshes for all virtual agents, and determining terminal condition.

Parameters

trace_paths (List[str]) – A list of trace paths.
trace_config (Dict) – Configuration of the trace.
car_configs (List[Dict]) – Configuration of every cars.
sensors_configs (List[Dict]) – Configuration of sensors on every cars.

task_config (Dict) –

Configuration of the task. An example (default) is,

>>> DEFAULT_CONFIG = {
        'n_agents': 1,
        'mesh_dir': None,
        'overlap_threshold': 0.05,
        'max_resample_tries': 10,
        'init_dist_range': [5., 10.],
        'init_lat_noise_range': [-1., 1.],
        'init_yaw_noise_range': [-0.0, 0.0],
        'reward_fn': default_reward_fn,
        'terminal_condition': default_terminal_condition
    }

Note that both reward_fn and terminal_condition have function signature as f(task, agent_id, **kwargs) -> (value, dict). For more details, please check the source code.

logging_level (str) – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL); default set to WARNING.

reset() → Dict[source]¶

Reset the environment. This involves regular world reset, randomly initializing ado agent in the front of the ego agent, and resetting the mesh library for all virtual agents.

Returns: A dictionary with keys as agent IDs and values as observation for each agent, which is also a dictionary with keys as sensor IDs and values as sensory measurements.
Return type: Dict

step(actions, dt=0.03333333333333333)[source]¶

Step the environment. This includes updating agents’ states, synthesizing agents’ observations, checking terminal conditions, and computing rewards.

Parameters

actions (Dict[str, np.ndarray]) – A dictionary with keys as agent IDs and values as actions to be executed to interact with the environment and other agents.
dt (float) – Elapsed time in second; default set to 1/30.

Returns

Return a tuple (dict_a, dict_b, dict_c, dict_d), where dict_a is the observation, dict_b is the reward, dict_c is whether the episode terminates, dict_d is additional informations for every agents; keys of every dictionary are agent IDs.

set_seed(seed) → None[source]¶

Set random seed.

Parameters: seed (int) – Random seed.

property config¶: Configuration of this task.

property ego_agent¶: Ego agent.

property world¶: World of this task.

property seed¶: Random seed for the task and the associated World.