Tasks

This module demonstrates how to build environments of various tasks with Vista. We follow roughly OpenAI gym interface for reinforcement learning setting here (with member functions of the environments like reset, step and return values like observation, reward, done, info).

class vista.tasks.lane_following.LaneFollowing(trace_paths: List[str], trace_config: Dict, car_config: Dict, sensors_configs: Optional[List[Dict]] = [], task_config: Optional[Dict] = {}, logging_level: Optional[str] = 'WARNING')[source]

This class defines a simple lane following task in Vista. It basically handles vehicle state update of the ego car, rendering of specified sensors, and determing reward and terminal condition. The default terminal condition is set to (1) being out of lane (2) exceed maximal roation (3) reaching the end of the trace.

Parameters
  • trace_paths (List[str]) – A list of trace paths.

  • trace_config (Dict) – Configuration of the trace.

  • car_configs (List[Dict]) – Configuration of every cars.

  • sensors_configs (List[Dict]) – Configuration of sensors on every cars.

  • task_config (Dict) – Configuration of the task, which specifies reward function and terminal condition. For more details, please check the source code.

  • logging_level (str) – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL); default set to WARNING.

reset()[source]

Reset the environment. This basically reset the World object in Vista, which reset all (the only) agent in the world.

Returns

A dictionary with keys as agent IDs and values as observation for each agent, which is also a dictionary with keys as sensor IDs and values as sensory measurements.

Return type

Dict

step(action, dt=0.03333333333333333)[source]

Step the environment. This involves updating agent’s state based on the given actions and determining reward and termination.

Parameters
  • actions (Dict[str, np.ndarray]) – A dictionary with keys as agent IDs and values as actions to be executed to interact with the environment and other agents.

  • dt (float) – Elapsed time in second; default set to 1/30.

Returns

Return a tuple (dict_a, dict_b, dict_c, dict_d), where dict_a is the observation, dict_b is the reward, dict_c is whether the episode terminates, dict_d is additional informations for every agents; keys of every dictionary are agent IDs.

set_seed(seed) → None[source]

Set random seed.

Parameters

seed (int) – Random seed.

property config

Configuration of this task.

property world

World of this task.

property seed

Random seed for the task and the associated World.

class vista.tasks.multi_agent_base.MultiAgentBase(trace_paths: List[str], trace_config: Dict, car_configs: List[Dict], sensors_configs: List[List[Dict]], task_config: Optional[Dict] = {}, logging_level: Optional[str] = 'WARNING')[source]

This class builds a simple environment with multiple cars in the scene, which involves randomly initializing ado cars in the front of the ego car, checking collision between cars, handling meshes for all virtual agents, and determining terminal condition.

Parameters
  • trace_paths (List[str]) – A list of trace paths.

  • trace_config (Dict) – Configuration of the trace.

  • car_configs (List[Dict]) – Configuration of every cars.

  • sensors_configs (List[Dict]) – Configuration of sensors on every cars.

  • task_config (Dict) –

    Configuration of the task. An example (default) is,

    >>> DEFAULT_CONFIG = {
            'n_agents': 1,
            'mesh_dir': None,
            'overlap_threshold': 0.05,
            'max_resample_tries': 10,
            'init_dist_range': [5., 10.],
            'init_lat_noise_range': [-1., 1.],
            'init_yaw_noise_range': [-0.0, 0.0],
            'reward_fn': default_reward_fn,
            'terminal_condition': default_terminal_condition
        }
    

    Note that both reward_fn and terminal_condition have function signature as f(task, agent_id, **kwargs) -> (value, dict). For more details, please check the source code.

  • logging_level (str) – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL); default set to WARNING.

reset() → Dict[source]

Reset the environment. This involves regular world reset, randomly initializing ado agent in the front of the ego agent, and resetting the mesh library for all virtual agents.

Returns

A dictionary with keys as agent IDs and values as observation for each agent, which is also a dictionary with keys as sensor IDs and values as sensory measurements.

Return type

Dict

step(actions, dt=0.03333333333333333)[source]

Step the environment. This includes updating agents’ states, synthesizing agents’ observations, checking terminal conditions, and computing rewards.

Parameters
  • actions (Dict[str, np.ndarray]) – A dictionary with keys as agent IDs and values as actions to be executed to interact with the environment and other agents.

  • dt (float) – Elapsed time in second; default set to 1/30.

Returns

Return a tuple (dict_a, dict_b, dict_c, dict_d), where dict_a is the observation, dict_b is the reward, dict_c is whether the episode terminates, dict_d is additional informations for every agents; keys of every dictionary are agent IDs.

set_seed(seed) → None[source]

Set random seed.

Parameters

seed (int) – Random seed.

property config

Configuration of this task.

property ego_agent

Ego agent.

property world

World of this task.

property seed

Random seed for the task and the associated World.