Sensors¶
-
class
vista.entities.sensors.BaseSensor.
BaseSensor
(attach_to: vista.entities.Entity.Entity, config: Optional[Dict] = None)[source]¶ Base class of all sensors.
- Parameters
attach_to (Entity) – A car to be attached to.
config (dict) – Configuration of the sensor.
-
capture
(timestamp: float, **kwargs) → Any[source]¶ Run sensor synthesis based on current timestamp and transformation between the novel viewpoint to be simulated and the nominal viewpoint from the pre-collected dataset.
- Parameters
timestamp (float) – Timestamp that allows to retrieve a pointer to the dataset for data-driven simulation (synthesizing point cloud from real LiDAR sweep).
-
update_scene_object
(name: str, scene_object: Any, pose: Any) → None[source]¶ Update object in the scene for rendering. This is only used when we put virtual objects in the scene.
- Parameters
name (str) – Name of the scene object.
scene_object (Any) – The scene object.
pose (Any) – The pose of the scene object.
-
property
name
¶ The name of the sensor.
-
property
id
¶ The identifier of this entity.
-
property
parent
¶ The parent of this entity.
-
class
vista.entities.sensors.MeshLib.
MeshLib
(root_dirs: List[str])[source]¶ Handle all meshes of actors (as opposed to scene/background itself) in the scene. It basically reads in all meshes with .obj extension, calibrates the meshes such that they are centered at the origin, and convert them to
pyrender.Mesh
for later usage. Note that this class is written specifically for a set of meshes (carpack01) and thus if there is a custom set of meshes, you would need to change how to read from the root directory and to calibrate the mesh.- Parameters
root_dirs (List[str]) – A list of root directories that contains meshes.
- Raises
AttributeError – If there is no mesh in the directory.
In the source code, there is a
main
function that uses this class independently that performs rendering on meshes in pyrender.-
reset
(n_agents: int, random: Optional[bool] = True) → None[source]¶ Reset agents meshes by sampling
n_agents
meshes from the mesh library.- Parameters
n_agents (int) – Number of agents.
random (bool) – Whether to randomly sample
n_agents
meshes from the entire mesh library; default is set toTrue
.
-
property
fpaths
¶ Paths to all meshes.
-
property
tmeshes
¶ A list of trimesh objects.
-
property
n_tmeshes
¶ Number of trimeshes.
-
property
agents_meshes
¶ A list of meshes for all agents.
-
property
agents_meshes_dim
¶ The dimensions (width, length) of agents’ meshes.
RGB Camera¶
-
class
vista.entities.sensors.Camera.
Camera
(attach_to: vista.entities.Entity.Entity, config: Dict)[source]¶ A RGB camera sensor object that synthesizes RGB image locally around the dataset given a viewpoint (potentially different from the dataset) and timestamp.
- Parameters
attach_to (Entity) – A parent object (Car) to be attached to.
config (Dict) –
Configuration of the sensor. An example (default) is,
>>> DEFAULT_CONFIG = { 'depth_mode': 'FIXED_PLANE', 'znear': ZNEAR, 'zfar': ZFAR, 'use_lighting': False, 'directional_light_intensity': 10, 'recoloring_factor': 0.5, 'use_synthesizer': True, }
Check
Viewsynthesis
object for more details about the configuration.
-
reset
() → None[source]¶ Reset RGB camera sensor by initiating RGB data stream based on current reference pointer to the dataset.
-
capture
(timestamp: float, **kwargs) → numpy.ndarray[source]¶ Synthesize RGB image based on current timestamp and transformation between the novel viewpoint to be simulated and the nominal viewpoint from the pre-collected dataset. Note that if there exists optical flow data in the trace directory, the
Camera
object will take the optical flow to interpolate across frame to the exact timestamp as opposed to retrieving the RGB frame with the closest timestamp in the dataset.- Parameters
timestamp (float) – Timestamp that allows to retrieve a pointer to the dataset for data-driven simulation (synthesizing RGB image from real RGB video).
- Returns
A synthesized RGB image.
- Return type
np.ndarray
-
update_scene_object
(name: str, scene_object: pyrender.mesh.Mesh, pose: Any) → None[source]¶ Update pyrender mesh object in the scene for rendering.
- Parameters
name (str) – Name of the scene object.
scene_object (pyrender.Mesh) – The scene object.
pose (Any) – The pose of the scene object.
-
property
config
¶ Configuration of the RGB camera sensor.
-
property
camera_param
¶ Camera parameters of the virtual camera.
-
property
streams
¶ Data stream of RGB image/video dataset to be simulated from.
-
property
flow_streams
¶ Data stream of optical flow (if any).
-
property
flow_meta
¶ Meta data of optical flow (if any).
-
property
view_synthesis
¶ View synthesizer object.
-
property
id
¶ The identifier of this entity.
-
property
name
¶ The name of the sensor.
-
property
parent
¶ The parent of this entity.
-
class
vista.entities.sensors.camera_utils.ViewSynthesis.
ViewSynthesis
(camera_param: vista.entities.sensors.camera_utils.CameraParams.CameraParams, config: Dict, init_with_bg_mesh: Optional[bool] = True)[source]¶ A RGB synthesizer that simulates RGB image at novel viewpoint around a pre-collected RGB image/video dataset. Conceptually, it (1) projects a reference 2D RGB image to 3D colored mesh using camera projection matrices and (approximated) depth, (2) place virtual objects in the scene, (3) render RGB image at novel viewpoint.
- Parameters
camera_param (CameraParams) – Camera parameter object of the virtual camera.
config (Dict) – Configuration of the synthesizer.
init_with_bg_mesh (bool) – Whether to initialize with background mesh; default is set to
True
.
-
synthesize
(trans: numpy.ndarray, rot: numpy.ndarray, imgs: Dict[str, numpy.ndarray], depth: Optional[Dict[str, numpy.ndarray]] = None) → Tuple[numpy.ndarray, numpy.ndarray][source]¶ Synthesize RGB image at the novel viewpoint specified by
trans
androt
with respect to the nominal viewpoint that corresponds to a set of RGB images and depth maps.- Parameters
trans (np.ndarray) – Translation vector.
rot (np.ndarray) – Rotation vector in Euler angle.
imgs (Dict[str, np.ndarray]) – A set of images (potentially from multiple camera).
depth (Dict[str, np.ndarray]) – A set of depth maps corresponding to
imgs
.
- Returns
Returns a tuple (
array_1
,array_2
), wherearray_1
is the synthesized RGB image andarray_2
is the corresponding depth image.
-
update_object_node
(name: str, mesh: pyrender.mesh.Mesh, trans: numpy.ndarray, quat: numpy.ndarray) → None[source]¶ Update the virtual object in the scene.
- Parameters
name (str) – Name of the virtual object.
mesh (pyrender.Mesh) – Mesh of the virtual object.
trans (np.ndarray) – Translation vector.
quat (np.ndarray) – Quaternion vector.
-
add_bg_mesh
(camera_param: vista.entities.sensors.camera_utils.CameraParams.CameraParams) → None[source]¶ Add background mesh to the scene based on camera projection and the initial depth. The color of the mesh will be updated at every
synthesize
call and if not using ground-plane depth approximation, the geometry of the mesh will be also updated.- Parameters
camera_param (CameraParams) – Camera parameter of the virtual camera.
-
property
bg_mesh_names
¶ Names of all background meshes in the scene.
-
property
object_nodes
¶ Pyrender nodes of all virtual objects added to the scene.
-
property
config
¶ Configuration of the view synthesizer.
-
class
vista.entities.sensors.camera_utils.CameraParams.
CameraParams
(rig_path: str = None, name: str = None, params: dict = None)[source]¶ The CameraParams object stores information pertaining to a single physical camera mounted on the car. It is useful for encapsulating the relevant calibration information for easy access in other modules.
- Parameters
rig_path (str) – Path to RIG.xml that specifies camera parameters.
name (str) – Name of the camera identifier to initialize. Must be a valid TopicName and present inside the RIG.xml file. Can also specify None to auto grab the first named camera in the RIG.xml file.
params (dict) – Dictionary camera parameters to instantiate with. If not provided then the rig_path is used
- Raises
ValueError – if name is provided but not found in the rig file.
-
resize
(height: int, width: int) → None[source]¶ Scales the camera object and adjusts the internal parameters such that it projects images of a certain size.
- Parameters
height (int) – New height of the camera images in pixels.
width (int) – New width of the camera images in pixels.
-
crop
(i1: int, j1: int, i2: int, j2: int) → None[source]¶ Crops a camera object to a given region of interest specified by the coordinates of the top left (i1,j1) and bottom right (i2,j2) corner.
- Parameters
i1 (int) – Top row of ROI.
j1 (int) – Left column of ROI.
i2 (int) – Bottom row of ROI.
j2 (int) – Right column of ROI.
-
get_height
() → int[source]¶ Get the raw pixel height of images captured by the camera.
- Returns
Height in pixels.
- Return type
int
-
get_width
() → int[source]¶ Get the raw pixel width of images captured by the camera.
- Returns
Width in pixels.
- Return type
int
-
get_K
() → numpy.ndarray[source]¶ Get intrinsic calibration matrix.
- Returns
Intrinsic matrix (3,3).
- Return type
np.array
-
get_K_inv
() → numpy.ndarray[source]¶ Get inverse intrinsic calibration matrix.
- Returns
Inverse intrinsic matrix (3,3).
- Return type
np.array
-
get_distortion
() → numpy.ndarray[source]¶ Get the distortion coefficients of the camera.
- Returns
Distortion coefficients (-1,).
- Return type
np.array
-
get_position
() → numpy.ndarray[source]¶ Get the 3D position of camera.
- Returns
3D position of camera.
- Return type
np.array
-
get_quaternion
() → numpy.ndarray[source]¶ Get the rotation in quaternion of camera.
- Returns
Rotation in quaternion of camera.
- Return type
np.array
-
get_yaw
() → float[source]¶ Get the yaw of the camera relative the frame of reference.
- Returns
Yaw of the camera [rads].
- Return type
float
-
get_ground_plane
() → List[float][source]¶ Get the equation of the ground plane.
The equation of the ground plane is given by:
Ax + By + Cz = D
and is computed from the position and orientation of the camera.- Returns
Parameterization of the ground plane: [A,B,C,D].
- Return type
List[float]
-
get_roi
(axis: Optional[str] = 'ij') → List[int][source]¶ Get the region of interest of the images captured by the camera.
- Parameters
axis (str) – Axis order to return the coordinates in (default ‘ij’,
also be 'xy') (can) –
- Returns
Coordinates of the ROI box.
- Return type
List[int]
- Raises
ValueError – If axis is not valid.
-
get_roi_angle
() → float[source]¶ Get the angle of the region of interest.
- Returns
The rotation of the ROI box.
- Return type
float
3D LiDAR¶
-
class
vista.entities.sensors.Lidar.
Lidar
(attach_to: vista.entities.Entity.Entity, config: Dict)[source]¶ A LiDAR sensor object that synthesizes LiDAR measurement locally around the dataset given a viewpoint (potentially different from the dataset) and timestamp.
- Parameters
attach_to (Entity) – A car to be attached to.
config (dict) –
Configuration of LiDAR sensor. An example (default) is,
>>> DEFAULT_CONFIG = { 'name': 'lidar_3d', 'yaw_fov': None, 'pitch_fov': None, 'culling_r': 1, 'use_synthesizer': True, }
Check
Lidarsynthesis
object for more details about the configuration.
-
reset
() → None[source]¶ Reset LiDAR sensor by initiating LiDAR data stream based on current reference pointer to the dataset.
-
capture
(timestamp: float, **kwargs) → numpy.ndarray[source]¶ Synthesize LiDAR point cloud based on current timestamp and transformation between the novel viewpoint to be simulated and the nominal viewpoint from the pre-collected dataset.
- Parameters
timestamp (float) – Timestamp that allows to retrieve a pointer to the dataset for data-driven simulation (synthesizing point cloud from real LiDAR sweep).
- Returns
Synthesized point cloud.
- Return type
np.ndarray
-
update_scene_object
(name: str, scene_object: Any, pose: Any) → None[source]¶ Adding virtual object in LiDAR synthesis is not yet implemented.
-
property
config
¶ Configuration of the LiDAR sensor.
-
property
streams
¶ Data stream of LiDAR dataset to be simulated from.
-
property
id
¶ The identifier of this entity.
-
property
name
¶ The name of the sensor.
-
property
parent
¶ The parent of this entity.
-
property
view_synthesis
¶ Wiew synthesizer object for the first trace.
-
class
vista.entities.sensors.lidar_utils.LidarSynthesis.
LidarSynthesis
(input_yaw_fov: Tuple[float, float], input_pitch_fov: Tuple[float, float], yaw_fov: Optional[Tuple[float, float]] = None, pitch_fov: Optional[Tuple[float, float]] = None, yaw_res: float = 0.1, pitch_res: float = 0.1, culling_r: int = 1, load_model: bool = True, **kwargs)[source]¶ A Lidar synthesizer that simulates point cloud from novel viewpoint around a pre-collected Lidar sweep. At a high level, it involves (1) performing rigid transformation on point cloud based on given viewpoint change (2) projecting 3D point cloud to 2D image space with angle coordinates (3) densifying the sparse 2D image (4) culling occluded region (5) masking out some points/pixels to simulate the sparse pattern of LiDAR sweep (6) reprojecting back to 3D point cloud or rays.
- Parameters
input_yaw_fov (float) – Input LiDAR field of view in yaw axis; can be read from
params.xml
file.input_pitch_fov (float) – Input LiDAR field of view in pitch axis; can be read from
params.xml
file.yaw_fov (float) – Output LiDAR field of view in yaw axis; default is
input_yaw_fov
.pitch_fov (float) – Output LiDAR field of view in pitch axis; default is
input_pitch_fov
.yaw_res (float) – Resolution in yaw axis; default is
0.1
.pitch_res (float) – Resolution in pitch axis; default is
0.1
.culling_r (int) – The radius (from the origin) for culling occluded points.
load_model (bool) – Whether to load Lidar densifier model; default to
True
.
-
synthesize
(trans: numpy.ndarray, rot: numpy.ndarray, pcd: numpy.ndarray) → Tuple[vista.entities.sensors.lidar_utils.Pointcloud.Pointcloud, numpy.ndarray][source]¶ Apply rigid transformation to a dense pointcloud and return new dense representation or sparse pointcloud.
- Parameters
trans (np.ndarray) – Translation vector.
rot (np.ndarray) – Rotation matrix.
pcd (np.ndarray) – Point cloud.
- Returns
Returns a tuple (
pointcloud
,array
), wherepointcloud
is the synthesized point cloud with view point change from the transform (trans
,rot
), andarray
is the dense depth map in 2D image space.
-
class
vista.entities.sensors.lidar_utils.Pointcloud.
Point
(value)[source]¶ Point feature, including x, y, z, intensity, depth, and mask.
-
class
vista.entities.sensors.lidar_utils.Pointcloud.
Pointcloud
(xyz: Union[torch.Tensor, numpy.ndarray], intensity: Optional[Union[torch.Tensor, numpy.ndarray]] = None)[source]¶ A helper class that allow handling point cloud more easily with functionality like transforming point cloud and extracting features/properties from point cloud. Pointcloud can be built from either
numpy.ndarray
ortorch.Tensor
data. Methods will maintain the same data type.- Parameters
xyz (tensor) – x, y, z position of the point cloud. shape of
(N,3)
intensity (tensor) – Intensity of the point cloud. shape
(N,)
-
transform
(R: Optional[Union[torch.Tensor, numpy.ndarray]] = None, trans: Optional[Union[torch.Tensor, numpy.ndarray]] = None)[source]¶ Transform the point cloud.
- Parameters
R (tensor) – Rotation matrix with shape (3,3).
trans (tensor) – Translation vector with length 3.
- Raises
AssertionError – Invalid rotation matrix (3,3) or translation (3,)
-
get
(feature: vista.entities.sensors.lidar_utils.Pointcloud.Point) → numpy.ndarray[source]¶ Get feature (x, y, z, intensity, depth, mask) of the point cloud.
- Parameters
feature (Point) – Feature to extract from the point cloud.
- Returns
Point feature.
- Return type
np.ndarray
- Raises
ValueError – Unrecognized Point feature.
-
numpy
()[source]¶ Returns a copy of the torch pointcloud built using numpy. If the pointcloud is already in numpy format, then a copy is returned.
-
property
num_points
¶ Number of points.
-
property
x
¶ The x component of all points.
-
property
y
¶ The y component of all points.
-
property
z
¶ The z component of all points.
-
property
xyz
¶ xyz of all points.
-
property
intensity
¶ The intensity of all points.
-
property
dist
¶ Distance to the origin of all points.
-
property
yaw
¶ Yaw angle (radians) of each point in the cloud.
-
property
pitch
¶ Pitch angle (radians) of each point in the cloud.
Event Camera¶
-
class
vista.entities.sensors.EventCamera.
EventCamera
(attach_to: vista.entities.Entity.Entity, config: Dict)[source]¶ A event camera sensor object that synthesizes event data locally around the RGB dataset given a viewpoint (potentially different from the dataset) and timestamp with video interpolation and an event emission model.
- Parameters
attach_to (Entity) – A parent object (car) to be attached to.
config (Dict) –
Configuration of the sensor. An example (default) config is,
>>> DEFAULT_CONFIG = { 'rig_path': None, # Event camera 'name': 'event_camera_front', 'original_size': (480, 640), 'size': (240, 320), 'optical_flow_root': '../data_prep/Super-SloMo', 'checkpoint': '../data_prep/Super-SloMo/ckpt/SuperSloMo.ckpt', 'lambda_flow': 0.5, 'max_sf': 16, 'use_gpu': True, 'positive_threshold': 0.1, 'sigma_positive_threshold': 0.02, 'negative_threshold': -0.1, 'sigma_negative_threshold': 0.02, 'reproject_pixel': False, 'subsampling_ratio': 0.5, # RGB rendering 'base_camera_name': 'camera_front', 'base_size': (600, 960), 'depth_mode': 'FIXED_PLANE', 'use_lighting': False, }
Note that event camera simulation requires third-party dependence and pretrained checkpoint for video interpolation.
-
reset
() → None[source]¶ Reset Event camera sensor by initiating RGB data stream (as it’s simulated using RGB data) based on current reference pointer to the dataset.
-
capture
(timestamp: float, update_rgb_frame_only: Optional[bool] = False) → numpy.ndarray[source]¶ Synthesize event data based on current timestamp and transformation between the novel viewpoint to be simulated and the nominal viewpoint from the pre-collected RGB dataset. In a very high level, it basically performs video interpolation across consecutive RGB frames, extract events with an event emission model, projects the events to the event camera space. Note that the simulation is running a deep network (SuperSloMo) for video interpolation.
- Parameters
timestamp (float) – Timestamp that allows to retrieve a pointer to the dataset for data-driven simulation (synthesizing RGB image from real RGB video).
- Returns
A synthesized event data.
- Return type
np.ndarray
-
property
config
¶ Configuration of this sensor.
-
property
streams
¶ Data stream of RGB image/video dataset to be simulated from.
-
property
camera_param
¶ Camera parameters of the virtual event camera.
-
property
base_camera_param
¶ Camera parameters of the RGB camera.
-
property
view_synthesis
¶ View synthesizer object.
-
property
id
¶ The identifier of this entity.
-
property
name
¶ The name of the sensor.
-
property
parent
¶ The parent of this entity.
-
property
prev_frame
¶ Previous RGB frame.
-
update_scene_object
(name: str, scene_object: Any, pose: Any) → None¶ Update object in the scene for rendering. This is only used when we put virtual objects in the scene.
- Parameters
name (str) – Name of the scene object.
scene_object (Any) – The scene object.
pose (Any) – The pose of the scene object.
-
property
prev_timestamp
¶ Previous timestamp.