kaolin.render.camera¶

Kaolin provides extensive camera API. For an overview, see the Camera class docs.

API¶

Classes¶

Camera Conversions¶

Aligning camera conventions across different codebases can take time and care. Kaolin ships with converters between kaolin.render.camera.Camera and camera conventions in several popular codebases, including:

Community contributions are welcome to expand this set.

kaolin.render.camera.kaolin_camera_to_gsplat_nerfstudio(kal_camera: Camera)¶

Convert Kaolin Camera to nerfstudio gsplat library camera parameters, as expected by gsplat.rendering.rasterization. Batched conversion is supported. Only Pinhole camera model is covered.

Note

This has been tested with the version gsplat==1.4.0.

Parameters

kal_camera (Camera) – camera to convert.

Returns

A dict with the following keys:

"Ks" (torch.Tensor): intrinsics matrix of shape \((C, 3, 3)\).
"viewmats" (torch.Tensor): view matrix of shape \((C, 4, 4)\).
"width" (int): image width from the source camera.
"height" (int): image height from the source camera.
"camera_model" (str): always "pinhole".

C is the number of cameras.

Return type

(dict)

Raises

RuntimeError – if kal_camera does not use a pinhole intrinsics model.

kaolin.render.camera.gsplat_nerfstudio_camera_to_kaolin(Ks, viewmats, width=None, height=None, camera_model='pinhole', near_plane: float = 0.01, far_plane: float = 100.0) → Camera¶

Convert nerfstudio gsplat library camera parameters, as expected by gsplat.rendering.rasterization, to Kaolin Camera. Batched conversion is supported.

Parameters

Ks (torch.Tensor) – (C, 3, 3) matrix
viewmats (torch.Tensor) – (C, 4, 4) matrix
width (optional, int) – if not set, will guess value from Ks
height (optional, int) – if not set, will guess value from Ks
camera_model (optional, str) – currently only pinhole is supported
near_plane (optional, float) – near clipping plane, defines the min depth of the view frustum.
far_plane (optional, float) – far clipping plane, define the max depth of the view frustum.

Returns

converted Kaolin camera.

Return type

(Camera)

kaolin.render.camera.kaolin_camera_to_gsplat_inria(kal_camera: Camera, gs_cam_cls)¶

Converts Kaolin Camera to INRIA gaussian splats camera (gsplats.scene.cameras.Camera).

Note

This has been tested with the version commit 472689c

Parameters

kal_camera (Camera) – camera to convert.
gs_cam_cls (class) – This is the gsplats Camera class, usually located in gsplats/scene/cameras.py.

Returns

converted INRIA gaussian splats camera.

Return type

(gsplats.scene.cameras.Camera)

kaolin.render.camera.gsplat_inria_camera_to_kaolin(gs_camera) → Camera¶

Convert INRIA gaussian splats camera (gsplats.scene.cameras.Camera) to Kaolin Camera.

Note

This has been tested with the version commit 472689c

Parameters: gs_camera (gsplats.scene.cameras.Camera) – camera to convert.
Returns: converted Kaolin camera.
Return type: (Camera)

kaolin.render.camera.kaolin_camera_to_polyscope(camera: Camera)¶

Converts Kaolin Camera to a polyscope camera (polyscope.core.CameraParameters). Polyscope cameras are always assumed to exist on a cpu device. The converted information includes the camera extrinsics, and intrinsics for the field of view.

Parameters: camera (Camera) – camera to convert.
Returns: A polyscope camera object.
Return type: (ps.core.CameraParameters)

kaolin.render.camera.polyscope_camera_to_kaolin(ps_camera, width: int, height: int, near: float = 0.01, far: float = 100.0, dtype: dtype = torch.float32, device: Union[device, str] = 'cpu') → Camera¶

Converts a polyscope camera (polyscope.core.CameraParameters) to Kaolin Camera. The converted information includes the camera extrinsics, the image plane dimensions and field of view. Additional parameters that kaolin cameras assume and polyscope does not, such as near, far plane and device can be passed explicitly if needed.

Parameters

ps_camera (ps.core.CameraParameters) – A polyscope camera object.
width (int) – Image plane width in pixels.
height (int) – Image plane height in pixels.
near (optional, float) – near clipping plane, defines the min depth of the view frustum.
far (optional, float) – far clipping plane, define the max depth of the view frustum.
dtype (optional, torch.dtype) – Datatype of the kaolin camera, converted from polyscope float32 precision.
device (optional, torch.device or str) – the device on which camera parameters will be allocated. Default: cpu

Returns

A kaolin camera object.

Return type

(Camera)

Functions¶

class kaolin.render.camera.CameraFOV(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: IntEnum

Camera’s field-of-view can be defined by either of the directions

DIAGONAL = 2¶

HORIZONTAL = 0¶

VERTICAL = 1¶

kaolin.render.camera.allclose(input: Camera, other: Camera, rtol: float = 1e-05, atol: float = 1e-08, equal_nan: bool = False) → bool¶

This function checks if the camera extrinsics and intrinsics, are close using torch.allclose().

Parameters

input (Camera) – first camera to compare
other (Camera) – second camera to compare
atol (float, optional) – absolute tolerance. Default: 1e-08
rtol (float, optional) – relative tolerance. Default: 1e-05
equal_nan (bool, optional) – if True, then two NaN s will be considered equal. Default: False

Returns

Result of the comparison

Return type

(bool)

kaolin.render.camera.blender_coords()¶

Blender world coordinates are right handed, with the z axis pointing upwards

Z      Y
^    /
|  /
|---------> X

kaolin.render.camera.camera_path_generator(trajectory: List[Camera], frames_between_cameras: int = 60, interpolation: str = 'catmull_rom') → Iterator[Camera]¶

A finite generator function for returning continuous camera objects an o path interpolated from a trajectory of cameras.

This generator is exhausted after it returns the last point on the path. If interpolation is ‘polynomial’ - the trajectory is assumed to have a list of at least 2 cameras. If interpolation is ‘catmull_rom’ - the trajectory is assumed to have a list of at least 4 cameras.

Parameters

trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path. frames_between_cameras (int): Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.
interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.

Returns

An interpolated camera object formed by the cameras trajectory.

Return type

(Iterator[kaolin.render.camera.Camera])

kaolin.render.camera.down_from_homogeneous(homogeneous_vectors: Tensor)¶

Performs perspective division by dividing each vector by its w coordinate.
Down-projects vectors from 4D homogeneous space to 3D space.

Parameters: homogenenous_vectors – the inputs vectors, of shape \((..., 4)\)
Returns: the 3D vectors, of same shape than inputs but last dim to be 3
Return type: (torch.Tensor)

kaolin.render.camera.generate_centered_custom_resolution_pixel_coords(img_width, img_height, res_x=None, res_y=None, device=None)¶

Creates a pixel grid with a custom resolution, with the rays spaced out according to the scale. The scale is determined by the ratio of \(\text{img_width / res_x, img_height / res_y}\). The ray grid is of resolution \(\text{res_x} \times \text{res_y}\).

Parameters

img_width (int) – width of camera image plane.
img_height (int) – height of camera image plane.
res_x (int) – x resolution of pixel grid to be created
res_y (int) – y resolution of pixel grid to be created
device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{s, s, ..., s})\) up to \((\text{height-s, height-s... height-s})\).

Tensor 1 contains repeated rows of indices: \((\text{s, s+1, ..., width-s})\).

\(\text{s}\) is \(\text{scale/2}\) where \(\text{scale}\) is \((\text{img_width / res_x, img_height, res_y})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_centered_pixel_coords(img_width, img_height, device=None)¶

Creates a pixel grid with rays intersecting the center of each pixel. The ray grid is of resolution img_width x img_height.

Parameters

img_width (int) – width of image.
img_height (int) – height of image.
device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0.5, 0.5, ..., 0.5})\) up to \((\text{height-0.5, height-0.5... height-0.5})\).

Tensor 1 contains repeated rows of indices: \((\text{0.5, 1.5, ..., width-0.5})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_default_grid(width, height, device=None)¶

Creates a pixel grid of integer coordinates with resolution width x height.

Parameters

width (int) – width of image.
height (int) – height of image.
device (torch.device, optional) – Device on which the meshgrid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0, 0, ..., 0})\) up to \((\text{height-1, height-1... height-1})\).

Tensor 1 contains repeated rows of indices: \((\text{0, 1, ..., width-1})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_ortho_rays(camera: Camera, coords_grid: Optional[Tensor] = None)¶

Default ray generation function for ortho cameras.

Parameters

camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated ortho rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\) .

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_perspective_projection(fovyangle, ratio=1.0, dtype=torch.float32)¶

Generate perspective projection matrix for a given camera fovy angle.

Parameters

fovyangle (float) – field of view angle of y axis, \(tan(\frac{fovy}{2}) = \frac{y}{f}\).
ratio (float) – aspect ratio \((\frac{width}{height})\). Default: 1.0.

Returns

camera projection matrix, of shape \((3, 1)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.generate_pinhole_rays(camera: Camera, coords_grid: Optional[Tensor] = None)¶

Default ray generation function for pinhole cameras.

This function assumes that the principal point (the pinhole location) is specified by a displacement (camera.x0, camera.y0) in pixel coordinates from the center of the image.

The Kaolin camera class does not enforce a coordinate space for how the principal point is specified, so users will need to make sure that the correct principal point conventions are followed for the cameras passed into this function.

Parameters

camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated pinhole rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rays(camera, coords_grid: Optional[Tensor] = None)¶

Default ray generation function for unbatched kaolin cameras. The camera lens type will determine the exact raygen logic that runs (i.e. pinhole, ortho..)

Parameters

camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated camera rays according to the camera lens type, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rotate_translate_matrices(camera_position, look_at, camera_up_direction)¶

Generate rotation and translation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{rot_mtx} * (\text{P_world} - \text{trans_mtx})\)

Parameters

camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are
look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),
camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

the camera rotation matrix of shape \((\text{batch_size}, 3, 3)\) and the camera transformation matrix of shape \((\text{batch_size}, 3)\)

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_transformation_matrix(camera_position, look_at, camera_up_direction)¶

Generate transformation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{P_world} * \text{transformation_mtx}\), with \(\text{P_world}\) being the points coordinates padded with 1.

Parameters

camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are
look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),
camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

The camera transformation matrix of shape \((\text{batch_size}, 4, 3)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.gsplats_camera_to_kaolin(gs_camera)¶

Deprecated function name for INRIA camera conversion.

Use instead gsplat_inria_camera_to_kaolin()

kaolin.render.camera.kaolin_camera_to_gsplats(kal_camera, gs_cam_cls)¶

Deprecated function name for INRIA camera conversion.

for INRIA Gaussian Splats codebase, use: kaolin_camera_to_gsplat_inria()
for NerfStudio gsplat package, use: kaolin_camera_to_gsplat_nerfstudio()

kaolin.render.camera.loop_camera_path_generator(trajectory: List[Camera], frames_between_cameras: int = 60, interpolation: str = 'polynomial', repeat: Optional[int] = None) → Iterator[Camera]¶

A generator function for returning continuous camera objects an on a smoothed path interpolated from a trajectory of cameras.

The trajectory is assumed to have a list of at least 2 cameras, where the first and last cameras form a looped path. Certain interpolation modes (i.e. catmull_rom) may require additional cameras. If repeat is None, this generator is therefore never exhausted, and can be invoked infinitely to generate continuous camera motion. Otherwise the loop will repeat a finite number of times.

Parameters

trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path.
frames_between_cameras (int) – Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.
interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.
repeat (int, Optional) – If specified, will limit the number of loops. Passing None results in an infinite loop.

Returns

An interpolated camera object formed by the cameras trajectory.

Return type

(Iterator[kaolin.render.camera.Camera])

kaolin.render.camera.opengl_coords()¶

Contemporary OpenGL doesn’t enforce specific handedness on world coordinates. However it is common standard to define OpenGL world coordinates as right handed, with the y axis pointing upwards (cartesian):

   Y
   ^
   |
   |---------> X
  /
Z

kaolin.render.camera.perspective_camera(points, camera_proj)¶

Projects 3D points on 2D images in perspective projection mode.

Parameters

points (torch.FloatTensor) – 3D points in camera coordinate, of shape \((\text{batch_size}, \text{num_points}, 3)\).
camera_proj (torch.FloatTensor) – projection matrix of shape \((3, 1)\).

Returns

2D points on image plane of shape \((\text{batch_size}, \text{num_points}, 2)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.register_backend(name: str)¶

Registers a representation backend class with a unique name.

CameraExtrinsics can switch between registered representations dynamically (see switch_backend()).

kaolin.render.camera.rotate_translate_points(points, camera_rot, camera_trans)¶

Rotate and translate 3D points on based on rotation matrix and transformation matrix.

Formula is \(\text{P_new} = R * (\text{P_old} - T)\)

Parameters

points (torch.FloatTensor) – 3D points, of shape \((\text{batch_size}, \text{num_points}, 3)\).
camera_rot (torch.FloatTensor) – rotation matrix, of shape \((\text{batch_size}, 3, 3)\).
camera_trans (torch.FloatTensor) – translation matrix, of shape \((\text{batch_size}, 3, 1)\).

Returns

3D points in new rotation, of same shape than points.

Return type

(torch.FloatTensor)

kaolin.render.camera.up_to_homogeneous(vectors: Tensor)¶

Up-projects vectors to homogeneous coordinates of four dimensions. If the vectors are already in homogeneous coordinates, this function return the inputs.

Parameters: vectors (torch.Tensor) – the inputs vectors to project, of shape \((..., 3)\)
Returns: The projected vectors, of same shape than inputs but last dim to be 4
Return type: (torch.Tensor)