kaolin.render.camera

Kaolin provides extensive camera API. For an overview, see the Camera class docs.

API

Classes

Functions

class kaolin.render.camera.CameraFOV(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: IntEnum

Camera’s field-of-view can be defined by either of the directions

DIAGONAL = 2
HORIZONTAL = 0
VERTICAL = 1
kaolin.render.camera.allclose(input, other, rtol=1e-05, atol=1e-08, equal_nan=False)

This function checks if the camera extrinsics and intrinsics, are close using torch.allclose().

Parameters
  • input (Camera) – first camera to compare

  • other (Camera) – second camera to compare

  • atol (float, optional) – absolute tolerance. Default: 1e-08

  • rtol (float, optional) – relative tolerance. Default: 1e-05

  • equal_nan (bool, optional) – if True, then two NaN s will be considered equal. Default: False

Returns

Result of the comparison

Return type

(bool)

kaolin.render.camera.blender_coords()

Blender world coordinates are right handed, with the z axis pointing upwards

Z      Y
^    /
|  /
|---------> X
kaolin.render.camera.down_from_homogeneous(homogeneous_vectors)
  1. Performs perspective division by dividing each vector by its w coordinate.

  2. Down-projects vectors from 4D homogeneous space to 3D space.

Parameters
  • homogenenous_vectors – the inputs vectors, of shape \((..., 4)\)

  • homogeneous_vectors (Tensor) –

Returns

the 3D vectors, of same shape than inputs but last dim to be 3

Return type

(torch.Tensor)

kaolin.render.camera.generate_centered_custom_resolution_pixel_coords(img_width, img_height, res_x=None, res_y=None, device=None)

Creates a pixel grid with a custom resolution, with the rays spaced out according to the scale. The scale is determined by the ratio of \(\text{img_width / res_x, img_height / res_y}\). The ray grid is of resolution \(\text{res_x} \times \text{res_y}\).

Parameters
  • img_width (int) – width of camera image plane.

  • img_height (int) – height of camera image plane.

  • res_x (int) – x resolution of pixel grid to be created

  • res_y (int) – y resolution of pixel grid to be created

  • device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{s, s, ..., s})\) up to \((\text{height-s, height-s... height-s})\).

Tensor 1 contains repeated rows of indices: \((\text{s, s+1, ..., width-s})\).

\(\text{s}\) is \(\text{scale/2}\) where \(\text{scale}\) is \((\text{img_width / res_x, img_height, res_y})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_centered_pixel_coords(img_width, img_height, device=None)

Creates a pixel grid with rays intersecting the center of each pixel. The ray grid is of resolution img_width x img_height.

Parameters
  • img_width (int) – width of image.

  • img_height (int) – height of image.

  • device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0.5, 0.5, ..., 0.5})\) up to \((\text{height-0.5, height-0.5... height-0.5})\).

Tensor 1 contains repeated rows of indices: \((\text{0.5, 1.5, ..., width-0.5})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_default_grid(width, height, device=None)

Creates a pixel grid of integer coordinates with resolution width x height.

Parameters
  • width (int) – width of image.

  • height (int) – height of image.

  • device (torch.device, optional) – Device on which the meshgrid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0, 0, ..., 0})\) up to \((\text{height-1, height-1... height-1})\).

Tensor 1 contains repeated rows of indices: \((\text{0, 1, ..., width-1})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_ortho_rays(camera, coords_grid=None)

Default ray generation function for ortho cameras.

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated ortho rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\) .

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_perspective_projection(fovyangle, ratio=1.0, dtype=torch.float32)

Generate perspective projection matrix for a given camera fovy angle.

Parameters
  • fovyangle (float) – field of view angle of y axis, \(tan(\frac{fovy}{2}) = \frac{y}{f}\).

  • ratio (float) – aspect ratio \((\frac{width}{height})\). Default: 1.0.

Returns

camera projection matrix, of shape \((3, 1)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.generate_pinhole_rays(camera, coords_grid=None)

Default ray generation function for pinhole cameras.

This function assumes that the principal point (the pinhole location) is specified by a displacement (camera.x0, camera.y0) in pixel coordinates from the center of the image.

The Kaolin camera class does not enforce a coordinate space for how the principal point is specified, so users will need to make sure that the correct principal point conventions are followed for the cameras passed into this function.

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated pinhole rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rays(camera, coords_grid=None)

Default ray generation function for unbatched kaolin cameras. The camera lens type will determine the exact raygen logic that runs (i.e. pinhole, ortho..)

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated camera rays according to the camera lens type, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rotate_translate_matrices(camera_position, look_at, camera_up_direction)

Generate rotation and translation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{rot_mtx} * (\text{P_world} - \text{trans_mtx})\)

Parameters
  • camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are

  • look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),

  • camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

the camera rotation matrix of shape \((\text{batch_size}, 3, 3)\) and the camera transformation matrix of shape \((\text{batch_size}, 3)\)

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_transformation_matrix(camera_position, look_at, camera_up_direction)

Generate transformation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{P_world} * \text{transformation_mtx}\), with \(\text{P_world}\) being the points coordinates padded with 1.

Parameters
  • camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are

  • look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),

  • camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

The camera transformation matrix of shape \((\text{batch_size}, 4, 3)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.gsplats_camera_to_kaolin(gs_camera)

Convert INRIA gaussian splats camera to Kaolin camera.

Note

This has been tested with the version commit 472689c

Parameters

gs_camera (gsplats.scene.cameras.Camera) – camera to convert.

Returns

converted Kaolin camera.

Return type

(Camera)

kaolin.render.camera.kaolin_camera_to_gsplats(kal_camera, gs_cam_cls)

Convert Kaolin Camera to INRIA gaussian splats camera.

Note

This has been tested with the version commit 472689c

Parameters
  • kal_camera (Camera) – camera to convert.

  • gs_cam_cls (class) – This is the gsplats Camera class, usually located in gsplats/scene/cameras.py.

Returns

converted INRIA gsplats camera.

Return type

(gsplats.scene.cameras.Camera)

kaolin.render.camera.opengl_coords()

Contemporary OpenGL doesn’t enforce specific handedness on world coordinates. However it is common standard to define OpenGL world coordinates as right handed, with the y axis pointing upwards (cartesian):

   Y
   ^
   |
   |---------> X
  /
Z
kaolin.render.camera.perspective_camera(points, camera_proj)

Projects 3D points on 2D images in perspective projection mode.

Parameters
  • points (torch.FloatTensor) – 3D points in camera coordinate, of shape \((\text{batch_size}, \text{num_points}, 3)\).

  • camera_proj (torch.FloatTensor) – projection matrix of shape \((3, 1)\).

Returns

2D points on image plane of shape \((\text{batch_size}, \text{num_points}, 2)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.register_backend(name)

Registers a representation backend class with a unique name.

CameraExtrinsics can switch between registered representations dynamically (see switch_backend()).

Parameters

name (str) –

kaolin.render.camera.rotate_translate_points(points, camera_rot, camera_trans)

Rotate and translate 3D points on based on rotation matrix and transformation matrix.

Formula is \(\text{P_new} = R * (\text{P_old} - T)\)

Parameters
  • points (torch.FloatTensor) – 3D points, of shape \((\text{batch_size}, \text{num_points}, 3)\).

  • camera_rot (torch.FloatTensor) – rotation matrix, of shape \((\text{batch_size}, 3, 3)\).

  • camera_trans (torch.FloatTensor) – translation matrix, of shape \((\text{batch_size}, 3, 1)\).

Returns

3D points in new rotation, of same shape than points.

Return type

(torch.FloatTensor)

kaolin.render.camera.up_to_homogeneous(vectors)

Up-projects vectors to homogeneous coordinates of four dimensions. If the vectors are already in homogeneous coordinates, this function return the inputs.

Parameters

vectors (torch.Tensor) – the inputs vectors to project, of shape \((..., 3)\)

Returns

The projected vectors, of same shape than inputs but last dim to be 4

Return type

(torch.Tensor)