kaolin.render.camera.CameraIntrinsics¶

API¶

class kaolin.render.camera.CameraIntrinsics(width, height, params, near, far)¶

Bases: ABC

Holds the intrinsics parameters of a camera: how it should project from camera space to normalized screen / clip space.

The instrinsics are determined by the camera type, meaning parameters may differ according to the lens structure. Typical computer graphics systems commonly assume the intrinsics of a pinhole camera (see: PinholeIntrinsics class).

One implication is that some camera types do not use a linear projection (i.e: Fisheye lens). There are therefore numerous ways to use CameraIntrinsics subclasses:

1. Access intrinsics parameters directly. This may typically benefit use cases such as ray generators.

2. The transform() method is supported by all CameraIntrinsics subclasses, both linear and non-linear transformations, to project vectors from camera space to normalized screen space. This method is implemented using differential pytorch operations.

3. Certain CameraIntrinsics subclasses which perform linear projections, may expose the transformation matrix via dedicated methods. For example, PinholeIntrinsics exposes a projection_matrix() method. This may typically be useful for rasterization based rendering pipelines (i.e: OpenGL vertex shaders).

This class is batched and may hold information from multiple cameras.

Parameters are stored as a single tensor of shape \((\text{num_cameras}, K)\) where K is the number of intrinsic parameters.

Parameters

width (int) –
height (int) –
params (torch.Tensor) –
near (float) –
far (float) –

aspect_ratio()¶

Returns the aspect ratio of the cameras held by this object.

Return type: float

classmethod cat(cameras)¶

Concatenate multiple CameraIntrinsics’s.

Assumes all cameras use the same width, height, near and far planes.

Parameters: cameras (Sequence of CameraIntrinsics) – the cameras to concatenate.
Returns: The concatenated cameras as a single CameraIntrinsics.
Return type: (CameraIntrinsics)

clip_mask(depth)¶

Creates a boolean mask for clipping depth values which fall out of the view frustum.

Parameters: depth (torch.Tensor) – depth values
Returns: a mask, marking whether depth values are within the view frustum or not, of same shape than depth.
Return type: (torch.BoolTensor)

cpu()¶

Return type: CameraIntrinsics

cuda()¶

Return type: CameraIntrinsics

property device: str¶: the torch device of parameters tensor

double()¶

Return type: CameraIntrinsics

property dtype¶: the torch dtype of parameters tensor

property far: float¶

float()¶

Return type: CameraIntrinsics

gradient_mask(*args)¶

Creates a gradient mask, which allows to backpropagate only through params designated as trainable.

This function does not consider the requires_grad field when creating this mask.

Parameters: *args (Union[str, IntEnum]) – A vararg list of the intrinsic params that should allow gradient flow. This function also supports conversion of params from their string names. (i.e: ‘focal_x’ will convert to PinholeParamsDefEnum.focal_x)
Return type: Tensor

Example

>>> # equivalent to:   mask = intrinsics.gradient_mask(IntrinsicsParamsDefEnum.focal_x,
>>> #                                                  IntrinsicsParamsDefEnum.focal_y)
>>> mask = intrinsics.gradient_mask('focal_x', 'focal_y')
>>> intrinsics.params.register_hook(lambda grad: grad * mask.float())
>>> # intrinsics will now allow gradient flow only for PinholeParamsDefEnum.focal_x and
>>> # PinholeParamsDefEnum.focal_y.

half()¶

Return type: CameraIntrinsics

property height: int¶

abstract property lens_type: str¶

named_params()¶

Get a descriptive list of named parameters per camera.

Returns: The named parameters.
Return type: (list of dict)

property ndc_max: float¶

property ndc_min: float¶

property near: float¶

param_count()¶

Returns: number of intrinsic parameters managed per camera
Return type: (int)

abstract classmethod param_types()¶

Returns: an enum describing each of the intrinsic parameters managed by the subclass. This enum also defines the order in which values are kept within the params buffer.
Return type: (IntrinsicsParamsDefEnum)

parameters()¶

Returns: (torch.Tensor): the intrinsics parameters buffer

Return type: Tensor

projection_matrix()¶

property requires_grad: bool¶: True if the current intrinsics object allows gradient flow

set_ndc_range(ndc_min, ndc_max)¶: Warning

This method is not implemented

to(*args, **kwargs)¶

An instance of this object with the parameters tensor on the given device. If the specified device is the same as this object, this object will be returned. Otherwise a new object with a copy of the parameters tensor on the requested device will be created.

See also

torch.Tensor.to()

Return type: CameraIntrinsics

abstract transform(vectors)¶

Projects the vectors from view space / camera space to NDC (normalized device coordinates) space. The NDC space used by kaolin is a left-handed coordinate system which uses OpenGL conventions:

Y      Z
^    /
|  /
|---------> X

The coordinates returned by this class are not concerned with clipping, and therefore the range of values returned by this transformation is not numerically bounded between \([-1, 1]\).

To support a wide range of lens, this function is compatible with both linaer or non-linear transformations (which are not representable by matrices). CameraIntrinsics subclasses should always implement this method using pytorch differential operations.

Parameters: vectors (torch.Tensor) – the vectors to be transformed, can homogeneous of shape \((\text{num_vectors}, 4)\) or \((\text{num_cameras}, \text{num_vectors}, 4)\) or non-homogeneous of shape \((\text{num_vectors}, 3)\) or \((\text{num_cameras}, \text{num_vectors}, 3)\)
Returns: the transformed vectors, of same shape than vectors but last dim 3
Return type: (torch.Tensor)

viewport_matrix(vl=0, vr=None, vb=0, vt=None, min_depth=0.0, max_depth=1.0)¶

Constructs a viewport matrix which transforms coordinates from NDC space to pixel space. This is the general matrix form of glViewport, familiar from OpenGL.

NDC coordinates are expected to be in: * [-1, 1] for the (x,y) coordinates. * [ndc_min, ndc_max] for the (z) coordinate.

Pixel coordinates are in: * [vl, vr] for the (x) coordinate. * [vb, vt] for the (y) coordinate. * [0, 1] for the (z) coordinate (yielding normalized depth).

When used in conjunction with a projection_matrix(), a transformation from camera view space to window space can be obtained.

Note that for the purpose of rendering with OpenGL shaders, this matrix is not required, as viewport transformation is already applied by the hardware.

By default, this matrix assumes the NDC screen spaces have the y axis pointing up. Under this assumption, and a [-1, 1] NDC space, the default values of this method are compatible with OpenGL glViewport.