kaolin.ops.gaussian

Gaussian Splats is a novel 3D representation consisting of a collection of optimizable 3D Gaussian particles carrying values like alpha and radiance.

kaolin does not address specifics regarding the optimization and rendering of this representation, which are already addressed by other frameworks. Rather, it supports additional novel operations which are not handled by other common packages.

To maintain compatibility with other frameworks in the broader sense, kaolin makes minimal assumptions about the exact fields tracked by the 3D Gaussians. At the bare minimum, gaussians are expected to keep track of their mean, covariance rotation & scale components, and opacity.

Densification

The marriage of high quality reconstructions available with Gaussian Splats and physics simulations now supported by kaolin paves the way to new and exciting interactive opportunities.

To improve the accuracy of such simulations, kaolin includes a CUDA based densification module which attempts to sample additional points within the volume of shapes represented with 3D Gaussians.

API

kaolin.ops.gaussian.sample_points_in_volume(xyz, scale, rotation, opacity, mask=None, num_samples=None, octree_level=8, opacity_threshold=0.35, post_scale_factor=1.0, jitter=True, clip_samples_to_input_bbox=True, viewpoints=None)

Logic for sampling additional points inside a shape represented by sparse 3D Gaussian Splats. Reconstructions based on 3D Gaussian Splats result in shells of objects, leaving the internal volume unfilled. In addition, the resulting shell is not a watertight shell, but rather, a collection of anisotropic gaussian samples on it.

Certain applications require additional samples within the volume. For example: physics simulations are more accurate when using volumetric mass. The logic in this class approximates the volumetric interior with additional points injected to the interior of objects.

The algorithm will attempt to voxelize the Gaussians and predict an approximated surface, and then sample additional points within it making sure the volume is evenly spaced (e.g. no voxel is sampled more than once). Note that reconstructions of poor quality may obtain samples with varying degrees of quality.

Note

Choosing Densifier Args.

The densifier is a non-learned method described in details below. Consequentially, different models of varying quality may require adjusting the parameters. As a rule of thumb, octree_level controls the density of volume samples. Higher density ensures more points are sampled within the volume, but may also expose holes within the shape shell. opacity_threshold controls how quickly low opacity cells get culled away. Lower quality models may want to lower this parameter as low as 0.0 to avoid exposing holes. jitter ensures the returned points are random, the exact usage should vary by application. The default viewpoints provide adequate coverage for common objects, but more complex objects with many cavities may benefit from a more specialized set of viewpoints. post_scale_factor downscales the returned points using their mean as the center of scale, to ensure they reside within the shape shell. It is recommended to leave this value lower than and close to 1.0 – for concave shapes downscaling too much may cause the points to drift away from the shape shell.

Note

Implementation Details. The object sampling takes place in 2 steps.

  1. The set of 3D Gaussians is converted to voxels using a novel hierarchical algorithm which builds on kaolin’s Structured Point Cloud (SPC) (which functions as an octree). The axis aligned bounding box of the Gaussians is enclosed in a cubical root node of an octree. This node is subdivided in an 8-way split, and a list of overlapping gaussian IDs is maintained for each sub node. The nodes that contain voxels are subdivided again and tested for overlap. This process repeats until a desired resolution is achieved according to the octree level. The nodes at the frontier of the octree are a voxelization of the Gaussians, represented by an SPC. The opacity_threshold parameter may cause some cells to get culled if they haven’t accumulated enough density from the 3D Gaussians overlapping them. At the end of this step, the SPC does not include voxels ‘inside’ the object represented by the Gaussians, but rather, voxels that represent the shape shell.

  2. Volume filling of voxelized shell by carving the space of voxels using rendered depth maps. This is achieved by ray-tracing the SPC from an icosahedral collection of viewpoints to create a depth map for each view. These depth maps are fused together into a second sparse SPC using a novel algorithm that maintains the occupancy state for each node of the full octree. These states are: empty, occupied, or unseen. Finally, the occupancy state of points in a regular grid are determined by querying this SPC. The union of the sets of occupied and unseen points serves as a sampling of the solid object.

Post process: the Gaussians are now converted to dense voxels including the volume. A point is sampled at the center of each voxel. If jitter is true, a small perturbation is also applied to each point. The perturbation is small enough such that each point remains within the voxel. Each voxel should contain at most one point by the end of this phase. If num_samples is specified, the points are randomly subsampled to return a sized sample pool.

Parameters
  • xyz (torch.FloatTensor) – A tensor of shape \((\text{N, 3})\) containing the Gaussian means. For example, using the original Inria codebase, this corresponds to GaussianModel.get_xyz.

  • scale (torch.FloatTensor) – A tensor of shape \((\text{N, 3})\) containing the Gaussian covariance scale components, in a format of a 3D scale vector per Gaussian. The scale is assumed to be post-activation. For example, using the original Inria codebase, this corresponds to GaussianModel.get_scaling.

  • rotation (torch.FloatTensor) – A tensor of shape \((\text{N, 4})\) containing the Gaussian covariance rotation components, in a format of a 4D quaternion per Gaussian. The rotation is assumed to be post-activation. For example, using the original Inria codebase, this corresponds to GaussianModel.get_rotation.

  • opacity (torch.FloatTensor) – A tensor of shape \((\text{N, 1})\) or \((\text{N,})\) containing the Gaussian opacities. For example, using the original Inria codebase, this corresponds to GaussianModel.get_opacity.

  • mask (optional, torch.BoolTensor) – An optional \((\text{N,})\) binary mask which selects only a subset of the gaussian to use for predicting the shell. Useful if some Gaussians are suspected as noise. By default, the mask is assumed to be a tensor of ones to select all Gaussians.

  • num_samples (optional, int) – An optional upper cap on the number of points sampled in the predicted volume. If not specified, the volume is evenly sampled in space according to the octree resolution.

  • octree_level (int) – A Structured Point Cloud of cubic resolution \((\text{3**level})\) will be constructed to voxelize and fill the volume with points. A single point will be sampled within each voxel. Higher values require more memory, and may suffer from holes in the shell. At the same time, higher values provide more points within the shape volume. octree_level range supported is in \([\text{6, 10}]\).

  • opacity_threshold (float) – The densification algorithm starts by voxelizing space using the gaussian responses and their associated opacities. Each cell accumulated the opacity induced by the Gaussians overlapping it. Voxels with accumulated opacity below this threshold will be masked away. If \(\text{opacity_threshold} > 0.0\), no culling will take place.

  • post_scale_factor (float) – Postprocess: if \(\text{post_scale_factor} < 1.0\), the returned pointcloud will be rescaled to ensure it fits inside the hull of the input points. It is recommended to avoid values significantly lower than 1 with concave or multi-part objects.

  • jitter (bool) – If true, applies a small jitter to the returned volume points. If false, the returned points lie on an equally distanced grid.

  • clip_samples_to_input_bbox (bool) – If true, the densifier will compute a bounding box out of the input gaussian means. Any points sampled outside of this bounding box will be rejected. For most purposes, it is recommended to leave this “safety mechanism” toggled on.

  • viewpoints (optional, torch.Tensor) – Collection of viewpoints used to ‘carve’ out seen voxel space around the shell after it’s voxelized. These is a \((\text{C, 3})\) tensor of camera viewpoints facing the center, chosen based on empirical heuristics. If not specified, kaolin will opt to use its own set of default views.

Returns

A tensor of \((\text{K, 3})\) points sampled inside the approximated shape volume.

Return type

(torch.FloatTensor)