kaolin.render.mesh

API

kaolin.render.mesh.deftet_sparse_render(pixel_coords, render_ranges, face_vertices_z, face_vertices_image, face_features, knum=300)

Fully differentiable volumetric renderer devised by Gao et al. in Learning Deformable Tetrahedral Meshes for 3D Reconstruction NeurIPS 2020.

This is rasterizing a mesh w.r.t to a list of pixel coordinates, but instead of just rendering the closest intersection. it will render all the intersections sorted by depth order, returning the interpolated features and the indexes of faces intersected for each intersection in padded arrays.

Note

The function is not differentiable w.r.t pixel_coords.

Note

if face_camera_vertices and face_camera_z are produced by camera functions in kaolin.render.camera, then the expected range of values for pixel_coords is [-1., 1.] and the expected of range values for render_range is [-inf, 0.].

Parameters
  • pixel_coords (torch.Tensor) – Image coordinates to render, of shape \((\text{batch_size}, \text{num_pixels}, 2)\).

  • render_ranges (torch.Tensor) – Depth ranges on which intersection get rendered, of shape \((\text{batch_size}, \text{num_pixels}, 2)\).

  • face_vertices_z (torch.Tensor) – 3D points values of the face vertices in camera coordinate, values in front of camera are expected to be negative, higher values being closer to the camera. of shape \((\text{batch_size}, \text{num_faces}, 3)\).

  • face_vertices_image (torch.Tensor) – 2D positions of the face vertices on image plane, of shape \((\text{batch_size}, \text{num_faces}, 3, 2)\), Note that face vertices are projected on image plane (z=-1) to forms face_vertices_image.

  • face_features (torch.Tensor or list of torch.Tensor) – Features (per-vertex per-face) to be drawn, of shape \((\text{batch_size}, \text{num_faces}, 3, \text{feature_dim})\), feature is the features dimension, for instance with vertex colors num_features=3 (R, G, B), and texture coordinates num_features=2 (X, Y), or a list of num_features, of shapes \((\text{batch_size}, \text{num_faces}, 3, \text{feature_dim[i]})\).

  • knum (int) – Maximum number of faces that influence one pixel. Default: 300.

Returns

  • The rendered features, of shape \((\text{batch_size}, \text{num_pixels}, \text{knum}, \text{feature_dim})\), if face_features is a list of torch.Tensor, then it returns a list of torch.Tensor, of shapes \((\text{batch_size}, \text{num_pixels}, \text{knum}, \text{feature_dim[i]})\).

  • The rendered face index, -1 is void, of shape \((\text{batch_size}, \text{num_pixels}, \text{knum})\).

Return type

(torch.Tensor or list of torch.Tensor, torch.LongTensor)

kaolin.render.mesh.dibr_rasterization(height, width, face_vertices_z, face_vertices_image, face_features, face_normals_z, sigmainv=7000, boxlen=0.02, knum=30, multiplier=1000)

Fully differentiable DIB-R renderer implementation, that renders 3D triangle meshes with per-vertex per-face features to generalized feature “images”, soft foreground masks, depth and face index maps.

See for usage with textures and lighting.

Originally proposed by Chen, Whenzheng, et al. in Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer NeurIPS 2019

Parameters
  • height (int) – the size of rendered images

  • width (int) – the size of rendered images

  • face_vertices_z (torch.FloatTensor) – 3D points depth (z) value of the face vertices in camera coordinate, of shape \((\text{batch_size}, \text{num_faces}, 3)\).

  • face_vertices_image (torch.FloatTensor) – 2D positions of the face vertices on image plane, of shape \((\text{batch_size}, \text{num_faces}, 3, 2)\), Note that face_vertices_camera is projected on image plane (z=-1) and forms face_vertices_image. The coordinates of face_vertices_image are between [-1, 1], which corresponds to normalized image pixels.

  • face_features (torch.FloatTensor or list of torch.FloatTensor) – Features (per-vertex per-face) to be drawn, of shape \((\text{batch_size}, \text{num_faces}, 3, \text{feature_dim})\), feature is the features dimension, for instance with vertex colors num_features=3 (R, G, B), and texture coordinates num_features=2 (X, Y), or a list of num_features, of shapes \((\text{batch_size}, \text{num_faces}, 3, \text{feature_dim[i]})\)

  • face_normals_z (torch.FloatTensor) – Normal directions in z axis, fo shape \((\text{batch_size}, \text{num_faces})\), only faces with normal z >= 0 will be drawn

  • sigmainv (int) – Smoothness term for soft mask, the higher, the sharper, the range is [1/3e-4, 1/3e-5]. Default: 7000.

  • boxlen (float) – We assume the pixel will only be influenced by nearby faces and boxlen controls the area size, the range is [0.05, 0.2]. Default: 0.1.

  • knum (int) – Maximum faces that influence one pixel. The range is [20, 100]. Default: 30. Note that the higher boxlen, the bigger knum.

  • multiplier (int) – To avoid numeric issue, we enlarge the coordinates by a multiplier. Default: 1000.

Returns

  • The rendered features of shape \((\text{batch_size}, \text{height}, \text{width}, \text{num_features})\), if face_features is a list of torch.FloatTensor, return of torch.FloatTensor, of shapes \((\text{batch_size}, \text{height}, \text{width}, \text{num_features[i]})\).

  • The rendered soft mask. It is generally sued in IoU loss to deform the shape, of shape \((\text{batch_size}, \text{height}, \text{width})\).

  • The rendered face index, 0 is void and face index start from 1, of shape \((\text{batch_size}, \text{height}, \text{width})\).

Return type

(torch.FloatTensor, torch.FloatTensor, torch.LongTensor)

kaolin.render.mesh.prepare_vertices(vertices, faces, camera_proj, camera_rot=None, camera_trans=None, camera_transform=None)

Wrapper function to move and project vertices to cameras then index them with faces.

Parameters
  • vertices (torch.Tensor) – the meshes vertices, of shape \((\text{batch_size}, \text{num_vertices}, 3)\).

  • faces (torch.LongTensor) – the meshes faces, of shape \((\text{num_faces}, \text{face_size})\).

  • camera_proj (torch.Tensor) – the camera projection vector, of shape \((3, 1)\).

  • camera_rot (torch.Tensor, optional) – the camera rotation matrices, of shape \((\text{batch_size}, 3, 3)\).

  • camera_trans (torch.Tensor, optional) – the camera translation vectors, of shape \((\text{batch_size}, 3)\).

  • camera_transform (torch.Tensor, optional) – the camera transformation matrices, of shape \((\text{batch_size}, 4, 3)\). Replace camera_trans and camera_rot.

Returns

The vertices in camera coordinate indexed by faces, of shape \((\text{batch_size}, \text{num_faces}, \text{face_size}, 3)\). The vertices in camera plan coordinate indexed by faces, of shape \((\text{batch_size}, \text{num_faces}, \text{face_size}, 2)\). The face normals, of shape \((\text{batch_size}, \text{num_faces})\).

Return type

(torch.Tensor, torch.Tensor, torch.Tensor)

kaolin.render.mesh.spherical_harmonic_lighting(imnormal, lights)

Creates lighting effects.

Follows convention set by Wojciech Jarosz in Efficient Monte Carlo Methods for Light Transport in Scattering Media.

Parameters
  • imnormal (torch.FloatTensor) – per pixel normal, of shape \((\text{batch_size}, \text{height}, \text{width}, 3)\)

  • lights (torch.FloatTensor) – spherical harmonic lighting parameters, of shape \((\text{batch_size}, 9)\)

Returns

lighting effect, shape of \((\text{batch_size}, \text{height}, \text{width})\)

Return type

(torch.FloatTensor)

https://cs.dartmouth.edu/~wjarosz/publications/dissertation/appendixB.pdf

kaolin.render.mesh.texture_mapping(texture_coordinates, texture_maps, mode='nearest')

Interpolates texture_maps by dense or sparse texture_coordinates. This function supports sampling texture coordinates for: 1. An entire 2D image 2. A sparse point cloud of texture coordinates.

Parameters
  • texture_coordinates (torch.FloatTensor) – dense image texture coordinate, of shape \((\text{batch_size}, h, w, 2)\) or sparse texture coordinate for points, of shape \((\text{batch_size}, \text{num_points}, 2)\) Coordinates are expected to be normalized between [0, 1]. Note that opengl tex coord is different from pytorch’s coord. opengl coord ranges from 0 to 1, y axis is from bottom to top and it supports circular mode(-0.1 is the same as 0.9) pytorch coord ranges from -1 to 1, y axis is from top to bottom and does not support circular filtering is the same as the mode parameter for torch.nn.functional.grid_sample.

  • texture_maps (torch.FloatTensor) –

    textures of shape \((\text{batch_size}, \text{num_channels}, h', w')\). Here, \(h'\) & \(w'\) are the height and width of texture maps.

    If texture_coordinates are image texture coordinates - For each pixel in the rendered image of height we use the coordinates in texture_coordinates to query corresponding value in texture maps. Note that height \(h\) and width \(w\) of the rendered image could be different from \(h'\) & \(w'\).

    If texture_coordinates are sparse texture coordinates - For each point in texture_coordinates we query the corresponding value in texture_maps.

Returns

interpolated texture of shape \((\text{batch_size}, h, w, \text{num_channels})\) or interpolated texture of shape \((\text{batch_size}, \text{num_points}, \text{num_channels})\)

Return type

(torch.FloatTensor)