kaolin.io.dataset¶
API¶
- class kaolin.io.dataset.Cache(func, cache_dir, cache_key)¶
Bases:
object
Caches the results of a function to disk. If already cached, data is returned from disk. Otherwise, the function is executed. Output tensors are always on CPU device.
Deprecated since version 0.13.0:
Cache
is deprecated.- Parameters
- class kaolin.io.dataset.CachedDataset(dataset, cache_dir=None, save_on_disk=False, num_workers=0, force_overwrite=False, cache_at_runtime=False, progress_message=None, ignore_diff_error=False, transform=None)¶
Bases:
Dataset
A wrapper dataset that caches the data to disk or RAM depending on
save_on_disk
.For all
dataset[i]
withi
from 0 tolen(dataset)
the output is store on RAM or disk depending onsave_on_disk
.The base dataset or the
preprocessing_transform
if defined, should have a__getitem__(idx)
method that returns a dictionary.Note
if CUDA is used in preprocessing,
num_workers
must be set to 0.- Parameters
dataset (torch.utils.data.Dataset or Sequence) – The base dataset to use.
cache_dir (optional, str) – Path where the data must be saved. Must be given if
save_on_disk
is not False.save_on_disk (optional, bool or Sequence[str]) – If True all the preprocessed outputs are stored on disk, if False all the preprocessed outputs are stored on RAM, if it’s a sequence of strings then all the corresponding fields are stored on disk.
num_workers (optional, int) – Number of process used in parallel for preprocessing. Default: 0 (run in main process).
force_overwrite (optional, bool) – If True, force overwriting on disk even if files already exist. Default: False.
cache_at_runtime (optional, bool) – If True, instead of preprocessing everything at construction of the dataset, each new
__getitem__
will cache if necessary. Default: False.progress_message (optional, str) – Message to be displayed during preprocessing. This is unuse with cache_at_runtime=True. Default: don’t show any message.
transform (optional, Callable) – If defined, called on the data at
__getitem__
. The result of this function is not cached. Default: don’t apply any transform.
- class kaolin.io.dataset.CombinationDataset(datasets)¶
Bases:
KaolinDataset
Dataset combining a list of datasets into a unified dataset object.
Deprecated since version 0.13.0:
CombinationDataset
is deprecated. SeeProcessedDatasetV2
.Useful when multiple output representations are needed from a common base representation (Eg. when a mesh is to be served as both a pointcloud and a voxelgrid, etc.)
The output of get_attributes will be a tuple of all the get_attributes of the dataset list.
The output of get_data will be a tuple of all the get_data of the dataset list.
If a dataset does not have get_data, __getitem__ will be used instead.
The output of get_cache_key will be the cache key of the first dataset. If that dataset does not provide get_cache_key, the index will be used instead.
- Parameters
datasets – list or tuple of datasets
- get_attributes(index)¶
Returns the attributes at the given index. Attributes are usually not transformed by wrappers such as ProcessedDataset.
- get_cache_key(index)¶
- get_data(index)¶
Returns the data at the given index.
- class kaolin.io.dataset.KaolinDataset¶
Bases:
Dataset
A dataset supporting the separation of data and attributes, and combines them in its __getitem__. The return value of __getitem__ will be a named tuple containing the return value of both get_data and get_attributes. The difference between get_data and get_attributes is that data are able to be transformed or preprocessed (such as using ProcessedDataset), while attributes are generally not.
Deprecated since version 0.13.0:
KaolinDataset
is deprecated. Datasets should always output a dictionary to be compatible withProcessedDataset
.- abstract get_attributes(index)¶
Returns the attributes at the given index. Attributes are usually not transformed by wrappers such as ProcessedDataset.
- abstract get_data(index)¶
Returns the data at the given index.
- class kaolin.io.dataset.KaolinDatasetItem(data, attributes)¶
Bases:
tuple
- attributes¶
Alias for field number 1
- data¶
Alias for field number 0
- class kaolin.io.dataset.ProcessedDataset(dataset, preprocessing_transform=None, cache_dir=None, num_workers=None, transform=None, no_progress=False)¶
Bases:
KaolinDataset
- Parameters
no_progress (bool) –
- get_attributes(index)¶
Returns the attributes at the given index. Attributes are usually not transformed by wrappers such as ProcessedDataset.
- get_cache_key(index)¶
- get_data(index)¶
Returns the data at the given index.