dire_rapids.utils module
Utility classes and functions for dire-rapids package.
This module provides: - ReducerConfig: Configuration dataclass for dimensionality reduction algorithms - ReducerRunner: General-purpose runner for dimensionality reduction benchmarking - Dataset loading utilities for sklearn, cytof, DiRe geometric datasets, and more
- dire_rapids.utils.rand_point_disk(n_features, n_samples=1, rng=None)[source]
Generate uniformly distributed points in n-dimensional unit disk.
- dire_rapids.utils.rand_point_sphere(n_features, n_samples=1, rng=None)[source]
Generate uniformly distributed points on n-dimensional unit sphere.
- class dire_rapids.utils.elgen(a)[source]
Bases:
objectEllipsoid generator - transforms sphere points to ellipsoid.
- dire_rapids.utils.rand_point_ell(semi_axes, n_features, n_samples=1, rng=None)[source]
Generate uniformly distributed points on n-dimensional ellipsoid with semi-axes.
- class dire_rapids.utils.ReducerConfig(name: str, reducer_class: type, reducer_kwargs: dict, visualize: bool = False, categorical_labels: bool = True, max_points: int = 10000)[source]
Bases:
objectConfiguration for a dimensionality reduction algorithm.
- All fields are mutable and can be changed after creation:
config.visualize = True config.categorical_labels = False config.max_points = 20000
- class dire_rapids.utils.ReducerRunner(config: ReducerConfig)[source]
Bases:
objectGeneral-purpose runner for dimensionality reduction algorithms.
Supports: - DiRe (create_dire, DiRePyTorch, DiRePyTorchMemoryEfficient, DiReCuVS) - cuML (UMAP, TSNE) - scikit-learn (any TransformerMixin-compatible class)
- Parameters:
config (ReducerConfig) – Configuration object containing reducer_class, reducer_kwargs, name, and visualize flag.
- config: ReducerConfig
- run(dataset, *, dataset_kwargs=None, transform=None)[source]
Run dimensionality reduction on specified dataset.
- Parameters:
- Returns:
Results containing: - embedding: reduced data - labels: data labels - reducer: fitted reducer instance - fit_time_sec: time taken for fit_transform - dataset_info: dataset metadata
- Return type:
- static available_sklearn()[source]
Return available sklearn dataset loaders, fetchers, and generators.
- __init__(config: ReducerConfig) None