dire_rapids
PyTorch and RAPIDS (cuVS/cuML) accelerated dimensionality reduction.
This package provides high-performance dimensionality reduction using the DiRe algorithm with multiple backend implementations:
DiRePyTorch: Standard PyTorch implementation for general use
DiRePyTorchMemoryEfficient: Memory-optimized PyTorch implementation for large datasets
DiReCuVS: RAPIDS cuVS/cuML accelerated implementation for massive datasets
The package automatically selects the best available backend based on system capabilities and dataset characteristics.
Additionally, the package provides comprehensive metrics for evaluating dimensionality reduction quality through the metrics module, which includes:
Distortion metrics: stress, neighborhood preservation
Context metrics: SVM/kNN classification accuracy preservation
Topological metrics: persistence homology, Betti curves, Wasserstein/bottleneck distances
Examples
Basic usage with automatic backend selection:
from dire_rapids import create_dire
# Create reducer with optimal backend
reducer = create_dire()
# Fit and transform data
embedding = reducer.fit_transform(X)
Force a specific backend:
from dire_rapids import DiRePyTorch, DiReCuVS
# Use PyTorch backend
reducer = DiRePyTorch(n_neighbors=32)
# Use RAPIDS backend (requires RAPIDS installation)
reducer = DiReCuVS(use_cuvs=True)
Evaluate embedding quality:
from dire_rapids.metrics import evaluate_embedding
# Comprehensive evaluation
results = evaluate_embedding(data, embedding, labels)
print(f"Stress: {results['local']['stress']:.4f}")
print(f"SVM accuracy: {results['context']['svm'][1]:.4f}")
- class dire_rapids.DiRePyTorch(n_components=2, n_neighbors=16, init='pca', max_iter_layout=128, min_dist=0.01, spread=1.0, cutoff=42.0, n_sample_dirs=8, sample_size=16, neg_ratio=8, verbose=True, random_state=None, use_exact_repulsion=False, metric=None)[source]
Bases:
TransformerMixin
Memory-efficient PyTorch/PyKeOps implementation of DiRe dimensionality reduction.
This class provides a high-performance implementation of the DiRe algorithm using PyTorch as the computational backend. It features adaptive memory management for large datasets and automatic GPU optimization.
Features
Chunked k-NN computation prevents GPU out-of-memory errors
Memory-aware force computation with automatic chunk sizing
Attraction forces between k-NN neighbors only
Repulsion forces from random sampling for efficiency
Automatic FP16 optimization for memory and speed
Optional PyKeOps integration for low-dimensional data
Best suited for
Large datasets (>50K points) on CUDA GPUs
Production environments requiring reliable memory usage
High-performance dimensionality reduction workflows
- param n_components:
Number of dimensions in the target embedding space.
- type n_components:
int, default=2
- param n_neighbors:
Number of nearest neighbors to use for attraction forces.
- type n_neighbors:
int, default=16
- param init:
Method for initializing the embedding. ‘pca’ uses PCA initialization, ‘random’ uses random projection.
- type init:
{‘pca’, ‘random’}, default=’pca’
- param max_iter_layout:
Maximum number of optimization iterations.
- type max_iter_layout:
int, default=128
- param min_dist:
Minimum distance between points in the embedding.
- type min_dist:
float, default=1e-2
- param spread:
Controls how tightly points are packed in the embedding.
- type spread:
float, default=1.0
- param cutoff:
Distance cutoff for repulsion forces.
- type cutoff:
float, default=42.0
- param n_sample_dirs:
Number of sampling directions (used by derived classes).
- type n_sample_dirs:
int, default=8
- param sample_size:
Size of samples for force computation (used by derived classes).
- type sample_size:
int, default=16
- param neg_ratio:
Ratio of negative samples to positive samples for repulsion.
- type neg_ratio:
int, default=8
- param verbose:
Whether to print progress information.
- type verbose:
bool, default=True
- param random_state:
Random seed for reproducible results.
- type random_state:
int or None, default=None
- param use_exact_repulsion:
If True, use exact all-pairs repulsion (memory intensive, for testing only).
- type use_exact_repulsion:
bool, default=False
- param metric:
Custom distance metric for k-NN computation only (layout forces remain Euclidean):
None or ‘euclidean’/’l2’: Use fast built-in Euclidean distance
str: String expression evaluated with x and y tensors (e.g., ‘(x - y).abs().sum(-1)’ for L1)
callable: Custom function taking (x, y) tensors and returning distance matrix
Examples: ‘(x - y).abs().sum(-1)’ (L1), ‘1 - (x*y).sum(-1)/(x.norm()*y.norm() + 1e-8)’ (cosine).
- type metric:
str, callable, or None, default=None
- device
The PyTorch device being used (CPU or CUDA).
- Type:
- logger
Instance-specific logger for this reducer.
- Type:
loguru.Logger
Examples
Basic usage:
from dire_rapids import DiRePyTorch import numpy as np # Create sample data X = np.random.randn(10000, 100) # Create and fit reducer reducer = DiRePyTorch(n_neighbors=32, verbose=True) embedding = reducer.fit_transform(X) # Visualize results fig = reducer.visualize() fig.show()
With custom parameters:
reducer = DiRePyTorch( n_components=3, n_neighbors=50, max_iter_layout=200, min_dist=0.1, random_state=42 ) embedding = reducer.fit_transform(X)
With custom distance metric:
# Using L1 (Manhattan) distance for k-NN reducer = DiRePyTorch( metric='(x - y).abs().sum(-1)', n_neighbors=32 ) embedding = reducer.fit_transform(X) # Using custom callable metric def cosine_distance(x, y): return 1 - (x * y).sum(-1) / (x.norm(dim=-1, keepdim=True) * y.norm(dim=-1, keepdim=True) + 1e-8) reducer = DiRePyTorch(metric=cosine_distance) embedding = reducer.fit_transform(X)
- __init__(n_components=2, n_neighbors=16, init='pca', max_iter_layout=128, min_dist=0.01, spread=1.0, cutoff=42.0, n_sample_dirs=8, sample_size=16, neg_ratio=8, verbose=True, random_state=None, use_exact_repulsion=False, metric=None)[source]
Initialize DiRePyTorch reducer with specified parameters.
- Parameters:
n_components (int, default=2) – Number of dimensions in the target embedding space.
n_neighbors (int, default=16) – Number of nearest neighbors to use for attraction forces.
init ({'pca', 'random'}, default='pca') – Method for initializing the embedding.
max_iter_layout (int, default=128) – Maximum number of optimization iterations.
min_dist (float, default=1e-2) – Minimum distance between points in the embedding.
spread (float, default=1.0) – Controls how tightly points are packed in the embedding.
cutoff (float, default=42.0) – Distance cutoff for repulsion forces.
n_sample_dirs (int, default=8) – Number of sampling directions (reserved for future use).
sample_size (int, default=16) – Size of samples for force computation (reserved for future use).
neg_ratio (int, default=8) – Ratio of negative samples to positive samples for repulsion.
verbose (bool, default=True) – Whether to print progress information.
random_state (int or None, default=None) – Random seed for reproducible results.
use_exact_repulsion (bool, default=False) – If True, use exact all-pairs repulsion (memory intensive, testing only).
metric (str, callable, or None, default=None) – Custom distance metric for k-NN computation. See class docstring for details.
- fit_transform(X, y=None)[source]
Fit the DiRe model and transform data to low-dimensional embedding.
This method performs the complete dimensionality reduction pipeline: 1. Computes k-nearest neighbors graph 2. Fits kernel parameters 3. Initializes embedding with PCA or random projection 4. Optimizes layout using force-directed algorithm
- Parameters:
X (array-like of shape (n_samples, n_features)) – High-dimensional input data to transform.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.
- Returns:
Low-dimensional embedding of the input data.
- Return type:
numpy.ndarray of shape (n_samples, n_components)
Examples
Transform high-dimensional data:
import numpy as np from dire_rapids import DiRePyTorch X = np.random.randn(1000, 100) reducer = DiRePyTorch(n_neighbors=16) embedding = reducer.fit_transform(X) print(embedding.shape) # (1000, 2)
- fit(X: ndarray, y=None)[source]
Fit the DiRe model to data without transforming.
This method fits the model by computing the k-NN graph, kernel parameters, and optimized embedding, but primarily serves for scikit-learn compatibility. For practical use, fit_transform() is recommended.
- Parameters:
X (numpy.ndarray of shape (n_samples, n_features)) – High-dimensional data to fit the model to.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.
- Returns:
self – The fitted DiRe instance.
- Return type:
Notes
This method calls fit_transform() internally. The embedding result is stored in self._layout and can be accessed after fitting.
- visualize(labels=None, point_size=2, title=None, max_points=10000, **kwargs)[source]
Create an interactive visualization of the embedding.
Uses WebGL rendering (Scattergl) for performance and automatically subsamples to max_points if dataset is larger.
- Parameters:
labels (array-like of shape (n_samples,), optional) – Labels for coloring points in the visualization.
point_size (int, default=2) – Size of points in the scatter plot.
title (str, optional) – Title for the plot. If None, a default title is generated.
max_points (int, default=10000) – Maximum number of points to display. Subsamples if larger.
**kwargs (dict) – Additional keyword arguments passed to plotly.express plotting functions.
- Returns:
Interactive Plotly figure, or None if no embedding is available.
- Return type:
plotly.graph_objects.Figure or None
Examples
Basic visualization:
fig = reducer.visualize() fig.show()
With labels and custom styling:
fig = reducer.visualize( labels=y, point_size=3, title="My Embedding", max_points=20000, width=800, height=600 ) fig.show()
Notes
Requires a fitted model with available embedding (self._layout). Only supports 2D and 3D visualizations.
- class dire_rapids.DiRePyTorchMemoryEfficient(*args, use_fp16=True, use_pykeops_repulsion=True, pykeops_threshold=50000, memory_fraction=0.25, **kwargs)[source]
Bases:
DiRePyTorch
Memory-optimized PyTorch implementation of DiRe for large-scale datasets.
This class extends DiRePyTorch with enhanced memory management capabilities, making it suitable for processing very large datasets that would otherwise cause out-of-memory errors with the standard implementation.
Key Improvements over DiRePyTorch
FP16 Support: Uses half-precision by default for 2x memory reduction
Dynamic Chunking: Automatically adjusts chunk sizes based on available memory
Aggressive Cleanup: More frequent garbage collection and cache clearing
PyKeOps Integration: Optional LazyTensors for memory-efficient exact repulsion
Memory Monitoring: Real-time memory usage tracking and warnings
Point-wise Processing: Falls back to point-by-point computation when needed
Best Use Cases
Datasets with >100K points
High-dimensional data (>500 features)
Memory-constrained environments
Production systems requiring reliable memory usage
- param *args:
Positional arguments passed to DiRePyTorch parent class.
- param use_fp16:
Enable FP16 precision for memory efficiency (recommended). Provides 2x memory reduction and significant speed improvements.
- type use_fp16:
bool, default=True
- param use_pykeops_repulsion:
Use PyKeOps LazyTensors for repulsion when beneficial. Automatically disabled if PyKeOps unavailable or dataset too large.
- type use_pykeops_repulsion:
bool, default=True
- param pykeops_threshold:
Maximum dataset size for PyKeOps all-pairs computation. Above this threshold, random sampling is used instead.
- type pykeops_threshold:
int, default=50000
- param memory_fraction:
Fraction of available memory to use for computations. Lower values are more conservative but may be slower.
- type memory_fraction:
float, default=0.25
- param **kwargs:
Additional keyword arguments passed to DiRePyTorch parent class. Includes: n_components, n_neighbors, init, max_iter_layout, min_dist, spread, cutoff, neg_ratio, verbose, random_state, use_exact_repulsion, metric (custom distance function for k-NN computation).
Examples
Memory-efficient processing of large dataset:
from dire_rapids import DiRePyTorchMemoryEfficient import numpy as np # Large dataset X = np.random.randn(500000, 512) # Memory-efficient reducer reducer = DiRePyTorchMemoryEfficient( use_fp16=True, memory_fraction=0.3, verbose=True ) embedding = reducer.fit_transform(X)
Custom memory settings:
reducer = DiRePyTorchMemoryEfficient( use_pykeops_repulsion=False, # Disable PyKeOps memory_fraction=0.15, # Use less memory pykeops_threshold=20000 # Lower PyKeOps threshold )
With custom distance metric:
# L1 metric for k-NN with memory efficiency reducer = DiRePyTorchMemoryEfficient( metric='(x - y).abs().sum(-1)', n_neighbors=32, use_fp16=True, memory_fraction=0.2 ) embedding = reducer.fit_transform(X)
- __init__(*args, use_fp16=True, use_pykeops_repulsion=True, pykeops_threshold=50000, memory_fraction=0.25, **kwargs)[source]
Initialize memory-efficient DiRe reducer.
- Parameters:
*args – Positional arguments passed to DiRePyTorch parent class.
use_fp16 (bool, default=True) – Enable FP16 precision for memory efficiency. Provides 2x memory reduction and significant speed improvements on modern GPUs.
use_pykeops_repulsion (bool, default=True) – Use PyKeOps LazyTensors for memory-efficient repulsion computation when dataset size is below pykeops_threshold.
pykeops_threshold (int, default=50000) – Maximum dataset size for PyKeOps all-pairs computation. Above this threshold, random sampling is used instead.
memory_fraction (float, default=0.25) – Fraction of available memory to use for computations. Lower values are more conservative but may be slower.
**kwargs –
Additional keyword arguments passed to DiRePyTorch parent class. See DiRePyTorch documentation for available parameters including:
n_components, n_neighbors, init, max_iter_layout, min_dist, spread
cutoff, neg_ratio, verbose, random_state, use_exact_repulsion
metric: Custom distance metric for k-NN (str, callable, or None)
- fit_transform(X, y=None)[source]
Fit the model and transform data with memory-efficient processing.
This method extends the parent implementation with memory-optimized data handling, enhanced logging, and aggressive cleanup procedures.
- Parameters:
X (array-like of shape (n_samples, n_features)) – High-dimensional input data to transform.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.
- Returns:
Low-dimensional embedding of the input data.
- Return type:
numpy.ndarray of shape (n_samples, n_components)
Notes
Memory Optimizations: - Automatic FP16 conversion for large datasets on GPU - Strategic backend selection based on dataset characteristics - Aggressive memory cleanup after processing - Real-time memory monitoring and reporting
Examples
Process large dataset with memory monitoring:
import numpy as np from dire_rapids import DiRePyTorchMemoryEfficient # Large high-dimensional dataset X = np.random.randn(200000, 1000) reducer = DiRePyTorchMemoryEfficient( use_fp16=True, memory_fraction=0.3, verbose=True ) embedding = reducer.fit_transform(X)
- dire_rapids.create_dire(backend='auto', memory_efficient=False, **kwargs)[source]
Create DiRe instance with automatic backend selection.
This factory function automatically selects the optimal DiRe implementation based on available hardware and software, or allows manual backend selection. It provides a convenient interface for creating DiRe instances without importing specific backend classes.
- Parameters:
backend ({'auto', 'cuvs', 'pytorch', 'pytorch_gpu', 'pytorch_cpu'}, default='auto') –
Backend selection strategy:
’auto’: Automatically select best available backend based on hardware
’cuvs’: Force RAPIDS cuVS backend (requires RAPIDS installation)
’pytorch’: Force PyTorch backend with automatic device selection
’pytorch_gpu’: Force PyTorch backend on GPU (requires CUDA)
’pytorch_cpu’: Force PyTorch backend on CPU only
memory_efficient (bool, default=False) –
If True, use memory-efficient PyTorch implementation which provides:
Reduced memory usage for large datasets
FP16 support for additional memory savings
Enhanced chunking strategies
More aggressive memory cleanup
**kwargs (dict) – Additional keyword arguments passed to the DiRe constructor. See individual backend documentation for available parameters. Common parameters include: n_components, n_neighbors, metric, max_iter_layout, min_dist, spread, verbose, random_state.
- Returns:
An instance of the selected DiRe backend (DiRePyTorch, DiRePyTorchMemoryEfficient, or DiReCuVS) configured with the specified parameters.
- Return type:
DiRe instance
- Raises:
RuntimeError – If a specific backend is requested but requirements are not met (e.g., requesting cuVS without RAPIDS, or GPU without CUDA).
ValueError – If an unknown backend name is specified.
Examples
Auto-select optimal backend:
from dire_rapids import create_dire # Will use cuVS if available, otherwise PyTorch with GPU if available reducer = create_dire(n_neighbors=32, verbose=True) embedding = reducer.fit_transform(X)
Force memory-efficient mode for large datasets:
reducer = create_dire( memory_efficient=True, n_neighbors=50, max_iter_layout=200 )
Force specific backend:
# CPU-only processing reducer = create_dire(backend='pytorch_cpu') # GPU processing with cuVS acceleration reducer = create_dire(backend='cuvs', use_cuvs=True) # With custom distance metric reducer = create_dire( metric='(x - y).abs().sum(-1)', # L1 distance n_neighbors=32, verbose=True )
Notes
Backend Selection Priority (when backend=’auto’): 1. RAPIDS cuVS (if available and CUDA GPU present) 2. PyTorch with CUDA (if CUDA GPU available) 3. PyTorch with CPU (fallback)
The function automatically handles import errors and missing dependencies, falling back to available alternatives when possible.
- class dire_rapids.ReducerRunner(config: ReducerConfig)[source]
Bases:
object
General-purpose runner for dimensionality reduction algorithms.
Supports: - DiRe (create_dire, DiRePyTorch, DiRePyTorchMemoryEfficient, DiReCuVS) - cuML (UMAP, TSNE) - scikit-learn (any TransformerMixin-compatible class)
- Parameters:
config (ReducerConfig) – Configuration object containing reducer_class, reducer_kwargs, name, and visualize flag.
- config: ReducerConfig
- run(dataset, *, dataset_kwargs=None, transform=None)[source]
Run dimensionality reduction on specified dataset.
- Parameters:
- Returns:
Results containing: - embedding: reduced data - labels: data labels - reducer: fitted reducer instance - fit_time_sec: time taken for fit_transform - dataset_info: dataset metadata
- Return type:
- static available_sklearn()[source]
Return available sklearn dataset loaders, fetchers, and generators.
- __init__(config: ReducerConfig) None
- class dire_rapids.ReducerConfig(name: str, reducer_class: type, reducer_kwargs: dict, visualize: bool = False, categorical_labels: bool = True, max_points: int = 10000)[source]
Bases:
object
Configuration for a dimensionality reduction algorithm.
- All fields are mutable and can be changed after creation:
config.visualize = True config.categorical_labels = False config.max_points = 20000
- class dire_rapids.DiReCuVS(*args, use_cuvs=None, use_cuml=None, cuvs_index_type='auto', cuvs_build_params=None, cuvs_search_params=None, **kwargs)[source]
Bases:
DiRePyTorch
RAPIDS cuVS/cuML accelerated implementation of DiRe for massive datasets.
This class extends DiRePyTorch with optional RAPIDS cuVS (CUDA Vector Search) integration for GPU-accelerated k-nearest neighbors computation and cuML integration for GPU-accelerated PCA initialization. It provides substantial performance improvements for large-scale datasets.
Performance Advantages over PyTorch/PyKeOps
10-100x faster k-NN: For large datasets (>100K points)
Massive scale support: Handles 10M+ points efficiently
High accuracy: Approximate k-NN with >95% recall
Multi-GPU ready: Supports extreme scale processing
GPU-accelerated PCA: cuML PCA/SVD for initialization
Automatic Fallback
Falls back to PyTorch backend if cuVS is not available, ensuring compatibility across different environments.
- param use_cuvs:
Whether to use cuVS for k-NN computation. If None, automatically detected based on availability and hardware.
- type use_cuvs:
bool or None, default=None
- param use_cuml:
Whether to use cuML for PCA initialization. If None, automatically detected based on availability and hardware.
- type use_cuml:
bool or None, default=None
- param cuvs_index_type:
Type of cuVS index to build: - ‘auto’: Automatically select based on data characteristics - ‘ivf_flat’: Inverted file index without compression - ‘ivf_pq’: Inverted file index with product quantization - ‘cagra’: Graph-based index for very large datasets - ‘flat’: Brute-force exact search
- type cuvs_index_type:
{‘auto’, ‘ivf_flat’, ‘ivf_pq’, ‘cagra’, ‘flat’}, default=’auto’
- param cuvs_build_params:
Custom parameters for cuVS index building. Overrides defaults.
- type cuvs_build_params:
dict, optional
- param cuvs_search_params:
Custom parameters for cuVS search. Overrides defaults.
- type cuvs_search_params:
dict, optional
- param *args:
Additional arguments passed to DiRePyTorch parent class. Includes: n_components, n_neighbors, init, max_iter_layout, min_dist, spread, cutoff, neg_ratio, verbose, random_state, use_exact_repulsion, metric (custom distance function for k-NN computation).
- param **kwargs:
Additional arguments passed to DiRePyTorch parent class. Includes: n_components, n_neighbors, init, max_iter_layout, min_dist, spread, cutoff, neg_ratio, verbose, random_state, use_exact_repulsion, metric (custom distance function for k-NN computation).
Examples
Basic usage with automatic backend selection:
from dire_rapids import DiReCuVS import numpy as np # Large dataset X = np.random.randn(100000, 512) # Auto-detect cuVS/cuML availability reducer = DiReCuVS() embedding = reducer.fit_transform(X)
Force cuVS with custom index parameters:
reducer = DiReCuVS( use_cuvs=True, cuvs_index_type='ivf_pq', cuvs_build_params={'n_lists': 2048, 'pq_dim': 64} )
Massive dataset processing:
# 10M points, 1000 dimensions X = np.random.randn(10_000_000, 1000) reducer = DiReCuVS( use_cuvs=True, use_cuml=True, cuvs_index_type='cagra', # Best for very large datasets n_neighbors=32 ) embedding = reducer.fit_transform(X)
With custom distance metric:
# cuVS with L1 distance for k-NN computation reducer = DiReCuVS( use_cuvs=True, metric='(x - y).abs().sum(-1)', # L1/Manhattan distance n_neighbors=32, cuvs_index_type='ivf_flat' ) embedding = reducer.fit_transform(X)
Notes
Requirements: - RAPIDS cuVS: Follow the installation instructions at https://docs.rapids.ai/install/ - CUDA-capable GPU with compute capability >= 6.0
Index Selection Guidelines: - < 50K points: ‘flat’ (exact search) - 50K-500K points: ‘ivf_flat’ - 500K-5M points: ‘ivf_pq’ - > 5M points: ‘cagra’ (if dimensions <= 500)
Memory Considerations: - cuVS requires float32 precision (no FP16 support) - Index building requires additional GPU memory - ‘cagra’ uses more memory but provides best performance for huge datasets
- __init__(*args, use_cuvs=None, use_cuml=None, cuvs_index_type='auto', cuvs_build_params=None, cuvs_search_params=None, **kwargs)[source]
Initialize DiReCuVS with cuVS and cuML backend configuration.
- Parameters:
*args – Positional arguments passed to DiRePyTorch parent class.
use_cuvs (bool or None, default=None) – Whether to use cuVS for k-NN computation: - None: Auto-detect based on availability and GPU presence - True: Force cuVS usage (raises error if unavailable) - False: Disable cuVS, use PyTorch backend
use_cuml (bool or None, default=None) – Whether to use cuML for PCA initialization: - None: Auto-detect based on availability and GPU presence - True: Force cuML usage (raises error if unavailable) - False: Disable cuML, use sklearn backend
cuvs_index_type ({'auto', 'ivf_flat', 'ivf_pq', 'cagra', 'flat'}, default='auto') – Type of cuVS index to build: - ‘auto’: Automatically select optimal index based on data size/dimensionality - ‘ivf_flat’: Inverted file index without compression (good balance) - ‘ivf_pq’: Inverted file with product quantization (memory efficient) - ‘cagra’: Graph-based index (best for very large datasets) - ‘flat’: Brute-force exact search (small datasets only)
cuvs_build_params (dict, optional) – Custom parameters for cuVS index building. These override the automatically determined parameters. See cuVS documentation for index-specific parameters.
cuvs_search_params (dict, optional) – Custom parameters for cuVS search operations. These override the automatically determined parameters. See cuVS documentation for index-specific search parameters.
**kwargs – Additional keyword arguments passed to DiRePyTorch parent class. See DiRePyTorch documentation for available parameters including: n_components, n_neighbors, init, max_iter_layout, min_dist, spread, cutoff, neg_ratio, verbose, random_state, use_exact_repulsion, metric (custom distance function for k-NN computation).
- Raises:
ImportError – If cuVS or cuML are requested but not available.
RuntimeError – If GPU is required but not available.
- fit_transform(X, y=None)[source]
Fit the model and transform data with cuVS/cuML acceleration.
This method extends the parent implementation with intelligent backend selection and logging to inform users about the acceleration being used.
- Parameters:
X (array-like of shape (n_samples, n_features)) – High-dimensional input data to transform.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.
- Returns:
Low-dimensional embedding of the input data.
- Return type:
numpy.ndarray of shape (n_samples, n_components)
Notes
Backend Selection Logic: - Uses cuVS for k-NN if dataset is large enough and cuVS is available - Uses cuML for PCA initialization if available and init=’pca’ - Falls back to PyTorch implementations automatically
Performance Benefits: - cuVS k-NN: 10-100x speedup for large datasets - cuML PCA: 5-50x speedup for high-dimensional initialization
Examples
Large dataset with cuVS acceleration:
import numpy as np from dire_rapids import DiReCuVS # 500K points, 1000 dimensions X = np.random.randn(500000, 1000) reducer = DiReCuVS(verbose=True) # Will log backend selection embedding = reducer.fit_transform(X) # Output: "Using cuVS-accelerated backend for 500000 points"
Submodules
- dire_rapids.dire_pytorch module
- dire_rapids.dire_pytorch_memory_efficient module
- dire_rapids.dire_cuvs module
- dire_rapids.metrics module
get_available_persistence_backends()
set_persistence_backend()
get_persistence_backend()
welford_update_gpu()
welford_finalize_gpu()
welford_gpu()
threshold_subsample_gpu()
make_knn_graph_gpu()
make_knn_graph_cpu()
compute_stress()
compute_neighbor_score()
compute_local_metrics()
compute_svm_accuracy()
compute_knn_accuracy()
compute_svm_score()
compute_knn_score()
compute_context_measures()
compute_h0_h1_knn()
compute_persistence_diagrams_fast()
compute_persistence_diagrams()
betti_curve()
compute_dtw()
compute_twed()
compute_emd()
compute_wasserstein()
compute_bottleneck()
compute_global_metrics()
evaluate_embedding()
- dire_rapids.atlas_cpu module
- dire_rapids.atlas_gpu module