dire_rapids.dire_pytorch module

PyTorch/PyKeOps backend for DiRe dimensionality reduction.

This implementation features: - Memory-efficient chunked k-NN computation for large datasets (>100K points) - Attraction forces applied only between k-NN neighbors - Repulsion forces computed from random samples - Automatic GPU memory management with adaptive chunk sizing - Designed for high-performance processing on CUDA GPUs

Performance characteristics: - Best for datasets >50K points on CUDA GPUs - Memory-aware processing up to millions of points - Chunked computation prevents GPU out-of-memory errors

class dire_rapids.dire_pytorch.DiRePyTorch(n_components=2, n_neighbors=16, init='pca', max_iter_layout=128, min_dist=0.01, spread=1.0, cutoff=42.0, n_sample_dirs=8, sample_size=16, neg_ratio=8, verbose=True, random_state=None, use_exact_repulsion=False, metric=None)[source]

Bases: TransformerMixin

Memory-efficient PyTorch/PyKeOps implementation of DiRe dimensionality reduction.

This class provides a high-performance implementation of the DiRe algorithm using PyTorch as the computational backend. It features adaptive memory management for large datasets and automatic GPU optimization.

Features

Chunked k-NN computation prevents GPU out-of-memory errors
Memory-aware force computation with automatic chunk sizing
Attraction forces between k-NN neighbors only
Repulsion forces from random sampling for efficiency
Automatic FP16 optimization for memory and speed
Optional PyKeOps integration for low-dimensional data

Best suited for

Large datasets (>50K points) on CUDA GPUs
Production environments requiring reliable memory usage
High-performance dimensionality reduction workflows

param n_components:

Number of dimensions in the target embedding space.

type n_components:

int, default=2

param n_neighbors:

Number of nearest neighbors to use for attraction forces.

type n_neighbors:

int, default=16

param init:

Method for initializing the embedding. ‘pca’ uses PCA initialization, ‘random’ uses random projection.

type init:

{‘pca’, ‘random’}, default=’pca’

param max_iter_layout:

Maximum number of optimization iterations.

type max_iter_layout:

int, default=128

param min_dist:

Minimum distance between points in the embedding.

type min_dist:

float, default=1e-2

param spread:

Controls how tightly points are packed in the embedding.

type spread:

float, default=1.0

param cutoff:

Distance cutoff for repulsion forces.

type cutoff:

float, default=42.0

param n_sample_dirs:

Number of sampling directions (used by derived classes).

type n_sample_dirs:

int, default=8

param sample_size:

Size of samples for force computation (used by derived classes).

type sample_size:

int, default=16

param neg_ratio:

Ratio of negative samples to positive samples for repulsion.

type neg_ratio:

int, default=8

param verbose:

Whether to print progress information.

type verbose:

bool, default=True

param random_state:

Random seed for reproducible results.

type random_state:

int or None, default=None

param use_exact_repulsion:

If True, use exact all-pairs repulsion (memory intensive, for testing only).

type use_exact_repulsion:

bool, default=False

param metric:

Custom distance metric for k-NN computation only (layout forces remain Euclidean):

None or ‘euclidean’/’l2’: Use fast built-in Euclidean distance
str: String expression evaluated with x and y tensors (e.g., ‘(x - y).abs().sum(-1)’ for L1)
callable: Custom function taking (x, y) tensors and returning distance matrix

Examples: ‘(x - y).abs().sum(-1)’ (L1), ‘1 - (x*y).sum(-1)/(x.norm()*y.norm() + 1e-8)’ (cosine).

type metric:

str, callable, or None, default=None

device

The PyTorch device being used (CPU or CUDA).

Type:: torch.device

logger

Instance-specific logger for this reducer.

Type:: loguru.Logger

Examples

Basic usage:

from dire_rapids import DiRePyTorch
import numpy as np

# Create sample data
X = np.random.randn(10000, 100)

# Create and fit reducer
reducer = DiRePyTorch(n_neighbors=32, verbose=True)
embedding = reducer.fit_transform(X)

# Visualize results
fig = reducer.visualize()
fig.show()

With custom parameters:

reducer = DiRePyTorch(
    n_components=3,
    n_neighbors=50,
    max_iter_layout=200,
    min_dist=0.1,
    random_state=42
)
embedding = reducer.fit_transform(X)

With custom distance metric:

# Using L1 (Manhattan) distance for k-NN
reducer = DiRePyTorch(
    metric='(x - y).abs().sum(-1)',
    n_neighbors=32
)
embedding = reducer.fit_transform(X)

# Using custom callable metric
def cosine_distance(x, y):
    return 1 - (x * y).sum(-1) / (x.norm(dim=-1, keepdim=True) * y.norm(dim=-1, keepdim=True) + 1e-8)

reducer = DiRePyTorch(metric=cosine_distance)
embedding = reducer.fit_transform(X)

__init__(n_components=2, n_neighbors=16, init='pca', max_iter_layout=128, min_dist=0.01, spread=1.0, cutoff=42.0, n_sample_dirs=8, sample_size=16, neg_ratio=8, verbose=True, random_state=None, use_exact_repulsion=False, metric=None)[source]

Initialize DiRePyTorch reducer with specified parameters.

Parameters:

n_components (int, default=2) – Number of dimensions in the target embedding space.
n_neighbors (int, default=16) – Number of nearest neighbors to use for attraction forces.
init ({'pca', 'random'}, default='pca') – Method for initializing the embedding.
max_iter_layout (int, default=128) – Maximum number of optimization iterations.
min_dist (float, default=1e-2) – Minimum distance between points in the embedding.
spread (float, default=1.0) – Controls how tightly points are packed in the embedding.
cutoff (float, default=42.0) – Distance cutoff for repulsion forces.
n_sample_dirs (int, default=8) – Number of sampling directions (reserved for future use).
sample_size (int, default=16) – Size of samples for force computation (reserved for future use).
neg_ratio (int, default=8) – Ratio of negative samples to positive samples for repulsion.
verbose (bool, default=True) – Whether to print progress information.
random_state (int or None, default=None) – Random seed for reproducible results.
use_exact_repulsion (bool, default=False) – If True, use exact all-pairs repulsion (memory intensive, testing only).
metric (str, callable, or None, default=None) – Custom distance metric for k-NN computation. See class docstring for details.

fit_transform(X, y=None)[source]

Fit the DiRe model and transform data to low-dimensional embedding.

This method performs the complete dimensionality reduction pipeline: 1. Computes k-nearest neighbors graph 2. Fits kernel parameters 3. Initializes embedding with PCA or random projection 4. Optimizes layout using force-directed algorithm

Parameters:

X (array-like of shape (n_samples, n_features)) – High-dimensional input data to transform.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.

Returns:

Low-dimensional embedding of the input data.

Return type:

numpy.ndarray of shape (n_samples, n_components)

Examples

Transform high-dimensional data:

import numpy as np
from dire_rapids import DiRePyTorch

X = np.random.randn(1000, 100)
reducer = DiRePyTorch(n_neighbors=16)
embedding = reducer.fit_transform(X)
print(embedding.shape)  # (1000, 2)

fit(X: ndarray, y=None)[source]

Fit the DiRe model to data without transforming.

This method fits the model by computing the k-NN graph, kernel parameters, and optimized embedding, but primarily serves for scikit-learn compatibility. For practical use, fit_transform() is recommended.

Parameters:

X (numpy.ndarray of shape (n_samples, n_features)) – High-dimensional data to fit the model to.
y (array-like of shape (n_samples,), optional) – Ignored. Present for scikit-learn API compatibility.

Returns:

self – The fitted DiRe instance.

Return type:

DiRePyTorch

Notes

This method calls fit_transform() internally. The embedding result is stored in self._layout and can be accessed after fitting.

visualize(labels=None, point_size=2, title=None, max_points=10000, **kwargs)[source]

Create an interactive visualization of the embedding.

Uses WebGL rendering (Scattergl) for performance and automatically subsamples to max_points if dataset is larger.

Parameters:

labels (array-like of shape (n_samples,), optional) – Labels for coloring points in the visualization.
point_size (int, default=2) – Size of points in the scatter plot.
title (str, optional) – Title for the plot. If None, a default title is generated.
max_points (int, default=10000) – Maximum number of points to display. Subsamples if larger.
**kwargs (dict) – Additional keyword arguments passed to plotly.express plotting functions.

Returns:

Interactive Plotly figure, or None if no embedding is available.

Return type:

plotly.graph_objects.Figure or None

Examples

Basic visualization:

fig = reducer.visualize()
fig.show()

With labels and custom styling:

fig = reducer.visualize(
    labels=y,
    point_size=3,
    title="My Embedding",
    max_points=20000,
    width=800,
    height=600
)
fig.show()

Notes

Requires a fitted model with available embedding (self._layout). Only supports 2D and 3D visualizations.

dire_rapids.dire_pytorch.create_dire(backend='auto', memory_efficient=False, **kwargs)[source]

Create DiRe instance with automatic backend selection.

This factory function automatically selects the optimal DiRe implementation based on available hardware and software, or allows manual backend selection. It provides a convenient interface for creating DiRe instances without importing specific backend classes.

Parameters:

backend ({'auto', 'cuvs', 'pytorch', 'pytorch_gpu', 'pytorch_cpu'}, default='auto') –
Backend selection strategy:
- ’auto’: Automatically select best available backend based on hardware
- ’cuvs’: Force RAPIDS cuVS backend (requires RAPIDS installation)
- ’pytorch’: Force PyTorch backend with automatic device selection
- ’pytorch_gpu’: Force PyTorch backend on GPU (requires CUDA)
- ’pytorch_cpu’: Force PyTorch backend on CPU only
memory_efficient (bool, default=False) –
If True, use memory-efficient PyTorch implementation which provides:
- Reduced memory usage for large datasets
- FP16 support for additional memory savings
- Enhanced chunking strategies
- More aggressive memory cleanup
**kwargs (dict) – Additional keyword arguments passed to the DiRe constructor. See individual backend documentation for available parameters. Common parameters include: n_components, n_neighbors, metric, max_iter_layout, min_dist, spread, verbose, random_state.

Returns:

An instance of the selected DiRe backend (DiRePyTorch, DiRePyTorchMemoryEfficient, or DiReCuVS) configured with the specified parameters.

Return type:

DiRe instance

Raises:

RuntimeError – If a specific backend is requested but requirements are not met (e.g., requesting cuVS without RAPIDS, or GPU without CUDA).
ValueError – If an unknown backend name is specified.

Examples

Auto-select optimal backend:

from dire_rapids import create_dire

# Will use cuVS if available, otherwise PyTorch with GPU if available
reducer = create_dire(n_neighbors=32, verbose=True)
embedding = reducer.fit_transform(X)

Force memory-efficient mode for large datasets:

reducer = create_dire(
    memory_efficient=True,
    n_neighbors=50,
    max_iter_layout=200
)

Force specific backend:

# CPU-only processing
reducer = create_dire(backend='pytorch_cpu')

# GPU processing with cuVS acceleration
reducer = create_dire(backend='cuvs', use_cuvs=True)

# With custom distance metric
reducer = create_dire(
    metric='(x - y).abs().sum(-1)',  # L1 distance
    n_neighbors=32,
    verbose=True
)

Notes

Backend Selection Priority (when backend=’auto’): 1. RAPIDS cuVS (if available and CUDA GPU present) 2. PyTorch with CUDA (if CUDA GPU available) 3. PyTorch with CPU (fallback)

The function automatically handles import errors and missing dependencies, falling back to available alternatives when possible.