dire_rapids.betti_curve module
Betti curve computation using filtered edge addition.
Computes Betti numbers (β₀, β₁) at multiple filtration thresholds by: 1. Building full atlas complex once with kNN graph 2. Recording edge and triangle filtration values 3. Progressively adding edges and triangles by threshold 4. Updating β₀ with union-find and rank(B₂) over GF(2) incrementally
Contains both CPU and GPU implementations with automatic backend selection.
- class dire_rapids.betti_curve.IncrementalF2Rank[source]
Bases:
objectIncremental sparse rank over GF(2), represented as Python integer bitsets.
- class dire_rapids.betti_curve.UnionFind(n)[source]
Bases:
objectDisjoint-set with path compression and union by rank.
- dire_rapids.betti_curve.compute_betti_curve_cpu(data, k_neighbors=20, density_threshold=0.8, overlap_factor=1.5, n_steps=50)[source]
CPU implementation of filtered Betti curve computation.
- Parameters:
- Returns:
dict ({) – ‘filtration_values’: array of filtration thresholds, ‘beta_0’: array of H0 Betti numbers, ‘beta_1’: array of H1 Betti numbers, ‘n_edges_active’: array of active edge counts, ‘n_triangles_active’: array of active triangle counts
}
- dire_rapids.betti_curve.compute_betti_curve_fast(data, k_neighbors=20, density_threshold=0.8, overlap_factor=1.5, n_steps=50)[source]
Fast Betti curve computation using an incremental atlas filtration.
- Adds each edge and triangle once in filtration order:
β₀ = connected components via union-find, O(E·α(E)) β₁ = E - rank(B1) - rank(B2) rank(B1) = V - β₀ rank(B2) is maintained incrementally over GF(2) using sparse bitsets
- Parameters:
- Returns:
dict
- Return type:
Same structure as compute_betti_curve_cpu
- dire_rapids.betti_curve.compute_betti_curve_gpu(data, k_neighbors=20, density_threshold=0.8, overlap_factor=1.5, n_steps=50)[source]
GPU implementation of filtered Betti curve computation.
Uses GPU kNN (cuVS/cuML) for graph construction, then the shared incremental atlas computation (union-find for β₀ and GF(2) bitset elimination for rank(B₂)). Atlas building runs on CPU because the set-heavy merge step is faster and simpler there.
- Parameters:
- Returns:
dict
- Return type:
Same structure as compute_betti_curve_cpu
- dire_rapids.betti_curve.compute_betti_curve_ripser(data, n_steps=50, maxdim=1, thresh=None)[source]
Betti curves via ripser’s persistent-homology reduction.
Much faster than the eigsh and dense-SVD paths at every scale: empirically ~4000× on MNIST n=3000 (eigsh path hangs for hours; ripser: ~6 s). The cost grows roughly O(N²) for maxdim=1 and O(N³) for maxdim=2, but the constants are small thanks to cohomology + apparent-pairs sparsification.
Persistent-homology view: ripser returns, for each dimension, a list of (birth, death) intervals. The Betti curve is β_d(t) = #{intervals with birth ≤ t < death} — “number of d-cycles alive at filtration t”.
- Parameters:
data (array-like of shape (n_samples, n_features)) – Point cloud.
n_steps (int, default=50) – Number of filtration samples at which to evaluate each Betti curve.
maxdim (int, default=1) – Highest homology dimension to compute. maxdim=1 returns β_0 and β_1; maxdim=2 additionally computes β_2 at O(N³) cost.
thresh (float or None, default=None) – Maximum filtration radius. None lets ripser run to its default (≈ the enclosing radius of the data). Cap this on large data if you only care about small-scale topology.
- Returns:
dict –
- ‘filtration_values’n_steps evenly spaced values from 0 to the
longest finite death time seen (plus a 5% margin).
’beta_0’, ‘beta_1’ : length-n_steps arrays. ‘beta_2’ : present only if maxdim >= 2. ‘n_edges_active’, ‘n_triangles_active’ : filled with NaN — ripser
does not expose these. Callers depending on the counts should use compute_betti_curve_fast instead.
- Return type:
same shape as the other compute_betti_curve_* backends.
Notes
Requires
ripser: install viapip install dire-rapids[persistent].
- dire_rapids.betti_curve.compute_betti_curve(data, k_neighbors=20, density_threshold=0.8, overlap_factor=1.5, n_steps=50, use_gpu=True, prefer_ripser=True)[source]
Compute filtered Betti curve (backend selector).
- Preference order (if available):
ripser (if
prefer_ripserandripseris installed) → GPU kNN + incremental atlas path (compute_betti_curve_gpu) ifuse_gpu- → fast CPU incremental atlas path (
compute_betti_curve_fast) as fallback.
- → fast CPU incremental atlas path (
Ripser is the preferred default when available; the atlas paths provide a dependency-light fallback for environments without ripser.
- Parameters:
data (array-like) – Point cloud data (n_samples, n_features).
k_neighbors (int) – Size of local neighborhood (ignored by the ripser path, which uses Vietoris-Rips filtration on the full distance matrix up to
thresh).density_threshold (float) – Percentile threshold for edge inclusion, 0-1 (ignored by ripser).
overlap_factor (float) – Factor for expanding local neighborhoods (ignored by ripser).
n_steps (int) – Number of filtration steps.
use_gpu (bool) – Whether to try the GPU kNN atlas path if ripser is not picked.
prefer_ripser (bool) – Prefer ripser when available. Set to False to skip ripser and use the atlas backends only (e.g. for correctness comparison).
- Returns:
dict ({) – ‘filtration_values’: array of filtration thresholds, ‘beta_0’: array of H0 Betti numbers, ‘beta_1’: array of H1 Betti numbers, ‘n_edges_active’: array of active edge counts (NaN under ripser), ‘n_triangles_active’: array of active triangle counts (NaN under ripser)
}