Quick Start Guide
This guide will get you up and running with GraphEm in just a few minutes.
Installation
Install GraphEm using pip:
pip install graphem-jax
For GPU/TPU acceleration (optional but recommended for large graphs), see the JAX installation guide.
Your First Graph Embedding
Let’s start with a simple example of embedding a random graph:
import graphem as ge
import numpy as np
# Generate a random graph
edges = ge.erdos_renyi_graph(n=200, p=0.05)
# Create an embedder
embedder = ge.GraphEmbedder(
edges=edges,
n_vertices=200,
dimension=3, # 3D embedding
L_min=10.0, # Minimum edge length
k_attr=0.5, # Attraction force
k_inter=0.1, # Repulsion force
knn_k=15 # Nearest neighbors
)
# Compute the embedding
embedder.run_layout(num_iterations=40)
# Visualize the result
embedder.display_layout(edge_width=0.5, node_size=5)
Understanding the Parameters
dimension: Embedding space dimension (2D or 3D)
L_min: Controls minimum distance between connected nodes
k_attr: Strength of attractive forces between connected nodes
k_inter: Strength of repulsive forces between all nodes
knn_k: Number of nearest neighbors for efficient force computation
Graph Generation
GraphEm provides various graph generators:
# Scale-free network (Barabási–Albert)
edges = ge.generate_ba(n=500, m=3)
# Small-world network (Watts–Strogatz)
edges = ge.generate_ws(n=500, k=6, p=0.1)
# Stochastic block model
edges = ge.generate_sbm(n_per_block=100, num_blocks=3, p_in=0.1, p_out=0.01)
# Random regular graph
edges = ge.generate_random_regular(n=300, d=4)
Working with Real Data
Load and analyze real-world networks:
# Load a dataset (includes several network datasets)
vertices, edges = ge.load_dataset('snap-ca-GrQc') # Collaboration network
n_vertices = len(vertices)
# Create embedder for larger networks
embedder = ge.GraphEmbedder(
edges=edges,
n_vertices=n_vertices,
dimension=2,
knn_k=20, # More neighbors for denser graphs
sample_size=512, # Larger sample for accuracy
batch_size=2048 # Larger batches for efficiency
)
embedder.run_layout(num_iterations=100)
embedder.display_layout()
Influence Maximization
Find the most influential nodes in a network:
import networkx as nx
# Convert to NetworkX for influence analysis
G = nx.Graph()
G.add_nodes_from(range(n_vertices))
G.add_edges_from(edges)
# Method 1: GraphEm-based selection (uses embedding)
seeds_graphem = ge.graphem_seed_selection(embedder, k=10, num_iterations=20)
# Method 2: Greedy selection (traditional approach)
seeds_greedy = ge.greedy_seed_selection(G, k=10)
# Estimate influence spread
influence, iterations = ge.ndlib_estimated_influence(
G, seeds_graphem, p=0.1, iterations_count=200
)
print(f"GraphEm method: {influence} nodes influenced ({influence/n_vertices:.2%})")
Benchmarking and Analysis
Compare different centrality measures:
from graphem.benchmark import benchmark_correlations
from graphem.visualization import report_full_correlation_matrix
# Run comprehensive benchmark
results = benchmark_correlations(
graph_generator=ge.generate_ba,
graph_params={'n': 300, 'm': 3},
dim=3,
num_iterations=50
)
# Display correlation matrix
correlation_matrix = report_full_correlation_matrix(
results['radii'], # Embedding-based centrality
results['degree'], # Degree centrality
results['betweenness'], # Betweenness centrality
results['eigenvector'], # Eigenvector centrality
results['pagerank'], # PageRank
results['closeness'], # Closeness centrality
results['node_load'] # Load centrality
)
Performance Tips
For Large Graphs (>10k nodes):
embedder = ge.GraphEmbedder(
edges=edges,
n_vertices=n_vertices,
dimension=2, # 2D is faster than 3D
knn_k=10, # Fewer neighbors = faster
sample_size=256, # Smaller samples = faster
batch_size=4096, # Larger batches = more efficient
verbose=False # Disable progress bars
)
GPU Acceleration:
GraphEm automatically uses GPU if JAX detects CUDA:
import jax
print("Available devices:", jax.devices()) # Check for GPU
# Force CPU usage if needed
with jax.default_device(jax.devices('cpu')[0]):
embedder.run_layout(num_iterations=50)
Memory Management:
For very large graphs, process in chunks:
# For graphs with >100k nodes, consider reducing parameters
embedder = ge.GraphEmbedder(
edges=edges,
n_vertices=n_vertices,
knn_k=5, # Minimum viable k
sample_size=128, # Smaller sample
batch_size=1024 # Smaller batches
)
Next Steps
Explore the Tutorials for detailed examples
Check the API Reference for complete documentation
See examples for real-world use cases
Read Contributing to GraphEm to help improve GraphEm