Clustrix Documentation¶
Clustrix is a Python package that enables seamless distributed computing on clusters. With a simple decorator, you can execute any Python function remotely on cluster resources while automatically handling dependency management, environment setup, and result collection.
Features¶
Simple Decorator Interface: Just add
@clusterto any functionMultiple Cluster Support: SLURM, PBS, SGE, Kubernetes, and SSH
Automatic Dependency Management: Captures and replicates your exact Python environment
Loop Parallelization: Automatically distributes loops across cluster nodes
Local Parallelization: Multi-core execution for development and testing
Flexible Configuration: Easy setup with config files or environment variables
Error Handling: Comprehensive error reporting and job monitoring
Quick Start¶
Installation¶
pip install clustrix
Basic Usage¶
import clustrix
# Configure your cluster
clustrix.configure(
cluster_type='slurm',
cluster_host='your-cluster.example.com',
username='your-username',
default_cores=4,
default_memory='8GB'
)
# Decorate your function
@clustrix.cluster(cores=8, memory='16GB', time='02:00:00')
def expensive_computation(data, iterations=1000):
import numpy as np
result = 0
for i in range(iterations):
result += np.sum(data ** 2)
return result
# Execute on cluster
data = [1, 2, 3, 4, 5]
result = expensive_computation(data, iterations=10000)
print(f"Result: {result}")
Table of Contents¶
User Guide
Interactive Notebooks
- Complete Clustrix API Demonstration
- SLURM Cluster Tutorial
- Prerequisites
- Installation and Setup
- Basic SLURM Configuration
- Example 1: Simple Mathematical Computation
- Example 2: Machine Learning Model Training
- Example 3: Parallel Data Processing with Automatic Loop Distribution
- Example 4: Scientific Computing - Numerical Integration
- Example 5: Bioinformatics - Sequence Analysis
- Advanced SLURM Features
- Monitoring and Debugging
- Configuration Best Practices
- Summary
- PBS/Torque Cluster Tutorial
- Prerequisites
- Installation and Setup
- PBS Cluster Configuration
- Example 1: Bioinformatics - DNA Sequence Analysis
- Example 2: Materials Science - Molecular Dynamics Simulation
- Example 3: Environmental Science - Climate Data Analysis
- PBS Queue Management and Resource Selection
- PBS Job Arrays for Parameter Studies
- Monitoring PBS Jobs
- PBS Configuration Best Practices
- Summary
- SGE (Sun Grid Engine) Tutorial
- Kubernetes Tutorial
- SSH Remote Execution Tutorial
- Clustrix Basic Usage Tutorial
API Reference
Supported Cluster Types¶
Cluster Type |
Status |
Notes |
|---|---|---|
SLURM |
✅ Full Support |
Production ready |
PBS/Torque |
✅ Full Support |
Production ready |
SSH |
✅ Full Support |
Direct execution |
SGE |
⚡ Nearly Ready |
Job submit works, status pending |
Kubernetes |
⚡ Nearly Ready |
Job submit works, status pending |
Links¶
GitHub Repository: https://github.com/ContextLab/clustrix
PyPI Package: https://pypi.org/project/clustrix/
Issue Tracker: https://github.com/ContextLab/clustrix/issues
Discussions: https://github.com/ContextLab/clustrix/discussions