Clustrix Documentation

Clustrix is a Python package that enables seamless distributed computing on clusters. With a simple decorator, you can execute any Python function remotely on cluster resources while automatically handling dependency management, environment setup, and result collection.

PyPI version Python versions License

Features

  • Simple Decorator Interface: Just add @cluster to any function

  • Multiple Cluster Support: SLURM, PBS, SGE, Kubernetes, and SSH

  • Automatic Dependency Management: Captures and replicates your exact Python environment

  • Loop Parallelization: Automatically distributes loops across cluster nodes

  • Local Parallelization: Multi-core execution for development and testing

  • Flexible Configuration: Easy setup with config files or environment variables

  • Error Handling: Comprehensive error reporting and job monitoring

Quick Start

Installation

pip install clustrix

Basic Usage

import clustrix

# Configure your cluster
clustrix.configure(
    cluster_type='slurm',
    cluster_host='your-cluster.example.com',
    username='your-username',
    default_cores=4,
    default_memory='8GB'
)

# Decorate your function
@clustrix.cluster(cores=8, memory='16GB', time='02:00:00')
def expensive_computation(data, iterations=1000):
    import numpy as np
    result = 0
    for i in range(iterations):
        result += np.sum(data ** 2)
    return result

# Execute on cluster
data = [1, 2, 3, 4, 5]
result = expensive_computation(data, iterations=10000)
print(f"Result: {result}")

Table of Contents

Interactive Notebooks

Supported Cluster Types

Cluster Type

Status

Notes

SLURM

✅ Full Support

Production ready

PBS/Torque

✅ Full Support

Production ready

SSH

✅ Full Support

Direct execution

SGE

⚡ Nearly Ready

Job submit works, status pending

Kubernetes

⚡ Nearly Ready

Job submit works, status pending

Indices and tables