Clustrix Documentationยถ

Clustrix is a Python package that enables seamless distributed computing on clusters. With a simple decorator, you can execute any Python function remotely on cluster resources while automatically handling dependency management, environment setup, and result collection.

PyPI version Python versions License

Featuresยถ

  • Simple Decorator Interface: Just add @cluster to any function

  • Advanced Function Packaging: AST-based dependency analysis replaces pickle limitations

  • Interactive Jupyter Widget: %%clusterfy magic command with GUI configuration manager

  • Multiple Cluster Support: SLURM, PBS, SGE, Kubernetes, and SSH

  • Unified Filesystem Utilities: Work with files seamlessly across local and remote clusters

  • Shared Storage Optimization: Automatic detection and optimization for HPC shared filesystems

  • Native Cost Monitoring: Built-in cost tracking for AWS, GCP, Azure, and Lambda Cloud

  • Automatic Dependency Management: Captures and replicates your exact Python environment

  • Loop Parallelization: Automatically distributes loops across cluster nodes

  • Local Parallelization: Multi-core execution for development and testing

  • Flexible Configuration: Easy setup with config files, environment variables, or interactive widget

  • Error Handling: Comprehensive error reporting and job monitoring

Quick Startยถ

Installationยถ

pip install clustrix

Basic Usageยถ

import clustrix

# Configure your cluster
clustrix.configure(
    cluster_type='slurm',
    cluster_host='your-cluster.example.com',
    username='your-username',
    default_cores=4,
    default_memory='8GB'
)

# Decorate your function
@clustrix.cluster(cores=8, memory='16GB', time='02:00:00')
def expensive_computation(data, iterations=1000):
    import numpy as np
    result = 0
    for i in range(iterations):
        result += np.sum(data ** 2)
    return result

# Execute on cluster
data = [1, 2, 3, 4, 5]
result = expensive_computation(data, iterations=10000)
print(f"Result: {result}")

Jupyter Notebook Integrationยถ

For Jupyter notebook users, Clustrix provides an interactive configuration widget:

import clustrix  # Auto-loads the magic command and displays widget
%%clusterfy
# Interactive widget appears with:
# - Dropdown to select configurations
# - Forms to create/edit cluster setups
# - One-click configuration application
# - Save/load configurations to files

Interactive Configuration Widgetยถ

The Clustrix widget provides a comprehensive GUI for managing cluster configurations directly in Jupyter notebooks.

Default View

When you first import clustrix or use the %%clusterfy magic command, the widget displays with a default โ€œLocal Single-coreโ€ configuration:

Default widget view showing Local Single-core configuration

Configuration Templates

The dropdown menu includes pre-built templates for various cluster types and cloud providers:

Configuration dropdown showing available templates

HPC Cluster Configuration

For traditional HPC clusters like SLURM, the widget provides all essential configuration fields:

SLURM configuration with basic settings

The advanced settings accordion reveals additional options for modules, environment variables, and custom commands:

SLURM advanced configuration options

Cloud Provider Support

Cloud providers have dynamic field visibility showing only relevant options:

Google Cloud Platform:

GCP VM configuration interface

Lambda Cloud GPU Instances:

Lambda Cloud with GPU instance dropdown

The widget includes templates for AWS, Google Cloud, Azure, SLURM, Kubernetes, Lambda Cloud, and HuggingFace Spaces.

Table of Contentsยถ

Interactive Notebooks

Cloud Platform Tutorials

API Reference

Supported Cluster Typesยถ

Traditional HPC Schedulers

Cluster Type

Status

Notes

SLURM

โœ… Full Support

Production ready

PBS/Torque

โœ… Full Support

Production ready

SGE

โœ… Full Support

Production ready

SSH

โœ… Full Support

Direct execution

Container Orchestration

Platform

Status

Notes

Kubernetes

โœ… Full Support

Native K8s API with auto-deps

AWS EKS

โœ… Full Support

Kubernetes + AWS integration

Azure AKS

โœ… Full Support

Kubernetes + Azure integration

Google GKE

โœ… Full Support

Kubernetes + GCP integration

Cloud Computing Platforms

Platform

Status

Notes

AWS EC2

โœ… Full Support

Auto-provisioning + cost monitor

Azure VMs

โœ… Full Support

Auto-provisioning + cost monitor

Google Cloud

โœ… Full Support

Auto-provisioning + cost monitor

Lambda Cloud

โœ… Full Support

GPU-optimized instances

HF Spaces

โœ… Full Support

Hugging Face Spaces integration

Indices and tablesยถ