Configuration API

Clustrix provides flexible configuration management for cluster settings, authentication, and execution preferences.

class clustrix.config.ClusterConfig(api_key=None, username=None, password=None, key_file=None, cluster_type='slurm', cluster_host=None, cluster_port=22, default_cores=4, default_memory='8GB', default_time='01:00:00', default_partition=None, default_queue=None, remote_work_dir='/tmp/clustrix', local_cache_dir='~/.clustrix/cache', conda_env_name=None, python_executable='python', auto_parallel=True, max_parallel_jobs=100, job_poll_interval=30, cleanup_on_success=True, prefer_local_parallel=False, local_parallel_threshold=1000, environment_variables=None, module_loads=None, pre_execution_commands=None)[source]

Bases: object

Configuration settings for cluster execution.

api_key: Optional[str] = None
username: Optional[str] = None
password: Optional[str] = None
key_file: Optional[str] = None
cluster_type: str = 'slurm'
cluster_host: Optional[str] = None
cluster_port: int = 22
default_cores: int = 4
default_memory: str = '8GB'
default_time: str = '01:00:00'
default_partition: Optional[str] = None
default_queue: Optional[str] = None
remote_work_dir: str = '/tmp/clustrix'
local_cache_dir: str = '~/.clustrix/cache'
conda_env_name: Optional[str] = None
python_executable: str = 'python'
auto_parallel: bool = True
max_parallel_jobs: int = 100
job_poll_interval: int = 30
cleanup_on_success: bool = True
prefer_local_parallel: bool = False
local_parallel_threshold: int = 1000
environment_variables: Optional[Dict[str, str]] = None
module_loads: Optional[list] = None
pre_execution_commands: Optional[list] = None
__init__(api_key=None, username=None, password=None, key_file=None, cluster_type='slurm', cluster_host=None, cluster_port=22, default_cores=4, default_memory='8GB', default_time='01:00:00', default_partition=None, default_queue=None, remote_work_dir='/tmp/clustrix', local_cache_dir='~/.clustrix/cache', conda_env_name=None, python_executable='python', auto_parallel=True, max_parallel_jobs=100, job_poll_interval=30, cleanup_on_success=True, prefer_local_parallel=False, local_parallel_threshold=1000, environment_variables=None, module_loads=None, pre_execution_commands=None)
clustrix.config.configure(**kwargs)[source]

Configure Clustrix settings.

Parameters:

**kwargs – Configuration parameters matching ClusterConfig fields

Return type:

None

clustrix.config.load_config(config_path)[source]

Load configuration from a file (JSON or YAML).

Parameters:

config_path (str) – Path to configuration file

Return type:

None

clustrix.config.save_config(config_path)[source]

Save current configuration to a file.

Parameters:

config_path (str) – Path where to save configuration

Return type:

None

clustrix.config.get_config()[source]

Get current configuration.

Return type:

ClusterConfig

Configuration Methods

Programmatic Configuration

import clustrix

clustrix.configure(
    cluster_type='slurm',
    cluster_host='cluster.example.com',
    username='myuser',
    key_file='~/.ssh/id_rsa',
    default_cores=8,
    default_memory='16GB'
)

Configuration File

Create a clustrix.yml file:

cluster_type: slurm
cluster_host: cluster.example.com
username: myuser
key_file: ~/.ssh/id_rsa

default_cores: 8
default_memory: 16GB
default_time: "02:00:00"
default_partition: gpu

remote_work_dir: /scratch/myuser/clustrix
conda_env_name: myproject

auto_parallel: true
max_parallel_jobs: 50
cleanup_on_success: true

module_loads:
  - python/3.9
  - cuda/11.2

environment_variables:
  CUDA_VISIBLE_DEVICES: "0,1"

Environment Variables

Set configuration via environment variables:

export CLUSTRIX_CLUSTER_TYPE=slurm
export CLUSTRIX_CLUSTER_HOST=cluster.example.com
export CLUSTRIX_USERNAME=myuser

Configuration Options

Authentication

  • username: SSH username

  • password: SSH password (not recommended)

  • key_file: Path to SSH private key file

Cluster Settings

  • cluster_type: Type of cluster (slurm, pbs, sge, kubernetes, ssh)

  • cluster_host: Hostname of cluster head node

  • cluster_port: SSH port (default: 22)

Resource Defaults

  • default_cores: Default number of CPU cores

  • default_memory: Default memory allocation

  • default_time: Default time limit

  • default_partition: Default partition/queue

Execution Preferences

  • auto_parallel: Enable automatic loop parallelization

  • max_parallel_jobs: Maximum number of parallel jobs

  • prefer_local_parallel: Prefer local over remote parallel execution

  • cleanup_on_success: Clean up remote files after successful execution