Configuration APIΒΆ
Clustrix provides flexible configuration management for cluster settings, authentication, and execution preferences.
- class clustrix.config.ClusterConfig(api_key=None, username=None, password=None, key_file=None, cluster_type='slurm', cluster_host=None, cluster_port=22, k8s_namespace='default', k8s_image='python:3.11-slim', k8s_service_account=None, k8s_pull_policy='IfNotPresent', k8s_job_ttl_seconds=3600, k8s_backoff_limit=3, k8s_remote=False, cloud_provider='manual', cloud_region=None, cloud_auto_configure=False, auto_provision_k8s=False, k8s_provider='aws', k8s_from_scratch=True, k8s_auto_cleanup=True, k8s_cluster_name=None, k8s_node_count=2, k8s_node_type=None, k8s_version='1.28', k8s_region=None, aws_profile=None, aws_access_key_id=None, aws_secret_access_key=None, aws_access_key=None, aws_secret_key=None, aws_session_token=None, aws_instance_type=None, aws_cluster_type=None, eks_cluster_name=None, aws_region=None, azure_subscription_id=None, azure_resource_group=None, azure_tenant_id=None, azure_client_id=None, azure_client_secret=None, azure_instance_type=None, aks_cluster_name=None, azure_region=None, gcp_project_id=None, gcp_zone=None, gcp_service_account_key=None, gcp_instance_type=None, gke_cluster_name=None, gcp_region=None, lambda_instance_type=None, lambda_api_key=None, hf_hardware=None, hf_token=None, hf_username=None, hf_sdk=None, default_cores=4, default_memory='8GB', default_time='01:00:00', default_partition=None, default_queue=None, remote_work_dir='/tmp/clustrix', local_work_dir=None, local_cache_dir='~/.clustrix/cache', conda_env_name=None, python_executable='python', package_manager='pip', auto_parallel=True, auto_gpu_parallel=True, max_parallel_jobs=100, max_gpu_parallel_jobs=8, job_poll_interval=30, cleanup_on_success=True, prefer_local_parallel=False, local_parallel_threshold=1000, async_submit=False, use_two_venv=True, venv_setup_timeout=300, cost_monitoring=False, use_env_password=False, password_env_var='', cache_credentials=True, credential_cache_ttl=300, ssh_port=22, environment_variables=None, module_loads=None, pre_execution_commands=None, cluster_packages=None, venv_post_install_commands=None, gpu_detection_enabled=True, auto_gpu_packages=True, cuda_version_preference=None, gpu_memory_fraction=0.9, prefer_gpu_execution=True, gpu_requirements=None, rapids_ecosystem=False, venv_info=None)[source]ΒΆ
Bases:
objectConfiguration settings for cluster execution.
- classmethod load_from_file(config_path)[source]ΒΆ
Load configuration from a file and return a new instance.
- Return type:
- __init__(api_key=None, username=None, password=None, key_file=None, cluster_type='slurm', cluster_host=None, cluster_port=22, k8s_namespace='default', k8s_image='python:3.11-slim', k8s_service_account=None, k8s_pull_policy='IfNotPresent', k8s_job_ttl_seconds=3600, k8s_backoff_limit=3, k8s_remote=False, cloud_provider='manual', cloud_region=None, cloud_auto_configure=False, auto_provision_k8s=False, k8s_provider='aws', k8s_from_scratch=True, k8s_auto_cleanup=True, k8s_cluster_name=None, k8s_node_count=2, k8s_node_type=None, k8s_version='1.28', k8s_region=None, aws_profile=None, aws_access_key_id=None, aws_secret_access_key=None, aws_access_key=None, aws_secret_key=None, aws_session_token=None, aws_instance_type=None, aws_cluster_type=None, eks_cluster_name=None, aws_region=None, azure_subscription_id=None, azure_resource_group=None, azure_tenant_id=None, azure_client_id=None, azure_client_secret=None, azure_instance_type=None, aks_cluster_name=None, azure_region=None, gcp_project_id=None, gcp_zone=None, gcp_service_account_key=None, gcp_instance_type=None, gke_cluster_name=None, gcp_region=None, lambda_instance_type=None, lambda_api_key=None, hf_hardware=None, hf_token=None, hf_username=None, hf_sdk=None, default_cores=4, default_memory='8GB', default_time='01:00:00', default_partition=None, default_queue=None, remote_work_dir='/tmp/clustrix', local_work_dir=None, local_cache_dir='~/.clustrix/cache', conda_env_name=None, python_executable='python', package_manager='pip', auto_parallel=True, auto_gpu_parallel=True, max_parallel_jobs=100, max_gpu_parallel_jobs=8, job_poll_interval=30, cleanup_on_success=True, prefer_local_parallel=False, local_parallel_threshold=1000, async_submit=False, use_two_venv=True, venv_setup_timeout=300, cost_monitoring=False, use_env_password=False, password_env_var='', cache_credentials=True, credential_cache_ttl=300, ssh_port=22, environment_variables=None, module_loads=None, pre_execution_commands=None, cluster_packages=None, venv_post_install_commands=None, gpu_detection_enabled=True, auto_gpu_packages=True, cuda_version_preference=None, gpu_memory_fraction=0.9, prefer_gpu_execution=True, gpu_requirements=None, rapids_ecosystem=False, venv_info=None)ΒΆ
Configuration MethodsΒΆ
Programmatic ConfigurationΒΆ
import clustrix
clustrix.configure(
cluster_type='slurm',
cluster_host='cluster.example.com',
username='myuser',
key_file='~/.ssh/id_rsa',
default_cores=8,
default_memory='16GB'
)
Configuration FileΒΆ
Create a clustrix.yml file:
cluster_type: slurm
cluster_host: cluster.example.com
username: myuser
key_file: ~/.ssh/id_rsa
default_cores: 8
default_memory: 16GB
default_time: "02:00:00"
default_partition: gpu
remote_work_dir: /scratch/myuser/clustrix
conda_env_name: myproject
auto_parallel: true
max_parallel_jobs: 50
cleanup_on_success: true
module_loads:
- python/3.9
- cuda/11.2
environment_variables:
CUDA_VISIBLE_DEVICES: "0,1"
Environment VariablesΒΆ
Set configuration via environment variables:
export CLUSTRIX_CLUSTER_TYPE=slurm
export CLUSTRIX_CLUSTER_HOST=cluster.example.com
export CLUSTRIX_USERNAME=myuser
Configuration OptionsΒΆ
AuthenticationΒΆ
username: SSH usernamepassword: SSH password (not recommended)key_file: Path to SSH private key file
Cluster SettingsΒΆ
cluster_type: Type of cluster (slurm, pbs, sge, kubernetes, ssh)cluster_host: Hostname of cluster head nodecluster_port: SSH port (default: 22)
Resource DefaultsΒΆ
default_cores: Default number of CPU coresdefault_memory: Default memory allocationdefault_time: Default time limitdefault_partition: Default partition/queue
Execution PreferencesΒΆ
auto_parallel: Enable automatic loop parallelizationmax_parallel_jobs: Maximum number of parallel jobsprefer_local_parallel: Prefer local over remote parallel executioncleanup_on_success: Clean up remote files after successful execution