SSH Key Setup for Remote Clusters

Clustrix requires SSH access to remote clusters for job submission and management. This guide provides detailed instructions for setting up SSH keys with different cluster environments and authentication methods.

Overview

Clustrix supports multiple authentication methods:

  • SSH Key Authentication (Recommended): Most secure and convenient

  • Password Authentication: Simple but less secure

  • SSH Agent: For multiple key management

  • Custom Key Files: For specific cluster configurations

SSH Key Authentication Setup

1. Generate SSH Key Pair

Generate a new SSH key pair specifically for cluster access:

# Generate RSA key (recommended for compatibility)
ssh-keygen -t rsa -b 4096 -f ~/.ssh/clustrix_key

# Or generate Ed25519 key (more secure, newer systems)
ssh-keygen -t ed25519 -f ~/.ssh/clustrix_ed25519

Important: Use a strong passphrase for additional security.

2. Copy Public Key to Cluster

Upload your public key to the cluster:

# Using ssh-copy-id (easiest method)
ssh-copy-id -i ~/.ssh/clustrix_key.pub username@cluster.hostname.edu

# Manual method if ssh-copy-id is not available
cat ~/.ssh/clustrix_key.pub | ssh username@cluster.hostname.edu "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"

3. Configure SSH Client

Create or edit ~/.ssh/config to define cluster-specific settings:

# Example SSH config for a SLURM cluster
Host my-slurm-cluster
    HostName slurm.university.edu
    User myusername
    IdentityFile ~/.ssh/clustrix_key
    ForwardAgent yes
    ServerAliveInterval 60
    ServerAliveCountMax 3

# Example for a PBS cluster with specific port
Host my-pbs-cluster
    HostName pbs.cluster.org
    Port 2222
    User researcher
    IdentityFile ~/.ssh/clustrix_key
    ProxyJump gateway.cluster.org

# Example for SGE cluster with compression
Host my-sge-cluster
    HostName sge.hpc.gov
    User scientist
    IdentityFile ~/.ssh/clustrix_key
    Compression yes
    TCPKeepAlive yes

4. Test SSH Connection

Verify your SSH setup works correctly:

# Test basic connection
ssh my-slurm-cluster "hostname && whoami"

# Test specific commands that Clustrix will use
ssh my-slurm-cluster "which sbatch && squeue --version"

Clustrix Configuration

Configure Clustrix to use your SSH setup:

Python Configuration

from clustrix import configure

# Using SSH config host alias (recommended)
configure(
    cluster_type="slurm",
    cluster_host="my-slurm-cluster",  # matches SSH config
    username="myusername",           # optional if in SSH config
)

# Using specific key file
configure(
    cluster_type="pbs",
    cluster_host="pbs.cluster.org",
    username="researcher",
    key_file="~/.ssh/clustrix_key"
)

# Using password authentication (not recommended)
configure(
    cluster_type="sge",
    cluster_host="sge.hpc.gov",
    username="scientist",
    password="your_password"  # Use environment variable instead
)

Configuration File

Create ~/.clustrix/config.yml:

# SLURM cluster with SSH key
cluster_type: "slurm"
cluster_host: "my-slurm-cluster"
username: "myusername"
key_file: "~/.ssh/clustrix_key"
remote_work_dir: "/scratch/myusername/clustrix"

# Default resource settings
default_cores: 4
default_memory: "8GB"
default_time: "02:00:00"

# Environment setup
module_loads:
  - "python/3.11"
  - "gcc/11.2"

environment_variables:
  OMP_NUM_THREADS: "4"

Environment Variables

Set environment variables for sensitive information:

# In your shell profile (~/.bashrc, ~/.zshrc)
export CLUSTRIX_HOST="slurm.university.edu"
export CLUSTRIX_USERNAME="myusername"
export CLUSTRIX_KEY_FILE="~/.ssh/clustrix_key"

# For password auth (not recommended in scripts)
export CLUSTRIX_PASSWORD="your_password"

Advanced SSH Configurations

SSH Agent Integration

For managing multiple keys and passphrases:

# Start SSH agent
eval "$(ssh-agent -s)"

# Add your cluster key
ssh-add ~/.ssh/clustrix_key

# Verify keys are loaded
ssh-add -l

Configure Clustrix to use SSH agent:

configure(
    cluster_type="slurm",
    cluster_host="my-cluster",
    username="myuser"
    # No key_file specified - will use SSH agent
)

Jump Hosts and Bastion Servers

For clusters behind firewalls:

# SSH config with jump host
Host cluster-gateway
    HostName gateway.cluster.org
    User myuser
    IdentityFile ~/.ssh/gateway_key

Host cluster-internal
    HostName internal.cluster.local
    User myuser
    IdentityFile ~/.ssh/cluster_key
    ProxyJump cluster-gateway
# Clustrix configuration for jump host setup
configure(
    cluster_type="slurm",
    cluster_host="cluster-internal",  # Uses SSH config
    username="myuser"
)

Multiple Cluster Management

Manage multiple clusters with different keys:

# Define cluster configurations
clusters = {
    "slurm_cluster": {
        "cluster_type": "slurm",
        "cluster_host": "slurm.university.edu",
        "key_file": "~/.ssh/slurm_key"
    },
    "pbs_cluster": {
        "cluster_type": "pbs",
        "cluster_host": "pbs.research.org",
        "key_file": "~/.ssh/pbs_key"
    }
}

# Switch between clusters
from clustrix import configure

# Use SLURM cluster
configure(**clusters["slurm_cluster"])

@cluster(cores=8)
def slurm_task():
    return "Running on SLURM"

# Switch to PBS cluster
configure(**clusters["pbs_cluster"])

@cluster(cores=4)
def pbs_task():
    return "Running on PBS"

Security Best Practices

Key Management

  1. Use Strong Passphrases: Always protect private keys with passphrases

  2. Separate Keys per Cluster: Use different keys for different environments

  3. Regular Key Rotation: Replace keys periodically (every 6-12 months)

  4. Backup Keys Securely: Store encrypted backups of important keys

File Permissions

Ensure correct SSH file permissions:

# Set correct permissions
chmod 700 ~/.ssh
chmod 600 ~/.ssh/config
chmod 600 ~/.ssh/clustrix_key
chmod 644 ~/.ssh/clustrix_key.pub
chmod 600 ~/.ssh/authorized_keys  # on remote cluster

Network Security

  1. Use SSH Config: Centralize connection settings

  2. Enable Compression: For large data transfers

  3. Configure Timeouts: Prevent hanging connections

  4. Use Port Forwarding: For additional services when needed

Troubleshooting

Common Issues

Connection Refused

# Check if SSH service is running
ssh -v username@cluster.hostname.edu

# Test different ports
ssh -p 2222 username@cluster.hostname.edu

Permission Denied

# Verify key permissions
ls -la ~/.ssh/

# Check authorized_keys on remote
ssh username@cluster.hostname.edu "ls -la ~/.ssh/authorized_keys"

# Debug SSH connection
ssh -vvv username@cluster.hostname.edu

Clustrix Connection Errors

# Test Clustrix SSH connection
from clustrix.executor import ClusterExecutor
from clustrix.config import get_config

config = get_config()
executor = ClusterExecutor(config)

try:
    executor.connect()
    print("SSH connection successful!")
    executor.disconnect()
except Exception as e:
    print(f"Connection failed: {e}")

Debug Mode

Enable verbose logging for troubleshooting:

import logging
logging.basicConfig(level=logging.DEBUG)

from clustrix import configure, cluster

configure(
    cluster_type="slurm",
    cluster_host="my-cluster",
    username="myuser"
)

Cluster-Specific Setup

SLURM Clusters

Typical SLURM cluster requirements:

configure(
    cluster_type="slurm",
    cluster_host="slurm.hpc.edu",
    username="researcher",
    key_file="~/.ssh/slurm_key",
    remote_work_dir="/scratch/researcher/jobs",
    module_loads=["python/3.11", "gcc/11"],
    default_partition="compute"
)

PBS/Torque Clusters

PBS cluster configuration:

configure(
    cluster_type="pbs",
    cluster_host="pbs.cluster.org",
    username="scientist",
    key_file="~/.ssh/pbs_key",
    remote_work_dir="/home/scientist/clustrix",
    default_queue="normal"
)

SGE Clusters

Sun Grid Engine setup:

configure(
    cluster_type="sge",
    cluster_host="sge.grid.edu",
    username="user",
    key_file="~/.ssh/sge_key",
    remote_work_dir="/tmp/user/clustrix"
)

Complete Example

Here’s a complete working example:

# 1. Generate SSH key
ssh-keygen -t rsa -b 4096 -f ~/.ssh/my_cluster_key

# 2. Copy to cluster
ssh-copy-id -i ~/.ssh/my_cluster_key.pub myuser@cluster.university.edu

# 3. Create SSH config
echo "Host my-cluster
    HostName cluster.university.edu
    User myuser
    IdentityFile ~/.ssh/my_cluster_key" >> ~/.ssh/config

# 4. Test connection
ssh my-cluster "hostname"
# 5. Configure Clustrix
from clustrix import configure, cluster

configure(
    cluster_type="slurm",
    cluster_host="my-cluster",
    remote_work_dir="/scratch/myuser/clustrix"
)

# 6. Use Clustrix
@cluster(cores=4, memory="8GB")
def compute_task(n):
    return sum(i**2 for i in range(n))

# This will execute on the remote cluster
result = compute_task(1000)
print(f"Result: {result}")

This setup provides secure, reliable SSH access for Clustrix to manage your cluster computing jobs.