SSH Key Setup for Remote Clusters¶
Clustrix requires SSH access to remote clusters for job submission and management. This guide provides detailed instructions for setting up SSH keys with different cluster environments and authentication methods.
Overview¶
Clustrix supports multiple authentication methods:
SSH Key Authentication (Recommended): Most secure and convenient
Password Authentication: Simple but less secure
SSH Agent: For multiple key management
Custom Key Files: For specific cluster configurations
SSH Key Authentication Setup¶
1. Generate SSH Key Pair¶
Generate a new SSH key pair specifically for cluster access:
# Generate RSA key (recommended for compatibility)
ssh-keygen -t rsa -b 4096 -f ~/.ssh/clustrix_key
# Or generate Ed25519 key (more secure, newer systems)
ssh-keygen -t ed25519 -f ~/.ssh/clustrix_ed25519
Important: Use a strong passphrase for additional security.
2. Copy Public Key to Cluster¶
Upload your public key to the cluster:
# Using ssh-copy-id (easiest method)
ssh-copy-id -i ~/.ssh/clustrix_key.pub username@cluster.hostname.edu
# Manual method if ssh-copy-id is not available
cat ~/.ssh/clustrix_key.pub | ssh username@cluster.hostname.edu "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"
3. Configure SSH Client¶
Create or edit ~/.ssh/config to define cluster-specific settings:
# Example SSH config for a SLURM cluster
Host my-slurm-cluster
HostName slurm.university.edu
User myusername
IdentityFile ~/.ssh/clustrix_key
ForwardAgent yes
ServerAliveInterval 60
ServerAliveCountMax 3
# Example for a PBS cluster with specific port
Host my-pbs-cluster
HostName pbs.cluster.org
Port 2222
User researcher
IdentityFile ~/.ssh/clustrix_key
ProxyJump gateway.cluster.org
# Example for SGE cluster with compression
Host my-sge-cluster
HostName sge.hpc.gov
User scientist
IdentityFile ~/.ssh/clustrix_key
Compression yes
TCPKeepAlive yes
4. Test SSH Connection¶
Verify your SSH setup works correctly:
# Test basic connection
ssh my-slurm-cluster "hostname && whoami"
# Test specific commands that Clustrix will use
ssh my-slurm-cluster "which sbatch && squeue --version"
Clustrix Configuration¶
Configure Clustrix to use your SSH setup:
Python Configuration¶
from clustrix import configure
# Using SSH config host alias (recommended)
configure(
cluster_type="slurm",
cluster_host="my-slurm-cluster", # matches SSH config
username="myusername", # optional if in SSH config
)
# Using specific key file
configure(
cluster_type="pbs",
cluster_host="pbs.cluster.org",
username="researcher",
key_file="~/.ssh/clustrix_key"
)
# Using password authentication (not recommended)
configure(
cluster_type="sge",
cluster_host="sge.hpc.gov",
username="scientist",
password="your_password" # Use environment variable instead
)
Configuration File¶
Create ~/.clustrix/config.yml:
# SLURM cluster with SSH key
cluster_type: "slurm"
cluster_host: "my-slurm-cluster"
username: "myusername"
key_file: "~/.ssh/clustrix_key"
remote_work_dir: "/scratch/myusername/clustrix"
# Default resource settings
default_cores: 4
default_memory: "8GB"
default_time: "02:00:00"
# Environment setup
module_loads:
- "python/3.11"
- "gcc/11.2"
environment_variables:
OMP_NUM_THREADS: "4"
Environment Variables¶
Set environment variables for sensitive information:
# In your shell profile (~/.bashrc, ~/.zshrc)
export CLUSTRIX_HOST="slurm.university.edu"
export CLUSTRIX_USERNAME="myusername"
export CLUSTRIX_KEY_FILE="~/.ssh/clustrix_key"
# For password auth (not recommended in scripts)
export CLUSTRIX_PASSWORD="your_password"
Advanced SSH Configurations¶
SSH Agent Integration¶
For managing multiple keys and passphrases:
# Start SSH agent
eval "$(ssh-agent -s)"
# Add your cluster key
ssh-add ~/.ssh/clustrix_key
# Verify keys are loaded
ssh-add -l
Configure Clustrix to use SSH agent:
configure(
cluster_type="slurm",
cluster_host="my-cluster",
username="myuser"
# No key_file specified - will use SSH agent
)
Jump Hosts and Bastion Servers¶
For clusters behind firewalls:
# SSH config with jump host
Host cluster-gateway
HostName gateway.cluster.org
User myuser
IdentityFile ~/.ssh/gateway_key
Host cluster-internal
HostName internal.cluster.local
User myuser
IdentityFile ~/.ssh/cluster_key
ProxyJump cluster-gateway
# Clustrix configuration for jump host setup
configure(
cluster_type="slurm",
cluster_host="cluster-internal", # Uses SSH config
username="myuser"
)
Multiple Cluster Management¶
Manage multiple clusters with different keys:
# Define cluster configurations
clusters = {
"slurm_cluster": {
"cluster_type": "slurm",
"cluster_host": "slurm.university.edu",
"key_file": "~/.ssh/slurm_key"
},
"pbs_cluster": {
"cluster_type": "pbs",
"cluster_host": "pbs.research.org",
"key_file": "~/.ssh/pbs_key"
}
}
# Switch between clusters
from clustrix import configure
# Use SLURM cluster
configure(**clusters["slurm_cluster"])
@cluster(cores=8)
def slurm_task():
return "Running on SLURM"
# Switch to PBS cluster
configure(**clusters["pbs_cluster"])
@cluster(cores=4)
def pbs_task():
return "Running on PBS"
Security Best Practices¶
Key Management¶
Use Strong Passphrases: Always protect private keys with passphrases
Separate Keys per Cluster: Use different keys for different environments
Regular Key Rotation: Replace keys periodically (every 6-12 months)
Backup Keys Securely: Store encrypted backups of important keys
File Permissions¶
Ensure correct SSH file permissions:
# Set correct permissions
chmod 700 ~/.ssh
chmod 600 ~/.ssh/config
chmod 600 ~/.ssh/clustrix_key
chmod 644 ~/.ssh/clustrix_key.pub
chmod 600 ~/.ssh/authorized_keys # on remote cluster
Network Security¶
Use SSH Config: Centralize connection settings
Enable Compression: For large data transfers
Configure Timeouts: Prevent hanging connections
Use Port Forwarding: For additional services when needed
Troubleshooting¶
Common Issues¶
Connection Refused
# Check if SSH service is running
ssh -v username@cluster.hostname.edu
# Test different ports
ssh -p 2222 username@cluster.hostname.edu
Permission Denied
# Verify key permissions
ls -la ~/.ssh/
# Check authorized_keys on remote
ssh username@cluster.hostname.edu "ls -la ~/.ssh/authorized_keys"
# Debug SSH connection
ssh -vvv username@cluster.hostname.edu
Clustrix Connection Errors
# Test Clustrix SSH connection
from clustrix.executor import ClusterExecutor
from clustrix.config import get_config
config = get_config()
executor = ClusterExecutor(config)
try:
executor.connect()
print("SSH connection successful!")
executor.disconnect()
except Exception as e:
print(f"Connection failed: {e}")
Debug Mode
Enable verbose logging for troubleshooting:
import logging
logging.basicConfig(level=logging.DEBUG)
from clustrix import configure, cluster
configure(
cluster_type="slurm",
cluster_host="my-cluster",
username="myuser"
)
Cluster-Specific Setup¶
SLURM Clusters¶
Typical SLURM cluster requirements:
configure(
cluster_type="slurm",
cluster_host="slurm.hpc.edu",
username="researcher",
key_file="~/.ssh/slurm_key",
remote_work_dir="/scratch/researcher/jobs",
module_loads=["python/3.11", "gcc/11"],
default_partition="compute"
)
PBS/Torque Clusters¶
PBS cluster configuration:
configure(
cluster_type="pbs",
cluster_host="pbs.cluster.org",
username="scientist",
key_file="~/.ssh/pbs_key",
remote_work_dir="/home/scientist/clustrix",
default_queue="normal"
)
SGE Clusters¶
Sun Grid Engine setup:
configure(
cluster_type="sge",
cluster_host="sge.grid.edu",
username="user",
key_file="~/.ssh/sge_key",
remote_work_dir="/tmp/user/clustrix"
)
Complete Example¶
Here’s a complete working example:
# 1. Generate SSH key
ssh-keygen -t rsa -b 4096 -f ~/.ssh/my_cluster_key
# 2. Copy to cluster
ssh-copy-id -i ~/.ssh/my_cluster_key.pub myuser@cluster.university.edu
# 3. Create SSH config
echo "Host my-cluster
HostName cluster.university.edu
User myuser
IdentityFile ~/.ssh/my_cluster_key" >> ~/.ssh/config
# 4. Test connection
ssh my-cluster "hostname"
# 5. Configure Clustrix
from clustrix import configure, cluster
configure(
cluster_type="slurm",
cluster_host="my-cluster",
remote_work_dir="/scratch/myuser/clustrix"
)
# 6. Use Clustrix
@cluster(cores=4, memory="8GB")
def compute_task(n):
return sum(i**2 for i in range(n))
# This will execute on the remote cluster
result = compute_task(1000)
print(f"Result: {result}")
This setup provides secure, reliable SSH access for Clustrix to manage your cluster computing jobs.