DocsCLI

Trainwave CLI Guide

The Trainwave Command Line Interface (CLI) is your primary tool for managing machine learning jobs from your terminal. This guide covers all available commands with practical examples.

Installation

Install the CLI using pip:

pip install trainwave-cli

Verify the installation:

wave --version

Quick Start

# Login to your account
wave auth login
 
# Create a new configuration
wave config
 
# Launch a training job
wave jobs launch
 
# Monitor the job
wave jobs logs -f

Authentication

Login

# Interactive browser login (recommended)
wave auth login
 
# Login with API key
wave auth set-token your-api-key
 
# Verify login
wave auth whoami

Token Management

# Create a new API token
wave auth create-token --name "CI/CD Token"
 
# List active tokens
wave auth list-tokens
 
# Revoke a token
wave auth revoke-token token-id

Job Management

Launching Jobs

# Launch with default config
wave jobs launch
 
# Launch with specific config file
wave jobs launch --config custom-config.toml
 
# Launch with overrides
wave jobs launch --gpu-type "RTX A5000" --gpus 2
 
# Launch in a specific project
wave jobs launch --project p-abc123

Monitoring Jobs

# List all jobs
wave jobs list
 
# List running jobs
wave jobs list --status running
 
# List jobs for a specific project
wave jobs list --project p-abc123
 
# Get detailed job status
wave jobs status j-xyz789
 
# Stream logs in real-time
wave jobs logs -f j-xyz789
 
# View GPU metrics
wave jobs metrics j-xyz789 --type gpu

Job Control

# Cancel a job
wave jobs cancel j-xyz789
 
# Stop a job gracefully
wave jobs stop j-xyz789
 
# Restart a failed job
wave jobs restart j-xyz789

Configuration

Interactive Configuration

# Open web UI for configuration
wave config
 
# Create config in specific location
wave config --output custom-config.toml

Validate Configuration

# Validate current config
wave config validate
 
# Validate specific config file
wave config validate --file custom-config.toml

Resource Management

GPU Management

# List available GPU types
wave gpus list
 
# Show GPU pricing
wave gpus pricing
 
# Check GPU availability
wave gpus availability

Storage Management

# List artifacts
wave storage list
 
# Download artifacts
wave storage download j-xyz789 --output ./artifacts
 
# Clean up old artifacts
wave storage cleanup --older-than 30d

Secrets Management

Setting Secrets

# Set a single secret
wave secrets set API_KEY=xyz123
 
# Set multiple secrets
wave secrets set \
  WANDB_API_KEY=abc123 \
  HUGGINGFACE_TOKEN=def456
 
# Set secret from file
wave secrets set-file GITHUB_KEY=./github_key.txt

Managing Secrets

# List all secrets
wave secrets list
 
# List secrets for specific project
wave secrets list --project p-abc123
 
# Delete a secret
wave secrets unset API_KEY
 
# Rotate a secret
wave secrets rotate API_KEY

Project Management

Project Operations

# List projects
wave projects list
 
# Create new project
wave projects create "MNIST Training"
 
# Set default project
wave projects set-default p-abc123

Advanced Usage

Batch Operations

# Cancel all running jobs
wave jobs list --status running | xargs -I {} wave jobs cancel {}
 
# Download artifacts from multiple jobs
wave jobs list --status completed | xargs -I {} wave storage download {}

CI/CD Integration

# Non-interactive authentication
export TRAINWAVE_API_KEY=your-api-key
wave jobs launch --non-interactive
 
# Wait for job completion
wave jobs wait j-xyz789 --timeout 1h

Custom Scripts

# Get job info in JSON format
wave jobs status j-xyz789 --json
 
# Use with jq for parsing
wave jobs status j-xyz789 --json | jq .status

Environment Variables

The CLI respects the following environment variables:

VariableDescriptionExample
TRAINWAVE_API_KEYAPI key for authenticationexport TRAINWAVE_API_KEY=xyz123
TRAINWAVE_PROJECTDefault project IDexport TRAINWAVE_PROJECT=p-abc123
TRAINWAVE_CONFIGCustom config file pathexport TRAINWAVE_CONFIG=./config.toml

Common Workflows

1. Training Workflow

# Start training
JOB_ID=$(wave jobs launch --config train.toml --json | jq -r .id)
 
# Monitor progress
wave jobs logs -f $JOB_ID
 
# Download results
wave storage download $JOB_ID --output ./results

2. Experiment Management

# Launch multiple experiments
for lr in 0.001 0.01 0.1; do
  wave jobs launch --config base.toml \
    --env LEARNING_RATE=$lr \
    --name "experiment-lr-${lr}"
done
 
# Monitor all experiments
wave jobs list --format wide

3. Resource Cleanup

# Cancel old running jobs
wave jobs list --status running --older-than 24h | \
  xargs -I {} wave jobs cancel {}
 
# Clean up old artifacts
wave storage cleanup --older-than 7d

Troubleshooting

Common Issues

  1. Authentication Failed

    # Verify authentication
    wave auth whoami
     
    # Re-authenticate
    wave auth login --force
  2. Job Failed

    # Check detailed status
    wave jobs status j-xyz789 --verbose
     
    # View error logs
    wave jobs logs j-xyz789 --tail 100
  3. Resource Issues

    # Check GPU availability
    wave gpus availability
     
    # View resource quotas
    wave quota show

Support