DocsVariables

Environment Variables and Secrets Management

Learn how to securely manage environment variables and secrets in your Trainwave jobs. This guide covers best practices, examples, and security considerations.

Quick Start

# Set a secret in your organization
wave secrets set OPENAI_API_KEY=sk-...
 
# Use it in your configuration
cat trainwave.toml
[env_vars]
OPENAI_API_KEY = "${OPENAI_API_KEY}"
BATCH_SIZE = "32"

Variable Types

Trainwave supports three types of variables:

  1. Configuration Variables: Defined in trainwave.toml
  2. Project Secrets: Scoped to specific projects
  3. Organization Secrets: Available across all projects

Configuration Variables

Basic Usage

Define variables directly in your trainwave.toml:

[env_vars]
# Fixed values
BATCH_SIZE = "32"
LEARNING_RATE = "0.001"
NUM_EPOCHS = "100"
 
# Local environment interpolation
WANDB_API_KEY = "${WANDB_API_KEY}"
HUGGINGFACE_TOKEN = "${HF_TOKEN}"

Dynamic Configuration

Use shell commands or environment variables:

[env_vars]
# Date-based versioning
MODEL_VERSION = "${shell:date +%Y%m%d}"
 
# Git information
GIT_COMMIT = "${shell:git rev-parse HEAD}"
GIT_BRANCH = "${shell:git branch --show-current}"
 
# Custom formatting
EXPERIMENT_NAME = "train-${shell:date +%Y%m%d-%H%M%S}"

Secrets Management

Organization Secrets

Available to all projects in your organization:

# Set organization secrets
wave secrets set \
  OPENAI_API_KEY=sk-... \
  WANDB_API_KEY=abc... \
  AWS_ACCESS_KEY=xyz...
 
# List organization secrets
wave secrets list --scope org

Project Secrets

Scoped to specific projects:

# Set project-specific secrets
wave secrets set \
  --project p-abc123 \
  DB_PASSWORD=xyz... \
  API_TOKEN=abc...
 
# List project secrets
wave secrets list --project p-abc123

Secret Files

Store entire files as secrets:

# Store SSH key
wave secrets set-file \
  SSH_KEY=~/.ssh/id_rsa \
  --project p-abc123
 
# Store JSON config
wave secrets set-file \
  GCP_CREDENTIALS=./gcp-credentials.json

Security Best Practices

1. Use Appropriate Scope

✅ DO:

# Project-specific credentials
wave secrets set --project p-abc123 DB_PASSWORD=xyz
 
# Organization-wide API keys
wave secrets set --scope org GITHUB_TOKEN=abc

❌ DON’T:

# Don't store project-specific secrets at org level
wave secrets set --scope org DB_PASSWORD=xyz

2. Rotate Secrets Regularly

# Generate new secret
NEW_KEY=$(openssl rand -hex 32)
 
# Update secret
wave secrets set API_KEY=$NEW_KEY
 
# Verify update
wave secrets list | grep API_KEY

3. Use Least Privilege

# Restrict secret access
wave secrets set --project p-abc123 \
  --restrict-to-job-type training \
  API_KEY=xyz

4. Audit Secret Usage

# View secret access logs
wave secrets audit API_KEY --last 7d
 
# Monitor secret usage
wave secrets usage --project p-abc123

Common Patterns

1. Machine Learning Training

[env_vars]
# Authentication
WANDB_API_KEY = "${WANDB_API_KEY}"
HUGGINGFACE_TOKEN = "${HF_TOKEN}"
 
# Training parameters
BATCH_SIZE = "32"
LEARNING_RATE = "0.001"
NUM_EPOCHS = "100"
 
# Resource configuration
PYTORCH_CUDA_ALLOC_CONF = "max_split_size_mb:512"
OMP_NUM_THREADS = "4"

2. Distributed Training

[env_vars]
# Distribution setup
MASTER_ADDR = "localhost"
MASTER_PORT = "29500"
WORLD_SIZE = "4"
RANK = "0"
 
# Framework configuration
NCCL_DEBUG = "INFO"
NCCL_P2P_DISABLE = "1"

3. Cloud Integration

[env_vars]
# AWS credentials
AWS_ACCESS_KEY_ID = "${AWS_ACCESS_KEY_ID}"
AWS_SECRET_ACCESS_KEY = "${AWS_SECRET_ACCESS_KEY}"
AWS_DEFAULT_REGION = "us-west-2"
 
# GCP credentials
GOOGLE_APPLICATION_CREDENTIALS = "/secrets/gcp-credentials.json"

Variable Precedence

Variables are resolved in the following order (highest to lowest priority):

  1. Job-specific variables (in trainwave.toml)
  2. Project secrets
  3. Organization secrets
  4. System environment variables

Example:

[env_vars]
# This will override any project/org level API_KEY
API_KEY = "job-specific-key"
 
# This will use project/org level WANDB_API_KEY
WANDB_API_KEY = "${WANDB_API_KEY}"

Troubleshooting

Common Issues

  1. Missing Variable

    # Check if variable exists
    wave secrets list | grep MISSING_VAR
     
    # Verify variable in job
    wave jobs logs j-xyz789 | grep MISSING_VAR
  2. Permission Issues

    # Check access rights
    wave secrets access-check API_KEY
     
    # Request access
    wave secrets request-access API_KEY
  3. Variable Conflicts

    # List all variables with their source
    wave secrets list --show-source
     
    # Check variable resolution
    wave jobs debug-env j-xyz789

Best Practices Summary

  1. Security

    • Never commit secrets to version control
    • Rotate secrets regularly
    • Use appropriate scoping
    • Audit secret usage
  2. Organization

    • Use consistent naming conventions
    • Document variable requirements
    • Keep secrets organized by project
    • Clean up unused secrets
  3. Development

    • Use local .env files for development
    • Test with dummy secrets in CI/CD
    • Validate required variables before job start
    • Log variable presence (not values) for debugging

Support

Last updated on