Trainwave API Reference

The Trainwave REST API enables you to programmatically control and monitor your machine learning jobs. This reference provides detailed information about available endpoints, authentication, and example usage.

Quick Start

# Authentication using API key
curl -H "Accept: application/json" \
     -H "X-API-KEY: your-api-key" \
     https://backend.trainwave.ai/api/v1/jobs/
 
# Create a new job
curl -X POST \
     -H "Accept: application/json" \
     -H "X-API-KEY: your-api-key" \
     -H "Content-Type: application/json" \
     -d '{
       "name": "mnist-training",
       "project": "p-abc123",
       "gpu_type": "RTX A5000",
       "gpus": 1
     }' \
     https://backend.trainwave.ai/api/v1/jobs/

Base URL

All API requests should be made to: https://backend.trainwave.ai/api/v1/

Authentication

See the Authentication Guide for detailed information about securing your API requests.

Response Format

All responses follow this standard format:

{
    "success": true,
    "data": {
        // Response data here
    },
    "meta": {
        "request_id": "req_abc123",
        "timestamp": "2024-03-21T12:00:00Z"
    }
}

Error responses:

{
    "success": false,
    "error": {
        "code": "error_code",
        "message": "Human-readable error message",
        "details": {
            // Additional error details
        }
    },
    "meta": {
        "request_id": "req_xyz789",
        "timestamp": "2024-03-21T12:00:00Z"
    }
}

Rate Limits

Free tier: 100 requests per minute
Pro tier: 1000 requests per minute
Enterprise tier: Custom limits

Rate limit headers are included in all responses:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1616876400

Available Endpoints

Jobs

List Jobs

GET /api/v1/jobs/

Query parameters:

org (string): Organization ID
project (string): Project ID
status (string): Filter by status (running, completed, failed)
limit (integer, default: 20): Number of results per page
offset (integer): Pagination offset

Example request:

curl -H "X-API-KEY: your-api-key" \
     "https://backend.trainwave.ai/api/v1/jobs/?project=p-abc123&status=running"

Example response:

{
    "success": true,
    "data": {
        "count": 25,
        "next": "https://backend.trainwave.ai/api/v1/jobs/?offset=20",
        "previous": null,
        "results": [
            {
                "id": "j-789xyz",
                "rid": "training-job-1",
                "created_at": "2024-03-21T09:00:00Z",
                "state": "RUNNING",
                "project": "p-def456",
                "owner": {
                    "id": "u-abc123",
                    "email": "user@example.com",
                    "username": "johndoe"
                },
                "total_cost": 25.5,
                "gpu_hours": 2.5,
                "metrics": {
                    "gpu_utilization": 95.2,
                    "memory_usage": 14.3
                }
            }
        ]
    }
}

Create Job

POST /api/v1/jobs/

Request body:

{
    "name": "mnist-training",
    "project": "p-abc123",
    "description": "Training MNIST classifier",
    "gpu_type": "RTX A5000",
    "gpus": 1,
    "cpu_cores": 4,
    "memory_gb": 16,
    "hdd_size_mb": 51200,
    "image": "trainwave/pytorch:2.3.1",
    "setup_command": "pip install -r requirements.txt",
    "run_command": "python train.py",
    "env_vars": {
        "WANDB_API_KEY": "xxx",
        "PYTORCH_CUDA_ALLOC_CONF": "max_split_size_mb:512"
    },
    "expires": "4h",
    "compliance_soc2": true
}

Example using Python:

import requests
 
api_key = "your-api-key"
headers = {
    "Accept": "application/json",
    "X-API-KEY": api_key,
    "Content-Type": "application/json"
}
 
job_config = {
    "name": "mnist-training",
    "project": "p-abc123",
    "gpu_type": "RTX A5000",
    "gpus": 1,
    # ... other configuration
}
 
response = requests.post(
    "https://backend.trainwave.ai/api/v1/jobs/",
    headers=headers,
    json=job_config
)
 
if response.status_code == 201:
    job = response.json()["data"]
    print(f"Created job: {job['id']}")

Get Job Details

GET /api/v1/jobs/{job_id}/

Example response:

{
    "success": true,
    "data": {
        "id": "j-789xyz",
        "name": "mnist-training",
        "state": "RUNNING",
        "created_at": "2024-03-21T09:00:00Z",
        "started_at": "2024-03-21T09:01:00Z",
        "finished_at": null,
        "project": "p-abc123",
        "gpu_type": "RTX A5000",
        "gpus": 1,
        "cpu_cores": 4,
        "memory_gb": 16,
        "cost_per_hour": 2.5,
        "total_cost": 5.0,
        "metrics": {
            "gpu_utilization": 95.2,
            "memory_usage": 14.3,
            "network_rx_bytes": 1024000,
            "network_tx_bytes": 512000
        },
        "artifacts": {
            "model": "s3://bucket/model.pt",
            "logs": "s3://bucket/logs.txt"
        }
    }
}

Stop Job

POST /api/v1/jobs/{job_id}/stop/

Example using curl:

curl -X POST \
     -H "X-API-KEY: your-api-key" \
     https://backend.trainwave.ai/api/v1/jobs/j-789xyz/stop/

Projects

List Projects

GET /api/v1/projects/

Query parameters:

org (string): Organization ID
limit (integer, default: 20): Results per page
offset (integer): Pagination offset

Example response:

{
    "success": true,
    "data": {
        "count": 2,
        "results": [
            {
                "id": "p-abc123",
                "name": "MNIST Classification",
                "description": "Image classification research",
                "created_at": "2024-03-01T12:00:00Z",
                "organization": "org-xyz789",
                "active_job_count": 2,
                "total_job_count": 15,
                "total_cost": 150.25
            }
        ]
    }
}

Organizations

List Organizations

GET /api/v1/organizations/

Example response:

{
    "success": true,
    "data": {
        "count": 2,
        "results": [
            {
                "id": "org-xyz789",
                "name": "Research Team",
                "created_at": "2024-01-15T08:00:00Z",
                "credit_balance": 1000.5,
                "member_count": 5,
                "project_count": 3
            }
        ]
    }
}

Metrics

Get Job Metrics

GET /api/v1/metrics/{metric_name}/?job_id={job_id}

Available metrics:

cpu: CPU utilization
memory: Memory usage
network: Network I/O
gpu_utilization: GPU utilization
gpu_memory: GPU memory usage
disk: Disk I/O

Example request:

curl -H "X-API-KEY: your-api-key" \
     "https://backend.trainwave.ai/api/v1/metrics/gpu_utilization/?job_id=j-789xyz"

Example response:

{
    "success": true,
    "data": {
        "metric": "gpu_utilization",
        "job_id": "j-789xyz",
        "values": [
            [1711027200, 95.2],
            [1711027260, 94.8],
            [1711027320, 96.1]
        ],
        "unit": "percent",
        "interval": "60s"
    }
}

SDKs and Libraries

Official SDKs:

Webhooks

Trainwave can send webhooks for important events. Configure webhooks in your organization settings.

Example webhook payload:

{
    "event": "job.completed",
    "job": {
        "id": "j-789xyz",
        "state": "COMPLETED",
        "exit_code": 0,
        "duration": 3600,
        "cost": 25.5
    },
    "timestamp": "2024-03-21T10:00:00Z",
    "webhook_id": "wh_abc123"
}

Best Practices

Rate Limiting
- Implement exponential backoff
- Cache responses when appropriate
- Use bulk operations when possible
Error Handling
- Always check the success field
- Log the request_id for debugging
- Handle rate limits gracefully
Security
- Never expose API keys in client-side code
- Rotate API keys regularly
- Use environment variables for sensitive data
Monitoring
- Track rate limit headers
- Monitor webhook delivery status
- Log API response times

Support

Technical issues: support@trainwave.ai
API status: status.trainwave.ai
Developer community: Discord

Authentication Pricing