DocsEndpoints

Trainwave API Reference

The Trainwave REST API lets you programmatically control and monitor your machine learning jobs.

Quick Start

# List your jobs
curl -H "Accept: application/json" \
     -H "X-API-KEY: your-api-key" \
     https://backend.trainwave.ai/api/v1/jobs/

Base URL

All API requests should be made to: https://backend.trainwave.ai/api/v1/

Authentication

See the Authentication Guide for details on generating and using API keys.

Response Format

All responses follow this standard format:

{
    "success": true,
    "data": {
        // Response data here
    }
}

Error responses:

{
    "success": false,
    "error": {
        "code": "error_code",
        "message": "Human-readable error message"
    }
}

Available Endpoints

Jobs

List Jobs

GET /api/v1/jobs/

Query parameters:

  • org (string): Organization ID
  • project (string): Project ID
  • status (string): Filter by status (running, completed, failed)
  • limit (integer, default: 20): Number of results per page
  • offset (integer): Pagination offset

Example request:

curl -H "X-API-KEY: your-api-key" \
     "https://backend.trainwave.ai/api/v1/jobs/?project=p-abc123&status=running"

Example response:

{
    "success": true,
    "data": {
        "count": 25,
        "next": "https://backend.trainwave.ai/api/v1/jobs/?offset=20",
        "previous": null,
        "results": [
            {
                "id": "j-789xyz",
                "rid": "training-job-1",
                "created_at": "2024-03-21T09:00:00Z",
                "state": "RUNNING",
                "project": "p-def456",
                "owner": {
                    "id": "u-abc123",
                    "email": "user@example.com",
                    "username": "johndoe"
                },
                "total_cost": 25.5,
                "gpu_hours": 2.5,
                "metrics": {
                    "gpu_utilization": 95.2,
                    "memory_usage": 14.3
                }
            }
        ]
    }
}

Create Job

POST /api/v1/jobs/

Request body:

{
    "name": "mnist-training",
    "project": "p-abc123",
    "description": "Training MNIST classifier",
    "gpu_type": "RTX A5000",
    "gpus": 1,
    "cpu_cores": 4,
    "memory_gb": 16,
    "hdd_size_mb": 51200,
    "image": "trainwave/pytorch:2.3.1",
    "setup_command": "pip install -r requirements.txt",
    "run_command": "python train.py",
    "env_vars": {
        "WANDB_API_KEY": "xxx",
        "PYTORCH_CUDA_ALLOC_CONF": "max_split_size_mb:512"
    },
    "expires": "4h"
}

Example using Python:

import requests
 
headers = {
    "Accept": "application/json",
    "X-API-KEY": "your-api-key",
    "Content-Type": "application/json"
}
 
job_config = {
    "name": "mnist-training",
    "project": "p-abc123",
    "gpu_type": "RTX A5000",
    "gpus": 1,
    # ... other configuration
}
 
response = requests.post(
    "https://backend.trainwave.ai/api/v1/jobs/",
    headers=headers,
    json=job_config
)
 
if response.status_code == 201:
    job = response.json()["data"]
    print(f"Created job: {job['id']}")

Get Job Details

GET /api/v1/jobs/{job_id}/

Example response:

{
    "success": true,
    "data": {
        "id": "j-789xyz",
        "name": "mnist-training",
        "state": "RUNNING",
        "created_at": "2024-03-21T09:00:00Z",
        "started_at": "2024-03-21T09:01:00Z",
        "finished_at": null,
        "project": "p-abc123",
        "gpu_type": "RTX A5000",
        "gpus": 1,
        "cpu_cores": 4,
        "memory_gb": 16,
        "cost_per_hour": 2.5,
        "total_cost": 5.0,
        "metrics": {
            "gpu_utilization": 95.2,
            "memory_usage": 14.3
        }
    }
}

Stop Job

POST /api/v1/jobs/{job_id}/stop/
curl -X POST \
     -H "X-API-KEY: your-api-key" \
     https://backend.trainwave.ai/api/v1/jobs/j-789xyz/stop/

Projects

List Projects

GET /api/v1/projects/

Query parameters:

  • org (string): Organization ID
  • limit (integer, default: 20): Results per page
  • offset (integer): Pagination offset

Organizations

List Organizations

GET /api/v1/organizations/

Metrics

Get Job Metrics

GET /api/v1/metrics/{metric_name}/?job_id={job_id}

Available metrics:

  • cpu: CPU utilization
  • memory: Memory usage
  • network: Network I/O
  • gpu_utilization: GPU utilization
  • gpu_memory: GPU memory usage
  • disk: Disk I/O

Best Practices

  1. Error Handling — Always check the success field and handle errors gracefully
  2. Security — Never expose API keys in client-side code; use environment variables
  3. Monitoring — Log API response times and track error rates

Support