vastai-ci-integrationClaude Skill

Configure Vast.ai CI/CD integration with GitHub Actions and testing.

1.9k Stars
259 Forks
2025/10/10

Install & Download

Linux / macOS:

请登录后查看安装命令

Windows (PowerShell):

请登录后查看安装命令

Download and extract to ~/.claude/skills/

namevastai-ci-integration
descriptionConfigure Vast.ai CI/CD integration with GitHub Actions and automated GPU testing. Use when setting up automated testing on GPU instances, or integrating Vast.ai provisioning into CI/CD pipelines. Trigger with phrases like "vastai CI", "vastai github actions", "vastai automated testing", "vastai pipeline".
allowed-toolsRead, Write, Edit, Bash(vastai:*), Grep
version1.0.0
licenseMIT
authorJeremy Longshore <jeremy@intentsolutions.io>
compatible-withclaude-code, codex, openclaw
tags["saas","vast-ai","ci-cd"]

Vast.ai CI Integration

Overview

Integrate Vast.ai GPU provisioning into CI/CD pipelines. Run GPU-accelerated tests, model validation, and benchmarks as part of your automated workflow using GitHub Actions with the Vast.ai CLI.

Prerequisites

  • GitHub repository with Actions enabled
  • VASTAI_API_KEY stored as GitHub Actions secret
  • Docker image for GPU workload published to a registry

Instructions

Step 1: GitHub Actions Workflow

# .github/workflows/gpu-test.yml
name: GPU Tests
on:
  push:
    branches: [main]
  pull_request:

jobs:
  gpu-test:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v4

      - name: Install Vast.ai CLI
        run: |
          pip install vastai
          vastai set api-key ${{ secrets.VASTAI_API_KEY }}

      - name: Provision GPU Instance
        id: provision
        run: |
          # Search for cheapest reliable GPU
          OFFER_ID=$(vastai search offers \
            'num_gpus=1 gpu_ram>=8 reliability>0.95 dph_total<=0.25' \
            --order dph_total --raw --limit 1 \
            | python3 -c "import sys,json; print(json.load(sys.stdin)[0]['id'])")

          # Create instance
          INSTANCE_ID=$(vastai create instance $OFFER_ID \
            --image ghcr.io/${{ github.repository }}/gpu-test:latest \
            --disk 20 --raw \
            | python3 -c "import sys,json; print(json.load(sys.stdin)['new_contract'])")

          echo "instance_id=$INSTANCE_ID" >> $GITHUB_OUTPUT

          # Wait for running
          for i in $(seq 1 30); do
            STATUS=$(vastai show instance $INSTANCE_ID --raw \
              | python3 -c "import sys,json; print(json.load(sys.stdin).get('actual_status','loading'))")
            echo "Status: $STATUS"
            [ "$STATUS" = "running" ] && break
            sleep 10
          done

      - name: Run GPU Tests
        run: |
          INSTANCE_ID=${{ steps.provision.outputs.instance_id }}
          SSH_INFO=$(vastai show instance $INSTANCE_ID --raw \
            | python3 -c "import sys,json; i=json.load(sys.stdin); print(f'{i[\"ssh_host\"]} {i[\"ssh_port\"]}')")
          SSH_HOST=$(echo $SSH_INFO | cut -d' ' -f1)
          SSH_PORT=$(echo $SSH_INFO | cut -d' ' -f2)

          ssh -p $SSH_PORT -o StrictHostKeyChecking=no root@$SSH_HOST \
            "cd /workspace && python -m pytest tests/gpu/ -v --tb=short"

      - name: Cleanup
        if: always()
        run: |
          vastai destroy instance ${{ steps.provision.outputs.instance_id }} || true

Step 2: Cost-Controlled CI

# scripts/ci_gpu_test.py — wrapper with budget controls
import subprocess, json, time, sys, os

MAX_COST = float(os.environ.get("CI_GPU_BUDGET", "1.00"))  # $1 max per run
MAX_DURATION = int(os.environ.get("CI_GPU_TIMEOUT", "1800"))  # 30 min

def ci_gpu_test(test_command):
    # Search for cheapest offer
    offers = json.loads(subprocess.run(
        ["vastai", "search", "offers",
         "num_gpus=1 gpu_ram>=8 reliability>0.90 dph_total<=0.20",
         "--order", "dph_total", "--raw", "--limit", "1"],
        capture_output=True, text=True, check=True).stdout)

    if not offers:
        print("No GPU offers available — skipping GPU tests")
        return 0

    cost_per_hour = offers[0]["dph_total"]
    max_hours = MAX_COST / cost_per_hour
    print(f"GPU: {offers[0]['gpu_name']} at ${cost_per_hour:.3f}/hr "
          f"(budget allows {max_hours:.1f}hrs)")

    # Provision, run, destroy (with timeout)
    # ... (use managed_instance pattern from sdk-patterns)

Step 3: Mock Mode for Non-GPU CI

# conftest.py — skip GPU tests when no API key available
import pytest, os

def pytest_collection_modifyitems(config, items):
    if not os.environ.get("VASTAI_API_KEY"):
        skip_gpu = pytest.mark.skip(reason="VASTAI_API_KEY not set")
        for item in items:
            if "gpu" in item.keywords:
                item.add_marker(skip_gpu)

Output

  • GitHub Actions workflow with GPU instance lifecycle
  • Cost-controlled CI with budget limits
  • Automatic cleanup on success or failure
  • Mock mode for non-GPU CI runs

Error Handling

ErrorCauseSolution
No offers in CIAll cheap GPUs rentedIncrease dph_total limit or retry later
Instance timeout in CISlow Docker pullUse pre-cached images or smaller base images
SSH fails in CIGitHub runner IP blockedUse Vast.ai API for remote execution instead
Cleanup skippedJob cancelledUse if: always() on cleanup step

Resources

Next Steps

For deployment patterns, see vastai-deploy-integration.

Examples

PR validation: Run GPU tests on every PR with a $0.50 budget cap. Skip GPU tests on draft PRs.

Nightly benchmarks: Schedule a nightly workflow that provisions an A100, runs benchmarks, saves results as artifacts, and posts a cost report.

Similar Claude Skills & Agent Workflows