Rate Limits and Errors

This page describes how Vast.ai public API errors and rate limits work, along with practical retry guidance.

Error Responses

Error responses vary slightly by endpoint. The most common error response shape is:

{
  "success": false,
  "error": "invalid_args",
  "msg": "Human-readable description of the problem."
}

Some endpoints omit the boolean success. Some omit error and return only msg or message.

Rate Limits

How rate limits are applied

Vast.ai applies rate limits per endpoint and per identity. Unlike other services, this is enforced as a minimum interval between requests for a given endpoint and identity, and enforcement is not a binary wall, but determined probabalistically. The identity is determined by: bearer token + session user + api_key query param and falls back to client IP.

Rate limit response and recommended retry behavior

When you hit a rate limit, you will receive HTTP 429. The response body will typically return an acceptable threshold number in seconds:

API requests too frequent: endpoint threshold=...

We recommend you retry your call after the recommended threshold.

How to reduce rate limit errors

Batch requests where supported, rather than calling many single-item endpoints.
Reduce polling: use longer polling intervals, or cache results client-side.
Spread traffic over time: avoid bursts; use a queue or scheduler.

If you need higher limits for legitimate production usage, contact support with the endpoint(s), your expected call rate, and your account details.

API Reference

Endpoints

Error Responses

Rate Limits

How rate limits are applied

Rate limit response and recommended retry behavior

How to reduce rate limit errors

API Reference

Endpoints

​Error Responses

​Rate Limits

​How rate limits are applied

​Rate limit response and recommended retry behavior

​How to reduce rate limit errors

Error Responses

Rate Limits

How rate limits are applied

Rate limit response and recommended retry behavior

How to reduce rate limit errors