Aqta

Rate Limits

Aqta enforces rate limits per API key to ensure fair usage and system stability. Limits vary by tier.


Limits by tier

TierRequests / monthRate limitBurst
Free5005 / min10
Starter10,000100 / min200
Pro100,0001,000 / min2,000
EnterpriseUnlimitedCustomCustom

Burst limit

Burst lets you briefly exceed the per-minute rate — useful for spiky workloads. For example, Free tier allows 10 requests in a short burst even though the sustained rate is 5 / min.

Model availability by tier

Free — cost-effective models only:

  • GPT-4o mini, GPT-3.5 Turbo
  • Claude 3 Haiku
  • Gemini 1.5 Flash

Starter, Pro, Enterprise — all models:

  • GPT-4o, GPT-4 Turbo
  • Claude 3.5 Sonnet, Claude 3 Opus
  • Gemini 1.5 Pro, Gemini 2.0 Flash
  • Perplexity Sonar Pro

How rate limiting works

Per-minute window (sliding)

Requests are counted in a rolling 60-second window:

Time (s):    0    10   20   30   40   50   60
Requests:    3     2    0    0    0    0    0
In window:   3     5    5    5    5    5    5  ← all still within last 60s

Once you hit the limit, requests return 429 until the window clears.

Monthly limit

Resets on the 1st of each month at 00:00 UTC.

If you exhaust your monthly quota mid-month, all requests return 429 until the next cycle. Upgrading your tier restores access immediately.


Rate limit headers

Every response includes:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 94
X-RateLimit-Reset: 1743465600
HeaderDescription
X-RateLimit-LimitMax requests per minute for your tier
X-RateLimit-RemainingRequests left in the current window
X-RateLimit-ResetUnix timestamp when the window resets

Handling 429 errors

Error response

{
  "error": {
    "message": "Rate limit exceeded. Retry after 30 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "retry_after": 30
  }
}

Retry with backoff (Python)

import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://api.aqta.ai/v1",
    api_key="sk-aqta-your-key-here",
)

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages,
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait = 2 ** attempt  # 1s, 2s, 4s
            print(f"Rate limited, retrying in {wait}s...")
            time.sleep(wait)

Retry with backoff (JavaScript)

import OpenAI, { RateLimitError } from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.aqta.ai/v1',
  apiKey: 'sk-aqta-your-key-here',
});

async function chatWithRetry(messages: OpenAI.ChatCompletionMessageParam[], maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.chat.completions.create({ model: 'gpt-4o', messages });
    } catch (err) {
      if (err instanceof RateLimitError && attempt < maxRetries - 1) {
        const wait = Math.pow(2, attempt) * 1000;
        await new Promise(r => setTimeout(r, wait));
      } else {
        throw err;
      }
    }
  }
}

Best practices

Watch the headers — check X-RateLimit-Remaining before it hits zero:

# After each request, inspect remaining quota
response = client.chat.completions.create(...)
# Access via response.headers if using raw HTTP, or monitor in the dashboard

Exponential backoff — don't hammer the API after a 429. Start at 1s, double each retry.

Batch when possible — one request with multiple items is better than many individual requests.

Cache repeated prompts — identical prompts return the same result; cache at the application layer to avoid redundant calls.


Limits and edge cases

  • Streaming requests count as one request, regardless of stream duration.
  • Failed requests (4xx, 5xx) still count toward your rate limit.
  • Rate limits are per API key, not per account. Create multiple keys to distribute load across services.

Upgrading

Visit app.aqta.ai/pricing to upgrade. New limits apply immediately — no downtime.

Need a temporary limit increase for a launch or batch job? Email hello@aqta.ai.


Next steps


Questions? hello@aqta.ai