RESILIENCE AND RATE LIMITS
Neura combines Redis backed rate limiting, Opossum circuit breakers, and structured logging to keep the API responsive under load. This document summarizes the protections and offers guidance for client side retry logic.
Rate limiting
Global budget: 300 requests per second per agent, measured by route grouping.
Penalty window: Exceeding the budget blocks the offending agent for 5 seconds.
HTTP response:
429 Too Many Requestswith JSON body:{ "success": false, "error": "Rate limit exceeded", "retryAfter": 5, "timestamp": "2025-01-15T12:34:56.789Z" }Headers: The middleware does not override
Retry-After, but the JSON payload exposesretryAfterin seconds.
Best practices
Implement exponential backoff with jitter. Start with 1 second and cap near the
retryAftervalue.Throttle discovery calls; invoice responses count towards rate limits even if unpaid.
Spread polling across
/solanaand/baseprefixes if you operate separate clients per network.
Circuit breakers
Each upstream data call is protected by an Opossum breaker with the following configuration:
Timeout
30 seconds
Error threshold
50 percent
Volume threshold
10 requests
Reset timeout
30 seconds
When the breaker opens, the API immediately returns a 503 style error, which the router surfaces as 500 with a descriptive message. The breaker transitions to half open after the reset window and closes once calls succeed again.
Client guidance
Treat repeated
500errors with the same timestamp as transient. Retry with a 10 to 30 second delay.Tune your monitoring to alert if breaker related errors exceed normal baselines.
Retries
The API retries upstream calls up to three times with exponential backoff (1 to 10 seconds). If all attempts fail, the error cascades to the client with success: false.
Client applications should:
Avoid immediate retries if the payload already failed after backend retries.
Surface error messages to operators; they include enough context to determine whether the issue is user input or infrastructure.
Observability
Structured logs
Logging uses Pino. Set
LOG_LEVEL=debugto capture retry and breaker events locally.Production deployments commonly use
infoto reduce noise while retaining key resilience events.
Health endpoints
GET /health
Returns uptime, memory usage, and deployment version. Use for liveness checks.
GET /health/resilience
Returns rate limiter counters for tracked keys in the form rateLimit_neura:key along with remaining points and msBeforeNext.
Sample GET /health/resilience response:
{
"status": "healthy",
"timestamp": "2025-01-15T12:34:56.789Z",
"resilience": {
"rateLimit_neura:token": {
"consumedPoints": 42,
"remainingPoints": 258,
"msBeforeNext": 1200
}
}
}The key prefixes follow internal naming conventions. Treat them as opaque identifiers and focus on the remaining point totals.
Production checklist
Monitor
429rates and circuit breaker warnings.Scale Redis with sufficient throughput to handle spikes.
Mirror resilience configuration across
/solanaand/baseif you run separate clusters.Automate invoice replay with retries, respecting the guidance above to avoid flapping breakers.
Last updated
