Principles

The core design principles that guide how BluesMinds works.

1. OpenAI Compatibility First

BluesMinds is built to be a drop-in replacement for any OpenAI client. Change only your base_url — no other code changes required. This means:

All OpenAI SDK methods work unchanged
Response shapes match the OpenAI specification
Error codes and types mirror OpenAI conventions
Streaming uses the same SSE format (data: {json} … data: [DONE])

2. Unified, Single Endpoint

Instead of juggling multiple API keys, SDKs, and billing accounts across providers, BluesMinds exposes one URL for everything:

https://api.bluesminds.com/v1

Your application code never needs to know which underlying provider is serving a request.

3. Automatic Failover & High Availability

If a provider is degraded or rate-limited, BluesMinds automatically reroutes traffic to an available alternative — with no action required from you. This delivers:

Higher effective uptime than any single provider
Transparent failover (same response format)
Per-model fallback chains configured server-side

4. Cost Optimization

BluesMinds selects the most cost-effective provider for a given model when multiple are available. You benefit from competitive pricing without manually comparing provider rates.

5. Privacy by Default

Request and response bodies are not stored by default
You can explicitly opt in to logging via the Management API for audit/debug purposes
API keys are scoped and revocable — never embed them in client apps
The Management API uses short-lived session tokens, separate from LLM API keys

6. Standard Rate Limiting

Rate limits are enforced per API key:

Plan	RPM
Free	20
Trial Pack	15
10-Day Pass	15
Unlimited	15
Enterprise	Custom

On 429 Too Many Requests, use jittered exponential backoff and honor the Retry-After header. Do not retry 401, 403, or 404 without fixing the underlying issue.

7. Explicit Model Selection

BluesMinds does not silently substitute models. If you request gpt-4o and it is unavailable, you receive a 404 — not a silent downgrade to a cheaper model. Always:

Call GET /v1/models to discover available models
Keep your model allowlist in sync with the live model list
Handle 404 in your application to present a meaningful user error

8. Separation of LLM and Management APIs

Two distinct API surfaces with different authentication:

Surface	Base Path	Auth Method
LLM (inference)	`/v1/*`	`Bearer sk-...` (API key)
Management	`/api/*`	Session token from login

Never use an API key for management calls, or vice versa. This separation limits blast radius if a key is compromised.

1. OpenAI Compatibility First​

2. Unified, Single Endpoint​

3. Automatic Failover & High Availability​

4. Cost Optimization​

5. Privacy by Default​

6. Standard Rate Limiting​

7. Explicit Model Selection​

8. Separation of LLM and Management APIs​