Python SDK
The official Python client for Alveare. Supports sync and async, automatic retries, and typed responses.
Installation
Requires Python 3.8+.
pip install alveare
Quick start
from alveare import Alveare
client = Alveare(api_key="alv_live_abc123...")
response = client.infer(
specialist="summarise",
prompt="Quarterly revenue increased 23% year-over-year...",
max_tokens=256,
)
print(response.result)
print(f"Tokens used: {response.tokens_used}")
print(f"Latency: {response.latency_ms}ms")
Client initialization
The client accepts these parameters. All are optional if the corresponding environment variable is set.
| Parameter | Env variable | Default |
|---|---|---|
| api_key | ALVEARE_API_KEY | None (required) |
| base_url | ALVEARE_BASE_URL | https://api.alveare.ai |
| timeout | -- | 30.0 seconds |
| max_retries | -- | 3 |
import os
from alveare import Alveare
# Reads ALVEARE_API_KEY from environment automatically
client = Alveare()
# Or pass everything explicitly
client = Alveare(
api_key=os.environ["MY_ALVEARE_KEY"],
base_url="https://api.alveare.ai",
timeout=60.0,
max_retries=5,
)
client.infer()
Call the native Alveare inference endpoint.
# Classification
result = client.infer(
specialist="classify",
prompt="I need to cancel my subscription immediately",
temperature=0.3,
)
print(result.result) # "cancellation_request"
# Extraction with JSON output
result = client.infer(
specialist="extract",
prompt="""Extract contact info from this email:
Hi, I'm Jane Smith from Acme Corp.
Reach me at jane@acme.com or 555-0123.""",
max_tokens=256,
)
print(result.result)
# {"name": "Jane Smith", "company": "Acme Corp", "email": "jane@acme.com", "phone": "555-0123"}
client.chat.completions.create()
OpenAI-compatible chat completions. Returns the same response shape as the OpenAI Python SDK.
response = client.chat.completions.create(
model="alveare-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is a cognitive hive?"},
],
max_tokens=512,
temperature=0.7,
)
print(response.choices[0].message.content)
print(f"Usage: {response.usage.total_tokens} tokens")
client.models.list()
List all specialists available to your account.
models = client.models.list()
for model in models.data:
print(f"{model.id} — {model.owned_by}")
# alveare-classify — alveare
# alveare-summarise — alveare
# alveare-extract — alveare
# ...
client.usage()
Get usage statistics for the current billing period.
usage = client.usage()
print(f"Requests: {usage.requests_used} / {usage.requests_limit}")
print(f"Tokens: {usage.tokens_used}")
print(f"Period: {usage.period_start} to {usage.period_end}")
# Per-specialist breakdown
for name, count in usage.by_specialist.items():
print(f" {name}: {count}")
Async client
Use AsyncAlveare for async/await support. The interface is identical to the sync client.
import asyncio
from alveare import AsyncAlveare
async def main():
client = AsyncAlveare(api_key="alv_live_abc123...")
# Run multiple inferences concurrently
tasks = [
client.infer(specialist="classify", prompt="Great product!"),
client.infer(specialist="classify", prompt="Terrible experience."),
client.infer(specialist="classify", prompt="It was okay I guess."),
]
results = await asyncio.gather(*tasks)
for r in results:
print(r.result)
# Chat completions work too
chat = await client.chat.completions.create(
model="alveare-chat",
messages=[{"role": "user", "content": "Hello!"}],
)
print(chat.choices[0].message.content)
asyncio.run(main())
Error handling
The SDK raises typed exceptions for different error conditions.
from alveare import Alveare
from alveare.errors import (
AlveareError, # Base class for all errors
AuthenticationError, # 401 — invalid or missing API key
RateLimitError, # 429 — rate limit exceeded
BadRequestError, # 400 — invalid parameters
PermissionError, # 403 — plan does not allow this
ServerError, # 500 — internal server error
)
client = Alveare()
try:
result = client.infer(
specialist="summarise",
prompt="Some long text...",
)
except AuthenticationError as e:
print(f"Check your API key: {e.message}")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except AlveareError as e:
print(f"API error: {e.message} ({e.code})")
Retries and timeouts
The client automatically retries on transient errors (429, 500, connection errors) with exponential backoff. Configure the behavior at initialization.
client = Alveare(
max_retries=5, # Retry up to 5 times (default: 3)
timeout=60.0, # 60 second timeout (default: 30)
)
# Disable retries entirely
client = Alveare(max_retries=0)
# Per-request timeout override
result = client.infer(
specialist="code",
prompt="Write a complex algorithm...",
max_tokens=4096,
timeout=120.0,
)
The retry logic respects the Retry-After header from rate limit responses. For 429 errors, the SDK waits the recommended duration rather than using its own backoff.