Python SDK

The official Python client for Alveare. Supports sync and async, automatic retries, and typed responses.

Installation

Requires Python 3.8+.

bash
pip install alveare

Quick start

python
from alveare import Alveare

client = Alveare(api_key="alv_live_abc123...")

response = client.infer(
    specialist="summarise",
    prompt="Quarterly revenue increased 23% year-over-year...",
    max_tokens=256,
)

print(response.result)
print(f"Tokens used: {response.tokens_used}")
print(f"Latency: {response.latency_ms}ms")

Client initialization

The client accepts these parameters. All are optional if the corresponding environment variable is set.

ParameterEnv variableDefault
api_keyALVEARE_API_KEYNone (required)
base_urlALVEARE_BASE_URLhttps://api.alveare.ai
timeout--30.0 seconds
max_retries--3
python
import os
from alveare import Alveare

# Reads ALVEARE_API_KEY from environment automatically
client = Alveare()

# Or pass everything explicitly
client = Alveare(
    api_key=os.environ["MY_ALVEARE_KEY"],
    base_url="https://api.alveare.ai",
    timeout=60.0,
    max_retries=5,
)

client.infer()

Call the native Alveare inference endpoint.

python
# Classification
result = client.infer(
    specialist="classify",
    prompt="I need to cancel my subscription immediately",
    temperature=0.3,
)
print(result.result)   # "cancellation_request"

# Extraction with JSON output
result = client.infer(
    specialist="extract",
    prompt="""Extract contact info from this email:
Hi, I'm Jane Smith from Acme Corp.
Reach me at jane@acme.com or 555-0123.""",
    max_tokens=256,
)
print(result.result)
# {"name": "Jane Smith", "company": "Acme Corp", "email": "jane@acme.com", "phone": "555-0123"}

client.chat.completions.create()

OpenAI-compatible chat completions. Returns the same response shape as the OpenAI Python SDK.

python
response = client.chat.completions.create(
    model="alveare-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is a cognitive hive?"},
    ],
    max_tokens=512,
    temperature=0.7,
)

print(response.choices[0].message.content)
print(f"Usage: {response.usage.total_tokens} tokens")

client.models.list()

List all specialists available to your account.

python
models = client.models.list()

for model in models.data:
    print(f"{model.id}{model.owned_by}")

# alveare-classify — alveare
# alveare-summarise — alveare
# alveare-extract — alveare
# ...

client.usage()

Get usage statistics for the current billing period.

python
usage = client.usage()

print(f"Requests: {usage.requests_used} / {usage.requests_limit}")
print(f"Tokens:   {usage.tokens_used}")
print(f"Period:   {usage.period_start} to {usage.period_end}")

# Per-specialist breakdown
for name, count in usage.by_specialist.items():
    print(f"  {name}: {count}")

Async client

Use AsyncAlveare for async/await support. The interface is identical to the sync client.

python
import asyncio
from alveare import AsyncAlveare

async def main():
    client = AsyncAlveare(api_key="alv_live_abc123...")

    # Run multiple inferences concurrently
    tasks = [
        client.infer(specialist="classify", prompt="Great product!"),
        client.infer(specialist="classify", prompt="Terrible experience."),
        client.infer(specialist="classify", prompt="It was okay I guess."),
    ]

    results = await asyncio.gather(*tasks)

    for r in results:
        print(r.result)

    # Chat completions work too
    chat = await client.chat.completions.create(
        model="alveare-chat",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(chat.choices[0].message.content)

asyncio.run(main())

Error handling

The SDK raises typed exceptions for different error conditions.

python
from alveare import Alveare
from alveare.errors import (
    AlveareError,          # Base class for all errors
    AuthenticationError,   # 401 — invalid or missing API key
    RateLimitError,        # 429 — rate limit exceeded
    BadRequestError,       # 400 — invalid parameters
    PermissionError,       # 403 — plan does not allow this
    ServerError,           # 500 — internal server error
)

client = Alveare()

try:
    result = client.infer(
        specialist="summarise",
        prompt="Some long text...",
    )
except AuthenticationError as e:
    print(f"Check your API key: {e.message}")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except AlveareError as e:
    print(f"API error: {e.message} ({e.code})")

Retries and timeouts

The client automatically retries on transient errors (429, 500, connection errors) with exponential backoff. Configure the behavior at initialization.

python
client = Alveare(
    max_retries=5,       # Retry up to 5 times (default: 3)
    timeout=60.0,        # 60 second timeout (default: 30)
)

# Disable retries entirely
client = Alveare(max_retries=0)

# Per-request timeout override
result = client.infer(
    specialist="code",
    prompt="Write a complex algorithm...",
    max_tokens=4096,
    timeout=120.0,
)

The retry logic respects the Retry-After header from rate limit responses. For 429 errors, the SDK waits the recommended duration rather than using its own backoff.