Skip to main content

CostWise Managed Cost Policy SDK — automatically reduce AI costs without changing your workflow

Project description

CostWise MCP SDK

Automatically reduce AI costs without changing your workflow.

The CostWise Managed Cost Policy (MCP) SDK analyzes your LLM requests, classifies task complexity, and recommends cheaper models and token limits — saving up to 90% on AI costs.

Install

pip install costwise-mcp

Quick Start

from costwise_mcp import CostPolicy

policy = CostPolicy(
    api_key="cw_your_api_key",
    backend_url="https://app.cost-wise.dev",  # your CostWise instance
)

decision = policy.optimize(
    prompt="Translate 'hello' to French",
    model="gpt-5.4",
)

print(decision.recommended_model)  # "gpt-5.4-mini" (70% cheaper)
print(decision.max_tokens)         # 256
print(decision.estimated_cost)     # $0.001154

# Call your LLM with the optimized params
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
    model=decision.recommended_model,
    max_tokens=decision.max_tokens,
    messages=[{"role": "user", "content": "Translate 'hello' to French"}],
)

# Report actual usage (async, non-blocking, optional)
policy.report(decision, actual_tokens=response.usage.total_tokens)

Team Policies (AI Governance)

Enforce model access and budgets per team:

from costwise_mcp import CostPolicy

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")

# Interns: restricted to budget models (tier 3), max $50/month
decision = policy.optimize(
    prompt="Write a summary",
    model="gpt-5.4",      # they asked for premium
    team="interns",        # policy enforces tier 3
)
# decision.recommended_model = "gpt-5.4-nano"
# decision.message = "Team 'interns' restricted to tier 3 models"

# Engineers: standard tier, $500/month budget
decision = policy.optimize(
    prompt="Debug this code",
    model="gpt-5.4",
    team="engineers",      # policy enforces tier 2
)
# decision.recommended_model = "gpt-5.4-mini"

# AI Research: full access, no budget limit
decision = policy.optimize(
    prompt="Analyze this architecture...",
    model="gpt-5.4",
    team="ai-team",        # tier 1 = all models allowed
)
# decision.recommended_model = "gpt-5.4" (kept for complex tasks)

Team tier levels

Tier Access Example teams
1 — Premium All models (gpt-5.4, claude-opus-4-6, etc.) AI Research, CTO
2 — Standard Mid-tier models (gpt-5.4-mini, claude-sonnet-4-6) Engineers, Data Science
3 — Budget Cheapest models only (gpt-5.4-nano, claude-haiku-4-5) Interns, Support, QA

Configure team policies in CostWise: Settings > MCP Teams.

Error Handling

from costwise_mcp import (
    CostPolicy,
    BudgetExceededError,
    ModelBlockedError,
    TierRestrictionError,
    PolicyViolationError,
)

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")

decision = policy.optimize(
    prompt="Generate a report",
    model="gpt-5.4",
    team="interns",
)

if not decision.allowed:
    print(f"Blocked: {decision.message}")
    # "Team 'interns' monthly budget exceeded ($50.42 / $50.00)"
else:
    # Proceed with the LLM call
    print(f"Use {decision.recommended_model}, max {decision.max_tokens} tokens")

Error types

Error When Fields
BudgetExceededError Team monthly/daily budget exceeded current_spend, budget_limit
ModelBlockedError Model is on the team's blocklist blocked_model, alternative
TierRestrictionError Model tier above team's limit model_tier, max_tier
PolicyViolationError Any team policy violation (base) team, reason
InvalidAPIKeyError API key invalid or expired
RateLimitError Backend rate limit exceeded retry_after

Framework Integration

The SDK works with any AI framework — it runs before the LLM call, not instead of it.

OpenAI SDK

from costwise_mcp import CostPolicy
from openai import OpenAI

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")
client = OpenAI()

decision = policy.optimize("Summarize this article", model="gpt-5.4", team="engineers")
response = client.chat.completions.create(
    model=decision.recommended_model,
    max_tokens=decision.max_tokens,
    messages=[{"role": "user", "content": prompt}],
)
policy.report(decision, actual_tokens=response.usage.total_tokens)

Anthropic SDK

from costwise_mcp import CostPolicy
import anthropic

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")
client = anthropic.Anthropic()

decision = policy.optimize(prompt, model="claude-opus-4-6", team="ai-team")
message = client.messages.create(
    model=decision.recommended_model,
    max_tokens=decision.max_tokens,
    messages=[{"role": "user", "content": prompt}],
)
policy.report(decision, output_tokens=message.usage.output_tokens)

LangChain

from costwise_mcp import CostPolicy
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")

decision = policy.optimize(prompt, model="gpt-5.4", team="engineers")
llm = ChatOpenAI(model=decision.recommended_model, max_tokens=decision.max_tokens)
result = llm.invoke([HumanMessage(content=prompt)])

LlamaIndex

from costwise_mcp import CostPolicy
from llama_index.llms.openai import OpenAI

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")

decision = policy.optimize(prompt, model="gpt-5.4", team="engineers")
llm = OpenAI(model=decision.recommended_model, max_tokens=decision.max_tokens)
response = llm.complete(prompt)

Google Gemini

from costwise_mcp import CostPolicy
import google.generativeai as genai

policy = CostPolicy(api_key="cw_...", backend_url="https://app.cost-wise.dev")

decision = policy.optimize(prompt, model="gemini-2.5-pro", team="engineers")
model = genai.GenerativeModel(decision.recommended_model)
response = model.generate_content(prompt)

Custom Configuration

from costwise_mcp import CostPolicy, PolicyConfig

policy = CostPolicy(
    api_key="cw_...",
    backend_url="https://app.cost-wise.dev",
    config=PolicyConfig(
        max_tokens_simple=128,      # override default 256
        max_tokens_medium=512,      # override default 1024
        max_tokens_complex=2048,    # override default 4096
        auto_downgrade=True,        # auto-select cheaper models for simple tasks
        blocked_models=["o1-pro"],  # local blocklist (in addition to team policy)
        telemetry_batch_size=100,   # send every 100 events (default: 50)
        telemetry_flush_interval=60, # send every 60 seconds (default: 30)
    ),
    project_id="my-chatbot",
)

Supported Models (75 models, 10 providers)

Provider Models
OpenAI (18) gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-4.1, gpt-4o, gpt-4o-mini, o1, o1-mini, o3, o3-mini, o3-pro, o4-mini
Anthropic (23) claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5, claude-opus-4-5, claude-sonnet-4-5, claude-3.5-sonnet, claude-3.5-haiku
Google (7) gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash
Mistral (7) mistral-large, mistral-small, open-mistral-nemo, codestral, pixtral-large
Meta/Llama (6) llama-3.3-70b, llama-3.1-405b, llama-4-scout, llama-4-maverick
Cohere (4) command-r-plus, command-r, command-light, command-a
xAI/Grok (3) grok-3, grok-3-mini, grok-2
AWS Nova (3) nova-pro, nova-lite, nova-micro
DeepSeek (2) deepseek-chat, deepseek-reasoner
AI21 (2) jamba-1.5-large, jamba-1.5-mini

Privacy

  • Prompts are never sent to the CostWise backend
  • Only metadata: token counts, model name, cost, team ID
  • Telemetry is optional: PolicyConfig(telemetry_enabled=False)

API Reference

CostPolicy(api_key, backend_url, config, project_id)

Parameter Type Required Description
api_key str Yes CostWise API key (cw_...). Get one at Settings > MCP SDK.
backend_url str No Your CostWise instance URL. Default: https://app.cost-wise.dev
config PolicyConfig No Custom token limits, budgets, blocked models
project_id str No Group analytics by project

policy.optimize(prompt, model, task_type, max_budget, team) → Decision

Parameter Type Required Description
prompt str Yes Prompt text (for token estimation + complexity classification)
model str Yes Model you intend to use (e.g. "gpt-5.4")
task_type TaskComplexity No Override auto-classification: SIMPLE, MEDIUM, COMPLEX
max_budget float No Max cost in USD for this request
team str No Team ID for policy enforcement (e.g. "interns")

policy.report(decision, actual_tokens, output_tokens, latency_ms)

Report actual usage after LLM call. Async, non-blocking.

Decision object

Field Type Description
recommended_model str Model to use
max_tokens int Token limit
estimated_cost float Cost in USD
original_model str Originally requested model
input_tokens int Estimated input tokens
complexity TaskComplexity simple, medium, complex
allowed bool Whether request is within budget/policy
savings_pct float Percentage saved vs original
estimated_savings float USD saved
message str Human-readable recommendation

Get Your API Key

  1. Sign in to your CostWise instance
  2. Go to Settings > MCP SDK
  3. Click Generate API Key
  4. Copy the cw_... key
  5. Set up team policies at Settings > MCP Teams

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

costwise_mcp-0.2.0.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

costwise_mcp-0.2.0-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file costwise_mcp-0.2.0.tar.gz.

File metadata

  • Download URL: costwise_mcp-0.2.0.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for costwise_mcp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6db10830b5408a96e1337fb5b9a0077fa50a7b8f73c582862cf96904958cad17
MD5 2aa33dbc76ef7b00edb71f11e86d63a0
BLAKE2b-256 06f5b6ca8d9ddd7de186b4f5ebe7902c9793bcdc4a19f2146ca047390c2c65c8

See more details on using hashes here.

File details

Details for the file costwise_mcp-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: costwise_mcp-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for costwise_mcp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b0a6afb622bd485896427093545c4100744d280f723098d7e0824c2a970287c2
MD5 ba894b3297708fac63f48b122b549f4d
BLAKE2b-256 63d373950d9425878576154928cdd136cc65d366d348612bdf525fccbe40e380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page