Semvec — patent-pending persistent semantic state engine

These details have not been verified by PyPI

Project links

Project description

Semvec

Constant-cost semantic memory for LLM agents — drop-in alternative to mem0, Letta, and LangChain Memory.

Semvec replaces unbounded conversation history with a fixed-size 384-d semantic state vector plus a tiered, content-aware memory. The cost of every LLM call stays constant — turn 10 and turn 10 000 carry the same input footprint — and the agent still has structured access to decisions, invariants, error patterns, and prior context across sessions.

pip install semvec

from semvec import SemvecState, SemvecConfig
from semvec.token_reduction import SemvecStateSerializer

state = SemvecState(config=SemvecConfig(dimension=768))
for text, embedding in conversation:
    state.update(embedding, text)

context = SemvecStateSerializer().serialize(state, query_text="what did we decide?")
# `context` is a 150–350-token block — paste it into any LLM system prompt.

Architectural differences vs. mem0, Letta, LangChain Memory

This table compares architectural properties, not measured performance. The benchmarks below were run head-to-head against mem0 only.

Property	semvec	mem0	Letta (MemGPT)	LangChain Memory
Per-turn input footprint	O(1) — fixed-size state	O(retrieved records)	O(in-context blocks)	depends on class (buffer ≈ O(n); summary ≈ bounded)
LLM calls during ingest	0 (deterministic EMA)	LLM fact-extraction per turn	LLM-managed page-in/out	varies (none for buffer/vector; LLM for summary classes)
Recall procedure	Deterministic (vector + literal cache)	LLM-extracted facts	LLM-managed swap	Deterministic retrieval (when vector-based)
Numeric / exact-value safety	Verbatim cache with `Decimal`	Embedded → lossy	Embedded → lossy	Not addressed by the framework
Deployment options	Proprietary; self-hosted, air-gapped, or Versino-managed hosting	OSS, self-hosted	OSS, self-hosted	OSS, self-hosted
Patent protection	Applications pending (EP, US)	—	—	—
Multi-agent coordination	Built-in (Cortex)	Manual	Manual	Manual

→ Deep-dive comparisons: vs mem0 · vs Letta · vs LangChain Memory

Benchmarks (where we measured head-to-head)

LOCOMO 10-conv (1986 QAs, gpt-4o, T = 0.0) — semvec BM25-hybrid + cross-encoder rerank scores F1 0.495 including adversarial Cat 5 (vs 0.469 for the dense-only baseline, +2.6 pp). Strongest single-category lift: multi-hop +5.3 pp. Rank 2 of 8 on the published LOCOMO leaderboard, beating RAG @ k=5 (0.433), claude-3-sonnet 200K (0.428), gemini-1.0-pro 32K (0.391), and gpt-3.5-turbo 16K (0.359).
vs. mem0 on LOCOMO — under the mem0 LLM-as-Judge prompt (verbatim), semvec clears mem0 by ~12 pp at the aggregate level (mem0 paper excludes Cat 5; the LOCOMO paper does not). Wall-clock is 17× shorter (2.77 h vs. ~47 h on the same suite) — semvec issues zero LLM calls during ingest.
Token efficiency on the same LOCOMO config: ~93 % fewer input tokens per turn vs gpt-4-turbo Full-Context 128K (~1.5–2 k vs ~26 k).

Reproduce with pip install "semvec[benchmarks,hybrid,api,mem0]" and benchmarks/run_locomo.py. We have not benchmarked against Letta or LangChain Memory directly; the comparison pages above describe the architectural differences, not measured performance gaps.

What you get
Installation
Choose your use case
Token-reduced LLM context
Drop-in chat proxy
Multi-agent coordination
Coding-agent compaction
REST API server
Persistence
Configuration & environment variables
Error handling
Licensing
Limitations & non-goals
FAQ
Telemetry
Support
License

What you get

Capability	What it solves
Constant-size compressed context	Per-call LLM input cost stops growing with conversation length. ~93 % fewer input tokens per turn on LOCOMO vs gpt-4-turbo full-context.
Tiered memory with selective forgetting	Three tiers (short / medium / long term) with retention scoring — frequently-accessed older memories outlive never-touched newer ones.
Domain anchors + resonance triggers	Bias retrieval toward known domains or specific keywords without re-training. Lifts precision@3 from 86 % → 91.7 % on mixed-domain workloads.
Drop-in chat proxy	Wrap any OpenAI-compatible LLM and get compressed context for free. Works with vLLM, LiteLLM, OpenRouter, Ollama out of the box.
Multi-agent coordination (Cortex)	Run several agents that share an aggregated view, vote on proposals, and exchange checksummed state vectors.
Coding-agent compaction	Persistent memory across coding sessions — design decisions, invariants, error patterns, code-pointer index, anti-resonance checks. MCP server for Claude Code & Cursor included.
REST API server	`semvec serve` exposes the full surface over FastAPI: sessions, clusters, regions, observer, network, literal cache, Prometheus metrics.
Compliance pack	Append-only event store, deterministic replay, GDPR Art. 17 forget with signed certificates, HMAC request signing, RS256 user JWTs.
Bring-your-own embedder	Anything exposing `get_embedding(text) → np.ndarray` and `get_dimension() → int` works. SentenceTransformers, OpenAI, ONNX int8 — see the embedders guide.
One wheel, all platforms	Python 3.10–3.14 via stable ABI. Pre-built wheels for Linux glibc + Alpine musl (x86_64 + aarch64), macOS (x86_64 + arm64), Windows (x86_64).

What changed in 0.6.1 — cleanup release

Bugfix-only release on top of 0.6.0. No behaviour change for any documented API; the patch ships:

Dead slowapi scaffolding removed from semvec.api.routes and semvec.api.app. The limiter was instantiated but never bound to any endpoint — actual per-SemvecState rate-limiting lives in the Rust core (RateLimitError) and is unchanged. HTTP-edge DoS protection is expected at a reverse proxy in front of semvec serve; see licensing.
RateLimitError → HTTP 429 mapped through a dedicated FastAPI exception handler; previously the bucket exhaust leaked through as a generic 500.
Documentation overhaul (patent-safe) — the public docs site is rewritten end-to-end so that user-visible interface and behaviour are fully covered while engine internals remain inside the patent boundary. Quickstart, REST reference, and the Cortex / compliance pages are now in sync with the installed wheel: env-var names, type-stub signatures, and code examples have all been re-verified against pip install semvec==0.6.1.

New in 0.6.0 — sharpening release

Production-shaped knobs across the REST API stack. Defaults keep the /v1/run pipeline byte-identical to 0.5.6 — every feature below is strictly additive and opt-in.

Surface	What's new
Sidecar embedder daemon	`semvec serve --embedder-mode sidecar` loads the model once and shares it across every worker over UDS or TCP. Eliminates the per-worker model load on `--workers > 1`. Python is the default daemon; an opt-in Rust binary trades throughput for cold-start latency and a ~10× smaller RSS. See the embedders guide.
BM25-hybrid retrieval	`pip install "semvec[hybrid]"` + `SEMVEC_HYBRID_BM25=1` adds a per-session BM25 index fused with dense cosine via Reciprocal Rank Fusion before an optional cross-encoder rerank stage. +2.6 pp weighted F1 on LOCOMO 10-conv (0.469 → 0.495), strongest single-category lift multi-hop +5.3 pp.
Cross-encoder rerank	`SEMVEC_RERANK_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2` adds the reranker between RRF fusion and the final top-K. Tunables: `SEMVEC_RERANK_FETCH_K`, `SEMVEC_RERANK_BATCH`, `SEMVEC_RERANK_FP16`, `SEMVEC_RERANK_THREADS`.
Weighted RRF fusion	`SEMVEC_RRF_WEIGHTS="1.0,0.4"` biases the fusion (dense, BM25) — useful when BM25 hurts single-fact precision on your domain.
Tunable retrieval at `/v1/run`	Four env vars replace wheel-baked defaults: `SEMVEC_RUN_TOP_K` (default 5), `SEMVEC_MMR_FETCH_K` (default 0 = MMR off), `SEMVEC_MMR_LAMBDA` (0.5), `SEMVEC_CONTEXT_BUDGET_CHARS` (4000, sum-as-you-go, replaces the legacy per-memory 150-char cap).
Session lifecycle	`SEMVEC_MAX_SESSIONS` (10 000) + `SEMVEC_SESSION_IDLE_TTL_S` (1 800 s) + `SEMVEC_SESSION_SWEEP_S` (60 s) cap the in-memory session table. SIGTERM drains in-flight requests, closes the embedder cleanly, then empties the table — zero-error rolling restarts behind a load balancer.
Embedder LRU + in-flight dedup	`SEMVEC_EMBEDDER_CACHE_SIZE=10000` wraps the active embedder. Cache hits skip the model; concurrent submits for the same text dedup onto one in-flight future. ~2.9× throughput on repeat-heavy chat traffic.
Async-native `/v1/run`	Parallel embed of query + last-response, ASGI-middleware bypass for license verify, LRU-cached Ed25519 verify (256 entries), CORS middleware skipped when no origins configured, threadpool 200. End-to-end: +63 % cumulative throughput, +772 % on the QA-only flow vs 0.5.6.
LOCOMO LLM-as-Judge `--judge`	`benchmarks/run_locomo.py --judge` re-scores predictions with the mem0 paper's judge prompt (verbatim). `requests`-backed OpenAI-compat adapter — no `openai` SDK dependency.
Performance hot path	Single-pass `safe_cosine_similarity` (+91 % turn-rate), `norm()` hoisted out of K-means inner loops (+57 %), retrieval-projection matrices as `ndarray::Array2`. Prometheus high-cardinality leak fixed.

Strictly additive — every 0.5.x call site keeps working untouched. Removed: the LongBench, MT-Bench, longmemeval, scaling / load / k6 / cortex / consensus / coding-replay runners and their datasets (LOCOMO is the single publication-grade bench now). Full picture in the changelog.

Installation

# Core only
pip install semvec

# With multi-agent coordination
pip install "semvec[cortex]"

# With coding-agent compaction (FastMCP server, Claude Code hooks)
pip install "semvec[coding]"

# Compliance pack (event store, retention, DSGVO forget, HMAC, RS256)
pip install "semvec[compliance]"
# When you also want the FastAPI compliance routes + middleware:
pip install "semvec[api,compliance]"

# REST API server
pip install "semvec[api]"
semvec serve --host 0.0.0.0 --port 8080

# Benchmark harness dependencies (SentenceTransformers, datasets, psutil)
pip install "semvec[benchmarks]"

# BM25-hybrid retrieval (LOCOMO +2.6 pp F1)
pip install "semvec[hybrid]"

# Optional Mem0 head-to-head baseline for benchmarks
pip install "semvec[mem0]"

# Developer tooling (ruff, mypy, pre-commit, pytest, httpx)
pip install "semvec[dev]"

# mkdocs-material for the documentation site
pip install "semvec[docs]"

# Everything the developers use
pip install "semvec[cortex,coding,api,compliance,hybrid,benchmarks,dev,docs]"

Extra	Pulls in	When you need it
`[cortex]`	— (marker only)	multi-agent coordination is always available; the extra marks intent for future pip resolvers
`[coding]`	`fastmcp>=2.0`	MCP server + Claude Code hooks
`[compliance]`	`cryptography>=42`	Event store, retention sweeper, deletion-certificate signer, HMAC + RS256 signing. FastAPI routes need `[api]` on top. See the Compliance guide.
`[jwt]`	`pyjwt>=2.9`	Stand-alone licence-JWT decoding without the full `[api]` extra — handy for build pipelines or short scripts that only need to inspect a token.
`[api]`	`fastapi`, `uvicorn[standard]`, `sqlalchemy`, `prometheus-client`, `pydantic`	REST API server (`semvec serve`)
`[benchmarks]`	`sentence-transformers`	running the LOCOMO bench runners under `benchmarks/`
`[hybrid]`	`bm25s>=0.2`, `nltk>=3.8`	BM25-hybrid retrieval — required to reproduce the LOCOMO +2.6 pp lift
`[mem0]`	`mem0ai>=0.1`, `faiss-cpu>=1.7`	head-to-head Mem0 comparison
`[dev]`	`ruff`, `mypy`, `pre-commit`, `pytest`, `httpx`	contributing — includes the FastAPI TestClient transport
`[docs]`	`mkdocs>=1.6`, `mkdocs-material>=9.5`, `pymdown-extensions`	building the documentation site (`mkdocs serve`)

Embedder requirement

Semvec is embedder-agnostic and refuses silent hash-based fallbacks — you bring your own. Any object exposing get_embedding(text) → np.ndarray and get_dimension() → int works.

pip install sentence-transformers

Choose the embedder dimension carefully — Semvec's retrieval quality is bounded by what the embedder can separate. Measured on 80 mixed-domain notes:

Embedder	dimension	precision@3	usable for
`all-MiniLM-L6-v2`	384	66.67 %	English-only, tight-domain prototypes only
`paraphrase-multilingual-mpnet-base-v2`	768	86.11 %	German / multilingual mixed-domain (recommended)

The 384-dim MiniLM is the easy default but on multilingual or domain-mixed text it confuses generic terms (e.g. "filter" → coffee filter vs. data filter). For German content, mixed-domain corpora, or anything where you need ≥ 80 % precision@3, use multilingual mpnet 768 d minimum.

from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer(
    "sentence-transformers/paraphrase-multilingual-mpnet-base-v2"
)

Choose your use case

You want to…	Jump to
Compress conversation history for any LLM	Token-reduced LLM context
Drop-in replacement for `openai.chat.completions`	Drop-in chat proxy
Coordinate many agents (analyst + planner + critic …)	Multi-agent coordination
Give Claude Code / Cursor persistent memory across sessions	Coding-agent compaction
Run as a service, talk to it over HTTP	REST API server
Process regulated data (GDPR, audit, retention)	Compliance pack

Token-reduced LLM context

The single most-used path: produce a compact system-prompt block from any conversation, regardless of length.

from semvec import SemvecState, SemvecConfig
from semvec.token_reduction import SemvecStateSerializer

state = SemvecState(config=SemvecConfig(dimension=768))

for text, embedding in conversation:
    state.update(embedding, text)

serializer = SemvecStateSerializer()
context = serializer.serialize(state, query_text="what did we decide about auth?")

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": context},
        {"role": "user",   "content": "what did we decide about auth?"},
    ],
)

Compared to raw history concatenation, the compressed context does not grow with conversation length — input cost converges to a constant. The serializer fits prior context into a 150–350-token block sized for a system prompt.

The truncation budget is caller-controlled. Pass SerializerConfig(max_memory_chars=N) for any positive N (e.g. 10_000 to effectively disable per-memory truncation), and set full_first=True to keep the highest-ranked retrieved memory verbatim while the rest stay short:

from semvec.token_reduction import SerializerConfig

cfg = SerializerConfig(top_k=5, max_memory_chars=200, full_first=True)
context = SemvecStateSerializer(cfg).serialize(state, query_text="...")
# Entry 1: full text. Entries 2..5: capped at 200 chars.

The same pattern is exposed on the REST API as ?max_text_chars=N&full_first=true on GET /v1/state/context.

Lift retrieval quality with anchors and triggers

The passive ingest above gives you retrieval that already beats sliding-window concatenation. To bias retrieval toward known domains or specific cues, register anchors and resonance triggers:

from semvec import SemvecState, SemvecConfig

state = SemvecState(config=SemvecConfig(
    dimension=768,
    enable_topic_switch=True,
    auto_anchor_on_topic_switch=True,   # opt-in (default off)
))

# Anchors — bias retrieval toward your known domains.
for prototype in [
    "SAP Business One Service Layer OData REST API",
    "Python MCP Model Context Protocol Server",
    "italienische Kueche Kochen Pasta Pizza",
    "Kaffee Espresso Roesterei Brewing",
]:
    state.add_anchor(embed(prototype))

# Triggers — boost memories on a keyword OR vector match.
state.create_resonance_trigger(
    keyword="security review",
    embedding=embed("security audit threat model"),
    threshold=0.7,
)

for text, vec in conversation:
    state.update(vec, text)

# Retrieval is now anchor-biased: candidates aligned with one of
# your domain anchors win the tie-break against generic phrases.
top = state.memory.get_relevant_memories(embed("OData filter syntax"), top_k=3)

What each piece adds (measured on mpnet 768 d, 80 mixed German notes):

Variant	precision@3
passive `update()` only	86.11 %
+ 4 domain anchors	91.67 % (+ 5.56 pp)
+ 4 resonance triggers	86.11 %
anchors + triggers	91.67 %

Without anchors, the retrieval boost is a no-op — flipping these features on costs nothing if you do not need them. Anchors and triggers compete for the same boost slot (max(...), not addition), so redundant signals do not double-count.

Tuning rule of thumb: keep anchor_retrieval_boost ≥ trigger_retrieval_boost, both in the [0.1, 0.6] range. Pushing either past 0.7 mostly stops moving the needle — spend your budget on better anchor prototypes or sharper trigger thresholds rather than dialling the boosts higher.

Drop-in chat proxy

SemvecChatProxy wraps any callable LLM behind compressed context and tracks both compressed and full-history token counts per turn:

from semvec.token_reduction import SemvecChatProxy, create_llm_client

llm = create_llm_client("openai")  # reads OPENAI_BASE_URL/MODEL/API_KEY from env
proxy = SemvecChatProxy(
    llm_call=llm,
    system_prompt="You are a helpful assistant.",
    embedding_service=my_embedder,
)

for question in ["summarise Q3", "compare with Q2", "biggest miss?"]:
    result = proxy.chat(question)
    print(f"turn {result.turn_number}: {result.response}")
    print(f"  compressed tokens: {result.tokens.compressed}")
    print(f"  full-history tokens: {result.tokens.full_history}")

print(proxy.get_summary())

Built-in clients: OpenAIClient (works with the OpenAI API and any compatible endpoint such as vLLM, LiteLLM, OpenRouter), OllamaClient. You can pass any callable (list[ChatMessage]) -> str.

Break-even is around ten turns. The compressed prompt carries a constant ~110-token header. For very short conversations (≤ 5 turns) plain history concatenation is cheaper; from ~10 turns onward the proxy undercuts naive concatenation, and the gap widens linearly with conversation length. Measured on the LOCOMO 10-conv suite: ~93 % fewer input tokens per turn vs a stateless full-history baseline.

Multi-agent coordination

Run several agents (analyst, planner, critic, …) that share an aggregated view, vote on proposals, and exchange checksummed state vectors.

from semvec.cortex import SemvecAgentNetwork, AttentionAggregation

network = SemvecAgentNetwork(
    aggregation_strategy=AttentionAggregation(dimension=768),
    dimension=768,
)
network.add_local_instance("analyst")
network.add_local_instance("planner")

network.process_input("analyst", "quarterly revenue is up 23%")
network.process_input("planner", "we should redirect Q4 spend to retention")

state = network.get_network_state()
print(f"active agents: {state['active_instances']}/{state['total_instances']}")

# Pull per-agent feedback for the next turn (consensus-aware)
feedback = network.get_feedback_for_agent("analyst")

Aggregation strategies: WeightedAverageAggregation, AttentionAggregation. ConsensusEngine adds proposal voting with five levels (SIMPLE_MAJORITY, QUALIFIED_MAJORITY, UNANIMOUS, WEIGHTED_VOTE, ADAPTIVE_THRESHOLD); quorum is measured against the registered voter pool, not just votes-cast-so-far. StateVectorPacket round-trips bit-exactly via serialize()/deserialize() and verify_integrity() confirms byte equality.

See the Cortex API reference for the full surface, the Cortex overview for the in-process / service / REST decision tree, and Cortex over REST API for the cluster / region / observer / network endpoints with curl + httpx examples.

Coding-agent compaction

Persistent memory across coding sessions for Claude Code, Cursor, Aider — code pointers, anti-resonance error patterns, structured handoff context.

→ Full integration guides: Claude Code (MCP + automatic SessionStart / PreCompact hooks) · Cursor (MCP + project rule). The high-level Coding overview lays out the three usage paths (MCP, in-process API, REST API) and when to pick which.

from semvec.coding import CodingEngine

engine = CodingEngine(state_dir="~/.semvec/project-x", embedder=my_embedder)
engine.ingest_transcript("path/to/claude_code_session.jsonl")

context = engine.get_compacted_context(
    "implement password reset flow",
    invariants=["never log plaintext passwords"],
)

Multi-session memory via `LiteralCache`

Below the high-level CodingEngine, state.literal_cache is a structured memory of design decisions, error patterns, invariants, and per-checkpoint test results — anything you want to survive across sessions verbatim:

import semvec

state = semvec.SemvecState(semvec.SemvecConfig(dimension=768))
cache = state.literal_cache

cache.record_decision("Use mpnet 768d for German content", checkpoint=1)
cache.record_error_pattern(
    pattern="catastrophic recency bias on blocked-domain ingest",
    example="500-note 4-domain blocked sequence",
    fix="raise long_term_size and use tier weights 1.0/0.95/0.9",
    checkpoint=1,
)
cache.add_invariant("State must round-trip via to_dict/from_dict")
cache.record_test_results(
    checkpoint=1,
    passed_tests=["test_a", "test_b", "test_c"],
    failed_tests=[],
)

# Build the LLM hand-off context for the next session
ctx = cache.build_handoff_context(next_checkpoint=2)
# ### INVARIANTS — Do NOT break these:
# - State must round-trip via to_dict/from_dict
#
# ### Test Status (CP1: 100%, 3/3)
#
# ### Known Error Patterns
# - `catastrophic recency bias on blocked-domain ingest` (x1): raise long_term_size...
#
# ### Design Decisions
# - [CP1] Use mpnet 768d for German content

# Persist + restore — round-trip preserves decisions, error_patterns,
# invariants, test_history, code_structures.
blob = state.to_bytes()
restored = semvec.SemvecState.from_bytes(blob)
assert restored.literal_cache.build_handoff_context(2) == ctx

build_handoff_context() produces a Markdown block ready for the system prompt of the next session. See the Coding API reference for the full surface.

Claude Code integration (MCP + hooks)

Wire it directly into Claude Code via the bundled FastMCP server and two lifecycle hooks. The settings below give you the bare minimum; for the full walk-through (what each hook does, CLAUDE.md setup, troubleshooting, end-to-end example session) see the Claude Code guide.

Add to .claude/settings.json:

{
  "mcpServers": {
    "semvec": {
      "command": "python",
      "args": ["-m", "semvec.coding.mcp_server"],
      "env": {
        "SEMVEC_STATE_DIR": ".semvec",
        "SEMVEC_EMBED_MODEL": "all-MiniLM-L6-v2"
      }
    }
  },
  "hooks": {
    "PreCompact":  [{"command": "python -m semvec.coding.hooks.pre_compact",  "timeout": 30000}],
    "SessionStart":[{"command": "python -m semvec.coding.hooks.session_start", "timeout": 10000}]
  }
}

The MCP server exposes six tools — pss_get_context, pss_update, pss_check_anti_resonance, pss_register_code, pss_record_error, pss_save. FastMCP is installed automatically via the [coding] extra.

The same FastMCP server plugs into Cursor via .cursor/mcp.json plus a Cursor Rule that replaces Claude Code's lifecycle hooks. Full step-by-step in the Cursor guide.

For multi-tenant server-side use (literal-cache endpoints over HTTP, JWT-gated, Postgres-backed metadata), see the REST API reference.

REST API server

pip install "semvec[api]"

# Dev mode — anonymous community-tier auth, in-memory SQLite
SEMVEC_ALLOW_ANONYMOUS=1 semvec serve --host 0.0.0.0 --port 8080

# Production — license JWT required, Postgres-backed metadata
export SEMVEC_LICENSE_KEY="eyJhbGciOiJFZERTQSI..."
export DATABASE_URL="postgresql://user:pw@host/semvec"
semvec serve --host 0.0.0.0 --port 8080

Talk HTTP:

# Health check (no auth)
curl http://localhost:8080/v1/health

# Single turn
curl -X POST http://localhost:8080/v1/run \
  -H "Authorization: Bearer $SEMVEC_LICENSE_KEY" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "demo", "query": "what was the Q3 miss?"}'

# Retrieve compressed context
curl "http://localhost:8080/v1/state/context?session_id=demo&top_k=5" \
  -H "Authorization: Bearer $SEMVEC_LICENSE_KEY"

Endpoint groups: sessions (CRUD + run/store/context), session-control (resonance triggers, anchors, isolation, export/import/verify), clusters, regions (consensus-driven realignment), global observer (anomaly detection across regions), network (state transfer, user partitioning, trust-based consensus), literal cache, Prometheus /metrics.

Auth is via Authorization: Bearer <jwt> or X-API-Key: <jwt> — same Ed25519-signed JWT as the in-process licensing system.

See the REST API reference for every endpoint and the CLI reference for semvec serve flags.

Persistence

state.to_dict() is a JSON-safe checkpoint with embedded SHA-256 checksum — best when the snapshot has to round-trip through systems that only speak JSON.

state.to_bytes(compress=True) is the compact binary equivalent (gzip-compressed JSON, magic header, SHA-256 corruption check) — best for cold-storage checkpoints. state.to_bytes(compress=False) is the speed-optimised variant: same byte footprint as JSON, but kept as a self-describing binary blob with corruption check — best for hot-path persistence. Both paths preserve the full state on round-trip:

the semantic state and its rolling histories
all three memory tiers
domain anchors and topic-switch history
the complete LiteralCache: entities, decisions, error patterns, invariants, test history, code structures

Restore with SemvecState.from_bytes(blob); the version byte distinguishes the two to_bytes modes automatically.

Practical sizing on mpnet 768 d:

Memories	JSON	`to_bytes(compress=True)`	`to_bytes(compress=False)`
110 (small)	18 ms / 8.8 kB / memory	157 ms / 3.7 kB / memory	36 ms / 8.8 kB / memory
1 000 (extrapolated)	~ 0.2 s / 9 MB	~ 1.4 s / 3.7 MB	~ 0.3 s / 9 MB
100 000	~ 17 s / 1.7 GB	~ 2.5 min / 400 MB	~ 30 s / 1.7 GB

Pick the variant by use case:

Cold-storage checkpoint (occasional, durability matters) → compress=True. ~ 2.4× smaller than JSON; pay the gzip cost once.
Hot-path persistence (every-turn or per-request) → compress=False. Same size as JSON, only ~ 1.9× slower than json.dumps, but kept as a self-describing binary blob with corruption check.

For very large footprints (> 100 k memories) wrap your own NPZ/Parquet around the embedding payload to save another factor.

Configuration & environment variables

Variable	Default	Used by
`SEMVEC_LICENSE_KEY`	—	Pro/Enterprise gates; REST API auth
`SEMVEC_ALLOW_ANONYMOUS`	unset	REST API: bypass auth (dev only)
`SEMVEC_STATE_DIR`	`.semvec`	`CodingEngine` state persistence
`SEMVEC_EMBED_MODEL`	`all-MiniLM-L6-v2`	MCP server / hooks default embedder (consider overriding to `paraphrase-multilingual-mpnet-base-v2` for German/multilingual)
`SEMVEC_EMBED_DEVICE`	`cpu`	MCP server / hooks: `cpu` or `cuda`
`DATABASE_URL`	`sqlite:///semvec.db`	REST API persistence (also accepts `postgresql://…`)
`METRICS_USER` / `METRICS_PASSWORD`	—	Basic Auth on Prometheus `/metrics`
`OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL`	—	`OpenAIClient`
`OLLAMA_BASE_URL`, `OLLAMA_MODEL`	`http://localhost:11434`, —	`OllamaClient`

Error handling

import time
from semvec import RateLimitError, LicenseExpiredError, ConfigurationError

try:
    result = state.update(embedding, text)
except RateLimitError as e:
    # e.retry_after is a datetime.timedelta; e.upgrade_url is set
    time.sleep(e.retry_after.total_seconds())
    result = state.update(embedding, text)
except LicenseExpiredError as e:
    # Hard fail — re-import won't help. Renew at e.upgrade_url.
    logger.error("semvec license expired — renew at %s", e.upgrade_url)
    raise
except ConfigurationError as e:
    # Wrong dimension, missing embedder, malformed config, etc.
    raise

All Semvec exceptions inherit from SemvecError. License-related exceptions (RateLimitError, LicenseExpiredError, LicenseError) inherit from LicenseError → SemvecError.

Licensing

Three tiers; Community works without a key, Pro and Enterprise require a signed Ed25519 JWT:

Tier	Rate limit	Retrieval modes
Community (no key)	5 QPS sustained / 50 burst	Base retrieval
Pro	200 / 2000 QPS	Extended
Enterprise	Unthrottled	All

JWTs have a 30-day TTL. Expiry is a hard fail — the next gated call raises LicenseExpiredError with the renewal URL in the message. Rate-limit exhaustion raises RateLimitError whose message names the tier and the retry-after delay.

The limiter is a token bucket per SemvecState. Both update() and the three calculate_* methods draw from the same bucket — the QPS budget is the combined operations-per-second on that state. Burst capacity gives every legitimate dev workload (conversational chat, MCP servers, smoke-tests, small pytest suites) plenty of headroom; sustained heavy load above the Community 5 QPS belongs in Pro. For background on the bucket plus the secondary probe-defence sliding window, see the Licensing guide.

export SEMVEC_LICENSE_KEY="eyJhbGciOiJFZERTQSI..."

Limitations & non-goals

Honest list of what Semvec does not do:

Not a vector database. Long-term memory is bounded; if you need recall over a million documents, run a dedicated vector store and treat Semvec as a conversational compressor on top.
Not a drop-in for stateless completion. The whole point is persistent state; if you only do single-shot prompts, you do not need Semvec.
No silent embedder fallback. If you do not pass an embedder, methods that need one raise a descriptive RuntimeError. Intentional — silent hash fallbacks gave surprising failure modes in earlier iterations.
License gate is a licensing feature, not a hard security boundary. Use it to enforce subscription tiers, not to keep determined adversaries out.
No mobile / WASM build today. abi3-py310 Linux/macOS/Windows only.
REST API persistence is metadata-only. Hot semantic state lives in-memory per process; only session/cluster/member/region/audit metadata is persisted. Plan accordingly for restarts.

FAQ

Is this RAG? Not in the usual sense. RAG retrieves documents at query time. Semvec compresses the conversation itself into a fixed-size state. They compose well — many users run Semvec for conversational signal + a vector DB for document retrieval.

Does the state ever grow? No, the state vector itself is fixed-size. The associated memory tiers are bounded by configured capacities — when full, the lowest-scoring entry is evicted (not the oldest).

Can I run it offline / air-gapped? Yes for Community tier. Pro/Enterprise tiers verify Ed25519 JWT signatures locally — no network call to a license server at runtime. Contact vertrieb@versino.de for offline-issued JWTs with custom TTLs.

How fast is it? Per-turn update() is sub-millisecond on a recent x86_64 CPU at dimension 384, dominated by NumPy/Rust matrix ops, not Python overhead. The whole point of the Rust port was to keep the math out of the GIL.

Is the source available? Compiled wheels are public on PyPI; the Rust source is held closed. Source access for Enterprise terms — contact vertrieb@versino.de.

GPU support? Embedders run on whatever device you configure (cuda, mps, cpu); the Semvec core itself is CPU-only — the math is small enough that GPU offload would lose more in transfer than it gains.

Telemetry

None. Semvec does not phone home. There is no init ping, no per-call event, no usage tracking, no machine pseudonym, no diversity sketch. License-JWT verification, inference, state updates, and retrieval all run locally — the package only contacts the network when you explicitly call something that does (the optional REST API server, the OpenAI / Ollama clients, your own embedder).

If you need install counts, PyPI download statistics (pypistats overall semvec) give you that without any client-side telemetry.

Earlier 0.x releases (≤ 0.5.2) shipped an opt-out anonymous init ping and a HyperLogLog "diversity sketch" intended to detect surrogate-cloning attempts. Both were removed in 0.5.3 — the trade-off was wrong, the GDPR Art. 6(1)(f) "patent-enforcement" basis was untenable, and the architecture matched the pattern of commercial spyware regardless of intent. If you're still on ≤ 0.5.2, upgrading to 0.5.3 removes the ping; you can also delete ~/.semvec/telemetry-salt (it is no longer used).

Support

Documentation: https://semvec-docs.pages.dev
Pricing & licensing: https://www.semvec.io
Pro / Enterprise support: support@versino.de (priority response)
Security disclosures: security@versino.de — please do not open public issues for vulnerabilities; coordinated disclosure with 48 h acknowledgement, fix-or-mitigation in 30 days for high-severity issues

License & patents

Patent applications pending for the Semvec engine algorithms and functional mechanisms:

EP 25 188 105 — European Patent Office (filed 2025; pending)
EP 26 160 795 — European Patent Office (pending)
US 19/269,195 — United States Patent and Trademark Office (pending)

Until grant, references to "patent-protected" features describe claims of pending applications, not enforceable exclusive rights.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.1

May 13, 2026

0.6.0

May 13, 2026

0.5.6

May 5, 2026

0.5.5

May 4, 2026

0.5.4

May 3, 2026

0.5.3

May 3, 2026

0.5.2 yanked

May 3, 2026

0.5.1 yanked

May 3, 2026

0.5.0 yanked

May 3, 2026

0.4.5 yanked

May 2, 2026

0.4.4 yanked

May 2, 2026

0.4.3 yanked

May 2, 2026

0.4.2 yanked

May 1, 2026

0.4.1 yanked

May 1, 2026

0.4.0 yanked

May 1, 2026

0.3.8 yanked

Apr 30, 2026

0.3.7 yanked

Apr 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

semvec-0.6.1-cp310-abi3-win_amd64.whl (1.3 MB view details)

Uploaded May 13, 2026 CPython 3.10+Windows x86-64

semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl (1.4 MB view details)

Uploaded May 13, 2026 CPython 3.10+musllinux: musl 1.2+ x86-64

semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded May 13, 2026 CPython 3.10+musllinux: musl 1.2+ ARM64

semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded May 13, 2026 CPython 3.10+manylinux: glibc 2.17+ x86-64

semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded May 13, 2026 CPython 3.10+manylinux: glibc 2.17+ ARM64

semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded May 13, 2026 CPython 3.10+macOS 11.0+ ARM64

semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl (1.4 MB view details)

Uploaded May 13, 2026 CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file semvec-0.6.1-cp310-abi3-win_amd64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-win_amd64.whl
Upload date: May 13, 2026
Size: 1.3 MB
Tags: CPython 3.10+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`d243f0982b606445b9c44d8bb008fddf9d12878adab2cc065ac34eee1dab36bd`
MD5	`8f17a394328749afef32a1074c412b8b`
BLAKE2b-256	`5ff9baa86665d1e59414d1d9dbd2542b219689af3ac1ac5af6457063ba25f9a4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-win_amd64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-win_amd64.whl
- Subject digest: d243f0982b606445b9c44d8bb008fddf9d12878adab2cc065ac34eee1dab36bd
- Sigstore transparency entry: 1526063521
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl
Upload date: May 13, 2026
Size: 1.4 MB
Tags: CPython 3.10+, musllinux: musl 1.2+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm	Hash digest
SHA256	`2084c2f48f3a253b246005fe8b18cf75b1cef49f848c769b15df02211ee23b89`
MD5	`84d93ad298ed321bba62ad43585c988c`
BLAKE2b-256	`3c7b5b78142b1b46a302acae82e977802e7181237b4b0d1682d141d1691f9bdc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-musllinux_1_2_x86_64.whl
- Subject digest: 2084c2f48f3a253b246005fe8b18cf75b1cef49f848c769b15df02211ee23b89
- Sigstore transparency entry: 1526063963
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl
Upload date: May 13, 2026
Size: 1.3 MB
Tags: CPython 3.10+, musllinux: musl 1.2+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm	Hash digest
SHA256	`4b8abff408e560d417cdd2233edc0858f90328636c9ea081ba4078f20ba4d6b8`
MD5	`1840020e03ffbb2ed8605219e14fe8fd`
BLAKE2b-256	`34935e532e13da45a5aae8588ed06f7fce85c7e6f1162f60dcaa42182de767c2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-musllinux_1_2_aarch64.whl
- Subject digest: 4b8abff408e560d417cdd2233edc0858f90328636c9ea081ba4078f20ba4d6b8
- Sigstore transparency entry: 1526063798
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: May 13, 2026
Size: 1.4 MB
Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`2a1a357b038c3b38a3d880d28cf87c69ac39534f5770f04c41f1c72c0c857e51`
MD5	`77b00c584bdfe02a8de6d3a3713b9190`
BLAKE2b-256	`e40bae62461703858b6d7fd82fbd64b63701e3a8fef77345aa43f786c4745814`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Subject digest: 2a1a357b038c3b38a3d880d28cf87c69ac39534f5770f04c41f1c72c0c857e51
- Sigstore transparency entry: 1526063636
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: May 13, 2026
Size: 1.3 MB
Tags: CPython 3.10+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`1f2340c6fd86cb095d610c54184203a3f4484ef47904319a2c0a8198dea1a24c`
MD5	`d2f8beae657ab7dc44cc2ee146b17a26`
BLAKE2b-256	`0fe33a4f861898d9b6de5c7395cbe3f69f2008126ec385d5a1f7562a8fbff969`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Subject digest: 1f2340c6fd86cb095d610c54184203a3f4484ef47904319a2c0a8198dea1a24c
- Sigstore transparency entry: 1526064027
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl
Upload date: May 13, 2026
Size: 1.3 MB
Tags: CPython 3.10+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`75bb63a0a711602cc4e3dccbcc8eb170d4e3585e253d5d088dca7ea0cc5da0e5`
MD5	`8d76d32e7471fb1931c1c0a42c020a40`
BLAKE2b-256	`5de57c59921c6f8860678418aa0cd1e65ff785c180d170908fceefdc35489f78`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-macosx_11_0_arm64.whl
- Subject digest: 75bb63a0a711602cc4e3dccbcc8eb170d4e3585e253d5d088dca7ea0cc5da0e5
- Sigstore transparency entry: 1526063878
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

File details

Details for the file semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl
Upload date: May 13, 2026
Size: 1.4 MB
Tags: CPython 3.10+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`de3f72d94bbd6c8723b7b3b911bfb90ba29c19e3f1de731bf29ca992afc2818c`
MD5	`9cf34b747fd2e8143cd6ddf530612ddf`
BLAKE2b-256	`e3ca3424fc11fc8a16269358f1c409ad85fb36b18241e0967052f25f9b08e1fe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on MichaelNeuberger/semvec

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: semvec-0.6.1-cp310-abi3-macosx_10_12_x86_64.whl
- Subject digest: de3f72d94bbd6c8723b7b3b911bfb90ba29c19e3f1de731bf29ca992afc2818c
- Sigstore transparency entry: 1526063710
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: MichaelNeuberger/semvec@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Branch / Tag: refs/tags/v0.6.1
- Owner: https://github.com/MichaelNeuberger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7a1c3b2ef49df1cca561e5ebaf315f750cde78de
- Trigger Event: push

semvec 0.6.1

Navigation

Verified details

Project links

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Semvec

Architectural differences vs. mem0, Letta, LangChain Memory

Benchmarks (where we measured head-to-head)

Table of contents

What you get

What changed in 0.6.1 — cleanup release

New in 0.6.0 — sharpening release

Installation

Embedder requirement

Choose your use case

Token-reduced LLM context

Lift retrieval quality with anchors and triggers

Drop-in chat proxy

Multi-agent coordination

Coding-agent compaction

Multi-session memory via LiteralCache

Claude Code integration (MCP + hooks)

REST API server

Persistence

Configuration & environment variables

Error handling

Licensing

Limitations & non-goals

FAQ

Telemetry

Support

License & patents

Project details

Verified details

Project links

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Multi-session memory via `LiteralCache`