Skip to main content

Content-addressed block store with chunking and deduplication

Project description

Content-Addressed Block Store

A Python content-addressed storage system with content-defined chunking and automatic deduplication. Efficiently stores large files by splitting them into content-defined chunks and deduplicating at the chunk level.

Features

  • Content-Defined Chunking: Rabin fingerprinting approximation for intelligent file splitting
  • Automatic Deduplication: Chunk-level deduplication with reference counting
  • Optional Compression: zlib compression for space efficiency
  • SHA256 Content Addressing: Immutable, verifiable content identifiers
  • Garbage Collection: Reference-counted cleanup of unreferenced chunks
  • SQLite Metadata: Robust metadata storage with proper indexing
  • Thread-Safe: Reentrant lock protection for concurrent access
  • Integrity Verification: Hash-based blob verification

Installation

pip install content-addressed-store

Usage

from content_addressed_store import ContentAddressedStore

store = ContentAddressedStore("./my_store", enable_compression=True)

# Store data (returns content hash)
blob_hash = store.put(b"Hello, World!" * 1000, metadata={"type": "text"})

# Retrieve data
data = store.get(blob_hash)

# Verify integrity
assert store.verify(blob_hash)

# Get statistics
print(store.stats())

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

content_addressed_store-1.0.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

content_addressed_store-1.0.0-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file content_addressed_store-1.0.0.tar.gz.

File metadata

  • Download URL: content_addressed_store-1.0.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for content_addressed_store-1.0.0.tar.gz
Algorithm Hash digest
SHA256 84f04124c48d472a915ee8dc3a8099b30d55332aa34d84c96acfdea3bfbddf85
MD5 a759c9fca2a5905145120e2b204c690f
BLAKE2b-256 ff831fd1097e4933cd3ca686f5a782bf735e640cc80993e871eee4723fd3ca12

See more details on using hashes here.

File details

Details for the file content_addressed_store-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for content_addressed_store-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6eb74e681171f56952aeb8535e8efc399b5a8dd104754a31be5d2a7334dd579e
MD5 7a580379750c9b3fc84109e96f1c9f81
BLAKE2b-256 00ec1509136de4b3b6e6a263d692df868b5556efa57d070587c5e1236110b19c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page