Delta compression for LLM fine-tunes - lossless or LoRA-equivalent SVD compression
Project description
∴ Sparse
Delta Compression for Fine-tuned Models and Datasets
Compress your 14GB fine-tune to 1.4GB (lossless) or 50MB (LoRA-equivalent). Reconstruct in 4 seconds.
Verified: GPT-2 compression → reconstruction → identical inference output ✅
Quick Start • How It Works • CLI • Python API
What Sparse Does
Sparse compresses fine-tuned models and datasets as deltas from their base versions.
| Compression Mode | Size (7B) | Quality | Use Case |
|---|---|---|---|
| Lossless | ~1.4 GB | 100% | Production, quality-critical |
| Lossy (SVD) | ~50 MB | ~95-99% | Sharing, size-critical |
| Dataset Delta | 60-80% savings | 100% | Derivative datasets |
Key benefit: Works on models you've already trained - no LoRA required during training.
Works with: Full fine-tunes, RLHF, model merges, translated/augmented datasets
Quick Start
pip install sparse-llm
Compress a Fine-tune
# Lossless compression (~1.4GB for 7B model)
sparse compress meta-llama/Llama-2-7b-hf ./my-finetune -o ./my-delta
# OR: Lossy compression (~50MB, LoRA-equivalent quality)
sparse compress-lossy meta-llama/Llama-2-7b-hf ./my-finetune -o ./my-delta --rank 16
Reconstruct from Delta
# From lossless delta
sparse reconstruct meta-llama/Llama-2-7b-hf ./my-delta -o ./reconstructed-model
# From lossy delta
sparse reconstruct-lossy meta-llama/Llama-2-7b-hf ./my-delta -o ./reconstructed-model
Dataset Delta
# Compress derivative dataset
sparse dataset-compress squad squad_v2 -o ./squad_v2_delta
# Reconstruct
sparse dataset-reconstruct ./squad_v2_delta
How It Works
Fine-tuned Model (14GB) - Base Model (14GB) = Delta (1.4GB or 50MB)
↓
Reconstruct: Base + Delta
- Lossless: Sparse + INT8 encoding → ~10% of original size, 100% quality
- Lossy (SVD): Low-rank approximation → ~0.4% of original, ~95-99% quality
CLI Reference
# Lossless compression (100% quality)
sparse compress <base> <finetune> -o <output>
sparse reconstruct <base> <delta> [-o <output>]
# Lossy compression (~50MB, LoRA-equivalent quality)
sparse compress-lossy <base> <finetune> -o <output> [--rank 16]
sparse reconstruct-lossy <base> <delta> [-o <output>]
# Dataset commands
sparse dataset-compress <base> <derivative> -o <output>
sparse dataset-reconstruct <delta_dir>
sparse dataset-estimate <base> <derivative>
# Info
sparse info <path>
Python API
from core import compress_delta, reconstruct_from_delta
from core import compress_delta_svd_full, reconstruct_from_svd_delta
# Lossless compression
manifest = compress_delta(
base_model_id="meta-llama/Llama-2-7b-hf",
finetune_model_id="./my-finetune",
output_path="./my-delta"
)
print(f"Compression: {manifest.compression_ratio:.1f}x") # ~10x
# Extract LoRA (lossy, LoRA-equivalent)
manifest = compress_delta_svd_full(
base_model_id="meta-llama/Llama-2-7b-hf",
finetune_model_id="./my-finetune",
output_path="./my-svd-delta",
rank=16 # Like LoRA rank
)
print(f"Compression: {manifest.compression_ratio:.1f}x") # ~280x
# Reconstruct (lossless)
model = reconstruct_from_delta("meta-llama/Llama-2-7b-hf", "./my-delta")
# Reconstruct from extracted LoRA
model = reconstruct_from_svd_delta("meta-llama/Llama-2-7b-hf", "./my-lora-delta")
Dataset API
from core import compress_dataset_delta, reconstruct_from_dataset_delta
# Compress
manifest = compress_dataset_delta("squad", "squad_v2", "./squad_v2_delta")
print(f"Savings: {manifest['size_stats']['savings_pct']:.1f}%")
# Reconstruct
dataset = reconstruct_from_dataset_delta("./squad_v2_delta")
Performance
All optimizations are automatic - no configuration needed:
- Rust SIMD acceleration: 5-10x faster compression
- Base model caching: ~20s saved per compression
- Smart heuristics: 10-20% better compression ratios
- GPU reconstruction: 2-3x faster on CUDA
- Lazy loading: 50-70% memory reduction for 30B+ models
Typical speedup: ~60s → ~8-12s (5-8x faster)
📚 Advanced optimizations: See API_REFERENCE.md for MmapDeltaStorage, DifferentialCompressor, and other utilities.
Sparse vs LoRA
| LoRA/PEFT | Sparse | |
|---|---|---|
| When applied | During training | After training |
| Works on existing models | ❌ | ✅ |
| Lossless option | ❌ | ✅ |
Key insight: sparse compress-lossy gives you LoRA-sized files (~50MB) from models that weren't trained with LoRA.
Requirements
- Python 3.9+
- PyTorch 2.0+
- transformers
- Rust (included in wheel, no setup needed)
License
Apache 2.0 - See LICENSE for details.
Free for personal and commercial use.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparse_llm-0.0.5-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: sparse_llm-0.0.5-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 636.0 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7e1022abb38e84ebc1a524ae73da3b6dd58358b42d111cf2b655ac67c620d80
|
|
| MD5 |
036252a42fa6b2df9542a64ed944de09
|
|
| BLAKE2b-256 |
6494d8759fb0521010439efe9f63b5526b1c7105ad662d336565ac9ab411ee3f
|
Provenance
The following attestation bundles were made for sparse_llm-0.0.5-cp39-abi3-win_amd64.whl:
Publisher:
build-artifacts.yml on gagansuie/sparse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sparse_llm-0.0.5-cp39-abi3-win_amd64.whl -
Subject digest:
c7e1022abb38e84ebc1a524ae73da3b6dd58358b42d111cf2b655ac67c620d80 - Sigstore transparency entry: 791047063
- Sigstore integration time:
-
Permalink:
gagansuie/sparse@8cbb068574893743730212770d795ae343a148fa -
Branch / Tag:
refs/heads/main - Owner: https://github.com/gagansuie
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-artifacts.yml@8cbb068574893743730212770d795ae343a148fa -
Trigger Event:
push
-
Statement type:
File details
Details for the file sparse_llm-0.0.5-cp39-abi3-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: sparse_llm-0.0.5-cp39-abi3-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 888.6 kB
- Tags: CPython 3.9+, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6a8214c55ba64241f7e0951d01aacb423cd2f3fdf75ce72fa37e59dd1b76ac7
|
|
| MD5 |
8c7031c68bb9cd1dfde449cb2101d1c7
|
|
| BLAKE2b-256 |
a1db4fbff9c90cbcf36e26441bd67d125f1ced734200c40de7b1077f7d96848f
|
Provenance
The following attestation bundles were made for sparse_llm-0.0.5-cp39-abi3-manylinux_2_34_x86_64.whl:
Publisher:
build-artifacts.yml on gagansuie/sparse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sparse_llm-0.0.5-cp39-abi3-manylinux_2_34_x86_64.whl -
Subject digest:
a6a8214c55ba64241f7e0951d01aacb423cd2f3fdf75ce72fa37e59dd1b76ac7 - Sigstore transparency entry: 791047053
- Sigstore integration time:
-
Permalink:
gagansuie/sparse@8cbb068574893743730212770d795ae343a148fa -
Branch / Tag:
refs/heads/main - Owner: https://github.com/gagansuie
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-artifacts.yml@8cbb068574893743730212770d795ae343a148fa -
Trigger Event:
push
-
Statement type:
File details
Details for the file sparse_llm-0.0.5-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: sparse_llm-0.0.5-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 680.6 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29d43f28b2838192812499b9df5604b62b10cb8518c970a39185289661f30ddd
|
|
| MD5 |
580512a7c76f156e9811642f57b8c3c2
|
|
| BLAKE2b-256 |
8daa86e9411985cf297b912e3b0afbb659a759d18432353052beaa2520de0e60
|
Provenance
The following attestation bundles were made for sparse_llm-0.0.5-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
build-artifacts.yml on gagansuie/sparse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sparse_llm-0.0.5-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
29d43f28b2838192812499b9df5604b62b10cb8518c970a39185289661f30ddd - Sigstore transparency entry: 791047057
- Sigstore integration time:
-
Permalink:
gagansuie/sparse@8cbb068574893743730212770d795ae343a148fa -
Branch / Tag:
refs/heads/main - Owner: https://github.com/gagansuie
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-artifacts.yml@8cbb068574893743730212770d795ae343a148fa -
Trigger Event:
push
-
Statement type: