Experimental CLI tool for converting TIFF raster data to FLAC format
Project description
FLAC-Raster: Experimental Raster to FLAC Converter
An experimental CLI tool that converts TIFF raster data files into FLAC audio format while preserving all geospatial metadata, CRS, and bounds information. This proof-of-concept explores using FLAC's lossless compression for geospatial data storage and introduces revolutionary HTTP range streaming for efficient geospatial data access - "Netflix for Geospatial Data".
๐ NEW: Netflix-Style Streaming for Geospatial Data
FLAC-Raster now supports true streaming exactly like Netflix and Spotify - each tile is a complete, self-contained FLAC file that can be decoded independently!
๐ต Two Streaming Formats:
- Raw Frames Format (15MB) - High compression, full file download only
- ๐ Streaming Format (185MB) - Netflix-style independent tiles, perfect for HTTP range streaming
โจ Streaming Features:
- ๐ฌ Netflix-style tiles: Each tile is a complete, independent FLAC file
- ๐ HTTP range streaming: Stream individual tiles via precise byte range requests
- โก Instant access: Decode any tile without downloading the full file
- ๐ฐ 99%+ bandwidth savings: Download only what you need (0.8MB vs 185MB)
- ๐บ๏ธ Geographic precision: Query specific areas with pixel-perfect accuracy
- ๐ฑ Web-native: Works with any HTTP server, CDN, or browser
- ๐ URL support: Query remote FLAC files directly via HTTPS URLs
- ๐ฏ Smart indexing: Spatial metadata for instant tile discovery
Features
- Bidirectional conversion: TIFF โ FLAC and FLAC โ TIFF
- Complete metadata preservation: CRS, bounds, transform, data type, nodata values
- ๐ Embedded metadata: All geospatial metadata stored directly in FLAC files (no sidecar files!)
- ๐ Spatial tiling: Convert rasters to tiled FLAC with bbox metadata per tile
- ๐ HTTP range streaming: Query and stream data by bounding box with 90%+ bandwidth savings
- ๐ Exceptional compression: 7-15ร file size reduction while maintaining lossless quality
- Intelligent audio parameters: Automatically selects sample rate and bit depth based on raster properties
- Multi-band support: Seamlessly handles multi-band rasters (RGB, multispectral) as multi-channel audio
- Lossless compression: Perfect reconstruction verified - no data loss
- FLAC chunking: Uses FLAC's frame-based compression (4096 samples/frame)
- Comprehensive logging: Verbose mode with detailed progress tracking
- Colorful CLI: Built with Typer and Rich for an intuitive experience
Installation
Prerequisites
First, install pixi:
# Install pixi (cross-platform package manager)
curl -fsSL https://pixi.sh/install.sh | bash
# or via conda: conda install -c conda-forge pixi
Clone and Setup
git clone https://github.com/Youssef-Harby/flac-raster.git
cd flac-raster
pixi install # Install all dependencies
Install the CLI tool
# For regular use:
pixi run pip install .
# For development (editable install):
pixi run pip install -e .
Alternative: Direct pip installation
pip install rasterio numpy typer rich tqdm pyflac mutagen
# Or install from PyPI (when published):
pip install flac-raster
Usage
Basic Commands
After installation, you can use the CLI directly:
-
Convert TIFF to FLAC:
flac-raster convert input.tif -o output.flac
-
Convert FLAC back to TIFF:
flac-raster convert input.flac -o output.tif
-
Get file information:
flac-raster info file.tif flac-raster info file.flac
-
Compare two TIFF files:
flac-raster compare original.tif reconstructed.tif
๐ Spatial Tiling & Netflix-Style Streaming
-
Create spatial FLAC with tiling:
# Raw frames format (high compression, 15MB) flac-raster convert input.tif --spatial -o spatial.flac # ๐ Streaming format (Netflix-style tiles, 185MB) flac-raster create-streaming input.tif --tile-size=1024 --output=streaming.flac # Custom tile size (256x256) flac-raster convert input.tif --spatial --tile-size 256 -o spatial.flac
-
Query spatial FLAC by bounding box:
# Query local file (raw frames) flac-raster query spatial.flac --bbox "xmin,ymin,xmax,ymax" # ๐ Query streaming FLAC (local or remote) python test_streaming.py local_streaming.flac --bbox "34.1,28.6,34.3,28.8" # ๐ Stream from remote URL (Netflix-style!) python test_streaming.py "https://example.com/streaming.flac" --tile-id=120 # Example with real coordinates flac-raster query spatial.flac --bbox "-105.3,40.3,-105.1,40.5"
-
๐ Extract tiles from Netflix-style streaming FLAC:
# Extract center tile from remote streaming FLAC flac-raster extract-streaming "https://example.com/streaming.flac" --center --output=center.tif # Extract last tile flac-raster extract-streaming "local_streaming.flac" --last --output=last_tile.tif # Extract by tile ID flac-raster extract-streaming "streaming.flac" --tile-id=60 --output=tile_60.tif # Extract by bounding box flac-raster extract-streaming "https://cdn.example.com/data.flac" --bbox="602380,3090240,609780,3097640" --output=bbox_tile.tif
-
View spatial index information:
flac-raster spatial-info spatial.flac # Raw frames format only # For streaming format, use extract-streaming with analysis
๐ Live Demo: Real Remote Streaming
Try our live streaming FLAC files hosted on Storj DCS with real Sentinel-2 B04 band data:
๐ Single Tile Extraction (99%+ Bandwidth Savings)
# ๐ฏ Extract center tile (coordinates: 554,880, 3,145,140)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--center --output=center_1km.tif
# โ Downloads: 1.5 MB | Result: 1024ร1024 center tile
# ๐ฆ Extract last tile (southeast corner)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--last --output=southeast_corner.tif
# โ Downloads: 0.8 MB | Result: 740ร740 edge tile
# ๐ฌ Extract specific tile by ID (northwest corner)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--tile-id=0 --output=northwest_corner.tif
# โ Downloads: 1.5 MB | Result: 1024ร1024 first tile
๐บ๏ธ Geographic Bounding Box Extraction
# ๐ Extract specific geographic area (1kmยฒ in center)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--bbox="554380,3144640,555380,3145640" \
--output=center_1km_bbox.tif
# โ Downloads: 1.5 MB | Result: Exact geographic area
# ๐ Extract southeast corner area (last tile region)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--bbox="602380,3090240,609780,3097640" \
--output=southeast_bbox.tif
# โ Downloads: 0.8 MB | Result: 740ร740 edge region
# ๐ Extract northwest area (first tile region)
flac-raster extract-streaming \
"https://link.storjshare.io/raw/ju6tov7vffpleabbilqgxfpxz5cq/truemaps-public/flac-raster/B04_streaming.flac" \
--bbox="499980,3189800,510220,3200040" \
--output=northwest_bbox.tif
# โ Downloads: 1.5 MB | Result: 1024ร1024 corner region
๐ Full Dataset Access
# ๐ฅ Download full dataset (use raw frames format for efficiency)
flac-raster convert \
"https://link.storjshare.io/raw/juxc544kagtqgkvhezix6wzia5yq/truemaps-public/flac-raster/B04_spatial.flac" \
--output=full_sentinel_B04.tif
# โ Downloads: 15 MB | Result: Complete 10,980ร10,980 Sentinel-2 dataset
# โ ๏ธ Note: For full datasets, use the raw frames format (15MB) instead of
# streaming format (177MB) for better compression efficiency
๐ Performance Comparison
| Use Case | Command | Download Size | Output | Savings |
|---|---|---|---|---|
| Single tile | --center |
1.5 MB | 1024ร1024 | 99.2% |
| Corner tile | --last |
0.8 MB | 740ร740 | 99.5% |
| Bbox query | --bbox="..." |
0.8-1.5 MB | Exact area | 99%+ |
| Full dataset | Raw frames format | 15 MB | 10,980ร10,980 | 91.5% |
| Full streaming | All 121 tiles | 177 MB | 10,980ร10,980 | 0% โ |
Netflix-Style Benefits:
- โก Instant metadata: 21KB spatial index loaded once
- ๐ฏ Precision targeting: Download only needed geographic areas
- ๐บ๏ธ Perfect quality: Pixel-perfect GeoTIFF output with full metadata
- ๐ฐ Massive savings: 99%+ bandwidth reduction for area-specific queries
Alternative: Use python main.py if you haven't installed the package:
python main.py convert input.tif # Direct script usage
Options
Convert command:
--output, -o: Specify output file path (auto-generates if not provided)--compression, -c: FLAC compression level 0-8 (default: 5)--force, -f: Overwrite existing output files--verbose, -v: Enable verbose logging for detailed progress--spatial, -s: ๐ Enable spatial tiling (raw frames format)--tile-size: ๐ Size of spatial tiles in pixels (default: 512x512)
๐ Extract-tile command (for raw frames format):
--bbox, -b: Bounding box as 'xmin,ymin,xmax,ymax'--output, -o: Output TIFF file path (required)
๐ Extract-streaming command (for Netflix-style streaming format):
--bbox, -b: Bounding box as 'xmin,ymin,xmax,ymax'--tile-id: Extract specific tile by ID number--center: Extract center tile automatically--last: Extract last tile--output, -o: Output TIFF file path (required)
Query command:
--bbox, -b: Bounding box as 'xmin,ymin,xmax,ymax' (required)--output, -o: Output file for extracted data--format, -f: Output format: 'ranges' (default) or 'data'
๐ Streaming test commands:
--tile-id: Extract specific tile by ID number--bbox: Extract tile by geographic bounding box--last: Extract last tile (default)--savings: Show bandwidth savings analysis
Compare command:
--show-bands/--no-bands: Show per-band statistics (default: True)--export, -e: Export comparison results to JSON file--help: Show help message
Example Workflow
# Create sample data
python examples/create_test_data.py
# Convert DEM to FLAC
flac-raster convert test_data/sample_dem.tif -v
# Check the FLAC file info
flac-raster info test_data/sample_dem.flac
# Convert back to TIFF
flac-raster convert test_data/sample_dem.flac -o test_data/dem_reconstructed.tif
# Compare original and reconstructed
flac-raster compare test_data/sample_dem.tif test_data/dem_reconstructed.tif
# Export comparison to JSON
flac-raster compare test_data/sample_dem.tif test_data/dem_reconstructed.tif --export comparison.json
# Test with multi-band data
flac-raster convert test_data/sample_rgb.tif
flac-raster convert test_data/sample_rgb.flac -o test_data/rgb_reconstructed.tif
flac-raster compare test_data/sample_rgb.tif test_data/rgb_reconstructed.tif
# Open in QGIS to verify
# The reconstructed files should be viewable in QGIS with all metadata intact
How It Works
TIFF to FLAC Conversion
- Read raster data and extract all metadata (CRS, bounds, transform, etc.)
- Spatial tiling (if enabled): Divide raster into configurable tile sizes
- Calculate audio parameters:
- Sample rate: Based on image resolution (44.1kHz to 192kHz)
- Bit depth: Matches the raster's bit depth (16 or 24-bit, minimum 16-bit due to FLAC decoder limitations)
- Normalize data to audio range (-1 to 1)
- Reshape data: Bands become audio channels, pixels become samples
- Single-band โ Mono audio
- Multi-band (RGB, multispectral) โ Multi-channel audio
- Encode to FLAC with configurable compression
- Embed metadata directly in FLAC using VORBIS_COMMENT blocks
- Generate spatial index with bbox and byte range information for each tile
FLAC to TIFF Conversion
- Decode FLAC file and extract audio samples
- Load metadata from embedded FLAC metadata (with JSON sidecar fallback)
- Reconstruct spatial index for tiled data
- Reshape audio back to raster dimensions
- Mono โ Single-band raster
- Multi-channel โ Multi-band raster
- Denormalize to original data range
- Write GeoTIFF with all original metadata preserved
Metadata Preservation
The tool preserves all geospatial metadata directly embedded in FLAC files:
- Width and height dimensions
- Number of bands
- Data type (uint8, int16, float32, etc.)
- Coordinate Reference System (CRS)
- Geospatial transform (affine transformation matrix)
- Bounding box coordinates
- Original data min/max values
- NoData values
- Spatial index: Compressed tile bbox and byte range information
- Original driver information
Embedded Metadata Format
Metadata is stored in FLAC VORBIS_COMMENT blocks:
GEOSPATIAL_CRS=EPSG:4326
GEOSPATIAL_WIDTH=1201
GEOSPATIAL_HEIGHT=1201
GEOSPATIAL_SPATIAL_INDEX=<base64(gzip(spatial_index_json))>
...
Lazy Loading & HTTP Range Streaming for Web GIS
Concept: "Zarr for Geospatial Data using Audio Compression"
The lazy loading feature transforms FLAC-Raster into a web-native geospatial format that enables efficient HTTP range request streaming:
FLAC URL: https://cdn.example.com/elevation.flac
โ
๐โโ๏ธ Lazy Load: Download first 1MB for metadata only
โ
Query Spatial Index: Find intersecting tiles for bbox
โ
HTTP Range Request: bytes=48152-73513,87850-113211
โ
โฌ๏ธ Smart Download: Only 76KB instead of 189KB (60% savings!)
โ
Decode FLAC: Get pixels for visible area only
Lazy Loading Workflow
- Metadata First: Download only 1MB to read embedded spatial index
- On-Demand Streaming: Query specific geographic areas
- Precise Downloads: HTTP Range requests for intersecting tiles only
- Progressive Loading: Cache tiles for repeated access
Use Cases
-
Interactive Web Maps
- Progressive loading as users pan/zoom
- Only download visible area data
- Works with any HTTP server/CDN
-
Cloud-Native GIS
- Stream large rasters without specialized servers
- Compatible with S3, CloudFront, etc.
- No need for complex tiling servers
-
Bandwidth-Constrained Applications
- Mobile mapping apps
- Satellite/field data collection
- IoT sensor networks
Web Server Integration
// JavaScript lazy loading client example
async function loadRasterData(bbox) {
const flacUrl = '/data/elevation.flac';
// 1. Lazy load: get metadata only (first 1MB)
const metadataResponse = await fetch(flacUrl, {
headers: { 'Range': 'bytes=0-1048575' }
});
const spatialIndex = extractEmbeddedMetadata(metadataResponse);
// 2. Find byte ranges for bbox
const ranges = calculateRanges(bbox, spatialIndex);
// 3. Stream only needed tiles via HTTP ranges
const rangeHeader = ranges.map(r => `${r.start}-${r.end}`).join(',');
const dataResponse = await fetch(flacUrl, {
headers: { 'Range': `bytes=${rangeHeader}` }
});
// 4. Decode FLAC data for bbox
return decodeFLACTiles(dataResponse.body, bbox);
}
Technical Details
๐ต Netflix-Style Streaming Architecture
FLAC-Raster implements two distinct formats for different use cases:
Raw Frames Format (Legacy)
- FLAC frames: Raw frame chunks within single FLAC file
- Compression: Exceptional compression (15MB for 185MB streaming equivalent)
- Use case: Full file downloads, highest compression ratio
- Limitation: Cannot stream individual tiles
๐ Streaming Format (Netflix-Style)
- Self-contained tiles: Each tile is a complete, independent FLAC file
- HTTP range ready: Perfect byte boundaries for range requests
- Instant decode: Any tile can be decoded without full file context
- Format structure:
[4 bytes index size][JSON spatial index][Complete FLAC Tile 1][Complete FLAC Tile 2]...[Complete FLAC Tile N]
Core Technologies
- ๐ Complete FLAC segments: Each tile includes full FLAC headers and metadata
- ๐ HTTP byte ranges: Precise byte offsets enable partial downloads
- ๐ Embedded metadata: All geospatial info stored in FLAC VORBIS_COMMENT blocks
- ๐ Spatial indexing: JSON metadata with bbox coordinates and byte ranges
- Multi-band support: Each raster band becomes an audio channel (up to 8 channels supported by FLAC)
- Lossless conversion: Data is normalized but the process is completely reversible
- Exceptional compression: Leverages FLAC's compression algorithms (7-15ร size reduction)
- Self-contained files: No external dependencies or sidecar files required
- Data type mapping:
- uint8 โ 16-bit FLAC (due to decoder limitations)
- int16/uint16 โ 16-bit FLAC
- int32/uint32/float32 โ 24-bit FLAC
Performance Examples
From comprehensive testing against report.md analysis:
Compression Results
- DEM file (1201ร1201, int16): 2.8 MB โ 185 KB FLAC (15.25ร compression)
- Multispectral (200ร200ร6, uint8): 235 KB โ 32 KB FLAC (7.38ร compression)
- RGB (256ร256ร3, uint8): 193 KB โ 27 KB FLAC (7.26ร compression)
HTTP Range Streaming Efficiency
- Small area queries: Up to 98.8% bandwidth savings vs full download
- Geographic precision: Query exact areas with pixel-perfect accuracy
- Optimized ranges: Smart merging of contiguous tiles reduces HTTP requests
All conversions are perfectly lossless (verified with numpy array comparison)
Limitations
- Maximum 8 bands (FLAC channel limitation)
- Minimum 16-bit encoding (pyflac decoder limitation)
- Large rasters may take time to process
- FLAC format limitations apply (specific bit depths: 16, 24-bit)
- Requires mutagen library for embedded metadata support
- Experimental: Not recommended for production use without thorough testing
Project Structure
flac-raster/
โโโ src/flac_raster/ # Main package
โ โโโ __init__.py # Package initialization
โ โโโ cli.py # Command-line interface
โ โโโ converter.py # Core conversion logic
โ โโโ spatial_encoder.py # ๐ Spatial tiling & HTTP range streaming
โ โโโ metadata_encoder.py # ๐ Embedded metadata handling
โ โโโ compare.py # Comparison utilities
โโโ examples/ # Example scripts
โ โโโ create_test_data.py # Generate test datasets
โ โโโ spatial_streaming_example.py # ๐ HTTP range streaming demo
โโโ test_data/ # Test datasets
โ โโโ dem-raw.tif # Large DEM for testing
โ โโโ sample_multispectral.tif # 6-band multispectral
โ โโโ sample_rgb.tif # RGB test data
โโโ report.md # ๐ Comprehensive analysis & benchmarks
โโโ main.py # Main entry point
โโโ pyproject.toml # Project configuration
โโโ README.md # This file
โโโ pixi.toml # Pixi package configuration
CI/CD & Publishing
This project uses GitHub Actions for:
- Continuous Integration: Tests on Python 3.9-3.12 across Windows, macOS, and Linux
- Automated Building: Package building and validation
- PyPI Publishing: Automatic publishing on release creation
- Quality Assurance: Integration testing via CLI commands
Publishing to PyPI
See PUBLISHING.md for detailed instructions on publishing releases.
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and test them
- Commit your changes:
git commit -am 'Add feature' - Push to the branch:
git push origin feature-name - Create a Pull Request
Future Improvements
- Adaptive tiling: Variable tile sizes based on data complexity
- Temporal support: Time-series data with temporal indexing
- Band selection: Spectral subsetting for multispectral data
- Compression tuning: Automatic optimization of FLAC parameters
- Caching strategy: Intelligent tile caching for frequently accessed areas
- JavaScript client: Browser-based FLAC decoder for web mapping
- Parallel processing: Multi-threaded encoding/decoding
- More formats: Support for HDF5, NetCDF, Zarr integration
- Performance optimization: Memory usage and processing speed improvements
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flac_raster-0.1.2.tar.gz.
File metadata
- Download URL: flac_raster-0.1.2.tar.gz
- Upload date:
- Size: 45.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a2efeaddb2e0f0a884a6bf19922f65e55d87e639c788d2cc02ad956cf7d7c75
|
|
| MD5 |
db4006b459a93a67c25744d5112d3e9a
|
|
| BLAKE2b-256 |
9d5ebc6877635c8651debef65fba9bd0bbb1a7093e2b0b9c2f6eb75014aa07a2
|
Provenance
The following attestation bundles were made for flac_raster-0.1.2.tar.gz:
Publisher:
ci.yml on Youssef-Harby/flac-raster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flac_raster-0.1.2.tar.gz -
Subject digest:
7a2efeaddb2e0f0a884a6bf19922f65e55d87e639c788d2cc02ad956cf7d7c75 - Sigstore transparency entry: 295539344
- Sigstore integration time:
-
Permalink:
Youssef-Harby/flac-raster@577cfe63a0dc6b3bfaf8e9fa49f96d2235347426 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Youssef-Harby
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@577cfe63a0dc6b3bfaf8e9fa49f96d2235347426 -
Trigger Event:
release
-
Statement type:
File details
Details for the file flac_raster-0.1.2-py3-none-any.whl.
File metadata
- Download URL: flac_raster-0.1.2-py3-none-any.whl
- Upload date:
- Size: 35.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0889b51812f01c1137add230b73c5323fd77c1452c89206756cb3ae746345ca4
|
|
| MD5 |
1a567229327b7849eb33b7c424f6e453
|
|
| BLAKE2b-256 |
ac61c08b36fc562ab085823a0355924861fa7b7f79d29568be47e1617c4469b6
|
Provenance
The following attestation bundles were made for flac_raster-0.1.2-py3-none-any.whl:
Publisher:
ci.yml on Youssef-Harby/flac-raster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flac_raster-0.1.2-py3-none-any.whl -
Subject digest:
0889b51812f01c1137add230b73c5323fd77c1452c89206756cb3ae746345ca4 - Sigstore transparency entry: 295539346
- Sigstore integration time:
-
Permalink:
Youssef-Harby/flac-raster@577cfe63a0dc6b3bfaf8e9fa49f96d2235347426 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Youssef-Harby
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@577cfe63a0dc6b3bfaf8e9fa49f96d2235347426 -
Trigger Event:
release
-
Statement type: