pytortilla

The file format behind TACO.

These details have not been verified by PyPI

Project links

Project description

The file format behind TACO. 🫓

GitHub: https://github.com/tacofoundation/tortilla-python 🌐

PyPI: https://pypi.tw.martin98.com/project/pytortilla/ 🛠️

Tortilla 🫓

Hello! I'm a Tortilla, a format to serialize your EO data 🤗.

pytortilla is a Python package that simplifies the creation and management of .tortilla files—these files are designed to encapsulate metadata, dataset information, and links to relevant files in remote sensing or AI workflows.

This package is “re-exported” within tacotoolbox, specifically under tacotoolbox.tortilla. Therefore, by installing and using pytortilla, you can also leverage it from tacotoolbox.tortilla.

Goals

Metadata handling: Defines classes (Sample, Samples) to describe and structure your data’s information.
Dataset structuring: Easily generate training, validation, and testing splits, and store them in .tortilla files.
Internal validation: Validate your dataset’s integrity (e.g., opening each file with rasterio).
Integration with Earth Engine (ee): Combines local data operations with GEE functionalities.
Unified usage with tacotoolbox: Load and manipulate these datasets with tacoreader and other helper functions from tacotoolbox.

Installation
Usage guide
Creating samples
Validation and adding metadata
Generating the .tortilla file
Loading and using the .tortilla File

Installation

pip install pytortilla

or from source:

git clone https://github.com/tacofoundation/tortilla-python.git
cd tortilla-python
pip install .

Note: You may also install it as part of tacotoolbox, where pytortilla is included as a dependency.

Usage guide

In this guide, we delve deeper into the step-by-step creation of .tortilla files, providing tips and best practices.

import pathlib
import rasterio
import pandas as pd
from sklearn.model_selection import train_test_split
import pytortilla

If you need Earth Engine:

import ee
ee.Initialize()  # Requires prior authentication if not done already

Files

Move the Files from Hugging Face to Your Local Machine

import os

# URL path to the Hugging Face repository
path = "https://huggingface.co/datasets/tacofoundation/tortilla_demo/resolve/main/"

# List of demo files to download
files = [
    "demo/high__test__ROI_0010__20190125T112341_20190125T112624_T28QFG.tif",
    "demo/high__test__ROI_0011__20190130T103251_20190130T104108_T31REP.tif",
    "demo/high__test__ROI_0011__20190830T102029_20190830T102552_T31REP.tif",
    "demo/high__test__ROI_0064__20190317T015619_20190317T020354_T51JVH.tif",
    "demo/high__test__ROI_0120__20191219T045209_20191219T045214_T45TXE.tif",
    "demo/high__test__ROI_0141__20190316T141049_20190316T142437_T19FDE.tif",
    "demo/high__test__ROI_0159__20200403T143721_20200403T144642_T19HBV.tif",
    "demo/high__test__ROI_0235__20200402T053639_20200402T053638_T44UNV.tif"
]

# Create a local folder called 'demo' (if not already existing)
os.system("mkdir -p demo")

# Download each file to the 'demo' folder
for file in files:
    os.system(f"wget {path}{file} -O {file}")

Note: Depending on your environment, you might prefer using requests or urllib instead of os.system for downloading files.

At this point, you should have a demo/ folder populated with several .tif files.

Creating samples

Now, we will create samples using pytortilla:

import pathlib
import pandas as pd
from sklearn.model_selection import train_test_split
import rasterio
from pytortilla.datamodel import Sample, Samples

# Define the local path containing the TIFF files
demo_path = pathlib.Path("./demo")

# Collect all .tif files in the demo folder
all_files = list(demo_path.glob("*.tif"))

# Split into train, val, and test
train_files, test_files = train_test_split(all_files, test_size=0.2, random_state=42)
train_files, val_files = train_test_split(train_files, test_size=0.2, random_state=42)

train_df = pd.DataFrame({"path": train_files, "split": "train"})
val_df = pd.DataFrame({"path": val_files, "split": "validation"})
test_df = pd.DataFrame({"path": test_files, "split": "test"})
dataset_full = pd.concat([train_df, val_df, test_df], ignore_index=True)

# Build a list of Sample objects
samples_list = []
for _, row in dataset_full.iterrows():
    with rasterio.open(row.path) as src:
        metadata = src.profile
        sample_obj = Sample(
            id=row.path.stem,
            path=str(row.path),
            file_format="GTiff",
            data_split=row.split,
            stac_data={
                "crs": str(metadata["crs"]),
                "geotransform": metadata["transform"].to_gdal(),
                "raster_shape": (metadata["height"], metadata["width"])
            }
        )
        samples_list.append(sample_obj)

samples_obj = Samples(samples=samples_list)

Validation and adding metadata

Validate each .tif file by trying to open it:

samples_obj.deep_validator(read_function=lambda x: rasterio.open(x))

If you need RAI metadata (or any other additional metadata) in your workflow, you can include it:

samples_obj = samples_obj.include_rai_metadata(
    sample_footprint=5120,  # Example footprint value
    cache=False,
    quiet=False
)

Generating the `.tortilla` file

Use pytortilla.create.main.create() (or the equivalent tacotoolbox.tortilla.create if you have tacotoolbox installed):

from pytortilla.create.main import create

# Generate the .tortilla file
output_file = create(
    samples=samples_obj,
    output="demo_dataset.tortilla"
)

print(f"Tortilla file generated: {output_file}")

The .tortilla might split into multiple files (.0000.part.tortilla, etc.) for large datasets.

Loading and using the `.tortilla` file

Finally, load the .tortilla file (or its parts) with tacoreader:

import tacoreader
import pandas as pd

dataset_chunks = []

# Try loading .part.tortilla files (assuming a maximum of 4 parts for this example)
for i in range(4):
    part_file = f"demo_dataset.{i:04d}.part.tortilla"
    try:
        dataset_part = tacoreader.load(part_file)
        dataset_chunks.append(dataset_part)
    except FileNotFoundError:
        break  # Stop if no more parts

if dataset_chunks:
    dataset = pd.concat(dataset_chunks, ignore_index=True)
    print(dataset.head())
else:
    print("No tortilla parts found.")

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.1

Mar 28, 2025

0.5.0

Dec 29, 2024

0.5.0b2 pre-release

Dec 28, 2024

0.5.0b1 pre-release

Dec 27, 2024

0.5.0b0 pre-release

Dec 27, 2024

0.5.0a0 pre-release

Dec 26, 2024

0.4.2

Dec 22, 2024

0.4.1

Dec 17, 2024

0.4.0

Dec 16, 2024

0.3.3

Dec 15, 2024

0.3.2

Dec 15, 2024

0.3.1

Dec 15, 2024

0.3.0

Dec 15, 2024

0.2.1

Dec 15, 2024

0.2.0

Dec 14, 2024

0.1.2

Dec 13, 2024

0.1.1

Dec 11, 2024

0.1.0

Dec 10, 2024

0.1.0a7 pre-release

Dec 10, 2024

0.1.0a6 pre-release

Dec 10, 2024

0.1.0a5 pre-release

Dec 9, 2024

0.1.0a4 pre-release

Dec 9, 2024

0.1.0a3 pre-release

Dec 9, 2024

0.1.0a2 pre-release

Dec 9, 2024

0.1.0a1 pre-release

Dec 9, 2024

0.1.0a0 pre-release

Dec 9, 2024

0.0.5

Nov 30, 2024

0.0.3

Oct 2, 2024

0.0.2

Oct 2, 2024

0.0.1

Sep 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytortilla-0.5.1.tar.gz (672.2 kB view details)

Uploaded Mar 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytortilla-0.5.1-py3-none-any.whl (668.6 kB view details)

Uploaded Mar 28, 2025 Python 3

File details

Details for the file pytortilla-0.5.1.tar.gz.

File metadata

Download URL: pytortilla-0.5.1.tar.gz
Upload date: Mar 28, 2025
Size: 672.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/7.0.1 keyring/24.3.1 pkginfo/1.9.6 readme-renderer/34.0 requests-toolbelt/1.0.0 requests/2.32.3 rfc3986/1.5.0 tqdm/4.67.1 urllib3/2.3.0 CPython/3.10.12

File hashes

Hashes for pytortilla-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`cc37f4c9182afabf3fe99eccecb30e11e39209a24b0727b413770ba89fe2093c`
MD5	`e1ed8ea707bab3d77d87546e0af35785`
BLAKE2b-256	`1edbb2592e430f8c2a04005909d72b2f8ef8e551b86f75052dd63fbf26974e7a`

See more details on using hashes here.

File details

Details for the file pytortilla-0.5.1-py3-none-any.whl.

File metadata

Download URL: pytortilla-0.5.1-py3-none-any.whl
Upload date: Mar 28, 2025
Size: 668.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/7.0.1 keyring/24.3.1 pkginfo/1.9.6 readme-renderer/34.0 requests-toolbelt/1.0.0 requests/2.32.3 rfc3986/1.5.0 tqdm/4.67.1 urllib3/2.3.0 CPython/3.10.12

File hashes

Hashes for pytortilla-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`27b1d66db576a02c95070978c23e5410fbb8576938bf8c6da92c43c3106f367e`
MD5	`39a35b68bbb1268dc839c9fbf6be3e83`
BLAKE2b-256	`7296543816b82935f09d33038ea35210b5aaa47ba1b1ef5b2cebf277506d40cb`

See more details on using hashes here.

pytortilla 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Tortilla 🫓

Goals

Table of Contents

Installation

Usage guide

Files

Creating samples

Validation and adding metadata

Generating the `.tortilla` file

Loading and using the `.tortilla` file

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

pytortilla 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Tortilla 🫓

Goals

Table of Contents

Installation

Usage guide

Files

Creating samples

Validation and adding metadata

Generating the .tortilla file

Loading and using the .tortilla file

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Generating the `.tortilla` file

Loading and using the `.tortilla` file