Skip to main content

A framework for reproducible experiments with pipelines, treatments, and hypotheses.

Project description

Crystallize 🧪✨

Test Lint PyPI Version License Codecov

⚠️ Pre-Alpha Notice
This project is in an early experimental phase. Breaking changes may occur at any time. Use at your own risk.


Rigorous, reproducible, and clear data science experiments.

Crystallize is an elegant, lightweight Python framework designed to help data scientists, researchers, and machine learning practitioners turn hypotheses into crystal-clear, reproducible experiments.


Why Crystallize?

  • Clarity from Complexity: Easily structure your experiments, making it straightforward to follow best scientific practices.
  • Repeatability: Built-in support for reproducible results through immutable contexts, lockfiles, and robust pipeline management.
  • Statistical Rigor: Hypothesis-driven experiments with integrated statistical verification.

Core Concepts

Crystallize revolves around several key abstractions:

  • DataSource: Flexible data fetching and generation.
  • Pipeline & PipelineSteps: Deterministic data transformations. Steps may be synchronous or async functions and are awaited automatically.
  • Hypothesis & Treatments: Quantifiable assertions and experimental variations.
  • Statistical Tests: Built-in support for rigorous validation of experiment results.
  • Optimizer: Iterative search over treatments using an ask/tell loop.

Getting Started

Crystallize is a powerful framework that can be used in two primary ways: via the interactive CLI for managing file-based experiments, or as a Python library for full programmatic control.

Installation

Install the framework and its CLI using pixi:

pixi install crystallize-ml

Option 1: The Interactive CLI (Recommended Workflow)

This is the fastest way to create, manage, and run a suite of experiments.

Launch the interactive terminal UI:

crystallize

Scaffold a new experiment:

Inside the UI, press the n key to open the "Create New Experiment" screen. Fill out the details to generate a new experiment folder under experiments/.

Run your experiment:

The UI will automatically discover your new experiment. Highlight it in the list and press Enter to run it.

Option 2: The Python Library (Programmatic Workflow)

Use the library directly in your Python scripts for advanced use cases and integrations.

from crystallize import (
    Experiment,
    Pipeline,
    Treatment,
    Hypothesis,
    SeedPlugin,
    ParallelExecution,
)

# Define your datasource, pipeline, treatments, etc.
pipeline = Pipeline([...])
datasource = DataSource(...)
treatment = Treatment(...)
hypothesis = Hypothesis(...)

# Build and run the experiment programmatically
experiment = Experiment(
    datasource=datasource,
    pipeline=pipeline,
    plugins=[SeedPlugin(seed=42), ParallelExecution(max_workers=4)],
)
result = experiment.run(
    treatments=[treatment],
    hypotheses=[hypothesis],
    replicates=10,
)
print(result.metrics)

Command Line Interface

The crystallize command opens a terminal UI for browsing and executing experiments. Highlight an experiment or graph to view its details and press Enter to run it. The details panel includes a live config editor so you can adjust values directly in config.yaml.

Experiments can define a cli section in config.yaml to control grouping and style:

cli:
  group: 'Data Preprocessing'
  priority: 1
  icon: '📊'
  color: '#85C1E9'
  hidden: false

You can also run experiments without the UI:

python -m experiments.<experiment_name>.main

Project Structure

crystallize/
├── datasources/
├── experiments/
├── pipelines/
├── plugins/
└── utils/

Key classes and decorators are re-exported in :mod:crystallize for concise imports:

from crystallize import Experiment, Pipeline, ArtifactPlugin

This layout keeps implementation details organized while exposing a clean, flat public API.


Roadmap

  • Advanced features: Adaptive experimentation, intelligent meta-learning
  • Collaboration: Experiment sharing, templates, and community contributions

Contributing

Contributions are very welcome! Please see CONTRIBUTING.md for guidelines.

Use code2prompt to generate LLM-powered docs:

code2prompt crystallize --exclude="*.lock" --exclude="**/docs/src/content/docs/reference/*" --exclude="**package-lock.json" --exclude="**CHANGELOG.md"

License

Crystallize is licensed under the Apache 2.0 License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crystallize_ml-0.24.10.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crystallize_ml-0.24.10-py3-none-any.whl (78.1 kB view details)

Uploaded Python 3

File details

Details for the file crystallize_ml-0.24.10.tar.gz.

File metadata

  • Download URL: crystallize_ml-0.24.10.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for crystallize_ml-0.24.10.tar.gz
Algorithm Hash digest
SHA256 ff54bf91ea517252e6a9983ce98116d9d6348d58abcb5b58066e924c50bf2466
MD5 b297996716308cb7237f6768715ba259
BLAKE2b-256 a8a73daeee55d003ffe81c6da3a68d32f3dfd7e02df7faea04aec9b2e3979d72

See more details on using hashes here.

Provenance

The following attestation bundles were made for crystallize_ml-0.24.10.tar.gz:

Publisher: publish_pypi.yml on brysontang/crystallize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file crystallize_ml-0.24.10-py3-none-any.whl.

File metadata

File hashes

Hashes for crystallize_ml-0.24.10-py3-none-any.whl
Algorithm Hash digest
SHA256 c0f03cda41a199ebd561bf0c5309d106147847eab2de32f7494c53148d2edcb5
MD5 c713b1b2285ea01cfb2b570f71f4815a
BLAKE2b-256 c60d23e150f8b30927e123d7ba647def73a2ac15b7825ec4a58e70870ccdb571

See more details on using hashes here.

Provenance

The following attestation bundles were made for crystallize_ml-0.24.10-py3-none-any.whl:

Publisher: publish_pypi.yml on brysontang/crystallize

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page