Sibyl — configuration-driven multi-task deep learning experiment platform (time-series forecasting + gene expression prediction)
Project description
Sibyl
Renamed from
DeepTS-Flow-Wheat(v0.x). PyPI:sibyl-ml. Python module:sibyl. CLI:sibyl-train/sibyl-bench/sibyl-genex.
Configuration-driven deep learning experiment platform — 12 time-series forecasting models from 2017–2025 plus a multi-omics gene-expression pipeline, all one Hydra config away.
Sibyl covers two task domains on a single Hydra+PyTorch core: time-series forecasting with a unified, fair-comparison protocol across 12 baselines, and gene-expression prediction from DNA sequence + epigenetic marks via the CrossMark architecture. One Trainer, one config system, one CLI. Drop in a new model file, add a YAML, send a PR — LazyModelDict auto-discovers it. AutoML mode profiles your data, recommends the top-3 models, trains them, and returns an ensemble report. Resume, mixed precision, GPU peak tracking, deterministic seeding, and Optuna sweeps come built in.
Quickstart (30 seconds, CPU-only)
git clone https://github.com/leehom0123/sibyl
cd sibyl
pip install -e ".[dev]"
python main.py experiment=demo
The shipped 12 KB toy dataset (data/demo_sine.npz, regenerable via python scripts/forecast/make_demo_data.py) trains a DLinear for 5 epochs on CPU. Expect outputs/forecast/<timestamp>_demo/ containing training curves, predictions, error analysis, SHAP attributions, and a leaderboard row — all in well under a minute.
Models
Time-series forecasting (12)
| Year | Model | Paper | Type |
|---|---|---|---|
| 2017 | Transformer | Vaswani et al. (NeurIPS) | Encoder-decoder |
| 2021 | Autoformer | Wu et al. (NeurIPS) | Decomposition + auto-correlation |
| 2022 | FEDformer | Zhou et al. (ICML) | Frequency-domain attention |
| 2023 | DLinear | Zeng et al. (AAAI) | Pure linear |
| 2023 | PatchTST | Nie et al. (ICLR) | Patch + channel-independent |
| 2023 | TimesNet | Wu et al. (ICLR) | 2D periodicity |
| 2024 | iTransformer | Liu et al. (ICLR) | Inverted attention |
| 2024 | TimeMixer | Wang et al. (ICLR) | Multi-scale MLP |
| 2024 | TimeXer | Wang et al. (ICML) | Patch + exogenous |
| 2024 | SOFTS | Han et al. (NeurIPS) | STar aggregation |
| 2025 | TimeFilter | Hu et al. (ICLR) | Filter-based |
| 2025 | DUET | Qiu et al. (KDD) | Dual clustering + Mahalanobis mask |
All 12 models are ported from each paper's reference implementation (10 from Time-Series-Library; SOFTS and DUET from their official author repos).
Gene expression / multi-omics (1)
| Year | Model | Description | Task |
|---|---|---|---|
| 2025 | CrossMark | Multimodal cross-attention over DNA sequence + epigenetic marks | Multi-omics gene expression |
Run a benchmark
# Single time-series experiment
python main.py experiment=etth1 model=patchtst
# Full forecast benchmark (10 datasets x 12 models)
python scripts/forecast/run_benchmark.py --epochs 50
# Multi-seed (mean ± std reporting)
python scripts/forecast/run_benchmark.py --seeds 42 2024 2025 --datasets etth1 --models dlinear
# Hyperparameter search (Optuna, resumable from SQLite)
python main.py -m hparams_search=dam_optuna
# AutoML: profile data → recommend top-3 → train → ensemble report
python main.py mode=auto
Resume an interrupted run by re-running the same command — COMPLETED markers and saved random states keep the resume bit-deterministic.
Gene expression (multi-omics)
# Smoke run on the spike dataset (1 epoch, single mark)
python main.py launcher=gene_expr experiment=spike model=crossmark \
model.active_marks='[ATAC]'
# 32-combination interpretability sweep (5-fold CV)
python scripts/gene_expr/run_sweep.py --n_folds 5 --epochs 200 --resume
# Post-sweep Shapley + interaction analysis
python scripts/gene_expr/run_analysis.py
The gene-expression pipeline reuses the same Trainer, callbacks, and config groups as the forecasting pipeline; only the Task, Model, and DataProvider differ. See docs/gene_expr.md for the data layout and how to extend to a new organ or species without code changes.
Adding a new model (5 steps)
- Create
sibyl/models/<name>.pywithclass Model(BaseModel). - Create
configs/model/<name>.yaml. - Run
python main.py experiment=etth1 model=<name>. - Add
tests/unit/test_<name>.py. - Send the PR —
LazyModelDictauto-discovers it;ModelRecommenderpicks up yourrecommender:rules.
See CONTRIBUTING.md for the full path, including code style and PR conventions.
Architecture (one-liner)
main.py → Hydra composes cfg.launcher → Launcher builds Task + Model + DataProvider → shared Trainer.fit() / .test() → callbacks (visualisation, leaderboard, SHAP, attention export, …) write everything under outputs/<pipeline>/. Every component is auto-discovered from its directory; adding one is a YAML and a Python file — zero registration.
Reproducibility
- Unified protocol across all 12 forecasting baselines: 50 epochs, patience 10, cosine annealing, Adam lr=1e-4, batch_size=32, seed=2021. One protocol, one set of architecture hyperparameters per model — performance differences come from data, not from per-dataset tuning.
- Multi-seed benchmarking via
--seeds 42 2024 2025(auto-suffixed run dirs and leaderboard rows). - Deterministic resume: Python / NumPy / PyTorch / CUDA random states all serialised into
checkpoint_last.ptalongside the scheduler state. - AMP off by default (cuFFT fp16 doesn't support non-power-of-2 seq_len=96); flip
training.use_amp=trueonce your model has no FFT-on-fp16 path.
Documentation
- Quickstart tutorial (forecast)
- Gene-expression tutorial
- AutoML guide
- Adding components
- Server deployment
Contributing
See CONTRIBUTING.md. New model contributions especially welcome — the bar is one model file, one config, and one test.
License
Apache-2.0. See LICENSE.
Citing
If you use Sibyl in academic work, please cite via CITATION.cff.
Acknowledgements
- Time-Series-Library (THUML) for the reference implementations of the 10 earlier forecasting models.
- The SOFTS, TimeFilter, and DUET authors for releasing their official codebases.
- The Hydra and PyTorch communities for the foundations the platform sits on.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sibyl_ml-0.1.3-py3-none-any.whl.
File metadata
- Download URL: sibyl_ml-0.1.3-py3-none-any.whl
- Upload date:
- Size: 220.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ec5025e7a26a6be6598f7d262c294864d971a6404da2dd59f5ba3c58156b52a
|
|
| MD5 |
9b1c9a1474d44aa8bcdba6e1249df91d
|
|
| BLAKE2b-256 |
f7de323c5e7100e7816aac6a03c5c2eaaf463a3b65e840061a732cddb7ce20ab
|