Skip to main content

Simple data quality checks that just work.

Project description

Gaya

Simple data quality checks that just work.

Gaya helps you catch data issues early with sensible defaults and zero ceremony.

pip install gaya
gaya init
gaya run

What Gaya Checks

Out of the box, Gaya runs common, practical data quality checks with clear thresholds.

Check Default Behavior
Null rate per column Warn > 10%, fail > 25%
Required columns Zero nulls allowed
Primary key uniqueness Zero duplicates
Row count change Warn > 20%, fail > 40%
Schema drift Warn on column add, fail on removal

All thresholds are configurable in gaya.yml.


Quick Configuration

Define your data sources and tables in a simple YAML file.

datasources:
  main_db:
    type: postgres
    host: localhost
    database: app_db
    user: app_user
    password: env:DB_PASSWORD

tables:
  orders:
    source: main_db
    layer: staging
    primary_key: order_id
    not_null:
      - order_id
      - customer_id

Example Output

Clear, readable output that explains what failed and why it matters.

  ──────────────────────────────────────────────────────
  ✖  staging.orders  FAILED
     ✖  row count dropped 38% (1.2M → 740K)
          → A drop this large usually means a failed upstream load.

  ──────────────────────────────────────────────────────
  1 table(s) · 1 failed · 7 passed
  Finished in 2.3s
  ──────────────────────────────────────────────────────

Exit Codes

Designed to integrate cleanly with CI/CD pipelines.

Code Meaning
0 All checks passed
1 Warnings only
2 One or more checks failed
3 Gaya error (config or connection)

CI Integration

# GitHub Actions
- name: Run data quality checks
  run: gaya run --quiet

Supported Sources

  • Postgres

Additional connectors are planned.


Project Status

Gaya is an early-stage project. The core check logic, Postgres adapter, and CLI are working. The API and configuration format may evolve, but the goal will always be the same: simple, predictable, easy to reason about.

Feedback and contributions are welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gaya-0.1.1.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gaya-0.1.1-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file gaya-0.1.1.tar.gz.

File metadata

  • Download URL: gaya-0.1.1.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for gaya-0.1.1.tar.gz
Algorithm Hash digest
SHA256 58ad12ffb4b4b43a7baa956cfb188453b7ab02c62931c2a116273e061ab92704
MD5 16bfcf75c90a2692a5b3e6e4bafc48aa
BLAKE2b-256 c07c12a4b5a42d4deff3bea80d9534cee0364c738912564656c824ccdc5e4e1a

See more details on using hashes here.

File details

Details for the file gaya-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: gaya-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 26.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for gaya-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 14ebc0e31e07eb00610a0f30eda29ab1df04f2afc438d1a08ad497df9bc89515
MD5 f0b7aaa3e52bf87f52952356d3e606bb
BLAKE2b-256 88ba9efc07c63dce0ab02523c1d0f32fa896498e75ff91ff03e840c1b1d49d05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page