Skip to main content

Prevent statistically invalid analyses from being shipped

Project description

stat-guard

stat-guard is a production-grade statistical assumption validation library for experiments such as A/B tests and controlled studies.

It acts as a guardrail, validating data integrity and statistical assumptions before any analysis is performed.

If stat-guard fails, the experiment must not be analyzed.


🚦 Why stat-guard exists

Most statistical failures do not come from incorrect formulas.
They come from broken data and violated assumptions:

  • Duplicate users counted multiple times
  • Users appearing in both control and treatment
  • Samples too small to be meaningful
  • Imbalanced or biased groups
  • Metrics with zero variance
  • Silent assumption violations in production pipelines

These issues often surface after results are shipped.

stat-guard prevents that.


🧠 What stat-guard does

  • Validates unit integrity (missing IDs, duplicates, leakage)
  • Checks minimum sample size
  • Detects group imbalance
  • Measures covariate balance (SMD)
  • Flags zero-variance metrics
  • Diagnoses distribution issues (skewness, normality)
  • Separates errors (blocking) from warnings (diagnostic)
  • Produces deterministic, machine-readable reports

Designed for:

  • CI/CD pipelines
  • Experiment gating systems
  • Production data workflows

🚫 What stat-guard does NOT do

stat-guard is not a statistics engine.

It deliberately does not:

  • ❌ Run hypothesis tests
  • ❌ Modify or auto-fix data
  • ❌ Apply transformations
  • ❌ Guess intent or apply heuristics

This keeps behavior explicit, transparent, and reproducible.


🧱 Core Philosophy

  • Explicit over implicit
  • No automatic corrections
  • Errors invalidate experiments; warnings do not
  • Deterministic, reproducible behavior
  • Production-first, not notebook-first
  • Simple, readable, maintainable code

📦 Installation

From GitHub (current)

stat-guard is currently distributed via GitHub:

pip install git+https://github.com/aaryansolankii/stat-guard.git

## 🚀 Quick example

```python
import pandas as pd
from stat_guard import validate

data = pd.DataFrame({
    "metric": [10, 12, 11, 13, 15, 14],
    "group": ["control", "control", "control", "treatment", "treatment", "treatment"]
})

report = validate(
    data,
    target_col="metric",
    group_col="group"
)

if not report.is_valid:
    raise RuntimeError(report)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stat_guard-0.2.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stat_guard-0.2.0-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file stat_guard-0.2.0.tar.gz.

File metadata

  • Download URL: stat_guard-0.2.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for stat_guard-0.2.0.tar.gz
Algorithm Hash digest
SHA256 95e33b6ca2569e745ccbd989acb0802f1eaf1270568c52847bb0e2a5028f9ee7
MD5 d46d31cdb3ee2484f47ffa1e1f10a2ae
BLAKE2b-256 286e2eab1e91a8bc4915b49fc37cd7f6a1a9b9277f58d3a5796e6ceece30d3c5

See more details on using hashes here.

File details

Details for the file stat_guard-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: stat_guard-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for stat_guard-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c8eb5f617bb9da6dc35950dc814a5c5486e0c4692dde6c470991bab4dba4772
MD5 025f7a8cc469d9f31e9560d9142bde5a
BLAKE2b-256 0792dfd7be52f66619fac3985ad71b606ed8477bda21c3f80fc801f873265e7d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page