Skip to main content

Lightweight dataset quality checks for CSV files

Project description

DataGuard

DataGuard is a lightweight Python package for basic dataset quality validation with both API and CLI support.

Features

  • Missing value detection
  • Duplicate row detection
  • Optional target column analysis
  • JSON report export
  • Command line interface

Installation

From TestPyPI:

pip install -i https://test.pypi.org/simple/ dataguard-lite

Local development:

pip install -e .

Usage (Python API)

from dataguard import validate_csv

report = validate_csv("data.csv", target="label")
report.summary()
report.to_json("report.json")

Usage (CLI)

dataguard data.csv
dataguard data.csv --target label
dataguard data.csv --json report.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataguard_lite-0.4.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataguard_lite-0.4.0-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file dataguard_lite-0.4.0.tar.gz.

File metadata

  • Download URL: dataguard_lite-0.4.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dataguard_lite-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b590a385096ea728febb5db89a9ac0651e38528b01e461118b6ece9135cffd45
MD5 bd0cd108c2ad2d7f1c0172a267b5cd80
BLAKE2b-256 70bcaffded414ef9dad7c483ececb8ff3a3a98e4d11e3f616a0959445bb06552

See more details on using hashes here.

File details

Details for the file dataguard_lite-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: dataguard_lite-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dataguard_lite-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ad19798de0e683dc1b17c99d3940da68a1c4170ef9299a1ef44058125d79fa77
MD5 da9b101b97928cf903bf841a9f4d0baa
BLAKE2b-256 53168c9bfc7184f5c36cdd5dfada142f694023052b3b665745e0704e00f9dc12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page