Skip to main content

Complex Object Metric - A library for measuring complexity of Python objects

Project description

Cobjectric

Complex Object Metric - A Python library for computing metrics on complex objects (JSON, dictionaries, lists, etc.).

CI codecov PyPI version PyPI downloads Python Version Documentation License Code style: black Ruff

📖 Description

Cobjectric is a library designed to help developers calculate metrics on complex objects such as JSON, dictionaries, and arrays. It was originally created for Machine Learning projects where comparing and evaluating generated JSON structures against ground truth data was a repetitive manual task.

📦 Installation

pip install cobjectric

🚀 Core Features

Cobjectric provides three main functionalities for analyzing complex structured data:

1. Fill Rate - Measure Data Completeness

Compute how "complete" your data is by measuring which fields are filled vs missing.

from cobjectric import BaseModel

class Person(BaseModel):
    name: str
    age: int
    email: str

person = Person.from_dict({
    "name": "John Doe",
    "age": 30,
    # email is missing
})

result = person.compute_fill_rate()
print(result.fields.name.value)   # 1.0 (present)
print(result.fields.age.value)    # 1.0 (present)
print(result.fields.email.value)  # 0.0 (missing)
print(result.mean())              # 0.667 (2 out of 3 fields filled)

Use cases: Data quality assessment, completeness scoring, field-level statistics.

2. Fill Rate Accuracy - Compare Completeness States

Compare the completeness of two models (got vs expected). Focus on field state (filled/missing), not on actual values.

got = Person.from_dict({"name": "John", "age": 30})           # email missing
expected = Person.from_dict({"name": "Jane", "age": 25, "email": "jane@example.com"})

accuracy = got.compute_fill_rate_accuracy(expected)
print(accuracy.fields.name.value)   # 1.0 (both filled)
print(accuracy.fields.age.value)    # 1.0 (both filled)
print(accuracy.fields.email.value)  # 0.0 (got missing, expected filled)
print(accuracy.mean())              # 0.667 (2 out of 3 states match)

Note: Fill Rate Accuracy compares state only (field present/missing), not values. To validate actual values, use Similarity.

Use cases: Validation pipelines, comparing generated vs expected data structures, quality control.

3. Similarity - Compare Values with Fuzzy Matching

Compare field values between two models with support for fuzzy text matching via rapidfuzz and intelligent list alignment strategies.

from cobjectric import BaseModel, Spec, ListCompareStrategy
from cobjectric.similarity import fuzzy_similarity_factory

class Person(BaseModel):
    name: str = Spec(similarity_func=fuzzy_similarity_factory("WRatio"))
    tags: list[Tag] = Spec(list_compare_strategy=ListCompareStrategy.OPTIMAL_ASSIGNMENT)

got = Person.from_dict({"name": "John Doe", "tags": [...]})
expected = Person.from_dict({"name": "john doe", "tags": [...]})

similarity = got.compute_similarity(expected)
print(similarity.fields.name.value)  # 0.99 (fuzzy match despite case difference)
print(similarity.fields.tags.mean()) # Uses optimal assignment for best matching

Key features:

  • Fuzzy text matching via rapidfuzz: handles typos, case differences, word order
  • List alignment strategies:
    • PAIRWISE: Compare by index (default)
    • LEVENSHTEIN: Order-preserving alignment based on similarity
    • OPTIMAL_ASSIGNMENT: Hungarian algorithm for best one-to-one matching
  • Numeric similarity: Gradual similarity based on difference thresholds

Use cases: ML model evaluation, fuzzy matching, comparing generated text with ground truth, list item matching.

Additional Features

  • Pre-defined Specs: Optimized Specs for common types (KeywordSpec, TextSpec, NumericSpec, BooleanSpec, DatetimeSpec)
  • Contextual Normalizers: Normalizers that receive field context for intelligent type coercion
  • Statistical Aggregation: mean(), std(), var(), min(), max(), quantile() on all results
  • Nested Models: Recursive computation on complex structures
  • List Aggregation: Access aggregated statistics across list items via items.aggregated_fields.name.mean()
  • Path Access: result["address.city"] or result["items[0].name"]
  • Custom Functions: Define your own fill rate, accuracy, or similarity functions per field
  • Field Normalizers: Transform values before validation

See the documentation for complete details.

📚 Full Documentation

📖 https://cobjectric.nigiva.com

The documentation includes:

🛠️ Development

Getting Started

Prerequisites

  • Python 3.13.9 or higher
  • uv - Fast Python package installer
  1. Install dependencies with uv (including optional extras for testing):
uv sync --dev --all-extras
  1. Install pre-commit hooks:
uv run pre-commit install --hook-type pre-push

Available Commands

The project uses invoke for task management.

To see all available commands:

uv run inv --list
# or shorter:
uv run inv -l

To get help on a specific command:

uv run inv --help <command>
# Example:
uv run inv --help precommit

Release Guide

See the RELEASE.md file for the release guide.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Citing Cobjectric

If you use Cobjectric in your research or projects, please consider citing it:

@software{cobjectric2025,
  author = {Nigiva},
  title = {Cobjectric: A Library for Computing Metrics on Complex Objects},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nigiva/cobjectric}},
  version = {3.0.0}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cobjectric-3.0.0.tar.gz (32.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cobjectric-3.0.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file cobjectric-3.0.0.tar.gz.

File metadata

  • Download URL: cobjectric-3.0.0.tar.gz
  • Upload date:
  • Size: 32.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cobjectric-3.0.0.tar.gz
Algorithm Hash digest
SHA256 94758231f5853310ca78537d6d8c070696eefdabe3e3ea7a3af58b360d041504
MD5 59b6d73a06d07dfc575f9efdb1a9081e
BLAKE2b-256 e065b8b3dd418c223d4a638627ff92a0d94bacdb1c654f52d0f96757e8863f67

See more details on using hashes here.

Provenance

The following attestation bundles were made for cobjectric-3.0.0.tar.gz:

Publisher: release.yml on nigiva/cobjectric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cobjectric-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: cobjectric-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cobjectric-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1dd509c3b5d09b6bbd3c9ebc6fe4c2caa3eddeb201ca1a004c085d9dc2e3dc8f
MD5 34a466cda9109da66d8ad7c450321ac7
BLAKE2b-256 6b31a0c70dcbcf83d9ad864984dd4234b2b9c5b491afc5834b6a945ea61631d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for cobjectric-3.0.0-py3-none-any.whl:

Publisher: release.yml on nigiva/cobjectric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page