Skip to main content

A package for spherical clustering and probabilistic modeling

Project description

Spheroids Logo

spheroids

Python Version License: GPL-3.0 PyTorch GitHub Issues PRs Welcome

High-performance spherical clustering with PyTorch and C++

Key FeaturesInstallationQuick StartDocumentationContributing


Package spheroids offers the use of PKBD and spherical Cauchy distributions, which—unlike many other spherical distributions—avoid complicated normalizing constants involving hypergeometric functions and hence do not require iterative evaluations. Instead, they primarily rely on matrix multiplication, making them well-suited for GPU-accelerated computing.

Beyond traditional applications, spheroids is particularly useful for clustering of modern embeddings (e.g., semantic embeddings generated by large language models, image embeddings, or any high-dimensional feature representations). By leveraging high-performance matrix operations on GPUs, it can efficiently group large-scale embedding datasets while benefiting from the flexibility of the deep learning approach when covariates or additional contextual information are included. This way the user can control for the effects of covariates rather than rediscover them using clustering.

The package provides two EM-based estimation methods:

  • A direct approach (C++ backend) when no covariates are available
  • A deep learning approach (PyTorch backend) for model-based clustering in an embedding space with covariates

Furthermore, spheroids includes options to regularize the number of clusters using an L1 norm (via a Hadamard product approach inspired by Ziyin and Wang) and can dynamically drop clusters whose total weight falls below a user-specified threshold (min_weight).

Key Features

🚀 High Performance

  • Core computations implemented in C++ with Armadillo
  • GPU acceleration via PyTorch
  • Efficient batch processing

🎯 Multiple Distributions

  • Poisson kernel-based Distribution (PKBD)
  • Spherical Cauchy distribution
  • Extensible architecture for new distributions

📊 Clustering Capabilities

  • Automatic cluster number selection
  • Robust parameter estimation
  • Support for high-dimensional data

Installation

Quick Install (Recommended)

You can install spheroids directly from PyPI with precompiled wheels:

pip install spheroids

Advanced Installation (Local Compilation)

For users who want to build the package locally (e.g., to modify the codebase), follow these steps:

Prerequisites

  • Python ≥3.8
  • C++ compiler with C++17 support
  • Armadillo installed

Steps

On Linux
# Install required libraries
sudo apt-get update
sudo apt-get install -y libarmadillo-dev libomp-dev

# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids

# Build and install
pip install -e .
On macOS
# Install required libraries
brew update
brew install armadillo libomp

# Configure compiler paths (if necessary)
export CXXFLAGS="-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include -I/opt/homebrew/opt/armadillo/include"
export LDFLAGS="-L/opt/homebrew/opt/libomp/lib -lomp -L/opt/homebrew/opt/armadillo/lib"

# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids

# Build and install
pip install -e .
On Windows
# Clone vcpkg for managing C++ libraries
git clone https://github.com/microsoft/vcpkg.git C:\vcpkg
cd C:\vcpkg
.\bootstrap-vcpkg.bat -disableMetrics
.\vcpkg.exe install armadillo

# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids

# Build and install
pip install -e .

Quick Start

import torch
from spheroids import SphericalClustering

# Prepare your data (normalize to unit sphere)
X = torch.randn(1000, 3)
X = X / torch.norm(X, dim=1, keepdim=True)
Y = torch.randn(1000, 2)
Y = Y / torch.norm(Y, dim=1, keepdim=True)

# Create and fit model
model = SphericalClustering(
    num_covariates=3,
    response_dim=2,
    num_clusters=3,
    distribution="pkbd"
)

# Fit model
ll = model.fit(X, Y, num_epochs=100)

Using C++ Implementations

Access optimized C++ implementations directly:

from spheroids import PKBD

# Generate random samples 
samples = PKBD.random_sample(
    n=100,
    rho=0.5,
    mu=np.array([1.0, 0.0])
)

# Calculate log-likelihood
loglik = PKBD.log_likelihood(data, mu, rho)

API Reference

SphericalClustering

SphericalClustering(
    num_covariates: int,     # Number of input features
    response_dim: int,       # Dimension of response variables
    num_clusters: int,       # Initial number of clusters
    distribution: str,       # "pkbd" or "spcauchy"
    min_weight: float = 0.05 # Minimum cluster weight
)

Key Methods

# Fit the model
model.fit(
    X: torch.Tensor,        # Input features (N x num_covariates)
    Y: torch.Tensor,        # Response variables (N x response_dim)
    num_epochs: int = 100,  # Number of training epochs
    lr: float = 1e-3       # Learning rate
)

# Get cluster predictions
pred = model.predict(X)

Examples

Basic Clustering Example
import torch
from spheroids import SphericalClustering

#load data 
Y = np.load('spheroids/spheroids/datasets/pkbd_Y.npy')

# Create model
model = SphericalClustering(num_covariates= 1, 
                            response_dim= 4, 
                            num_clusters=3, 
                            device="cpu", 
                            min_weight=0.02, 
                            distribution="pkbd")

# Fit without covariates
mu, rho = model.fit_no_covariates(Y, num_epochs= 200, tol= 1e-8)
Usage of C++ API
from spheroids import PKBD, spcauchy

# PKBD distribution
pkbd_samples = PKBD.random_sample(1000, 0.5, mu)
pkbd_loglik = PKBD.log_likelihood(data, mu, rho)

# Spherical Cauchy distribution
scauchy_samples = spcauchy.random_sample(1000, 0.5, mu)
scauchy_loglik = spcauchy.log_likelihood(data, mu, rho)

Contributing

We welcome contributions! Here's how you can help:

  1. 🐛 Report bugs
  2. 💡 Suggest features

License

This project is licensed under the GPL-3.0 License - see the LICENSE file for details.

Citation

If you use spheroids in your research, please cite:

@software{spheroids,
  title = {spheroids: A Python Package for Spherical Clustering Models},
  author = {Lukas Sablica},
  year = {2025},
  url = {https://github.com/lsablica/spheroids}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

spheroids-0.4.0-cp313-cp313-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.13Windows x86-64

spheroids-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

spheroids-0.4.0-cp313-cp313-macosx_13_0_arm64.whl (310.2 kB view details)

Uploaded CPython 3.13macOS 13.0+ ARM64

spheroids-0.4.0-cp312-cp312-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.12Windows x86-64

spheroids-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

spheroids-0.4.0-cp312-cp312-macosx_13_0_x86_64.whl (339.4 kB view details)

Uploaded CPython 3.12macOS 13.0+ x86-64

spheroids-0.4.0-cp312-cp312-macosx_13_0_arm64.whl (310.1 kB view details)

Uploaded CPython 3.12macOS 13.0+ ARM64

spheroids-0.4.0-cp311-cp311-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.11Windows x86-64

spheroids-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

spheroids-0.4.0-cp311-cp311-macosx_13_0_x86_64.whl (332.3 kB view details)

Uploaded CPython 3.11macOS 13.0+ x86-64

spheroids-0.4.0-cp311-cp311-macosx_13_0_arm64.whl (302.9 kB view details)

Uploaded CPython 3.11macOS 13.0+ ARM64

spheroids-0.4.0-cp310-cp310-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.10Windows x86-64

spheroids-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

spheroids-0.4.0-cp310-cp310-macosx_13_0_x86_64.whl (329.6 kB view details)

Uploaded CPython 3.10macOS 13.0+ x86-64

spheroids-0.4.0-cp310-cp310-macosx_13_0_arm64.whl (300.5 kB view details)

Uploaded CPython 3.10macOS 13.0+ ARM64

spheroids-0.4.0-cp39-cp39-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.9Windows x86-64

spheroids-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

spheroids-0.4.0-cp39-cp39-macosx_13_0_x86_64.whl (329.7 kB view details)

Uploaded CPython 3.9macOS 13.0+ x86-64

spheroids-0.4.0-cp39-cp39-macosx_13_0_arm64.whl (300.6 kB view details)

Uploaded CPython 3.9macOS 13.0+ ARM64

File details

Details for the file spheroids-0.4.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: spheroids-0.4.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spheroids-0.4.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 c3e58ef3f5a6093cbcbcd8db24364b51e8b65590ca4ac5bdec20149a4282888b
MD5 65b8965be57a15bf6ad53314a4362dd2
BLAKE2b-256 73641d772a75d88730d5f1ed480502ba370cc7168e013bf1cba550fae529918d

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7671e9c71080fc0b468c8d0f9f0aa198b1dcc1db8bbc51ece2f9cf59cc7cc7c2
MD5 a21f5bc7fc66645214432e2c43d585dc
BLAKE2b-256 16acde9e4a71c1aee00c2dd8383e5d7c17d11ee9a3191db74d35d90c99f1fadf

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp313-cp313-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp313-cp313-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 18b55537e255433b62d26b60615ce74dc31c33ba2578ab3b91d1638f006ed62b
MD5 83305d5c57ab51fda9e71c7c76d9d524
BLAKE2b-256 d4b366211c473a202302284604dfb3650cab2d3f6f08e738d5db1c45737d6468

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: spheroids-0.4.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spheroids-0.4.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6f6ddcabaa41e692449262fc5fe3e6ff667319dd80daf39a600147669d3d5d2e
MD5 bc8fe730596096d93b24c87966150c87
BLAKE2b-256 344e15caa63df527075f21c5faf165f63eac76396207383c17b27c1c24a52bb9

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 33fb5bc93f05dd504809fe4c279d137ce2e2d38bcf19306104026bda2fb95906
MD5 c0710b1cd7393743a829997b6de6bec4
BLAKE2b-256 ff99b04a2ae0744e5b0560884b2dad3b6d484680a766b8d977ede1d387b025f8

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 baff87325b30873b4c31bc25ad184e598fdcd8bc12113f258316376df4e09e69
MD5 2865178ea279e217d60fdd64c8929041
BLAKE2b-256 7676cbbb0b1b2084d81d3492e0557b06ef43299840b3b7fb190d8d2d87185da2

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp312-cp312-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp312-cp312-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 39d418247d6fe31cbffcdeb356f8413d26ea727a1e3fc5306835f37d66ee1f6f
MD5 6e090dee6eb7b2022f7c5973c699fec1
BLAKE2b-256 141830b9c7cb5c9d00fe57dbf643793eb1fde8719b904bdf0b62694c60c5ccd0

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: spheroids-0.4.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spheroids-0.4.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 173d10e0ef2a8a3fbf751614286ce7b0bd10bb5c49bdd79498944cf49fec7a4f
MD5 e1ec5591aedaf2b8fa0466498b9c6d48
BLAKE2b-256 ec3665830dfa921321750a3ea6e049681c64b7be3ab25caa5d75e9dd6a3c4fac

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0140b43b0270b0ea2823399c4750fc5777dc9728c855148387080a07c537c42e
MD5 f1e95443cda7c36b4e482760f77bce29
BLAKE2b-256 18af649dca3e7aac24787e65bfb2c5a867cd9185176fa23e1e3c47ef56aa47b1

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 4778eee7cc9a8c0f8a81ef201cdf9cbcf94fc75ed942e227baecda8f40e986c5
MD5 1a28e65143114e85f4584ce8c43a53be
BLAKE2b-256 0657b0746bbd616dced8aed32e1aec32f5d9a1bcb9de77fbc7d67dda3662fdd6

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp311-cp311-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp311-cp311-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 7d6799a247966bf6a7f42457652f066906edec3f3590a956f051745c5e6e0379
MD5 eb2971961fc9c3ec83bb331a3e7f2420
BLAKE2b-256 304b41f5eeba553b8954173d0aa13e1eb930e20f69280181b7884f3439ad326a

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: spheroids-0.4.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spheroids-0.4.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 58d4935d211529d185f7feb2c2a887ba3f93b20d959a8e83e760e3c399a2aa15
MD5 fdea61f9e318f30760ebf3097051214d
BLAKE2b-256 1130d9098d7d681c7378f38f43eb8bcffac8bacc2303a4be26ed5a31c9711486

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 037b571d10d2037f48e3ada2864c56ef5ac01754f31a896ec09fb02381b8ff40
MD5 ec439e775ad39730bfc6a1f33fe1e568
BLAKE2b-256 469b7936949348aba1ff5cc08093ecd0d865b56e24e45cc5bad6134b9e96f425

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 9c40fb59f1ddd706c06d8b8cefb6811bb22792e2058bb4951ce4a732146ca2c6
MD5 5e67fca001f7a4db2319263cb2c41640
BLAKE2b-256 a3532060603f7056fa48240cd25e9f0dd9ba8da930abdbd7b441bc108b172899

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp310-cp310-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp310-cp310-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 89387b7071a9046e35cf6ab77a3ceef2324719958bb8709bb91fb7d9d1a2a74f
MD5 2d10afcbdbe187ec601028fcc137bb4c
BLAKE2b-256 e719dcde2c1ffc73ebd55838354221f0651d8387a0aa3cdf26645e555ed9be3c

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: spheroids-0.4.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 6.0 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spheroids-0.4.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 54ef91b7f993490b78ad793a6107276a712bee837622cfc0fb166371dbb98da8
MD5 1e074592504579934821e644eefd5e1d
BLAKE2b-256 6bfd6cdfc97ceb2c5e8e0d06ccbf79198870d50a95815b7435dd591555e3a0e3

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7f06f5a24d9a44a9a4e2dcd663f1163a3ffd86deab20ed31bed0a113b9cf7b7c
MD5 973e4b8ea9b8d4deb98d28826e5c277d
BLAKE2b-256 bb848895325b01f840a5d858a7ef3d135c1a25a05a83a7b1f1f73f9ef5515169

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp39-cp39-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp39-cp39-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 c71334058ade531c674d50e188ed42af6ca4c8edd6fbaae39cd831fa28236f64
MD5 1c661536e220e6d4e3507945b2ad60a9
BLAKE2b-256 3855fe58bf65b6ae7c121457d7ce44672251202cca03dc003d9aa51cfed21528

See more details on using hashes here.

File details

Details for the file spheroids-0.4.0-cp39-cp39-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for spheroids-0.4.0-cp39-cp39-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 80c4efd2beaf2fcf70acec205566982b8b19163f0e54cb97cd7ab9725b8118ec
MD5 17ce56e5ec663fb92b501d21dcb57c79
BLAKE2b-256 eb6433e47ee9b8b18ba5066d4afa0c31f17fb549374133e2f578cac719a617da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page