A package for spherical clustering and probabilistic modeling
Project description
spheroids
High-performance spherical clustering with PyTorch and C++
Key Features • Installation • Quick Start • Documentation • Contributing
Package spheroids offers the use of PKBD and spherical Cauchy distributions, which—unlike many other spherical distributions—avoid complicated normalizing constants involving hypergeometric functions and hence do not require iterative evaluations. Instead, they primarily rely on matrix multiplication, making them well-suited for GPU-accelerated computing.
Beyond traditional applications, spheroids is particularly useful for clustering of modern embeddings (e.g., semantic embeddings generated by large language models, image embeddings, or any high-dimensional feature representations). By leveraging high-performance matrix operations on GPUs, it can efficiently group large-scale embedding datasets while benefiting from the flexibility of the deep learning approach when covariates or additional contextual information are included. This way the user can control for the effects of covariates rather than rediscover them using clustering.
The package provides two EM-based estimation methods:
- A direct approach (C++ backend) when no covariates are available
- A deep learning approach (PyTorch backend) for model-based clustering in an embedding space with covariates
Furthermore, spheroids includes options to regularize the number of clusters using an L1 norm (via a Hadamard product approach inspired by Ziyin and Wang) and can dynamically drop clusters whose total weight falls below a user-specified threshold (min_weight).
Key Features
🚀 High Performance
- Core computations implemented in C++ with Armadillo
- GPU acceleration via PyTorch
- Efficient batch processing
🎯 Multiple Distributions
- Poisson kernel-based Distribution (PKBD)
- Spherical Cauchy distribution
- Extensible architecture for new distributions
📊 Clustering Capabilities
- Automatic cluster number selection
- Robust parameter estimation
- Support for high-dimensional data
Installation
Quick Install (Recommended)
You can install spheroids directly from PyPI with precompiled wheels:
pip install spheroids
Advanced Installation (Local Compilation)
For users who want to build the package locally (e.g., to modify the codebase), follow these steps:
Prerequisites
- Python ≥3.8
- C++ compiler with C++17 support
- Armadillo installed
Steps
On Linux
# Install required libraries
sudo apt-get update
sudo apt-get install -y libarmadillo-dev libomp-dev
# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids
# Build and install
pip install -e .
On macOS
# Install required libraries
brew update
brew install armadillo libomp
# Configure compiler paths (if necessary)
export CXXFLAGS="-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include -I/opt/homebrew/opt/armadillo/include"
export LDFLAGS="-L/opt/homebrew/opt/libomp/lib -lomp -L/opt/homebrew/opt/armadillo/lib"
# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids
# Build and install
pip install -e .
On Windows
# Clone vcpkg for managing C++ libraries
git clone https://github.com/microsoft/vcpkg.git C:\vcpkg
cd C:\vcpkg
.\bootstrap-vcpkg.bat -disableMetrics
.\vcpkg.exe install armadillo
# Clone the repository
git clone https://github.com/lsablica/spheroids.git
cd spheroids
# Build and install
pip install -e .
Quick Start
import torch
from spheroids import SphericalClustering
# Prepare your data (normalize to unit sphere)
X = torch.randn(1000, 3)
X = X / torch.norm(X, dim=1, keepdim=True)
Y = torch.randn(1000, 2)
Y = Y / torch.norm(Y, dim=1, keepdim=True)
# Create and fit model
model = SphericalClustering(
num_covariates=3,
response_dim=2,
num_clusters=3,
distribution="pkbd"
)
# Fit model
ll = model.fit(X, Y, num_epochs=100)
Using C++ Implementations
Access optimized C++ implementations directly:
from spheroids import PKBD
# Generate random samples
samples = PKBD.random_sample(
n=100,
rho=0.5,
mu=np.array([1.0, 0.0])
)
# Calculate log-likelihood
loglik = PKBD.log_likelihood(data, mu, rho)
API Reference
SphericalClustering
SphericalClustering(
num_covariates: int, # Number of input features
response_dim: int, # Dimension of response variables
num_clusters: int, # Initial number of clusters
distribution: str, # "pkbd" or "spcauchy"
min_weight: float = 0.05 # Minimum cluster weight
)
Key Methods
# Fit the model
model.fit(
X: torch.Tensor, # Input features (N x num_covariates)
Y: torch.Tensor, # Response variables (N x response_dim)
num_epochs: int = 100, # Number of training epochs
lr: float = 1e-3 # Learning rate
)
# Get cluster predictions
pred = model.predict(X)
Examples
Basic Clustering Example
import torch
from spheroids import SphericalClustering
#load data
Y = np.load('spheroids/spheroids/datasets/pkbd_Y.npy')
# Create model
model = SphericalClustering(num_covariates= 1,
response_dim= 4,
num_clusters=3,
device="cpu",
min_weight=0.02,
distribution="pkbd")
# Fit without covariates
mu, rho = model.fit_no_covariates(Y, num_epochs= 200, tol= 1e-8)
Usage of C++ API
from spheroids import PKBD, spcauchy
# PKBD distribution
pkbd_samples = PKBD.random_sample(1000, 0.5, mu)
pkbd_loglik = PKBD.log_likelihood(data, mu, rho)
# Spherical Cauchy distribution
scauchy_samples = spcauchy.random_sample(1000, 0.5, mu)
scauchy_loglik = spcauchy.log_likelihood(data, mu, rho)
Contributing
We welcome contributions! Here's how you can help:
License
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
Citation
If you use spheroids in your research, please cite:
@software{spheroids,
title = {spheroids: A Python Package for Spherical Clustering Models},
author = {Lukas Sablica},
year = {2025},
url = {https://github.com/lsablica/spheroids}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spheroids-0.4.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 6.0 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3e58ef3f5a6093cbcbcd8db24364b51e8b65590ca4ac5bdec20149a4282888b
|
|
| MD5 |
65b8965be57a15bf6ad53314a4362dd2
|
|
| BLAKE2b-256 |
73641d772a75d88730d5f1ed480502ba370cc7168e013bf1cba550fae529918d
|
File details
Details for the file spheroids-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 35.9 MB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7671e9c71080fc0b468c8d0f9f0aa198b1dcc1db8bbc51ece2f9cf59cc7cc7c2
|
|
| MD5 |
a21f5bc7fc66645214432e2c43d585dc
|
|
| BLAKE2b-256 |
16acde9e4a71c1aee00c2dd8383e5d7c17d11ee9a3191db74d35d90c99f1fadf
|
File details
Details for the file spheroids-0.4.0-cp313-cp313-macosx_13_0_arm64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp313-cp313-macosx_13_0_arm64.whl
- Upload date:
- Size: 310.2 kB
- Tags: CPython 3.13, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18b55537e255433b62d26b60615ce74dc31c33ba2578ab3b91d1638f006ed62b
|
|
| MD5 |
83305d5c57ab51fda9e71c7c76d9d524
|
|
| BLAKE2b-256 |
d4b366211c473a202302284604dfb3650cab2d3f6f08e738d5db1c45737d6468
|
File details
Details for the file spheroids-0.4.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 6.0 MB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f6ddcabaa41e692449262fc5fe3e6ff667319dd80daf39a600147669d3d5d2e
|
|
| MD5 |
bc8fe730596096d93b24c87966150c87
|
|
| BLAKE2b-256 |
344e15caa63df527075f21c5faf165f63eac76396207383c17b27c1c24a52bb9
|
File details
Details for the file spheroids-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 35.9 MB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33fb5bc93f05dd504809fe4c279d137ce2e2d38bcf19306104026bda2fb95906
|
|
| MD5 |
c0710b1cd7393743a829997b6de6bec4
|
|
| BLAKE2b-256 |
ff99b04a2ae0744e5b0560884b2dad3b6d484680a766b8d977ede1d387b025f8
|
File details
Details for the file spheroids-0.4.0-cp312-cp312-macosx_13_0_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp312-cp312-macosx_13_0_x86_64.whl
- Upload date:
- Size: 339.4 kB
- Tags: CPython 3.12, macOS 13.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baff87325b30873b4c31bc25ad184e598fdcd8bc12113f258316376df4e09e69
|
|
| MD5 |
2865178ea279e217d60fdd64c8929041
|
|
| BLAKE2b-256 |
7676cbbb0b1b2084d81d3492e0557b06ef43299840b3b7fb190d8d2d87185da2
|
File details
Details for the file spheroids-0.4.0-cp312-cp312-macosx_13_0_arm64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp312-cp312-macosx_13_0_arm64.whl
- Upload date:
- Size: 310.1 kB
- Tags: CPython 3.12, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39d418247d6fe31cbffcdeb356f8413d26ea727a1e3fc5306835f37d66ee1f6f
|
|
| MD5 |
6e090dee6eb7b2022f7c5973c699fec1
|
|
| BLAKE2b-256 |
141830b9c7cb5c9d00fe57dbf643793eb1fde8719b904bdf0b62694c60c5ccd0
|
File details
Details for the file spheroids-0.4.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 6.0 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
173d10e0ef2a8a3fbf751614286ce7b0bd10bb5c49bdd79498944cf49fec7a4f
|
|
| MD5 |
e1ec5591aedaf2b8fa0466498b9c6d48
|
|
| BLAKE2b-256 |
ec3665830dfa921321750a3ea6e049681c64b7be3ab25caa5d75e9dd6a3c4fac
|
File details
Details for the file spheroids-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 35.8 MB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0140b43b0270b0ea2823399c4750fc5777dc9728c855148387080a07c537c42e
|
|
| MD5 |
f1e95443cda7c36b4e482760f77bce29
|
|
| BLAKE2b-256 |
18af649dca3e7aac24787e65bfb2c5a867cd9185176fa23e1e3c47ef56aa47b1
|
File details
Details for the file spheroids-0.4.0-cp311-cp311-macosx_13_0_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp311-cp311-macosx_13_0_x86_64.whl
- Upload date:
- Size: 332.3 kB
- Tags: CPython 3.11, macOS 13.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4778eee7cc9a8c0f8a81ef201cdf9cbcf94fc75ed942e227baecda8f40e986c5
|
|
| MD5 |
1a28e65143114e85f4584ce8c43a53be
|
|
| BLAKE2b-256 |
0657b0746bbd616dced8aed32e1aec32f5d9a1bcb9de77fbc7d67dda3662fdd6
|
File details
Details for the file spheroids-0.4.0-cp311-cp311-macosx_13_0_arm64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp311-cp311-macosx_13_0_arm64.whl
- Upload date:
- Size: 302.9 kB
- Tags: CPython 3.11, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d6799a247966bf6a7f42457652f066906edec3f3590a956f051745c5e6e0379
|
|
| MD5 |
eb2971961fc9c3ec83bb331a3e7f2420
|
|
| BLAKE2b-256 |
304b41f5eeba553b8954173d0aa13e1eb930e20f69280181b7884f3439ad326a
|
File details
Details for the file spheroids-0.4.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 6.0 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58d4935d211529d185f7feb2c2a887ba3f93b20d959a8e83e760e3c399a2aa15
|
|
| MD5 |
fdea61f9e318f30760ebf3097051214d
|
|
| BLAKE2b-256 |
1130d9098d7d681c7378f38f43eb8bcffac8bacc2303a4be26ed5a31c9711486
|
File details
Details for the file spheroids-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 35.8 MB
- Tags: CPython 3.10, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
037b571d10d2037f48e3ada2864c56ef5ac01754f31a896ec09fb02381b8ff40
|
|
| MD5 |
ec439e775ad39730bfc6a1f33fe1e568
|
|
| BLAKE2b-256 |
469b7936949348aba1ff5cc08093ecd0d865b56e24e45cc5bad6134b9e96f425
|
File details
Details for the file spheroids-0.4.0-cp310-cp310-macosx_13_0_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp310-cp310-macosx_13_0_x86_64.whl
- Upload date:
- Size: 329.6 kB
- Tags: CPython 3.10, macOS 13.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c40fb59f1ddd706c06d8b8cefb6811bb22792e2058bb4951ce4a732146ca2c6
|
|
| MD5 |
5e67fca001f7a4db2319263cb2c41640
|
|
| BLAKE2b-256 |
a3532060603f7056fa48240cd25e9f0dd9ba8da930abdbd7b441bc108b172899
|
File details
Details for the file spheroids-0.4.0-cp310-cp310-macosx_13_0_arm64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp310-cp310-macosx_13_0_arm64.whl
- Upload date:
- Size: 300.5 kB
- Tags: CPython 3.10, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89387b7071a9046e35cf6ab77a3ceef2324719958bb8709bb91fb7d9d1a2a74f
|
|
| MD5 |
2d10afcbdbe187ec601028fcc137bb4c
|
|
| BLAKE2b-256 |
e719dcde2c1ffc73ebd55838354221f0651d8387a0aa3cdf26645e555ed9be3c
|
File details
Details for the file spheroids-0.4.0-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 6.0 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54ef91b7f993490b78ad793a6107276a712bee837622cfc0fb166371dbb98da8
|
|
| MD5 |
1e074592504579934821e644eefd5e1d
|
|
| BLAKE2b-256 |
6bfd6cdfc97ceb2c5e8e0d06ccbf79198870d50a95815b7435dd591555e3a0e3
|
File details
Details for the file spheroids-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 35.8 MB
- Tags: CPython 3.9, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f06f5a24d9a44a9a4e2dcd663f1163a3ffd86deab20ed31bed0a113b9cf7b7c
|
|
| MD5 |
973e4b8ea9b8d4deb98d28826e5c277d
|
|
| BLAKE2b-256 |
bb848895325b01f840a5d858a7ef3d135c1a25a05a83a7b1f1f73f9ef5515169
|
File details
Details for the file spheroids-0.4.0-cp39-cp39-macosx_13_0_x86_64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp39-cp39-macosx_13_0_x86_64.whl
- Upload date:
- Size: 329.7 kB
- Tags: CPython 3.9, macOS 13.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c71334058ade531c674d50e188ed42af6ca4c8edd6fbaae39cd831fa28236f64
|
|
| MD5 |
1c661536e220e6d4e3507945b2ad60a9
|
|
| BLAKE2b-256 |
3855fe58bf65b6ae7c121457d7ce44672251202cca03dc003d9aa51cfed21528
|
File details
Details for the file spheroids-0.4.0-cp39-cp39-macosx_13_0_arm64.whl.
File metadata
- Download URL: spheroids-0.4.0-cp39-cp39-macosx_13_0_arm64.whl
- Upload date:
- Size: 300.6 kB
- Tags: CPython 3.9, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80c4efd2beaf2fcf70acec205566982b8b19163f0e54cb97cd7ab9725b8118ec
|
|
| MD5 |
17ce56e5ec663fb92b501d21dcb57c79
|
|
| BLAKE2b-256 |
eb6433e47ee9b8b18ba5066d4afa0c31f17fb549374133e2f578cac719a617da
|