Portable C++ avatar runtime — Python bindings via pybind11. Powers the bitHuman Essence pipeline cross-platform.
Project description
bithuman
This is the Python flavor of Layer 3: a platform-specific library for app developers. It wraps the Layer 1 libessence engine. For the CLI tool see docs/CLI.md.
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Platform-specific libraries (app developers) │
│ - Python wheel pip install bithuman ◄──── you are here
│ - Swift package SwiftPM Bithuman │
│ - Kotlin AAR ai.bithuman:sdk │
│ - (future) Rust crate, JS/TS, Go, ... │
└─────────────────────────────────────────────────────────────┘
▼ embeds
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: bithuman CLI (end-user tool) │
│ - one cross-platform binary on macOS / Linux / Windows │
│ - brew install bithuman · curl-pipe installer │
└─────────────────────────────────────────────────────────────┘
▼ links
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: libessence engine (cross-platform C++ core) │
│ - portable C ABI, same source on every target │
│ - macOS · iOS · Android · Linux · Windows │
│ - never imported directly by app developers │
└─────────────────────────────────────────────────────────────┘
Python bindings for the bitHuman SDK — the portable C++ avatar engine
(libessence) that powers our cross-platform lipsync pipeline. The wheel
ships a native pybind11 module that talks directly to libessence,
so you get the same per-frame cost as our Swift and Kotlin clients with
none of the GIL noise.
On an Apple M5 with 24 GB unified memory we measure ~640 FPS sustained compose (1.56 ms/frame mean, 2.03 ms p99) for a 1248×704 avatar, with ~206 MB peak RSS end-to-end. Cold load is ~14 ms for the fixture and ~400 ms for the first compose tick (lazy ONNX init).
This package is namespace-isolated from the v0 bithuman SDK; you can
install both side-by-side.
Install
pip install bithuman
Current PyPI version: 0.1.1 (matches the libessence-v0.1.1 tag).
Compatibility
- Platforms: macOS arm64, Linux x86_64, Linux arm64 — all ship as wheels. Windows is tracked for a follow-up.
- Python: 3.10 – 3.13 (cp310, cp311, cp312, cp313). CPython only.
- ABI: wraps
libessenceABI v4 (auth + auto-fit canvas). - Auth: ships with live heartbeat against
api.bithuman.aibaked intolibessence.Avatar.load(api_secret=...)is the entry point;BITHUMAN_API_SECRETenv var works too. SetBITHUMAN_UNMETERED=1for dev / parity-test runs.
What you get
The package exposes three API tiers (all importable from bithuman):
| Tier | Types | Use when… |
|---|---|---|
| Async | AsyncAvatar, AudioChunk, VideoControl, VideoFrame |
Hosting a service / parity with legacy AsyncBithuman |
| Sync facade | Avatar, ComposedFrame, EP |
Offline / batch / CLI rendering |
| Low-level | Fixture, Runtime, EP_CPU/EP_AUTO/EP_COREML/EP_NNAPI/EP_QNN |
Direct C ABI access, custom audio pipeline |
Error types: BithumanError (base), TokenError /
TokenExpiredError / TokenValidationError / TokenRequestError /
AccountStatusError (auth), ModelError / ModelNotFoundError /
ModelLoadError / ModelSecurityError / ExpressionModelNotSupported
(fixture), RuntimeNotReadyError.
Version info: bithuman.__version__ (Python package),
bithuman.__core_version__ (linked libessence), bithuman.__abi_version__.
Quickstart
from bithuman import Avatar
with Avatar.load("model.imx") as avatar:
for frame in avatar.compose("speech.wav"):
# frame.bgr is a (H, W, 3) uint8 numpy array in BGR pixel order
cv2.imshow("avatar", frame.bgr)
cv2.waitKey(40)
Avatar.compose accepts a 16 kHz float32 mono numpy array OR a path to
any WAV / MP3 / FLAC / OGG file (decoded and resampled via
soundfile when needed).
CLI
A essence-render console script ships with the wheel:
pip install 'bithuman[cli]'
essence-render \
--model ~/.cache/bithuman/models/sample-avatar.imx \
--audio speech.wav \
--output out.mp4
Pass --output - to stream raw BGR24 frames to stdout (handy for piping
into a separate ffmpeg pipeline or a custom encoder). Other flags:
| Flag | Default | Description |
|---|---|---|
--fps |
25 | Output FPS for the MP4 container. |
--quality |
80 | libx264 quality 1..100 (higher = better). |
--ep |
cpu |
Execution provider hint (cpu/auto/coreml/…). |
--threads |
1 | ORT intra-op thread count. |
--no-audio |
– | Skip audio muxing; produce a silent video. |
Example end-to-end run (5 s sine sweep):
essence-render 0.1.0: model=sample-avatar.imx audio=sine_sweep_5s.wav ep=cpu threads=1
essence-render: loaded fixture in 14.9 ms — 1248x704 @ 25 fps, 183 clusters, 202 src frames
essence-render: composed 122 frames in 1.83s (14.96 ms/frame, 66.8 fps)
essence-render: wrote /tmp/sine_sweep_5s.mp4
(Throughput here is bounded by H.264 encode, not Essence inference. Use
--output - if you want to measure raw compose speed.)
Low-level API
If you need finer control or want to swap in a custom audio pipeline, the C ABI is exposed directly:
import numpy as np
from bithuman import Fixture, Runtime, EP_CPU
fx = Fixture("model.imx", preferred_ep=EP_CPU, intra_op_threads=1)
rt = Runtime(fx)
pcm = np.fromfile("speech.f32", dtype=np.float32) # 16 kHz mono float32
cluster_idx, bgr = rt.tick_compose(pcm, frame_idx_hint=-1)
# bgr.shape == (fx.frame_height, fx.frame_width, 3), dtype uint8
Pass the entire pcm buffer to each tick_compose call; the runtime
maintains an internal cursor and advances one tick per call until the
audio is exhausted.
Zero-alloc hot path (since 1.13.0)
For tight render loops, pre-allocate the BGR buffer once and pass it
via out=. The runtime writes into it in place and returns just the
cluster_idx. This drops wrapper overhead to within ~3 % of raw
libessence (vs ~8 % for the alloc-per-tick path):
out = np.empty((fx.frame_height, fx.frame_width, 3), dtype=np.uint8)
for _ in range(num_ticks):
cluster_idx = rt.tick_compose(pcm, -1, out=out)
# `out` now holds this tick's frame; read it before the next call.
The same out= keyword works on tick_compose_to_size. See
docs/ARCHITECTURE.md §9 for the cross-wrapper perf table.
Build from source
You need the prebuilt parent C++ archive at
cpp/build/libessence.a (run the parent CMake build first), plus
the runtime deps from Homebrew (onnxruntime, webp, ffmpeg,
hdf5, jpeg-turbo).
cd cpp/bindings/python
uv pip install -e '.[cli,test]' --no-build-isolation
The CMake glue links the prebuilt static archive directly — it does NOT re-run the parent build, so iterate on bindings without paying the C++ rebuild cost.
Performance
Measured with tests/bench.py against the v1 compose path
(audio → composited BGR frame) on Apple M5 24 GB, libessence 1.13.0:
| Metric | Alloc per tick | out= reuse buffer |
|---|---|---|
| Steady-state mean | 1.53 ms / frame | 1.45 ms / frame |
| p99 | 1.66 ms | 1.53 ms |
| Sustained throughput | 655 FPS | 692 FPS |
| Overhead vs raw libessence | +8.3 % | +2.6 % |
| Peak RSS (proc) | 192 MB | 182 MB |
Wrapper overhead is within 5 % of raw libessence on the out= path;
see docs/ARCHITECTURE.md §9 for the apples-to-apples methodology and
the cross-wrapper comparison. Reproduce with:
scripts/bench-wrappers.sh
Linux wheels
Pre-built manylinux_2_28 wheels ship for x86_64 + aarch64 across cp310
through cp313 — 8 wheels in total, all auditwheel-repaired with the
full dep tree bundled (ORT, FFmpeg, HDF5, libjpeg-turbo, libwebp,
libcurl, OpenSSL).
To rebuild them locally:
# One-time: build the dep-baked Docker images (~10 min each).
docker build --platform linux/amd64 -t libessence/manylinux-x86_64:0.1 \
-f scripts/Dockerfile.manylinux-x86_64 scripts/
docker build --platform linux/arm64/v8 -t libessence/manylinux-aarch64:0.1 \
-f scripts/Dockerfile.manylinux-aarch64 scripts/
# Per wheel build (~2 min):
docker run --rm --platform linux/amd64 -v "$REPO":/src \
-e PYTAG=cp311 -e ARCH_INSIDE=x86_64 \
libessence/manylinux-x86_64:0.1 \
bash /src/cpp/bindings/python/scripts/build-wheel-in-container.sh
Limitations
- Windows wheels not yet built — tracked for v0.2.
- The CLI's output framerate is fixed at 25 fps to match the model's
internal rate. Pass
--output -and pipe to your own encoder if you need temporal resampling. preferred_ep=COREML/NNAPI/QNNis accepted but currently no-ops to CPU in the v0.1 build.
License
Commercial. Contact hello@bithuman.ai.
See also
- Root
README.md— install matrix cpp/README.md— libessence engine internals + C ABIdocs/CLI.md—bithumanCLI referencecpp/bindings/swift/README.md— Swift bindingcpp/bindings/kotlin/README.md— Kotlin/Android bindingdocs/BUILD_AND_RELEASE.md— release flow
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bithuman-1.15.2-cp313-cp313-macosx_26_0_arm64.whl.
File metadata
- Download URL: bithuman-1.15.2-cp313-cp313-macosx_26_0_arm64.whl
- Upload date:
- Size: 202.1 kB
- Tags: CPython 3.13, macOS 26.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90b09cc874e60fc0936a8625c3cab6172e32348e5b2123680513865800485417
|
|
| MD5 |
b7da0c4fe3a0c7bd4eb7fc19a59640cf
|
|
| BLAKE2b-256 |
b694f6fa1d6919745a0966d22bf002799497cf67af70573c4028629e7ff2018d
|
File details
Details for the file bithuman-1.15.2-cp311-cp311-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: bithuman-1.15.2-cp311-cp311-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 14.5 MB
- Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e1d7bd464a11ae969285ef7a89c4b6da28627dbdc3b99b1f6090a644e275c18
|
|
| MD5 |
48c0d72c411b84698804d672ebd824f2
|
|
| BLAKE2b-256 |
ea67072dea3bd7d442ba25302d0557700f280f31b7a521735113763465bfc679
|
File details
Details for the file bithuman-1.15.2-cp311-cp311-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: bithuman-1.15.2-cp311-cp311-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 13.4 MB
- Tags: CPython 3.11, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
448612ff3b0399770b4e1e0eb7c4a5135a711ffa19a9723fbb566be62a7d7c02
|
|
| MD5 |
091dcf0bc0b3d74959fbbb945e04ffc7
|
|
| BLAKE2b-256 |
e6edb78fcdf6e72f4e4f35580a7715325cf653dd7c4d5e2f21f5fcf8cf47c291
|