Client library for Palabra AI's real-time speech translation, dubbing, and voice synthesis APIs across 25+ languages.
Project description
Palabra AI Python SDK
๐ Python SDK for Palabra AI's real-time speech-to-speech translation API ๐ Break down language barriers and enable seamless communication across 25+ languages
Overview ๐
๐ฏ The Palabra AI Python SDK provides a high-level API for integrating real-time speech-to-speech translation into your Python applications.
โจ What can Palabra.ai do?
- โก Real-time speech-to-speech translation with near-zero latency
- ๐๏ธ Auto voice cloning - speak any language in YOUR voice
- ๐ Two-way simultaneous translation for live discussions
- ๐ Developer API/SDK for building your own apps
- ๐ฏ Works everywhere - Zoom, streams, events, any platform
- ๐ Zero data storage - your conversations stay private
๐ง This SDK focuses on making real-time translation simple and accessible:
- ๐ก๏ธ Uses WebRTC and WebSockets under the hood
- โก Abstracts away all complexity
- ๐ฎ Simple configuration with source/target languages
- ๐ค Supports multiple input/output adapters (microphones, speakers, files, buffers)
๐ How it works:
- ๐ค Configure input/output adapters
- ๐ SDK handles the entire pipeline
- ๐ฏ Automatic transcription, translation, and synthesis
- ๐ Real-time audio stream ready for playback
๐ก All with just a few lines of code!
Installation ๐ฆ
From PyPI ๐ฆ
pip install palabra-ai
macOS SSL Certificate Setup ๐
If you encounter SSL certificate errors on macOS like:
SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate
Option 1: Install Python certificates (recommended)
/Applications/Python\ $(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')")/Install\ Certificates.command
Option 2: Use system certificates
pip install pip-system-certs
This will configure Python to use your system's certificate store.
Quick Start ๐
Real-time microphone translation ๐ค
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
EN, ES, DeviceManager)
palabra = PalabraAI()
dm = DeviceManager()
mic, speaker = dm.select_devices_interactive()
cfg = Config(SourceLang(EN, mic), [TargetLang(ES, speaker)])
palabra.run(cfg)
โ๏ธ Set your API credentials as environment variables:
export PALABRA_CLIENT_ID=your_client_id
export PALABRA_CLIENT_SECRET=your_client_secret
Examples ๐ก
File-to-file translation ๐
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
FileReader, FileWriter, EN, ES)
palabra = PalabraAI()
reader = FileReader("./speech/es.mp3")
writer = FileWriter("./es2en_out.wav")
cfg = Config(SourceLang(ES, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
Multiple target languages ๐
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
FileReader, FileWriter, EN, ES, FR, DE)
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, FileReader("presentation.mp3")),
targets=[
TargetLang(ES, FileWriter("spanish.wav")),
TargetLang(FR, FileWriter("french.wav")),
TargetLang(DE, FileWriter("german.wav"))
]
)
palabra.run(config)
Customizable output ๐
๐ Add a transcription of the source and translated speech. โ๏ธ Configure output to provide:
- ๐ Audio only
- ๐ Transcriptions only
- ๐ฏ Both audio and transcriptions
from palabra_ai import (
PalabraAI,
Config,
SourceLang,
TargetLang,
FileReader,
EN,
ES,
)
from palabra_ai.base.message import TranscriptionMessage
async def print_translation_async(msg: TranscriptionMessage):
print(repr(msg))
def print_translation(msg: TranscriptionMessage):
print(str(msg))
palabra = PalabraAI()
cfg = Config(
source=SourceLang(
EN,
FileReader("speech/en.mp3"),
print_translation # Callback for source transcriptions
),
targets=[
TargetLang(
ES,
# You can use only transcription without audio writer if you want
# FileWriter("./test_output.wav"), # Optional: audio output
on_transcription=print_translation_async # Callback for translated transcriptions
)
],
silent=True, # Set to True to disable verbose logging to console
)
palabra.run(cfg)
Transcription output options: ๐
1๏ธโฃ Audio only (default):
TargetLang(ES, FileWriter("output.wav"))
2๏ธโฃ Transcription only:
TargetLang(ES, on_transcription=your_callback_function)
3๏ธโฃ Audio and transcription:
TargetLang(ES, FileWriter("output.wav"), on_transcription=your_callback_function)
๐ก The transcription callbacks receive TranscriptionMessage objects containing the transcribed text and metadata.
๐ Callbacks can be either synchronous or asynchronous functions.
Integrate with FFmpeg (streaming) ๐ฌ
import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
BufferReader, BufferWriter, AR, EN, RunAsPipe)
ffmpeg_cmd = [
'ffmpeg',
'-i', 'speech/ar.mp3',
'-f', 's16le', # 16-bit PCM
'-acodec', 'pcm_s16le',
'-ar', '48000', # 48kHz
'-ac', '1', # mono
'-' # output to stdout
]
pipe_buffer = RunAsPipe(ffmpeg_cmd)
es_buffer = io.BytesIO()
palabra = PalabraAI()
reader = BufferReader(pipe_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
f.write(es_buffer.getbuffer())
Using buffers ๐พ
import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
BufferReader, BufferWriter, AR, EN)
from palabra_ai.internal.audio import convert_any_to_pcm16
en_buffer, es_buffer = io.BytesIO(), io.BytesIO()
with open("speech/ar.mp3", "rb") as f:
en_buffer.write(convert_any_to_pcm16(f.read()))
palabra = PalabraAI()
reader = BufferReader(en_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
f.write(es_buffer.getbuffer())
Using default audio devices ๐
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, DeviceManager, EN, ES
dm = DeviceManager()
reader, writer = dm.get_default_readers_writers()
if reader and writer:
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, reader),
targets=[TargetLang(ES, writer)]
)
palabra.run(config)
Async Translation โก
import asyncio
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES
async def translate():
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, FileReader("input.mp3")),
targets=[TargetLang(ES, FileWriter("output.wav"))]
)
result = await palabra.arun(config)
# Result contains: result.ok, result.exc, result.log_data
if __name__ == "__main__":
asyncio.run(translate())
Synchronous Translation ๐
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES
# Synchronous execution (blocks until complete)
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, FileReader("input.mp3")),
targets=[TargetLang(ES, FileWriter("output.wav"))]
)
result = palabra.run(config)
# Result contains: result.ok, result.exc, result.log_data
Signal Handling ๐ก๏ธ
# Enable Ctrl+C signal handlers (disabled by default)
result = palabra.run(config, signal_handlers=True)
# Default behavior (signal handlers disabled)
result = palabra.run(config) # signal_handlers=False by default
Result Handling ๐
Both run() and arun() return a RunResult object with status information:
result = palabra.run(config)
# or: result = await palabra.arun(config)
if result.ok:
print("โ
Translation completed successfully!")
if result.log_data:
print(f"๐ Processing stats: {result.log_data}")
if result.eos:
print("๐ End of stream signal received")
else:
print(f"โ Translation failed: {result.exc}")
I/O Adapters & Mixing ๐
Available adapters ๐ ๏ธ
๐ฏ The Palabra AI SDK provides flexible I/O adapters that can combined to:
- ๐ FileReader/FileWriter: Read from and write to audio files
- ๐ค DeviceReader/DeviceWriter: Use microphones and speakers
- ๐พ BufferReader/BufferWriter: Work with in-memory buffers
- ๐ง RunAsPipe: Run command and represent as pipe (e.g., FFmpeg stdout)
Mixing examples ๐จ
๐ Combine any input adapter with any output adapter:
๐คโก๏ธ๐ Microphone to file - record translations
config = Config(
source=SourceLang(EN, mic),
targets=[TargetLang(ES, FileWriter("recording_es.wav"))]
)
๐โก๏ธ๐ File to speaker - play translations
config = Config(
source=SourceLang(EN, FileReader("presentation.mp3")),
targets=[TargetLang(ES, speaker)]
)
๐คโก๏ธ๐๐ Microphone to multiple outputs
config = Config(
source=SourceLang(EN, mic),
targets=[
TargetLang(ES, speaker), # Play Spanish through speaker
TargetLang(ES, FileWriter("spanish.wav")), # Save Spanish to file
TargetLang(FR, FileWriter("french.wav")) # Save French to file
]
)
๐พโก๏ธ๐พ Buffer to buffer - for integration
input_buffer = io.BytesIO(audio_data)
output_buffer = io.BytesIO()
config = Config(
source=SourceLang(EN, BufferReader(input_buffer)),
targets=[TargetLang(ES, BufferWriter(output_buffer))]
)
๐งโก๏ธ๐ FFmpeg pipe to speaker
pipe = RunAsPipe(ffmpeg_process.stdout)
config = Config(
source=SourceLang(EN, BufferReader(pipe)),
targets=[TargetLang(ES, speaker)]
)
Benchmarking ๐
The SDK includes a powerful benchmarking module for performance analysis and quality testing. Run comprehensive benchmarks with detailed metrics, latency measurements, and trace data export.
# Quick benchmark
uv run python -m palabra_ai.benchmark examples/speech/en.mp3 en es --out ./results
# With Docker
make bench -- examples/speech/en.mp3 en es --out ./results
๐ See Benchmarking Guide for complete documentation including configuration options, output files, and advanced usage.
Features โจ
Real-time translation โก
๐ฏ Translate audio streams in real-time with minimal latency ๐ฌ Perfect for live conversations, conferences, and meetings
Voice cloning ๐ฃ๏ธ
๐ญ Preserve the original speaker's voice characteristics in translations โ๏ธ Enable voice cloning in the configuration
Device management ๐ฎ
๐ค Easy device selection with interactive prompts or programmatic access:
dm = DeviceManager()
# Interactive selection
mic, speaker = dm.select_devices_interactive()
# Get devices by name
mic = dm.get_mic_by_name("Blue Yeti")
speaker = dm.get_speaker_by_name("MacBook Pro Speakers")
# List all devices
input_devices = dm.get_input_devices()
output_devices = dm.get_output_devices()
Audio Configuration ๐ต
Sample Rates by Protocol
The SDK automatically handles audio sample rates based on the connection protocol:
WebSocket (WS) Mode
- Input (to API): Always 16kHz mono PCM
- Output (from API): Always 24kHz mono PCM
WebRTC Mode
- Input (to API): 48kHz mono PCM
- Output (from API): 48kHz mono PCM
The SDK automatically resamples audio to match these requirements regardless of your input/output device capabilities.
Supported languages ๐
Speech recognition languages ๐ค (Source)
๐ธ๐ฆ Arabic (AR), ๐ Bashkir (BA), ๐ง๐พ Belarusian (BE), ๐ง๐ฌ Bulgarian (BG), ๐ง๐ฉ Bengali (BN), ๐ Catalan (CA), ๐จ๐ฟ Czech (CS), ๐ด Welsh (CY), ๐ฉ๐ฐ Danish (DA), ๐ฉ๐ช German (DE), ๐ฌ๐ท Greek (EL), ๐ฌ๐ง English (EN), ๐ Esperanto (EO), ๐ช๐ธ Spanish (ES), ๐ช๐ช Estonian (ET), ๐ Basque (EU), ๐ฎ๐ท Persian (FA), ๐ซ๐ฎ Finnish (FI), ๐ซ๐ท French (FR), ๐ฎ๐ช Irish (GA), ๐ Galician (GL), ๐ฎ๐ฑ Hebrew (HE), ๐ฎ๐ณ Hindi (HI), ๐ญ๐ท Croatian (HR), ๐ญ๐บ Hungarian (HU), ๐ Interlingua (IA), ๐ฎ๐ฉ Indonesian (ID), ๐ฎ๐น Italian (IT), ๐ฏ๐ต Japanese (JA), ๐ฐ๐ท Korean (KO), ๐ฑ๐น Lithuanian (LT), ๐ฑ๐ป Latvian (LV), ๐ฒ๐ณ Mongolian (MN), ๐ฎ๐ณ Marathi (MR), ๐ฒ๐พ Malay (MS), ๐ฒ๐น Maltese (MT), ๐ณ๐ฑ Dutch (NL), ๐ณ๐ด Norwegian (NO), ๐ต๐ฑ Polish (PL), ๐ต๐น Portuguese (PT), ๐ท๐ด Romanian (RO), ๐ท๐บ Russian (RU), ๐ธ๐ฐ Slovak (SK), ๐ธ๐ฎ Slovenian (SL), ๐ธ๐ช Swedish (SV), ๐ฐ๐ช Swahili (SW), ๐ฎ๐ณ Tamil (TA), ๐น๐ญ Thai (TH), ๐น๐ท Turkish (TR), ๐ Uyghur (UG), ๐บ๐ฆ Ukrainian (UK), ๐ต๐ฐ Urdu (UR), ๐ป๐ณ Vietnamese (VI), ๐จ๐ณ Chinese (ZH)
Translation languages ๐ (Target)
๐ธ๐ฆ Arabic (AR), ๐ฆ๐ฟ Azerbaijani (AZ), ๐ง๐พ Belarusian (BE), ๐ง๐ฌ Bulgarian (BG), ๐ง๐ฆ Bosnian (BS), ๐ Catalan (CA), ๐จ๐ฟ Czech (CS), ๐ด Welsh (CY), ๐ฉ๐ฐ Danish (DA), ๐ฉ๐ช German (DE), ๐ฌ๐ท Greek (EL), ๐ฌ๐ง English (EN), ๐ฆ๐บ English Australian (EN_AU), ๐จ๐ฆ English Canadian (EN_CA), ๐ฌ๐ง English UK (EN_GB), ๐บ๐ธ English US (EN_US), ๐ช๐ธ Spanish (ES), ๐ฒ๐ฝ Spanish Mexican (ES_MX), ๐ช๐ช Estonian (ET), ๐ซ๐ฎ Finnish (FI), ๐ต๐ญ Filipino (FIL), ๐ซ๐ท French (FR), ๐จ๐ฆ French Canadian (FR_CA), ๐ Galician (GL), ๐ฎ๐ฑ Hebrew (HE), ๐ฎ๐ณ Hindi (HI), ๐ญ๐ท Croatian (HR), ๐ญ๐บ Hungarian (HU), ๐ฎ๐ฉ Indonesian (ID), ๐ฎ๐ธ Icelandic (IS), ๐ฎ๐น Italian (IT), ๐ฏ๐ต Japanese (JA), ๐ฐ๐ฟ Kazakh (KK), ๐ฐ๐ท Korean (KO), ๐ฑ๐น Lithuanian (LT), ๐ฑ๐ป Latvian (LV), ๐ฒ๐ฐ Macedonian (MK), ๐ฒ๐พ Malay (MS), ๐ณ๐ฑ Dutch (NL), ๐ณ๐ด Norwegian (NO), ๐ต๐ฑ Polish (PL), ๐ต๐น Portuguese (PT), ๐ง๐ท Portuguese Brazilian (PT_BR), ๐ท๐ด Romanian (RO), ๐ท๐บ Russian (RU), ๐ธ๐ฐ Slovak (SK), ๐ธ๐ฎ Slovenian (SL), ๐ท๐ธ Serbian (SR), ๐ธ๐ช Swedish (SV), ๐ฐ๐ช Swahili (SW), ๐ฎ๐ณ Tamil (TA), ๐น๐ท Turkish (TR), ๐บ๐ฆ Ukrainian (UK), ๐ต๐ฐ Urdu (UR), ๐ป๐ณ Vietnamese (VI), ๐จ๐ณ Chinese (ZH), ๐จ๐ณ Chinese Simplified (ZH_HANS), ๐น๐ผ Chinese Traditional (ZH_HANT)
Available language constants ๐
from palabra_ai import (
# English variants - 1.5+ billion speakers (including L2)
EN, EN_AU, EN_CA, EN_GB, EN_US,
# Chinese variants - 1.3+ billion speakers
ZH, ZH_HANS, ZH_HANT, # ZH_HANS and ZH_HANT for translation only
# Hindi & Indian languages - 800+ million speakers
HI, BN, MR, TA, UR,
# Spanish variants - 500+ million speakers
ES, ES_MX,
# Arabic variants - 400+ million speakers
AR, AR_AE, AR_SA,
# French variants - 280+ million speakers
FR, FR_CA,
# Portuguese variants - 260+ million speakers
PT, PT_BR,
# Russian & Slavic languages - 350+ million speakers
RU, UK, PL, CS, SK, BG, HR, SR, SL, MK, BE,
# Japanese & Korean - 200+ million speakers combined
JA, KO,
# Southeast Asian languages - 400+ million speakers
ID, VI, MS, FIL, TH,
# Germanic languages - 150+ million speakers
DE, NL, SV, NO, DA, IS,
# Romance languages (other) - 100+ million speakers
IT, RO, CA, GL,
# Turkic & Central Asian languages - 200+ million speakers
TR, AZ, KK, UG,
# Baltic languages - 10+ million speakers
LT, LV, ET,
# Other European languages - 50+ million speakers
EL, HU, FI, EU, CY, MT,
# Middle Eastern languages - 50+ million speakers
HE, FA,
# African languages - 100+ million speakers
SW,
# Asian languages (other) - 50+ million speakers
MN, BA,
# Constructed languages
EO, IA,
# Other languages
GA, BS
)
Note: Source languages (for speech recognition) and target languages (for translation) have different support. The SDK automatically validates language compatibility when creating SourceLang and TargetLang objects.
Development status ๐ ๏ธ
Current status โ
- โ Core SDK functionality
- โ GitHub Actions CI/CD
- โ Docker packaging
- โ Python 3.11, 3.12, 3.13 support
- โ PyPI publication
- โ Documentation site (coming soon)
- โณ Code coverage reporting (setup required)
Current dev roadmap ๐บ๏ธ
- โณ TODO: global timeout support for long-running tasks
- โณ TODO: support for multiple source languages in a single run
- โณ TODO: fine cancelling on cancel_all_tasks()
- โณ TODO: error handling improvements
Build status ๐๏ธ
- ๐งช Tests: Running on Python 3.11, 3.12, 3.13
- ๐ฆ Release: Automated releases with Docker images
- ๐ Coverage: Tests implemented, reporting setup needed
Requirements ๐
- ๐ Python 3.11+
- ๐ Palabra AI API credentials (get them at palabra.ai)
Support ๐ค
- ๐ Documentation: https://docs.palabra.ai
- ๐ Issues: GitHub Issues
- ๐ง Email: info@palabra.ai
License ๐
This project is licensed under the MIT License - see the LICENSE file for details.
ยฉ Palabra.ai, 2025 | ๐ Breaking down language barriers with AI ๐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file palabra_ai-0.6.11.tar.gz.
File metadata
- Download URL: palabra_ai-0.6.11.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24463dd314ec80e9ce6c66329c44b940b0135a5a7f5b03b6e2233263f4ffc503
|
|
| MD5 |
45f76c816cf17dc43b2a0da48ec52cc9
|
|
| BLAKE2b-256 |
fbc8efbf0277403b3d61c44ef71bb4d1067018eb146c4bac2d3d7c3e2b42c3d4
|
Provenance
The following attestation bundles were made for palabra_ai-0.6.11.tar.gz:
Publisher:
release.yml on PalabraAI/palabra-ai-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
palabra_ai-0.6.11.tar.gz -
Subject digest:
24463dd314ec80e9ce6c66329c44b940b0135a5a7f5b03b6e2233263f4ffc503 - Sigstore transparency entry: 930461415
- Sigstore integration time:
-
Permalink:
PalabraAI/palabra-ai-python@773d361771d22eeb518db9028ebf3e5fe98d468c -
Branch / Tag:
refs/tags/v0.6.11 - Owner: https://github.com/PalabraAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@773d361771d22eeb518db9028ebf3e5fe98d468c -
Trigger Event:
push
-
Statement type:
File details
Details for the file palabra_ai-0.6.11-py3-none-any.whl.
File metadata
- Download URL: palabra_ai-0.6.11-py3-none-any.whl
- Upload date:
- Size: 116.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43ce208bda32c55ba235eb6bea35a0e99c7c5717caa2e7b7cffed5a6661f3074
|
|
| MD5 |
ad8262e948b23a4f3ecebeb6571e4021
|
|
| BLAKE2b-256 |
463a55dd0fa653eaada7c7bac6399b854eebdc004e34588d9f72c53e67559bb6
|
Provenance
The following attestation bundles were made for palabra_ai-0.6.11-py3-none-any.whl:
Publisher:
release.yml on PalabraAI/palabra-ai-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
palabra_ai-0.6.11-py3-none-any.whl -
Subject digest:
43ce208bda32c55ba235eb6bea35a0e99c7c5717caa2e7b7cffed5a6661f3074 - Sigstore transparency entry: 930461420
- Sigstore integration time:
-
Permalink:
PalabraAI/palabra-ai-python@773d361771d22eeb518db9028ebf3e5fe98d468c -
Branch / Tag:
refs/tags/v0.6.11 - Owner: https://github.com/PalabraAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@773d361771d22eeb518db9028ebf3e5fe98d468c -
Trigger Event:
push
-
Statement type: