Automatic Speech Recoginition Model
Project description
TrorYong ASR Model
TrorYongASR, is an Automatic Speech Recognition Model implemented by KrorngAI.
TrorYong (ត្រយ៉ង) is Khmer word for giant ibis, the bird that symbolises Cambodia.
Support My Work
While this work comes truly from the heart, each project represents a significant investment of time -- from deep-dive research and code preparation to the final narrative and editing process. I am incredibly passionate about sharing this knowledge, but maintaining this level of quality is a major undertaking. If you find my work helpful and are in a position to do so, please consider supporting my work with a donation. You can click here to donate or scan the QR code below. Your generosity acts as a huge encouragement and helps ensure that I can continue creating in-depth, valuable content for you.
Installation
You can easily install tror-yong-asr using pip command as the following:
pip install tror-yong-asr
To use TrorYongASR, there are few dependencies: transformers, safetensors, and torchaudio.
Usage
Get started with the code below
from transformers import AutoProcessor
from tror_yong_asr import TrorYongASRModel, transcribe, translate, detect_language
model_id = "KrorngAI/TrorYongASR-tiny"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = TrorYongASRModel.from_pretrained(model_id)
result1 = detect_language('/path/to/audio_file.mp3', model, processor)
print(result1)
result2 = transcribe('/path/to/audio_file.mp3', model, processor, max_tokens=64)
print(result2)
result3 = translate('/path/to/audio_file.mp3', model, processor, max_tokens=64)
print(result3)
TrorYongASR has 2 pre-trained weights that support Khmer and English:
- Tiny version with
model_id=KrorngAI/TrorYongASR-tiny - Small version with
model_id=KrorngAI/TrorYongASR-small
Evaluation
TrorYongASR was evaluated on test-split of google/fleurs with code km-kh for Khmer and librispeech.clean for English.
WER Comparison with Whisper:
| Tiny | Parameters | Khmer (fleurs) |
English (librispeech.clean) |
|---|---|---|---|
| TrorYongASR | 29M | 75.88% | 54.33% |
| Whisper | 39M | 100.6% | 7.6% |
| Small | Parameters | Khmer (fleurs) |
English (librispeech.clean) |
|---|---|---|---|
| TrorYongASR | 135M | 50.46% | 21.75% |
| Whisper | 244M | 104.4% | 3.4% |
Fine-tune TrorYongASR
Below is the notebook of fine-tuning tutorial.
If you speak Khmer, you can watch my YouTube video explaining each step of the fine-tuning below.
Note: from version v.1.1 onward, you can use functions push_to_hub, save_pretrained, and from_pretrained like any models of transformers.
from transformers import AutoProcessor
from tror_yong_asr import TrorYongASRModel
original_model_id="KrorngAI/TrorYongASR-tiny"
processor = AutoProcessor.from_pretrained(original_model_id, trust_remote_code=True)
model = TrorYongASRModel.from_pretrained(original_model_id)
new_model_id="your_hf_repo"
processor.push_to_hub(new_model_id)
model.push_to_hub(new_model_id)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tror_yong_asr-0.1.1.tar.gz.
File metadata
- Download URL: tror_yong_asr-0.1.1.tar.gz
- Upload date:
- Size: 19.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
640b882f29af89969d28dee2a2074622032c3fb7472e871cd776a8e053a6634c
|
|
| MD5 |
11e04cbe69fdcd6ccccf6787b04c1658
|
|
| BLAKE2b-256 |
1b03e3230a3fddb24507f0495e75878400097e507ba8a10ecfb3f0702a61245a
|