Implements Sinapsis templates to perform optical character recognition on images
Project description
Sinapsis OCR
Templates for Optical Character Recognition (OCR) in images or PDFs
🐍 Installation • ⚠️ Compatibility • 📦 Packages • 🚀 Features • 📚 Usage example • 🌐 Webapp • 📙 Documentation • 🔍 License
Sinapsis OCR provides powerful and flexible implementations for extracting text from images using different OCR engines. It enables users to easily configure and run OCR tasks with minimal setup.
🐍 Installation
This mono repo consists of different packages for OCR:
sinapsis-deepseek-ocrsinapsis-doctrsinapsis-easyocrsinapsis-glm-ocr
Install using your package manager of choice. We encourage the use of uv
⚠️ Transformers Version Compatibility
DeepSeek OCR and GLM OCR have different transformers version requirements. They cannot be used together in the same environment:
| Package | Transformers Version | Notes |
|---|---|---|
sinapsis-deepseek-ocr |
==4.46.3 (pinned) |
DeepSeek models require this exact version |
sinapsis-glm-ocr |
>=4.46.3 (flexible) |
GLM-OCR works with >=5.1.0 |
When installing from PyPI:
# DeepSeek OCR - installs transformers==4.46.3
uv pip install sinapsis-deepseek-ocr[all] --extra-index-url https://pypi.sinapsis.tech
# GLM OCR - installs latest transformers (5.x)
uv pip install sinapsis-glm-ocr[all] --extra-index-url https://pypi.sinapsis.tech
To upgrade the transfomers version use:
uv pip install transformers
Important: Installing both sinapsis-deepseek-ocr and sinapsis-glm-ocr in the same environment may force transformers==4.46.3, which will cause GLM OCR to fail. Use separate virtual environments if you need both.
Example with uv:
uv pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech
or with raw pip:
pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech
Change the name of the package for the one you want to install.
[!IMPORTANT] Templates in each package may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:
with uv:
uv pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech
or with raw pip:
pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech
[!TIP] You can also install all the packages within this project:
uv pip install sinapsis-ocr[all] --extra-index-url https://pypi.sinapsis.tech
📦 Packages
Packages summary
-
Sinapsis DeepSeek OCR
- Uses the DeepSeek OCR model for high-quality OCR
- Supports optional grounding for bounding box extraction
- Multiple inference modes (tiny, small, base, large, gundam)
-
Sinapsis DocTR
- Uses the DocTR library for high-quality OCR with modern deep learning models
- Supports multiple detection and recognition architectures
- Provides detailed text extraction with bounding boxes and confidence scores
-
Sinapsis EasyOCR
- Leverages the EasyOCR library for simple yet effective OCR
- Supports multiple languages
- Extracts text with bounding boxes and confidence scores
-
Sinapsis GLM OCR
- Uses Zhipu AI's GLM-OCR model for high-quality OCR
- Supports document parsing (text, formula, table) and structured information extraction via JSON schema
- Batch inference support for faster processing of multiple images
[!TIP] Use CLI command
sinapsis info --all-template-namesto show a list with all the available Template names installed with Sinapsis OCR.
[!TIP] Use CLI command
sinapsis info --example-template-config TEMPLATE_NAMEto produce an example Agent config for the Template specified in TEMPLATE_NAME.
For example, for DocTROCRPrediction use sinapsis info --example-template-config DocTROCRPrediction to produce an example config.
📚 Usage example
DocTR Example
agent:
name: doctr_prediction
description: agent to run inference with DocTR, performs on images read, recognition and save
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: FolderImageDatasetCV2
class_name: FolderImageDatasetCV2
template_input: InputTemplate
attributes:
data_dir: dataset/input
- template_name: DocTROCRPrediction
class_name: DocTROCRPrediction
template_input: FolderImageDatasetCV2
attributes:
recognized_characters_as_labels: True
- template_name: BBoxDrawer
class_name: BBoxDrawer
template_input: DocTROCRPrediction
attributes:
draw_confidence: True
draw_extra_labels: True
- template_name: ImageSaver
class_name: ImageSaver
template_input: BBoxDrawer
attributes:
save_dir: output
root_dir: dataset
EasyOCR Example
agent:
name: easyocr_inference
description: agent to run inference with EasyOCR, performs on images read, recognition and save
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: FolderImageDatasetCV2
class_name: FolderImageDatasetCV2
template_input: InputTemplate
attributes:
data_dir: dataset/input
- template_name: EasyOCR
class_name: EasyOCR
template_input: FolderImageDatasetCV2
attributes: {}
- template_name: BBoxDrawer
class_name: BBoxDrawer
template_input: EasyOCR
attributes:
draw_confidence: True
draw_extra_labels: True
- template_name: ImageSaver
class_name: ImageSaver
template_input: BBoxDrawer
attributes:
save_dir: output
root_dir: dataset
DeepSeek OCR Example
agent:
name: deepseek_ocr_inference
description: agent to run inference with DeepSeek OCR
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: FolderImageDatasetCV2
class_name: FolderImageDatasetCV2
template_input: InputTemplate
attributes:
data_dir: dataset/input
- template_name: DeepSeekOCRInference
class_name: DeepSeekOCRInference
template_input: FolderImageDatasetCV2
attributes:
prompt: "Convert the document to markdown."
enable_grounding: true
mode: base
- template_name: BBoxDrawer
class_name: BBoxDrawer
template_input: DeepSeekOCRInference
attributes:
draw_confidence: True
draw_extra_labels: True
- template_name: ImageSaver
class_name: ImageSaver
template_input: BBoxDrawer
attributes:
save_dir: output
root_dir: dataset
GLM OCR Example
agent:
name: glm_ocr_table_agent
description: "Agent to read images, perform GLM OCR for table recognition."
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: FolderImageDatasetCV2
class_name: FolderImageDatasetCV2
template_input: InputTemplate
attributes:
load_on_init: True
root_dir: "."
data_dir: "artifacts"
pattern: "expense.jpg"
- template_name: GLMOCRInference
class_name: GLMOCRInference
template_input: FolderImageDatasetCV2
attributes:
prompt: "Table Recognition:"
init_args:
pretrained_model_name_or_path: zai-org/GLM-OCR
torch_dtype: auto
attn_implementation: kernels-community/flash-attn2
device_map: auto
generation_config:
max_new_tokens: 8192
do_sample: false
To run, simply use:
sinapsis run name_of_the_config.yml
🌐 Webapp
The webapp provides a simple interface to extract text from images using OCR. Upload your image, and the app will process it and display the detected text with bounding boxes.
[!IMPORTANT] To run the app you first need to clone this repository:
git clone https://github.com/Sinapsis-ai/sinapsis-ocr.git
cd sinapsis-ocr
[!NOTE] If you'd like to enable external app sharing in Gradio,
export GRADIO_SHARE_APP=True
[!TIP] The agent configuration can be updated using the AGENT_CONFIG_PATH environment var. For default uses the config for easy ocr but this can be chaged with:
AGENT_CONFIG_PATH=/app/packages/sinapsis_doctr/src/sinapsis_doctr/configs/doctr_demo.yaml
🐳 Docker
IMPORTANT This docker image depends on the sinapsis:base image. Please refer to the official sinapsis instructions to Build with Docker.
- Build the sinapsis-ocr image:
docker compose -f docker/compose.yaml build
- Start the app container:
docker compose -f docker/compose_app.yaml up
- Check the status:
docker logs -f sinapsis-ocr-app
- The logs will display the URL to access the webapp, e.g.:
NOTE: The url can be different, check the output of logs
Running on local URL: http://127.0.0.1:7860
- To stop the app:
docker compose -f docker/compose_app.yaml down
💻 UV
To run the webapp using the uv package manager, please:
- Create the virtual environment and sync the dependencies:
uv sync --frozen
- Install packages:
uv pip install sinapsis-ocr[all] --extra-index-url https://pypi.sinapsis.tech
- Run the webapp:
uv run webapps/gradio_ocr.py
- The terminal will display the URL to access the webapp, e.g.:
Running on local URL: http://127.0.0.1:7860
NOTE: The url can be different, check the output of the terminal
- To stop the app press
Control + Con the terminal
📙 Documentation
Documentation for this and other sinapsis packages is available on the sinapsis website
Tutorials for different projects within sinapsis are available at sinapsis tutorials page
🔍 License
This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.
For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sinapsis_ocr-0.3.1.tar.gz.
File metadata
- Download URL: sinapsis_ocr-0.3.1.tar.gz
- Upload date:
- Size: 35.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d71358dfe71dd4f68008b6a5763080e1e94accfee847190cf95b4af505c285cc
|
|
| MD5 |
d70c35f4a946ffc6e1b9d666aa480cb0
|
|
| BLAKE2b-256 |
068e0c1d835a743ecee265273d3cde4efb574190a3cb7504670015deb66bab85
|
File details
Details for the file sinapsis_ocr-0.3.1-py3-none-any.whl.
File metadata
- Download URL: sinapsis_ocr-0.3.1-py3-none-any.whl
- Upload date:
- Size: 40.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9b4ad3837bfcfb6cabd0782fb06a6838240882945f63c76f4571a5a80e6e83f
|
|
| MD5 |
6d0eed7fa41f9992c1acb1d2c19d795c
|
|
| BLAKE2b-256 |
8e8d9b77630a3aa82bb468f25f9000786ec758d123514f02cb98959a44312c60
|