A high-performance Python application template
Project description
OmniSkill
A super skill generator that turns CSV and Markdown datasets into ready-to-use Agentic-RAG skills with a single command.
Overview
OmniSkill analyzes your dataset directory and generates:
SKILL.md— Skill specification document that tells LLMs how to use the knowledge basesearch.py— Standalone Python script for BM25-based retrieval against the datasetdatasets/— Symlinked source data for the generated skill
It also provides a CLI for manually creating skill scaffolding and searching knowledge bases.
Installation
Using uv (Recommended)
uv add omniskill
Using pip
pip install omniskill
From Source
git clone https://github.com/longcipher/omni-skill.git
cd omni-skill
uv sync --all-groups
Quick Start
Generate a Skill from a Dataset
Point OmniSkill at any directory containing CSV and/or Markdown files:
omniskill generate examples/backend-api-master
This analyzes the dataset files and produces a complete skill under skills/<dataset-name>/:
skills/backend-api-master/
SKILL.md # Skill specification for LLMs
search.py # Standalone search script
datasets/ # Symlink to source data
Use the Generated Skill
The generated search.py can be run directly:
python skills/backend-api-master/search.py "API design best practices"
Or use the CLI:
omniskill search "API design best practices" --skill-dir skills/backend-api-master
Custom Options
# Custom skill name and output directory
omniskill generate my-datasets/ --name my-skill --output out/my-skill/
# Verbose mode shows dataset analysis details
omniskill generate my-datasets/ --verbose
Architecture
graph TD
subgraph "1. Dataset Input"
CSV[CSV Files] --> GEN[omniskill generate]
MD[Markdown Files] --> GEN
end
subgraph "2. Generator"
GEN --> ANALYZE[Analyze Dataset]
ANALYZE --> GEN_SCRIPT[Generate search.py]
ANALYZE --> GEN_MD[Generate SKILL.md]
ANALYZE --> LINK[Symlink datasets/]
end
subgraph "3. Generated Skill"
GEN_SCRIPT --> SEARCH_PY[search.py]
GEN_MD --> SKILL_MD[SKILL.md]
LINK --> DATASETS[datasets/]
end
subgraph "4. Runtime"
SEARCH_PY --> ENGINE[SearchEngine]
ENGINE --> INDEXER[CSV / MD Indexer]
ENGINE --> BM25[BM25 Searcher]
BM25 --> ASM[PromptAssembler]
ASM --> |XML / Markdown / llms.txt| OUTPUT[Formatted Context]
end
SKILL_MD -. "instructs LLM" .-> SEARCH_PY
CLI Reference
generate — Generate a Skill from Datasets
omniskill generate <dataset-dir> [options]
Arguments:
dataset-dir— Path to a directory containing CSV and/or Markdown files
Options:
--name, -n— Skill name (defaults to the dataset directory name)--output, -o— Output directory (defaults toskills/<skill-name>/)--verbose, -v— Show dataset analysis details
Example:
omniskill generate data/api-specs/ --name api-assistant --output skills/api-assistant/
create — Create Skill Scaffolding
Create an empty skill directory with template files:
omniskill create <skill-name> [--force]
Options:
--force, -f— Overwrite existing skill directory
search — Search a Knowledge Base
omniskill search <query> --skill-dir <path> [options]
Options:
--skill-dir, -d— Path to the skill directory (required)--format, -f— Output format:xml,markdown(default:xml)--limit, -l— Maximum number of results (default:10)--type, -t— Filter by document type:csv,markdown--tag— Filter by tag (AND logic, repeatable)--metadata— Include BM25 scores in output--verbose, -v— Enable verbose output
Python API
Generate a Skill Programmatically
from omniskill.core.generator import generate_skill
analysis = generate_skill(
dataset_dir="data/my-datasets",
skill_name="my-skill",
output_dir="skills/my-skill",
)
print(f"Generated {analysis.skill_name} with {analysis.total_documents} documents")
SearchEngine
Index and search directories of CSV/Markdown files:
from omniskill.core.engine import SearchEngine
from omniskill.core.assembler import OutputFormat, PromptAssembler
engine = SearchEngine()
engine.index_directory("skills/my-skill/datasets")
results = engine.search("API design", limit=10)
assembler = PromptAssembler()
print(assembler.assemble(results, output_format=OutputFormat.XML))
Dataset Analysis
Analyze a dataset directory without generating files:
from omniskill.core.generator import analyze_dataset
analysis = analyze_dataset("data/my-datasets")
print(f"CSV files: {len(analysis.csv_files)}")
print(f"Markdown files: {len(analysis.markdown_files)}")
print(f"Total documents: ~{analysis.total_documents}")
Output Formats
OmniSkill supports three output formats for assembled search results:
| Format | CLI Flag | Description |
|---|---|---|
| XML | --format xml |
Structured <context_injection> with <rules> and <reference> sections |
| Markdown | --format markdown |
Human-readable sections with source attribution |
| llms.txt | (in generated scripts) | Follows the llms.txt spec for LLM consumption |
Contributing
Development Setup
git clone https://github.com/longcipher/omni-skill.git
cd omni-skill
uv sync --all-groups
Common Commands
just format # Format code
just lint # Run linter
just test # Run unit tests
just bdd # Run BDD tests
just test-all # Run all tests
just build # Build package
just typecheck # Run type checker
Adding New Features
- Write a failing Gherkin scenario in
features/*.feature - Write a failing
pytesttest for the inner domain logic - Implement the feature
- Re-run
just testandjust bddto verify
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omniskill-0.2.0.tar.gz.
File metadata
- Download URL: omniskill-0.2.0.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a72de38afaa392e7e26ed68dcf19c4863586d65321da514285d2fb611961fae0
|
|
| MD5 |
270e2d2801f92be288eb1ae3a4e845b8
|
|
| BLAKE2b-256 |
2f3c7a9f13f7450ed97c19aaa01bac0b905cf13c6663978090c0d5c66b141480
|
File details
Details for the file omniskill-0.2.0-py3-none-any.whl.
File metadata
- Download URL: omniskill-0.2.0-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca337297d64cff70eeb23082e047b34eb0650ce5549cd2da6edc4bc58b45e73a
|
|
| MD5 |
238b003e9d86cbcb6be5b689db3642e0
|
|
| BLAKE2b-256 |
ca86e4231464da2515a5e026907ab437a80b5370d2fe2a93bd72c4fb4afdd717
|