A Python package for Llama CPP.
Project description
Llama CPP
This is a Python package for Llama CPP ( https://github.com/ggml-org/llama.cpp ).
Installation
You can install the pre-built wheel from the releases page or build it from source.
pip install llama-cpp-pydist
Usage
This section provides a basic overview of how to use the llama_cpp_pydist library.
Deploying Windows Binaries
If you are on Windows, the package attempts to automatically deploy pre-compiled binaries. You can also manually trigger this process.
from llama_cpp import deploy_windows_binary
# Specify the target directory for the binaries
# This is typically within your Python environment's site-packages
# or a custom location if you prefer.
target_dir = "./my_llama_cpp_binaries"
if deploy_windows_binary(target_dir):
print(f"Windows binaries deployed successfully to {target_dir}")
else:
print(f"Failed to deploy Windows binaries or no binaries were found for your system.")
# Once deployed, you would typically add the directory containing llama.dll (or similar)
# to your system's PATH or ensure your application can find it.
# For example, if llama.dll is in target_dir/bin:
# import os
# os.environ["PATH"] += os.pathsep + os.path.join(target_dir, "bin")
Conversion Library Installation
To perform Hugging Face to GGUF model conversions, you need to install additional Python libraries. You can install them via pip:
pip install transformers numpy torch safetensors sentencepiece
Alternatively, you can install them programmatically in Python:
from llama_cpp.install_conversion_libs import install_conversion_libs
if install_conversion_libs():
print("Conversion libraries installed successfully.")
else:
print("Failed to install conversion libraries.")
Converting Hugging Face Models to GGUF
This package provides a utility to convert Hugging Face models (including those using Safetensors) into the GGUF format, which is used by llama.cpp. This process leverages the conversion scripts from the underlying llama.cpp submodule.
1. Install Conversion Libraries:
Before converting models, ensure you have the necessary Python libraries. You can install them using a helper function:
from llama_cpp import install_conversion_libs
if install_conversion_libs():
print("Conversion libraries installed successfully.")
else:
print("Failed to install conversion libraries. Please check the output for errors.")
2. Convert the Model:
Once the dependencies are installed, you can use the convert_hf_to_gguf function:
from llama_cpp import convert_hf_to_gguf
# Specify the Hugging Face model name or local path
model_name_or_path = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" # Example: A small model from Hugging Face Hub
# Or, a local path: model_name_or_path = "/path/to/your/hf_model_directory"
output_directory = "./converted_gguf_models" # Directory to save the GGUF file
output_filename = "tinyllama_1.1b_chat_q8_0.gguf" # Optional: specify a filename
quantization_type = "q8_0" # Example: 8-bit quantization. Common types: "f16", "q4_0", "q4_K_M", "q5_K_M", "q8_0"
print(f"Starting conversion for model: {model_name_or_path}")
success, result_message = convert_hf_to_gguf(
model_path_or_name=model_name_or_path,
output_dir=output_directory,
output_filename=output_filename, # Can be None to auto-generate
outtype=quantization_type
)
if success:
print(f"Model converted successfully! GGUF file saved at: {result_message}")
else:
print(f"Model conversion failed: {result_message}")
# The `result_message` will contain the path to the GGUF file on success,
# or an error message on failure.
This function will download the model from Hugging Face Hub if a model name is provided and it's not already cached locally by Hugging Face transformers. It then invokes the convert_hf_to_gguf.py script from llama.cpp.
For more detailed examples and advanced usage, please refer to the documentation of the underlying llama.cpp project and explore the examples provided there.
Building and Development
For instructions on how to build the package from source, update the llama.cpp submodule, or other development-related tasks, please see BUILDING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_cpp_pydist-0.7.0.tar.gz.
File metadata
- Download URL: llama_cpp_pydist-0.7.0.tar.gz
- Upload date:
- Size: 36.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
429c41d1ac6fae3c34f1c5c0feb71588a94cb9f4dcb5d42175c7fc5d4e56b545
|
|
| MD5 |
f43353497fc4e7d297570b491b1cd3b8
|
|
| BLAKE2b-256 |
a4fcb9875c8d9a9ce262a60aebf0bbd5423b84b7019fbeaf159bd84829e92daf
|
File details
Details for the file llama_cpp_pydist-0.7.0-py3-none-any.whl.
File metadata
- Download URL: llama_cpp_pydist-0.7.0-py3-none-any.whl
- Upload date:
- Size: 37.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
192a67b69d21a41521c21dd615656e25e4613ebc0578298b8221d7926f8ce31b
|
|
| MD5 |
99c89f2690dc84fb847d3f68a7981b42
|
|
| BLAKE2b-256 |
3439586c0ceefc576a788c3dfffde8a872266057e4f92fb74f166ab2940a6341
|