Skip to main content

PTO kernels

Project description

pto-kernels

A collection of high-performance custom kernels for Ascend NPUs, built on top of pto-isa — the Parallel Tile Operation virtual instruction set architecture designed by Ascend CANN.

PTO focuses on tile-level operations, enabling efficient, composable kernel development targeting Huawei's Ascend AI processors.


Prerequisites

  • A configured torch-npu environment
  • Ascend toolkit installed at /usr/local/Ascend/ascend-toolkit

Run the one-time setup before building:

make setup_once

Install repository using pip

The repository is "pip installable", i.e.,

export CMAKE_GENERATOR="Unix Makefiles" && pip install -v git+https://github.com/huawei-csl/pto-kernels.git

Build

source /usr/local/Ascend/ascend-toolkit/set_env.sh
pip3 install -r requirements.txt
make build_wheel

This produces an installable Python wheel:

pto_kernels-X.Y.Z-*.whl

Installation

pip install --force-reinstall pto_kernels-*.whl

Testing

make test

Repository Structure

pto-kernels/
├── csrc/                  # C++ kernel source files
├── python/pto_kernels/    # Python bindings and utilities
├── examples/jit_cpp/      # JIT compilation examples
├── tests/                 # Test suite
├── scripts/               # Helper scripts
├── doxygen/               # API documentation config
└── CMakeLists.txt         # CMake build configuration

Contributing

Contributions are welcome! Please read CONTRIBUTING.md before opening a pull request.


License

BSD-3-Clause-Clear — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pto_kernels-0.1.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl (199.5 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.27+ x86-64

pto_kernels-0.1.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl (196.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.27+ x86-64

File details

Details for the file pto_kernels-0.1.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl.

File metadata

File hashes

Hashes for pto_kernels-0.1.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl
Algorithm Hash digest
SHA256 29ab916cab38a2192698ce16024c71ff7adcb3655f2d5d494e256d36cdf3edc1
MD5 37f4a148024d6cf20c2c40b41ffbc639
BLAKE2b-256 12c9302b30157a6c7fd0203b608c3144fa29ac1da4073f1786b17b090086f35b

See more details on using hashes here.

File details

Details for the file pto_kernels-0.1.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl.

File metadata

File hashes

Hashes for pto_kernels-0.1.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_27_x86_64.whl
Algorithm Hash digest
SHA256 b1c7f56ff976b78a41a7980aeb8d8f75bcbab35727e0855b78519aea85468983
MD5 b2a7dce273d4d88ff39655723f60616b
BLAKE2b-256 cdb729c007d567a3092d8edfb2117a2e8c6c716cb8d8ad0d62f9dea7cf35fa92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page