Skip to main content

A small example package

Project description

python ml skeleton project

generic skeleton for machine learning project with python, hydra, pytest, sphinx, github actions, etc. with dummy functionalities! It is mostly oriented geospatial projects

PyPI python PyPI version License Documentation Status pre-commit.ci status codecov

Why this project?

The goal of this project is to present a standard architecture of python repository/package including a full CiCd pipeline to document/test/deploy your project with standard methods of 2022. It can be used as starting point for any project without reinventing the wheel.

The code has no interest!

The code of this project is totally dummy: it makes simple mathematics operations like addition and subtration! The next iteration will make the opetations more interesting by using multi-layers perceptron! It will try to add a complete example of Hydra configuration.

In a close future, it will serve as a demonstrator by the example of a standard ML pipeline for experimentation and production

Installation

Install requirements

As Gdal dependencies are presents it's preferable to install dependencis via conda before installing the package:

  git clone https://github.com/samysung/python_ml_project_skeleton
  cd python_ml_project_skeleton/packaging
  conda env create -f package_env.yml

From pip:

pip install pmps
or pip install pmps==vx.x # for a specific version
Other installation options

From source:

python setup.py install

From source with symbolic links:

pip install -e .

From source using pip:

pip install git+https://github.com/samysung/python_ml_project_skeleton

Project Architecture

├── CHANGELOG.rst
├── .codecov.yml
├── deploy
│   └── dockerfile
├── docs
│   ├── add.rst
│   ├── build.sh
│   ├── changelog.rst
│   ├── conf.py
│   ├── deploy.sh
│   ├── index.rst
│   ├── Makefile
│   ├── readme_link.md
│   └── _static
│       └── img
├── .github
│   └── workflows
│       ├── publish.yml
│       ├── test_code.yml
│       ├── test_docs.yml
│       ├── test_packaging.yml
│       └── test_publish.yml
├── .gitignore
├── LICENSE
├── packaging
│   ├── doc_env.yml
│   ├── doc_requirements.txt
│   ├── package_env.yml
│   ├── requirements.txt
│   ├── test_env.yml
│   └── test_requirements.txt
├── pmps
│   ├── api
│      ├── add.py
│      ├── __init__.py
│      └── subtract.py
│   ├── core
│      ├── add.py
│      ├── __init__.py
│      └── subtract.py
│   └── __init__.py
├── .pre-commit-config.yaml
├── .pylintrc
├── README.md
├── readthedocs.yml
├── setup.cfg
├── setup.py
├── tests
│   ├── api
│      ├── __init__.py
│      ├── test_add.py
│      └── test_subtract.py
│   └── __init__.py
└── VERSION

Architecture component overview

Component Path Description
Python Package pmps/ where the python executable code is localized. It is your root package as it's the first directory to contain a init.py and its name is generally the one you choose for your publishing package (the one build and published on forge like pypi conda, etc. Don't forget for any subpackage to add an init.py module to declare it as python package. NB: separate core and api in different sub package is a design choice not standard, it comes from java world but a lot of python project prefers declaring private python modules.
Documentation docs/ the source code of your documentation: conf.py is where you configure your sphinx doc, _static/ for your additional statis files (img, text, icon, video, etc.), doc is built under docs/_build/html but can be modified in maekfile.
Tests Package tests/ where you organize the test code of your executable code. Your unit tests (pytest is the library used) should at least test what you expose to your clients, you can add static analysis of your tests code with extentions like mypy and flake8. Use the pytest-cov extension to produce test cover reporting.
Python Env packaging/ Place for your conda environment files and requirement files.
Deployment deploy/ Place for Dockerfiles or any other deployment solution
CI/CD workflows .github/ github workflows configuration files (details below)
CD (Documentation publishing) .readthedocs.yml configuration of the documentation publication on readthedocs (see readthedocs link)
CI (tests covering publishing) .codecov.yml configuration of the code covering pubication on codecov (see codecov)
CI (static analysis publishing) .pre-commit.yml configuration of the pre-commit publication (see pre-commit)
CD (packaging) setup.cfg and setup.py configuration files for packaging on pipy, local, etc (see python doc)

CI/CD pipeline

The first and essential goal is to have a skeleton quickly editable for a lot of use case projects with a big emphasis on continuous integration and continuous deployment. Here is a schematic view of the Ci/Cd pipeline targeted for open source python project, largely inspired by others well known projects:

DIAGRAM

Ci/Cd diagram

Github Workflows

test code worflow (.github/workflows/test_code.yml):

Used to run unit test (and functionnal if implemented) tests on pull request events or push on main branch. It publishes coverage results on codecov.io. Use the packaging/test_env.yml conda environment file, github cache action and codecov/codecov-action

test docs workflow (.github/workflows/test_docs.yml):

Used to test the build of sphinx documentation. Run on pull request events or push on main branch. Use the packaging/doc_env.yml conda environment file, and the github cache action.

publish workflow (.github/workflows/publish.yml):

Used to publish the package on pypi, when a new tagged version or release is published. Use the packaging/package_env.yml conda environment file, github cache action github download and upload artifacts, and gh-action-pypi-publish.

test publish workflow (.github/workflows/test_publish.yml):

Same worflow as above, but on a test branch and test.pypi forge, for testing deployment improvement recipes

test packaging workflow (.github/workflows/test_packaging.yml):

Worflow actioned by CRON event (see crontab-guru), every n hours. Used to test that the package has been published and the lasted version is working.

Github workflows based on github webhooks or githu Apps

Some workflow works are handled by third party applications, like the readthe docs publication or the online pre-commit static analysis.

Pre-commit

Pre-commit action is launched via a github app (pre-commit.ci) on every commit made on remote. it's configured via the file pre-commit-config.yaml

Read-the-docs publication

Readthedocs publish new documentation version via a webhook subscribed for push and commit event. You can configure the type of push trigering the process in the readthedocs.org configuration section. See read the docs documentation for more detail.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pmps-0.2.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pmps-0.2-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file pmps-0.2.tar.gz.

File metadata

  • Download URL: pmps-0.2.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for pmps-0.2.tar.gz
Algorithm Hash digest
SHA256 ea2ff61e0b7101c47d7c2d2cc10192c869442349115f7bfe872fb0c38bfb1414
MD5 4088f2c61a6f751554cfee9af07aa4dc
BLAKE2b-256 17c137ae9211a87bb2586ff80af27d15801ef2ed4e2bb2c6723b56305509fbe8

See more details on using hashes here.

File details

Details for the file pmps-0.2-py3-none-any.whl.

File metadata

  • Download URL: pmps-0.2-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for pmps-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 039e4294ea3cda46cbeecb2315d2ad7e49db68be4d493da5fecb247867949b9c
MD5 db48da758e206b339b751cd27f2b3cdb
BLAKE2b-256 585e3f16d670dd6f53a4348add68da2756e40bb39842e82070d3843fc7cf3f2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page