No project description provided

These details have not been verified by PyPI

Project description

WordLift Python SDK

A Python toolkit for orchestrating WordLift imports: fetch URLs from sitemaps, Google Sheets, or explicit lists, filter out already imported pages, enqueue search console jobs, push RDF graphs, and call the WordLift APIs to import web pages.

Features

URL sources: XML sitemaps (with optional regex filtering), Google Sheets (url column), or Python lists.
Change detection: skips URLs that are already imported unless OVERWRITE is enabled; re-imports when lastmod is newer.
Web page imports: sends URLs to WordLift with embedding requests, output types, retry logic, and pluggable callbacks.
Search Console refresh: triggers analytics imports when top queries are stale.
Graph templates: renders .ttl.liquid templates under data/templates with account data and uploads the resulting RDF graphs.
Extensible: override protocols via WORDLIFT_OVERRIDE_DIR without changing the library code.

Installation

pip install wordlift-sdk
# or
poetry add wordlift-sdk

Requires Python 3.10–3.13.

Configuration

Settings are read in order: config/default.py (or a custom path you pass to ConfigurationProvider.create), environment variables, then (when available) Google Colab userdata.

Common options:

WORDLIFT_KEY (required): WordLift API key.
API_URL: WordLift API base URL, defaults to https://api.wordlift.io.
SITEMAP_URL: XML sitemap to crawl; SITEMAP_URL_PATTERN optional regex to filter URLs.
SHEETS_URL, SHEETS_NAME, SHEETS_SERVICE_ACCOUNT: use a Google Sheet as source; service account points to credentials file.
URLS: list of URLs (e.g., ["https://example.com/a", "https://example.com/b"]).
OVERWRITE: re-import URLs even if already present (default False).
WEB_PAGE_IMPORT_WRITE_STRATEGY: WordLift write strategy (default createOrUpdateModel).
EMBEDDING_PROPERTIES: list of schema properties to embed.
WEB_PAGE_TYPES: output schema types, defaults to ["http://schema.org/Article"].
GOOGLE_SEARCH_CONSOLE: enable/disable Search Console handler (default True).
CONCURRENCY: max concurrent handlers, defaults to min(cpu_count(), 4).
WORDLIFT_OVERRIDE_DIR: folder containing protocol overrides (default app/overrides).

Example config/default.py:

WORDLIFT_KEY = "your-api-key"
SITEMAP_URL = "https://example.com/sitemap.xml"
SITEMAP_URL_PATTERN = r"^https://example.com/article/.*$"
GOOGLE_SEARCH_CONSOLE = True
WEB_PAGE_TYPES = ["http://schema.org/Article"]
EMBEDDING_PROPERTIES = [
    "http://schema.org/headline",
    "http://schema.org/abstract",
    "http://schema.org/text",
]

Running the import workflow

import asyncio
from wordlift_sdk import run_kg_import_workflow

if __name__ == "__main__":
    asyncio.run(run_kg_import_workflow())

The workflow:

Renders and uploads RDF graphs from data/templates/*.ttl.liquid using account info.
Builds the configured URL source and filters out unchanged URLs (unless OVERWRITE).
Sends each URL to WordLift for import with retries and optional Search Console refresh.

You can build components yourself when you need more control:

import asyncio
from wordlift_sdk.container.application_container import ApplicationContainer

async def main():
    container = ApplicationContainer()
    workflow = await container.create_kg_import_workflow()
    await workflow.run()

asyncio.run(main())

Custom callbacks and overrides

Override the web page import callback by placing web_page_import_protocol.py with a WebPageImportProtocol class under WORDLIFT_OVERRIDE_DIR (default app/overrides). The callback receives a WebPageImportResponse and can push to graph_queue or entity_patch_queue.

Templates

Add .ttl.liquid files under data/templates. Templates render with account fields available (e.g., {{ account.dataset_uri }}) and are uploaded before URL handling begins.

Testing

poetry install --with dev
poetry run pytest

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

8.0.17

May 7, 2026

8.0.16

Apr 17, 2026

8.0.15

Apr 13, 2026

8.0.14

Apr 11, 2026

8.0.13

Apr 11, 2026

8.0.10

Apr 10, 2026

8.0.8

Apr 8, 2026

8.0.7

Apr 8, 2026

8.0.6

Mar 31, 2026

8.0.4

Mar 26, 2026

8.0.3

Mar 26, 2026

8.0.2

Mar 25, 2026

8.0.1

Mar 25, 2026

8.0.0

Mar 23, 2026

7.0.1

Mar 15, 2026

7.0.0

Mar 15, 2026

6.15.6

Mar 13, 2026

6.15.5

Mar 13, 2026

6.15.4

Mar 13, 2026

6.15.3

Mar 13, 2026

6.15.2

Mar 13, 2026

6.15.1

Mar 12, 2026

6.15.0

Mar 12, 2026

6.14.0

Mar 12, 2026

6.13.0

Mar 11, 2026

6.12.4

Mar 11, 2026

6.12.3

Mar 11, 2026

6.12.2

Mar 11, 2026

6.12.1

Mar 11, 2026

6.12.0

Mar 10, 2026

6.11.3

Mar 10, 2026

6.11.2

Mar 10, 2026

6.11.1

Mar 10, 2026

6.11.0

Mar 10, 2026

6.10.1

Mar 9, 2026

6.10.0

Mar 9, 2026

6.9.0

Mar 6, 2026

6.8.1

Mar 6, 2026

6.8.0

Mar 6, 2026

6.7.0

Mar 5, 2026

6.6.5

Mar 4, 2026

6.6.4

Mar 4, 2026

6.6.3

Mar 4, 2026

6.6.2

Mar 1, 2026

6.6.1

Feb 27, 2026

6.5.4

Feb 27, 2026

6.5.3

Feb 27, 2026

6.5.2

Feb 27, 2026

6.5.1

Feb 27, 2026

6.5.0

Feb 26, 2026

6.4.1

Feb 26, 2026

6.4.0

Feb 26, 2026

6.3.1

Feb 25, 2026

6.3.0

Feb 25, 2026

6.2.0

Feb 25, 2026

6.1.0

Feb 25, 2026

6.0.6

Feb 25, 2026

6.0.5

Feb 25, 2026

6.0.4

Feb 25, 2026

6.0.3

Feb 25, 2026

6.0.2

Feb 25, 2026

6.0.1

Feb 24, 2026

6.0.0

Feb 24, 2026

5.4.1

Feb 24, 2026

5.4.0

Feb 24, 2026

5.3.0

Feb 23, 2026

5.2.1

Feb 22, 2026

5.2.0

Feb 22, 2026

5.1.2

Feb 22, 2026

5.1.1

Feb 20, 2026

5.1.0

Feb 20, 2026

5.0.0

Feb 20, 2026

4.0.2

Feb 19, 2026

4.0.0

Feb 18, 2026

3.11.2

Feb 18, 2026

3.11.0

Feb 18, 2026

3.10.0

Feb 18, 2026

3.9.0

Feb 18, 2026

3.8.0

Feb 18, 2026

3.7.0

Feb 18, 2026

3.6.0

Feb 14, 2026

3.5.0

Feb 14, 2026

3.4.0

Feb 13, 2026

3.3.0

Feb 12, 2026

3.2.0

Feb 11, 2026

3.1.0

Feb 11, 2026

3.0.0

Feb 11, 2026

2.21.1

Feb 8, 2026

2.21.0

Feb 8, 2026

2.20.0

Feb 8, 2026

2.19.0

Feb 6, 2026

2.18.0

Feb 5, 2026

2.17.0

Feb 5, 2026

2.16.0

Feb 5, 2026

2.15.0

Feb 5, 2026

2.14.0

Feb 4, 2026

2.13.0

Feb 4, 2026

2.12.0

Feb 4, 2026

2.11.0

Feb 4, 2026

2.10.3

Feb 3, 2026

2.10.2

Feb 3, 2026

2.10.1

Feb 1, 2026

2.9.1

Jan 30, 2026

2.9.0

Jan 30, 2026

2.8.0

Jan 30, 2026

2.7.6

Jan 18, 2026

2.7.5

Dec 16, 2025

2.7.4

Dec 3, 2025

This version

2.7.1

Dec 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordlift_sdk-2.7.1.tar.gz (154.3 kB view details)

Uploaded Dec 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wordlift_sdk-2.7.1-py3-none-any.whl (183.9 kB view details)

Uploaded Dec 2, 2025 Python 3

File details

Details for the file wordlift_sdk-2.7.1.tar.gz.

File metadata

Download URL: wordlift_sdk-2.7.1.tar.gz
Upload date: Dec 2, 2025
Size: 154.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.12

File hashes

Hashes for wordlift_sdk-2.7.1.tar.gz
Algorithm	Hash digest
SHA256	`4d4affdc9c166707f6e4d887a050034446f9c624aa943c02131d7a60237df8ea`
MD5	`cccb7f0b5786571496c0c7169130d3a5`
BLAKE2b-256	`3eac9c720b51cff8c3d70d56e2c55574017825510778fbc37cdc7c850fb06c45`

See more details on using hashes here.

File details

Details for the file wordlift_sdk-2.7.1-py3-none-any.whl.

File metadata

Download URL: wordlift_sdk-2.7.1-py3-none-any.whl
Upload date: Dec 2, 2025
Size: 183.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.12

File hashes

Hashes for wordlift_sdk-2.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bba5576e12f57999033e2a7c2da8802909f72b1e0689d03c1ef2d75d84195525`
MD5	`53632e5e1ebba0d33608ec78857c3083`
BLAKE2b-256	`f605f9ef517453de75489f69d5eb789f5baa4fcd0ec6d4dbe1c2a9d31d8b1654`

See more details on using hashes here.

wordlift-sdk 2.7.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

WordLift Python SDK

Features

Installation

Configuration

Running the import workflow

Custom callbacks and overrides

Templates

Testing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes