Skip to main content

Extreme Quantile Regression Neural Networks for insurance pricing — covariate-dependent GPD tail modelling

Project description

insurance-eqrn

Extreme Quantile Regression Neural Networks for insurance pricing.

The problem

Your EVT model gives you the 1-in-200 claim for the portfolio. EQRN gives you the 1-in-200 claim for the Kensington flat vs the Somerset farmhouse. That difference is your reinsurance margin.

The standard approach to extreme severity modelling — fit a GPD to all claims above a threshold, read off the 99.5th percentile — pools everything together. It gives you one shape parameter and one scale parameter for the whole book. If your TPBI claims have a heavier tail for younger injured parties and lighter for older ones, the pooled model averages those tails away. Your per-segment VaR is wrong and your XL pricing is wrong.

The solution is covariate-dependent GPD parameters: xi(x) and sigma(x) as functions of risk characteristics, not pooled scalars. This is what EQRN does.

EQRN (Pasche & Engelke 2024, Annals of Applied Statistics) is the first method to estimate covariate-dependent GPD parameters using a neural network. This library is the first Python implementation.

What this library provides

  • EQRNModel — two-step fitting: LightGBM intermediate quantile + GPD neural network
  • EQRNDiagnostics — QQ plot, threshold stability, calibration, xi scatter
  • Out-of-fold intermediate quantile estimation (prevents leakage into GPD step)
  • Orthogonal GPD reparameterisation for stable gradient training
  • predict_quantile — conditional VaR at any extreme level (0.99, 0.995, ...)
  • predict_tvar — conditional TVaR / expected shortfall
  • predict_exceedance_prob — P(claim > threshold | risk profile)
  • predict_xl_layer — expected loss in per-risk XL layer (attachment, limit)

Install

pip install insurance-eqrn

PyTorch is required. For CPU-only:

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install insurance-eqrn

Quickstart

import numpy as np
from insurance_eqrn import EQRNModel, EQRNDiagnostics

# X: covariate matrix (e.g. risk characteristics)
# y: claim severity values (above basic threshold)
model = EQRNModel(
    tau_0=0.85,             # intermediate quantile level
    hidden_sizes=(32, 16, 8),
    n_epochs=300,
    shape_fixed=False,      # covariate-dependent xi
    seed=42,
)
model.fit(X_train, y_train, X_val=X_val, y_val=y_val)

# Per-segment 99.5th percentile severity
var_995 = model.predict_quantile(X_test, q=0.995)

# TVaR for reinsurance pricing
tvar_99 = model.predict_tvar(X_test, q=0.99)

# XL layer: £500k xs £500k
xl_loss = model.predict_xl_layer(X_test, attachment=500_000, limit=500_000)

# Fitted GPD parameters per observation
params = model.predict_params(X_test)
# DataFrame with columns: xi, sigma, nu, threshold

The two-step method

Step 1: Intermediate quantile (LightGBM, out-of-fold)

Fits a quantile regression at level tau_0 (default 0.8) using K-fold cross-validation. Out-of-fold predictions are mandatory here. If you use in-sample predictions, the GPD network in Step 2 sees artificially clean thresholds and learns the wrong exceedance set.

Step 2: GPD neural network on exceedances

Identifies observations above their predicted threshold (~20% of training data at tau_0=0.8). Trains a feedforward network mapping (X, Q_hat(tau_0)) → (nu(x), xi(x)) using the orthogonal GPD deviance loss.

The orthogonal parameterisation (nu = sigma * (xi + 1)) makes the Fisher information matrix diagonal, which stabilises Adam training substantially compared to the direct (sigma, xi) parameterisation.

Prediction

For a new observation x at target level tau > tau_0:

Q_x(tau) = Q_hat_x(tau_0) + sigma(x)/xi(x) * [((1-tau_0)/(1-tau))^xi(x) - 1]

At xi ≈ 0 (exponential limit), this is Q_hat + sigma * log((1-tau_0)/(1-tau)).

Parameters

Parameter Default Description
tau_0 0.8 Intermediate quantile level. Increase for smaller datasets
hidden_sizes (32, 16, 8) Network hidden layer widths
n_epochs 500 Maximum training epochs
patience 50 Early stopping patience
shape_fixed False If True, xi is a scalar. Start here before fitting full model
l2_pen 1e-4 L2 weight decay
shape_penalty 0 Penalty on variance of xi(x) — smooths the shape surface
p_drop 0 Dropout probability. Try 0.1–0.2 for small datasets
n_folds 5 K-fold folds for OOF intermediate quantile
seed None Random seed

Diagnostics

from insurance_eqrn import EQRNDiagnostics

diag = EQRNDiagnostics(model)

# GPD QQ plot — should track the diagonal if the tail model is correct
diag.qq_plot(X_test, y_test)

# Predicted vs empirical coverage at each quantile level
diag.calibration_plot(X_test, y_test, levels=[0.9, 0.95, 0.99, 0.995])

# Mean residual life plot — linearity onset shows where GPD approximation holds
diag.mean_residual_life_plot(y_train)

# Threshold stability — fit shape_fixed models at each tau_0, look for plateau
diag.threshold_stability_plot(X_train, y_train)

# Summary table: predicted vs empirical exceedance rates
diag.summary_table(X_test, y_test)

Insurance applications

Motor TPBI (Third-Party Bodily Injury)

Young injured parties have longer annuity streams and heavier tails. EQRN lets you model xi(x) as a function of injured party age, claim type, solicitor involvement. Output: P(claim > £500k | risk profile) per policy.

Property large loss

Commercial property fire severity varies by construction class, sum insured, sprinkler status. EQRN provides 1-in-200 loss conditional on risk characteristics — input to CAT reinsurance models.

Per-risk XL pricing

# Price layer: £1M xs £500k, conditional on risk
xl = model.predict_xl_layer(X_test, attachment=500_000, limit=1_000_000)

Solvency II SCR

EQRN provides per-segment 99.5th percentile severity, which is the correct input for simulation-based SCR calculations on heterogeneous portfolios. Segment-level conditional VaR is more conservative than pooled EVT for high-risk segments and more accurate for low-risk segments.

When not to use EQRN

  • Frequency modelling: EQRN models severity above a threshold. Frequency is a separate model.
  • Attritional claims: Claims below tau_0 are not modelled by EQRN.
  • Small books (n_exceedances < 200): Set shape_fixed=True as a minimum. Below ~100 exceedances, fall back to marginal EVT.
  • No covariates: Use insurance-evt directly.

Reference

Pasche, O.C. & Engelke, S. (2024). "Neural networks for extreme quantile regression with an application to forecasting of flood risk." Annals of Applied Statistics, 18(4), 2818–2839. DOI:10.1214/24-AOAS1907.

R reference implementation: opasche/EQRN (CRAN, March 2025).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_eqrn-0.1.1.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

insurance_eqrn-0.1.1-py3-none-any.whl (25.4 kB view details)

Uploaded Python 3

File details

Details for the file insurance_eqrn-0.1.1.tar.gz.

File metadata

  • Download URL: insurance_eqrn-0.1.1.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_eqrn-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f7c09bceca8501759a3c5c476643f5b907e470606308cd6842866a524cb77415
MD5 3345b837b286742908b13bb0a0f59b97
BLAKE2b-256 616f770373dd5ff0ff4d6589e2b49c097c60b07e4bc583b6b0acb3edfcbdf525

See more details on using hashes here.

File details

Details for the file insurance_eqrn-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: insurance_eqrn-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_eqrn-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4ee35cff8ee2eb11c957cabb51869f3d73aaa4c005ca369a96156c02dbe7fbf4
MD5 e041a0d8de6d8792c40ce8a5c2af8a50
BLAKE2b-256 2aed4ade4c9d971f0b5ce76c55a9d6c96d7500f1bb2a79d2ba89362c315cc4c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page