Skip to main content

Python client for India's National Statistical Office (NSO/MoSPI) data portal

Project description

esankhyiki

Python client for India's National Statistical Office (NSO/MoSPI) data portal

Python 3.9+ License: MIT PyPI PRs Welcome


Access 500+ statistical indicators across 22 datasets covering employment, prices, industry, GDP, health, education, environment, trade, and more - directly from Python.

Installation

pip install mospi-esankhyiki

Quick Start

import esankhyiki

# Step 1: Discover available datasets
datasets = esankhyiki.list_datasets()

# Step 2: Get indicators for a dataset
indicators = esankhyiki.get_indicators("PLFS")

# Step 3: Get valid filter values for an indicator
metadata = esankhyiki.get_metadata("PLFS", indicator_code=1, frequency_code=1)

# Step 4: Fetch the data
data = esankhyiki.get_data("PLFS", {
    "indicator_code": 1,
    "frequency_code": 1,
    "year": "2023-24",
    "state_code": 99,
    "gender_code": 3,
    "age_code": 1,
    "sector_code": 3,
})

The 4-Step Workflow

The API follows a sequential discovery workflow. Filter codes are dataset-specific - always discover them through the workflow instead of guessing.

list_datasets()  ->  get_indicators()  ->  get_metadata()  ->  get_data()
      |                    |                    |                  |
  Find the right      See what's          Get valid          Fetch the
    dataset           measured          filter values       actual data

Why this order matters: get_metadata() returns the exact values (state codes, year strings, gender codes, etc.) that get_data() expects. Passing arbitrary values will raise InvalidFilterError.


API Reference

list_datasets(format="dict")

Returns an overview of all 22 MoSPI statistical datasets.

datasets = esankhyiki.list_datasets()
datasets_df = esankhyiki.list_datasets(format="df")

get_indicators(dataset, format="dict")

Returns available indicators for a given dataset.

indicators = esankhyiki.get_indicators("PLFS")

Notes by dataset:

  • PLFS, ASUSE - indicators are grouped by frequency_code (1=Annual, 2=Quarterly, 3=Monthly for PLFS)
  • NSS79 - indicators are grouped by survey_code (1=CAMS, 2=AYUSH)
  • CPI - returns available base years instead of named indicators
  • IIP, WPI - these datasets have no sub-indicators; call get_metadata() directly

get_metadata(dataset, ..., format="dict")

Returns valid filter values (states, years, quarters, gender codes, etc.) for a given dataset and indicator. The returned filter_values show exactly what values are accepted by get_data().

Full signature:

esankhyiki.get_metadata(
    dataset,
    indicator_code=None,        # int - required for most datasets
    base_year=None,             # str - required for CPI, IIP, NAS
    level=None,                 # str - required for CPI ("Group" or "Item")
    frequency=None,             # str - required for IIP ("Annually" or "Monthly")
    classification_year=None,   # str - required for ASI (e.g. "2008")
    frequency_code=None,        # int - required for PLFS and ASUSE (1=Annual, 2=Quarterly)
    series=None,                # str - for CPI and NAS ("Current" or "Back")
    use_of_energy_balance_code=None,  # int - for ENERGY (1=Supply, 2=Consumption)
    sub_indicator_code=None,    # int - for RBI (alternative to indicator_code)
    format="dict",
)

Required params by dataset:

Dataset Required params
PLFS indicator_code, frequency_code
CPI base_year, level
IIP base_year, frequency
ASI classification_year
NAS indicator_code, base_year, frequency_code
WPI (none)
ENERGY indicator_code
AISHE indicator_code
ASUSE indicator_code, frequency_code
GENDER indicator_code
NFHS indicator_code
ENVSTATS indicator_code
RBI indicator_code or sub_indicator_code
NSS77 indicator_code
NSS78 indicator_code
CPIALRL indicator_code
HCES indicator_code
TUS indicator_code
EC indicator_code (1=EC6, 2=EC5, 3=EC4)
NSS79 indicator_code
UDISE indicator_code
MNRE indicator_code (1=Solar, 2=Wind, 3=Hydro, 4=Bio, 5=Total)

get_data(dataset, filters, format="dict")

Fetches statistical data. Use filter values returned by get_metadata().

data = esankhyiki.get_data("PLFS", {
    "indicator_code": 1,
    "frequency_code": 1,
    "year": "2023-24",
    "state_code": 99,
    "gender_code": 3,
    "age_code": 1,
    "sector_code": 3,
})

Pagination (where supported):

data = esankhyiki.get_data("PLFS", {
    ...,
    "limit": 50,
    "page": 2,
})

Output Formats

All four functions accept a format parameter:

Value Returns
"dict" (default) Python dict
"df" or "dataframe" pandas DataFrame
"csv" CSV string
# Default dict
data = esankhyiki.get_data("PLFS", filters)

# DataFrame
df = esankhyiki.get_data("PLFS", filters, format="df")

# CSV
csv = esankhyiki.get_data("PLFS", filters, format="csv")

Datasets

Dataset Name Coverage
PLFS Periodic Labour Force Survey Jobs, unemployment, wages
CPI Consumer Price Index Retail inflation, price indices
IIP Index of Industrial Production Manufacturing, mining output
ASI Annual Survey of Industries Factory financials, employment
NAS National Accounts Statistics GDP, GVA, national income
WPI Wholesale Price Index Wholesale inflation
ENERGY Energy Statistics Energy production and consumption
AISHE Higher Education Survey Universities, enrolment, GER
ASUSE Unincorporated Enterprises Informal sector, MSMEs
GENDER Gender Statistics 147 indicators across all domains
NFHS National Family Health Survey Health, fertility, mortality
ENVSTATS Environment Statistics Climate, biodiversity, pollution
RBI RBI Statistics Trade, forex, exchange rates
NSS77 NSS 77th Round Agricultural households
NSS78 NSS 78th Round Living conditions
CPIALRL CPI for Rural Labourers Rural inflation
HCES Household Consumption Spending, poverty, Gini
TUS Time Use Survey Time allocation, unpaid work
EC Economic Census District-wise establishments
NSS79 NSS 79th Round Education, health, digital literacy (CAMS/AYUSH)
UDISE Unified District Information System School education statistics
MNRE Renewable Energy (MNRE) State-wise installed capacity for solar, wind, hydro, bio, and total renewable power

Examples

Unemployment Rate (PLFS)

import esankhyiki

# Discover valid filter values first
meta = esankhyiki.get_metadata("PLFS", indicator_code=3, frequency_code=1)

df = esankhyiki.get_data("PLFS", {
    "indicator_code": 3,      # Unemployment Rate
    "frequency_code": 1,      # Annual
    "year": "2023-24",
    "state_code": 99,         # All India
    "gender_code": 3,         # Person (all genders combined)
    "age_code": 1,
    "sector_code": 3,         # Rural + Urban combined
}, format="df")

GDP / National Accounts (NAS)

meta = esankhyiki.get_metadata(
    "NAS",
    indicator_code=1,
    base_year="2022-23",
    frequency_code=1,
    series="Current",
)

df = esankhyiki.get_data("NAS", {
    "indicator_code": 1,
    "base_year": "2022-23",
    "series": "Current",
    "frequency_code": 1,
}, format="df")

Consumer Price Index (CPI)

# CPI is auto-routed: pass level="Group" for group-level, level="Item" for item-level
meta = esankhyiki.get_metadata("CPI", base_year="2024", level="Group", series="Current")

df = esankhyiki.get_data("CPI", {
    "base_year": "2024",
    "year": "2026",
    "series": "Current",
}, format="df")

Industrial Production (IIP)

# IIP is auto-routed: annual if no month_code, monthly if month_code is present
meta = esankhyiki.get_metadata("IIP", base_year="2011-12", frequency="Annually")

df = esankhyiki.get_data("IIP", {
    "base_year": "2011-12",
    "financial_year": "2023-24",
}, format="df")

Wholesale Price Index (WPI)

# WPI requires no indicator_code
meta = esankhyiki.get_metadata("WPI")

df = esankhyiki.get_data("WPI", {
    "base_year": "2011-12",
    "year": "2023-24",
    "series": "All Commodities",
}, format="df")

Annual Survey of Industries (ASI)

meta = esankhyiki.get_metadata("ASI", classification_year="2008")

df = esankhyiki.get_data("ASI", {
    "classification_year": "2008",
    "indicator_code": 1,
    "year": "2021-22",
}, format="df")

Economic Census (District-wise, EC)

The EC dataset has two modes: ranking (top/bottom N districts) and detail (row-level records).

# Check available filters
meta = esankhyiki.get_metadata("EC", indicator_code=1)  # EC6 (2013-14)

# Ranking mode - top 5 districts in Assam by establishments
data = esankhyiki.get_data("EC", {
    "indicator_code": 1,      # EC6
    "state": "18",            # Assam
    "mode": "ranking",
    "top5opt": "2",           # Top 5
})

# Detail mode - paginated row-level data (20 rows per page)
data = esankhyiki.get_data("EC", {
    "indicator_code": 1,
    "state": "18",
    "mode": "detail",
    "pageNum": "1",
})

Gender Statistics

indicators = esankhyiki.get_indicators("GENDER")
meta = esankhyiki.get_metadata("GENDER", indicator_code=1)

df = esankhyiki.get_data("GENDER", {
    "indicator_code": 1,
    "year": "2021",
}, format="df")

RBI / Trade Statistics

# RBI uses sub_indicator_code internally; pass indicator_code and it maps automatically
meta = esankhyiki.get_metadata("RBI", indicator_code=1)

df = esankhyiki.get_data("RBI", {
    "indicator_code": 1,
    "year": "2023-24",
}, format="df")

Renewable Energy Capacity (MNRE)

# type_of_renewable_energy_code: 1=Solar, 2=Wind, 3=Hydro, 4=Bio, 5=Total
meta = esankhyiki.get_metadata("MNRE", indicator_code=1)  # Solar — returns valid category_codes

df = esankhyiki.get_data("MNRE", {
    "type_of_renewable_energy_code": 1,  # Solar Power
    "state_code": 36,                    # All India
    "year": "2023",
}, format="df")

Error Handling

from esankhyiki.exceptions import (
    InvalidDatasetError,
    InvalidFilterError,
    APIError,
    NoDataError,
)

try:
    data = esankhyiki.get_data("PLFS", {
        "bogus_param": 1,
        "indicator_code": 1,
        "frequency_code": 1,
    })
except InvalidFilterError as e:
    print(e)  # Shows invalid params and valid alternatives
except InvalidDatasetError as e:
    print(e)  # Shows valid dataset names
except NoDataError as e:
    print(e)  # Valid request, but no matching rows
except APIError as e:
    print(e)  # Upstream API / network / server failure

Error contract:

Exception When raised
InvalidDatasetError Dataset name is not one of the 22 known datasets
InvalidFilterError A filter param is missing, invalid, or not accepted by the endpoint
NoDataError Request was valid but returned zero rows
APIError Network failure, timeout, or upstream 5xx error

Live API Notes

  • This package wraps live MoSPI endpoints. Some datasets may temporarily return No Data Found or upstream 5xx errors even when the code and filters are valid.
  • Those cases surface explicitly as NoDataError or APIError rather than silently returning empty results.
  • The recommended workflow is always: list_datasets() -> get_indicators() -> get_metadata() -> get_data().
  • The client retries on 429, 500, 502, 503, and 504 status codes (up to 3 times with backoff).

Testing

# Non-network contract tests only
pytest -q -m "not network"

# All tests including live endpoint calls
pytest -q

To verify release artifacts locally:

python -m build

Related


Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

License

MIT License. See LICENSE for details.


Acknowledgments

Made in partnership with Bharat Digital in pursuit of modernising and humanising how governments use technology in service of the public.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mospi_esankhyiki-0.1.3.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mospi_esankhyiki-0.1.3-py3-none-any.whl (60.9 kB view details)

Uploaded Python 3

File details

Details for the file mospi_esankhyiki-0.1.3.tar.gz.

File metadata

  • Download URL: mospi_esankhyiki-0.1.3.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for mospi_esankhyiki-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f79762217b175a0a318a849282ffe4b6415210b49e898928f5f2596ee488ee7c
MD5 569bde2e26995e4331a6a291446485a7
BLAKE2b-256 55a8ab14c5237d4bb89254842ddfd5be60889d6ed33601c74ce8cc96ac2ca459

See more details on using hashes here.

File details

Details for the file mospi_esankhyiki-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for mospi_esankhyiki-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3af7c5ed4e98d159e86e178a26a8f90bb0f58bf85282bfd59e285f0c3df9f5ad
MD5 6d7dcd41b824a6f64f985816f8098822
BLAKE2b-256 4bba9bb26c8b16ccd8f483d14fd0d838cbab802630819fd3f61d8d6876d5e9d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page