StreamDiffusionV2 offline and online video diffusion inference.
Project description
StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation (MLSys 2026)
Tianrui Feng1, Zhi Li2, Shuo Yang2, Haocheng Xi2, Muyang Li3, Xiuyu Li1, Lvmin Zhang4, Keting Yang5, Kelly Peng6, Song Han7, Maneesh Agrawala4, Kurt Keutzer2, Akio Kodaira8, Chenfeng Xu†,1
1UT Austin, 2UC Berkeley, 3Nunchaku AI, 4Stanford University, 5Independent Researcher, 6First Intelligence, 7MIT, 8Shizhuku AI
† Project lead, corresponding to xuchenfeng@utexas.edu
Overview
StreamDiffusionV2 is an open-source interactive diffusion pipeline for real-time streaming applications. It scales across diverse GPU setups, supports flexible denoising steps, and delivers high FPS for creators and platforms. Further details are available on our project homepage.
News
- [2026-03-27] Added optional TAEHV-VAE support for offline and online inference via
--use_taehvandUSE_TAEHV=1. - [2026-03-06] Update Ring-buffer KV Cache for efficient sliding window attention.
- [2026-01-26] 🎉 StreamDiffusionV2 is accepted by MLSys 2026!
- [2025-11-10] 🚀 We have released our paper at arXiv. Check it for more details!
- [2025-10-18] Release our model checkpoint on huggingface.
- [2025-10-06] 🔥 Our StreamDiffusionV2 is publicly released! Check our project homepage for more details.
Prerequisites
- OS: Linux with NVIDIA GPU
- CUDA-compatible GPU and drivers
Installation
conda create -n streamdiffusionv2 python=3.10
conda activate streamdiffusionv2
# PyPI
pip install streamdiffusionv2
# Optional but recommended for better throughput
pip install "streamdiffusionv2[flash-attn]"
If you are installing from a local checkout of this repository instead of PyPI:
conda create -n streamdiffusionv2 python=3.10
conda activate streamdiffusionv2
pip install .
# Optional but recommended for better throughput
pip install ".[flash-attn]"
The package install includes the Python dependencies required for both offline inference and the demo backend. The demo frontend still requires Node.js 18 as described in demo/README.md.
Download Checkpoints
# 1.3B Model
huggingface-cli download --resume-download Wan-AI/Wan2.1-T2V-1.3B --local-dir wan_models/Wan2.1-T2V-1.3B
huggingface-cli download --resume-download jerryfeng/StreamDiffusionV2 --local-dir ./ckpts --include "wan_causal_dmd_v2v/*"
# 14B Model
huggingface-cli download --resume-download Wan-AI/Wan2.1-T2V-14B --local-dir wan_models/Wan2.1-T2V-14B
huggingface-cli download --resume-download jerryfeng/StreamDiffusionV2 --local-dir ./ckpts --include "wan_causal_dmd_v2v_14b/*"
We use the 14B model from CausVid-Plus for offline inference demo.
Optional: TAEHV-VAE Checkpoint
If you want to enable the lightweight TAEHV decoder, download its checkpoint once:
curl -L https://github.com/madebyollin/taehv/raw/main/taew2_1.pth -o ckpts/taew2_1.pth
The offline inference code can also download this file automatically on first use, but keeping it in ckpts/taew2_1.pth avoids that extra startup step.
Offline Inference
All offline inference entrypoints are unified under run_v2v.sh.
Choose one mode first:
single: single-GPU streaming inferencesingle-wo: single-GPU inference without Stream-batchpipe: multi-GPU pipeline inference
Quick start:
./run_v2v.sh single
./run_v2v.sh single-wo
./run_v2v.sh pipe
./run_v2v.sh pipe --profile
Use --profile only when you want synchronized throughput measurements.
The legacy wrappers v2v.sh, v2v_wo.sh, and pipe_v2v.sh still work, but they now forward to the same shared entrypoint.
Common Arguments
The most important options are:
--config_path: model config YAML--checkpoint_folder: checkpoint directory--video_path: input video--prompt_file_path: prompt text file--output_folder: output directory--heightand--width: output resolution--fps: target output FPS--step: number of denoising steps used during inference--use_taehv: use Wan stream encode with the TAEHV decoder for faster VAE decoding
You can pass overrides either as CLI flags or as environment variables. For example:
OUTPUT_FOLDER=outputs/run_single ./run_v2v.sh single
VIDEO_PATH=examples/original.mp4 PROMPT_FILE_PATH=examples/prompt.txt ./run_v2v.sh single-wo
NPROC_PER_NODE=2 MASTER_PORT=29511 ./run_v2v.sh pipe
./run_v2v.sh single --use_taehv
Single GPU
This is the standard offline path when you run on one GPU.
./run_v2v.sh single \
--config_path configs/wan_causal_dmd_v2v.yaml \
--checkpoint_folder ckpts/wan_causal_dmd_v2v \
--output_folder outputs/ \
--prompt_file_path examples/prompt.txt \
--video_path examples/original.mp4 \
--height 480 \
--width 832 \
--fps 16 \
--step 2
To enable the TAEHV decoder in this mode:
./run_v2v.sh single --use_taehv
Multi-GPU
Use this mode when you want to split inference across multiple GPUs.
./run_v2v.sh pipe \
--config_path configs/wan_causal_dmd_v2v.yaml \
--checkpoint_folder ckpts/wan_causal_dmd_v2v \
--output_folder outputs/ \
--prompt_file_path examples/prompt.txt \
--video_path examples/original.mp4 \
--height 480 \
--width 832 \
--fps 16 \
--step 2
# --schedule_block # optional: enable block scheduling
To enable the TAEHV decoder in pipeline mode:
./run_v2v.sh pipe --use_taehv
Notes:
--schedule_blockis optional and can improve throughput on some multi-GPU setups.- Adjust
NPROC_PER_NODE,--height,--width, and--fpsto match your hardware and target workload. ./run_v2v.sh pipe --profileis intended for profiling runs, not normal benchmarking or deployment.
Online Inference (Web UI)
A minimal web demo is available under demo/. For setup and startup, please refer to demo.
- Access in a browser after startup:
http://0.0.0.0:7860orhttp://localhost:7860 - To enable the TAEHV decoder in the web demo, start it with
USE_TAEHV=1.
To-do List
- Demo and inference pipeline.
- Dynamic scheduler for various workload.
- Training code.
- FP8 support.
- TensorRT support.
Acknowledgements
StreamDiffusionV2 is inspired by the prior works StreamDiffusion and StreamV2V. Our Causal DiT builds upon CausVid, and the rolling KV cache design is inspired by Self-Forcing.
We are grateful to the team members of StreamDiffusion for their support. We also thank First Intelligence and Daydream team for their great feedback.
We also especially thank DayDream team for the great collaboration and incorporating our StreamDiffusionV2 pipeline into their cool Demo UI.
Citation
If you find this repository useful in your research, please consider giving a star ⭐ or a citation.
@article{feng2025streamdiffusionv2,
title={StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation},
author={Feng, Tianrui and Li, Zhi and Yang, Shuo and Xi, Haocheng and Li, Muyang and Li, Xiuyu and Zhang, Lvmin and Yang, Keting and Peng, Kelly and Han, Song and others},
journal={arXiv preprint arXiv:2511.07399},
year={2025}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file streamdiffusionv2-0.1.0.tar.gz.
File metadata
- Download URL: streamdiffusionv2-0.1.0.tar.gz
- Upload date:
- Size: 79.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4738d9bfbdbd8ad968139d7ac575f4af0ac9fac148242c0839b2ee2736580be5
|
|
| MD5 |
c7d078b0d31a26eb4d755b25043aa31f
|
|
| BLAKE2b-256 |
91e857bcb39b929bdd9ab7d6627344e8d2e86d5ff6e82f72df977a2a5534fa0a
|
Provenance
The following attestation bundles were made for streamdiffusionv2-0.1.0.tar.gz:
Publisher:
publish-to-pypi.yml on jerryfeng2003/StreamDiffusionV2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
streamdiffusionv2-0.1.0.tar.gz -
Subject digest:
4738d9bfbdbd8ad968139d7ac575f4af0ac9fac148242c0839b2ee2736580be5 - Sigstore transparency entry: 1188324973
- Sigstore integration time:
-
Permalink:
jerryfeng2003/StreamDiffusionV2@d589b5f92d33c687edd447f27d252003e234488e -
Branch / Tag:
refs/heads/master - Owner: https://github.com/jerryfeng2003
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@d589b5f92d33c687edd447f27d252003e234488e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file streamdiffusionv2-0.1.0-py3-none-any.whl.
File metadata
- Download URL: streamdiffusionv2-0.1.0-py3-none-any.whl
- Upload date:
- Size: 91.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97c4c5aa6c651de0b977713d54c05eb05f516799b49d8f083d22575056a0ed61
|
|
| MD5 |
f7f8e65a7e4dcce3b5b6585399456244
|
|
| BLAKE2b-256 |
b7dbe09a1a66e7cda33fe7bcf16653886459b56a2d575229d1155ef83acd769d
|
Provenance
The following attestation bundles were made for streamdiffusionv2-0.1.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on jerryfeng2003/StreamDiffusionV2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
streamdiffusionv2-0.1.0-py3-none-any.whl -
Subject digest:
97c4c5aa6c651de0b977713d54c05eb05f516799b49d8f083d22575056a0ed61 - Sigstore transparency entry: 1188324985
- Sigstore integration time:
-
Permalink:
jerryfeng2003/StreamDiffusionV2@d589b5f92d33c687edd447f27d252003e234488e -
Branch / Tag:
refs/heads/master - Owner: https://github.com/jerryfeng2003
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@d589b5f92d33c687edd447f27d252003e234488e -
Trigger Event:
workflow_dispatch
-
Statement type: