Last released Aug 13, 2025
Open-source pipeline that converts human-written rubrics into LLM-based reward functions for RL and RLHF training
Supported by