Overview

This session covers reinforcement learning fundamentals and practical training insights relevant to post-training and reasoning-oriented LLM pipelines.

Paper 1: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper 2: Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper 3: JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Slides