Overview

This session covers reinforcement learning fundamentals and practical training insights relevant to post-training and reasoning-oriented LLM pipelines.

Papers

  1. ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
  2. Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
  3. JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Slides