Now enrolling · 2025

Learn
Reinforcement
Learning.

Build, visualize, and train intelligent agents. A modern curriculum designed for students in grades 7–10 — rigorous, free, and actually fun.

Apply Now — It's Free
How RL Works
🤖
Agent
Observes state → chooses action
↓  action sent to environment
🌍
Environment
Returns reward + next state
↓  reward drives learning
📈
Policy Update
Agent improves over time
↺  Repeat until optimal
12+
Hands-on modules
7–10
Grade levels served
$0
Completely free
Future potential
Built for curious
young minds.
Every concept is taught through a problem. Students don't learn RL to pass a test — they learn it to build something real.
🤖
Train Real Agents
Students build and watch agents learn through trial and error — from grid worlds to CartPole — seeing RL work, not just theory on a slide.
📊
Visual Learning
Complex concepts like the Bellman equation and Q-tables are brought to life through animated visualizations and interactive demos.
🧠
Deep Concepts, Simple Language
From explore vs. exploit to deep Q-networks, we translate cutting-edge AI research into something a 7th grader can genuinely understand.
💻
Python-First Projects
Students write real Python code, not drag-and-drop blocks. They leave with working projects they can show off, build on, and present.
🚀
Built by a Student
ReinforceLearn was created by a high school student who actually does this work — so the curriculum meets students exactly where they are.
🎯
Problem-Centered Design
Every module starts with a challenge. Students are curious before they're taught — which makes everything stick.

A 12-week journey
into AI.

From zero to training intelligent agents. Each module builds on the last — no prior AI experience needed, just curiosity. Three phases: foundations, core RL, and advanced deep RL.


Phase 1 — Foundations

01
Week 1 · Introduction
What Is Reinforcement Learning?
The big picture: what makes RL different from supervised and unsupervised learning. Students explore how agents interact with environments through actions and observations, and why reward-driven learning is so powerful.
Agent & EnvironmentReward SignalRL vs MLReal-World Examples
02
Week 2 · Core Concepts
States, Actions & the Markov Property
We formalize the RL framework with Markov Decision Processes. Students learn what a state really is, how action spaces are defined, and why the Markov property is the mathematical foundation everything else is built on.
MDPsState SpacesAction SpacesMarkov Property
03
Week 3 · Rewards & Goals
Reward Design & Discount Factors
Designing reward functions is one of the hardest parts of RL. Students learn what makes a good (and bad) reward, why long-term vs short-term rewards matter, and how the discount factor γ controls how far ahead an agent thinks.
Reward ShapingDiscount Factor γSparse RewardsReturn G_t

Phase 2 — Core RL

04
Week 4 · Value Functions
Value Functions & the Bellman Equation
The mathematical heart of RL. Students derive the Bellman equation from first principles, implement value and action-value functions, and see how iterating the Bellman backup leads to optimal policies.
V(s) and Q(s,a)Bellman EquationOptimal PolicyPolicy Evaluation
05
Week 5 · Tabular RL
Q-Learning & Q-Tables
Students build their first Q-learning agent from scratch. They construct a Q-table, implement the TD update rule, and watch the agent gradually learn to navigate a grid world — seeing RL click in real time.
Q-TablesTD LearningLearning Rate αGrid World Lab
06
Week 6 · Exploration
Explore vs. Exploit
Should your agent try new things or stick with what works? Students experiment with epsilon-greedy strategies, decaying exploration schedules, and the multi-armed bandit problem — building deep intuition for one of RL's core dilemmas.
Epsilon-GreedyExploration DecayMulti-Armed BanditUCB Strategy
07
Week 7 · Policy Methods
Policy Iteration & SARSA
Students learn the difference between on-policy and off-policy learning. We compare SARSA (on-policy) to Q-learning (off-policy), implement both, and explore how policy iteration converges to optimal solutions.
SARSAOn-policy vs Off-policyPolicy IterationConvergence

Phase 3 — Deep RL

08
Week 8 · Neural Networks
Neural Networks for RL
Before building DQNs, students need to understand neural networks. Layers, activation functions, backpropagation, and function approximation — everything that sets the stage for Deep RL.
Feedforward NetworksActivation FunctionsBackpropagationFunction Approximation
09
Week 9 · Deep Q-Networks
DQNs — When Q-Tables Aren't Enough
When state spaces are too large for a table, we use neural networks. Students learn the DQN architecture, experience replay, and target networks — the innovations that powered DeepMind's Atari-playing agents.
DQN ArchitectureExperience ReplayTarget NetworksLoss Functions
10
Week 10 · OpenAI Gym
Training Agents in Real Environments
Students put everything together using OpenAI Gym — training agents on CartPole and MountainCar, tuning hyperparameters, and interpreting training curves.
OpenAI GymCartPoleHyperparameter TuningTraining Curves
11
Week 11 · Advanced Topics
Policy Gradients & Actor-Critic
A peek at the frontier. Students are introduced to policy gradient methods and the actor-critic architecture that combines the best of value-based and policy-based approaches.
REINFORCE AlgorithmPolicy GradientsActor-CriticAdvantage Function
12
Week 12 · Capstone
Build & Present Your Own Agent
Students choose a challenge environment, design their agent architecture, train it from scratch, and present results. The project you'll show at science fairs, competitions, and college applications.
Full RL PipelineEnvironment of ChoicePerformance AnalysisProject Showcase

AI education shouldn't wait for college.

ReinforceLearn was built on a simple frustration: reinforcement learning — one of the most powerful areas of modern AI — is almost completely absent from K-12 education. Most students don't encounter it until graduate school, if ever.

We believe that's backwards. The concepts behind RL are deeply intuitive. They map onto how humans and animals learn naturally. Students in 7th grade are ready to understand them.

So we built the curriculum we wished existed. Rigorous enough to be meaningful. Approachable enough to be fun. Built by a student, for students.

🤖
Agent
Observes state, chooses action
↓ action
🌍
Environment
Returns reward + new state
↓ reward
📈
Policy Update
Agent learns to maximize reward
↺ repeat until optimal
Our Values
🔬
Rigor Without Gatekeeping
We don't water down the math — we make it accessible. Real concepts, real code, taught with patience and clarity.
🌱
Curiosity Over Credentials
No prerequisites. No barriers. If you're curious about AI, you belong here. We start from zero, together.
🤝
Student-Built, Student-First
This curriculum was designed by someone who went through the same confusion. We meet students exactly where they are.

Real results
from real students.

Don't take our word for it. Here's what students, parents, and teachers are saying about ReinforceLearn.

★★★★★
"
I went in thinking reinforcement learning was something only PhD students could understand. By Week 3, I had a working Q-learning agent navigating a grid world on my own laptop. The way everything connects — the math, the code, the intuition — it just clicks. I've never felt smarter in my life.
JK
Jordan K.
8th Grade Student, Portland OR
★★★★★
"
My daughter came home explaining the explore-exploit tradeoff using a restaurant analogy she invented herself. The curriculum teaches kids to think, not just follow steps.
MP
Maya P.
Parent of 9th Grader
★★★★★
"
As a CS teacher, I've tried a dozen AI curricula. ReinforceLearn is the only one where students are genuinely building something from scratch — and feeling proud of it.
TR
T. Reyes
CS Teacher, Washington Middle School
★★★★★
"
The Bellman equation used to intimidate me. Now I think about it like updating a game strategy. The visual explanations completely changed how I understand AI.
AS
Aiden S.
10th Grade Student
★★★★★
"
The capstone project was the highlight of my son's school year. He stayed up until midnight training his CartPole agent and woke me up to show me when it finally worked.
LW
L. Washington
Parent of 7th Grader

Got questions?
We've got answers.

Everything you need to know about ReinforceLearn. If you don't see your question here, reach out through the contact page.

Let's build
the future of
AI education.

Whether you're a student ready to enroll, a teacher looking to bring ReinforceLearn to your classroom, or just curious — we'd love to hear from you.

📧
Email
reinforcelearn@gmail.com
📍
Location
Portland, Oregon
Response Time
Usually within 24 hours

Send a Message

Message sent!

Thanks for reaching out. We'll get back to you within 24 hours.