Reinforcement Learning Masterclass
Published 5/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 8.16 GB | Duration: 20h 3m
Published 5/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 8.16 GB | Duration: 20h 3m
Master Reinforcement Learning: From Basics to Advanced Applications
What you'll learn
Understand the key concepts and components of reinforcement learning, including MDPs, policies, rewards, and value functions
Apply algorithms like SARSA, Q-Learning, REINFORCE, PPO, TRPO, SAC, and DQN in Python
Use modern libraries like Stable-Baselines3 and TF-Agents to solve real-world problems with RL
Implement actor-critic and policy gradient methods using neural networks
nderstand how to apply reinforcement learning in multi-agent and multi-objective environments
Build end-to-end projects such as inventory management, recommendation systems, and resource allocation with RL
Requirements
Basic understanding of Python and Numpy is recommended. Familiarity with probability, linear algebra, or machine learning will help, but not mandatory — the course starts from the foundations and builds up gradually.
Description
Welcome to the Reinforcement Learning Course! This course is designed to take you from the basics of Reinforcement Learning (RL) to advanced techniques and applications. Whether you're a data scientist, researcher, software developer, or simply curious about AI, this course will provide you with valuable insights and hands-on experience in the field of RL.In this course, you will:Understand the fundamentals of Reinforcement Learning: Learn about the core components of RL, including agents, environments, actions, rewards, and states.Explore Markov Decision Processes (MDPs): Study the concepts of policies, value functions, and solving MDPs using dynamic programming.Solve Multi-Armed Bandit Problems: Understand ε-greedy actions, Thompson sampling, and the exploration-exploitation trade-off.Master Temporal-Difference Learning: Learn about TD learning, SARSA, and Q-Learning.Learn Deep Q-Learning: Discover Deep Q-Networks (DQN), experience replay, and target networks.Apply Policy Gradient Methods: Explore algorithms like REINFORCE, Advantage Actor-Critic (A2C), and Asynchronous Advantage Actor-Critic (A3C).Implement Advanced Techniques: Learn about Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and more.Understand Evolution Strategies and Genetic Algorithms: Get an introduction to these powerful optimization techniques.Explore Model-Based RL: Learn about dynamic programming and the Dyna-Q algorithm.Investigate Hierarchical RL: Study hierarchical policies, the options framework, and MAXQ value function decomposition.Examine Curiosity-Driven Exploration: Understand intrinsic motivation in RL and curiosity-driven agents.Learn Bayesian Methods in RL: Study Bayesian optimization with Gaussian processes and Thompson sampling.Discover Distributed RL: Explore scalable RL architectures and distributed experience replay.Understand Meta-Reinforcement Learning: Learn about learning to learn and gradient-based meta-RL.Explore Multi-Agent RL: Study multi-agent systems, cooperative vs. competitive scenarios, and advanced algorithms like MADDPG and MAPPO.Focus on Safe RL: Learn about safety constraints, constrained policy optimization, and risk-aware RL.Study Inverse RL: Understand the basics, applications, and reward shaping in inverse RL.Perform Off-Policy Evaluation: Learn about importance sampling, doubly robust estimators, and other methods.Use Function Approximation in RL: Discover linear function approximation and the role of neural networks in RL.Optimize with Sequential Model-Based Techniques: Learn about Bayesian optimization and Gaussian processes in RL.Balance Multiple Objectives in RL: Study multi-objective RL and Pareto optimality.Understand Deep Recurrent Q-Networks (DRQN): Learn about memory-augmented neural networks and applications in partially observable environments.Explore Implicit Quantile Networks (IQN): Study distributional RL and quantile regression.Investigate Neural Episodic Control (NEC): Understand episodic memory in RL and the NEC algorithm.Implement Policy Iteration with Function Approximation: Learn about iterative policy evaluation and generalized policy iteration.Apply RL in Various Fields: Study applications of RL in robotics, autonomous systems, finance, supply chain management, and marketing.By the end of this course, you will have a thorough understanding of Reinforcement Learning and be equipped to apply it to solve complex problems in various domains. Join us and become proficient in this cutting-edge field!
Overview
Section 1: Introduction
Lecture 1 Introduction
Lecture 2 How You Should Study This Course?
Lecture 3 Curriculum
Lecture 4 What's Reinforcement Learning?
Lecture 5 Components of Reinforcement Learning
Section 2: Mathematical Foundations
Lecture 6 Probability Theory Essentials
Lecture 7 Markov Decision Processes
Lecture 8 Markov Decision Processes - Case
Lecture 9 Markov Decision Processes - Python
Lecture 10 Markov Decision Processes Code Output
Lecture 11 Dynamic Programming Principles
Lecture 12 Dynamic Programming - Case
Lecture 13 Dynamic Programming - Mathematical Model
Lecture 14 Dynamic Programming - Python Code
Lecture 15 Dynamic Programming - Output
Lecture 16 Probability Distributions - Theory
Section 3: Dynamic Programming
Lecture 17 Policy Evaluation
Lecture 18 Iterative Policy Evaluation Algorithm with Python
Section 4: Monte Carlo Methods
Lecture 19 Blackjack - Intro
Lecture 20 Blackjack Python
Lecture 21 Blackjack Output
Section 5: Temporal Difference Learning
Lecture 22 What is SARSA?
Lecture 23 SARSA - Taxi Implementation
Lecture 24 SARSA - Taxi & Visual
Lecture 25 Q-Learning Intro
Lecture 26 Frozen Lake
Lecture 27 Frozen Lake Python
Lecture 28 Cliff Walking Python
Section 6: Function Approximation
Lecture 29 Function Approximation in RL
Lecture 30 Neural Networks in Reinforcement Learning
Section 7: Policy Gradient Methods
Lecture 31 What is Reinforce?
Lecture 32 REINFORCE - Python
Lecture 33 Generalized Advantage Estimation (GAE)
Lecture 34 Generalized Advantage Estimation (GAE) - Python
Lecture 35 Advantage Actor-Critic (A2C)
Lecture 36 Asynchronous Advantage Actor-Critic (A3C)
Lecture 37 Deterministic Policy Gradient (DPG)
Lecture 38 DDPG (Deep Deterministic Policy Gradient)
Lecture 39 TD3 (Twin Delayed DDPG)
Lecture 40 SAC (Soft Actor-Critic)
Lecture 41 TRPO Intro
Lecture 42 Trust Region Policy Optimization (TRPO) - Python 1
Lecture 43 Trust Region Policy Optimization (TRPO) - Python 2
Lecture 44 Trust Region Policy Optimization (TRPO) - Python 3
Lecture 45 Trust Region Policy Optimization (TRPO) - Python 4
Lecture 46 TRPO - Output
Lecture 47 Proximal Policy Optimization
Lecture 48 ME-TRPO
Section 8: Deep Q-Networks
Lecture 49 DQN Intro
Section 9: Hierarchical Reinforcement Learning
Lecture 50 Hierarchical Reinforcement Learning : Intro
Lecture 51 HRL Python - 1
Lecture 52 HRL Python - 2
Lecture 53 HRL Python - Output
Section 10: Imıtation Learning & Inverse Reinforcement Learning
Lecture 54 Intro
Section 11: Stable-Baselines3 Projects
Lecture 55 CartPole-v1 - Proximal Policy Optimization
Section 12: Pyqlearning Projects
Lecture 56 Simulated Annealing - Traveling Salesman Problem
Section 13: Multi-Agent Reinforcement Learning
Lecture 57 Introduction to Multi-Agent Reinforcement Learning
Lecture 58 MARL Types
Lecture 59 MARL Training
Lecture 60 MARL Challenges
Lecture 61 MARL - Predator & Prey
Lecture 62 MARL - Predator & Prey Animated Outputs
Section 14: Multi-Objective Reinforcement Learning
Lecture 63 MORL Intro
Lecture 64 MORL Python - 1
Lecture 65 MORL Python - 2
Lecture 66 MORL Python - Output
Section 15: TF-Agents Projects
Lecture 67 What is CartPole
Lecture 68 CartPole with DQN
Section 16: Safe Reinforcement Learning
Lecture 69 Safe RL with Python
Section 17: Sequential Decision Analytics
Lecture 70 Sequential Decision Making Intro
Lecture 71 SDA Project with Julia - 1
Lecture 72 Dynamic Inventory Management - Python
Lecture 73 Adaptive Market Planning
Lecture 74 Portfolio Management
Lecture 75 Airline Pricing with Python - Code
Lecture 76 Airline Pricing - Output
Lecture 77 SDA Project with Julia - 2
Section 18: Advanced Topics in Reinforcement Learning
Lecture 78 Recurrent Replay Distributed DQN (R2D2) with Python
Lecture 79 C51
Section 19: Real-World Applications
Lecture 80 RL in Resource Management
Lecture 81 RL in Network Optimization - Part 1
Lecture 82 RL in Network Optimization - Part 2
Lecture 83 RL in Recommendation System
Lecture 84 RL in Inventory Management
Section 20: Goodbye!
Lecture 85 Closure
This course is for anyone who wants to learn reinforcement learning from scratch and apply it to real-world problems — whether you're a data scientist, engineer, researcher, or an advanced student aiming to master RL from both theoretical and practical angles.