Advanced Observability: Real-World Monitoring And Logging
Published 3/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 12.71 GB | Duration: 12h 13m
Published 3/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 12.71 GB | Duration: 12h 13m
Master observability with hands-on projects, real-world scenarios, and job-ready skills for SREs, DevOps, and Cloud.
What you'll learn
Build a complete Observability Stack using open-source tools – deploy and integrate Prometheus for metrics, Grafana/Loki for logs
Live Labs from Scratch: Follow along with live lab setups built from the ground up. With access to a cloud account, you can replicate every environment and expe
Diagnose real production issues with observability data – practice troubleshooting latency, errors, and failures in complex systems by correlating logs, metrics
Design Service Level Objectives (SLOs) and Alerts – define reliability targets (SLO/Error Budgets) and configure alerting rules that inform you of problems.
Implement Observability in Cloud & Kubernetes environments – monitor containerized applications, use Kubernetes event logs and metrics, and deploy in k8s
Best Practices & Performance Tuning – learn advanced tips (optimizing logging levels, handling high-cardinality metrics, and minimizing observability overhead)
Requirements
Basic Knowledge of DevOps & Containers: Familiarity with Docker and fundamental DevOps concepts will help in understanding the deployment of observability tools
General Cloud Understanding: Experience with any cloud provider (AWS, Azure, GCP) is recommended. You should have access to a free-tier or personal cloud account for optional cloud-based lab exercises
Linux Command Line Basics: Ability to navigate and run simple commands on a Linux terminal is needed, since many tools are deployed on Linux or via Docker.
Prior Monitoring/Logging Exposure (Optional): Basic experience with monitoring or logging (even at a beginner level) will be beneficial but not strictly required – we will recap the fundamentals before diving deep.
Hardware/Software: A computer (Windows/Mac/Linux) capable of running multiple Docker containers for labs, and an internet connection. We will use free/open-source software throughout the course.
Eagerness to Learn by Doing: A willingness to set up environments, experiment with configurations, and troubleshoot will greatly enhance your learning experience in this hands-on course.
Description
Observability is more than a buzzword – it's a critical skill set for today’s SREs and DevOps engineers. “Advanced Observability” is a project-based course that goes beyond theory, immersing you in real-world scenarios. In this course, you will build and break things on purpose: instrument applications with telemetry, set up a full observability stack, and troubleshoot complex systems just as you would on the job. Our approach is formal in coverage yet slightly conversational in tone, making advanced concepts accessible and engaging.Throughout the course, we emphasize learning by doing. Each module centers around a realistic project or problem scenario – from debugging a microservices outage to optimizing performance in a live environment. You won't just learn definitions or passively watch tool demos; you'll actively implement logs and metrics in a hands-on lab environment. By working through guided labs and challenges, you’ll gain confidence using industry-standard tools (like Prometheus, Grafana, Loki etc) and tie them together into a cohesive observability platform.By the end of this course, you’ll have job-ready observability expertise. You will know how to proactively monitor distributed systems, quickly pinpoint issues across complex architectures, and improve reliability using data-driven insights. Whether you’re aiming to excel in an SRE/DevOps role or to bring observability best practices to your team, this course delivers practical experience that translates directly to real-world success. Get ready to elevate your skills through immersive projects and become an observability champion in your organization!
Overview
Section 1: Introduction
Lecture 1 Introduction
Section 2: Getting Started with Observability
Lecture 2 01. Observability in Distributed Environments
Lecture 3 02. Prometheus Architecture
Lecture 4 03. Setting up and Configuring Prometheus
Lecture 5 04. Monitoring an External Machine
Lecture 6 05. Q&A
Section 3: Collecting Node data Using Exporters (Node Exporter)
Lecture 7 06. Summary - Setting up Nginx and Node Exporter
Lecture 8 07. nginx exporter
Lecture 9 08. Default Labels
Section 4: Visualising the Data using Grafana Dashboards
Lecture 10 09. Visualizing the Data in Prometheus
Lecture 11 10. Setting Up and Configuring Grafana
Lecture 12 11. Creating Dashboards in Grafana
Lecture 13 12. Adding Memory Panels to Grafana Dashboards
Section 5: Grafana Dashboards Advanced Topics
Lecture 14 13. Grafana Dashboard Library
Lecture 15 14. Managing Dashboards like Code
Section 6: Generating Metrices on your Nodes
Lecture 16 15. Generating metrices
Lecture 17 16. Push gateways
Lecture 18 17. Grafana Dynamic Dashboards and Variables
Section 7: Prometheus Stack on Kubernetes
Lecture 19 18. Running Prometheus on Kubernetes
Lecture 20 19. Running Prometheus on Kubernetes - The Right Way
Lecture 21 20. Install Prometheus Helm Package
Lecture 22 21. Configuring installed components
Lecture 23 22. Accessing the kubernetes services locally
Lecture 24 23. Default Monitoring Configurations
Section 8: Monitoring Your Custom Applications on Kubernetes
Lecture 25 24. Monitoring Your Custom Applications
Lecture 26 25. Prometheus Rules
Section 9: SLI, SLO, SLA
Lecture 27 26. SLI SLO SLA
Lecture 28 27. Using Sloth for generating SLO based rules
Lecture 29 28. Applying and visualizing the SLO based rules
Lecture 30 29. Best Practices
Section 10: Advanced hands on Logging practices Grafana Loki
Lecture 31 30. Grafana Loki
Section 11: The Final Project
Lecture 32 30. Project
Site Reliability Engineers (SREs) who want to master the art of observability and take their incident response skills to the next level.,DevOps Engineers aiming to implement comprehensive monitoring and tracing in CI/CD pipelines and production systems.,Software Developers who wish to understand what happens to their code in production and how to instrument applications for better debuggability.,System Administrators & IT Operations staff looking to proactively detect and resolve infrastructure issues using logs, metrics, and alerts.,Technical Support Engineers who troubleshoot production issues and need deeper insight into system behavior to resolve customer problems faster.,QA and Test Engineers interested in observing system performance and reliability during testing, and learning how to use observability data to validate features under load.,Cloud Architects/Engineers designing modern microservices or cloud-native architectures, who need to embed observability for distributed systems at scale.,Enthusiastic Learners who have basic ops/dev experience and are eager to step into an SRE, DevOps, or reliability-focused role by acquiring in-demand observability skills.