Observability 2.0 - More Than Just Logs, Metrics & Traces
Sun Feb 01 2026
Join us as Neel explores how observability is evolving beyond traditional logs, metrics, and traces into a predictive, AI-powered discipline.
Neel walks through the evolution of Observability, demonstrating how OpenTelemetry, machine learning, and LLMs are transforming how we monitor and maintain modern applications. You'll learn about dynamic sampling techniques that reduce costs while maintaining visibility, how ML algorithms detect anomalies before they cause outages, and practical implementations using tools like the OpenTelemetry Collector. This episode covers real-world scenarios from reducing massive log volumes to predicting system failures before they impact customers.
Timestamps
0:00 Welcome & Introduction
4:29 Neel's Background & Community Work
5:03 The Evolution of Observability
6:29 The 2 AM Production Incident Scenario
8:13 OpenTelemetry's Role in Modern Observability
12:45 Dynamic Sampling Techniques
18:22 ML & AI in Anomaly Detection
24:16 LLM Observability Explained
28:32 Cost Optimization Strategies
30:04 Context Windows & Token Management
32:00 Self-Healing Systems Discussion
34:15 Edge Cases: When Dynamic Sampling Doesn't Work
36:27 Wrap-up & Resources
How to find Neel:
https://www.linkedin.com/in/neelcshah/
https://bento.me/neelshah
Links from the show:
https://neelshah.dev/blogs/observability-2
https://opentelemetry.io/
https://middleware.io/blog/observability-2-0/
More
Join us as Neel explores how observability is evolving beyond traditional logs, metrics, and traces into a predictive, AI-powered discipline. Neel walks through the evolution of Observability, demonstrating how OpenTelemetry, machine learning, and LLMs are transforming how we monitor and maintain modern applications. You'll learn about dynamic sampling techniques that reduce costs while maintaining visibility, how ML algorithms detect anomalies before they cause outages, and practical implementations using tools like the OpenTelemetry Collector. This episode covers real-world scenarios from reducing massive log volumes to predicting system failures before they impact customers. Timestamps 0:00 Welcome & Introduction 4:29 Neel's Background & Community Work 5:03 The Evolution of Observability 6:29 The 2 AM Production Incident Scenario 8:13 OpenTelemetry's Role in Modern Observability 12:45 Dynamic Sampling Techniques 18:22 ML & AI in Anomaly Detection 24:16 LLM Observability Explained 28:32 Cost Optimization Strategies 30:04 Context Windows & Token Management 32:00 Self-Healing Systems Discussion 34:15 Edge Cases: When Dynamic Sampling Doesn't Work 36:27 Wrap-up & Resources How to find Neel: https://www.linkedin.com/in/neelcshah/ https://bento.me/neelshah Links from the show: https://neelshah.dev/blogs/observability-2 https://opentelemetry.io/ https://middleware.io/blog/observability-2-0/