PodcastsRank #20026
Artwork for SREpath Podcast

SREpath Podcast

TechnologyPodcastsBusinessENunited-states
Rating unavailable
Software reliability is a tough topic for engineers in many organizations. The Reliability Enablers (Ash Patel and Sebastian Vietz) know this from experience. Join us as we demystify reliability jargon like SRE, DevOps, and more. We interview experts and share practical insights. Our mission is to help you boost your success in reliability-enabling areas like observability, incident response, release engineering, and more. <a href="https://read.srepath.com/s/podcast?utm_medium=podcast">read.srepath.com</a>
Top 40.1% by pitch volume (Rank #20026 of 50,000)Data updated Feb 10, 2026

Key Facts

Publishes
N/A
Episodes
70
Founded
N/A
Category
Technology
Number of listeners
Private
Hidden on public pages

Listen to this Podcast

Pitch this podcast
Get the guest pitch kit.
Book a quick demo to unlock the outreach details you actually need before you hit send.
  • Verified contact + outreach fields
  • Exact listener estimates (not just bands)
  • Reply rate + response timing signals
10 minutes. Friendly walkthrough. No pressure.
Book a demo
Public snapshot
Audience: Under 4K / month
Canonical: https://podpitch.com/podcasts/srepath-podcast
Reply rate: Under 2%

Latest Episodes

Back to top

You (and AI) can't automate reliability away

Tue Dec 02 2025

Listen

What if the hardest part of reliability has nothing to do with tooling or automation? Jennifer Petoff explains why real reliability comes from the human workflows wrapped around the engineering work. Everyone seems to think AI will automate reliability away. I keep hearing the same story: “Our tooling will catch it.” “Copilots will reduce operational load.” “Automation will mitigate incidents before they happen.” But here’s a hard truth to swallow: AI only automates the mechanical parts of reliability — the machine in the machine. The hard parts haven’t changed at all. You still need teams with clarity on system boundaries.You still need consistent approaches to resolution.You still need postmortems that drive learning rather than blame. AI doesn’t fix any of that. If anything, it exposes every organizational gap we’ve been ignoring. And that’s exactly why I wanted today’s guest on. Jennifer Petoff is Director of Program  Management for Google Cloud Platform and Technical Infrastructure education. Every day, she works with SREs at Google, as well as with SREs at other companies through her public speaking and Google Cloud Customer engagements. Even if you have never touched GCP, you have still been influenced by her work at some point in your SRE career. She is co-editor of Google’s original Site Reliability Engineering book from 2016. Yeah, that one! It was my immense pleasure to have her join me to discuss the internal dynamics behind successful reliability initiatives. Here are 5 highlights from our talk: 3 issues stifling individual SREs’ work To start, I wanted to know from Jennifer the kinds of challenges she has seen individual SREs face when attempting to introduce or reinforce reliability improvements within their teams or the broader organization. She categorized these challenges into 3 main categories * Cultural issues (with a look into Westrum’s typology of organizational culture) * Insufficient buy-in from stakeholders * Inability to communicate the value of reliability work Organizations with generative cultures have 30% better organizational performance. A key highlight from this topic came from her look at DORA research, an annual survey of thousands of tech professionals and the research upon which the book Accelerate is based. It showed that organizations with generative cultures have 30% better organizational performance. In other words, you can have the best technology, tools, and processes to get good results, but culture further raises the bar. A generative culture also makes it easier to implement the more technical aspects of DevOps or SRE that are associated with improved organizational performance. Hands-on is the best kind of training We then explored structured approaches that ensure consistency, build capability, and deliberately shape reliability culture. As they say – Culture eats strategy for breakfast! One key example Jennifer gave was the hands-on approach they take at Google. She believes that adults learn by doing. In other words, SREs gain confidence by doing hands-on work. Where possible, training programs should move away from passive listening to lectures toward hands-on exercises that mimic real SRE work, especially troubleshooting. One specific exercise that Google has built internally is Simulating Production Breakages. Engineers undergoing that training have a chance to troubleshoot a real system built for this purpose in a safe environment. The results have been profound, with a tremendous amount of confidence that Jennifer’s team saw in survey results. This confidence is focused on job-related behaviors, which when repeated over time reinforce that culture of reliability. Reliability is mandatory for everybody Another thing Jennifer told me Google did differently was making reliability a mandatory part of every engineer’s curriculum, not only SREs. When we first spun up the SRE Education team, our focus was squarely on our SREs. However, that’s like preaching to the choir. SREs are usually bought into reliability. A few years in, our leadership was interested in propagating the reliability-focused culture of SRE to all of Google’s development teams, a challenge an order of magnitude greater than training SREs. How did they achieve this mandate? * They developed a short and engaging (and mandatory) production safety training * That training has now been taken by tens of thousands of Googlers * Jennifer attributes this initiative’s success to how they“SRE’ed the program”. “We ran a canary followed by a progressive roll-out. We instituted monitoring and set up feedback loops so that we could learn and drive continuous improvement.” The result of this massive effort? A very respectable 80%+ net promoter score with open text feedback: “best required training ever.” What made this program successful is that Jennifer and her team SRE’d its design and iterative improvement. You can learn more about “How to SRE anything” (from work to life) using her rubric: https://www.reliablepgm.com/how-to-sre-anything/ Reliability gets rewarded just like feature work Jennifer then talked about how Google mitigates a risk that I think every reliability engineer wishes could be solved at their organization. That is, having great reliability work rewarded at the same level as great feature work. For development and operations teams alike at Google, this means making sure “grungy work” like tech debt reduction, automation, and other activities that improve reliability are rewarded equally to shiny new product features. Organizational reward programs that recognize outstanding work typically have committees. These committees not only look for excellent feature development work, but also reward and celebrate foundational activities that improve reliability. This is explicitly built into the rubric for judging award submissions. Keep a scorecard of reliability performance Jennifer gave another example of how Google judges reliability performance, but more specifically for SRE teams this time. Google’s Production Excellence (ProdEx) program was created in 2015 to assess and improve production excellence (aka reliability improvements) across SRE teams. ProdEx acts like a central scorecard to aggregate metrics from various production health domains to provide a comprehensive overview of an SRE team’s health and the reliability of the services they manage. Here are some specifics from the program: * Domains include SLOs, on-call workload, alerting quality, and postmortem discipline * Reviews are conducted live every few quarters by senior SREs (directors or principal engineers) who are not part of the team’s direct leadership * There is a focus on coaching and accountability without shame (to elicit psychological safety) ProdEx serves various levels of the SRE organization through: * providing strategic situational awareness regarding organizational and system health to leadership and * keeping forward momentum around reliability and surfacing team-level issues early to support engineers in addressing them Wrapping up Having an inside view of reliability mechanisms within a few large organizations, I know that few are actively doing all — or sometimes any — of the reliability enhancers that Google uses and Jennifer has graciously shared with us. It’s time to get the ball rolling. What will you do today to make it happen? This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com

More

What if the hardest part of reliability has nothing to do with tooling or automation? Jennifer Petoff explains why real reliability comes from the human workflows wrapped around the engineering work. Everyone seems to think AI will automate reliability away. I keep hearing the same story: “Our tooling will catch it.” “Copilots will reduce operational load.” “Automation will mitigate incidents before they happen.” But here’s a hard truth to swallow: AI only automates the mechanical parts of reliability — the machine in the machine. The hard parts haven’t changed at all. You still need teams with clarity on system boundaries.You still need consistent approaches to resolution.You still need postmortems that drive learning rather than blame. AI doesn’t fix any of that. If anything, it exposes every organizational gap we’ve been ignoring. And that’s exactly why I wanted today’s guest on. Jennifer Petoff is Director of Program  Management for Google Cloud Platform and Technical Infrastructure education. Every day, she works with SREs at Google, as well as with SREs at other companies through her public speaking and Google Cloud Customer engagements. Even if you have never touched GCP, you have still been influenced by her work at some point in your SRE career. She is co-editor of Google’s original Site Reliability Engineering book from 2016. Yeah, that one! It was my immense pleasure to have her join me to discuss the internal dynamics behind successful reliability initiatives. Here are 5 highlights from our talk: 3 issues stifling individual SREs’ work To start, I wanted to know from Jennifer the kinds of challenges she has seen individual SREs face when attempting to introduce or reinforce reliability improvements within their teams or the broader organization. She categorized these challenges into 3 main categories * Cultural issues (with a look into Westrum’s typology of organizational culture) * Insufficient buy-in from stakeholders * Inability to communicate the value of reliability work Organizations with generative cultures have 30% better organizational performance. A key highlight from this topic came from her look at DORA research, an annual survey of thousands of tech professionals and the research upon which the book Accelerate is based. It showed that organizations with generative cultures have 30% better organizational performance. In other words, you can have the best technology, tools, and processes to get good results, but culture further raises the bar. A generative culture also makes it easier to implement the more technical aspects of DevOps or SRE that are associated with improved organizational performance. Hands-on is the best kind of training We then explored structured approaches that ensure consistency, build capability, and deliberately shape reliability culture. As they say – Culture eats strategy for breakfast! One key example Jennifer gave was the hands-on approach they take at Google. She believes that adults learn by doing. In other words, SREs gain confidence by doing hands-on work. Where possible, training programs should move away from passive listening to lectures toward hands-on exercises that mimic real SRE work, especially troubleshooting. One specific exercise that Google has built internally is Simulating Production Breakages. Engineers undergoing that training have a chance to troubleshoot a real system built for this purpose in a safe environment. The results have been profound, with a tremendous amount of confidence that Jennifer’s team saw in survey results. This confidence is focused on job-related behaviors, which when repeated over time reinforce that culture of reliability. Reliability is mandatory for everybody Another thing Jennifer told me Google did differently was making reliability a mandatory part of every engineer’s curriculum, not only SREs. When we first spun up the SRE Education team, our focus was squarely on our SREs. However, that’s like preaching to the choir. SREs are usually bought into reliability. A few years in, our leadership was interested in propagating the reliability-focused culture of SRE to all of Google’s development teams, a challenge an order of magnitude greater than training SREs. How did they achieve this mandate? * They developed a short and engaging (and mandatory) production safety training * That training has now been taken by tens of thousands of Googlers * Jennifer attributes this initiative’s success to how they“SRE’ed the program”. “We ran a canary followed by a progressive roll-out. We instituted monitoring and set up feedback loops so that we could learn and drive continuous improvement.” The result of this massive effort? A very respectable 80%+ net promoter score with open text feedback: “best required training ever.” What made this program successful is that Jennifer and her team SRE’d its design and iterative improvement. You can learn more about “How to SRE anything” (from work to life) using her rubric: https://www.reliablepgm.com/how-to-sre-anything/ Reliability gets rewarded just like feature work Jennifer then talked about how Google mitigates a risk that I think every reliability engineer wishes could be solved at their organization. That is, having great reliability work rewarded at the same level as great feature work. For development and operations teams alike at Google, this means making sure “grungy work” like tech debt reduction, automation, and other activities that improve reliability are rewarded equally to shiny new product features. Organizational reward programs that recognize outstanding work typically have committees. These committees not only look for excellent feature development work, but also reward and celebrate foundational activities that improve reliability. This is explicitly built into the rubric for judging award submissions. Keep a scorecard of reliability performance Jennifer gave another example of how Google judges reliability performance, but more specifically for SRE teams this time. Google’s Production Excellence (ProdEx) program was created in 2015 to assess and improve production excellence (aka reliability improvements) across SRE teams. ProdEx acts like a central scorecard to aggregate metrics from various production health domains to provide a comprehensive overview of an SRE team’s health and the reliability of the services they manage. Here are some specifics from the program: * Domains include SLOs, on-call workload, alerting quality, and postmortem discipline * Reviews are conducted live every few quarters by senior SREs (directors or principal engineers) who are not part of the team’s direct leadership * There is a focus on coaching and accountability without shame (to elicit psychological safety) ProdEx serves various levels of the SRE organization through: * providing strategic situational awareness regarding organizational and system health to leadership and * keeping forward momentum around reliability and surfacing team-level issues early to support engineers in addressing them Wrapping up Having an inside view of reliability mechanisms within a few large organizations, I know that few are actively doing all — or sometimes any — of the reliability enhancers that Google uses and Jennifer has graciously shared with us. It’s time to get the ball rolling. What will you do today to make it happen? This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com

Key Metrics

Back to top
Pitches sent
17
From PodPitch users
Rank
#20026
Top 40.1% by pitch volume (Rank #20026 of 50,000)
Average rating
N/A
Ratings count may be unavailable
Reviews
N/A
Written reviews (when available)
Publish cadence
N/A
Episode count
70
Data updated
Feb 10, 2026
Social followers
N/A

Public Snapshot

Back to top
Country
United States
Language
English
Language (ISO)
Release cadence
N/A
Latest episode date
Tue Dec 02 2025

Audience & Outreach (Public)

Back to top
Audience range
Under 4K / month
Public band
Reply rate band
Under 2%
Public band
Response time band
Private
Hidden on public pages
Replies received
Private
Hidden on public pages

Public ranges are rounded for privacy. Unlock the full report for exact values.

Presence & Signals

Back to top
Social followers
N/A
Contact available
Yes
Masked on public pages
Sponsors detected
Private
Hidden on public pages
Guest format
Private
Hidden on public pages

Social links

No public profiles listed.

Demo to Unlock Full Outreach Intelligence

We publicly share enough context for discovery. For actionable outreach data, unlock the private blocks below.

Audience & Growth
Demo to unlock
Monthly listeners49,360
Reply rate18.2%
Avg response4.1 days
See audience size and growth. Demo to unlock.
Contact preview
s***@hidden
Get verified host contact details. Demo to unlock.
Sponsor signals
Demo to unlock
Sponsor mentionsLikely
Ad-read historyAvailable
View sponsorship signals and ad read history. Demo to unlock.
Book a demo

How To Pitch SREpath Podcast

Back to top

Want to get booked on podcasts like this?

Become the guest your future customers already trust.

PodPitch helps you find shows, draft personalized pitches, and hit send faster. We share enough public context for discovery; for actionable outreach data, unlock the private blocks.

  • Identify shows that match your audience and offer.
  • Write pitches in your voice (nothing sends without you).
  • Move from “maybe later” to booked interviews faster.
  • Unlock deeper outreach intelligence with a quick demo.

This show is Rank #20026 by pitch volume, with 17 pitches sent by PodPitch users.

Book a demoBrowse more shows10 minutes. Friendly walkthrough. No pressure.
Rating unavailable
RatingsN/A
Written reviewsN/A

We summarize public review counts here; full review text aggregation is not shown on PodPitch yet.

Frequently Asked Questions About SREpath Podcast

Back to top

What is SREpath Podcast about?

Software reliability is a tough topic for engineers in many organizations. The Reliability Enablers (Ash Patel and Sebastian Vietz) know this from experience. Join us as we demystify reliability jargon like SRE, DevOps, and more. We interview experts and share practical insights. Our mission is to help you boost your success in reliability-enabling areas like observability, incident response, release engineering, and more. <a href="https://read.srepath.com/s/podcast?utm_medium=podcast">read.srepath.com</a>

How often does SREpath Podcast publish new episodes?

SREpath Podcast publishes on a variable schedule.

How many listeners does SREpath Podcast get?

PodPitch shows a public audience band (like "Under 4K / month"). Book a demo to unlock exact audience estimates and how we calculate them.

How can I pitch SREpath Podcast?

Use PodPitch to access verified outreach details and pitch recommendations for SREpath Podcast. Start at https://podpitch.com/try/1.

Which podcasts are similar to SREpath Podcast?

This page includes internal links to similar podcasts. You can also browse the full directory at https://podpitch.com/podcasts.

How do I contact SREpath Podcast?

Public pages only show a masked contact preview. Book a demo to unlock verified email and outreach fields.

Quick favor for your future self: want podcast bookings without the extra mental load? PodPitch helps you find shows, draft personalized pitches, and hit send faster.