SRE – Cloud2030 Podcast

Compliance Death Curve [Working Session 1]

The compliance death curve is something I’ve been working on as an evolving concept that tries to explain how companies fight compliance governance and standardization efforts, something that is critical to platform team and infrastructure operations.

Today we try to decompose some of the mathematics that I’ve been using into more universal, more easily understood components. We built a compliance flywheel that I found really fascinating which you can see an example of that work in our podcast description.

It could also be helpful to check out my previously recorded compliance death curve talk that has been released.

Resources:
www.youtube.com/watch?v=4RUKsakKZI0

Transcript: otter.ai/u/k9q5ZZ81Hm-EAAtfkV…?utm_source=copy_url

API Consumption [TechOps 003]

TechOps series episode 3 covers how to automate against API’s. We discuss exactly the ways in which you can use API’s effectively, and ways you can run into trouble. We also discuss how we should be consuming API’s, both as a consumer but also in times when we have produced API’s. Many ideas discussed were pulled from learning how people consume our API’s and what we can do to help make them better and safer.

Enjoy this broader TechOps series where we are diving in deep in tips and techniques that improve your journey as an Automator.

otter.ai/u/5akxcG83FBS1m9PBUn…?utm_source=copy_url
Image by Dall-E

Platform Engineering on API Abstractions

https://soundcloud.com/user-410091210/pt1-devops-ll-230124

Our mini episode today is a short discussion of API delineation and abstractions for platform engineering.

This was a short intro discussion, and it is especially interesting because platform is a major topic we will be exploring in the coming year. We highlight the challenges of finding the right abstraction points as well as building front end and back end automation.

Transcript: otter.ai/u/5gzoEliQ5H7N5LnGSP6sdOFNjv8
Image: www.pexels.com/photo/white-paper…te-table-7897470/

Platform Engineering Makes You Angry?

Platform engineering is a topic that seems to be generating a lot of interest going into 2023. It’s sure to be one of those things that enterprises spend a lot of time arguing about and telling each other that they’re doing it wrong.

In this podcast, we dissect why platform engineering seems to be so controversial, and what we can do to help make it more understandable.

We break it down into DevOps components, team components, Dev components, operations components, and ultimately talk about long term trajectories of how all this stuff is going.

Image:www.pexels.com/photo/person-skat…ard-ramp-1527241/
Transcript: otter.ai/u/SAAMNdHZh9lEeHrBxwcWmUxuDhs

Cloud2030 DevOps Platform Engineering Automation Cloud Infrastructure IaC SRE

Rob’s Hot Take:

In the December 13th DevOps Lunch and Learn on the Cloud 2030 podcast, Rob Hirschfeld explores the concept of platform engineering emerging from enterprises grappling with the challenges of enabling developers while rationalizing operations. The discussion introduces the idea of operational entropy or infrastructure entropy, emphasizing how platform engineering teams can effectively manage the constant changes, security vulnerabilities, and evolving environments, relieving developers of this burden. By shifting entropy management to a shared and collaborative task, platform engineering teams have the potential to enhance how they function, offering opportunities for improvement across the industry. For those intrigued by these discussions, the full episode is available at the2030.cloud, inviting participation in ongoing conversations.

Events And Monitoring [bonus Complexity chat]

How do you build GitOps, infrastructure and systems relying on events and monitoring, when you need to revert to a polling loop, or augment a polling loop with an event system?

Today, we drill into concrete technical details about events and monitoring. We also suggest practical functional advice on how Git Ops works, how systems work, and how you can build a resilient system.

Stick around for a bonus at the end of the discussion, where we talk a little bit about complexity!

Image: www.pexels.com/photo/green-and-b…ug-on-air-905905/
Transcript: otter.ai/u/udK3y3upQMszo2IVtbrdGigmehE

Rob’s Hot Take:

In the July 26th DevOps Lunch and Learn episode, Rob Hirschfeld delves into the intricacies of monitoring and events, highlighting the importance of eventing systems for scalability. The discussion explores the intersection between building a resilient standalone system using polling and enhancing responsiveness through eventing to create a comprehensive and adaptable solution. The key takeaway emphasizes the need for systems that can effectively integrate both polling and eventing to ensure durability and improved performance. For a detailed exploration of these concepts, tune in to the full podcast on monitoring and eventing from July 26th at the2030.cloud.

Improving Automation Safety

Making automation safe is essential to making it usable at scale. How do we make automation safe? We found a lot of great insights drawing from space craft design, aircraft, aircraft design and other systems where safety is super important.

Automation is a force multiplier. If we don’t factor in safety when we build it,then we could create a lot of harm in systems from wasteful spending to actual injury. These designs have very real implications.

Transcript: otter.ai/u/p9w4aKOqm3rpHhbDtRTaLgN3GIA
Image: www.pexels.com/photo/toddler-usi…-on-road-1642055/

Rob’s Hot Take:

In the Cloud 2030 Podcast on March 15th, Rob Hirschfeld underscores the critical importance of automation safety in system design. Emphasizing the need for thorough testing, he discusses how safety, especially in complex systems like airplanes and spacecraft, requires continuous testing and monitoring. The conversation delves into the significance of not just completing tasks but also exercising and testing systems in various scenarios to ensure their safety. To explore these insights further, listen to the full episode on March 15th at the2030.cloud and participate in the ongoing discussions.

Super Deep Dive into DHCP, PXE Boot, and Remote Installation

The RackN teams provides an in-depth overview of how DHCP and PXE boot work along with highlights in their remote booting on a Raspberry Pi

Chris Short on SRE, DevSecOps, Pipelines, Immutability, and Kubernetes

Joining us this week is Chris Short, Senior DevOps Advocate, SJ Technologies. Chris is also a CNCF Ambassador managing an excellent newsletter, DevOps’ish.

Highlights
• Site Reliability Engineering & DevOps relationship & philosophy
• SRE details in budgets, toil, and security
• Pipeline infrastructure, configuration management, and immutability
• Cultural aspects of DevOps
• Why Kubernetes? Ecosystems? Build for Kubernetes apps
• SaaS vs Licensing models (answer to all things software)

Christine Yen on 2nd Wave of DevOps and Listening to Users at a Startup

Joining us this week is Christine Yen, Co-founder at Honeycomb coming from a recording at SRECon Americas in March 2018 at Santa Clara Convention Center Hyatt.

Highlights
• Understanding of what developer tools are today
• Observability vs Monitoring
• Instrumenting Apps for Diagnostics to help Developers do More
• Tool to build not just better engineers but teams as well to support customers
• Brief history of Honeycomb and where it came from (Parse and Facebook)
• How debug containers that are most likely gone by time problem arises?
• AI / Machine Learning – can it really help today?
• 2nd Wave of DevOps
• Impact of listening to users at a startup – people problems vs technology

Mark Imbriaco on SRE, Edge, and Open Source Sustainability

Joining us this week is Mark Imbriaco, Global CTO DevOps, Pivotal. Mark’s view of ops and open source from a platform perspective as it relates to SRE offers listeners a high-level approach to these concepts that is not often heard.

Highlights
• Site Reliability Engineering – Introduction and Advanced Discussion
• Edge Computing from Platform View
• Open Source Projects vs Products and Sustainability
• Monetization of Open Source Matters