Containers Manager [TechOps]

In this episode, we continue our TechOps series, diving deep into the topic of container management. As containers become increasingly mainstream, the need to effectively manage and orchestrate these lightweight, purpose-built environments is crucial.

We’ll explore the distinctions between container management and orchestration, discussing the different tools, techniques and trade-offs involved. We’ll also hear insights from the RackN team on how they’ve approached container lifecycle management within their own infrastructure management platform, Digital Rebar.

This is a rich discussion that touches on everything from Kubernetes to system design trade-offs. So let’s jump in and learn how to wrangle those containers!

Supply Chain Security [TechOps]

In this episode, we dive deep into a recent and highly sophisticated SSH intrusion attack that was discovered in the Linux kernel. We’ll discuss how the attackers were able to inject a backdoor into a critical compression library, leveraging social engineering tactics to become a trusted maintainer over several years.

Advanced SSH [TechOps]

SSH and Secure Shell is one of those topics that people take for granted because it is a ubiquitous way to log in and access systems. True to form for the TechOps series, though, we break that down into much more detailed and granular components.

We talk about how to secure it and what best practices are. We also discuss how to use it for tunneling, or, more specifically, not use it for tunneling, and why all of this matters to your operations environment. Listen to what new things we’re doing that avoid having to have network access at all.

Transcript: otter.ai/u/XSRBfnifZOF0-nlNU5…?utm_source=copy_url

High Availability [TechOps Series]

Is high availability always a good thing? Today our discussion takes an operations perspective. We look at places where you were over or under committing high availability, where you were confusing disaster recovery for high availability, and perhaps even securing the wrong service or looking at it the wrong way. We cover all of these scenarios with practical, hands-on examples that I know you will get a lot out of.

This is good prep for talking about HA clusters, because the idea of coordinating and monitoring systems is core to HA and HA clusters. In our journey with RackN, a lot of customers who thought they needed very aggressive HA systems, once they are confronted with the overhead of maintaining an HA system, have to ask if you really need it. We started with an active/passive HA implementation using third party monitoring to monitor for when the system failed and spin up the second system, creating a live streaming back up to the failover system.

Transcript: otter.ai/u/vOVZadHvRTFCZGqcI2…?utm_source=copy_url

UEFI Trust & Secure Boot Issue

We explore the UEFI certificate issue in which secure boot is potentially compromised. Certificates that are included in most UEFI BIOSes have been compromised in ways that could easily be used as an attack vector, a very significant flaw and something that should be on your purview and radar to fix and patch.

We’re going to talk about what the issue is, why it’s important, how secure boot works, and what you can do to mitigate this problem in your own infrastructure. An important episode for anybody running or managing desktops, data centers or any infrastructure of any type.

Transcript: otter.ai/u/H15Z2NZDom8Hta8gHJ…?utm_source=copy_url

AI Platform Consolidation & Walled Gardens

We discuss the impact AI and data sovereignty data protection will have on platforms, consolidated management of your data like in Office by Microsoft or Google, on premises, and systems. This includes a whole bunch of data that you will want to use to train AI models to improve your day to day operations, but you probably don’t want a lot of vendors pulling that data apart and transiting it. We have a fascinating discussion about how the market is impacting these forces.

Transcript: otter.ai/u/gpyotjoYz5ev-1Q2yv…?utm_source=copy_url

Two But Rule by John Wolpert [Book Discussion]

This episode is one of our book club episodes starring John Walpole, who wrote the Two But Rule, which is very tongue in cheek while also very serious about momentum thinking and using a negative bouncy discussion pattern.


I like to think of it as a bouncy discussion pattern to really explore ideas and drive ideation in a positive way by asking and challenging people’s ideas in a constructive way.


Transcript: otter.ai/u/2-CzhoZXo1U9URwEc3…?utm_source=copy_url

Crowd Strike vs Operations Responsibility

This episode explores the intersection of infrastructure automation and security through the lens of the Crowd Strike outage. We’ll discuss the tension between maintaining stable, reliable data center infrastructure and the need to embrace change and innovation.

Recent events like the CrowdStrike outage demonstrate the paradox that infrastructure teams face. We’ll dive into the importance of having multiple control planes and standardized processes that can adapt to rapid industry changes.

Transcript: otter.ai/u/Wos9IOPfpSGPOYNT-A4muQccA5w

Cloud2030Crowd StrikeOperationsWindowsOutageCloud