Defending Against Complexity With Exercise

How do you manage complexity? Something we talk about a lot in Cloud2030 is how challenging it is to understand complexity, measure it and cope with it.

Richard Cooke wrote a paper called “How Complex Systems Fail,” (how.complexsystems.fail) and in it he talks about complex systems having strong defense mechanisms against failure. That’s what we talked about today. How do we build defense mechanisms for complex systems, not by making them simpler, but by exercising them and testing them?

We discuss the importance of testing, validation, and layer of abstraction and testing the layers in this conversation. If you deal with complex systems, this discussion will be fascinating and actionable.

Transcript: otter.ai/u/SP-z7OAJWAmJlql8Dh62rNk2hlo
Image: www.pexels.com/photo/man-woman-m…ng-young-4058411/

Rob’s Hot Take:

In the May 24th DevOps lunch and learn, Rob Hirschfeld delves into the concept of making complex systems defensible by exercising and testing them thoroughly. Emphasizing the importance of shared automation and collaborative efforts within communities, he cites examples like Kubernetes and OpenStack as complex systems made more defensible through widespread testing and shared code. While complexity cannot be eliminated, actively exercising systems enhances their defensibility. Join the ongoing discussions and explore the intricacies of complexity management at the2030.cloud.

Distributed Ledger Drives Distributed Infrastructure

How is data center infrastructure adapted to edge distributed ledger technology workloads?

We think through if those demands (blockchain, proof-of-stake coins, etc) are changing the way we look at data center infrastructure, and the short answer is yes. We also explore the impacts of the type of workloads that we’re running and how we distribute them, rather than the type of equipment that we need to buy.

This conversation quickly becomes one about what we want to do with our infrastructure, not what the infrastructure is.

Transcript: otter.ai/u/KcT3ZF8ELbg5M3FrZmTSL7ycpX8
Image: www.pexels.com/photo/vehicle-on-the-road-3593923/

Rob’s Hot Take:

In the May 24th Cloud 2030 Podcast episode, Rob Hirschfeld explores how distributed ledger technologies like blockchains could impact application design and workload distribution across infrastructure. The discussion shifts from the impact on data centers to the potential for distributed applications that are more portable, capable of running in smaller data centers. While acknowledging missing pieces in building such applications, the conversation highlights the opportunity for more portable and cost-effective workloads. Join the comprehensive discussions at the2030cloud to delve deeper into this transformative intersection of distributed ledgers and infrastructures.

Why Jenkins in DevOps?

What kind of orchestration systems does the industry use for infrastructure, automation and controlling day to day operations?

In today’s episode, we talk about infrastructure pipelines at the tooling level, and specifically the use of Jenkins and other CI pipelining tools for ops and orchestration. We dig into why and how you would do this, and what pieces are missing from the system. That conversation leads us into larger day to day challenges.

If you are doing infrastructure ops and DevOps automation, you will get a lot out of this session.

Transcript: otter.ai/u/dbTdHdYTIt5bU1G8SFghKSijhU0
Image: www.pexels.com/photo/barista-wit…d-tattoo-6205639/

Rob’s Hot Take:

In the May 19th Cloud 2030 Podcast episode, Rob Hirschfeld delves into the intersection of payment systems, PCI V4, NFTs, blockchain, virtual reality, and the metaverse. The discussion highlights the often overlooked XRP or ripple specification, enabling banks to transfer funds outside the SWIFT system, introducing alternative ways for banks to exchange fiat currency with significant impacts on credit, microtransactions, and blockchain conversions. The episode emphasizes the importance of understanding seemingly esoteric elements that can shape the future landscape and influence how it evolves. Explore the full conversation for insights into this intriguing combination of PCI, V4, Kryptos, and the Metaverse.

Green Data Centers

What’s going on with green data centers, why does it matter, and how do we think about it in a wider context? In this short conversation, we discuss green data centers and creating carbon neutral infrastructure.

This isn’t just about servers using electrons – the actual conversation about making our infrastructure carbon neutral includes thinking about all of the components that go into our infrastructure.

We also have an upcoming series of conversations on green data centers and carbon neutral infrastructure.

Transcript: otter.ai/u/IYsPlr4r570MmOOW3WWLjenDlWk
Image: www.pexels.com/photo/clear-light…ray-rock-1108572/

APIs With Composable State

What makes API’s complex? In this episode, we talk about how we compose APIs into higher level systems, and how we think about the design elements that go into building durable, reusable API’s.

This is a classic topic for us, and in this discussion we looked beyond the API itself and started talking about the state of the system and how you manage that state.

Transcript: otter.ai/u/Oae5e_ay0d_l48TmWk3PO3lpIDU
Image: www.pexels.com/photo/a-nacho-chi…ng-sauce-5848731/

Rob’s Hot Take:

In the Cloud 2030 podcast on April 21st, Rob Hirschfeld delves into the complexity of APIs, emphasizing the layered and nested nature of API systems. The discussion unveils the challenges of managing distributed state within APIs, where each layer needs to be aware of and interact with the state of adjacent or underlying APIs. The key insight is that without a well-understood distributed state model at the architectural level, building resilient APIs becomes inherently complex. Join the conversation at the2030.cloud for a comprehensive exploration of API design challenges and solutions.

Orchestration Automation Workflow [with Terraform]

Building reliable automation at scale for infrastructure presents challenges. In this episode, we discuss orchestration, workflow automation, and the reconciler pattern in the context of Terraform.

We refer to the pattern of Terraform, automation, and orchestration systems as “TACOS” and today we dig into how you test it and check it against drift. These are real topics of operational concern for anybody building any type of infrastructure.

Transcript: otter.ai/u/w-NA0HBsTc5NRaqWQQwlWUj4Whw
Image: www.pexels.com/photo/person-hold…ith-food-8448079/

Rob’s Hot Take:

In the April 5th Cloud 2030 Podcast episode, Rob Hirschfeld discusses orchestration, automation, and workflow, focusing on Terraform and introducing the “Terraform Automation and Orchestration” (TACO) pattern. The conversation emphasizes that while Terraform is a valuable tool, the broader patterns of reconciliation, GitOps, and event-driven automation are crucial for building and maintaining complex systems over time. Hirschfeld encourages listeners to view tools like Terraform and Ansible as initial steps in a journey, prompting consideration of scaling, building orchestration systems, and understanding the importance of comprehensive system development. For more in-depth discussions, explore the full episode on orchestration, automation, and workflow from April 5th, and join the ongoing conversations at the2030.cloud.

Everything As Code !

What makes Everything as Code and Infrastructure as Code interesting? In today’s episode, we discuss what makes something code-like and the idea of Everything as Code, based on Patrick Dubois’ article “In depth research and trends analyzed from 50+ different concepts as code.”

Reference: www.jedi.be/blog/2022/02/23/tre…0-as-code-concepts/

Some of our conclusions were practical, like if a concept is a process that is reproducible and auditable, that’s what makes it code-like. And some other possible conclusions were that it’s just marketing because it makes everything programmable. The reality is somewhere in the middle.

Transcript: otter.ai/u/E1TezO2XutwJyS-vCNetslwWO4A
Image: www.pexels.com/photo/man-in-grey…icky-note-879109/

Rob’s Hot Take:

In the Cloud 2030 Podcast episode on March 29th, Rob Hirschfeld provides insights on the “everything as code” discussion. While acknowledging the term’s playful exaggeration, Hirschfeld emphasizes the underlying desire for reproducibility, auditability, and code-like experiences in various aspects of operational and infrastructure activities. Despite the term’s potential for marketing hype, the aspiration to apply code principles to different facets of infrastructure management remains significant, influencing how we build and manage systems. To delve into this engaging discussion, check out the full episode on March 29th, available on the2030.cloud.

How Lock in Creates Risk

Organizations take a risk when they get locked into a vendor. In today’s episode, we talk a lot about the risks of lock in, both in general and in the context of Oracle.

That discussion takes us into a question of insurance, and if insurance policies could ultimately drive people to reduce lock in exposure. This was a fascinating discussion, not only about lock in but about what would drive organizations to fix their lock in problems.

Transcript: otter.ai/u/zJf0WMUwJgamk7IpscHCsL2vsV4
Image: www.pexels.com/photo/closed-white-door-3119977/

Rob’s Hot Take:

In the Cloud 2030 Podcast episode on March 31st, Rob Hirschfeld discusses the intricate aspects of vendor lock-in, focusing on the risks associated with relying on a single provider, such as an authentication service like Okta. The conversation delves into the challenges of migrating away from tightly integrated platforms and emphasizes the importance of assessing and mitigating lock-in risks. The broader theme within Cloud 2030 discussions seems to revolve around identifying and understanding various risk factors in building complex infrastructures, aiming to drive market dynamics by addressing and managing these risks. To explore this insightful discussion further, check out the full episode on March 31st at the2030.cloud and become part of these engaging conversations.

Improving Automation Safety

Making automation safe is essential to making it usable at scale. How do we make automation safe? We found a lot of great insights drawing from space craft design, aircraft, aircraft design and other systems where safety is super important.

Automation is a force multiplier. If we don’t factor in safety when we build it,then we could create a lot of harm in systems from wasteful spending to actual injury. These designs have very real implications.

Transcript: otter.ai/u/p9w4aKOqm3rpHhbDtRTaLgN3GIA
Image: www.pexels.com/photo/toddler-usi…-on-road-1642055/

Rob’s Hot Take:

In the Cloud 2030 Podcast on March 15th, Rob Hirschfeld underscores the critical importance of automation safety in system design. Emphasizing the need for thorough testing, he discusses how safety, especially in complex systems like airplanes and spacecraft, requires continuous testing and monitoring. The conversation delves into the significance of not just completing tasks but also exercising and testing systems in various scenarios to ensure their safety. To explore these insights further, listen to the full episode on March 15th at the2030.cloud and participate in the ongoing discussions.

Data Center Users: Majors vs Miners

Majors versus minors are enterprise data centers versus blockchain, bitcoin and distributed ledger data centers. We dive into the differences in processing and environmental requirements for those two different use cases.

While the idea of blockchain and distributed ledgers generate very different computational profiles, what we’re building keeps coming back to the design of a data center is design of a data center. The exception is proof of work like Bitcoin. In those cases, it’s really just how many CPUs you can run.

For this episode, we focus on proof of stake data center infrastructure. This podcast is helpful to understand the difference between proof of work and proof of state. There’s clear consensus on the call that that proof of work is not environmentally sustainable. So proof of stake is much more interesting.

Transcript: otter.ai/u/uuPJSF_nWeDLF64lZlsGLOY8JWw
Image: www.pexels.com/photo/man-holding-shovel-3285094/

Rob’s Hot Take:

In the Cloud 2030 Podcast, Rob Hirschfeld explores the distinctions between majors and minors in data center design, specifically comparing traditional enterprise workloads with proof of stake (PoS) and proof of work (PoW) data centers used for distributed ledgers and blockchains. The discussion reveals that the transition to PoS aligns more closely with enterprise data center needs, emphasizing reliability, performance, and security. Contrarily, PoW environments prioritize cost efficiency but face environmental challenges. This assessment suggests that PoS is likely to drive a resurgence in traditional data center designs. For a more in-depth exploration, join the ongoing conversations at the2030.cloud.