The cloud2030 Tech Ops series is an ongoing discussion for us to create what I think of as 200 level content for tech and operations leaders, exploring really complex, deep topics in a thoughtful way to really extend your knowledge base and capabilities in the data center and infrastructure space.
Today’s episode talks about gitops and immutability, and what we’re doing here is connecting together the operational concepts between controls and desired state communications and how that gets executed in infrastructure in an operations sense. Rather than a developer approach, this takes an operations approach. So if you are interested in how to manage immutability and what that means in infrastructure, this discussion is for you.
With a mix of excitement and apprehension, the RackN team has been watching physical deployment of immutable operating systems like CoreOS Container Linux and RancherOS. Overall, we like the idea of a small locked (aka immutable) in-memory image for servers; however, the concept does not map perfectly to hardware.
Note: if you want to provision these operating systems in a production way, we can help you!
These operating systems work on a “less is more” approach that strips everything out of the images to make them small and secure.
This is great for cloud-first approaches where VM size has a material impact in cost. It’s particularly matched for container platforms where VMs are constantly being created and destroyed. In these cases, the immutable image is easy to update and saves money.
So, why does that not work as well on physical?
First: HA DHCP?! It’s not as great a map for physical systems where operating system overhead is pretty minimal. The model requires orchestrated rebooting of your hardware. It also means that you need a highly available (HA) PXE Provisioning infrastructure (like we’re building with Digital Rebar).
Second: Configuration. That means that they must rely on having cloud-init injected configuration. In a physical environment, there is no way to create cloud-init like injections without integrating with the kickstart systems (a feature of Digital Rebar Provision). Further, hardware has a lot more configuration options (like hard drives and network interfaces) than VMs. That means that we need a robust and system-by-system way to manage these configurations.
Third: No SSH. Yes another problem with these minimal images is that they are supposed to eliminateSSH. Ideally, their image and configuration provides everything required to run the image without additional administration. Unfortunately, many applications assume post-boot configuration. That means that people often re-enable SSH to use tools like Ansible. If it did not conflict with the very nature of the “do-not configure-the-server” immutable model, I would suggest that SSH is a perfectly reasonable requirement for operators running physical infrastructure.
In Summary, even with those issues, we are excited about the positive impact this immutable approach can have on data center operations.
With tooling like Digital Rebar, it’s possible to manage the issues above. If this appeals to you, let us know!
It’s been a banner year for container awareness and adoption so we wanted to recap 2015. For RackN, container acceleration is near to our heart because we both enable and use them in fundamental ways. Look for Rob’s 2016 predictions on his blog.
The RackN team has truly deep and broad experience with containers in practical use. In the summer, we delivered multiple container orchestration workloads including Docker Swarm, Kubernetes, Cloud Foundry, StackEngine and others. In the fall, we refactored Digital Rebar to use Docker Compose with dramatic results. And we’ve been using Docker since 2013 (yes, “way back”) for ops provisioning and development.
To make it easier to review that experience, we are consolidating a list of our container related posts for 2015.
Nearly 10 TIMES faster system resets – that’s the result of fully enabling an multi-container immutable deployment on Digital Rebar.
I’ve been having a “containers all the way down” month since we launched Digital Rebar deployment using Docker Compose. I don’t want to imply that we rubbed Docker on the platform and magic happened. The RackN team spent nearly a year building up the Consul integration and service wrappers for our platform before we were ready to fully migrate.
During the Digital Rebar migration, we took our already service-oriented code base and broke it into microservices. Specifically, the Digital Rebar parts (the API and engine) now run in their own container and each service (DNS, DHCP, Provisioning, Logging, NTP, etc) also has a dedicated container. Likewise, supporting items like Consul and PostgreSQL are, surprise, managed in dedicated containers too. All together, that’s over nine containers and we continue to partition out services.
We use Docker Compose to coordinate the start-up and Consul to wire everything together. Both play a role, but Consul is the critical glue that allows Digital Rebar components to find each other. These were not random choices. We’ve been using a Docker package for over two years and using Consul service registration as an architectural choice for over a year.
Service registration plays a major role in the functional ops design because we’ve been wrapping datacenter services like DNS with APIs. Consul is a separation between providing and consuming the service. Our previous design required us to track the running service. This worked until customers asked for pluggable services (and every customer needs pluggable services as they scale).
Besides being a faster to reset the environment, there are several additional wins:
more transparent in how it operates – it’s obvious which containers provide each service and easy to monitor them as individuals.
easier to distribute services in the environment – we can find where the service runs because of the Consul registration, so we don’t have to manage it.
possible to have redundant services – it’s easy to spin up new services even on the same system
make services pluggable – as long as the service registers and there’s an API, we can replace the implementation.
no concern about which distribution is used – all our containers are Ubuntu user space but the host can be anything.
changes to components are more isolated – changing one service does not require a lot of downloading.
Docker and microservices are not magic but the benefits are real. Be prepared to make architectural investments to realize the gains.