16 min read

From Simple to HA: A Learning Path for Kubernetes on Apple Silicon

Most Kubernetes tutorials stop at kubectl get nodes. Here’s the structured path from a 6-VM simple cluster to an 11-VM production-grade HA setup — with working code at every level.

The problem

Why most Kubernetes learning paths are broken

There are two extremes in Kubernetes education. On one end: minikube, kind, and Docker Desktop — single-node setups that abstract away everything interesting. You learn kubectl commands but nothing about how the cluster actually works. On the other end: “Kubernetes the Hard Way” by Kelsey Hightower — brilliant but intimidating, and it drops you into the deep end with no intermediate steps.

What’s missing is the middle. A structured path that starts simple enough to build confidence but complex enough to teach real concepts, then progressively adds the production patterns that actually matter — HA control planes, proper PKI, etcd clustering, bastion architecture. Each level builds on the previous one, and at every step there’s working code you can run.

That’s what this project provides. Six GitHub repos across three virtualization tools (UTM, Vagrant, OrbStack) at two complexity levels (Simple and HA). Same architecture, same Ansible automation, same component versions — the only variables are the virtualization layer and the complexity level. This post maps the learning path through all of it.

The map

Two levels, three tools, six repos

Every repo in this project shares the same foundation: Kubernetes 1.32.0 installed from raw binaries (no kubeadm), HashiCorp Vault for PKI certificate management, Ansible for automation, and Ubuntu 24.04 ARM64 as the base OS. The difference is scope.

	Simple (6 VMs)	HA (11 VMs)
UTM	k8s-utm-simple	k8s-utm-ha
Vagrant	k8s-vagrant-simple	k8s-vagrant-ha
OrbStack	k8s-orbstack-simple	k8s-orbstack-ha

Not sure which tool to pick? The UTM vs Vagrant vs OrbStack comparison covers deployment times, resource consumption, networking differences, and when each tool makes sense. Short version: OrbStack for the easiest start with the lowest resource footprint, UTM for maximum production realism, Vagrant for declarative infrastructure-as-code.

Side-by-side architecture: the 6-VM Simple Kubernetes cluster on the left and the 11-VM HA cluster on the right, with the five added components (HAProxy, master-2, etcd-2, etcd-3, worker-3) highlighted to show how each single point of failure is eliminated — From 6 VMs to 11: the five additions (highlighted in yellow) eliminate every single point of failure while keeping the same Ansible automation, same Vault PKI, and same component versions.

Level 1

Simple cluster: learn the fundamentals (6 VMs)

Start here. The simple setup is deliberately constrained — one master, one etcd node, two workers — but it’s not a toy. Every simple cluster includes a dedicated HashiCorp Vault server for PKI, a jump/bastion server, and Kubernetes installed the hard way from raw binaries. This is already more sophisticated than 90% of homelab tutorials.

What you’re building

VM	Role	What you learn
vault	PKI & Secrets	Certificate management, Vault operations, PKI hierarchy
jump	Bastion / Ansible	Bastion pattern, SSH ProxyJump, Ansible automation
etcd-1	Key-value store	etcd basics, TLS configuration, data storage
master-1	Control plane	API server, controller-manager, scheduler — how they connect
worker-1/2	Worker nodes	kubelet, kube-proxy, containerd, pod scheduling

Concepts you’ll understand after Level 1

How Kubernetes components connect. The API server is the hub — everything talks to it. The controller-manager and scheduler connect as clients. Kubelets on worker nodes register with it. etcd sits behind it as the data store. Understanding this topology is fundamental, and the simple setup makes it visible because each component runs on a separate, identifiable VM.

Why certificates matter. Even the simple cluster uses Vault PKI with a 3-tier CA hierarchy — Root CA, Intermediate CA, and leaf CAs for Kubernetes, etcd, and the front proxy. Every connection between components is authenticated with TLS certificates. You’ll see firsthand what happens when a certificate is wrong, expired, or signed by the wrong CA.

What “the hard way” actually means. No kubeadm, no abstractions. Every binary is downloaded individually. Every systemd unit file is written from scratch. Every kubeconfig is generated with explicit certificate references. When something breaks, you know exactly which config file to check because you wrote it.

The bastion pattern. Your Mac only connects to the jump server. From jump, you reach every other node. This is how production environments restrict access — a single hardened entry point instead of every node being directly SSH-accessible.

Ansible as infrastructure automation. The entire deployment is driven by Ansible playbooks and roles. You’ll learn how idempotent tasks work, how inventory files map to real machines, and how roles encapsulate reusable automation. Every playbook can be run multiple times safely — the second run changes nothing.

Deployment times (Simple)

UTM Simple 5m 57s

OrbStack Simple 5m 59s

Vagrant Simple 6m 33s

From cold start to kubectl get nodes showing all nodes Ready.

→ UTM Simple repo → Vagrant Simple repo → OrbStack Simple repo

The gap

What’s missing from your simple cluster

The simple cluster works. Pods deploy, services route, kubectl responds. But hold it up against a production checklist and it fails on several critical items. Understanding why it fails is exactly where the most valuable learning happens.

Single point of failure: the control plane

One master node. Reboot it and kubectl stops responding. No new pods get scheduled. Existing workloads on workers keep running but can’t be managed, scaled, or healed. In production, losing the control plane means losing all operational capability.

Single point of failure: etcd

One etcd node stores the entire cluster state — every pod, service, secret, configmap, and RBAC policy. A disk failure or process crash means total data loss. No quorum, no consensus, no fault tolerance. You also never learn how Raft leader election works with a single node.

No load balancer for the API server

Every kubeconfig points directly to master-1’s IP. If you add a second master later, clients don’t know about it. There’s no abstraction layer between API server clients and the actual API server instances.

Only two workers

Two workers means limited scheduling decisions. Pod anti-affinity, topology spread constraints, and node failure scenarios are harder to explore. Three workers give the scheduler meaningful choices.

For the full production-readiness audit — including certificate rotation, etcd mutual TLS, network policies, and monitoring — see Why Your Homelab K8s Cluster Isn’t Production-Ready (And How to Fix It).

Level 2

HA cluster: production patterns on your laptop (11 VMs)

Level 2 fixes every gap from Level 1 and adds five new VMs. The architecture goes from “works” to “would survive a basic production review.” Here’s what changes and — more importantly — why each change matters.

What’s added in HA

New VM	Role	Why it exists
haproxy	API server load balancer	Abstracts away individual master IPs. All clients point to HAProxy. If a master dies, traffic routes to the survivor within seconds.
master-2	Second control plane	Eliminates single point of failure. Both masters run identical components. Controller-manager and scheduler use leader election — only one is active, but the other takes over instantly on failure.
etcd-2	Second etcd node	Three etcd nodes form a Raft consensus cluster. Quorum requires a majority (2 of 3), so the cluster tolerates one node failure. You learn leader election, log replication, and what happens during a network partition.
etcd-3	Third etcd node
worker-3	Third worker	Meaningful scheduling: pod anti-affinity, topology spread, and realistic node failure scenarios with workload redistribution.

The full 11-VM architecture

VM	Role	Simple	HA
haproxy	Load balancer	—	✓
vault	PKI & Secrets	✓	✓
jump	Bastion / Ansible	✓	✓
etcd-1	etcd	✓	✓
etcd-2/3	etcd cluster	—	✓
master-1	Control plane	✓	✓
master-2	Control plane	—	✓
worker-1/2	Worker nodes	✓	✓
worker-3	Worker node	—	✓

Deployment times (HA)

UTM HA6m 13s

OrbStack HA7m 26s

Vagrant HA8m 10s

From cold start to full 11-VM HA cluster with all nodes Ready and Calico CNI installed.

→ UTM HA repo → Vagrant HA repo → OrbStack HA repo

Deep dives

The concepts that make HA meaningful

Moving from Simple to HA isn’t just adding more VMs. Each new component introduces a concept that matters in production. Here’s a primer on the three most important ones.

etcd quorum and the Raft consensus protocol

A single etcd node is a database. Three etcd nodes are a distributed consensus cluster. The difference is fundamental.

etcd uses the Raft protocol to maintain consistency across nodes. One node is elected leader — all writes go through it. The leader replicates each write to the followers, and the write is only committed once a majority (quorum) acknowledges it. With 3 nodes, quorum is 2. This means one node can fail completely and the cluster continues operating normally.

This is why the magic number is 3, not 2. A 2-node etcd cluster has a quorum of 2 — both nodes must be healthy. Two nodes is actually worse than one for availability. Three nodes, five nodes, seven nodes — always odd numbers, because the quorum math works: (n/2)+1 means 3→2, 5→3, 7→4.

In the HA setup, you can test this yourself. SSH to the jump server, stop etcd on one node (sudo systemctl stop etcd), and confirm the cluster still works.

All three etcd nodes communicate using mutual TLS. Both peer and server certificates are signed by the dedicated etcd CA, separate from the Kubernetes CA. The CA separation is covered in the Vault PKI deep dive.

HAProxy and API server load balancing

In the simple setup, every kubeconfig points to https://master-1:6443. Add a second master and clients don’t know about it.

HAProxy sits in front of both masters as a TCP load balancer on port 6443. Every client points to https://haproxy:6443. HAProxy round-robins and runs health checks.

Only one controller-manager and one scheduler are active at any time via leader election. The API server itself doesn’t need leader election — both instances serve requests simultaneously.

Vault PKI: the 3-tier CA hierarchy

Both Simple and HA clusters use a 3-tier Certificate Authority hierarchy with separate CAs for Kubernetes, etcd, and the front proxy. HA makes the consequences of good certificate design more visible.

The certificate count grows from about 15 in Simple to over 25 in HA. The Vault PKI deep dive covers the full hierarchy and the three Ansible roles that automate it.

Shared foundation

Same components, same versions, every repo

All six repos share identical component versions: Kubernetes 1.32.0, etcd 3.5.12, containerd 1.7.24, runc 1.2.4, Calico CNI 3.28.0, Vault 1.15.4, Ubuntu 24.04 (Noble) ARM64.

The Ansible roles are also shared. Improvements to any role benefit all six repos immediately.

The right tool

Which virtualization tool at each level

The full comparison post covers this in depth.

Starting out? Use OrbStack Simple.

Lowest barrier to entry. 6 VMs use ~10 GB disk. Your Mac stays cool.

Ready for HA? Your Mac’s RAM decides.

32 GB or less: OrbStack HA. 48 GB or more: UTM HA or Vagrant HA.

Want infrastructure-as-code?

Vagrant. Everything in a Vagrantfile. See the Vagrant deep dive.

Quick start

Get running in 5 minutes

# Install OrbStack from orbstack.dev, then:

ssh-keygen -t ed25519 -f ~/.ssh/k8slab.key -C “k8s-homelab” -N “”

git clone https://github.com/labitlearnit/k8s-orbstack-simple-homelab.git

cd k8s-orbstack-simple-homelab

bash scripts/k8s-orbstack-simple-homelab.sh

Big tech, small lab. One reel at a time.

Go deeper