OrbStack is the lightest way to run Kubernetes on Apple Silicon — shared kernel, instant machine creation, minimal resource consumption. But “lightest” comes with its own set of surprises. This post covers what breaks, what confuses, and what behaves differently when building K8s clusters with OrbStack on an M-series Mac, drawn from real deployment experience with both the Simple cluster and the full 11-machine HA deployment.
Each gotcha follows the same format: what happens, why it happens, and how to fix it. For gotchas that apply to all three tools (UTM, Vagrant, OrbStack), see the HA-Specific Gotchas post. For UTM and Vagrant-specific issues, see the UTM Gotchas and Vagrant Gotchas posts.
Gotcha #1: Swap Can’t Be Disabled — And That’s Fine
What happens: You run swapoff -a inside an OrbStack machine but swap is still active. free -h continues to show swap space. Historically, kubelet would refuse to start if swap was detected, leading to the cluster failing to come up.
Why it happens: OrbStack machines share the host kernel, and OrbStack uses zram swap that’s managed at the host level. There’s no per-machine swap configuration — you can’t disable it because it’s not your kernel to configure. The swapoff -a command appears to succeed but has no lasting effect because the zram device is managed by the host.
How to fix it: Configure the kubelet to tolerate swap instead of trying to disable it. Kubernetes has supported running with swap since v1.22 (alpha), with the feature graduating to beta in v1.28 and reaching GA in v1.34. This project uses Kubernetes 1.32.0, where the NodeSwap feature gate is beta and enabled by default.
The kubelet configuration sets:
# kubelet-config.yamlfailSwapOn: false
This tells the kubelet to start normally even with swap present. The default swap behavior (NoSwap) means Kubernetes workloads won’t actually use swap — they just won’t be blocked from starting because of it. This isn’t a workaround; it’s the intended configuration for environments where swap can’t be turned off at the host level.
On UTM and Vagrant, VMs have their own kernels and swapoff -a works as expected. This gotcha is unique to OrbStack’s shared-kernel architecture.
Gotcha #2: kube-proxy conntrack Permission Denied
What happens: kube-proxy starts but logs errors about failing to set net.netfilter.nf_conntrack_max or conntrack.maxPerCore. You see “permission denied” in the kube-proxy logs when it tries to modify sysctl parameters.
Why it happens: The shared kernel means certain sysctl parameters can’t be modified from within an OrbStack machine because they would affect the entire host. Conntrack settings are kernel-wide — changing them from inside one machine would change them for all machines and potentially for macOS itself. The kernel correctly denies the write.
How to fix it: Set conntrack.maxPerCore to 0 in the kube-proxy configuration, which tells kube-proxy to skip the sysctl modification and use whatever the host kernel provides:
# kube-proxy-config.yamlconntrack: maxPerCore: 0 min: 0
Setting it to 0 doesn’t mean “zero connections allowed” — it means “don’t try to set this value, use the kernel default.” The OrbStack host kernel has a perfectly reasonable conntrack limit already configured. The Ansible roles handle this automatically, but if you’re configuring kube-proxy manually, this is a required change for OrbStack.
Gotcha #3: The Dual-IP Problem — hostname -I Returns the Wrong IP
What happens: A Kubernetes component (kubelet, etcd, API server) binds to the wrong IP address. Running hostname -I inside an OrbStack machine returns two IPs, and the first one might not be the static IP you configured via cloud-init.
Why it happens: OrbStack assigns two IP addresses to each machine’s eth0 interface: the static IP from your cloud-init config (192.168.139.x) and a dynamic IP that OrbStack uses internally for its networking layer. Unlike Vagrant, which puts the two IPs on separate interfaces (eth0 and eth1), OrbStack stacks both on the same interface. hostname -I returns both, and which one is listed first is not guaranteed.
How to fix it: The same principle as the Vagrant dual-NIC problem: never rely on auto-detection. Always specify bind addresses explicitly in every component configuration:
# Check both IPs on eth0ip addr show eth0# You'll see two inet entries on the same interface# Verify which IP the Ansible inventory is usingcat ~/k8s-orbstack-ha-homelab/ansible/inventory/homelab.yml# All IPs should be 192.168.139.x addresses# If a component bound to the wrong IP, check its configss -tlnp | grep 6443 # API serverss -tlnp | grep 2379 # etcdss -tlnp | grep 10250 # kubelet
The Ansible playbooks handle this by always specifying the exact bind address rather than relying on auto-detection. If you’re adding custom services to the cluster, remember this dual-IP behavior and bind explicitly.
Gotcha #4: File Copy to OrbStack Machines Is Slow
What happens: The Ansible deployment phase takes longer than expected. Specifically, tasks that distribute binaries (Kubernetes binaries, etcd tarball, containerd tarball) across 11 machines are noticeably slower than the same tasks on UTM or Vagrant.
Why it happens: File transfer to OrbStack machines is slower than to full QEMU VMs. This is a known characteristic of OrbStack’s I/O path. Each binary distribution involves SCP/rsync from the jump server to 10 other machines, and the per-file overhead adds up when you’re distributing ~500 MB of binaries across 11 machines.
How to fix it: This is the primary reason OrbStack HA (7m 26s) is over a minute slower than UTM HA (6m 13s) despite machines starting almost instantly. You can’t eliminate the overhead, but the deploy script minimizes it by:
# Pre-caching all binaries on the jump server first# This happens once, then Ansible distributes from jump to nodes# over the fast local network# The alternative — having each machine download from the internet —# would be even slower since all 11 machines share the same# OrbStack network path to the Mac's internet connection# To skip the binary distribution on re-runs:./scripts/k8s-orbstack-ha-homelab.sh --from-step 7 # Resume after caching
If speed is your top priority, UTM gives you the fastest total deployment time (6m 13s). OrbStack’s advantage is elsewhere — machine creation speed, memory efficiency, and daily driver convenience.
Gotcha #5: VS Code Terminal Can’t Reach OrbStack IPs
What happens: SSH commands to OrbStack machines fail from VS Code’s integrated terminal with “Connection refused” or “No route to host,” but the same commands work perfectly from macOS Terminal.app or iTerm2.
Why it happens: VS Code’s integrated terminal may not inherit the correct network routing for OrbStack’s static IP subnet (192.168.139.0/24). This appears to be related to how VS Code handles network interfaces and DNS resolution internally, particularly when OrbStack’s network configuration is set up through its own networking layer rather than standard macOS interfaces.
How to fix it: Use macOS Terminal.app or iTerm2 for cluster management instead of VS Code’s integrated terminal:
# Test from VS Code terminalssh jump # If this fails...# Try from macOS Terminalssh jump # ...and this works, use Terminal for cluster work# Or use VS Code's remote SSH extension to connect to jump# and run commands from there
This gotcha doesn’t affect UTM or Vagrant since their networking goes through standard macOS vmnet interfaces that VS Code handles correctly.
Gotcha #6: OrbStack Machines Are Not VMs — Implications
What happens: You try to do something that requires kernel-level access (load a custom kernel module, modify kernel parameters, use a different kernel version) and it doesn’t work. Or you’re troubleshooting a network policy issue and the behavior doesn’t match what you’d see on a real VM.
Why it happens: OrbStack machines share the host kernel (6.17.8-orbstack). They’re lightweight Linux environments, not full VMs with their own kernels. For 95% of Kubernetes learning — deployments, services, RBAC, Helm, monitoring, CI/CD — this is indistinguishable from a full VM. The 5% where differences surface includes: custom kernel module loading, kernel-level security policies (AppArmor/SELinux), low-level syscall behavior, and network namespace isolation edge cases.
How to fix it: You don’t fix this — you understand it and work within it. For the vast majority of K8s learning and experimentation, OrbStack behaves identically to full VMs. If you hit an edge case that requires real kernel isolation, switch to UTM or Vagrant for that specific investigation — the same Ansible roles work across all three tools.
# Check the kernel version inside an OrbStack machineuname -r# Returns something like: 6.17.8-orbstack# This is the shared host kernel, not a per-machine kernel# Check if a kernel module is availablelsmod | grep overlay# If the module isn't loaded and you can't modprobe it,# that's a shared-kernel limitation
Gotcha #7: orb create Silently Fails with Bad Cloud-Init
What happens: orb create ubuntu noble machine-name completes successfully, but the machine doesn’t have the expected hostname, user, SSH key, or static IP. Everything from the cloud-init config was silently ignored.
Why it happens: If the cloud-init YAML has a syntax error (bad indentation, missing colons, invalid YAML), orb create creates the machine anyway but skips the cloud-init configuration. There’s no error message indicating the cloud-init config was invalid. This is more forgiving than UTM (where a bad cloud-init ISO prevents boot) but also more dangerous because the machine appears to work.
How to fix it: Validate your cloud-init YAML before passing it to orb create:
# Validate YAML syntaxpython3 -c "import yaml; yaml.safe_load(open('cloud-init/jump.yaml'))"# After machine creation, verify cloud-init ranorb run jump -- cloud-init status# Should say "done" with no errors# Check if the k8s user was createdorb run jump -- id k8s# If "no such user", cloud-init didn't apply# Check cloud-init logs for detailsorb run jump -- cat /var/log/cloud-init-output.log | tail -30
If cloud-init didn’t run, delete the machine (orb delete machine-name), fix the YAML, and recreate. The deploy script’s cloud-init configs are tested and working — this gotcha mainly hits when you’re customizing configs.
Gotcha #8: OrbStack Network Prefix Changes Between Installations
What happens: You reinstall OrbStack or upgrade to a new version, and the network prefix changes from 192.168.139 to a different subnet. Existing cloud-init configs, Ansible inventory files, and /etc/hosts entries all reference the old subnet.
Why it happens: OrbStack’s subnet is configured in its settings and can change between installations. The deploy script auto-detects it using orb config show | grep network.subnet4, but hardcoded references in cloud-init configs or manually edited inventory files won’t update automatically.
How to fix it: The deploy script handles this by auto-detecting the prefix and generating all configs dynamically. If you’ve hardcoded IPs anywhere, update them after an OrbStack reinstall:
# Check the current OrbStack subnetorb config show | grep network.subnet4# If it changed, re-run the full deploy script# It will auto-detect the new prefix and regenerate everythingbash scripts/k8s-orbstack-ha-homelab.sh# Or destroy and recreate if the old machines have wrong IPsbash scripts/destroy-vms.shbash scripts/k8s-orbstack-ha-homelab.sh
Quick Reference: OrbStack Diagnostics
When something goes wrong with OrbStack machines, these commands help narrow down the issue:
# List all OrbStack machinesorb list# Check machine statusorb info jump# Run a command inside a machine without SSHorb run jump -- hostname -I# Check both IPs on eth0orb run jump -- ip addr show eth0# Check cloud-init statusorb run jump -- cloud-init status# Check swap status (it will always show active)orb run worker-1 -- free -h# Check kube-proxy logs for conntrack errorsssh jump 'kubectl logs -n kube-system -l k8s-app=kube-proxy --tail=20'# Check kernel version (shared host kernel)orb run jump -- uname -r# Check OrbStack network configorb config show | grep network# Destroy a single machineorb delete machine-name# Destroy all machinesbash scripts/destroy-vms.sh
Where to Go Next
These gotchas cover the OrbStack-specific issues. For problems that hit during the Ansible deployment phase — Vault seal/unseal, certificate SANs, etcd quorum, Calico initialization — see the HA-Specific Gotchas post, which covers cross-tool issues that apply regardless of whether you’re running UTM, Vagrant, or OrbStack.
For the full deployment walkthrough, see the OrbStack HA deep dive. For the full roadmap from simple to HA, see From Simple to HA: A Learning Path for Kubernetes on Apple Silicon.
Big tech, small lab. One reel at a time.
Questions, corrections, or want to share how you’re using these repos?
labitlearnit@gmail.com
Leave a Reply