This repository was born out of the need to better manage an ever-growing homelab environment. After starting with a simple single-node Docker Compose setup, the increasing number of services began to make maintenance and updates more challenging.
As the complexity grew, it became clear that a more structured, Infrastructure-as-Code approach was needed to:
- keep configurations versioned and thus better documented
- make deployments more consistently repeatable and reliable
- simplify the process of adding new services without
- enable easier backup and disaster recovery
- provide better scalability and resilience beyond a single node
I decided to take this opportunity to properly learn Kubernetes hands-on, embracing the complexity and "feeling the pain" that comes with it rather than just having the theoretical knowledge. This repo serves as both documentation of my setup as well as a real-world learning experience in managing infrastructure that I rely upon as code.
PS: This setup is mature enough to be girlfriend-approved. 😉
At the highest possible level, this repo and HaC workflow consists of three parts:
- cloud-init contains the stage 1 bootstrapping for the cluster nodes. This includes only the very basic OS-level configuration required for the others stages of this workflow. The contained shell script creates all files required to install the OS via network boot and without user interaction. Triggering the network-boot installation is out-of-scope for the moment. After completion of the cloud-init autoinstall, all nodes reboot and are ready to accept SSH connections.
- ansible contains the stage 2 system configuration for the cluster nodes. This includes a range of tasks including power management, networking setup, and most importantly bootstrapping the kubernetes cluster using kubeadm. The contained ansible playbook and roles perform the required tasks on the nodes via an SSH and a dedicated ansible user created in the previous step. After completion of this stage, the kubernetes cluster is set up with HA control planes, joined worker nodes, dual-stack CNI, almost working OIDC authn, and last but not least a bootstrapped GitOps setup that is ready to start reconciling.
- flux contains the final stage 3 GitOps cluster configuration. This includes everything running inside kubernetes in the cluster and ranges from basic system infrastructure like load balancer, ingress, and CSI to more user-style applications such as password manager and file management apps. The contained flux kustomizations are automatically installed and/or reconciled on the cluster without* user interaction. This process is staggered since there is an inherent dependency between some of the components. After completion of this stage, the cluster is fully set up and ready for use.
Component | Purpose | Notes |
---|---|---|
Ubuntu Server 24.04 | Base Operating System | |
cloud-init | Headless OS Installation | see cloud-init/README.md |
Ansible | OS Configuration | |
kubeadm | k8s Distribution / Install Mechanism | stacked HA controlplanes |
containerd | OCI Runtime | |
Calico | CNI | dual-stack nodes and services |
kube-vip | Virtual IP for controlplane Nodes | used in L2/ARP mode |
Flux2 | GitOps Automation inside the Cluster | |
SOPS | Secrets Management | age rather than pgp, but not any more user-friendly |
Name | Purpose | Notes | |
---|---|---|---|
metallb | Cloud-Native Service LoadBalancer | used in L2/ARP mode, so only VIP rather than true LB | |
external-dns | DNS Management Automation | split-horizon realized using opnsense webhook | |
cert-manager | Automated Certificate Management | Let's Encrypt via ACME DNS | |
ingress-nginx | Ingress Controller | ||
Renovate Bot | Dependency Update Automation | used for multiple repos, not just this one | |
longhorn | Cloud-Native Distributed Block Storage CSI | ||
democratic-csi | CSI for Common External Storage Systems | using the freenas-nfs implementation | |
stash | Cloud-Native Backup/Restore | freemium/open core, but really good | |
CloudNativePG | Cloud-Native PostgreSQL Operator | ||
Grafana | Montoring and Observability | ||
Prometheus | Metrics Aggregation and Storage | ||
Loki | Log Aggregation and Storage | ||
descheduler | Pod Eviction for Node Balancing | ||
reloader | Hot-Reload for ALL Workloads | ||
Dex | OIDC Provider | used for api-server authentication | |
metrics-server | Metrics API | ||
Goldilocks | Resource Recommendation Engine | ||
Vertical Pod Autoscaler | Workload Resource Scaler | used exclusively for Goldilocks recommendations |
Name | Purpose | Notes | |
---|---|---|---|
Pi-hole | Filtering DNS Proxy | ||
Nextcloud | File Storage and Management | ||
Vaultwarden | API-compatible Password Manager | ||
Immich | Photo/Video Storage and Management | ||
Paperless-ngx | Document Management System | ||
Firefly III | Personal Finance Manager | ||
Homepage | Application Dashboard | ||
Fresh-RSS | RSS Aggregator | ||
RSS-Bridge | Unofficial RSS Feeds of ANY Source | any as long as you know some PHP | |
Stirling PDF | Swiss-Army Knife for PDFs | ||
Overleaf | LaTeX Editor | ||
UniFi Network Application | AP Administration and Management | ||
n8n | Workflow Automation | freemium/open core | |
Jellyfin | Media Streaming and Management | ||
Gluetun | VPN Gateway | ||
qBittorrent | Torrent Client | ||
SABnzbd | Usenet Client | ||
Prowlarr | Torrent & Usenet Indexer Engine | ||
Radarr | Movie Management | ||
Sonarr | TV Show Management | ||
Lidarr | Music Management | ||
FlareSolverr | Cloudflare Protection Bypass |
While the ultimate goal is to have as self-sufficient of a setup as possible, some external services are still required for proper operation.
Service | Purpose | Notes |
---|---|---|
GitHub | Git Repository Hosting, GitOps Source | |
INWX | Domain Registrar | |
Cloudflare | Public DNS Auth Hosting | |
netcup | Public Reverse-Proxy for Relevant Services | not yet managed here since the number of public services is tiny |
BackBlaze | Cloud Storage for Backups | the "3" in 3-2-1 for the really important data |
TailScale | Overlay VPN | used for split-horizon and a direct connection back home |
VPN Provider | VPN Gateway | different external IP for all the Linux ISOs |