diff --git a/cluster-v2-design.md b/cluster-v2-design.md new file mode 100644 index 0000000..723a3df --- /dev/null +++ b/cluster-v2-design.md @@ -0,0 +1,277 @@ +# Design for Second Iteration of Cluster/Homelab + +## Context + +Current cluster was set up just to run CI builds as a +trial. + +I'm now sold the k8s is a good approach and would like +to move more of my services to it. + +This document will track my design for cluster v2. + + +## Investigation + +### Host OS + +Debian: +- on laptop +- already on most of systems +- stable +- not officially tested by k3s +- Will be using apt at work + +Stream: +- Tried with k3s and had to disable systemd... + - On second try seemed to work even with error I saw before. +- Cockpit is nice when managing servers. +- Want to like RHEL +- More stable than Fedora +- RPMs are easier to work with +- Using on VM host + +Fedora: +- Want to like RHEL +- Tested with k3s +- Latest podman and frieds +- Really fast for something stable... +- Cockpit is nice +- Fedora minimal can't be installed on + cockpit. + +Decision: Stream + +Put var/rancher on a separate partition. + +### K3S Distro + +RKE2: + - no Debian support + - 4GB Minimum + - 2 CPU + - cilium and nginx not default + +k3s: + - k3d is a thing + - documentation online is good + - 512 MB of RAM + - 1 CPU + - easy installation + +Decision: k3s + +### How many clusters? + +Decision: Exactly two (one for "need to work" services one for CI and messing around). +The mess with longhorn scared me... it wouldn't be that big a deal if it only effected +CI, but it also effect Kanboard and git. + +### Files + +Decision: Host local. + +Files are not something I want to have to think about. +The longhorn mess scared me. +NFS not working with postgres is annoying. + +### How many nodes per cluster? + +The current cluster has lots of small VMs, with VMs added with more +CPUs/RAM as the requirements grew. + +I'd rather limit myself to fewer more powerful VMs, and let the VM OS manage +CPU and memory. + +More nodes would be useful if they were on different base hardware. +Realistically I'm never going to pay for more than the Ingress VM... + +Decision: +1 big VM per cluster. +Both VMs hosted on current hardware. +If we add hardware, can add an additional node at that time. + +### Networking + +Status quo is flannel with vxlan with Traefik and Klipper LB and CoreDNS. + +#### DNS + +CoreDNS is great. + +#### Load Balancer + +Klipper works fine now. +MetalLB is the other option, is more complicated and doesn't +seem to give much particularly with a single node cluster. + +Decision: Klipper + +#### Ingress + +Traefik: +- Status Quo. +- Works fine. +- Outside of k8s I don't like. + +nginx-ingress: +- Google +- Used by a lot of people. +- Nothing sexy or risky. +- auth exposed in annotations + +ingress-nginx: +- nginx upstream. +- extra features like stream support that I'm using + on lightsail now. +- full blown virtual server support. +- maybe too complicated? +- exposes same features as I have on lightsail through + annotations, which could be a thing to get keycloack to + work. +- auth in annotations is behind paywall, but available through + a virtual server + +Decision: nginx-ingress +Use LB for stuff I would use the virtual server for. + +#### CNI + +flannel vxlan +- status quo +- works fine + +cilium +- label based network policies +- leaning toward this plus multus though I doubt + I'll ever write a policy +- I want the ability to write a policy... +- if set up different pod cidr can do multi-cluster later + - cluster name and cluster id at install time +- can do transparent encryption (not worth it...) + +cilium multi-cluster networking: +- not worth the complexity +- will manage connections with ingress/egress methods. + +flannel wg +- encrypt traffic and set up overlay if I want to interact with + cloud machines +- can do the same with a manual wireguard network... + +istio +- I dislike side car containers +- Traffic I'm interested in is mainly not L7. +- blessed by Air Force + +Decision: flannel vxlan +not worth the extra complexity of cilium. + +## What goes on each cluster/VM? + +Lightsail: +1. Wireguard +2. Apt/RPM repos +3. Main NGINX Proxy + +Infra Cluster: +- On Host: + 1. CoreDNS + 2. Wireguard +- On Cluster: + 1. Keycloak + 2. Kanboard + 3. OneDev + 4. Harbor + +Main Cluster: +- On Host: + 1. Wireguard +- On Cluster: + 1. Tekton + 2. MQTT Broker + 3. Squid + 4. j7s-os-deployment + +## Deployments + +Manually kubectl apply: +- Easy to reason about +- running apply is fun +- using flux has chicken and egg problem if git is also + deployed from flux + +Flux: +- More git ops-y +- chicken and egg problem is conquerable, in a maybe + confusing way + +Decision: +1. Infra: + 1. kubectl apply/helm everything + 2. Drop keycloak image manually in k3s either using cri or + placing in magic place after k3s install. + 3. Use helm with values for onedev. + 4. Get rid of Kanboard custom image. + Use kubectl apply. +2. Test: + 1. Mostly kubectl apply for tekton. + 2. Use flux for: + 1. MQTT + 2. j7s-os-deploy + 3. squid + +## VM Resources + +Lightsail: +- Leave alone + +Infra Cluster: +- On Host: + 1. CoreDNS + 2. Wireguard +- On Cluster: + 1. Keycloak + 2. Kanboard + 3. OneDev + 4. Harbor + +Main Cluster: +- On Host: + 1. Wireguard +- On Cluster: + 1. Tekton + 2. MQTT Broker + 3. Squid + 4. j7s-os-deployment + +## Stuff to experiment with +[ ] Manually placing keycloak image in k3s through k3s thing + and/or through cri. + +[ ] Keycloak ssl passthrough. + +[ ] fedora 37 server install with k3s. + +## Experiments + +### k3s with cilium and nginx on Centos Stream 9 + +``` +systemctl disable firewalld --now +export INSTALL_K3S_EXEC="server --disable traefik --flannel-backend=none --disable-network-policy --selinux" +curl -sfL https://get.k3s.io | sh -s - +``` +I see an error about selinux policies conflicting, but I'm not sure if it matters? + +Install cilium following instructions here: +https://docs.cilium.io/en/v1.12/gettingstarted/k3s/ + +Install nginx with: +``` +helm upgrade --install ingress-nginx ingress-nginx \ + --repo https://kubernetes.github.io/ingress-nginx \ + --namespace ingress-nginx --create-namespace +``` + +### k3s with nginx on fedora server