First pass of cluster v2 design.
This commit is contained in:
parent
40750e12eb
commit
8f1d8ef784
|
|
@ -0,0 +1,277 @@
|
|||
# Design for Second Iteration of Cluster/Homelab
|
||||
|
||||
## Context
|
||||
|
||||
Current cluster was set up just to run CI builds as a
|
||||
trial.
|
||||
|
||||
I'm now sold the k8s is a good approach and would like
|
||||
to move more of my services to it.
|
||||
|
||||
This document will track my design for cluster v2.
|
||||
|
||||
|
||||
## Investigation
|
||||
|
||||
### Host OS
|
||||
|
||||
Debian:
|
||||
- on laptop
|
||||
- already on most of systems
|
||||
- stable
|
||||
- not officially tested by k3s
|
||||
- Will be using apt at work
|
||||
|
||||
Stream:
|
||||
- Tried with k3s and had to disable systemd...
|
||||
- On second try seemed to work even with error I saw before.
|
||||
- Cockpit is nice when managing servers.
|
||||
- Want to like RHEL
|
||||
- More stable than Fedora
|
||||
- RPMs are easier to work with
|
||||
- Using on VM host
|
||||
|
||||
Fedora:
|
||||
- Want to like RHEL
|
||||
- Tested with k3s
|
||||
- Latest podman and frieds
|
||||
- Really fast for something stable...
|
||||
- Cockpit is nice
|
||||
- Fedora minimal can't be installed on
|
||||
cockpit.
|
||||
|
||||
Decision: Stream
|
||||
|
||||
Put var/rancher on a separate partition.
|
||||
|
||||
### K3S Distro
|
||||
|
||||
RKE2:
|
||||
- no Debian support
|
||||
- 4GB Minimum
|
||||
- 2 CPU
|
||||
- cilium and nginx not default
|
||||
|
||||
k3s:
|
||||
- k3d is a thing
|
||||
- documentation online is good
|
||||
- 512 MB of RAM
|
||||
- 1 CPU
|
||||
- easy installation
|
||||
|
||||
Decision: k3s
|
||||
|
||||
### How many clusters?
|
||||
|
||||
Decision: Exactly two (one for "need to work" services one for CI and messing around).
|
||||
The mess with longhorn scared me... it wouldn't be that big a deal if it only effected
|
||||
CI, but it also effect Kanboard and git.
|
||||
|
||||
### Files
|
||||
|
||||
Decision: Host local.
|
||||
|
||||
Files are not something I want to have to think about.
|
||||
The longhorn mess scared me.
|
||||
NFS not working with postgres is annoying.
|
||||
|
||||
### How many nodes per cluster?
|
||||
|
||||
The current cluster has lots of small VMs, with VMs added with more
|
||||
CPUs/RAM as the requirements grew.
|
||||
|
||||
I'd rather limit myself to fewer more powerful VMs, and let the VM OS manage
|
||||
CPU and memory.
|
||||
|
||||
More nodes would be useful if they were on different base hardware.
|
||||
Realistically I'm never going to pay for more than the Ingress VM...
|
||||
|
||||
Decision:
|
||||
1 big VM per cluster.
|
||||
Both VMs hosted on current hardware.
|
||||
If we add hardware, can add an additional node at that time.
|
||||
|
||||
### Networking
|
||||
|
||||
Status quo is flannel with vxlan with Traefik and Klipper LB and CoreDNS.
|
||||
|
||||
#### DNS
|
||||
|
||||
CoreDNS is great.
|
||||
|
||||
#### Load Balancer
|
||||
|
||||
Klipper works fine now.
|
||||
MetalLB is the other option, is more complicated and doesn't
|
||||
seem to give much particularly with a single node cluster.
|
||||
|
||||
Decision: Klipper
|
||||
|
||||
#### Ingress
|
||||
|
||||
Traefik:
|
||||
- Status Quo.
|
||||
- Works fine.
|
||||
- Outside of k8s I don't like.
|
||||
|
||||
nginx-ingress:
|
||||
- Google
|
||||
- Used by a lot of people.
|
||||
- Nothing sexy or risky.
|
||||
- auth exposed in annotations
|
||||
|
||||
ingress-nginx:
|
||||
- nginx upstream.
|
||||
- extra features like stream support that I'm using
|
||||
on lightsail now.
|
||||
- full blown virtual server support.
|
||||
- maybe too complicated?
|
||||
- exposes same features as I have on lightsail through
|
||||
annotations, which could be a thing to get keycloack to
|
||||
work.
|
||||
- auth in annotations is behind paywall, but available through
|
||||
a virtual server
|
||||
|
||||
Decision: nginx-ingress
|
||||
Use LB for stuff I would use the virtual server for.
|
||||
|
||||
#### CNI
|
||||
|
||||
flannel vxlan
|
||||
- status quo
|
||||
- works fine
|
||||
|
||||
cilium
|
||||
- label based network policies
|
||||
- leaning toward this plus multus though I doubt
|
||||
I'll ever write a policy
|
||||
- I want the ability to write a policy...
|
||||
- if set up different pod cidr can do multi-cluster later
|
||||
- cluster name and cluster id at install time
|
||||
- can do transparent encryption (not worth it...)
|
||||
|
||||
cilium multi-cluster networking:
|
||||
- not worth the complexity
|
||||
- will manage connections with ingress/egress methods.
|
||||
|
||||
flannel wg
|
||||
- encrypt traffic and set up overlay if I want to interact with
|
||||
cloud machines
|
||||
- can do the same with a manual wireguard network...
|
||||
|
||||
istio
|
||||
- I dislike side car containers
|
||||
- Traffic I'm interested in is mainly not L7.
|
||||
- blessed by Air Force
|
||||
|
||||
Decision: flannel vxlan
|
||||
not worth the extra complexity of cilium.
|
||||
|
||||
## What goes on each cluster/VM?
|
||||
|
||||
Lightsail:
|
||||
1. Wireguard
|
||||
2. Apt/RPM repos
|
||||
3. Main NGINX Proxy
|
||||
|
||||
Infra Cluster:
|
||||
- On Host:
|
||||
1. CoreDNS
|
||||
2. Wireguard
|
||||
- On Cluster:
|
||||
1. Keycloak
|
||||
2. Kanboard
|
||||
3. OneDev
|
||||
4. Harbor
|
||||
|
||||
Main Cluster:
|
||||
- On Host:
|
||||
1. Wireguard
|
||||
- On Cluster:
|
||||
1. Tekton
|
||||
2. MQTT Broker
|
||||
3. Squid
|
||||
4. j7s-os-deployment
|
||||
|
||||
## Deployments
|
||||
|
||||
Manually kubectl apply:
|
||||
- Easy to reason about
|
||||
- running apply is fun
|
||||
- using flux has chicken and egg problem if git is also
|
||||
deployed from flux
|
||||
|
||||
Flux:
|
||||
- More git ops-y
|
||||
- chicken and egg problem is conquerable, in a maybe
|
||||
confusing way
|
||||
|
||||
Decision:
|
||||
1. Infra:
|
||||
1. kubectl apply/helm everything
|
||||
2. Drop keycloak image manually in k3s either using cri or
|
||||
placing in magic place after k3s install.
|
||||
3. Use helm with values for onedev.
|
||||
4. Get rid of Kanboard custom image.
|
||||
Use kubectl apply.
|
||||
2. Test:
|
||||
1. Mostly kubectl apply for tekton.
|
||||
2. Use flux for:
|
||||
1. MQTT
|
||||
2. j7s-os-deploy
|
||||
3. squid
|
||||
|
||||
## VM Resources
|
||||
|
||||
Lightsail:
|
||||
- Leave alone
|
||||
|
||||
Infra Cluster:
|
||||
- On Host:
|
||||
1. CoreDNS
|
||||
2. Wireguard
|
||||
- On Cluster:
|
||||
1. Keycloak
|
||||
2. Kanboard
|
||||
3. OneDev
|
||||
4. Harbor
|
||||
|
||||
Main Cluster:
|
||||
- On Host:
|
||||
1. Wireguard
|
||||
- On Cluster:
|
||||
1. Tekton
|
||||
2. MQTT Broker
|
||||
3. Squid
|
||||
4. j7s-os-deployment
|
||||
|
||||
## Stuff to experiment with
|
||||
[ ] Manually placing keycloak image in k3s through k3s thing
|
||||
and/or through cri.
|
||||
|
||||
[ ] Keycloak ssl passthrough.
|
||||
|
||||
[ ] fedora 37 server install with k3s.
|
||||
|
||||
## Experiments
|
||||
|
||||
### k3s with cilium and nginx on Centos Stream 9
|
||||
|
||||
```
|
||||
systemctl disable firewalld --now
|
||||
export INSTALL_K3S_EXEC="server --disable traefik --flannel-backend=none --disable-network-policy --selinux"
|
||||
curl -sfL https://get.k3s.io | sh -s -
|
||||
```
|
||||
I see an error about selinux policies conflicting, but I'm not sure if it matters?
|
||||
|
||||
Install cilium following instructions here:
|
||||
https://docs.cilium.io/en/v1.12/gettingstarted/k3s/
|
||||
|
||||
Install nginx with:
|
||||
```
|
||||
helm upgrade --install ingress-nginx ingress-nginx \
|
||||
--repo https://kubernetes.github.io/ingress-nginx \
|
||||
--namespace ingress-nginx --create-namespace
|
||||
```
|
||||
|
||||
### k3s with nginx on fedora server
|
||||
Loading…
Reference in New Issue