Skip to content

Commit

Permalink
feat!: Drop Metallb in favor of Cilium L2Announcements (#821)
Browse files Browse the repository at this point in the history
  • Loading branch information
onedr0p authored Jul 3, 2023
1 parent 4905990 commit 467af18
Show file tree
Hide file tree
Showing 34 changed files with 111 additions and 243 deletions.
23 changes: 6 additions & 17 deletions .config.sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -45,22 +45,11 @@ export BOOTSTRAP_CLOUDFLARE_ACCOUNT_TAG=""
export BOOTSTRAP_CLOUDFLARE_TUNNEL_SECRET=""
export BOOTSTRAP_CLOUDFLARE_TUNNEL_ID=""

# Pick a range of unused IPs that are on the same network as your nodes
# You don't need many IPs, just choose 10 IPs to start with
# e.g. 192.168.1.220-192.168.1.230
export BOOTSTRAP_METALLB_LB_RANGE=""
# The load balancer IP for k8s_gateway, choose from one of the available IPs above
# e.g. 192.168.1.220
export BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR=""
# The load balancer IP for the ingress controller, choose from one of the available IPs above
# that doesn't conflict with any other IP addresses here
# e.g. 192.168.1.221
export BOOTSTRAP_METALLB_INGRESS_ADDR=""
# The IP Address to use with kube-vip
# Pick a unused IP that is on the same network as your nodes
# and outside the ${BOOTSTRAP_METALLB_LB_RANGE} range
# and doesn't conflict with any other IP addresses here
# e.g. 192.168.1.254
# The load balancer IP for k8s_gateway, choose a available IP in your nodes network that is not being used
export BOOTSTRAP_K8S_GATEWAY_ADDR=""
# The load balancer IP for ingress-nginx, choose a available IP in your nodes network that is not being used
export BOOTSTRAP_INGRESS_NGINX_ADDR=""
# The IP address to use with kube-vip, choose a available IP in your nodes network that is not being used
export BOOTSTRAP_KUBE_VIP_ADDR=""
# Choose your cluster or service cidrs
# Leave this unchanged unless you know what you are doing
Expand All @@ -84,7 +73,7 @@ export BOOTSTRAP_ANSIBLE_DEFAULT_NODE_HOSTNAME_PREFIX="k8s-" # NOTE: Must only c
# incrementing the last digit on the variable name for each node
#

# Host IP Address to the control plane node
# IP Address to the node
# That doesn't conflict with any other IP addresses here
# e.g. 192.168.1.200
export BOOTSTRAP_ANSIBLE_HOST_ADDR_0=""
Expand Down
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ The following components will be installed in your [k3s](https://k3s.io/) cluste

- [flux](https://toolkit.fluxcd.io/) - GitOps operator for managing Kubernetes clusters from a Git repository
- [kube-vip](https://kube-vip.io/) - Load balancer for the Kubernetes control plane nodes
- [metallb](https://metallb.universe.tf/) - Load balancer for Kubernetes services
- [cert-manager](https://cert-manager.io/) - Operator to request SSL certificates and store them as Kubernetes resources
- [cilium](https://cilium.io/) - Container networking interface for inter pod and service networking
- [external-dns](https://github.com/kubernetes-sigs/external-dns) - Operator to publish DNS records to Cloudflare (and other providers) based on Kubernetes ingresses
Expand Down Expand Up @@ -430,16 +429,16 @@ You should start to see your applications using the new certificate.
The `external-dns` application created in the `networking` namespace will handle creating public DNS records. By default, `echo-server` and the `flux-webhook` are the only public sub-domains exposed. In order to make additional applications public you must set an ingress annotation (`external-dns.alpha.kubernetes.io/target`) like done in the `HelmRelease` for `echo-server`.
For split DNS to work it is required to have `${SECRET_DOMAIN}` point to the `${METALLB_K8S_GATEWAY_ADDR}` load balancer IP address on your home DNS server. This will ensure DNS requests for `${SECRET_DOMAIN}` will only get routed to your `k8s_gateway` service thus providing **internal** DNS resolution to your cluster applications/ingresses from any device that uses your home DNS server.
For split DNS to work it is required to have `${SECRET_DOMAIN}` point to the `${K8S_GATEWAY_ADDR}` load balancer IP address on your home DNS server. This will ensure DNS requests for `${SECRET_DOMAIN}` will only get routed to your `k8s_gateway` service thus providing **internal** DNS resolution to your cluster applications/ingresses from any device that uses your home DNS server.
For and example with Pi-Hole apply the following file and restart dnsmasq:
```sh
# /etc/dnsmasq.d/99-k8s-gateway-forward.conf
server=/${SECRET_DOMAIN}/${METALLB_K8S_GATEWAY_ADDR}
server=/${SECRET_DOMAIN}/${K8S_GATEWAY_ADDR}
```
Now try to resolve an internal-only domain with `dig @${pi-hole-ip} hajimari.${SECRET_DOMAIN}` it should resolve to your `${METALLB_INGRESS_ADDR}` IP.
Now try to resolve an internal-only domain with `dig @${pi-hole-ip} hajimari.${SECRET_DOMAIN}` it should resolve to your `${INGRESS_NGINX_ADDR}` IP.
If having trouble you can ask for help in [this](https://github.com/onedr0p/flux-cluster-template/discussions/719) Github discussion.
Expand Down
2 changes: 1 addition & 1 deletion ansible/inventory/group_vars/kubernetes/kubernetes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ k3s_install_hard_links: true
k3s_become: true
k3s_etcd_datastore: true
k3s_use_unsupported_config: true
k3s_registration_address: "{{ kubevip_address }}"
k3s_registration_address: "{{ kube_vip_addr }}"
k3s_server_manifests_urls:
# Kube-vip RBAC
- url: https://raw.githubusercontent.com/kube-vip/kube-vip/main/docs/manifests/rbac.yaml
Expand Down
8 changes: 4 additions & 4 deletions ansible/inventory/group_vars/master/kubernetes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ k3s_control_node: true
k3s_server:
node-ip: "{{ ansible_host }}"
tls-san:
- "{{ kubevip_address }}"
- "{{ kube_vip_addr }}"
docker: false
flannel-backend: "none" # This needs to be in quotes
disable:
- coredns # Disable coredns - replaced with Helm Chart
- flannel # Disable flannel - replaced with Cilium
- coredns # Disable coredns - replaced with Coredns Helm Chart
- flannel # Disable flannel - replaced with Cilium Helm Chart
- local-storage # Disable local-path-provisioner - installed with Flux
- metrics-server # Disable metrics-server - installed with Flux
- servicelb # Disable servicelb - replaced with metallb and installed with Flux
- servicelb # Disable servicelb - replaced with Cilium Helm Chart
- traefik # Disable traefik - replaced with ingress-nginx and installed with Flux
disable-network-policy: true
disable-cloud-controller: true
Expand Down
8 changes: 5 additions & 3 deletions ansible/playbooks/cluster-installation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,11 @@
loop:
- { name: cilium, kind: HelmChart, namespace: kube-system }
- { name: coredns, kind: HelmChart, namespace: kube-system }
- { name: podmonitors.monitoring.coreos.com, kind: CustomResourceDefinition }
- { name: prometheusrules.monitoring.coreos.com, kind: CustomResourceDefinition }
- { name: servicemonitors.monitoring.coreos.com, kind: CustomResourceDefinition }
- { name: policy, kind: CiliumL2AnnouncementPolicy }
- { name: pool, kind: CiliumLoadBalancerIPPool }
- { name: podmonitors, kind: CustomResourceDefinition }
- { name: prometheusrules, kind: CustomResourceDefinition }
- { name: servicemonitors, kind: CustomResourceDefinition }

- name: Coredns
when:
Expand Down
22 changes: 22 additions & 0 deletions ansible/playbooks/templates/cilium-l2.yaml.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
# https://docs.cilium.io/en/latest/network/l2-announcements
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
name: policy
spec:
loadBalancerIPs: true
# NOTE: This might need to be set if you have more than one active NIC on your nodes
# interfaces:
# - ^eno[0-9]+
nodeSelector:
matchLabels:
kubernetes.io/os: linux
---
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
name: pool
spec:
cidrs:
- cidr: "{{ (ansible_default_ipv4.network + '/' + ansible_default_ipv4.netmask) | ansible.utils.ipaddr('network/prefix') }}"
23 changes: 20 additions & 3 deletions ansible/playbooks/templates/custom-cilium-helmchart.yaml.j2
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
---
# https://docs.k3s.io/helm
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
Expand All @@ -8,25 +9,41 @@ spec:
repo: https://helm.cilium.io/
chart: cilium
# renovate: datasource=helm depName=cilium registryUrl=https://helm.cilium.io
version: "1.13.4"
version: "1.14.0-rc.0"
targetNamespace: kube-system
bootstrap: true
valuesContent: |-
autoDirectNodeRoutes: true
bpf:
masquerade: true
bgp:
enabled: false
cluster:
name: home-cluster
id: 1
containerRuntime:
integration: containerd
socketPath: /var/run/k3s/containerd/containerd.sock
endpointRoutes:
enabled: true
hubble:
enabled: false
ipam:
mode: kubernetes
k8sServiceHost: "{{ kubevip_address }}"
ipv4NativeRoutingCIDR: "{{ k3s_server['cluster-cidr'] }}"
k8sServiceHost: "{{ kube_vip_addr }}"
k8sServicePort: 6443
kubeProxyReplacement: strict
kubeProxyReplacementHealthzBindAddr: 0.0.0.0:10256
l2announcements:
enabled: true
loadBalancer:
algorithm: maglev
mode: dsr
localRedirectPolicy: true
operator:
replicas: 1
rollOutPods: true
rollOutCiliumPods: true
securityContext:
privileged: true
tunnel: disabled
2 changes: 1 addition & 1 deletion ansible/playbooks/templates/kube-vip-static-pod.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ spec:
args: ["manager"]
env:
- name: address
value: "{{ kubevip_address }}"
value: "{{ kube_vip_addr }}"
- name: vip_arp
value: "true"
- name: port
Expand Down
87 changes: 12 additions & 75 deletions configure
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,9 @@ main() {
verify_binaries
verify_master_count
verify_ansible_hosts
verify_metallb
verify_lbs
verify_kubevip
verify_cluster_service_cidrs
verify_addressing
verify_age
verify_git_repository
verify_cloudflare
Expand Down Expand Up @@ -139,66 +138,6 @@ _has_valid_ip() {
fi
}

verify_addressing() {
local found_kube_vip="false"
local found_k8s_gateway="false"
local found_ingress="false"
# Verify the metallb min and metallb ceiling are in the same network
metallb_subnet_min=$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f1 | cut -d. -f1,2,3)
metallb_subnet_ceil=$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f2 | cut -d. -f1,2,3)
if [[ "${metallb_subnet_min}" != "${metallb_subnet_ceil}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The provided MetalLB IP range '${BOOTSTRAP_METALLB_LB_RANGE}' is not in the same subnet"
exit 1
fi
# Verify the node IP addresses are on the same network as the metallb range
for var in "${!BOOTSTRAP_ANSIBLE_HOST_ADDR_@}"; do
node_subnet=$(echo "${!var}" | cut -d. -f1,2,3)
if [[ "${node_subnet}" != "${metallb_subnet_min}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The subnet for node '${!var}' is not in the same subnet as the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
done
# Verify the kube-vip IP is in the same network as the metallb range
kubevip_subnet=$(echo "${BOOTSTRAP_KUBE_VIP_ADDR}" | cut -d. -f1,2,3)
if [[ "${kubevip_subnet}" != "${metallb_subnet_min}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The subnet for kupe-vip '${BOOTSTRAP_KUBE_VIP_ADDR}' is not the same subnet as the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
# Depending on the IP address, verify if it should be in the metallb range or not
metallb_octet_min=$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f1 | cut -d. -f4)
metallb_octet_ceil=$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f2 | cut -d. -f4)
for (( octet=metallb_octet_min; octet<=metallb_octet_ceil; octet++ )); do
addr="${metallb_subnet_min}.${octet}"
if [[ "${addr}" == "${BOOTSTRAP_KUBE_VIP_ADDR}" ]]; then
found_kube_vip="true"
fi
if [[ "${addr}" == "${BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR}" ]]; then
found_k8s_gateway="true"
fi
if [[ "${addr}" == "${BOOTSTRAP_METALLB_INGRESS_ADDR}" ]]; then
found_ingress="true"
fi
for var in "${!BOOTSTRAP_ANSIBLE_HOST_ADDR_@}"; do
if [[ "${!var}" == "${addr}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The IP for node '${!var}' should NOT be in the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
done
done
if [[ "${found_kube_vip}" == "true" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The IP for kube-vip '${BOOTSTRAP_KUBE_VIP_ADDR}' should NOT be in the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
if [[ "${found_k8s_gateway}" == "false" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The IP for k8s_gateway '${BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR}' should be in the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
if [[ "${found_ingress}" == "false" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The IP for ingress '${BOOTSTRAP_METALLB_INGRESS_ADDR}' should be in the provided metallb range '${BOOTSTRAP_METALLB_LB_RANGE}'"
exit 1
fi
}

verify_age() {
_has_envar "BOOTSTRAP_AGE_PUBLIC_KEY"
_has_envar "SOPS_AGE_KEY_FILE"
Expand Down Expand Up @@ -245,14 +184,12 @@ verify_kubevip() {
_has_valid_ip "${BOOTSTRAP_KUBE_VIP_ADDR}" "BOOTSTRAP_KUBE_VIP_ADDR"
}

verify_metallb() {
_has_envar "BOOTSTRAP_METALLB_LB_RANGE"
_has_envar "BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR"
_has_envar "BOOTSTRAP_METALLB_INGRESS_ADDR"
_has_valid_ip "$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f1)" "BOOTSTRAP_METALLB_LB_RANGE"
_has_valid_ip "$(echo "${BOOTSTRAP_METALLB_LB_RANGE}" | cut -d- -f2)" "BOOTSTRAP_METALLB_LB_RANGE"
_has_valid_ip "${BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR}" "BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR"
_has_valid_ip "${BOOTSTRAP_METALLB_INGRESS_ADDR}" "BOOTSTRAP_METALLB_INGRESS_ADDR"
verify_lbs() {
#TODO: Validate IPs aren't already in use with arp
_has_envar "BOOTSTRAP_K8S_GATEWAY_ADDR"
_has_envar "BOOTSTRAP_INGRESS_NGINX_ADDR"
_has_valid_ip "${BOOTSTRAP_K8S_GATEWAY_ADDR}" "BOOTSTRAP_K8S_GATEWAY_ADDR"
_has_valid_ip "${BOOTSTRAP_INGRESS_NGINX_ADDR}" "BOOTSTRAP_INGRESS_NGINX_ADDR"
}

verify_cluster_service_cidrs() {
Expand Down Expand Up @@ -352,16 +289,16 @@ verify_ansible_hosts() {
_has_envar "${node_password}" "true"
_has_envar "${node_control}"
_has_optional_envar "${node_hostname}"
if [[ "${!node_addr}" == "${BOOTSTRAP_KUBE_VIP_ADDR}" && "${BOOTSTRAP_KUBE_VIP_ENABLED}" == "true" ]]; then
if [[ "${!node_addr}" == "${BOOTSTRAP_KUBE_VIP_ADDR}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The kube-vip IP '${BOOTSTRAP_KUBE_VIP_ADDR}' should not be the same as the IP for node '${!node_addr}'"
exit 1
fi
if [[ "${!node_addr}" == "${BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The k8s-gateway load balancer IP '${BOOTSTRAP_METALLB_K8S_GATEWAY_ADDR}' should not be the same as the IP for node '${!node_addr}'"
if [[ "${!node_addr}" == "${BOOTSTRAP_K8S_GATEWAY_ADDR}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The k8s-gateway load balancer IP '${BOOTSTRAP_K8S_GATEWAY_ADDR}' should not be the same as the IP for node '${!node_addr}'"
exit 1
fi
if [[ "${!node_addr}" == "${BOOTSTRAP_METALLB_INGRESS_ADDR}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The ingress load balancer IP '${BOOTSTRAP_METALLB_INGRESS_ADDR}' should not be the same as the IP for node '${!node_addr}'"
if [[ "${!node_addr}" == "${BOOTSTRAP_INGRESS_NGINX_ADDR}" ]]; then
_log "ERROR(${FUNCNAME[0]})" "The ingress load balancer IP '${BOOTSTRAP_INGRESS_NGINX_ADDR}' should not be the same as the IP for node '${!node_addr}'"
exit 1
fi
if ssh -q -o BatchMode=yes -o ConnectTimeout=5 "${!node_username}"@"${!var}" "true"; then
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
name: cert-manager
namespace: cert-manager
spec:
interval: 15m
interval: 30m
chart:
spec:
chart: cert-manager
Expand Down
2 changes: 1 addition & 1 deletion kubernetes/apps/default/echo-server/app/helmrelease.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
name: echo-server
namespace: default
spec:
interval: 15m
interval: 30m
chart:
spec:
chart: app-template
Expand Down
2 changes: 1 addition & 1 deletion kubernetes/apps/default/hajimari/app/helmrelease.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
name: hajimari
namespace: default
spec:
interval: 15m
interval: 30m
chart:
spec:
chart: hajimari
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ metadata:
name: weave-gitops
namespace: flux-system
spec:
interval: 15m
interval: 30m
chart:
spec:
chart: weave-gitops
Expand Down
Loading

0 comments on commit 467af18

Please sign in to comment.