Kubernetes

From campisano.org
Jump to navigation Jump to search

Install

Prerequisites

from https://kubernetes.io/docs/setup/production-environment/container-runtimes/

  • OS commands
apt-get -y install socat conntrack > /dev/null
apt-get -y clean > /dev/null
  • OS configs
cat > /etc/modules-load.d/k8s.conf << 'EOF'
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
cat > /etc/sysctl.d/k8s.conf << 'EOF'
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF
sysctl -w net.bridge.bridge-nf-call-iptables=1
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
sysctl -w net.ipv4.ip_forward=1
  • Containerd

from https://github.com/containerd/containerd/blob/main/docs/getting-started.md

ARCH="amd64"
# or ARCH="arm64"
# containerd
VERSION="1.7.3"
curl -fsSL "https://github.com/containerd/containerd/releases/download/v${VERSION}/containerd-${VERSION}-linux-${ARCH}.tar.gz" | tar -C /usr/local/ -xz
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# runc
VERSION="1.1.8"
curl -fsSL -o /usr/local/sbin/runc "https://github.com/opencontainers/runc/releases/download/v${VERSION}/runc.${ARCH}"
chmod 0755 /usr/local/sbin/runc
# cni
VERSION="1.3.0"
mkdir -p /opt/cni/bin
curl -fsSL "https://github.com/containernetworking/plugins/releases/download/v${VERSION}/cni-plugins-linux-${ARCH}-v${VERSION}.tgz" | tar -C /opt/cni/bin -xz
# containerd service
mkdir -p /usr/local/lib/systemd/system/
curl -fsSL -o /usr/local/lib/systemd/system/containerd.service "https://raw.githubusercontent.com/containerd/containerd/main/containerd.service"
mkdir -p /etc/cni/net.d
systemctl daemon-reload
systemctl enable containerd
systemctl start containerd

kube binaries

from https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

and https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/

see https://dl.k8s.io/release/stable.txt

ARCH="amd64"
# or ARCH="arm64"
mkdir -p /usr/local/bin
cd /usr/local/bin
# crictl
VERSION="1.27.0"
curl -fsSL "https://github.com/kubernetes-sigs/cri-tools/releases/download/v${VERSION}/crictl-v${VERSION}-linux-${ARCH}.tar.gz" | tar -xz
chmod 0755 crictl
chown root:root crictl
# kubeadm / kubelet
VERSION="1.27.4"
curl -fsSL --remote-name-all "https://dl.k8s.io/release/v${VERSION}/bin/linux/${ARCH}/{kubeadm,kubelet}"
chmod 0755 {kubeadm,kubelet}
chown root:root {kubeadm,kubelet}
# kubernetes services
VERSION="0.15.1"
curl -fsSL "https://raw.githubusercontent.com/kubernetes/release/v${VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:/usr/local/bin:g" > /etc/systemd/system/kubelet.service
mkdir -p /etc/systemd/system/kubelet.service.d
curl -fsSL "https://raw.githubusercontent.com/kubernetes/release/v${VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:/usr/local/bin:g" > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# kubectl
VERSION="1.27.4"
curl -fsSL -O "https://dl.k8s.io/release/v${VERSION}/bin/linux/${ARCH}/kubectl"
chmod 0755 kubectl
chown root:root kubectl

Cluster setup

Now, if you want to have an High Availability (HA) kubernetes cluster, apparently you need an external load balancer. See https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ .

So, choose your setup to keep going:

NON HA cluster (no load balancer required)

# enable kubelet service but not start it, kubeadm init will create the initial config and start itself
# note the use of --ignore-preflight-errors=NumCPU,Mem option when machines have less than 2 cores or 1.7Gb Ram
systemctl enable kubelet.service
kubeadm init --ignore-preflight-errors=NumCPU,Mem --cri-socket unix:/run/containerd/containerd.sock --pod-network-cidr=192.168.0.0/16
...
# PLEASE TAKE NOTE OF THE OUTPUT COMMAND TO JOIN MORE WORKERS TO THE CLUSTER

HA cluster (load balancer REQUIRED)

  • please note, a loadbalancer domain name is useful here, however at least a loadbalancer responding to an internal ip and correctly configured to forward to the kubernetes control-plane https request is required. It must be defined with the option --control-plane-endpoint and forward the requests to the control-plane nodes of the cluster.
# enable kubelet service but not start it, kubeadm init will create the initial config and start itself
systemctl enable kubelet.service
kubeadm init --ignore-preflight-errors=NumCPU,Mem --cri-socket unix:/run/containerd/containerd.sock --pod-network-cidr=192.168.0.0/16 --control-plane-endpoint "LOADBALANCER_INTERNAL_IP_OR_NAME:6443" --upload-certs
# PLEASE TAKE NOTE OF THE OUTPUT COMMAND TO JOIN MORE CONTROL PLANES TO THE CLUSTER

add a pod network

  • if you want to use flannel pod network:

from https://github.com/flannel-io/flannel

export KUBECONFIG=/etc/kubernetes/admin.conf
VERSION="0.22.1"
curl -fsSL "https://github.com/flannel-io/flannel/releases/download/v${VERSION}/kube-flannel.yml" | sed "s:10.244.0.0/16:192.168.0.0/16:g" | kubectl create -f -
# wait ready state
kubectl get nodes
  • ALTERNATIVELY, if you want to use calico pod network:

from https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart

see also https://blog.palark.com/calico-for-kubernetes-networking-the-basics-examples/

export KUBECONFIG=/etc/kubernetes/admin.conf
VERSION="3.26.1"
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v${VERSION}/manifests/tigera-operator.yaml
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v${VERSION}/manifests/custom-resources.yaml
# wait ready state
kubectl get nodes

Links

add more nodes to the cluster

For each node,

  • again, enable (without start) the kubelet service
# enable kubelet service but not start it, kubeadm init will create the initial config and start itself
systemctl enable kubelet.service
  • use the command outputted by the kubeadm init to add nodes to the cluster

remember to add the --ignore-preflight-errors=NumCPU,Mem option if your machines have less than 2 cores or 1.7Gb Ram

kubeadm join --ignore-preflight-errors=NumCPU,Mem ...
  • NOTE, if you are configuring an HA cluster, and if you want to use the control plane nodes to also run your containers, you must allow that running the following command in the first node:
kubectl get nodes # note that the node ROLES is control-plane
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

Exposing cluster applications outside the cluster

Usually we can find docs or tutorials using the kubernates Service of type LoadBalancer to expose an app outside the cluster. If I understand it correctly, LoadBalancer service requires a direct integration with a cloud provider in a way that kubernates itself will setup a cloud loadbalancer in such provider. In case we have not (or we do not want) to use such direct integration, we can not use services of type LoadBalancer.

Usually, however, we want to use a loadbalancer to provide some level of High Availability to our applications. We can be using a homemade solution or a free cloud loadbalancer (like the one I'm using Oracle Network Load Balancer) so that we must find a solution to integrate your load balancer with your cluster.

For example, we can setup the loadbalancer to forward the requests incoming from the loadbalancer ip to all of our cluster nodes that are listening to a specific port (30080 in this example). Then we can setup a proxy application (one that implements the ingress controller specification, availables in this list), to listen directly on that port in the node were the proxy is deployed. This can be done using the hostPort:30080 directive, or we can put a kubernetes Service of type NodePort in front of it using a specific nodePort:30080. There are pro and cons to each solution (see for instance the answers on stackoverflow or reddit). In this example we will use hostPort:30080, and we will deploy the proxy application as a type DaemonSet (instead of Deployment) to have an easy way to run the a proxy instance in each node of our cluster.

Note that, if I understand correctly, in both cases we cannot use an Ingress kubernates object in front of proxy to change the routes to proxy itself, because we are skipping the Ingress and making our external loadbalancer to communicate directly with the proxy. However, we can configure an Ingress that defines a route to a Service that forward the request to the proxy again. It sound strange, however it can make sense to request that should be answered from the proxy itself (like the proxy healthcheck) but must be managed by an Ingress to do some route stuffs (e.g. to replace the Traefik path /ping, where it answers its health status, to the path /healthcheck that can be a path that make sense for us).

Using Traefik as ingress controller

from https://doc.traefik.io/traefik/v3.0/providers/kubernetes-crd/

To use traefik middlewares (useful to modify requests like replace paths or strip prefixes) we need to use the traefik kubernetescrd provider and add the traefik crd resources, so:

  • Traefik Resource Definitions 00-kubernetes-crd-definition-v1.yml
curl -SsL https://raw.githubusercontent.com/traefik/traefik/v3.0/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml -o 00-kubernetes-crd-definition-v1.yml
kubectl apply -f 00-kubernetes-crd-definition-v1.yml
  • Traefik RBAC 01-kubernetes-crd-rbac.yml
curl -SsL https://raw.githubusercontent.com/traefik/traefik/v3.0/docs/content/reference/dynamic-configuration/kubernetes-crd-rbac.yml -o 01-kubernetes-crd-rbac.yml
kubectl apply -f 01-kubernetes-crd-rbac.yml

Then, we setup traefik with the kubernetescrd provider for middlewares and kubernetesingress to use the classic Ingress object to define routes. We setup also a kubernetes Service of type ClusterIP for traefik to be able to use Ingress:

  • Traefik application resource 02-traefik.yml
cat > 02-traefik.yml << 'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-sa
  labels:
    stack: traefik

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: traefik-crb
  labels:
    stack: traefik
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-ingress-controller
subjects:
  - kind: ServiceAccount
    name: traefik-sa
    namespace: default

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: traefik-dashboard-ing
  labels:
    stack: traefik
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - http:
        paths:
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: traefik-srv
                port:
                  number: 8080
          - path: /dashboard
            pathType: Prefix
            backend:
              service:
                name: traefik-srv
                port:
                  number: 8080

---
apiVersion: v1
kind: Service
metadata:
  name: traefik-srv
  labels:
    stack: traefik
spec:
  type: ClusterIP
  selector:
    stack: traefik
  ports:
    - protocol: TCP
      port: 80
      name: web
      targetPort: 80
    - protocol: TCP
      port: 443
      name: websecure
      targetPort: 443
    - protocol: TCP
      port: 8080
      name: traefik
      targetPort: 8080

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: traefik-dst
  labels:
    stack: traefik
spec:
  selector:
    matchLabels:
      stack: traefik
  template:
    metadata:
      labels:
        stack: traefik
    spec:
      serviceAccountName: traefik-sa
      containers:
        - name: traefik-cnt
          image: docker.io/traefik:v3.0
          imagePullPolicy: Always
          readinessProbe:
            httpGet:
              path: /ping
              port: 8080
            failureThreshold: 1
            initialDelaySeconds: 10
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /ping
              port: 8080
            failureThreshold: 3
            initialDelaySeconds: 10
            periodSeconds: 10
          args:
            - --log.level=INFO
            - --accesslog=true
            - --entrypoints.web.address=:80/tcp
            - --entrypoints.websecure.address=:443/tcp
            - --entrypoints.traefik.address=:8080/tcp
            - --ping=true
            - --ping.entryPoint=traefik
            - --api=true
            - --api.dashboard=true
            - --api.insecure=true
            - --providers.kubernetesingress=true
            - --providers.kubernetescrd=true
          ports:
            - name: web
              containerPort: 80
              hostPort: 30080
            - name: websecure
              containerPort: 443
              hostPort: 30443
            - name: traefik
              containerPort: 8080
EOF
kubectl apply -f 02-traefik.yml
  • see all
watch kubectl get all,ing,cm,secret -A -o wide
  • see traefik definition
kubectl describe daemonset traefik
  • see traefik logs
kubectl logs -f --all-containers --prefix=true -l stack=traefik
  • delete all
kubectl delete -f 00-kubernetes-crd-definition-v1.yml -f 01-kubernetes-crd-rbac.yml -f 02-traefik.yml

Configure Cert-manager with letsencrypt

  • Add cert-manager resources:
curl -SsL https://github.com/cert-manager/cert-manager/releases/download/v1.14.4/cert-manager.yaml -o 00-cert-manager.yaml
kubectl apply -f 00-cert-manager.yaml

Test a staging certificate

Test a staging certificate to validate letsencrypt integration

(Note, letsencrypt has a rate limit for request in production environment, so it is better to test the first setup in the staging one)

  • Create the issuer

Please, replace <YOUR_EMAIL_FOR_NOTIFICATION> with your email, this is useful to have notification of problems with certificate renewals.

cat > 01-letsencrypt-issuer-staging.yaml << 'EOF'
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
  labels:
    stack: cert-manager
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: <YOUR_EMAIL_FOR_NOTIFICATION>
    privateKeySecretRef:
      name: letsencrypt-staging
    solvers:
      - http01:
          ingress:
            ingressTemplate:
              metadata:
                annotations:
                  "traefik.ingress.kubernetes.io/router.entrypoints": "web"
EOF
kubectl apply -f 01-letsencrypt-issuer-staging.yaml
  • Add a staging certificate

Please, replace <YOUR_CERT_NAME> with a name for your certificate, e.g. www-example-com-staging-cert

Please, replace <YOUR_DOMAIN> with your domain name, e.g. www.example.com

cat > 02-certificate-staging.yml << 'EOF'
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: <YOUR_CERT_NAME>-staging-cert
  labels:
    stack: cert-manager
spec:
  secretName: <YOUR_CERT_NAME>-staging-cert
  commonName: <YOUR_DOMAIN>
  dnsNames:
    - <YOUR_DOMAIN>
  issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-staging
EOF
kubectl apply -f 02-certificate-staging.yml
  • validate staging certificate
kubectl get clusterissuer
kubectl describe clusterissuer letsencrypt-staging

kubectl get certificate
kubectl describe certificate <YOUR_CERT_NAME>-staging-cert
kubectl describe certificaterequest <YOUR_CERT_NAME>-staging-cert-XXXXX

kubectl describe order <YOUR_CERT_NAME>-staging-cert-XXXXX-XXXXXXXXXX
kubectl describe challenge <YOUR_CERT_NAME>-staging-cert-XXXXX-XXXXXXXXXX-XXXXXXXXXX

kubectl get ingress
kubectl describe ingress cm-acme-http-solver-XXXXX
kubectl describe service cm-acme-http-solver-XXXXX

Create a production certificate

  • Create the issuer

Please, replace <YOUR_EMAIL_FOR_NOTIFICATION> with your email, this is useful to have notification of problems with certificate renewals.

cat > 03-letsencrypt-issuer-production.yaml << 'EOF'
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <YOUR_EMAIL_FOR_NOTIFICATION>
    privateKeySecretRef:
      name: letsencrypt-production
    solvers:
      - http01:
          ingress:
            ingressTemplate:
              metadata:
                annotations:
                  "traefik.ingress.kubernetes.io/router.entrypoints": "web"
EOF
kubectl apply -f 03-letsencrypt-issuer-production.yaml
  • Add a production certificate

Please, replace <YOUR_CERT_NAME> with a name for your certificate, e.g. www-example-com-production-cert

Please, replace <YOUR_DOMAIN> with your domain name, e.g. www.example.com

cat > 04-certificate-production.yml << 'EOF'
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: <YOUR_CERT_NAME>-production-cert
spec:
  secretName: <YOUR_CERT_NAME>-production-cert
  commonName: <YOUR_DOMAIN>
  dnsNames:
  - <YOUR_DOMAIN>
  issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-production
EOF
kubectl apply -f 04-certificate-production.yml
  • validate production certificate
kubectl get clusterissuer
kubectl describe clusterissuer letsencrypt-production

kubectl get certificate
kubectl describe certificate <YOUR_CERT_NAME>-production-cert
kubectl describe certificaterequest <YOUR_CERT_NAME>-production-cert-XXXXX

kubectl describe order <YOUR_CERT_NAME>-production-cert-XXXXX-XXXXXXXXXX
kubectl describe challenge <YOUR_CERT_NAME>-production-cert-XXXXX-XXXXXXXXXX-XXXXXXXXXX

kubectl get ingress
kubectl describe ingress cm-acme-http-solver-XXXXX
kubectl describe service cm-acme-http-solver-XXXXX

References

Add a whoami stateless application

Then we can add an example resources to test the add of a new application on our cluster and configure the http route to the application:

  • testing whoami resouce 01-whoami.yml

Please, replace <YOUR_CERT_NAME> with a name for your certificate, e.g. www-example-com-production-cert

Please, replace <YOUR_DOMAIN> with your domain name, e.g. www.example.com

cat > 01-whoami.yml << 'EOF'

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami-ing
  labels:
    stack: whoami
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
    traefik.ingress.kubernetes.io/router.tls: 'true'
    certmanager.k8s.io/cluster-issuer: letsencrypt-production
spec:
  rules:
    - http:
        paths:
          - path: /whoami
            pathType: Exact
            backend:
              service:
                name:  whoami-srv
                port:
                  number: 80
  tls:
    - hosts:
        - <YOUR_DOMAIN>
      secretName: <YOUR_CERT_NAME>-production-cert

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami-http-ing
  labels:
    stack: whoami
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
    traefik.ingress.kubernetes.io/router.middlewares: default-whoami-redirectscheme@kubernetescrd
spec:
  rules:
    - http:
        paths:
          - path: /whoami
            pathType: Exact
            backend:
              service:
                name:  traefik-srv
                port:
                  number: 80

---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: whoami-redirectscheme
  labels:
    stack: whoami
spec:
  redirectScheme:
    scheme: https
    permanent: true

---
apiVersion: v1
kind: Service
metadata:
  name: whoami-srv
  labels:
    stack: whoami
spec:
  ports:
    - name: http
      port: 80
  selector:
    stack: whoami

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: whoami-dpl
  labels:
    stack: whoami
spec:
  replicas: 1
  selector:
    matchLabels:
      stack: whoami
  template:
    metadata:
      labels:
        stack: whoami
    spec:
      containers:
        - name: whoami-cnt
          image: docker.io/traefik/whoami
          imagePullPolicy: Always
          ports:
            - containerPort: 80

EOF
kubectl apply -f 01-whoami.yml

Add a stateful applications

Kubernetes allows local storage volumes to be used in your apps, but does not takes decision about the lifecycle of the data. If you choose local storages, the persistent volumes are automatically claimed to be used and released when not needed, as usual, but they will remains in a released state, with the data untouched. It is responsibility of an human admin (or third part software managers) to decides what to do with the data, remove the claim to the volume, and set the volume in a usable state.

Usually, you may need to:

  • remove the pvc:
kubectl delete persistentvolumeclaim/YOUR_PVC_NAME
  • manually clean or reset volume data if needed (rm -rf)
  • mark the pv as reusable:
kubectl patch persistentvolume/YOUR_PV_NAME -p '{"spec":{"claimRef": null}}'

from https://medium.com/building-the-open-data-stack/reclaiming-persistent-volumes-in-kubernetes-5e035ba8c770

and https://kubernetes.io/docs/concepts/storage/persistent-volumes/#lifecycle-of-a-volume-and-claim

Postgresql example

from https://sweetcode.io/how-to-use-kubernetes-to-deploy-postgres/

and https://medium.com/@suyashmohan/setting-up-postgresql-database-on-kubernetes-24a2a192e962

A stateful application requires a persistent volume to store the data. In this example, we will use a local persistent volume to deploy a StatefulSet app. See also https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/ for an example.

  • Setup a storage of type local, a persistent volume with a max size mapped to a path of a local-storage, and add a postgres StatefulSet app claiming a part (or full) size of the persistent volume

Please note that you need to fix the local volume to a node, so you need to change the node name tf-instance2 to your node name, and you need to change the path /var/local/kube-local-storage/postgres-pv-2 to an existing path

cat > postgresql.yml << 'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
#  annotations:
#    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-pv-2
  labels:
    app: postgres
    type: local
spec:
  capacity:
    storage: 8Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /var/local/kube-local-storage/postgres-pv-2
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - tf-instance2

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-configuration
  labels:
    app: postgres
data:
  POSTGRES_DB: mydb
  POSTGRES_USER: mydbuser
  POSTGRES_PASSWORD: mydbpass

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-statefulset
  labels:
    app: postgres
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  volumeClaimTemplates:
  - metadata:
      name: postgres-pv-claim-template
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: local-storage
      resources:
        requests:
          storage: 8Gi
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        envFrom:
        - configMapRef:
            name: postgres-configuration
        ports:
        - containerPort: 5432
          name: postgresdb
        volumeMounts:
        - name: postgres-pv-claim-template
          mountPath: /var/lib/postgresql/data

---
apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  type: ClusterIP
  selector:
    app: postgres
  ports:
    - protocol: TCP
      port: 5432
      name: postgresdb
EOF
  • apply
kubectl apply -f postgresql.yml
  • logs
kubectl logs -f -l app=postgres
  • see volumes in your node
kubectl get pv
  • delete
kubectl delete -f postgresql.yml
  • if volumes is stuck in state Terminating, edit it and remove its constraint
  finalizers:
  - kubernetes.io/pv-protection

using the command

kubectl edit pv

Add postgres High Availability stateful app

from https://www.postgresql.org/download/products/3-clusteringreplication/

Using Kubegres

from https://www.kubegres.io/doc/getting-started.html

see also https://cloud.google.com/kubernetes-engine/docs/tutorials/stateful-workloads/postgresql

  • Add kubegres operator
curl -SsL https://raw.githubusercontent.com/reactive-tech/kubegres/v1.17/kubegres.yaml -o 00-kubegres.yml
kubectl apply -f 00-kubegres.yml
  • Setup a storage of type local, a Persistent Volume for each node, and remember to create the Persistent Volume paths on local nodes

First setup a storage class:

cat > 01-storage-class.yml << 'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kubegres-storage-class
  labels:
    stack: kubegres
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
EOF
kubectl apply -f 01-storage-class.yml

Next, for each node instance:

Please, first define the following env vars:

STORAGE_PATH=<YOUR_INSTANCE_STORAGE_PATH> # example /var/local/kubernetes-storage/kubegres-persistent-volume
STORAGE_SIZE=<YOUR_INSTANCE_STORAGE_SIZE> # example 8Gi
INSTANCE_NAME=<YOUR_INSTANCE_NAME_N>      # example tf-instance0

Then setup the 02-persistent-volume-${INSTANCE_NAME}.yml resource file:

cat > 02-persistent-volume-${INSTANCE_NAME}.yml << EOF
apiVersion: v1
kind: PersistentVolume
metadata:
  name: kubegres-persistent-volume-${INSTANCE_NAME}
  labels:
    stack: kubegres
spec:
  capacity:
    storage: ${STORAGE_SIZE}
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: kubegres-storage-class
  local:
    path: ${STORAGE_PATH}
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
              - ${INSTANCE_NAME}
EOF
kubectl apply -f 02-persistent-volume-${INSTANCE_NAME}.yml
  • create the Postgres cluster

Please, first define the following env vars:

SUPERUSER_PASSWORD=<YOUR_SUPERUSER_PASSWORD>
REPLICAUSER_PASSWORD=<YOUR_REPLICAUSER_PASSWORD>
REPLICAS=<YOUR_PERSISTENT_VOLUME_COUNT>          # example 4
STORAGE_SIZE=<YOUR_INSTANCE_STORAGE_SIZE>        # example 8Gi

Then, create a secret to store the passwords:

cat > 03-secret.yml << EOF
apiVersion: v1
kind: Secret
metadata:
  name: kubegres-cluster-scr
  labels:
    stack: kubegres
type: Opaque
stringData:
  superUserPassword: ${SUPERUSER_PASSWORD}
  replicationUserPassword: ${REPLICAUSER_PASSWORD}
EOF
kubectl apply -f 03-secret.yml

Finally, create the cluster:

cat > 04-kubegres-cluster.yml << EOF
apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
metadata:
  name: kubegres-cluster
  labels:
    stack: kubegres
spec:
  replicas: ${REPLICAS}
  image: docker.io/postgres:16
  port: 5432
  database:
    size: ${STORAGE_SIZE}
    storageClassName: kubegres-storage-class
    volumeMount: /var/lib/postgresql/data
  resources:
    limits:
      memory: 2Gi
    requests:
      memory: 200Mi
  env:
    - name: POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          name: kubegres-cluster-scr
          key: superUserPassword
    - name: POSTGRES_REPLICATION_PASSWORD
      valueFrom:
        secretKeyRef:
          name: kubegres-cluster-scr
          key: replicationUserPassword
EOF
kubectl apply -f 04-kubegres-cluster.yml
  • To see kubegres resources:
kubectl get all,ing,sc,pv,cm,secret -A -o wide --show-labels | grep gres
  • To check node logs:
kubectl logs --all-containers --prefix=true -l app=kubegres-cluster | sort
  • To access postgres database:
kubectl exec $(kubectl get pod -l app=kubegres-cluster,replicationRole=primary -o name) -it -- bash
su - postgres
psql
...
  • To solve problems with claimed but unused persistent volumes

Kubegres will create claims on persistent volumes (i.e. persistentvolumeclaim a.k.a. pvc) to run pods as statefulset apps.

When something goes wrong with the pod, it may decides to drop it and create a new one. If there is an available persistent volume for that, it will use the new one. But it cannot reuse the old claim (not sure if in some cases or in all cases) or the previous used pv, so if there is no pv available the satefulset app will be unable to create the pod. To unlock the situation, you need to remove the pvc, eventually clean the data if needed, and mark the volume as reusable as explained before.

Using CloudNativePG

from https://cloudnative-pg.io/docs/

see also https://awslabs.github.io/data-on-eks/docs/blueprints/distributed-databases/cloudnative-postgres

and https://medium.com/supportsages/set-up-cloudnativepg-postgresql-operator-for-kubernetes-on-aks-2dbc00c770ca

  • Add cloudnative-pg operator
curl -SsL https://github.com/cloudnative-pg/cloudnative-pg/releases/download/v1.21.3/cnpg-1.21.3.yaml -o 00-cloudnative-pg.yml
kubectl apply -f 00-cloudnative-pg.yml
  • Setup a storage of type local, a Persistent Volume for each node, and remember to create the Persistent Volume paths on local nodes

First setup a storage class:

cat > 01-storage-class.yml << 'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cloudnative-pg-storage-class
  labels:
    stack: cloudnative-pg
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
EOF
kubectl apply -f 01-storage-class.yml

Next, for each node instance:

Please, first define the following env vars:

STORAGE_PATH=<YOUR_INSTANCE_STORAGE_PATH> # example /var/local/kubernetes-storage/cloudnative-pg-persistent-volume
STORAGE_SIZE=<YOUR_INSTANCE_STORAGE_SIZE> # example 8Gi
INSTANCE_NAME=<YOUR_INSTANCE_NAME_N>      # example tf-instance0

Then setup the 02-persistent-volume-${INSTANCE_NAME}.yml resource file:

cat > 02-persistent-volume-${INSTANCE_NAME}.yml << EOF
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cloudnative-pg-persistent-volume-${INSTANCE_NAME}
  labels:
    stack: cloudnative-pg
spec:
  capacity:
    storage: ${STORAGE_SIZE}
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: cloudnative-pg-storage-class
  local:
    path: ${STORAGE_PATH}
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
              - ${INSTANCE_NAME}
EOF
kubectl apply -f 02-persistent-volume-${INSTANCE_NAME}.yml
  • create the Postgres cluster

Please, first define the following env vars:

SUPERUSER_PASSWORD=<YOUR_SUPERUSER_PASSWORD>
INSTANCES=<YOUR_INSTANCES_COUNT>                 # example 3
STORAGE_SIZE=<YOUR_INSTANCE_STORAGE_SIZE>        # example 8Gi

Then, create a secret to store the passwords:

cat > 03-secret.yml << EOF
apiVersion: v1
kind: Secret
metadata:
  name: cloudnative-pg-cluster-scr
  labels:
    stack: cloudnative-pg
type: Opaque
stringData:
  password: ${SUPERUSER_PASSWORD}
  username: postgres
EOF
kubectl apply -f 03-secret.yml

Finally, create the cluster:

cat > 04-cloudnative-pg-cluster.yml << EOF
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: cloudnative-pg-cluster
  labels:
    stack: cloudnative-pg
spec:
  instances: ${INSTANCES}
  imageName: ghcr.io/cloudnative-pg/postgresql:16.1
  primaryUpdateStrategy: unsupervised
  enableSuperuserAccess: true
  superuserSecret:
    name: cloudnative-pg-cluster-scr
  storage:
    pvcTemplate:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: ${STORAGE_SIZE}
      storageClassName: cloudnative-pg-storage-class
      volumeMode: Filesystem
  resources:
    limits:
      memory: 2Gi
    requests:
      memory: 200Mi
EOF
kubectl apply -f 04-cloudnative-pg-cluster.yml
  • To see kubegres resources:
kubectl get all,ing,sc,pv,cm,secret -A -o wide --show-labels | grep cloudnative
  • To check node logs:
kubectl logs --all-containers --prefix=true -l cnpg.io/cluster=cloudnative-pg-cluster | sort
  • To access postgres database:
kubectl exec $(kubectl get pod -l cnpg.io/cluster=cloudnative-pg-cluster,cnpg.io/instanceRole=primary -o name) -it -- psql -U postgres
...

Notes

Kubectl

  • To see most of the resources:
kubectl get all,ing,sc,pv,pvc,cm,secret,cronjob -A -o wide
  • Example to run a command in a standalone container
# bash inside a debian container
kubectl run --rm -it --image debian:12.1-slim -- bash
# psql inside a postgres container
kubectl run -it --rm --image=postgres:15.4 -- psql -H <MY_POSTGRES_HOST> -U postgres;
  • You can convert helm files to native kubectl files, this is an example to convert hydra helm package:
./helm template hydra ory/hydra --output-dir ./hydra
  • kubectl apply does a three-way diff

from https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#kubectl-apply

kubectl apply does a three-way diff between the previous configuration, the provided input and the current configuration of the resource, in order to determine how to modify the resource.

Containerd

Note that, if you follow this guide, you are using containerd instead of docker, so some usual commands changes, for instance

  • list node images in the kubernetes namespace
ctr -n k8s.io i ls
  • remove all images containing a text in the name
ctr -n k8s.io i rm $(ctr -n k8s.io i ls -q | grep PARTIALNAME)

References

Free Oracle Cloud Kubernetes cluster with Terraform A Kubernetes guide for Docker Swarm lovers