DockerSwarm

From campisano.org
Jump to navigation Jump to search

Install

Prerequisites

from https://kubernetes.io/docs/setup/production-environment/container-runtimes/

  • OS commands
apt-get -y install docker.io docker-compose > /dev/null
apt-get -y clean > /dev/null
  • OS configs
mkdir -p /srv/local/swarm
useradd --system --uid 500 --gid nogroup --groups docker --no-create-home --home-dir /srv/local/swarm --shell /bin/bash swarm
chown swarm:nogroup /srv/local/swarm

Cluster setup

  • to create the leader node
su - swarm
docker swarm init
docker node ls

to get a token to add more managers nodes:

docker swarm join-token manager

to get a token to add worker nodes:

docker swarm join-token worker
  • to add more nodes

for each node:

su - swarm
docker swarm join --token <TOKEN_FROM_PREVIOUS_COMMANDS> <HOST:PORT_FROM_PREVIOUS_COMMANDS>

Note: use the manager or worker tokens to add nodes as manager or worker, respectively

Exposing cluster applications outside the cluster

Using Traefik as edge proxy

from https://blog.creekorful.org/2020/01/how-to-expose-traefik-2-dashboard-securely-docker-swarm/

and https://juliensalinas.com/en/traefik-reverse-proxy-docker-compose-docker-swarm-nlpcloud/

and https://www.rockyourcode.com/traefik-2-docker-swarm-setup-with-docker-socket-proxy-and-more/

and https://www.cometari.com/case-study/container-orchestration-with-docker-swarm-and-traefik

  • from any manager node

from https://community.traefik.io/t/ping-endpoint-working-but-docker-healthcheck-fails/9845

NOTE: please replace <YOUR_HOST> with the hostname of your loadbalancer

su - swarm
docker network create --driver=overlay --subnet=10.1.0.0/16 --attachable traefik-net
cat > traefik.yml << 'EOF'
version: '3.7'

networks:
  traefik-net:
    external: true

services:
  traefik:
    image: traefik:v3.0
    healthcheck:
      test: traefik healthcheck --ping
      interval: 3s
      timeout: 1s
      retries: 3
      start_period: 1s
    networks:
      - traefik-net
    ports:
      - '30080:80'
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
    command:
      - '--log.level=INFO'
      - '--accesslog=true'
      - '--entryPoints.web.address=:80'
      - '--ping=true'
      - '--providers.swarm=true'
      - '--providers.swarm.watch=true'
      - '--providers.swarm.exposedbydefault=false'
      - '--providers.swarm.network=traefik-net'
      - '--providers.swarm.endpoint=unix:///var/run/docker.sock'
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      restart_policy:
        condition: any
        delay: 5s
      labels:
        # enable traefik, because we disabled expose a service by default
        - 'traefik.enable=true'

        # dummy service required by swarm port detection, but not used, the port can be any integer value
        - 'traefik.http.services.traefik-srv.loadbalancer.server.port=888'

        # add additional ping route
        - 'traefik.http.routers.ping-rt.rule=Path(`/healthcheck`)'
        - 'traefik.http.routers.ping-rt.service=ping@internal'

        # add metrics route
        - 'traefik.http.routers.metrics-rt.rule=Host(`<YOUR_HOST>`) && PathPrefix(`/metrics`)'
        - 'traefik.http.routers.metrics-rt.service=prometheus@internal'

EOF
docker stack deploy -c traefik.yml traefik

see service logs with

docker service logs -f traefik_traefik


Note: the traefik log and env statements are useful to capture logs data with tools like fluentbit, see Grafana and the next section

More examples

A more complete traefik config with dashboard, metrics and tracing enabled, as well as useful logging labels, middlewares and a tcp proxy for a squid application:

  • traefik

NOTE: please replace <YOUR_HOST> with the hostname of your loadbalancer

NOTE: please replace <YOUR_HTPASSWD_OUTPUT> with the output of "echo $(htpasswd -nb <YOUR_USER> <YOUR_PASS>) | sed -e s/\\$/\\$\\$/g"

cat > traefik.yml << 'EOF'
version: '3.7'

networks:
  traefik-net:
    external: true

services:
  traefik:
    image: traefik:v3.0
    healthcheck:
      test: traefik healthcheck --ping
      interval: 3s
      timeout: 2s
      retries: 2
      start_period: 10s
    networks:
      - traefik-net
    ports:
      - '30080:80'
      - '33128:3128'
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
    command:
      - '--log.level=INFO'
      - '--accesslog=true'
      - '--entryPoints.web.address=:80'
      - '--entryPoints.tcp3128.address=:3128'
      - '--ping=true'
      - '--api=true'
      - '--api.dashboard=true'
      - '--providers.swarm=true'
      - '--providers.swarm.watch=true'
      - '--providers.swarm.exposedbydefault=false'
      - '--providers.swarm.network=traefik-net'
      - '--providers.swarm.endpoint=unix:///var/run/docker.sock'
      - '--metrics.prometheus=true'
      - '--metrics.prometheus.addEntryPointsLabels=true'
      - '--metrics.prometheus.addServicesLabels=true'
      - '--metrics.prometheus.buckets=0.1,0.3,0.5,1.0,5.0'
      - '--tracing.jaeger=true'
      - '--tracing.jaeger.samplingParam=0'
      - '--tracing.jaeger.traceContextHeaderName=X-Request-ID'
    environment:
      - 'LOG_ENABLED=true'
      - 'LOG_SERVICE={{.Service.Name}}'
      - 'LOG_TASK={{.Task.Name}}'
      - 'LOG_IMAGE={{index .Service.Labels "com.docker.stack.image"}}'
      - 'LOG_HOST={{.Node.Hostname}}'
    logging:
      driver: 'json-file'
      options:
        env: 'LOG_ENABLED,LOG_SERVICE,LOG_TASK,LOG_IMAGE,LOG_HOST'
        max-size: '32m'
        max-file: '1'
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      restart_policy:
        condition: any
        delay: 5s
      labels:
        # enable traefik, because we disabled expose a service by default
        - 'traefik.enable=true'

        # dummy service required by swarm port detection, but not used, the port can be any integer value
        - 'traefik.http.services.traefik-srv.loadbalancer.server.port=888'

        # define middlewares
        - 'traefik.http.middlewares.gzip.compress=true'
        - 'traefik.http.middlewares.gzip.compress.excludedcontenttypes=image/png, image/jpeg, font/woff2'
        # basicauth created with 'echo $(htpasswd -nb <YOUR_USER> <YOUR_PASS>) | sed -e s/\\$/\\$\\$/g'
        - 'traefik.http.middlewares.traefik-auth.basicauth.users=<YOUR_HTPASSWD_OUTPUT>'

        # add additional ping route
        - 'traefik.http.routers.ping-rt.rule=Path(`/healthcheck`)'
        - 'traefik.http.routers.ping-rt.service=ping@internal'

        # add dashboard route
        - 'traefik.http.routers.dashboard-rt.rule=Host(`<YOUR_HOST>`) && (PathPrefix(`/api`) || PathPrefix(`/dashboard`))'
        - 'traefik.http.routers.dashboard-rt.service=api@internal'
        - 'traefik.http.routers.dashboard-rt.middlewares=traefik-auth'

        # add metrics route
        - 'traefik.http.routers.metrics-rt.rule=Host(`<YOUR_HOST>`) && PathPrefix(`/metrics`)'
        - 'traefik.http.routers.metrics-rt.service=prometheus@internal'
EOF
docker stack deploy -c traefik.yml traefik
  • squid

NOTE: please replace <YOUR_USER> and <YOUR_PASS> with the user and pass you want to use to access squd proxy

#
# squid admin password
#
htpasswd -c -b passwords <YOUR_USER> <YOUR_PASS>
#
# squid config
#
cat > squid.conf << 'EOF'
# listen to the follow port
http_port 3128

# configure auth
auth_param basic program /usr/lib/squid/basic_ncsa_auth /etc/squid/passwords
auth_param basic realm Test proxy
auth_param basic children 5
auth_param basic credentialsttl 24 hours

# require auth
acl auth proxy_auth REQUIRED
http_access allow auth

# allow access to squid cache manager app from localhost
http_access allow localhost manager
http_access deny manager

# finally allow auth users and deny any other
acl authenticated proxy_auth REQUIRED
http_access allow authenticated
http_access deny all

# disable all logging
access_log none
cache_log /dev/null

# disable storage cache
cache_mem 8 MB
cache deny all
cache_dir null /tmp

# avoid use of ipv6
dns_v4_first on


# make it anonymous
forwarded_for off
request_header_access Allow allow all
request_header_access Authorization allow all
request_header_access WWW-Authenticate allow all
request_header_access Proxy-Authorization allow all
request_header_access Proxy-Authenticate allow all
request_header_access Cache-Control allow all
request_header_access Content-Encoding allow all
request_header_access Content-Length allow all
request_header_access Content-Type allow all
request_header_access Date allow all
request_header_access Expires allow all
request_header_access Host allow all
request_header_access If-Modified-Since allow all
request_header_access Last-Modified allow all
request_header_access Location allow all
request_header_access Pragma allow all
request_header_access Accept allow all
request_header_access Accept-Charset allow all
request_header_access Accept-Encoding allow all
request_header_access Accept-Language allow all
request_header_access Content-Language allow all
request_header_access Mime-Version allow all
request_header_access Retry-After allow all
request_header_access Title allow all
request_header_access Connection allow all
request_header_access Proxy-Connection allow all
request_header_access User-Agent allow all
request_header_access Cookie allow all
request_header_access All deny all
EOF
#
# squid stack
#
cat > squid.yml << 'EOF'
version: "3.7"

configs:
  squid_config:
    file: ./squid.conf
  squid_pass:
    file: ./passwords

networks:
  traefik-net:
    external: true

services:
  squid:
    image: ubuntu/squid:5.2-22.04_beta
    environment:
      - 'TZ=UTC'
    configs:
      - source: squid_config
        target: /etc/squid/squid.conf
      - source: squid_pass
        target: /etc/squid/passwords
    networks:
      - traefik-net
    deploy:
      replicas: 1
      labels:
        - 'traefik.enable=true'

        - 'traefik.tcp.services.squid-backend-srv.loadbalancer.server.port=3128'

        - 'traefik.tcp.routers.squid-backend-tcp-rl.rule=HostSNI(`*`)'
        - 'traefik.tcp.routers.squid-backend-tcp-rl.entrypoints=tcp3128'
        - 'traefik.tcp.routers.squid-backend-tcp-rl.service=squid-backend-srv'
EOF
docker stack deploy -c squid.yml squid

Notes

Placeholder to get docker infos

from https://docs.docker.com/engine/swarm/services/#create-services-using-templates

and https://forums.docker.com/t/example-usage-of-docker-swarm-template-placeholders/73859

For example, you can use placeholder to put docker infos into env vars:

version: '3'

services:
  myservice:
    image: 'myimage'
    environment:
      - 'TASK_NAME={{.Task.Name}}'
      - 'TASK_SLOT={{.Task.Slot}}'
      - 'SERVICE_ID={{.Service.ID}}'
      - 'SERVICE_NAME={{.Service.Name}}'
      - 'SERVICE_LABELS={{.Service.Labels}}'
      - 'SERVICE_IMAGE={{index .Service.Labels "com.docker.stack.image"}}'
      - 'STACK_NAMESPACE={{index .Service.Labels "com.docker.stack.namespace"}}'

Move a Volume between nodes

  • from the source host:
mkdir bkp
docker run -it --rm -v <VOLUME_NAME>:/mnt/volume:ro -v $(pwd)/bkp:/bkp debian:stable-slim
tar -czf /bkp/volume.tgz /mnt/volume
exit
  • copy the file from the source to the dest host
  • from the dest host:
docker volume create --driver local --label com.docker.stack.namespace=<OPTIONAL_LABEL VOLUME_NAME>
mkdir bkp
mv volume.tgz bkp/
docker run -it --rm -v <VOLUME_NAME>:/mnt/volume:rw -v $(pwd)/bkp:/bkp debian:stable-slim
tar -xzf bkp/volume.tgz
exit

Tunneling an internal container port

  • first, create a new container in the same network that tunneling and export a new port to a specific host node, in this example is a rabbitmq_service_name (available in the rabbit_network network) instance port exported in the 55555 host port
  socat-tunneling:
    image: jonlabelle/network-tools
    entrypoint: socat -dd TCP-LISTEN:55555,fork TCP:rabbitmq_service_name:5672
    networks:
      - rabbit_network
    ports:
      - target: 55555
        published: 55555
        mode: host
    deploy:
      endpoint_mode: dnsrr
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == NODE_HOSTNAME
  • then, tunnel the remote host port to a local port as usual
ssh -v -C -N -L 0.0.0.0:5672:localhost:55555 NODE_HOSTNAME
  • you can also use this port in a local container (e.g. to replace a local service) with:
  rabbitmq-emulated:
    image: jonlabelle/network-tools
    entrypoint: socat -dd TCP-LISTEN:5672,fork TCP:host.docker.internal:5672
    extra_hosts:
      - "host.docker.internal:host-gateway"