Netdata on Kubernetes


  • Staff

    both latest and v0.2.1 have linux/arm64 platform

    https://hub.docker.com/r/netdata/agent-sd/tags

    agent-sd is optional, can be disabled in values.yaml
    https://github.com/netdata/helmchart/blob/7c0d22bd415daa52ce6d82b6d0c8e12270bac186/values.yaml#L10-L15



  • @Rybue Thanks for the reply!
    OK, so the main container is set to latest and the sd one is set to v0.2.1
    Here’s the output:

    luis@pi-node1:~/k8s/netdata$ kubectl get po
    NAME                             READY   STATUS    RESTARTS   AGE
    netdata-parent-cfb988d65-rkz5m   0/1     Running   0          14s
    netdata-child-zq2vl              1/2     Error     1          15s
    luis@pi-node1:~/k8s/netdata$
    luis@pi-node1:~/k8s/netdata$ kubectl describe po netdata-child-zq2vl
    Name:         netdata-child-zq2vl
    Namespace:    default
    Priority:     0
    Node:         pi-node1/192.168.178.81
    Start Time:   Thu, 17 Sep 2020 21:47:31 +0100
    Labels:       app=netdata
                  controller-revision-hash=65778dd95d
                  pod-template-generation=1
                  release=netdata
                  role=child
    Annotations:  checksum/config: dbf27785c04d58fa098895f1e45be1b72b4ea76b283ec2d0d373412977e44329
                  container.apparmor.security.beta.kubernetes.io/netdata: unconfined
    Status:       Running
    IP:           192.168.178.81
    IPs:
      IP:           192.168.178.81
    Controlled By:  DaemonSet/netdata-child
    Init Containers:
      init-nodeuid:
        Container ID:  containerd://6f5ff79976b7eb57caeb397cd4780746fc6f6d7074d4b69cc3f7c805197a8a66
        Image:         netdata/wget
        Image ID:      docker.io/netdata/wget@sha256:44e7a2be59451de7fda0bef7f35caeeb34a5e9c96949b17069ec7b62d7545af2
        Port:          <none>
        Host Port:     <none>
        Command:
          /bin/sh
        Args:
          -c
           TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token); URL="https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/api/v1/nodes/${MY_NODE_NAME}"; HEADER="Authorization: Bearer ${TOKEN}";
          DATA=$(wget -q -T 5 --no-check-certificate --header "${HEADER}" -O - "${URL}"); [ -z "${DATA}" ] && exit 1;
          UID=$(echo "${DATA}" | grep -m 1 uid | grep -o ":.*" | tr -d ": \","); [ -z "${UID}" ] && exit 1;
          echo -n "${UID}" > /nodeuid/netdata.public.unique.id;
        State:          Terminated
          Reason:       Completed
          Exit Code:    0
          Started:      Thu, 17 Sep 2020 21:47:35 +0100
          Finished:     Thu, 17 Sep 2020 21:47:35 +0100
        Ready:          True
        Restart Count:  0
        Environment:
          MY_NODE_NAME:   (v1:spec.nodeName)
        Mounts:
          /nodeuid from nodeuid (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from netdata-token-mbdkj (ro)
    Containers:
      netdata:
        Container ID:   containerd://f75510bcd3b0c280208e144d1479d0a23e0128d10c0e16f18afdf8dd35b79504
        Image:          netdata/netdata:latest
        Image ID:       docker.io/netdata/netdata@sha256:06ca7394e515561613324e6700b49deb1bb92de787f9f78bc98b76bc5d2a7462
        Port:           19999/TCP
        Host Port:      19999/TCP
        State:          Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Thu, 17 Sep 2020 21:47:43 +0100
          Finished:     Thu, 17 Sep 2020 21:47:44 +0100
        Last State:     Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Thu, 17 Sep 2020 21:47:39 +0100
          Finished:     Thu, 17 Sep 2020 21:47:40 +0100
        Ready:          False
        Restart Count:  1
        Liveness:       http-get http://:http/api/v1/info delay=0s timeout=1s period=30s #success=1 #failure=3
        Readiness:      http-get http://:http/api/v1/info delay=0s timeout=1s period=30s #success=1 #failure=3
        Environment:
          MY_POD_NAME:                     netdata-child-zq2vl (v1:metadata.name)
          MY_NODE_NAME:                     (v1:spec.nodeName)
          MY_POD_NAMESPACE:                default (v1:metadata.namespace)
          NETDATA_PLUGINS_GOD_WATCH_PATH:  /etc/netdata/go.d/sd/go.d.yml
        Mounts:
          /etc/netdata/go.d.conf from config (rw,path="go.d")
          /etc/netdata/go.d/k8s_kubelet.conf from config (rw,path="kubelet")
          /etc/netdata/go.d/k8s_kubeproxy.conf from config (rw,path="kubeproxy")
          /etc/netdata/go.d/sd/ from sd-shared (rw)
          /etc/netdata/netdata.conf from config (rw,path="netdata")
          /etc/netdata/stream.conf from config (rw,path="stream")
          /host/proc from proc (ro)
          /host/sys from sys (rw)
          /var/lib/netdata/registry/ from nodeuid (rw)
          /var/run/docker.sock from run (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from netdata-token-mbdkj (ro)
      sd:
        Container ID:   containerd://5c652252424cd16b7d37b47b5559b3a00d7ca3c49e71b337ba20ed2a08b26426
        Image:          netdata/agent-sd:v0.2.1
        Image ID:       docker.io/netdata/agent-sd@sha256:31cdb9c2c6b4e87deb075e1c620f8cb03c4ae9627f0c21cfebdbb998f5a325fa
        Port:           <none>
        Host Port:      <none>
        State:          Running
          Started:      Thu, 17 Sep 2020 21:47:40 +0100
        Ready:          True
        Restart Count:  0
        Environment:
          NETDATA_SD_CONFIG_MAP:  netdata-child-sd-config-map:config.yml
          MY_POD_NAMESPACE:       default (v1:metadata.namespace)
          MY_NODE_NAME:            (v1:spec.nodeName)
        Mounts:
          /export/ from sd-shared (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from netdata-token-mbdkj (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      proc:
        Type:          HostPath (bare host directory volume)
        Path:          /proc
        HostPathType:
      run:
        Type:          HostPath (bare host directory volume)
        Path:          /var/run/docker.sock
        HostPathType:
      sys:
        Type:          HostPath (bare host directory volume)
        Path:          /sys
        HostPathType:
      config:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      netdata-conf-child
        Optional:  false
      nodeuid:
        Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:
        SizeLimit:  <unset>
      sd-shared:
        Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:
        SizeLimit:  <unset>
      netdata-token-mbdkj:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  netdata-token-mbdkj
        Optional:    false
    QoS Class:       BestEffort
    Node-Selectors:  <none>
    Tolerations:     :NoSchedule
                     node.kubernetes.io/disk-pressure:NoSchedule
                     node.kubernetes.io/memory-pressure:NoSchedule
                     node.kubernetes.io/network-unavailable:NoSchedule
                     node.kubernetes.io/not-ready:NoExecute
                     node.kubernetes.io/pid-pressure:NoSchedule
                     node.kubernetes.io/unreachable:NoExecute
                     node.kubernetes.io/unschedulable:NoSchedule
    Events:
      Type     Reason     Age                From               Message
      ----     ------     ----               ----               -------
      Normal   Scheduled  <unknown>          default-scheduler  Successfully assigned default/netdata-child-zq2vl to pi-node1
      Normal   Pulling    25s                kubelet, pi-node1  Pulling image "netdata/wget"
      Normal   Pulled     24s                kubelet, pi-node1  Successfully pulled image "netdata/wget"
      Normal   Created    24s                kubelet, pi-node1  Created container init-nodeuid
      Normal   Started    23s                kubelet, pi-node1  Started container init-nodeuid
      Normal   Pulling    19s                kubelet, pi-node1  Pulling image "netdata/agent-sd:v0.2.1"
      Normal   Started    18s                kubelet, pi-node1  Started container sd
      Normal   Pulled     18s                kubelet, pi-node1  Successfully pulled image "netdata/agent-sd:v0.2.1"
      Normal   Created    18s                kubelet, pi-node1  Created container sd
      Normal   Pulling    17s (x2 over 22s)  kubelet, pi-node1  Pulling image "netdata/netdata:latest"
      Normal   Pulled     16s (x2 over 21s)  kubelet, pi-node1  Successfully pulled image "netdata/netdata:latest"
      Normal   Created    16s (x2 over 20s)  kubelet, pi-node1  Created container netdata
      Normal   Started    15s (x2 over 19s)  kubelet, pi-node1  Started container netdata
      Warning  BackOff    13s                kubelet, pi-node1  Back-off restarting failed container
    

    @ilyam8
    What is the sd container used for? If I disable it what won’t work?
    Would be good to have a short description on the docker page? 🙂



  • Ok, it looks like problem now not with pulling image(as it get created sucessfully), but something goes wrong when container is started.
    Could you post logs from both containers in netdata-child-zq2vl pod?

    kubectl logs netdata-child-zq2vl -c sd
    kubectl logs netdata-child-zq2vl -c netdata
    


  • Also, the issue may be that netdata child tries to connect to parent, but parent not actually serving any connections, as we can see from here netdata-parent-cfb988d65-rkz5m 0/1 Running
    Looks like Readiness probe is failed there.

    You could also post events from the parent pod 🙂


  • Staff

    It is https://github.com/netdata/agent-service-discovery#service-discovery

    its purpose is to identify applications running inside the containers and create configuration files that is used by netdata plugins.

    I see now it is netdata is the container that is failing to start 😃



  • luis@pi-node1:~/k8s/netdata$ kubectl logs netdata-child-zq2vl -c sd
    {"level":"info","component":"pipeline manager","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"k8s config provider","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"export manager","time":"2020-09-17 20:47:40","message":"registered: '[file exporter (/export/go.d.yml)]'"}
    {"level":"info","component":"discovery manager","time":"2020-09-17 20:47:40","message":"registered: [k8s discovery manager]"}
    {"level":"info","component":"pipeline manager","time":"2020-09-17 20:47:40","message":"received a new config, starting a new pipeline ('k8s/cmap/default/netdata-child-sd-config-map:config.yml')"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"export manager","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"discovery manager","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"file export","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"k8s discovery manager","time":"2020-09-17 20:47:40","message":"registered: [k8s pod discovery]"}
    {"level":"info","component":"k8s pod discovery","time":"2020-09-17 20:47:40","message":"instance is started"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"received '8' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/traefik-758cd5fc85-b9bdt' with 5 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/local-path-provisioner-6d59f47c7-96h7q' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6' with 3 target(s)"}
    {"level":"info","component":"build manager","time":"2020-09-17 20:47:45","message":"built 1 config(s) for target 'kube-system_coredns-7944c66d8d-4v9q6_coredns_tcp_9153'"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6': new/stale config(s) 1/0"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/helm-install-traefik-fsk4c' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/svclb-traefik-tkfnn' with 2 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:47:45","message":"processing group 'k8s/pod/kube-system/metrics-server-7566d596c8-82vtg' with 1 target(s)"}
    {"level":"info","component":"file export","time":"2020-09-17 20:47:46","message":"wrote 1 config(s) to '/export/go.d.yml'"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:00","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:00","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:05","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:05","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:15","message":"received '2' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:15","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:15","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:30","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:30","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:45","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:48:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:20","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:20","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:25","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:25","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:40","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:49:40","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:50:55","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:50:55","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:51:10","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:51:10","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:35","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:35","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:40","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:40","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:45","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:53:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"received '8' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/metrics-server-7566d596c8-82vtg' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/traefik-758cd5fc85-b9bdt' with 5 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/local-path-provisioner-6d59f47c7-96h7q' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6' with 3 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/helm-install-traefik-fsk4c' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/kube-system/svclb-traefik-tkfnn' with 2 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:57:45","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:58:45","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:58:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:59:00","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 20:59:00","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:03:50","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:03:50","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:04:05","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:04:05","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"received '8' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/metrics-server-7566d596c8-82vtg' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/traefik-758cd5fc85-b9bdt' with 5 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/local-path-provisioner-6d59f47c7-96h7q' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6' with 3 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/helm-install-traefik-fsk4c' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/kube-system/svclb-traefik-tkfnn' with 2 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:07:45","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:09:00","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:09:00","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:09:15","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:09:15","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:10","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:10","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:15","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:15","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:30","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:14:30","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"received '8' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/metrics-server-7566d596c8-82vtg' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/traefik-758cd5fc85-b9bdt' with 5 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/local-path-provisioner-6d59f47c7-96h7q' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6' with 3 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/helm-install-traefik-fsk4c' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/kube-system/svclb-traefik-tkfnn' with 2 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:17:45","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:19:20","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:19:20","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:19:30","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:19:30","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:30","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:30","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:35","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:35","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:45","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:24:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"received '8' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/svclb-traefik-tkfnn' with 2 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/default/netdata-parent-cfb988d65-rkz5m' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/metrics-server-7566d596c8-82vtg' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/traefik-758cd5fc85-b9bdt' with 5 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/local-path-provisioner-6d59f47c7-96h7q' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/coredns-7944c66d8d-4v9q6' with 3 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:27:45","message":"processing group 'k8s/pod/kube-system/helm-install-traefik-fsk4c' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:29:45","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:29:45","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:29:55","message":"received '1' group(s)"}
    {"level":"info","component":"pipeline","time":"2020-09-17 21:29:55","message":"processing group 'k8s/pod/default/netdata-child-zq2vl' with 1 target(s)"}
    
    kubectl logs netdata-child-zq2vl -c netdata
    Netdata entrypoint script starting
    2020-09-17 21:29:42: netdata INFO  : MAIN : CONFIG: cannot load cloud config '/var/lib/netdata/cloud.d/cloud.conf'. Running with internal defaults.
    2020-09-17 21:29:42: netdata INFO  : MAIN : Found 0 legacy dbengines, setting multidb diskspace to 256MB
    2020-09-17 21:29:42: netdata INFO  : MAIN : Created file '/var/lib/netdata/dbengine_multihost_size' to store the computed value
    2020-09-17 21:29:42: netdata INFO  : MAIN : Using host prefix directory '/host'
    2020-09-17 21:29:42: netdata INFO  : MAIN : SIGNAL: Not enabling reaper
    2020-09-17 21:29:42: netdata ERROR : MAIN : LISTENER: Invalid listen port 0 given. Defaulting to 19999. (errno 22, Invalid argument)
    2020-09-17 21:29:42: netdata ERROR : MAIN : LISTENER: IPv4 bind() on ip '0.0.0.0' port 19999, socktype 1 failed. (errno 98, Address in use)
    2020-09-17 21:29:42: netdata ERROR : MAIN : LISTENER: Cannot bind to ip '0.0.0.0', port 19999
    2020-09-17 21:29:42: netdata ERROR : MAIN : LISTENER: IPv6 bind() on ip '::' port 19999, socktype 1 failed. (errno 98, Address in use)
    2020-09-17 21:29:42: netdata ERROR : MAIN : LISTENER: Cannot bind to ip '::', port 19999
    2020-09-17 21:29:42: netdata FATAL : MAIN : LISTENER: Cannot listen on any API socket. Exiting... # : Invalid argument
    
    2020-09-17 21:29:42: netdata INFO  : MAIN : EXIT: netdata prepares to exit with code 1...
    2020-09-17 21:29:42: netdata INFO  : MAIN : EXIT: cleaning up the database...
    2020-09-17 21:29:42: netdata INFO  : MAIN : Cleaning up database [0 hosts(s)]...
    2020-09-17 21:29:42: netdata INFO  : MAIN : EXIT: all done - netdata is now exiting - bye bye...
    

    Please note that I am running netdata on the k8s/k3s host node… 😁



  • Yeah, that seems to be an issue. I’m not sure how your Kubernetes configured, but it looks like netdata pod conflicting with other processes on the same port.
    You can try to reconfigure your host netdata to run on a different port, to see if it solve the issue 🙂


  • Staff

    Luis keep us updated! @rybue thanks again for chiming in. You are helping a lot in this community 🙂



  • OK, that fixed it. I changed the listen port from 19999 to 19998 on the physical host in /etc/netdata/netdata.conf

    Looks good so far!! 😀

    So, I’m getting my head around how this works:

    I’m guessing from my playtime so far that this makes the agent on the host itself redundant since each child pod looks to be showing all the same information (plus more)…Is that the idea?
    If so, what happens if I hook this up to send stats up to my tenant in the Netdata cloud and then re-deploy the helm chart a few times; am I going to wind up with a consistent node-identity; or will I end up with either lots of orphaned nodes with the same name or a bunch of nodes with the same name but incremented numbers attached to them etc?
    Happy to try it ofc but just curious as I’ve got my workspaces up there setup nicely now

    One curious thing though: I spun up another node and added it to the cluster (child service came up fine with the modified host port) but I noticed the “k8s kubelet” and “k8s kubeproxy” menu’s on the right but those didn’t appear on the original node that was deployed to. Seems a bit odd given that the first node was and still is the only master…

    Is there a way for me to specify certain settings in the values.yaml for the Web UI? For example I like having my charts always refresh rather than the default of “On Focus”. If I set it in the running UI then as soon as I switch to a different child node and back then the setting is reverted. Ideally, could we get the config stored in a Persistent Volume or something?

    Also, do you guys have changes planned for representing/navigating the sections on each child node dedicated to specific pods? I ask because I have only circa 8 containers per node and the UI is rather cluttered: I can imagine a whole lot of scrolling and stuttering of the browser on a production system. I’ve felt like that right-side pane needed a search box and maybe this is the requirement for one?


  • Staff

    Hey @Luis-Johnstone ,

    To force the refresh of the Dashboard, you only need to append the update_always=true argument to the URL:
    http://192.168.1.150:19999/#menu_system_submenu_cpu;theme=slate;help=true;update_always=true

    We intend to offer proper support for kubernetes, including better visualization, optimized for the unique experience kubernetes offers (e.g ephemeral nodes). But, this is not on the committed roadmap, thus we can’t say in good conscience when it’s going to be shipped, or give more details about it.

    if I understand what you say correctly, the streaming functionality is intended so that the child nodes replicate their database to the master, so that the master not only can offer the same metrics but also can apply alarms on them. Depending on your use-case, this setup might make sense to you, or you might prefer to have the data live on each child node and access them through netdata cloud, leveraging the extra functionality, such as custom dashboards or metric correlations.

    I hope that I helped!

    Keep the feedback coming, we can’t get enough of it 💪


Log in to reply