使用 InfluxDB 和 Flux 扩展 Kubernetes 部署

导航至

本文由 InfluxDB 社区成员和 InfluxAce David Flanagan 撰写。

十八个小时前,我与一些同事会面,讨论我们的 Kubernetes 计划以及改进在 Kubernetes 上运行的 InfluxDB 的集成和支持的宏伟计划。在这次会议期间,我阐述了我认为 InfluxDB 真正在 Kubernetes 上大放异彩所缺少的东西。我不打算用细节来烦扰您,但我坚持我们需要的一件事是指标服务器集成,以基于 InfluxDB 内的数据提供水平 pod 自动扩缩 (HPA)。当我提出我们可以快速启动此功能的选项时,我出色的同事 Giacomo 插话道

“这已经存在了。”

TL;DR

  • 您可以将 kube-metrics-adapter 部署到您的集群,它支持使用 Flux 查询注释您的 HPA 资源,以控制部署资源的扩展。
  • InfluxData 有一个 Helm Charts 仓库,其中包含 InfluxDB 2 的图表
  • Telegraf 可以用作 sidecar 以进行本地指标收集
  • InfluxDB 2 有一个名为 pkger 的组件,它允许通过清单(如 Kubernetes)为 InfluxDB 资源的创建和管理提供声明式接口。

使用 Flux 扩展您的部署

Giacomo 继续对已构建的内容进行了精彩的解释,但我将保持简短。事实证明,我们以前的一位同事 Lorenzo Affetti,在年初向 Zalando 的 metrics-adapter 项目 提交了一些 PR。他提交的 pull 请求此后已合并,我们实际上可以使用此项目通过使用 Flux 查询注释所述部署来扩展我们的部署。

它是如何工作的?非常简单。让我来展示一下。

部署 InfluxDB

本文假设您已经在集群中运行了 InfluxDB 2。如果您没有,您可以使用我们的 Helm Chart 在 30 秒内部署 InfluxDB。我现在开始计时…

如果您感觉足够勇敢,您可以将此命令放入终端并期待最好的结果。

kubectl create namespace monitoring 
helm repo add influxdata https://helm.influxdata.com/ 
helm upgrade --install influxdb --namespace=monitoring influxdata/influxdb2

部署 Metrics Adapter

我们需要做的第一件事是将 metrics-adapter 部署到我们的 Kubernetes 集群。Zalando 没有为此提供 Helm chart,但 Banzai Cloud 提供了。不幸的是,Banzai Cloud chart 需要进行一些调整以支持 InfluxDB Collector;因此今天,我们将使用自定义清单部署它。我知道这不太好,但您只需要做一次。 ????

清单

在您盲目地复制并粘贴到您的集群之前,请注意:在 Deployment 资源的 args 部分中有 3 个硬编码变量。如果您计划将其推广到生产环境,请使用 Secrets 并将它们挂载为文件或环境变量,而不是采用我在本演示中使用的随意方法。

3 个硬编码变量是

  • InfluxDB URL
  • 组织名称
  • 令牌
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: custom-metrics-apiserver
  namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: custom-metrics-server-resources
rules:
  - apiGroups:
      - custom.metrics.k8s.io
    resources:
      - "*"
    verbs:
      - "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-metrics-server-resources
rules:
  - apiGroups:
      - external.metrics.k8s.io
    resources:
      - "*"
    verbs:
      - "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: custom-metrics-resource-reader
rules:
  - apiGroups:
      - ""
    resources:
      - namespaces
      - pods
      - services
    verbs:
      - get
      - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: custom-metrics-resource-collector
rules:
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - patch
  - apiGroups:
      - ""
    resources:
      - pods
    verbs:
      - list
  - apiGroups:
      - apps
    resources:
      - deployments
      - statefulsets
    verbs:
      - get
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses
    verbs:
      - get
  - apiGroups:
      - autoscaling
    resources:
      - horizontalpodautoscalers
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: hpa-controller-custom-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: custom-metrics-server-resources
subjects:
  - kind: ServiceAccount
    name: horizontal-pod-autoscaler
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: hpa-controller-external-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-metrics-server-resources
subjects:
  - kind: ServiceAccount
    name: horizontal-pod-autoscaler
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: custom-metrics-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
  - kind: ServiceAccount
    name: custom-metrics-apiserver
    namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
  - kind: ServiceAccount
    name: custom-metrics-apiserver
    namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics-resource-collector
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: custom-metrics-resource-collector
subjects:
  - kind: ServiceAccount
    name: custom-metrics-apiserver
    namespace: custom-metrics-server
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
spec:
  group: custom.metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: kube-metrics-adapter
    namespace: custom-metrics-server
  version: v1beta1
  versionPriority: 100
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.external.metrics.k8s.io
spec:
  group: external.metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: kube-metrics-adapter
    namespace: custom-metrics-server
  version: v1beta1
  versionPriority: 100
---
apiVersion: v1
kind: Service
metadata:
  name: kube-metrics-adapter
  namespace: custom-metrics-server
spec:
  ports:
    - port: 443
      targetPort: 443
  selector:
    app: kube-metrics-adapter
---
apiVersion: v1
kind: Namespace
metadata:
  name: custom-metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: kube-metrics-adapter
  name: kube-metrics-adapter
  namespace: custom-metrics-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kube-metrics-adapter
  template:
    metadata:
      labels:
        app: kube-metrics-adapter
    spec:
      containers:
        - args:
            - --influxdb-address=http://influxdb.monitoring.svc:9999
            - --influxdb-token=secret-token
            - --influxdb-org=InfluxData
          image: registry.opensource.zalan.do/teapot/kube-metrics-adapter:v0.1.5
          name: kube-metrics-adapter
      serviceAccountName: custom-metrics-apiserver

大型演示

现在我们已经在集群中运行了 InfluxDB 和 Metrics Adapter,让我们扩展一些 pod!

为了使这个演示相当完整,我将介绍使用 Telegraf 作为 sidecar,从 nginx 抓取指标,以及使用 pkger 使用名为 initContainers 的 Kubernetes 概念为我们的指标创建存储桶。为了完成这两个步骤,我们需要注入一个 ConfigMap 以提供 Telegraf 配置文件和一个 pkger 清单。我们的 nginx 配置也包含在内,它启用了状态页面。

应该阅读 YAML 中每个文件键上方的注释。

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-hpa
data:
  # This is our nginx configuration. It enables the status (/nginx_status) page to be scraped from Telegraf over the shared interface within the pod.
  default.conf: |
    server {
        listen       80;
        listen  [::]:80;
        server_name  localhost;

        location / {
            root   /usr/share/nginx/html;
            index  index.html index.htm;
        }

        location /nginx_status {
          stub_status;
          allow 127.0.0.1;	#only allow requests from localhost
          deny all;		#deny all other hosts
        }

        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   /usr/share/nginx/html;
        }
    }

  # This is our Telegraf configuration. It has the same hard coded values we mentioned earlier. You'll want to move them to secrets for a production deployment,
  # but I'm keeping that out of scope for this demo. We configure Telegraf to pull metrics from nginx and write to our local InfluxDB 2 instance.
  telegraf.conf: |
    [agent]
      interval = "2s"
      flush_interval = "2s"

    [[inputs.nginx]]
      urls = ["http://localhost/nginx_status"]
      response_timeout = "1s"

    [[outputs.influxdb_v2]]
      urls = ["http://influxdb.monitoring.svc:9999"]
      bucket = "nginx-hpa"
      organization = "InfluxData"
      token = "secret-token"

  # Finally, we need a bucket to store our metrics. You don't need a long retention, as it's only used for HPA.
  buckets.yaml: |
    apiVersion: influxdata.com/v2alpha1
    kind: Bucket
    metadata:
      name: nginx-hpa
    spec:
      description: Nginx HPA Example Bucket
      retentionRules:
      - type: expire
        everySeconds: 900

现在我将把 nginx 部署到集群。我选择 nginx 是因为它很容易使用大量可用的 HTTP 负载测试工具引起扩展事件;我将使用 baton

我们的 nginx 清单如下所示。再次强调,请记住提取硬编码值并使用 secrets!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-hpa
spec:
  selector:
    matchLabels:
      app: nginx-hpa
  template:
    metadata:
      labels:
        app: nginx-hpa
    spec:
      volumes:
        - name: influxdb-config
          configMap:
            name: nginx-hpa
      initContainers:
        - name: influxdb
      image: quay.io/influxdb/influxdb:2.0.0-beta
      volumeMounts:
        - mountPath: /etc/influxdb
      name: influxdb-config
      command:
        - influx
      args:
        - --host
        - http://influxdb.monitoring.svc:9999
        - --token
        - secret-token
        - pkg
        - --file
        - /etc/influxdb/buckets.yaml
        - -o
        - InfluxData
        - --force
        - "true"
      containers:
        - name: nginx
          image: nginx:latest
          volumeMounts:
            - mountPath: /etc/nginx/conf.d/default.conf
              name: influxdb-config
              subPath: default.conf
          ports:
            - containerPort: 80
        - name: telegraf
          image: telegraf:1.16
          volumeMounts:
            - mountPath: /etc/telegraf/telegraf.conf
              name: influxdb-config
              subPath: telegraf.conf

最后,让我们看一下完成我们演示的 HorizontalPodAutoscaler 清单。

我们添加了一个注释, metric-config.external.flux-query.influxdb/interval,它允许我们指定我们希望执行的 Flux 查询,以便获取我们需要确定此部署是否应向上扩展的指标。我们的 Flux 查询从我们的 nginx 测量中获取 waiting 字段,该字段的值大于零时,强烈表明我们需要水平扩展以处理当前的流量。

我们的目标是使等待数字尽可能接近 0/1。我们还可以使用另一个注释 metric-config.external.flux-query.influxdb/interval 来定义我们希望检查流量和扩展事件的频率。我们将使用 5 秒间隔。

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
  annotations:
    metric-config.external.flux-query.influxdb/interval: "5s"
    metric-config.external.flux-query.influxdb/http_requests: |
      from(bucket: "nginx-hpa")
        |> range(start: -30s)
        |> filter(fn: (r) => r._measurement == "nginx")
        |> filter(fn: (r) => r._field == "waiting")
        |> group()
        |> max()
        // Rename "_value" to "metricvalue" for letting the metrics server properly unmarshal the result.
        |> rename(columns: {_value: "metricvalue"})
        |> keep(columns: ["metricvalue"])
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-hpa
  minReplicas: 1
  maxReplicas: 4
  metrics:
    - type: External
      external:
        metric:
          name: flux-query
          selector:
            matchLabels:
              query-name: http_requests
        target:
          type: Value
          value: "1"

就是这样!当您知道方法时,很容易,对吧?

如果您想更详细地探索这一点,或者想了解更多关于使用 InfluxDB 监控 Kubernetes 的信息——请查看我的 示例仓库,其中包含更多供您仔细阅读的好东西。