使用 InfluxDB 和 Flux 扩展 Kubernetes 部署
作者:社区 / 产品, 用例, 开发者
2020 年 10 月 28 日
导航至
本文由 InfluxDB 社区成员和 InfluxAce David Flanagan 撰写。
十八个小时前,我与一些同事会面,讨论我们的 Kubernetes 计划以及改进在 Kubernetes 上运行的 InfluxDB 的集成和支持的宏伟计划。在这次会议期间,我阐述了我认为 InfluxDB 真正在 Kubernetes 上大放异彩所缺少的东西。我不打算用细节来烦扰您,但我坚持我们需要的一件事是指标服务器集成,以基于 InfluxDB 内的数据提供水平 pod 自动扩缩 (HPA)。当我提出我们可以快速启动此功能的选项时,我出色的同事 Giacomo 插话道
“这已经存在了。”
TL;DR
- 您可以将
kube-metrics-adapter
部署到您的集群,它支持使用 Flux 查询注释您的 HPA 资源,以控制部署资源的扩展。 - InfluxData 有一个 Helm Charts 仓库,其中包含 InfluxDB 2 的图表
- Telegraf 可以用作 sidecar 以进行本地指标收集
- InfluxDB 2 有一个名为
pkger
的组件,它允许通过清单(如 Kubernetes)为 InfluxDB 资源的创建和管理提供声明式接口。
使用 Flux 扩展您的部署
Giacomo 继续对已构建的内容进行了精彩的解释,但我将保持简短。事实证明,我们以前的一位同事 Lorenzo Affetti,在年初向 Zalando 的 metrics-adapter 项目 提交了一些 PR。他提交的 pull 请求此后已合并,我们实际上可以使用此项目通过使用 Flux 查询注释所述部署来扩展我们的部署。
它是如何工作的?非常简单。让我来展示一下。
部署 InfluxDB
本文假设您已经在集群中运行了 InfluxDB 2。如果您没有,您可以使用我们的 Helm Chart 在 30 秒内部署 InfluxDB。我现在开始计时…
如果您感觉足够勇敢,您可以将此命令放入终端并期待最好的结果。
kubectl create namespace monitoring
helm repo add influxdata https://helm.influxdata.com/
helm upgrade --install influxdb --namespace=monitoring influxdata/influxdb2
部署 Metrics Adapter
我们需要做的第一件事是将 metrics-adapter 部署到我们的 Kubernetes 集群。Zalando 没有为此提供 Helm chart,但 Banzai Cloud 提供了。不幸的是,Banzai Cloud chart 需要进行一些调整以支持 InfluxDB Collector;因此今天,我们将使用自定义清单部署它。我知道这不太好,但您只需要做一次。 ????
清单
在您盲目地复制并粘贴到您的集群之前,请注意:在 Deployment
资源的 args
部分中有 3 个硬编码变量。如果您计划将其推广到生产环境,请使用 Secrets 并将它们挂载为文件或环境变量,而不是采用我在本演示中使用的随意方法。
3 个硬编码变量是
- InfluxDB URL
- 组织名称
- 令牌
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-server-resources
rules:
- apiGroups:
- custom.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-metrics-server-resources
rules:
- apiGroups:
- external.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-reader
rules:
- apiGroups:
- ""
resources:
- namespaces
- pods
- services
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-collector
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- apiGroups:
- apps
resources:
- deployments
- statefulsets
verbs:
- get
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-external-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: custom-metrics-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics-resource-collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-resource-collector
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
group: custom.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.external.metrics.k8s.io
spec:
group: external.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: v1
kind: Service
metadata:
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
ports:
- port: 443
targetPort: 443
selector:
app: kube-metrics-adapter
---
apiVersion: v1
kind: Namespace
metadata:
name: custom-metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kube-metrics-adapter
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
replicas: 1
selector:
matchLabels:
app: kube-metrics-adapter
template:
metadata:
labels:
app: kube-metrics-adapter
spec:
containers:
- args:
- --influxdb-address=http://influxdb.monitoring.svc:9999
- --influxdb-token=secret-token
- --influxdb-org=InfluxData
image: registry.opensource.zalan.do/teapot/kube-metrics-adapter:v0.1.5
name: kube-metrics-adapter
serviceAccountName: custom-metrics-apiserver
大型演示
现在我们已经在集群中运行了 InfluxDB 和 Metrics Adapter,让我们扩展一些 pod!
为了使这个演示相当完整,我将介绍使用 Telegraf 作为 sidecar,从 nginx
抓取指标,以及使用 pkger
使用名为 initContainers
的 Kubernetes 概念为我们的指标创建存储桶。为了完成这两个步骤,我们需要注入一个 ConfigMap
以提供 Telegraf 配置文件和一个 pkger
清单。我们的 nginx
配置也包含在内,它启用了状态页面。
您应该阅读 YAML 中每个文件键上方的注释。
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-hpa
data:
# This is our nginx configuration. It enables the status (/nginx_status) page to be scraped from Telegraf over the shared interface within the pod.
default.conf: |
server {
listen 80;
listen [::]:80;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /nginx_status {
stub_status;
allow 127.0.0.1; #only allow requests from localhost
deny all; #deny all other hosts
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
# This is our Telegraf configuration. It has the same hard coded values we mentioned earlier. You'll want to move them to secrets for a production deployment,
# but I'm keeping that out of scope for this demo. We configure Telegraf to pull metrics from nginx and write to our local InfluxDB 2 instance.
telegraf.conf: |
[agent]
interval = "2s"
flush_interval = "2s"
[[inputs.nginx]]
urls = ["http://localhost/nginx_status"]
response_timeout = "1s"
[[outputs.influxdb_v2]]
urls = ["http://influxdb.monitoring.svc:9999"]
bucket = "nginx-hpa"
organization = "InfluxData"
token = "secret-token"
# Finally, we need a bucket to store our metrics. You don't need a long retention, as it's only used for HPA.
buckets.yaml: |
apiVersion: influxdata.com/v2alpha1
kind: Bucket
metadata:
name: nginx-hpa
spec:
description: Nginx HPA Example Bucket
retentionRules:
- type: expire
everySeconds: 900
现在我将把 nginx
部署到集群。我选择 nginx
是因为它很容易使用大量可用的 HTTP 负载测试工具引起扩展事件;我将使用 baton。
我们的 nginx
清单如下所示。再次强调,请记住提取硬编码值并使用 secrets!
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa
spec:
selector:
matchLabels:
app: nginx-hpa
template:
metadata:
labels:
app: nginx-hpa
spec:
volumes:
- name: influxdb-config
configMap:
name: nginx-hpa
initContainers:
- name: influxdb
image: quay.io/influxdb/influxdb:2.0.0-beta
volumeMounts:
- mountPath: /etc/influxdb
name: influxdb-config
command:
- influx
args:
- --host
- http://influxdb.monitoring.svc:9999
- --token
- secret-token
- pkg
- --file
- /etc/influxdb/buckets.yaml
- -o
- InfluxData
- --force
- "true"
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- mountPath: /etc/nginx/conf.d/default.conf
name: influxdb-config
subPath: default.conf
ports:
- containerPort: 80
- name: telegraf
image: telegraf:1.16
volumeMounts:
- mountPath: /etc/telegraf/telegraf.conf
name: influxdb-config
subPath: telegraf.conf
最后,让我们看一下完成我们演示的 HorizontalPodAutoscaler
清单。
我们添加了一个注释, metric-config.external.flux-query.influxdb/interval
,它允许我们指定我们希望执行的 Flux 查询,以便获取我们需要确定此部署是否应向上扩展的指标。我们的 Flux 查询从我们的 nginx
测量中获取 waiting
字段,该字段的值大于零时,强烈表明我们需要水平扩展以处理当前的流量。
我们的目标是使等待数字尽可能接近 0/1。我们还可以使用另一个注释 metric-config.external.flux-query.influxdb/interval
来定义我们希望检查流量和扩展事件的频率。我们将使用 5 秒间隔。
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
annotations:
metric-config.external.flux-query.influxdb/interval: "5s"
metric-config.external.flux-query.influxdb/http_requests: |
from(bucket: "nginx-hpa")
|> range(start: -30s)
|> filter(fn: (r) => r._measurement == "nginx")
|> filter(fn: (r) => r._field == "waiting")
|> group()
|> max()
// Rename "_value" to "metricvalue" for letting the metrics server properly unmarshal the result.
|> rename(columns: {_value: "metricvalue"})
|> keep(columns: ["metricvalue"])
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa
minReplicas: 1
maxReplicas: 4
metrics:
- type: External
external:
metric:
name: flux-query
selector:
matchLabels:
query-name: http_requests
target:
type: Value
value: "1"
就是这样!当您知道方法时,很容易,对吧?
如果您想更详细地探索这一点,或者想了解更多关于使用 InfluxDB 监控 Kubernetes 的信息——请查看我的 示例仓库,其中包含更多供您仔细阅读的好东西。