使用 InfluxDB & Flux 缩放 Kubernetes 部署
作者:社区 / 产品,用例,开发者
2020 年 10 月 28 日
导航至
本文由 InfluxDB 社区成员和 InfluxAce David Flanagan 撰写。
十八小时前,我和一些同事开会讨论我们的 Kubernetes 计划以及改进运行在 Kubernetes 上的 InfluxDB 集成和支持的宏伟计划。在这次会议上,我阐述了我认为 InfluxDB 在 Kubernetes 上真正发光所需的缺失部分。我不会详细说明,但其中之一是我们需要的指标服务器集成,以提供基于 InfluxDB 内部数据的水平 pod 自动扩展 (HPA)。当我提出我们可以采取的快速启动此功能的选项时,我那位了不起的同事 Giacomo 喊道
“这已经存在了。”
TL;DR
- 您可以将
kube-metrics-adapter
部署到您的集群中,它支持使用 Flux 查询注释您的 HPA 资源,以控制部署资源的扩展。 - InfluxData 拥有一个 Helm Charts 仓库,其中包含一个 InfluxDB 2 的图表。
- Telegraf 可以用作本地指标收集的边车。
- InfluxDB 2 拥有一个名为
pkger
的组件,它允许通过清单(如 Kubernetes)创建和管理 InfluxDB 资源,实现声明式接口。
使用 Flux 进行部署扩展
Giacomo 继续详细解释了所构建的内容,但我将简要介绍。原来,我们的前同事 Lorenzo Affetti,在年初向 Zalandos 的 metrics-adapter 项目 提交了一些 PR。这些 PR 已被合并,我们实际上可以使用这个项目通过给部署添加一个 Flux 查询来扩展我们的部署。
它是如何工作的?很简单。让我给你展示。
部署 InfluxDB
本文假设您已在集群中运行 InfluxDB 2。如果没有,您可以使用我们的 Helm Chart 在 30 秒内部署 InfluxDB。我现在开始计时…
如果您感到勇敢,可以将其放入终端并希望一切顺利。
kubectl create namespace monitoring
helm repo add influxdata https://helm.influxdata.com/
helm upgrade --install influxdb --namespace=monitoring influxdata/influxdb2
部署指标适配器
首先,我们需要在 Kubernetes 集群中部署 metrics-adapter。Zalando 没有提供用于此目的的 Helm 图表,但 Banzai Cloud 提供了。不幸的是,Banzai Cloud 图表需要对 InfluxDB Collector 进行一些调整,因此今天我们将使用自定义清单来部署它。我知道这并不完美,但您只需要做一次。???
清单
在您盲目地将以下内容复制粘贴到您的集群之前,请注意:在 Deployment
资源的 args
部分中有 3 个硬编码变量。如果您计划在生产环境中部署,请使用 Secrets 并将其挂载为文件或环境变量,而不是像我在这演示中使用的那种随意的方法。
这 3 个硬编码变量是
- InfluxDB URL
- 组织名称
- 令牌
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-server-resources
rules:
- apiGroups:
- custom.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-metrics-server-resources
rules:
- apiGroups:
- external.metrics.k8s.io
resources:
- "*"
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-reader
rules:
- apiGroups:
- ""
resources:
- namespaces
- pods
- services
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: custom-metrics-resource-collector
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- apiGroups:
- apps
resources:
- deployments
- statefulsets
verbs:
- get
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: hpa-controller-external-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: custom-metrics-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: custom-metrics-resource-collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-resource-collector
subjects:
- kind: ServiceAccount
name: custom-metrics-apiserver
namespace: custom-metrics-server
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.custom.metrics.k8s.io
spec:
group: custom.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.external.metrics.k8s.io
spec:
group: external.metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: kube-metrics-adapter
namespace: custom-metrics-server
version: v1beta1
versionPriority: 100
---
apiVersion: v1
kind: Service
metadata:
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
ports:
- port: 443
targetPort: 443
selector:
app: kube-metrics-adapter
---
apiVersion: v1
kind: Namespace
metadata:
name: custom-metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kube-metrics-adapter
name: kube-metrics-adapter
namespace: custom-metrics-server
spec:
replicas: 1
selector:
matchLabels:
app: kube-metrics-adapter
template:
metadata:
labels:
app: kube-metrics-adapter
spec:
containers:
- args:
- --influxdb-address=http://influxdb.monitoring.svc:9999
- --influxdb-token=secret-token
- --influxdb-org=InfluxData
image: registry.opensource.zalan.do/teapot/kube-metrics-adapter:v0.1.5
name: kube-metrics-adapter
serviceAccountName: custom-metrics-apiserver
大型演示
现在,我们的 InfluxDB 和 Metrics Adapter 已在我们的集群中运行,让我们扩展一些 Pod!
为了使这个演示尽可能完整,我将涵盖使用 Telegraf 作为边车从 nginx
抓取指标,并使用 Kubernetes 的概念 initContainers
来创建我们的指标存储桶,使用 pkger
。为了完成这两个步骤,我们需要注入一个 ConfigMap
,提供 Telegraf 配置文件和 pkger
清单。我们的 nginx
配置也包括在内,它启用了状态页面。
您 必须 阅读每个 YAML 文件密钥上面的注释。
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-hpa
data:
# This is our nginx configuration. It enables the status (/nginx_status) page to be scraped from Telegraf over the shared interface within the pod.
default.conf: |
server {
listen 80;
listen [::]:80;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /nginx_status {
stub_status;
allow 127.0.0.1; #only allow requests from localhost
deny all; #deny all other hosts
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
# This is our Telegraf configuration. It has the same hard coded values we mentioned earlier. You'll want to move them to secrets for a production deployment,
# but I'm keeping that out of scope for this demo. We configure Telegraf to pull metrics from nginx and write to our local InfluxDB 2 instance.
telegraf.conf: |
[agent]
interval = "2s"
flush_interval = "2s"
[[inputs.nginx]]
urls = ["https://127.0.0.1/nginx_status"]
response_timeout = "1s"
[[outputs.influxdb_v2]]
urls = ["http://influxdb.monitoring.svc:9999"]
bucket = "nginx-hpa"
organization = "InfluxData"
token = "secret-token"
# Finally, we need a bucket to store our metrics. You don't need a long retention, as it's only used for HPA.
buckets.yaml: |
apiVersion: influxdata.com/v2alpha1
kind: Bucket
metadata:
name: nginx-hpa
spec:
description: Nginx HPA Example Bucket
retentionRules:
- type: expire
everySeconds: 900
现在,我将部署 nginx
到集群中。我选择 nginx
因为它很容易通过大量的 HTTP 压力测试工具引起扩展事件;我将使用 baton。
我们的 nginx
清单如下。再次提醒,请提取硬编码的值并使用 secrets!
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa
spec:
selector:
matchLabels:
app: nginx-hpa
template:
metadata:
labels:
app: nginx-hpa
spec:
volumes:
- name: influxdb-config
configMap:
name: nginx-hpa
initContainers:
- name: influxdb
image: quay.io/influxdb/influxdb:2.0.0-beta
volumeMounts:
- mountPath: /etc/influxdb
name: influxdb-config
command:
- influx
args:
- --host
- http://influxdb.monitoring.svc:9999
- --token
- secret-token
- pkg
- --file
- /etc/influxdb/buckets.yaml
- -o
- InfluxData
- --force
- "true"
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- mountPath: /etc/nginx/conf.d/default.conf
name: influxdb-config
subPath: default.conf
ports:
- containerPort: 80
- name: telegraf
image: telegraf:1.16
volumeMounts:
- mountPath: /etc/telegraf/telegraf.conf
name: influxdb-config
subPath: telegraf.conf
最后,让我们看看完成我们演示的 HorizontalPodAutoscaler
清单。
我们添加了一个注释 metric-config.external.flux-query.influxdb/interval
,允许我们指定要执行的 Flux 查询,以获取我们所需的指标,以确定是否应该扩展此部署。我们的 Flux 查询从我们的 nginx
测量中获取 waiting
字段,该字段大于零值是我们需要水平扩展以处理当前流量流量的强烈指标。
我们的目标是尽可能使等待数量接近0/1。我们还可以使用另一个注释,metric-config.external.flux-query.influxdb/interval
,来定义我们希望多久检查一次流量和扩展事件。我们将使用5秒间隔。
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
annotations:
metric-config.external.flux-query.influxdb/interval: "5s"
metric-config.external.flux-query.influxdb/http_requests: |
from(bucket: "nginx-hpa")
|> range(start: -30s)
|> filter(fn: (r) => r._measurement == "nginx")
|> filter(fn: (r) => r._field == "waiting")
|> group()
|> max()
// Rename "_value" to "metricvalue" for letting the metrics server properly unmarshal the result.
|> rename(columns: {_value: "metricvalue"})
|> keep(columns: ["metricvalue"])
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa
minReplicas: 1
maxReplicas: 4
metrics:
- type: External
external:
metric:
name: flux-query
selector:
matchLabels:
query-name: http_requests
target:
type: Value
value: "1"
就是这样!知道了方法,是不是很简单?
如果您想更详细地了解这一内容,或者想了解更多关于使用InfluxDB监控Kubernetes的信息——请查看我的示例存储库,那里有更多精彩内容供您浏览。