目录
强大的性能,无限的扩展能力
收集、组织和处理海量高速数据。 当您将任何数据视为时序数据时,它都会变得更有价值。 借助 InfluxDB,这是首屈一指的、旨在与 Telegraf 协同扩展的时序平台。
查看开始使用的几种方式
输入和输出集成概述
Docker 输入插件允许您使用 Docker Engine API 从 Docker 容器收集指标,从而增强容器化应用程序的可视性和监控能力。
OpenSearch 输出插件允许用户使用 HTTP 将指标直接发送到 OpenSearch 实例,从而促进 OpenSearch 生态系统内有效的数据管理和分析。
集成详情
Docker
Telegraf 的 Docker 输入插件从 Docker Engine API 收集有价值的指标,从而深入了解正在运行的容器。 此插件利用官方 Docker 客户端与 Engine API 接口,允许用户监控各种容器状态、资源分配和性能指标。 该插件提供按名称和状态过滤容器的选项,以及可自定义的标签和标签,支持在各种环境中(无论是在本地系统上还是在 Kubernetes 等编排平台内)灵活地监控容器化应用程序。 此外,它还通过要求访问 Docker 守护程序的权限来解决安全问题,并强调在容器化环境中部署时进行正确的配置。
OpenSearch
OpenSearch Telegraf 插件通过 HTTP 与 OpenSearch 数据库集成,从而实现指标的简化收集和存储。 作为专为 2.x 及更高版本的 OpenSearch 版本设计的强大工具,该插件提供了强大的功能,同时通过原始 Elasticsearch 插件与 1.x 版本兼容。 此插件有助于在 OpenSearch 中创建和管理索引,自动管理模板并确保数据结构化,从而有效地进行分析。 该插件支持各种配置选项,例如索引名称、身份验证、运行状况检查和值处理,使其可以根据不同的操作要求进行定制。 其功能使其对于希望利用 OpenSearch 的强大功能进行指标存储和查询的组织至关重要。
配置
Docker
[[inputs.docker]]
## Docker Endpoint
## To use TCP, set endpoint = "tcp://[ip]:[port]"
## To use environment variables (ie, docker-machine), set endpoint = "ENV"
endpoint = "unix:///var/run/docker.sock"
## Set to true to collect Swarm metrics(desired_replicas, running_replicas)
## Note: configure this in one of the manager nodes in a Swarm cluster.
## configuring in multiple Swarm managers results in duplication of metrics.
gather_services = false
## Only collect metrics for these containers. Values will be appended to
## container_name_include.
## Deprecated (1.4.0), use container_name_include
container_names = []
## Set the source tag for the metrics to the container ID hostname, eg first 12 chars
source_tag = false
## Containers to include and exclude. Collect all if empty. Globs accepted.
container_name_include = []
container_name_exclude = []
## Container states to include and exclude. Globs accepted.
## When empty only containers in the "running" state will be captured.
# container_state_include = []
# container_state_exclude = []
## Objects to include for disk usage query
## Allowed values are "container", "image", "volume"
## When empty disk usage is excluded
storage_objects = []
## Timeout for docker list, info, and stats commands
timeout = "5s"
## Whether to report for each container per-device blkio (8:0, 8:1...),
## network (eth0, eth1, ...) and cpu (cpu0, cpu1, ...) stats or not.
## Usage of this setting is discouraged since it will be deprecated in favor of 'perdevice_include'.
## Default value is 'true' for backwards compatibility, please set it to 'false' so that 'perdevice_include' setting
## is honored.
perdevice = true
## Specifies for which classes a per-device metric should be issued
## Possible values are 'cpu' (cpu0, cpu1, ...), 'blkio' (8:0, 8:1, ...) and 'network' (eth0, eth1, ...)
## Please note that this setting has no effect if 'perdevice' is set to 'true'
# perdevice_include = ["cpu"]
## Whether to report for each container total blkio and network stats or not.
## Usage of this setting is discouraged since it will be deprecated in favor of 'total_include'.
## Default value is 'false' for backwards compatibility, please set it to 'true' so that 'total_include' setting
## is honored.
total = false
## Specifies for which classes a total metric should be issued. Total is an aggregated of the 'perdevice' values.
## Possible values are 'cpu', 'blkio' and 'network'
## Total 'cpu' is reported directly by Docker daemon, and 'network' and 'blkio' totals are aggregated by this plugin.
## Please note that this setting has no effect if 'total' is set to 'false'
# total_include = ["cpu", "blkio", "network"]
## docker labels to include and exclude as tags. Globs accepted.
## Note that an empty array for both will include all labels as tags
docker_label_include = []
docker_label_exclude = []
## Which environment variables should we use as a tag
tag_env = ["JAVA_HOME", "HEAP_SIZE"]
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
OpenSearch
[[outputs.opensearch]]
## URLs
## The full HTTP endpoint URL for your OpenSearch instance. Multiple URLs can
## be specified as part of the same cluster, but only one URLs is used to
## write during each interval.
urls = ["http://node1.os.example.com:9200"]
## Index Name
## Target index name for metrics (OpenSearch will create if it not exists).
## This is a Golang template (see https://pkg.go.dev/text/template)
## You can also specify
## metric name (`{{.Name}}`), tag value (`{{.Tag "tag_name"}}`), field value (`{{.Field "field_name"}}`)
## If the tag does not exist, the default tag value will be empty string "".
## the timestamp (`{{.Time.Format "xxxxxxxxx"}}`).
## For example: "telegraf-{{.Time.Format \"2006-01-02\"}}-{{.Tag \"host\"}}" would set it to telegraf-2023-07-27-HostName
index_name = ""
## Timeout
## OpenSearch client timeout
# timeout = "5s"
## Sniffer
## Set to true to ask OpenSearch a list of all cluster nodes,
## thus it is not necessary to list all nodes in the urls config option
# enable_sniffer = false
## GZIP Compression
## Set to true to enable gzip compression
# enable_gzip = false
## Health Check Interval
## Set the interval to check if the OpenSearch nodes are available
## Setting to "0s" will disable the health check (not recommended in production)
# health_check_interval = "10s"
## Set the timeout for periodic health checks.
# health_check_timeout = "1s"
## HTTP basic authentication details.
# username = ""
# password = ""
## HTTP bearer token authentication details
# auth_bearer_token = ""
## Optional TLS Config
## Set to true/false to enforce TLS being enabled/disabled. If not set,
## enable TLS only if any of the other options are specified.
# tls_enable =
## Trusted root certificates for server
# tls_ca = "/path/to/cafile"
## Used for TLS client certificate authentication
# tls_cert = "/path/to/certfile"
## Used for TLS client certificate authentication
# tls_key = "/path/to/keyfile"
## Send the specified TLS server name via SNI
# tls_server_name = "kubernetes.example.com"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## Template Config
## Manage templates
## Set to true if you want telegraf to manage its index template.
## If enabled it will create a recommended index template for telegraf indexes
# manage_template = true
## Template Name
## The template name used for telegraf indexes
# template_name = "telegraf"
## Overwrite Templates
## Set to true if you want telegraf to overwrite an existing template
# overwrite_template = false
## Document ID
## If set to true a unique ID hash will be sent as
## sha256(concat(timestamp,measurement,series-hash)) string. It will enable
## data resend and update metric points avoiding duplicated metrics with
## different id's
# force_document_id = false
## Value Handling
## Specifies the handling of NaN and Inf values.
## This option can have the following values:
## none -- do not modify field-values (default); will produce an error
## if NaNs or infs are encountered
## drop -- drop fields containing NaNs or infs
## replace -- replace with the value in "float_replacement_value" (default: 0.0)
## NaNs and inf will be replaced with the given number, -inf with the negative of that number
# float_handling = "none"
# float_replacement_value = 0.0
## Pipeline Config
## To use a ingest pipeline, set this to the name of the pipeline you want to use.
# use_pipeline = "my_pipeline"
## Pipeline Name
## Additionally, you can specify a tag name using the notation (`{{.Tag "tag_name"}}`)
## which will be used as the pipeline name (e.g. "{{.Tag \"os_pipeline\"}}").
## If the tag does not exist, the default pipeline will be used as the pipeline.
## If no default pipeline is set, no pipeline is used for the metric.
# default_pipeline = ""
输入和输出集成示例
Docker
-
监控容器化应用程序的性能:使用 Docker 输入插件,以便跟踪 Docker 容器中运行的应用程序的 CPU、内存、磁盘 I/O 和网络活动。 通过收集这些指标,DevOps 团队可以主动管理资源分配、排除性能瓶颈并确保跨不同环境的最佳应用程序性能。
-
与 Kubernetes 集成:利用此插件收集 Kubernetes 编排的 Docker 容器的指标。 通过滤除不必要的 Kubernetes 标签并专注于关键指标,团队可以简化其监控解决方案并创建仪表板,从而深入了解 Kubernetes 集群中运行的微服务的整体运行状况。
-
容量规划和资源优化:使用 Docker 输入插件收集的指标来执行 Docker 部署的容量规划。 分析使用模式有助于识别未充分利用的资源和过度配置的容器,从而指导基于实际使用趋势的向上或向下扩展决策。
-
容器异常的自动警报:根据通过 Docker 插件收集的指标设置警报规则,以通知团队资源使用量异常激增或服务中断。 这种主动监控方法有助于维护服务可靠性并优化容器化应用程序的性能。
OpenSearch
-
时序数据的动态索引:利用 OpenSearch Telegraf 插件为时序指标动态创建索引,确保数据以有组织的方式存储,从而有利于基于时间的查询。 通过使用 Go 模板定义索引模式,用户可以利用该插件创建每日或每月索引,这可以大大简化数据管理和随时间推移的检索,从而提高分析性能。
-
多租户应用程序的集中日志记录:在多租户应用程序中实施 OpenSearch 插件,其中每个租户的日志都发送到单独的索引。 这使得能够针对每个租户进行有针对性的分析和监控,同时保持数据隔离。 通过利用索引名称模板化功能,用户可以自动创建租户特定的索引,这不仅简化了流程,还增强了租户数据的安全性和可访问性。
-
与机器学习集成以进行异常检测:将 OpenSearch 插件与机器学习工具结合使用,以自动检测指标数据中的异常。 通过配置插件以将实时指标发送到 OpenSearch,用户可以将机器学习模型应用于传入的数据流,以识别异常值或异常模式,从而促进主动监控和快速补救措施。
-
使用 OpenSearch 增强监控仪表板:使用从 OpenSearch 收集的指标来创建实时仪表板,从而深入了解系统性能。 通过将指标馈送到 OpenSearch,组织可以利用 OpenSearch Dashboards 可视化关键绩效指标,从而使运营团队能够快速评估运行状况和性能,并做出数据驱动的决策。
反馈
感谢您成为我们社区的一份子! 如果您有任何一般性反馈或在这些页面上发现任何错误,我们欢迎并鼓励您提供意见。 请在 InfluxDB 社区 Slack 中提交您的反馈。
强大的性能,无限的扩展能力
收集、组织和处理海量高速数据。 当您将任何数据视为时序数据时,它都会变得更有价值。 借助 InfluxDB,这是首屈一指的、旨在与 Telegraf 协同扩展的时序平台。
查看开始使用的几种方式