目录
强大的性能,无限的扩展
收集、组织和处理海量高速数据。当您将任何数据视为时间序列数据时,它会更有价值。借助 InfluxDB,排名第一的时间序列平台,旨在与 Telegraf 一起扩展。
查看入门方法
输入和输出集成概述
此插件允许您从 Kafka 主题实时收集指标,从而增强 Telegraf 设置中的数据监控和收集功能。
Telegraf Elasticsearch 插件无缝地将指标发送到 Elasticsearch 服务器。该插件处理模板创建和动态索引管理,并支持各种 Elasticsearch 特定的功能,以确保数据格式正确,以便存储和检索。
集成详情
Kafka
Kafka Telegraf 插件旨在从 Kafka 主题读取数据,并使用支持的输入数据格式创建指标。作为服务输入插件,它持续监听传入的指标和事件,这与以固定间隔运行的标准输入插件不同。此特定插件可以使用各种 Kafka 版本的功能,并且能够使用 SASL 等配置的安全凭证,以及使用消息偏移量和消费者组管理消息处理,从而使用指定主题中的消息。此插件的灵活性使其能够处理各种消息格式和用例,使其成为依赖 Kafka 进行数据摄取的应用程序的宝贵资产。
Elasticsearch
此插件将指标写入 Elasticsearch,Elasticsearch 是一个分布式、RESTful 的搜索和分析引擎,能够以近乎实时的速度存储大量数据。它旨在处理 Elasticsearch 5.x 到 7.x 版本,并利用其动态模板功能来正确管理数据类型映射。该插件支持高级功能,例如模板管理、动态索引命名以及与 OpenSearch 的集成。它还允许配置身份验证和 Elasticsearch 节点的运行状况监控。
配置
Kafka
[[inputs.kafka_consumer]]
## Kafka brokers.
brokers = ["localhost:9092"]
## Set the minimal supported Kafka version. Should be a string contains
## 4 digits in case if it is 0 version and 3 digits for versions starting
## from 1.0.0 separated by dot. This setting enables the use of new
## Kafka features and APIs. Must be 0.10.2.0(used as default) or greater.
## Please, check the list of supported versions at
## https://pkg.go.dev/github.com/Shopify/sarama#SupportedVersions
## ex: kafka_version = "2.6.0"
## ex: kafka_version = "0.10.2.0"
# kafka_version = "0.10.2.0"
## Topics to consume.
topics = ["telegraf"]
## Topic regular expressions to consume. Matches will be added to topics.
## Example: topic_regexps = [ "*test", "metric[0-9A-z]*" ]
# topic_regexps = [ ]
## When set this tag will be added to all metrics with the topic as the value.
# topic_tag = ""
## The list of Kafka message headers that should be pass as metric tags
## works only for Kafka version 0.11+, on lower versions the message headers
## are not available
# msg_headers_as_tags = []
## The name of kafka message header which value should override the metric name.
## In case when the same header specified in current option and in msg_headers_as_tags
## option, it will be excluded from the msg_headers_as_tags list.
# msg_header_as_metric_name = ""
## Set metric(s) timestamp using the given source.
## Available options are:
## metric -- do not modify the metric timestamp
## inner -- use the inner message timestamp (Kafka v0.10+)
## outer -- use the outer (compressed) block timestamp (Kafka v0.10+)
# timestamp_source = "metric"
## Optional Client id
# client_id = "Telegraf"
## Optional TLS Config
# enable_tls = false
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## Period between keep alive probes.
## Defaults to the OS configuration if not specified or zero.
# keep_alive_period = "15s"
## SASL authentication credentials. These settings should typically be used
## with TLS encryption enabled
# sasl_username = "kafka"
# sasl_password = "secret"
## Optional SASL:
## one of: OAUTHBEARER, PLAIN, SCRAM-SHA-256, SCRAM-SHA-512, GSSAPI
## (defaults to PLAIN)
# sasl_mechanism = ""
## used if sasl_mechanism is GSSAPI
# sasl_gssapi_service_name = ""
# ## One of: KRB5_USER_AUTH and KRB5_KEYTAB_AUTH
# sasl_gssapi_auth_type = "KRB5_USER_AUTH"
# sasl_gssapi_kerberos_config_path = "/"
# sasl_gssapi_realm = "realm"
# sasl_gssapi_key_tab_path = ""
# sasl_gssapi_disable_pafxfast = false
## used if sasl_mechanism is OAUTHBEARER
# sasl_access_token = ""
## SASL protocol version. When connecting to Azure EventHub set to 0.
# sasl_version = 1
# Disable Kafka metadata full fetch
# metadata_full = false
## Name of the consumer group.
# consumer_group = "telegraf_metrics_consumers"
## Compression codec represents the various compression codecs recognized by
## Kafka in messages.
## 0 : None
## 1 : Gzip
## 2 : Snappy
## 3 : LZ4
## 4 : ZSTD
# compression_codec = 0
## Initial offset position; one of "oldest" or "newest".
# offset = "oldest"
## Consumer group partition assignment strategy; one of "range", "roundrobin" or "sticky".
# balance_strategy = "range"
## Maximum number of retries for metadata operations including
## connecting. Sets Sarama library's Metadata.Retry.Max config value. If 0 or
## unset, use the Sarama default of 3,
# metadata_retry_max = 0
## Type of retry backoff. Valid options: "constant", "exponential"
# metadata_retry_type = "constant"
## Amount of time to wait before retrying. When metadata_retry_type is
## "constant", each retry is delayed this amount. When "exponential", the
## first retry is delayed this amount, and subsequent delays are doubled. If 0
## or unset, use the Sarama default of 250 ms
# metadata_retry_backoff = 0
## Maximum amount of time to wait before retrying when metadata_retry_type is
## "exponential". Ignored for other retry types. If 0, there is no backoff
## limit.
# metadata_retry_max_duration = 0
## When set to true, this turns each bootstrap broker address into a set of
## IPs, then does a reverse lookup on each one to get its canonical hostname.
## This list of hostnames then replaces the original address list.
## resolve_canonical_bootstrap_servers_only = false
## Strategy for making connection to kafka brokers. Valid options: "startup",
## "defer". If set to "defer" the plugin is allowed to start before making a
## connection. This is useful if the broker may be down when telegraf is
## started, but if there are any typos in the broker setting, they will cause
## connection failures without warning at startup
# connection_strategy = "startup"
## Maximum length of a message to consume, in bytes (default 0/unlimited);
## larger messages are dropped
max_message_len = 1000000
## Max undelivered messages
## This plugin uses tracking metrics, which ensure messages are read to
## outputs before acknowledging them to the original broker to ensure data
## is not lost. This option sets the maximum messages to read from the
## broker that have not been written by an output.
##
## This value needs to be picked with awareness of the agent's
## metric_batch_size value as well. Setting max undelivered messages too high
## can result in a constant stream of data batches to the output. While
## setting it too low may never flush the broker's messages.
# max_undelivered_messages = 1000
## Maximum amount of time the consumer should take to process messages. If
## the debug log prints messages from sarama about 'abandoning subscription
## to [topic] because consuming was taking too long', increase this value to
## longer than the time taken by the output plugin(s).
##
## Note that the effective timeout could be between 'max_processing_time' and
## '2 * max_processing_time'.
# max_processing_time = "100ms"
## The default number of message bytes to fetch from the broker in each
## request (default 1MB). This should be larger than the majority of
## your messages, or else the consumer will spend a lot of time
## negotiating sizes and not actually consuming. Similar to the JVM's
## `fetch.message.max.bytes`.
# consumer_fetch_default = "1MB"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
Elasticsearch
[[outputs.elasticsearch]]
## The full HTTP endpoint URL for your Elasticsearch instance
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval
urls = [ "http://node1.es.example.com:9200" ] # required.
## Elasticsearch client timeout, defaults to "5s" if not set.
timeout = "5s"
## Set to true to ask Elasticsearch a list of all cluster nodes,
## thus it is not necessary to list all nodes in the urls config option
enable_sniffer = false
## Set to true to enable gzip compression
enable_gzip = false
## Set the interval to check if the Elasticsearch nodes are available
## Setting to "0s" will disable the health check (not recommended in production)
health_check_interval = "10s"
## Set the timeout for periodic health checks.
# health_check_timeout = "1s"
## HTTP basic authentication details.
## HTTP basic authentication details
# username = "telegraf"
# password = "mypassword"
## HTTP bearer token authentication details
# auth_bearer_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"
## Index Config
## The target index for metrics (Elasticsearch will create if it not exists).
## You can use the date specifiers below to create indexes per time frame.
## The metric timestamp will be used to decide the destination index name
# %Y - year (2016)
# %y - last two digits of year (00..99)
# %m - month (01..12)
# %d - day of month (e.g., 01)
# %H - hour (00..23)
# %V - week of the year (ISO week) (01..53)
## Additionally, you can specify a tag name using the notation {{tag_name}}
## which will be used as part of the index name. If the tag does not exist,
## the default tag value will be used.
# index_name = "telegraf-{{host}}-%Y.%m.%d"
# default_tag_value = "none"
index_name = "telegraf-%Y.%m.%d" # required.
## Optional Index Config
## Set to true if Telegraf should use the "create" OpType while indexing
# use_optype_create = false
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## Template Config
## Set to true if you want telegraf to manage its index template.
## If enabled it will create a recommended index template for telegraf indexes
manage_template = true
## The template name used for telegraf indexes
template_name = "telegraf"
## Set to true if you want telegraf to overwrite an existing template
overwrite_template = false
## If set to true a unique ID hash will be sent as sha256(concat(timestamp,measurement,series-hash)) string
## it will enable data resend and update metric points avoiding duplicated metrics with different id's
force_document_id = false
## Specifies the handling of NaN and Inf values.
## This option can have the following values:
## none -- do not modify field-values (default); will produce an error if NaNs or infs are encountered
## drop -- drop fields containing NaNs or infs
## replace -- replace with the value in "float_replacement_value" (default: 0.0)
## NaNs and inf will be replaced with the given number, -inf with the negative of that number
# float_handling = "none"
# float_replacement_value = 0.0
## Pipeline Config
## To use a ingest pipeline, set this to the name of the pipeline you want to use.
# use_pipeline = "my_pipeline"
## Additionally, you can specify a tag name using the notation {{tag_name}}
## which will be used as part of the pipeline name. If the tag does not exist,
## the default pipeline will be used as the pipeline. If no default pipeline is set,
## no pipeline is used for the metric.
# use_pipeline = "{{es_pipeline}}"
# default_pipeline = "my_pipeline"
#
# Custom HTTP headers
# To pass custom HTTP headers please define it in a given below section
# [outputs.elasticsearch.headers]
# "X-Custom-Header" = "custom-value"
## Template Index Settings
## Overrides the template settings.index section with any provided options.
## Defaults provided here in the config
# template_index_settings = {
# refresh_interval = "10s",
# mapping.total_fields.limit = 5000,
# auto_expand_replicas = "0-1",
# codec = "best_compression"
# }
输入和输出集成示例
Kafka
-
实时数据处理:使用 Kafka 插件将来自 Kafka 主题的实时数据馈送到监控系统。这对于需要即时反馈性能指标或用户活动的应用程序尤其有用,使企业能够更快地对其环境中的变化条件做出反应。
-
动态指标收集:利用此插件根据 Kafka 中发生的事件动态调整正在捕获的指标。例如,通过与其他服务集成,用户可以让插件即时重新配置自身,从而确保始终根据业务或应用程序的需求收集相关指标。
-
集中式日志记录和监控:实施集中式日志记录系统,使用 Kafka Consumer Plugin 将来自多个服务的日志聚合到统一的监控仪表板中。此设置可以帮助识别不同服务之间的问题,并提高整体系统可观察性和故障排除能力。
-
异常检测系统:将 Kafka 与机器学习算法结合使用以进行实时异常检测。通过不断分析流数据,此设置可以自动识别异常模式,从而更有效地触发警报和缓解潜在问题。
Elasticsearch
-
基于时间的索引:使用此插件将指标存储在 Elasticsearch 中,以根据收集时间索引每个指标。例如,CPU 指标可以存储在名为
telegraf-2023.01.01
的每日索引中,从而可以轻松进行基于时间的查询和保留策略。 -
动态模板管理:利用模板管理功能自动创建针对您的指标量身定制的自定义模板。这允许您定义如何索引和分析不同的字段,而无需手动配置 Elasticsearch,从而确保用于查询的最佳数据结构。
-
OpenSearch 兼容性:如果您正在使用 AWS OpenSearch,您可以通过激活兼容性模式来配置此插件以无缝工作,从而确保您现有的 Elasticsearch 客户端保持功能并与较新的集群设置兼容。
反馈
感谢您成为我们社区的一份子!如果您有任何一般性反馈或在这些页面上发现任何错误,我们欢迎并鼓励您提出意见。请在 InfluxDB 社区 Slack 中提交您的反馈。
强大的性能,无限的扩展
收集、组织和处理海量高速数据。当您将任何数据视为时间序列数据时,它会更有价值。借助 InfluxDB,排名第一的时间序列平台,旨在与 Telegraf 一起扩展。
查看入门方法