Kafka 和 IoTDB 集成

强大的性能和简单的集成,由 Telegraf 提供支持,Telegraf 是 InfluxData 构建的开源数据连接器。

info

这不是大规模实时查询的推荐配置。 为了查询和压缩优化、高速摄取和高可用性,您可能需要考虑 Kafka 和 InfluxDB

50 亿+

Telegraf 下载量

#1

时序数据库
来源:DB Engines

10 亿+

InfluxDB 下载量

2,800+

贡献者

目录

强大的性能,无限的扩展

收集、组织和处理海量高速数据。 当您将任何数据视为时间序列数据时,它会更有价值。 使用 InfluxDB,第一的时间序列平台,旨在通过 Telegraf 进行扩展。

查看入门方法

输入和输出集成概述

此插件允许您实时从 Kafka 主题收集指标,从而增强 Telegraf 设置中的数据监控和收集功能。

此插件将 Telegraf 指标保存到 Apache IoTDB 后端,支持会话连接和数据插入。

集成详情

Kafka

Kafka Telegraf 插件旨在从 Kafka 主题读取数据,并使用支持的输入数据格式创建指标。 作为服务输入插件,它持续监听传入的指标和事件,这与以固定间隔运行的标准输入插件不同。 此插件可以使用各种 Kafka 版本的功能,并且能够使用来自指定主题的消息,应用诸如使用 SASL 的安全凭证之类的配置,以及使用消息偏移量和消费者组选项来管理消息处理。 此插件的灵活性使其可以处理各种消息格式和用例,使其成为依赖 Kafka 进行数据摄取的应用程序的宝贵资产。

IoTDB

Apache IoTDB(物联网数据库)是一种 IoT 原生数据库,具有高性能,适用于数据管理和分析,可部署在边缘和云端。 其轻量级架构、高性能和丰富的功能集非常适合物联网工业领域中的海量数据存储、高速数据摄取和复杂分析。 IoTDB 与 Apache Hadoop、Spark 和 Flink 深度集成,进一步增强了其处理大规模数据和复杂处理任务的能力。

配置

Kafka


[[inputs.kafka_consumer]]
              ## Kafka brokers.
              brokers = ["localhost:9092"]

              ## Set the minimal supported Kafka version. Should be a string contains
              ## 4 digits in case if it is 0 version and 3 digits for versions starting
              ## from 1.0.0 separated by dot. This setting enables the use of new
              ## Kafka features and APIs.  Must be 0.10.2.0(used as default) or greater.
              ## Please, check the list of supported versions at
              ## https://pkg.go.dev/github.com/Shopify/sarama#SupportedVersions
              ##   ex: kafka_version = "2.6.0"
              ##   ex: kafka_version = "0.10.2.0"
              # kafka_version = "0.10.2.0"

              ## Topics to consume.
              topics = ["telegraf"]

              ## Topic regular expressions to consume.  Matches will be added to topics.
              ## Example: topic_regexps = [ "*test", "metric[0-9A-z]*" ]
              # topic_regexps = [ ]

              ## When set this tag will be added to all metrics with the topic as the value.
              # topic_tag = ""

              ## The list of Kafka message headers that should be pass as metric tags
              ## works only for Kafka version 0.11+, on lower versions the message headers
              ## are not available
              # msg_headers_as_tags = []

              ## The name of kafka message header which value should override the metric name.
              ## In case when the same header specified in current option and in msg_headers_as_tags
              ## option, it will be excluded from the msg_headers_as_tags list.
              # msg_header_as_metric_name = ""

              ## Set metric(s) timestamp using the given source.
              ## Available options are:
              ##   metric -- do not modify the metric timestamp
              ##   inner  -- use the inner message timestamp (Kafka v0.10+)
              ##   outer  -- use the outer (compressed) block timestamp (Kafka v0.10+)
              # timestamp_source = "metric"

              ## Optional Client id
              # client_id = "Telegraf"

              ## Optional TLS Config
              # enable_tls = false
              # tls_ca = "/etc/telegraf/ca.pem"
              # tls_cert = "/etc/telegraf/cert.pem"
              # tls_key = "/etc/telegraf/key.pem"
              ## Use TLS but skip chain & host verification
              # insecure_skip_verify = false

              ## Period between keep alive probes.
              ## Defaults to the OS configuration if not specified or zero.
              # keep_alive_period = "15s"

              ## SASL authentication credentials.  These settings should typically be used
              ## with TLS encryption enabled
              # sasl_username = "kafka"
              # sasl_password = "secret"

              ## Optional SASL:
              ## one of: OAUTHBEARER, PLAIN, SCRAM-SHA-256, SCRAM-SHA-512, GSSAPI
              ## (defaults to PLAIN)
              # sasl_mechanism = ""

              ## used if sasl_mechanism is GSSAPI
              # sasl_gssapi_service_name = ""
              # ## One of: KRB5_USER_AUTH and KRB5_KEYTAB_AUTH
              # sasl_gssapi_auth_type = "KRB5_USER_AUTH"
              # sasl_gssapi_kerberos_config_path = "/"
              # sasl_gssapi_realm = "realm"
              # sasl_gssapi_key_tab_path = ""
              # sasl_gssapi_disable_pafxfast = false

              ## used if sasl_mechanism is OAUTHBEARER
              # sasl_access_token = ""

              ## SASL protocol version.  When connecting to Azure EventHub set to 0.
              # sasl_version = 1

              # Disable Kafka metadata full fetch
              # metadata_full = false

              ## Name of the consumer group.
              # consumer_group = "telegraf_metrics_consumers"

              ## Compression codec represents the various compression codecs recognized by
              ## Kafka in messages.
              ##  0 : None
              ##  1 : Gzip
              ##  2 : Snappy
              ##  3 : LZ4
              ##  4 : ZSTD
              # compression_codec = 0
              ## Initial offset position; one of "oldest" or "newest".
              # offset = "oldest"

              ## Consumer group partition assignment strategy; one of "range", "roundrobin" or "sticky".
              # balance_strategy = "range"

              ## Maximum number of retries for metadata operations including
              ## connecting. Sets Sarama library's Metadata.Retry.Max config value. If 0 or
              ## unset, use the Sarama default of 3,
              # metadata_retry_max = 0

              ## Type of retry backoff. Valid options: "constant", "exponential"
              # metadata_retry_type = "constant"

              ## Amount of time to wait before retrying. When metadata_retry_type is
              ## "constant", each retry is delayed this amount. When "exponential", the
              ## first retry is delayed this amount, and subsequent delays are doubled. If 0
              ## or unset, use the Sarama default of 250 ms
              # metadata_retry_backoff = 0

              ## Maximum amount of time to wait before retrying when metadata_retry_type is
              ## "exponential". Ignored for other retry types. If 0, there is no backoff
              ## limit.
              # metadata_retry_max_duration = 0

              ## When set to true, this turns each bootstrap broker address into a set of
              ## IPs, then does a reverse lookup on each one to get its canonical hostname.
              ## This list of hostnames then replaces the original address list.
              ## resolve_canonical_bootstrap_servers_only = false

              ## Strategy for making connection to kafka brokers. Valid options: "startup",
              ## "defer". If set to "defer" the plugin is allowed to start before making a
              ## connection. This is useful if the broker may be down when telegraf is
              ## started, but if there are any typos in the broker setting, they will cause
              ## connection failures without warning at startup
              # connection_strategy = "startup"

              ## Maximum length of a message to consume, in bytes (default 0/unlimited);
              ## larger messages are dropped
              max_message_len = 1000000

              ## Max undelivered messages
              ## This plugin uses tracking metrics, which ensure messages are read to
              ## outputs before acknowledging them to the original broker to ensure data
              ## is not lost. This option sets the maximum messages to read from the
              ## broker that have not been written by an output.
              ##
              ## This value needs to be picked with awareness of the agent's
              ## metric_batch_size value as well. Setting max undelivered messages too high
              ## can result in a constant stream of data batches to the output. While
              ## setting it too low may never flush the broker's messages.
              # max_undelivered_messages = 1000

              ## Maximum amount of time the consumer should take to process messages. If
              ## the debug log prints messages from sarama about 'abandoning subscription
              ## to [topic] because consuming was taking too long', increase this value to
              ## longer than the time taken by the output plugin(s).
              ##
              ## Note that the effective timeout could be between 'max_processing_time' and
              ## '2 * max_processing_time'.
              # max_processing_time = "100ms"

              ## The default number of message bytes to fetch from the broker in each
              ## request (default 1MB). This should be larger than the majority of
              ## your messages, or else the consumer will spend a lot of time
              ## negotiating sizes and not actually consuming. Similar to the JVM's
              ## `fetch.message.max.bytes`.
              # consumer_fetch_default = "1MB"

              ## Data format to consume.
              ## Each data format has its own unique set of configuration options, read
              ## more about them here:
              ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
              data_format = "influx"

IoTDB

[[outputs.iotdb]]
  ## Configuration of IoTDB server connection
  host = "127.0.0.1"
  # port = "6667"

  ## Configuration of authentication
  # user = "root"
  # password = "root"

  ## Timeout to open a new session.
  ## A value of zero means no timeout.
  # timeout = "5s"

  ## Configuration of type conversion for 64-bit unsigned int
  ## IoTDB currently DOES NOT support unsigned integers (version 13.x).
  ## 32-bit unsigned integers are safely converted into 64-bit signed integers by the plugin,
  ## however, this is not true for 64-bit values in general as overflows may occur.
  ## The following setting allows to specify the handling of 64-bit unsigned integers.
  ## Available values are:
  ##   - "int64"       --  convert to 64-bit signed integers and accept overflows
  ##   - "int64_clip"  --  convert to 64-bit signed integers and clip the values on overflow to 9,223,372,036,854,775,807
  ##   - "text"        --  convert to the string representation of the value
  # uint64_conversion = "int64_clip"

  ## Configuration of TimeStamp
  ## TimeStamp is always saved in 64bits int. timestamp_precision specifies the unit of timestamp.
  ## Available value:
  ## "second", "millisecond", "microsecond", "nanosecond"(default)
  # timestamp_precision = "nanosecond"

  ## Handling of tags
  ## Tags are not fully supported by IoTDB.
  ## A guide with suggestions on how to handle tags can be found here:
  ##     https://iotdb.apache.org/UserGuide/Master/API/InfluxDB-Protocol.html
  ##
  ## Available values are:
  ##   - "fields"     --  convert tags to fields in the measurement
  ##   - "device_id"  --  attach tags to the device ID
  ##
  ## For Example, a metric named "root.sg.device" with the tags `tag1: "private"`  and  `tag2: "working"` and
  ##  fields `s1: 100`  and `s2: "hello"` will result in the following representations in IoTDB
  ##   - "fields"     --  root.sg.device, s1=100, s2="hello", tag1="private", tag2="working"
  ##   - "device_id"  --  root.sg.device.private.working, s1=100, s2="hello"
  # convert_tags_to = "device_id"

  ## Handling of unsupported characters
  ## Some characters in different versions of IoTDB are not supported in path name
  ## A guide with suggetions on valid paths can be found here:
  ## for iotdb 0.13.x           -> https://iotdb.apache.org/UserGuide/V0.13.x/Reference/Syntax-Conventions.html#identifiers
  ## for iotdb 1.x.x and above  -> https://iotdb.apache.org/UserGuide/V1.3.x/User-Manual/Syntax-Rule.html#identifier
  ##
  ## Available values are:
  ##   - "1.0", "1.1", "1.2", "1.3"  -- enclose in `` the world having forbidden character 
  ##                                    such as @ $ # : [ ] { } ( ) space
  ##   - "0.13"                      -- enclose in `` the world having forbidden character 
  ##                                    such as space
  ##
  ## Keep this section commented if you don't want to sanitize the path
  # sanitize_tag = "1.3"

输入和输出集成示例

Kafka

  1. 实时数据处理:使用 Kafka 插件将来自 Kafka 主题的实时数据馈送到监控系统。 这对于需要即时反馈性能指标或用户活动的应用尤其有用,使企业能够更快地对其环境中的变化条件做出反应。

  2. 动态指标收集:利用此插件根据 Kafka 中发生的事件动态调整正在捕获的指标。 例如,通过与其他服务集成,用户可以让插件即时重新配置自身,从而确保始终根据业务或应用程序的需求收集相关指标。

  3. 集中式日志记录和监控:使用 Kafka Consumer 插件实施集中式日志记录系统,以将来自多个服务的日志聚合到统一的监控仪表板中。 此设置可以帮助识别不同服务之间的问题,并提高整体系统可观察性和故障排除能力。

  4. 异常检测系统:将 Kafka 与机器学习算法结合使用以进行实时异常检测。 通过不断分析流数据,此设置可以自动识别异常模式,从而更有效地触发警报并减轻潜在问题。

IoTDB

  1. 实时 IoT 监控:利用 IoTDB 插件从各种 IoT 设备收集传感器数据,并将其保存在 Apache IoTDB 后端,从而促进对环境条件(如温度和湿度)的实时监控。 此用例使组织能够分析随时间变化的趋势,并根据历史数据做出明智的决策,同时还可以利用 IoTDB 的高效存储和查询功能。

  2. 智慧农业数据收集:使用 IoTDB 插件从部署在田地的智慧农业传感器收集指标。 通过将湿度水平、养分含量和大气条件传输到 IoTDB,农民可以访问有关最佳种植和浇水计划的详细见解,从而提高作物产量和资源管理水平。

  3. 能耗分析:利用 IoTDB 插件跟踪整个公用事业网络中智能电表的能耗指标。 这种集成使分析能够识别使用高峰并预测未来的消费模式,最终支持节能措施和改进的公用事业管理。

  4. 自动化工业设备监控:使用此插件收集制造工厂中机械的运营指标,并将其存储在 IoTDB 中进行分析。 此设置可以帮助识别低效率、预测性维护需求和运营异常,从而确保最佳性能并最大限度地减少意外停机时间。

反馈

感谢您成为我们社区的一份子! 如果您有任何一般性反馈或在这些页面上发现了任何错误,我们欢迎并鼓励您提出意见。 请在 InfluxDB 社区 Slack 中提交您的反馈。

强大的性能,无限的扩展

收集、组织和处理海量高速数据。 当您将任何数据视为时间序列数据时,它会更有价值。 使用 InfluxDB,第一的时间序列平台,旨在通过 Telegraf 进行扩展。

查看入门方法

相关集成

HTTP 和 InfluxDB 集成

HTTP 插件从一个或多个 HTTP(S) 端点收集指标。 它支持各种身份验证方法和数据格式的配置选项。

查看集成

Kafka 和 InfluxDB 集成

此插件从 Kafka 读取消息,并允许根据这些消息创建指标。 它支持各种配置,包括不同的 Kafka 设置和消息处理选项。

查看集成

Kinesis 和 InfluxDB 集成

Kinesis 插件允许从 AWS Kinesis 流读取指标。 它支持多种输入数据格式,并提供带有 DynamoDB 的检查点功能,以实现可靠的消息处理。

查看集成