学习Flux (#fluxlang) 与学习API一样困难

作者：Paul Dix / 开发者，产品
2018年8月28日

导航至

// Here are some basic parts of the language. This is a comment

// you can assign variables, like a string
s = "this is a string"

// or an int64
i = 1

// or a float64
f = 2.0


// or an object
o = {foo: "bar", asdf: 2.0, jkl: s}
// now access one of those properties
foo = o.foo
asdf = o["asdf"]

// you can also create an object with a shorthand notation
// where key will be the same as variable names
o = {s, i, f}
// equivalent to o = {s: s, i: i, f: f}


// here's an array
a = ["a", "b", "c"]

// here's a duration
d = 2h10m5s
// they're actually represented as seconds, days, and months.
// this is because those units can vary with time zones, Gregorian 
// calendar and all that. Here's one with months and days
d = 1mo7d

// here's a time
t = 2018-08-28T10:20:00Z

// define a function that takes argument named n. Flux only has
// named arguments (no positional ones)
square = (n) => {
  // the standard math operators are in the language
  return n * n
}
// call that function
num = square(n: 23)

// or if a function is one statement you can exclude the braces
square = (n) => n * n

// Now let's do a query. The functions in this query work
// with a stream of data. The stream is made up of tables
// that have columns and records (like in CSV).
// Conceptually, functions are applied to each table in
// the stream

// start with getting data from the InfluxDB server telegraf DB
from(host: "https://127.0.0.1:9070", bucket:"telegraf/default")
    // here's the pipe-forward operator. It says to send the
    // output of the previous function to the next one (range)
    // range will filter by time. You can also pass start as
    // a time, so functions have polymorphic arguments
    |> range(start:-1h)

    // now filter specific data. Here we pass an anonymous function.
    // note that brackets and a return statement aren't required if
    // the function is a single statement. Also note that we have
    // comparison and boolean operators
    |> filter(fn: (r) => r._measurement == "cpu" and r.host == "serverA")

    // now group all records from all rows into a table for
    // each region and service that we have. This converts 
    // however many tables we have into one table for each
    // unique region, service pair in the data.
    |> group(keys: ["region", "service"])

    // now compute some aggregates in 10 minute buckets of time.
    // each of these aggregates will get applied to each table.
    // Note that we're passing in an array of function pointers.
    // That is min, max, and mean are all functions that are
    // pipe-forwardable and can be applied to streams.
    |> applyWindow(aggregates: [min, max, mean], every: 10m)

    // And we can iterate over the rows in a table and add or 
    // modify the returned result with the map function. This
    // will add a new column in every table called spread.
    |> map(fn: (r) => {return {spread: r.max - r.min}})

    // Return output of the function chain as a stream called
    // "result". This isn't necessary if only one function
    // chain in the script.
    |> yield(name:"result")

// Finally, here's how we can define a function
// that is pipe-forwardable:
// Filter the stream to a specific set of measurements. This
// can be used with other filter functions and they'll combine.
filterMeasurements = (names, tables=<-) {
  return tables |> filter(fn: (r) => r._measurement in names)
}
// Note that tables=<- defines a parameter called tables that
// can be passed in as a function argument, or pipe-forwarded
// from another function.
// Now we can call to it through a pipe-forward operation
from(bucket:"telegraf/default") |> range(start: 2018-08-27)
    |> filterMeasurements(names: ["cpu", "mem"])
    |> yield(name:"cpu_mem")

// Or we can pass the tables input as an argument like this
filterMeasurements(names: ["cpu", "mem"], tables: (
    from(bucket:"telegraf/default")
        |> range(start:2018-07-27)
    ))
  |> yield(name:"cpu_mem")

这个简短的脚本介绍了该语言的主要语法结构，并展示了从InfluxDB查询数据的示例。该语言还有一些其他特性，但这些足以在Flux中完成许多事情。其余的学习曲线完全是API。也就是说，你需要了解存在哪些函数，它们的参数是什么，它们做什么，以及它们返回什么。

实际上，学习Flux主要是学习API以完成任务。如果我们选择使用Lua、JavaScript或SQL并自定义函数，这一点也是成立的。对于有过一些JavaScript经验的任何人来说，语言的语法元素相当熟悉。其中最奇怪的是管道向前操作符，但你很快就会习惯。

学习API意味着学习什么类型的数据可以输入到一个函数中，参数是什么，函数做什么，以及它输出什么。这无论用户使用什么语言都是重要的。具有所有这些信息的构建器用户界面将有助于让新用户了解该语言，而无需学习这些结构。

从概念上讲，Flux中的函数可以分为四个主要边界。首先，我们有输入函数，它们从InfluxDB、CSV文件、Prometheus或其他地方读取数据。接下来，我们有将结果流中的表组合在一起或拆分开的函数。分组、连接和窗口函数适合这个角色。然后，我们有应用在每个表上的函数，如聚合、选择和排序。最后，还有输出函数，用于将结果发送到某些数据同步。Yield将结果返回给用户，而其他输出将结果发送到InfluxDB、Kafka、S3、文件等。

如果你已经读到这里，那么你对Flux的了解就足够进行大多数查询和数据任务了，只要你有API参考。对于不到1000字的散文和代码来说，这还不错。

现在，冒着激怒SQL爱好者群体的风险，让我们重新审视一下在已经了解SQL的情况下学习Flux和使用SQL的想法。在我认识的开发者群体中，他们对SQL的了解程度差异很大。这通常是由他们实际编写SQL语句的频率决定的。在许多情况下，开发人员每天与关系型数据库打交道，但从不编写任何SQL，因为有了ActiveRecord这样的ORM。他们很少写SQL，这意味着他们的知识很快就会退化。

对我个人来说，我在2001年和2008年学习了SQL，并且我可以肯定地说，我已经忘记了我现在知道的SQL。比基本的SELECT、INNER JOIN、WHERE和HAVING语句更复杂的任何东西都需要查阅文档和参考资料。所以，尽管我已经“知道”SQL，但在实际完成任务时，这对我的帮助很小。我认为，大量的程序员都处于同样的境地。

对于更复杂的时间序列和数据分析查询，你几乎肯定需要查找窗口函数、存储程序以及其他一些我忘记的东西。再加上SQL的语法看起来不像任何其他语言，并且经常看起来像Yoda查询数据，即使你已经接触过它，也有一个真正的学习曲线。除非你定期编写查询，否则语法可能需要查阅文档。

我们的Flux目标之一是使其易于阅读和理解，即使是对于语言新手来说也是如此。我们希望开发者在使用语言本身时，认知负荷尽可能小，以便他们可以专注于思考如何处理数据。使语言看起来像许多其他流行的语言是一个具体的设计目标。我们已经公开迭代了部分内容，但我们总是乐于听取更多的反馈。

在未来几个月内，我们将发布Flux的早期版本，这些版本更适合每个人的使用。我们将将其包含在InfluxDB 1.7的开源版本中，InfluxDB 2.0的alpha版本，以及一个可以像Ruby或Python解释器一样运行的独立Flux可执行文件中。在此期间，您可以查看Flux特定问题、Flux语言规范、更多我们构建Flux的动机，或我在6月InfluxDays伦敦发表的Flux演讲。

导航至

试用InfluxDB Cloud

停止盲目飞行

学习Flux (#fluxlang) 与学习API一样困难

作者：Paul Dix / 开发者，产品
2018年8月28日

导航至

试用InfluxDB Cloud

了解更多关于InfluxDB的信息

性能基准测试：InfluxDB 3.0与InfluxDB开源

InfluxDB工业物联网：实时演示

时间序列数据库与数据湖协同工作

数据仓库

网络监控

时间序列数据分析：2024年的定义和最佳实践

产品与解决方案

开发者

公司

数据工作负载

导航至

试用InfluxDB Cloud

停止盲目飞行

获取更新

学习Flux (#fluxlang) 与学习API一样困难

作者：Paul Dix / 开发者，产品 2018年8月28日

导航至

试用InfluxDB Cloud

了解更多关于InfluxDB的信息

性能基准测试：InfluxDB 3.0与InfluxDB开源

InfluxDB工业物联网：实时演示

时间序列数据库与数据湖协同工作

数据仓库

网络监控

时间序列数据分析：2024年的定义和最佳实践

产品与解决方案

开发者

公司

注册InfluxData通讯

关注我们

作者：Paul Dix / 开发者，产品
2018年8月28日