codingjunkie-feed

Lessons Learned Writing a Book

Sun, 27 Oct 2024 22:46:00 +0000

Earlier this year, I completed the 2nd edition of Kafka Streams in Action. Even though it’s a second edition, there were several things I wanted to change from the first edition, so it ended up being mainly a complete rewrite. Writing a book is a significant undertaking, and while it’s probably the most challenging task I’ve ever undertaken, it was gratifying. I don’t regret the amount of time I spent working on it. I learned several lessons along the way that I’d like to share in the hopes that someone else will benefit from my experience (if anything, writing about it will help solidify the information for myself). I have several points I’d like to discuss:

Yes, Virginia, you need a schedule plan
Have a good yardstick
The magic of 100 words
Objects in the mirror are smaller than they appear
No matter what, never block

With the introduction out of the way, let’s get started.

Yes, Virginia, you must have a plan

The need for planning was probably my biggest surprise during the process. While the need for planning might seem obvious to some, to me, it was a revelation. I held the mistaken belief that I would simply sit down with the intent to write, the inspiration would start flowing, and my fingers would begin to furiously pound on the keyboard.

Additionally, I was sure that my productivity was linear; the more time I had, the more I’d get written. I quickly found those beliefs couldn’t be further from the truth. My actual experience looked like sitting down, distracted, and not knowing what I wanted to say. I’d get a few words down, and my mind would drift, and then inevitably, I’m surfing on the internet looking at things that couldn’t be further from what I’m writing about (“What happended to the cast of Gilligan’s Island?”) It seemed the more time I had, the less I accomplished.

This large block of time/low productivity scenario could be attributed to an issue known as Parkinson’s Law:

Work expands so as to fill the time available for its completion – C. Northcote Parkinson¹

In other words, if you allow yourself a large block of time without specific plans or goals, there’s a high chance you’ll use the entire time and achieve very little.

So what’s the solution to this less than ideal working situation? Planning and setting a schedule with deadlines. While many of us resist setting deadlines, the bottom line is they work. Consider the following illustration

Schedule defined blocks of time for a specific tasks vs. large blocks with no goal

The main idea is that instead of a big block of time to get something done, you need to set smaller amounts of time with a defined outcome. The size of the work blocks and break time shown here are arbitrary. Still, I would recommend blocks of 60-90 minutes. It’s also essential to allow for some time, in the beginning, to get into the “flow,” it’s nearly impossible to sit down to write and be productive immediately. A good resource for establishing a work-break process is the Pomodoro® technique

Have a good yardstick

Along with having a good schedule, I found having concrete markers to measure my progress invaluable. Being able to gauge your progress goes hand-in-hand with establishing a good schedule. For me, it was counting the number of words (exclusive of code examples) written. For Kafka Streams in Action, the publisher guidelines set each chapter at roughly 30 pages. At 30 pages, I’d have about 9,000 words. My goal was to produce a new chapter every 3 weeks. To achieve that pace, I determined I’d need to write 500 words daily, 6 days a week (one day off is essential). I’d produce 3,000 words weekly at that pace, resulting in a new chapter every 3 weeks. While my progress was not always linear, it allowed me to assess my productivity accurately. More importantly, it would help me adjust my schedule when the inevitable life interruptions occur. Not able to write for 2 days this week? It’s not a big deal. Just up my word count on two other days to 1,000 or 750 words per day for four days, and I’d still be on track. Of all my heuristics, standardizing how I was making progress was one of the most important. The ability to “commoditize” your time into blocks allows you to schedule more effectively and fit writing into your schedule vs. having writing take over your schedule.

The magic of 100 words

No matter how motivated you are as a writer or excited about your topic, inevitably, you’ll face times when you flat out don’t feel like writing. While it’s natural to take a break here and there, if you give in to the emotion of not writing too often, you can quickly fall even further behind on your schedule, which can lead to increased discouragement. The problem of waiting until you feel like doing something is that feeling may never come. While willpower can sometimes help with motivation, it’s not a long-term solution. While I tried a few different approaches, what ended up working for me was to set a very small goal: just write 100 words. Setting this small goal helped me overcome my writing inertia and have a productive session. Now, it doesn’t always need to be a word-count goal, but the idea is to set a small, achievable goal before giving in to the “I just don’t feel like writing” vibe. More often than not, this little trick will help you get going when the motivation is lacking.

Objects in the mirror are larger than they appear

Writing a book is a considerable project. There’s so much ground to cover, spanning several months and even years. Taking on something of this size, it’s easy to get overwhelmed by the volume of work that needs to be done. But thinking of the entire scope of work is counterproductive, and doing so, it’s easy to get into a “deer in the headlights” mode where you’re overwhelmed and not making progress. This is true of any large project, not only writing a book. In a bit of a twist on everything we’ve spoken about so far, the key is to break things down into smaller, manageable pieces. Switching your focus to each minor part makes things look much more manageable. Then, before you know it, you’ll string together several smaller parts into a larger whole.

No matter what, never block

This section could alternatively be called “Always make progress”. Of all the topics I’ve touched on in this blog post, this one could be the most important. Over time, you’ll find that even when applying different strategies, there will be periods when it’s tough to get anything done. Whether it’s work/family obligations or something else, life will get in the way. When you encounter such a period, it’s imperative for you to continue to make progress. The significant danger with stopping is that it’s much harder to get going again.

So when those tough times come, it is far better to write something small daily and continue to make some progress until you can get back to devoting more time to writing. Other times, you’ll be the obstacle. Let’s face it: You can apply all the strategies but are just out of steam. At those moments, it’s important to remember that making progress without writing is still possible. Maybe you have some illustrations to work on or do some additional research. Since I was writing a technical book, I would write code for examples or lay out a concept for something I wanted to cover later. Whatever you do isn’t that important as long as you continue to move forward.

Acknowledgements

I would be remiss in my writing if I didn’t acknowledge some sources for this blog post. First, I must mention my excellent editor Frances Lefkowitz, who taught me a lot about these writing concepts. Secondly, the book The Clockwork Muse is an invaluable resource.

Parkinson, Cyril Northcote (19 November 1955). “Parkinson’s Law”. The Economist. London. ↩

Mastering Stream Processing - Testing Flink SQL windowed applications

Tue, 07 May 2024 16:46:00 +0000

We’ve covered a lot of territory in this blog series about windowing aggregations. Here are the previous posts:

In this final installment, we will cover testing a windowed application. While it may seem obvious, tests are essential to validate your logic. Usually, when testing an application, you’ll assert that N input records result in an expected result. The time semanticspost emphasized that event timestamps drive the window action. So, it’s not enough to feed the application records and assert results; you must provide timestamps to advance the window correctly. In this post, we will cover how to test Apache Flink® SQL to ensure your streaming windowed applications produce correct results.

Now, let’s move on to testing Flink SQL windowed queries.

Testing Flink SQL windowed aggregations

The Flink SQL client provides an interactive environment for trying different queries but is impossible to use in an automated test. Fortunately, there is the Flink Table API that allows you to execute Flink SQL statements programmatically. So, we’ll use the Table API to create integration tests you can run in JUnit. The first step to running Flink SQL in a test is to create a StreamTableEnvironment that you’ll use to drive the test:

Setting up Flink SQL test with the Table API

    StreamTableEnvironment streamTableEnv;

    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(1);
    env.getConfig().setRestartStrategy(RestartStrategies.noRestart());
    env.setStateBackend(new EmbeddedRocksDBStateBackend());
    streamTableEnv = StreamTableEnvironment.create(env, EnvironmentSettings.newInstance().inStreamingMode().build());

Now that you have the StreamTableEnvironment, the following steps are to create a table, populate it with data, execute the SQL statement under test, and assert the results. First, let’s make the table for the test:

Table under test

    String table = String.format("CREATE TABLE iot_readings (\n" +
    "     device_id STRING,\n" +
    "     reading DOUBLE,\n" +
    "     reading_time TIMESTAMP(3),\n" +
    "     WATERMARK FOR reading_time AS reading_time\n" +
    ") WITH (\n" +
    "    'connector' = 'kafka',\n" +
    "    'topic' = 'iot_readings',\n" +
    "    'properties.bootstrap.servers' = 'localhost:%s',\n" + <1>
    "    'scan.startup.mode' = 'earliest-offset',\n" +
    "    'scan.bounded.mode' = 'latest-offset',\n" +
    "    'key.format' = 'raw',\n" +
    "    'key.fields' = 'device_id',\n" +
    "    'value.format' = 'json',\n" +
    "    'value.fields-include' = 'EXCEPT_KEY'\n" +
    ");", kafkaPort); <2>

A placeholder for String.format to set the Apache Kafka® port
The variable containing the Kafka port determined by the KafkaContainer

Here, the WITH clause specifies to use the Flink Kafka connector. The test will also use Testcontainers to provide the running Kafka instance. I won’t go into using Testcontainers in a test, but you can get the details from this blog post by Atomic Jar.

Now that you have the table definition along with the kafka-sql-connector configuration, the next step is to execute this statement to create the table:

Running the CREATE TABLE statement

    streamTableEnv.executeSql(table).await();

You’ll use the await() method to ensure that the test doesn’t progress until Flink completes creating the table. The next step is to populate the table with data:

Inserting the testing sensor data

    String insertStatement = "INSERT INTO iot_readings VALUES\n" +
    "    ('south_sensor', 5.0, TO_TIMESTAMP('2024-04-09 01:01:00')),\n" +
    "    ('south_sensor', 10.0, TO_TIMESTAMP('2024-04-09 01:03:00')),\n" +
    "    ('south_sensor', 9.7, TO_TIMESTAMP('2024-04-09 01:04:00')),\n" +
    "    ('south_sensor', 12.0, TO_TIMESTAMP('2024-04-09 01:06:00')),\n" +
    "    ('south_sensor', 11.9, TO_TIMESTAMP('2024-04-09 01:07:00')),\n" +
    "    ('south_sensor', 8.2, TO_TIMESTAMP('2024-04-09 01:09:00'));";

    streamTableEnv.executeSql(insertStatement).await();

Here, we’re simply using the familiar SQL INSERT statement to get data into the table, and again, we see the await method in use to make sure the test only progresses after the data inserts are finished. With the data inserted into the table, the next step is to execute the windowed query and compare the results against what we expect them to be:

Executing the query and comparing results

    String query = "SELECT device_id,\n" +
    "    MAX(reading) AS max_reading,\n" +
    "    window_start,\n" +
    "    window_end\n" +
    "FROM TABLE(TUMBLE(TABLE iot_readings, DESCRIPTOR(reading_time), INTERVAL '5' MINUTES))\n" +
    "GROUP BY device_id, window_start, window_end;";

    TableResult tableResult = streamTableEnv.executeSql(query);  <1>

    List<Row> actualResults = rowObjectsFromTableResult(tableResult); <2>

    List<Row> expectedRowResults = Arrays.asList(Row.ofKind(           <3>
                                                          RowKind.INSERT,
                                                          "south_sensor",
                                                          10.0,
                                                          parseTS("2024-04-09 01:00:00"),
                                                          parseTS("2024-04-09 01:05:00")),
                                                 Row.ofKind(
                                                          RowKind.INSERT,
                                                          "south_sensor",
                                                          12.0,
                                                          parseTS("2024-04-09 01:05:00"),
                                                          parseTS("2024-04-09 01:10:00"))
                                                       );
       assertEquals(expectedRowResults, actualResults); <4>

Executing the query
Extracting the results into an ArrayList
Creating the expected results
Asserting the actual results match the expected ones.

So, the final step is straightforward. The test executes the query and compares the returned results to what it expects them to be. Notice that since we’ve inserted simple data, it’s trivial to construct the predicted list of results. For completeness, here’s the code for the rowObjectsFromTableResult method:

Extracting the Row objects

     public List<Row> rowObjectsFromTableResult(TableResult tableResult) throws Exception {
        try (CloseableIterator<Row> closeableIterator = tableResult.collect()) {
          List<Row> rows = new ArrayList<>();
          closeableIterator.forEachRemaining(rows::add);
          return rows;
        }
      }

The method uses the TableResult.collect() method to gather the query results from the ClosableIterator and put them in a container more suitable for comparison at the end of the test.

There’s one last point I’d like to make before we move on. The last record of the test data has the time of 01:09:00. With this being the last record, the watermark wouldn’t reach 01:10:00 so how can the second window produce results that satisfy the test? It goes back to the configuration for the Flink SQL Kafka connector, specifically the scan.bounded.mode=latest-offset configuration. The scan.bounded.mode configuration determines when the stream is complete by specifying the latest offsets after consuming from Kafka. In other words, it sets a bound on the stream from Kafka; otherwise, the table is considered unbounded. Using the bounded mode is especially useful in tests, since it flushes out all pending results from any windows, joins, etc that would otherwise hang, waiting for watermarks that aren’t going to come.

This blog post has been a quick tour of testing Flink SQL windowed queries, but it can serve as the basis for writing future tests against other windowed SQL queries.

Resources

Mastering Stream Processing - Testing Kafka Streams windowed applications

Sat, 16 Mar 2024 20:55:00 +0000

In this blog series about windowing aggregations we’ve covered a lot of territory. Here are the previous posts:

In this final installment, we will cover testing a windowed application. Testing is essential to validate your logic. Typically, when testing an application, you’ll assert that N input records result in an expected result. The time semantics post emphasized that event timestamps drive the window action. So, it’s not enough for testing to feed the application records and assert results; you must provide timestamps to advance the window correctly. In the next two posts, we will cover how to test both Kafka Streams and Flink SQL to ensure your streaming windowed applications produce correct results. I originally planned to cover testing on one post, but it grew too long, so I split it in half. This post focuses on testing in Kafka Streams.

Testing Kafka Streams windowed aggregations

Kafka Streams provides the TopologyTestDriver(TTD) for testing a topology without the need of a live broker. As a result, TTD tests execute very fast and allow you to thoroughly test all topologies, from the simple to the complex. Generally speaking a TTD test will involve using TestInputTopic to push some records through your topology and then capture the results with a TestOutputTopic and will look something like the following code listing (I’ve left out several details for clarity)

A TopologyTestDriver test of a Kafka Streams application

    try(TopologyTestDriver driver = new TopologyTestDriver(topology)) {

       TestInputTopic<String, String> inputTopic = driver.createInputTopic(INPUT_TOPIC...);

       TestOutputTopic<String, String> outputTopic = driver.createOutputTopic(OUTPUT_TOPIC...);

       inputTopic.pipeInput("foo");
       inputTopic.pipeInput("bar");

       List<String> expectedOutput = Arrays.asList("FOO", "BAR");
       List<String> actualOutput = outputTopic.readValuesToList();
       assertEquals(expectedOutput, actualOutput);
    }

Reviewing the code in the above listing makes testing a Kafka Streams application straightforward. One thing that’s not obvious from this example is the use of timestamps. Under the covers, the TTD will create timestamps for all the input records. This approach is acceptable for a topology without windowing, since they don’t require timestamps to calculate results.

But once you have a topology that requires advancing timestamps, i.e., a windowed aggregation, you’ll want to take another approach and supply custom time values. You’ll want to use just enough values to validate your application for a unit test. The difference in time between records will be so slight that it will not be effective for driving the window behavior. To solve this issue, the TestInputTopic provides pipeInput method overloads accepting a java.time.Instant allowing you effectively advance a windowed operation with a small number of input records.

For example, assume you have a one-minute tumbling window aggregation (no grace period) of string keys and double values. Let’s take a look at the test code where you set timestamps to advance the window to contain a small number of expected values:

Providing timestamps to drive a windowed operation

    try(TopologyTestDriver driver = new TopologyTestDriver(tumblingWindows.topology(properties))) {
        TestInputTopic<String, Double> testInputTopic = driver.createInputTopic(inputTopic,
                                                                               stringSerializer,
                                                                               doubleSerializer);

       LocalDate localDate = LocalDate.ofInstant(Instant.now(), ZoneId.systemDefault());<1>
       LocalDateTime localDateTime = LocalDateTime.of(localDate.getYear(),
                                                      localDate.getMonthValue(),
                                                      localDate.getDayOfMonth(), 12, 0, 18);

       Instant instant = localDateTime.toInstant(ZoneOffset.UTC); 

       testInputTopic.pipeInput("deviceOne", 10.0, instant);
       testInputTopic.pipeInput("deviceOne", 35.0, instant.plusSeconds(20)); <2> 
       testInputTopic.pipeInput("deviceOne", 45.0, instant.plusSeconds(40));
       testInputTopic.pipeInput("deviceOne", 15.0, instant.plusSeconds(70)); <3>

    }

Let’s walk through this code:

You’re creating an Instant from a LocalDateTime object
Next, you advance the time 20 and 40 seconds with the second and third record inputs.
With the fourth record, you advance the time by more than 1 minute, so Kafka Streams will close the first window containing records 1-3 and create a new one containing the fourth record.

I want to discuss the block of code setting the LocalDateTime :

Setting current date time for the test

    LocalDate localDate = LocalDate.ofInstant(Instant.now(), ZoneId.systemDefault());
    LocalDateTime localDateTime = LocalDateTime.of(localDate.getYear(),
                                                   localDate.getMonthValue(),
                                                   localDate.getDayOfMonth(), 12, 0, 18);

In this case, we’ll have a window starting at 12:00:00 (tumbling windows are aligned to the epoch), but the 2nd and 3rd records advance the timestamp value, not the window start time. Getting the starting time for the initial timestamp is essential because you’ll want to ensure you have enough room for subsequent advances that align with your testing assertions.

If you have record payloads that contain timestamps, and assuming you’re using a custom TimestampExtractor, then you’ll follow the same approach placing timestamps on the values you’re piping through the test.

Now, let’s walk through asserting windowed results. The result of the windowed aggregation is an IoTAggregation object that tracks the number of readings taken, the highest value seen, and the sum of readings to calculate an average. We’ll use this information to validate our aggregation code:

Validating the windowed aggregation

    List<KeyValue<Windowed<String>, IotSensorAggregation>> results = testOutputTopic.readKeyValuesToList();
    IotSensorAggregation firstWindowAggregation = results.get(0).value;
    IotSensorAggregation lastWindowAggregation = results.get(2).value;
    IotSensorAggregation secondWindowAggregation = results.get(3).value;

    assertEquals(10.0, firstWindowAggregation.highestSeen());
    assertEquals(1, firstWindowAggregation.numberReadings());
    assertEquals(10.0, firstWindowAggregation.averageReading());

    assertEquals(45.0, lastWindowAggregation.highestSeen());
    assertEquals(3, lastWindowAggregation.numberReadings());
    assertEquals(30.0, lastWindowAggregation.averageReading());

    assertEquals(15.0, secondWindowAggregation.highestSeen());
    assertEquals(1, secondWindowAggregation.numberReadings());
    assertEquals(15.0, secondWindowAggregation.averageReading());

This testing code asserts that our windowed aggregation is operating correctly. The first window should only have 1 reading, and the average should equal the highest seen value since it’s the first record. It then asserts that the last aggregation of the 1-minute window should contain 3 readings and the correct average. Finally, it asserts that the last record input should be in a new window since its timestamp advanced over 1 minute, so it should have a similar state to the first window in that it contains 1 reading.

Resources

Mastering Stream Processing - Viewing and analyzing results

Fri, 15 Mar 2024 16:40:00 +0000

This is the sixth blog in a series on windowing in event stream processing. Here’s a list of the previous posts:

In this post, we’ll move on from covering the specific window implementations and discuss the viewing and analysis techniques for windowed results.

Flink SQL results

Given that Flink SQL renders results in a table, the ability to view the details of windowed operations is straightforward. Flink SQL creates 3 columns for windowing-TVF queries: window_start, window_end, and window_time (not shown here for brevity). Flink SQL calculates the window_time field by subtracting 1 ms from the window_end value. So, results from a query with the window columns will look similar to the following:

Table results of the query

+------------------+-------+------+------------------+------------------+
|          bidtime | price | item |     window_start |       window_end |
+------------------+-------+------+------------------+------------------+
| 2020-04-15 08:05 |  4.00 | C    | 2020-04-15 08:00 | 2020-04-15 08:10 |
| 2020-04-15 08:07 |  2.00 | A    | 2020-04-15 08:00 | 2020-04-15 08:10 |
| 2020-04-15 08:09 |  5.00 | D    | 2020-04-15 08:00 | 2020-04-15 08:10 |
| 2020-04-15 08:11 |  3.00 | B    | 2020-04-15 08:10 | 2020-04-15 08:20 |
| 2020-04-15 08:13 |  1.00 | E    | 2020-04-15 08:10 | 2020-04-15 08:20 |
| 2020-04-15 08:17 |  6.00 | F    | 2020-04-15 08:10 | 2020-04-15 08:20 |
+------------------+-------+------+------------------+------------------+

So, the results plainly show the time for each event tracked by the window. But directly running a windowing query and displaying the results to the console has one drawback: it needs to be shared, and other persons interested in the results would need to craft and run their queries. While running queries from a console is suitable for prototyping and testing different SQL statements, it doesn’t lend itself to organization-wide distribution of results. For that, a better approach would be to store the results of a windowed query in another table whose schema and existence can be circulated within an organization.

For example, consider a query that will generate an alert when an average reading exceeds a given threshold. Going back to the blog post on [LINK] OVER aggregations, you first generated a query to perform an aggregation per row:

SQL OVER aggregation with aggregated results per row

    SELECT location, device_id, report_time,
       AVG(temp_reading) OVER (
         PARTITION BY location
         ORDER BY report_time
          RANGE BETWEEN INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW
     ) AS one_minute_location_temp_averages
    FROM readings;

This query does all the work of generating an average over a sliding range of 1 minute from the readings table. We have some work to do to make it capture the alerting state we’re interested in. At this point, we have two possible approaches: creating a new table with the aggregations or creating a new table with only the alert data.

Let’s create a table with all the aggregations generated by this query:

Create a Table with all aggregations

    CREATE TABLE reading_averages (location STRING,
                                   device_id STRING,
                                   report_time TIMESTAMP(3),
                                   reading_averages DOUBLE);

Now you’ll write a persistent query that will continually update this table with the results of our OVER aggregation:

Populating a table with OVER aggregation resulst

    INSERT INTO reading_averages
      SELECT location, device_id, report_time,
       AVG(temp_reading) OVER (
         PARTITION BY location
         ORDER BY report_time
          RANGE BETWEEN INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW
         ) AS one_minute_location_temp_averages
      FROM readings;

Here, we’ve wrapped our OVER aggregation with an INSERT statement to push the results into our table for later analysis. To take this further, imagine we want a table containing only our alerted state. We can still use our table schema, but make a slight change to the column names:

Create a Table with only alert aggregations

    CREATE TABLE reading_alerts (location STRING,
                                 device_id STRING,
                                 report_time TIMESTAMP(3),
                                 reading_alerts DOUBLE);

Now you’ll follow a similar approach but with 2 nested queries. The first to perform the OVER aggregation and the second to pull out the records meeting the altering threshold:

Extracting only the alert results

    SELECT location, device_id, report_time, avg_temps
     FROM (
            SELECT location, device_id, report_time, Avg(reading)
                 OVER (ORDER BY report_time
                        RANGE BETWEEN INTERVAL '15' MINUTE PRECEDING AND CURRENT ROW
                     ) AS avg_temps
            FROM readings
    )
    WHERE avg_temps > SOME_VALUE;

At this point we’re almost there – we now need to get this data into a table. We’ll take the same approach as before, we’ll wrap this SELECT statement with an INSERT to push the results to a table:

Pushing only alert results to a table

    INSERT INTO reading_alerts
        SELECT location, device_id, report_time, avg_temps
           FROM (
                  SELECT location, device_id, report_time, Avg(reading)
                     OVER (ORDER BY report_time
                           RANGE BETWEEN INTERVAL '15' MINUTE PRECEDING AND CURRENT ROW
                      ) AS avg_temps
              FROM readings
            )
           WHERE avg_temps > SOME_VALUE;

Now we have a table containing only the alert results in a table that anyone can access.

Kafka Streams results

With Kafka Streams, you have a couple of choices when it comes to evaluating the windowed results. Let’s revisit the sliding window example:

Sliding windows in Kafka Streams

    Serde<Windowed<String>> windowedSerde =
            WindowedSerdes.timeWindowedSerdeFrom(String.class,
                                                  60_000L
                                                );

    KStream<String,Double> iotHeatSensorStream =
      builder.stream("heat-sensor-input",
        Consumed.with(stringSerde, doubleSerde));

    iotHeatSensorStream.groupByKey()
          .windowedBy(
                      SlidingWindows.ofTimeDifferenceWithNoGrace(Duration.ofMinutes(1)) 
                      )
            .aggregate(() -> new IotSensorAggregation(tempThreshold),
             aggregator,
             Materialized.with(stringSerde, aggregationSerde))
             .toStream().to("sensor-agg-output",
               Produced.with(windowedSerde, aggregationSerde))

As it stands here, this will produce results to Apache Kafka®. To analyze the windowed results, i.e., consume from the output topic, you would need to use the Serde<Windowed<String>> class to get the deserializer for the key, which means you’re leaking specific details of the streaming application. Additionally, I find it challenging to have the window start and end separated in the key vs. the value. Kafka Streams needs to store the window in the key during processing to ensure it’s appropriately handled as time advances. Still, once it emits a windowed aggregation, we don’t need to maintain it in the key.

Instead, I’d propose mapping a new value that contains the start and end times. Taking this a step further, I suggest adding two long fields to your aggregation so that adding the window times is simple. From there, you’ll update the topology to use a map operation to extract the window information and place it in the aggregation value. But before we do that, let’s create a KeyValueMapper that will know how to extract the window start and end:

KeyValueMapper to get window start and end

    public class WindowTimeToAggregateMapper implements KeyValueMapper<Windowed<String>,
                                                                      IotSensorAggregation,
                                                                      KeyValue<String, IotSensorAggregation>> {
        @Override
        public KeyValue<String, IotSensorAggregation> apply(Windowed<String> windowed,
                                                            IotSensorAggregation iotSensorAggregation) {
            long start = windowed.window().start(); 
            long end = windowed.window().end();

            iotSensorAggregation.setWindowStart(start); 
            iotSensorAggregation.setWindowEnd(end);

            return KeyValue.pair(windowed.key(), iotSensorAggregation);
        }
    }

Extracting the window start and end times
Setting the window start and end time on the aggregation object

Since the KeyValueMapper is a single abstract method (SAM) method we could define it inline as a lambda in the Kafka Streams topology, but it’s useful to create a concrete instance for testing. Now you need to plug this into the Kafka Streams DSL:

Adding the KeyValue mapper into the topology

    KStream<String,Double> iotHeatSensorStream =
      builder.stream("heat-sensor-input",
        Consumed.with(stringSerde, doubleSerde));

    iotHeatSensorStream.groupByKey()
          .windowedBy(
                      SlidingWindows.ofTimeDifferenceWithNoGrace(Duration.ofMinutes(1))
                      )
            .aggregate(() -> new IotSensorAggregation(tempThreshold),
             aggregator,
             Materialized.with(stringSerde, aggregationSerde))
             .toStream()
             .map(new WindowTimeToAggregateMapper()) 
             .to("sensor-agg-output",
               Produced.with(stringSerde, aggregationSerde));

Applying the KeyValueMapper to extract the window starting and closing time.

Now, with the addition of the KStream.map operator with the new KeyValueMapper, you’ve updated your aggregation to include the start and end of the window. Since you’ve also pulled the underlying key out, you’ll switch the Serde for Produced to reflect the change in types. When you analyze the aggregation result, you’ll have direct access to the window starting and ending times.

Kafka Streams also allows you to directly observe the results of the aggregation from its state store via Interactive Queries. I won’t go into those details here, but you can view a presentation on building an Interactive Query service from the 2022 Kafka Summit and take a look at the accompanying source code.

Resouces

Mastering Stream Processing - Time semantics

Thu, 29 Feb 2024 16:40:00 +0000

In the previous blog in this series, we wrapped up coverage of the different windowing types. Here is the list of earlier installments in this series:

In this post, we’ll move on from specific code examples and discuss the time semantics of window advancement and the forwarding of results. We’ve now discussed the different window types, how they function, and potential best use cases. But we’ve left some crucial questions unanswered. In this post, we’ll address the following questions:

Time semantics and determining what time to use
What determines when a window starts and ends?
Extracting timestamps and how to handle time advancement?
How do you handle out-of-order records?

Determining the timestamp to use

Kafka Streams and Flink SQL use event timestamps, so they’re based on event time, not system time (although both systems allow configurations to use system time, but I won’t discuss that) . Event timestamps are the time the event occurred. In this blog post, we’ll define event timestamps as the time of the event, and it’s part of the record. System time is the current time of the stream processing engine and provides processing time semantics. We will focus on event-time semantics; later in this blog, we’ll discuss the mechanics of how the stream processing systems extract the event time and some related details. But for now, it’s enough to say the event timestamps drive windowing operations.

Window start and end time

Record event timestamps are at the heart of windowing, but their involvement depends on the window type. Hopping, Tumbling, and Cumulative windows are aligned to the epoch, starting on January 1, 1970. What does aligned to the epoch mean exactly? Let’s answer that question with the help of an illustration:

Windows aligned to the epoch collect records that fit into the correct slot

From this illustration, a five-minute tumbling window starts at 00:00:00 on January 1, 1970. Then, every five minutes, a new window is created (logically) up to the present moment. So when a new record arrives, its event timestamp isn’t the window start time but determines *which* window it belongs to. Time advances in these windows as the event timestamps of the incoming records increase. The following graphic helps demonstrate this process:

Time for a window advances as event timestamps increase

So, once a record arrives with a timestamp greater than the current window end, a new window is built, either by an advance time or window size, depending on whether the window is hopping or tumbling. This advancement description does not account for out-of-order records; we’ll get to that later.

Session and sliding windows (Kafka Streams version of sliding windows) are more behavior-driven and have different semantics. Session windows use the event timestamps to start and close windows. When the first record arrives for a session window, its timestamp becomes the start of the window. Once a record arrives where the timestamp difference exceeds the inactivity gap (accounting for any grace period), a new session starts. The following picture will help in understanding this process:

Session windows use event timestamps for opening and closing times

From this illustration, the session start time is the timestamp of the first record, and the ending time is the timestamp of the last record included in the session. To be more precise, with session windows in Kafka Streams, when a new record arrives, it creates a new window for the record. Kafka Streams then looks to merge that new session window with an existing session. If the new session window’s start timestamp is within the existing session window’s ending timestamp plus the inactivity gap, Kafka Streams will merge the new session into the existing one. This process of merging sessions is how they continue to grow with events inside the inactivity gap. This merging process accounts for out-of-order records that could connect two older sessions into a single larger one.

Sliding windows in Kafka Streams have a fixed size, specified by the maximum difference between incoming events. But it uses event timestamps like the session window for window start and ending.

Now that we’ve discussed how the opening and closing of windows operate let’s move on to time advancement.

Time Advancement

For a window’s time to advance, there needs to be some mechanism to extract an event timestamp from a record and apply it so that time will move forward. Kafka Streams and Flink SQL handle this differently, but the results are the same.

Kafka Streams time advancement

Kafka Streams uses a TimestampExtractor to get the event timestamp. By default, it will extract the timestamp from the consumer, set by the producer. If you prefer to use a timestamp embedded in the record payload, you can write a custom TimestampExtractor, which “knows” which field to grab and use for the timestamp. Kafka Streams keeps track of the highest observed timestamp on a per-partition basis. This current highest timestamp is known as “streamtime” and only moves forward. When an out-of-order record arrives, the streamtime remains unchanged. We’ll discuss out-of-order records later on. Let’s look at an illustration of the concept of streamtime:

Kafka Streams keeps track of timestamps known as stream time

So, as Kafka Streams consumes records, it checks the timestamp of the current record, and if it exceeds the current time, Kafka Streams updates streamtime. Kafka Streams share the timestamp of the current record via a record context that accompanies each record as it flows through the topology. Each window operator keeps track of streamtime itself, and when it advances due to the event timestamps, Kafka Streams will close existing windows where time has advanced beyond its size and or create new windows. Suppose the difference between streamtime and the current timestamp exceeds the inactivity gap for session windows. In that case, Kafka Streams will not merge the new session into the existing one but use it to start a new session.

For all stateful operations in Kafka Streams, windowed results are buffered and released incrementally, either on commit or when its local cache is full. If you want only to receive a final result, you can set the EmitStrategy on the window to ON_WINDOW_CLOSE. Here’s the tumbling window example configured only to emit a final result:

Tumbling window with only final results

    KStream<String,Double> iotHeatSensorStream =
      builder.stream("heat-sensor-input",
        Consumed.with(stringSerde, doubleSerde));
    iotHeatSensorStream.groupByKey()
          .windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(1)))
            .emitStrategy(EmitStrategy.onWindowClose())  <1>
            .aggregate(() -> new IotSensorAggregation(tempThreshold),
             aggregator,
             Materialized.with(stringSerde, aggregationSerde))
             .toStream().to("sensor-agg-output",
               Produced.with(windowedSerde, aggregationSerde))

Specifying only emit results after the window closes

Using an EmitStrategy.onWindowClose() is an efficient approach for working with sliding windows, as it will have several updates due to the 1MS window advancement.

Now, let’s look at time advancement in FLink SQL.

Flink SQL time advancement

Logically, Flink time advancement works in the same way. It looks to extract the event timestamp from the incoming records. To do this event timestamp extraction, you provide a TimestampAssigner. But instead of the concept of streamtime for event time progression, Flink uses watermarks. A watermark is an assertion that the stream is now complete up through the timestamp the watermark carries. Here’s an illustration showing the watermark process:

Flink SQL uses watermarks to indicate to downstream operators what the current event time is

So, as you can see in this illustration, Flink operators will use the watermark timestamp to advance windows or close and open new ones. Flink only emits windowed results after a window closes due to a watermark advancing the window beyond its configured size.

In the Flink Data Stream API, you’ll set the TimestampAssigner and WatermarkGenerator together with a WatermarkStrategy. But we’re focusing on Flink SQL, so you’ll specify a watermark strategy as a statement when you issue a CREATE TABLE statement.

Consider this table definition tracking movie ratings entered by a user on a review site:

Flink SQL table definition with watermark strategy

    CREATE TABLE ratings (
        rating_id INT,
        title STRING,
        release_year INT,
        rating DOUBLE,
        rating_time TIMESTAMP(3), <1> 
        WATERMARK FOR rating_time AS rating_time <2>
    )

Timestamp of the movie rating event
Specifing to use the rating_time column as the watermark timestamp

Flink SQL specifying the watermark strategy takes the form of WATERMARK FOR <column with timestamp> AS <watermark strategy expression>. The watermark strategy depicted here is Strictly ascending timestamps and will forward the maximum observed timestamps. Flink SQL will evaluate the watermark strategy for each incoming record and periodically emit one as defined by the pipeline.auto-watermark-configuration which has a default value of 200ms. Flink will only emit a watermark if the current watermark is larger than the previous one. It doesn’t forward one if it is smaller (or null). It’s worth noting the timestamp column needs to be a TIMESTAMP(3) or TIMESTAMP_LTZ(3) type, where the 3 represents the precision of fractional seconds. In this case, it’s millisecond precision.

Another common element with stream time and watermarks is that advancing time affects all keys in a partition (Kafka Streams) or task slot (Flink SQL). Let’s take a look at an illustration to help understand what this means:

Advancing time affects all keys per partition or task slot

As you can see, we have three keys, A, B, and C, with roughly the same start time for their respective windows. When time advances due to an incoming record, C closes its window and closes the windows for A and B even though they do not have any new records. This action is an example of time advancement influencing all keys per partition/task slot.

Under ideal circumstances (an evenly distributed key space and a consistent flow of records), this time advancement should be fine because all the keys equally drive time advancement. However, an uneven distribution of keys combined with out-of-order records could cause issues where the window advances and closes, not including records that would otherwise be in the window. Configuring a grace period is one way to ensure that a larger key space doesn’t drive the window closing without skipping valid records because even though they arrive out of order, the grace period ensures their inclusion in the windows. We’ll cover grace periods and out-of-order data in the next section.

Out of order data

So far, with our discussion of windowing, I’ve assumed the happy path of records arriving in order, that is, strictly ascending timestamps. But in practice, a record may arrive out-of-order. Let’s look at an illustration to help explain what an out-of-order record is:

An out-of-order record would have been included in a window if it arrived in order

This illustration shows us that an out-of-order record would have been included in a now-closed window had it arrived in order. Given that we want our windowed results to have as complete a picture as possible, making allowances for out-of-order data makes sense. You should allow a grace period, where a window can include records it would otherwise reject. In Kafka Streams, you can explicitly add a grace period to a window definition:

Adding grace to Kafka Streams window operator

    KStream<String,Double> iotHeatSensorStream =
      builder.stream("heat-sensor-input",
        Consumed.with(stringSerde, doubleSerde));
    iotHeatSensorStream.groupByKey()
          .windowedBy(TimeWindows.ofSizeAndGrace(Duration.ofMinutes(1),Duration.ofSeconds(30))) <1>
            .emitStrategy(EmitStrategy.onWindowClose())
            .aggregate(() -> new IotSensorAggregation(tempThreshold),
             aggregator,
             Materialized.with(stringSerde, aggregationSerde))
             .toStream().to("sensor-agg-output",
               Produced.with(windowedSerde, aggregationSerde))

Defining a tumbling window of one minute with thirty seconds grace.

The grace period works like this. When a window operator evaluates if it should include the current record, it will subtract the grace period time from the current stream time value. It will be included if the record’s timestamp fits into the grace-adjusted time for the window. Here’s a quick depiction of a grace period in action:

Grace period in action

So, by defining a grace period, you can include records that arrive out of order. Any records arriving after the grace period expiration are considered late and are discarded.

Flink SQL also makes provisions for out-of-order records that operate in a similar manner. You would adjust the watermark strategy expression to allow for out-of-order records:

Flink SQL table definition with watermark strategy expression with a grace period

    CREATE TABLE ratings (
        rating_id INT,
        title STRING,
        release_year INT,
        rating DOUBLE,
        rating_time TIMESTAMP(3),
        WATERMARK FOR rating_time AS rating_time - INTERVAL '30' SECOND <1>
    )

This watermark strategy allows records in the ratings table to be as much as 30 seconds out-of-order.

Updating the watermark strategy this way makes it “Bounded out of orderness timestamps”. So when a record arrives out of order, but its timestamp still fits inside the current watermark, it will be included in the windowed calculation.

Time and Low traffic partitions

Another angle of time semantics relates equally to Kafka Streams (stream time) and Flink SQL (watermarks) that I’d like to cover before wrapping up: behavior with low traffic partitions. I had mentioned before that with both Kafka Streams and Flink SQL, the event time of the incoming records drives the progress of the event stream.

But what happens when you have low or infrequent traffic? You’ll not observe windowed results regularly without new events to push time advancement. Flink SQL has the concept of “idleness” that allows time to advance when faced with a task slot not receiving regular new events. The table.exec.source.idle-timeout or sql.tables.scan.idle-timeout on Confluent Cloud lets you specify an upper bound on the amount of time to wait for new records before considering a task slot as idle. Setting this configuration (the default is 0, which turns off detecting idleness) allows downstream operators to advance watermarks, providing windowed results without waiting for new records from the event source. You can offer similar functionality in Kafka Streams with a bit of manual work using the KStream.process method. This method provides for a mixin of the Processor API openting the door to schedule a punctuation which would allow you retrieve and forward windowed results with an idle partition.

Resources

Mastering Stream Processing - Session and Cumulating windows

Mon, 26 Feb 2024 21:00:00 +0000

In the third installment of this windowing blog series, you’ll learn about cumulating and session windows. In previous posts, we’ve covered hopping and tumbling windows and sliding windows and the Flink SQL equivalent - OVER aggregations. The cumulate window is unique to Flink SQL. The session window has been available in Kafka Streams since version 0.10.2 and is going to be available in the newest version (1.19) of Flink SQL as part of its stable windowing table-valued functions (TVFs).

Before jumping in, if you ask yourself what is cumulating and how does that relate to Accumulating? The difference between cumulate to accumulate is that the latter is a more intentional gathering while cumulate means to gather together what you already have.

Now, let’s get into cumulating windows.

Cumulating windows

The cumulate window is also part of Flink SQL’s windowing TVF stable and has a fixed size and steps that advance it. Each advance includes the data from the window start and each previous advance. Once the advances reach the window size, the data resets to only what is available at the beginning of the new window. This concept is probably best understood with an illustration:

Cumulating windows have a fixed size with advances smaller than the length of the window

So, from looking at this picture, each advance of the window includes all previous records from the window start. So, each advance accumulates the results up to the window end. Then, the window advances reset from the beginning of the newest window. Another way to think about the cumulate window is a tumbling window where you get updates at regular intervals.

This explanation could still leave some doubt about what the cumulating window does, so let’s look at one more illustration with values:

A Cumulate window with a sum function

Each event in above illustration represents a purchase transaction, and for simplicity of the example, let’s say each transaction is $5. At the window start, there is a transaction; with each slide, there’s an additional one. Since we have a window size of 1 minute with a 15-second step, our results are w1 = $5, w2 = $10, w3 = $15, w4 = $20. Each step includes the previous events from the overall window start. Once the window reaches its size, the results would reset to the beginning of the next window.

To use this functionality, specify the window type by using the reserved function name CUMULATE inside the TABLE function:

Specifying the CUMULATE function

    SELECT window_start,
           window_end,
           user_id,
           sum(page_view) AS page_views
    FROM TABLE(CUMULATE   <1>
                   (TABLE user_visits,   <2>
                         DESCRIPTOR(visit_time), <3>   
                         INTERVAL '15' SECONDS,  <4>
                         INTERVAL '1' MINUTE   <5>
                   ))
    GROUP BY window_start,
             window_end,
             user_id

Let’s break this query down:

Using the CUMULATE function
Specifying the table for the function source
Timestamp column providing time attributes for the windows
The step size of each advance
Maximum size of the window

In an earlier post, I discussed how Kafka Stream windows emit updates regularly and Flink SQL windows only emit on closing. The functionality of a cumulating window is logically similar to the Kafka Streams windowing model since it emits updates before the final one. But there’s a difference: Kafka Streams updates are tied to committing or cache eviction events and are not configurable.

Now, let’s move on to session windows.

Session windows

Session windows differ significantly from the previous ones we’ve seen so far in that they don’t have a fixed size. Instead, session windows define an inactivity period, and as long as records arrive within the inactivity period, a session window continues to grow. A new window starts only when a new record arrives and its timestamp is equal to or greater than the inactivity period plus the end timestamp of the current session. Due to the nature of session windows, the record timestamps determine the start and end of the window.

Let’s review this process in the following depiction of session windows:

Session windows continue to grow until the gap between the latest and incoming timestamps exceeds the inactivity period

So, following along with this illustration, the session window continues to grow until a record arrives with a timestamp 1 minute or more than the ending timestamp of the current session. With the arrival of this timestamp outside the inactivity gap, results in creating a new session.

Kafka Streams Session Window

Here’s how you define a session window in Kafka Streams:

Session windows in Kafka Streams

     Serde<Windowed<String>> sessionWindowSerde =
          WindowedSerdes.sessionWindowedSerdeFrom(String.class); <1>
     builder.stream(inputTopic, Consumed.with(Serdes.String(), clicksSerde))
            .groupByKey()
            .windowedBy(SessionWindows.ofInactivityGapWithNoGrace ( <2>
                                                      Duration.ofMinutes(1) <3>
                                                  )
                        )
            .count()
            .toStream()
            .to(outputTopic, Produced.with(sessionWindowSerde, Serdes.Long()));

Let’s break it down step by step (btw, Serde here refers to Serializer/Deserializer! aha, welcome back to distributed systems world)

Creating a Serde for session windows
Specifying to use session windows for the aggregation. Here, we’re choosing not to use a grace period so that Kafka Streams will drop out-of-order records. In this series, we’ll discuss grace periods in the blog post on time semantics.
The amount of inactivity, 1 minute, between events before the current session terminates and a new session starts.

Session Windows in Flink SQL

To define a session window in Flink SQL you’ll use the windowing TVF format assuming the use case of tracking a click stream on a website:

Session Windows in Flink SQL

    SELECT window_start,
           window_end,
           COUNT(click) AS total_clicks
      FROM TABLE(SESSION     <1>
                   (TABLE page_views,  <2>
                    DESCRIPTOR(click_time), <3>
                    INTERVAL '1' MINUTES    <4>
                    )
                )
    GROUP BY window_start, window_end;

Let’s break it down step by step:

Specifying the SESSION windowing TVF function
Source table for events
Time attribute column
The inactivity gap for defining when to start a new session

Use cases

Cumulating Window

For the cumulate window, any windowed aggregation where you do a count or sum is the candidate use case.

We can generalize the cumulating window use case as “Give me <aggregate> over the last N period, updated every Y.” Instead of waiting for the window to close, you can get updates to help you understand the trends leading up to the final window result.

Session Window

For the session window, we could say, “Show me <aggregate> of events occurring within <inactivity period>.” Since the session window starts and ends with event timestamps and continues to grow with incoming records within the inactivity gap, it lends itself well to tracking behavior.

Things like tracking a click stream come to mind. The first click event starts the window, and the session continues to grow until it doesn’t receive more events within the inactivity timeout. Then, when more events come in, a new session starts.

Resources

Mastering Stream Processing - Sliding windows and OVER aggregations

Wed, 14 Feb 2024 13:23:23 +0000

In the third installment of this windowing blog series, you’ll learn about sliding windows and a bit of SQL. In the previous post, we covered hopping and tumbling windows, both of which Kafka Streams and Flink SQL provide. In this installment, we will discuss sliding windows, supported by Kafka Streams and Flink SQL, or the logical equivalent in both. Let’s jump into sliding windows.

Sliding windows

Sliding windows in Kafka Streams combine attributes of the previous windows we’ve seen in this blog series. Like the hopping or tumbling variants, a sliding window has a fixed size determined by the maximum time difference between records. But record timestamps determine the start and end times of the window, like a session window. Another difference with the sliding window is that both start and end times are inclusive as opposed to only the start time as with the other windows.

As the window slides over the data, new records come into the front as old records drop off the back. You can think of a sliding window that continually "slides" over an event stream, with new records entering the front and older records falling out the back.

While you could emulate sliding windows in Kafka Streams by defining a hopping window with a 1MS advance, the sliding window has some distinct advantages. First, the sliding window start and end times are inclusive, unlike the hopping window, where only the start time is inclusive. Second, sliding windows are more efficient as they calculate each distinct window. A new window is created only when a record enters or drops out of the window. A hopping window with a small advance is less efficient as it will perform its calculation for every window regardless of whether the windows contain different events.

As each record arrives, Kafka Streams creates a new window, including any previous records that fit within the maximum time difference defined by the window. This "look back" feature is unique to the sliding window behavior. Let’s look at an illustration of sliding windows in action:

Sliding windows create a new window for new records, and when records drop out of a window

So, from our simple illustration, we can see how incoming records create a new window and include previous records that fit within the (exclusive) time difference.

Here’s how you define a Kafka Streams sliding window:

Sliding windows in Kafka Streams

KStream<String,Double> iotHeatSensorStream =
  builder.stream("heat-sensor-input",
    Consumed.with(stringSerde, doubleSerde));

iotHeatSensorStream.groupByKey()
      .windowedBy(
                  SlidingWindows.ofTimeDifferenceWithNoGrace(Duration.ofMinutes(1)) 
                  )
        .aggregate(() -> new IotSensorAggregation(tempThreshold),
         aggregator,
         Materialized.with(stringSerde, aggregationSerde))
         .toStream().to("sensor-agg-output",
           Produced.with(windowedSerde, aggregationSerde))

Using a sliding window with a time difference of one minute, when a new record arrives, previous records within the time difference are included in the window.

Now, let’s move on to FLink SQL.

OVER Aggregation

While FLink SQL doesn’t have an exact one-to-one match with the Kafka Streams sliding window, it does provide essentially the same functionality with OVER aggregations. Using the OVER clause in Flink SQL allows you to perform an aggregation over a range of rows, but what makes it unique is that, unlike a GROUP BY aggregation, the OVER aggregation does not reduce the results; it includes all the rows in the aggregation range. Note that you could do something similar in Kafka Streams using the Processor API.

There’s a subtle difference in the results of GROUP BY and an OVER aggregation with a PARTITION BY. The easiest way to show the differences between the two will be with illustrations. Consider the following table of data as the basis for our comparison:

Table of temperature readings

Now lets at the results of a GROUP BY aggregation first:

GROUP BY Aggregates collapse the details into singular results

The results here are what we’ve all come to expect: the original rows are reduced into a single row per location with the average reading. Now contrast that with the OVER approach:

OVER Aggregates return all rows in the range

The results of an OVER (PARTITION BY…) aggregation contain all the rows of the range. Each row contains the same value for the average by location, but you have all the other information available to view. This demonstrates the differences between GROUP BY and OVER (PARTITION BY..) aggregations. Both clauses group things together, but a Partition By does not combine the rows in the results; each row remains distinct. It’s important to note here that altough results are shown here for each row in the table, it’s only for demonstration purposes. An OVER aggregation only returns results for rows that fall into the specifed range.

So, in what may be an oversimplification, an OVER aggregation allows you to perform aggregates and group the results but still view the individual rows. While a GROUP BY will collapse the rows and provide a single-row result per grouping.

Let’s jump into an example query now. Let’s say you have a fleet of IoT sensors deployed in different parts of a manufacturing process, and monitoring the temperature is essential to spot problems and keep the process running smoothly. So you’ll want a query that will give you the average temps per location over the last minute:

OVER Aggregation in Flink SQL

SELECT device_id, report_time,
   AVG(temp_reading) OVER ( <1>
     PARTITION BY location  <2>
     ORDER BY report_time   <3>
      RANGE BETWEEN INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW <4>
 ) AS one_minute_location_temp_averages <5>
FROM readings;

The OVER clause
Partitioning by the location
Ordering results by the report_time column
A range definition specifying the range to go back 1 minute in results
The name of the average calculation column

So, this query will give us a running average of temperatures grouped by region but all rows. You can also specify the range as a count of rows from the current row. In Flink SQL, the ORDER BY is required and only works with ascending time attributes. The range defintions come in two forms:

A RANGE interval dependant on the time attribute defined by the ORDER BY column
A ROW interval, which is count-based and specifies how many rows the result will contain. A ROW interval looks like this: ROWS BETWEEN N and CURRENT ROW, including N+1 result rows (the N preceding rows plus the current row). The CURRENT ROW is the starting point for a specified range determined by the PARTITION BY clause.

The choice of which range definition to apply depends on your use case. The RANGE interval will drop older records as new records advance the window, but you’ll always know the records in the aggregation are within a given time. The ROW ensures that you’ll always have N number of records making up your computation.

Another point of consideration is that the PARTITION BY clause is optional. By leaving it off, you’ll get an overall aggregation of records in the range vs. aggregations segmented by the partition column.

There’s another way to express an OVER aggregation in Flink SQL using the WINDOW clause. Let’s rework our OVER aggregation example to use this format.

OVER aggregation using the WINDOW clause

SELECT location, reading, report_time,
 Avg(reading) OVER win as avg_temps, 
 FROM readings
 WINDOW win AS (   
    PARTITION BY location
    ORDER BY report_time
    RANGE BETWEEN INTERVAL '15' MINUTE PRECEDING AND CURRENT ROW
 );

The Avg aggregation function over the readings
Using the WINDOW clause to specify the window over a range of data

This query is the functional equivalent of the previous OVER aggregation example. So the question of "which one" naturally comes to mind, to which there are a couple of answers. First, the WINDOW form has a more explicit window definition, making it easier to understand. Second, defining the OVER aggregation this way opens the door to reusing the window definition for multiple aggregates. For example, consider you want to keep track of the maximum temperature and the average. You could do so with this query:

OVER aggregation with a Window clause and multiple aggregations

SELECT location, reading, report_time,
 Avg(reading) OVER win as avg_temps,
 MAX(reading) OVER win as max_temp
 FROM readings
 WINDOW win AS (
    PARTITION BY location
    ORDER BY report_time
    RANGE BETWEEN INTERVAL '15' MINUTE PRECEDING AND CURRENT ROW
 );

So by explicitly using the WINDOW form, you can easily add more aggregations, but keep in mind this increases the state for Flink SQL to keep.

Finally, the OVER aggregation query is the basis for other analytical queries like the Top-N query. I won’t go into more detail about the OVER aggregation type of query now, but I’ll have a post that goes deeper into it and other analytical queries soon.

Comparing Sliding windows to OVER aggregations

At the blog’s beginning, I mentioned that the Kafka Streams and Flink SQL OVER aggregation were logically similar. With the sliding window, when a new record arrives, Kafka Streams creates a new window for it, and there’s a "look back" to see what records have timestamps within the max difference. As records continue to arrive and the windows advance, new records come into the front, and older records drop out the back. Much the same can be said of the OVER aggregation; a new record results in a new row, and the RANGE includes records within the time range. Over time, new records are at the top, and older records drop off the back of the range.

Use cases

Sliding Windows

Logically, a sliding window flows continually over an event stream, which makes it an excellent fit for a running average.

Also, a sliding window could be used for alerting when a given event occurs N times within the timeframe of one window.

OVER Aggregations

Similarly, an OVER aggregation can provide the same type of functionality, a running average or count, watching for a value to exceed a given threshold.

You can also wrap your OVER query with an outer one to only select values that meet your alerting criteria:

Selecting only values that reach or exceed the max average

SELECT location, reading, report_time, avg_temps
 FROM (
        SELECT location, reading, report_time, Avg(reading)
             OVER (ORDER BY report_time
                    RANGE BETWEEN INTERVAL '15' MINUTE PRECEDING AND CURRENT ROW
                 ) AS avg_temps
        FROM readings
)
WHERE avg_temps > N;

Resouces

Mastering Stream Processing - Hoppping and Tumbling windows

Thu, 08 Feb 2024 00:23:23 +0000

In the first post of this series, we discussed what event streaming windowing is, and we examined in detail the structure of a windowed aggregate in Kafka Streams and Flink SQL.

In this post, we’ll dive into two specific windowing implementations: hopping and tumbling windows.

Hopping windows

A hopping window has a fixed time length, and it moves forward or "hops" at a time interval smaller than the window’s length. For example, a hopping window can be one minute long and advance every ten seconds. The following illustration demonstrates the concept of a hopping window:

Hopping windows have a fixed size with advances smaller than the length of the window

So, from the illustration above, hopping windows can produce overlapping results. A hop forward can include results contained in the previous window. Let’s look at another illustration demonstrating this concept:

Hopping windows of one minute with a 30-second advance will share 30 seconds of data with the following window

Walking through the picture:

Window one starts at 12:00:00 PM and will collect data until 12:01:00 PM (end time is exclusive).
At 12:00:30 PM, due to the thirty-second advance, window two starts gathering data.
Window one and window two will share data for thirty seconds from the start of window two until the end of window one. The process continues with each window advance.

Let’s show how you would implement a hopping window in Kafka Streams and Flink SQL.

Kafka Streams hopping window

For a hopping windowed aggregation in Kafka Streams, you’ll use one of the factory methods in the TimeWindows class:

A Kafks Streams hopping window example

KStream<String,Double> iotHeatSensorStream =
  builder.stream("heat-sensor-input",
    Consumed.with(stringSerde, doubleSerde));


iotHeatSensorStream.groupByKey()
      .windowedBy(
                  TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(1)) 
                             .advanceBy(Duration.ofSeconds(30)) 
                  )
        .aggregate(() -> new IotSensorAggregation(tempThreshold),
         aggregator,
         Materialized.with(stringSerde, aggregationSerde))
         .toStream().to("sensor-agg-output",
           Produced.with(windowedSerde, aggregationSerde))

By using TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(1)) sets the window size at one minute, the withNoGrace means Kafka Streams will drop any out-of-order records that would have been included in the window had they arrived in order. We’ll get into grace periods more in the blog post on windowing time semantics.
The .advanceBy(Duration.ofSeconds(30) call makes this a hopping window. It creates a window that is one minute in size and advances every ten seconds.

Next, let’s move on to hopping windows with Flink SQL.

Flink SQL hopping window

Note that Flink hopping windows can also be referred to as sliding windows. Kafka Stream offers a sliding window variant that behaves differently from its hopping window offering. So, for clarity, we’ll only refer to Flink windows with an advance smaller than the window size as hopping windows.

Hopping window average with Flink SQL

SELECT window_start,
       window_end,
       device_id,
       AVG(reading) AS avg_reading
FROM TABLE(HOP 
               (TABLE device_readings,   
                     DESCRIPTOR(ts),    
                     INTERVAL '30' SECONDS,  
                     INTERVAL '1' MINUTES  
               ))
GROUP BY window_start,
         window_end,
         device_id

Specifying hopping windows by passing the HOP function to the TABLE function.
The table you’ll use as the source for the hopping window aggregation.
The DESCRIPTOR is the column with the time attribute used for the window.
This first INTERVAL is the amount of "hop" or advance of the window.
The second INTERVAL is the size of the window.

Now, let’s move on to tumbling windows.

Tumbling windows

A tumbling window has a fixed length in size and has an advance that is the same amount of time as the size. Tumbling windows are considered a specialized case of hopping windows due to the advance equalling the window size.

A tumbling window collects data for the window size, then "tumbles over" to start a new window.

Since a tumbling window starts a new one when the previous one ends, they don’t share any data. You won’t find records from one window in another one; the following illustration helps clarify this process:

Tumbling windows have an advance equal to the size of the window and tumble to start a new one with no overlap in data

Stepping through this illustration

Window one starts at 12:00:00 PM and will collect data until it ends. The endtime of the window is exclusive
At 12:01:00 PM window two starts collecting data
Since each window starts collecting data after the previous window ended, there are no shared results.

Kafka Streams tumbling window

For tumbling windows in Kafka Streams you’ll use TimeWindows class:

A Kafks Streams tumbling window example

KStream<String,Double> iotHeatSensorStream =
  builder.stream("heat-sensor-input",
    Consumed.with(stringSerde, doubleSerde));
  
iotHeatSensorStream.groupByKey()
      .windowedBy(
                  TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(1)) 
                  )
        .aggregate(() -> new IotSensorAggregation(tempThreshold),
         aggregator,
         Materialized.with(stringSerde, aggregationSerde))
         .toStream().to("sensor-agg-output",
           Produced.with(windowedSerde, aggregationSerde))

Using TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(1)) without the .advanceBy clause automatically makes this a tumbling window. Since you didn’t specify an advance, Kafka Streams will add one equal to the size. You could add the advanceBy clause with the same amount of time if you choose to skip the shortened version.

Flink SQL tumbling window

Tumbling windows in Flink SQL are defined similarly to the hopping variety, with a couple of differences

Tumbling window average with Flink SQL

SELECT window_start,
       window_end,
       device_id,
       AVG(reading) AS avg_reading
FROM TABLE(TUMBLE 
               (TABLE device_readings,   
                     DESCRIPTOR(timestamp),    
                     INTERVAL '1' MINUTES  
               ))
GROUP BY window_start,
         window_end,
         device_id

Specifying tumbling windows by passing the TUMBLE function to the TABLE function.
The table you’ll use as the source for the tumbling window aggregation.
The DESCRIPTOR is the column with the time attribute used for the window.
By passing a single INTERVAL parameter, Flink SQL will utilize this as the advance and the size.

You define tumbling windows in Flink SQL similarly to Kafka Streams in that you only provide one time paramater.

Use cases

Hopping Windows

Let’s look at an illustration for the use case of hopping windows:

Hopping window use case of collecting data for an aggregation for some time reported at intervals less than window

So, from looking at this image, we could generalize hopping windows as "every <time-period of window advance> give me <aggregate> over the last <window size> period". From our examples here, it would be "every 30 seconds, give me the average temp reading over the last minute". So, any problem domain where you want to closely monitor changes over time or compare changes relative to the previous reading could be a good fit for the hopping window.

It’s worth noting that Kafka Streams will emit periodic results before a window closes, while Flink SQL will only emit results when it’s closed. I’ll go into more details about this in the fourth installment of this blog series.

Tumbling Windows

For the tumbling window, we have another illustration:

Tumbling window use case collecting data for aggregation and reporting it at regular intervals

This illustration for tumbling windows can be summarized as "give me <aggregate> every <window size period>" or restated to fit the examples in this post: "give me the average temp reading over the last minute". Since tumbling windows don’t share any records, any situation requiring a unique count of events per window period would be a reason to use a tumbling window.

Resouces

Mastering Stream Processing - Introduction to windowing.

Fri, 02 Feb 2024 00:23:23 +0000

Stream processing is the best way to work with event data. While batch processing still has its use cases, and probably always will, only stream processing offers the ability to respond in real-time to events.

But if we zoom in, what does it look like to respond to events? By now, I’m sure you’re familiar with the oft-quoted fraud scenario - a person with nefarious intent gets a hold of an unaware consumer’s credit card number. Still, due to the bank’s responsiveness processing system, the fraudulent charge gets declined.

Other uses of stream processing require an immediate response but are not tied to one single event. Consider monitoring the heat of a manufacturing process; if the average temperature reaches a certain threshold in a given period, then the monitoring process should generate an alert. But this isn’t about one temperature spike. It’s about a consistent upward trend. In other words, what are the temperature readings doing during a fixed period?

I’m talking about windowing in event streams, if you have not guessed by now. While aggregations (an aggregation is a grouping of events by a common attribute) are a vital tool to leverage an event stream, an aggregation over all time doesn’t shed any light on specific periods of activity. Consider the following illustration:

Course grained average temperature readings

Over time the average temperature reading has increased some over time, but it doesn’t tell the whole story. Now let’s take a look at capturing the average temp readings over specific intervals:

Windowed average temp readings

Now by getting readings at specific intervals (windows) you can spot the issue with a large jump in the average value.

This is not to say that an aggregation over all time isn’t helpful, but that, in many cases, you’ll want to aggregate over specific intervals. In other cases, you’ll want an aggregation not defined by fixed time boundaries but by behavior, e.g., session windows whose boundaries are based on periods of inactivity. We’ll get into session windows in a post later in the blog series.

This blog post marks the first in a series about windowing in the two dominant stream processing technologies today: Kafka Streams and Flink, specifically Flink SQL). It’s important to note that the point of this blog series is not a direct comparison between the two APIs. Instead, it is a resource for windowed operations in Kafka Streams and Flink SQL. While comparing the two in a competitive analysis is natural, it’s not the main focus here.

The blog series will discuss:

The different types of windowing, semantics, and potential use cases.
Time semantics
Interpretation of the results
Testing windowed applications

I will assume basic familiarity with Kafka Streams and Flink SQL, so the examples will start by covering windowing.

But before we get into windowing, let’s discuss how Kafka Streams and Flink SQL structure windowing applications. We’ll only cover this level of detail in this initial post, and subsequent ones will assume knowledge of how to assemble the program and focus on the windowing aspect.

Kafka Streams windowing

You’ll need to specify an aggregation to do any windowing in Kafka Streams. Aggregations are a function that combines smaller components into a large composition, clustered around some attribute, which in Kafka Streams will be the key in the key-value pairs. You can also perform a reduce, a specialized form of aggregation, since a reduce operation will return the same type as its input components. Generally, an aggregation can return a completely different value from the inputs. But since windowing operates the same for either a reduce or aggregation will use an aggregation for our examples throughout the blog series.

A Kafka Streams windowed aggregation

KStream<String,Double> iotHeatSensorStream =
  builder.stream("heat-sensor-input",
    Consumed.with(stringSerde, doubleSerde));
iotHeatSensorStream.groupByKey() 
      .windowedBy(<window specificatation>) 
        .aggregate(() -> new IotSensorAggregation(tempThreshold), 
         aggregator,
         Materialized.with(stringSerde, aggregationSerde))
         .toStream().to("sensor-agg-output",
           Produced.with(windowedSerde, sensorAggregationSerde))

Let’s walk through the essential points of setting up the Kafka Streams window aggregation:

The first step is to group all records by key; this is required before performing any aggregation. Here you’re using KStream.groupByKey which assumes the underlying key-value pairs have the correct keys needed for clustering together. If not, you could use the KStream.groupBy function where you pass a KeyValueMapper instance that maps the current key-value pair into a new one which allows you to create a new key suitable for the aggregation grouping. Note that changing the key for a group-by will lead to a re-partitioning of the records.
You are specifying the windowing - we’ll cover the specific types in later posts.
Point three is where you’re specifying how to aggregate records. The first parameter is an Initializer represented as a lambda function, which provides the initial value. The second parameter is the Aggregator instance, which performs the aggregation action you specify. Here, it’s a simple average and tracking the highest and lowest values seen. The third parameter is a Materialized instance specifying how to store the aggregation. Since the value type differs from the incoming value, you must provide the appropriate Serde instance for Kafka Streams to use when (de)serializing records.
The final point is where you provide the Serde instances for producing the results back to Kafka. The key Serde is a different type as Kafka Streams wraps the incoming record key in a Windowed instance.

What’s not apparent from this aggregation example is where the timestamps for the window are. But there’s a big hint in the explanation of the aggregation example. At point four of the aggregation description, Kafka Streams wraps the original key in a Windowed object.

Windowed object

As shown in this illustration, the Windowed object contains the original key and the Window instance for the aggregation values. The Window object has the start and end time for the aggregation window. It doesn’t contain the window size, but you can easily calculate the size by subtracting the start time from the end. We’ll cover reporting and analyzing the aggregation window times in a follow-on blog post.

Wrapping the original key in a Windowed object changes the type, meaning you’ll have to update Kafka Streams on serializing the results. Fortunately, Kafka Streams provides the WindowedSerdes utility class making it easy to get the correct Serde for producing results back to Kafka:

Using the WindowedSerdes class to get a Serde for Windowed keys

Serde<Windowed<String>> windowedSerde =
        WindowedSerdes.timeWindowedSerdeFrom(String.class, 
                                              60_000L 
                                            );

KStream<String,Double> iotHeatSensorStream =
  builder.stream("heat-sensor-input",
    Consumed.with(stringSerde, doubleSerde));
  iotHeatSensorStream.groupByKey() 
         .windowedBy(<window specificatation>)
         .aggregate(() -> new IotSensorAggregation(tempThreshold),
              aggregator,
               Materialized.with(stringSerde, aggregationSerde))
         .toStream().to("sensor-agg-output",
           Produced.with(windowedSerde, sensorAggregationSerde))

The class type for the original key
The size of the window in milliseconds
Providing the Serde for the Windowed key

So, by using the WindowedSerdes class, you provide the proper deserialization strategy for Kafka Streams to produce windowed results back to Kafka. Producing windowed results to a topic implies downstream consumers will know how to handle the windowed results as well. We’ll cover that situation in a later blog on reporting in a subsequent post in this series.

Now, let’s move on to Flink SQL aggregation windows.

Flink SQL windowing

Flink offers windowing for event stream data as windowing table-valued functions (TVF). The Flink TVFs implement the SQL 2016 standard Polymorphic Table Functions (PTF). In a nutshell, PTFs allow for user-defined functions on a table that returns a table.

PTF table function returning a table

The exciting thing about PTF is that the schema of the table returned by the function is dynamic; it’s determined at runtime by the function output. So, the PTFs enable windowing and aggregation functions on existing tables, precisely what we get with the Flink SQL windowing. The windowing TVFs in Flink replace the now deprecated Group Window Functions. Window TVFs provide more powerful window-based calculations like Window TopN and Window Deduplication.

Now, let’s move on to how you execute a windowed aggregation in Flink SQL. As with the Kafka Streams example, we’ll review the structure of a windowed aggregation, with specific window implementations covered in later posts.

Structure of Flink SQL windowed aggregation

SELECT window_start,
       window_end,
       device_id,
       AVG(reading) AS avg_reading   

FROM TABLE(
           <Window Function> ( 
                              TABLE device_readings,   
                              DESCRIPTOR(ts),    
                              INTERVAL '5' MINUTES,  
                              [INTERVAL '10' MINUTES]
                            )
           )
GROUP BY window_start, 
         window_end,
         device_id

Here’s the breakdown of the query:

Selecting the columns and the aggregation using the Flink SQL AVG function and providing a descriptive name; these columns form the schema of the returned table.
The TABLE function
Here, you give a specific window function, either HOP, TUMBLING, or CUMULATE. Support for a SESSION type is coming soon. We’ll cover the specific types in later posts.
Next are the parameters for the window function, starting with the table to use for the input
The DESCRIPTOR is the time attribute column the function uses for the window.
Depending on the window function, the following 1 or 2 parameters determine the window advance and size or just the size.
As with standard SQL aggregate functions, we need the same columns in the GROUP BY clause in the SELECT clause.

Flink SQL inserts three additional columns into windowed operations, window_start, window_end, and window_time. Flink SQL determines window_time by subtracting 1ms from the window_end value.

This concludes our introduction to the structure of windowing applications in Kafka Streams and Flink SQL. In the next edition, we’ll cover hopping and tumbling windows.

Resources

Completable Futures - Error Handling.

Wed, 31 Oct 2018 00:23:23 +0000

Some time ago, over 2 years, I started a 3 part series on the CompletableFuture. I’m just now getting around to doing part two now. My long time delay in completing this series was due to working in my book Kafka Streams in Action. But now that’s done I can get back to doing some blogging again.

Earlier this year I started a series on a new class introduced in Java 8, the CompletableFuture class. Since the CompletableFuture is such a feature rich class, I decided to break the coverage up into three stages. The first post covered the creation of CompletableFuture tasks and how to specify followup tasks to execute when the original one completes. The examples in the first post only dealt with the happy path scenarios, however. Today we are going to go over dealing with failures and errors including specifying actions to take when an error is encountered.

Why Having Separate Error Handling Methods

CompletableFutures give you the ability to define functionality that can be executed then you can come back to it later and magically extract the result of your asynchronous task. But there is one drawback, what to do when errors occur? You can use try-catch blocks, but you lose the conciseness of lambda syntax. Plus you lose flexibility as you have to handle errors the same way, you can’t use a CompleteableFuture with error handling strategy A then five lines later use a different error handling strategy (assuming you are passing the same lambda with different parameters). What we need is a pluggable solution where different functions can be specified at any point to handle errors that best fit the specific situation.

Error Handling Strategies

When it comes to handling errors with Completeable futures, there are two approaches. The first is to run a given function when an Exception occurs. The second is a BiFunction with the expected result type and a [Throwable] as parameters. If an error occurred, the Throwable instance is not null and you can take the action at that point. While not error handling strategies, two methods will force an exception to be thrown when any attempt to call the CompletableFuture.get method is called. The two approaches differ in that the CompleteableFuture.excepionally is used when you don’t want to take any further action with the result, if the future completes normally, then the returned result is good enough. The only way the provided function executes is in the event of an error. But the case of CompletableFuture.handle method is different. If the future completes without error, the result is available for extra processing. Otherwise the Throwable parameter is not null and you can react at that point. The key point here is CompleteableFuture.handle method is always executed.

Functions That Run On Error

The first strategy for handling errors is the CompletableFuture.exceptionally method. The exceptionally method takes a Function that expects to receive an instance of Throwable and returns the same type of the original CompletebleFuture. If the case of normal completion, the result of the CompletableFuture is returned to the caller. But if there are any errors then the supplied function is executed, and that result is returned instead. //Code Here In other words CompleteableFuture.exceptionally only runs when there is an exception.

Handling Success or Failure

The second method we have for error handling is CompleteableFuture.handle. In contrast to exceptionally the handle method always executes the function parameter. We can see the difference in execution by the types the two methods accept. The exceptionally method requires a Function returning the same type as the CompletebleFuture. On the other hand, the handle method takes a BiFunction where the first parameter is the result of the CompleteableFuture computation and the second parameter is a //Code Here Throwable. So in our function, if the Throwable is not null we know an error occurred and took the appropriate action. Otherwise, we return the result of the CompleteableFuture. Or course we can perform additional operations on a successful result as long as we return the same type.

Choosing a Strategy

We have two strategies for asynchronous error handling, so the question is which type to use? While there are no hard rules here’s some quick advice:

When the result stands alone use CompleteableFuture.exceptional.
If the result requires more processing, use CompleteableFuture.handle instead.

Conclusion

We have reached the end of our coverage on CompletableFuture error handling. The takeaway(s) here are don’t add error handling inside your CompletableFuture. Instead, rely on the error handling process provided by the class. In next post on the CompleteableFuture, we’ll cover canceling and forcing completion.