From Tables to Streams Apache Flink for SQL Developers Viktor Gamov · @gamussa · Amsterdam JUG X / Bluesky: @gamussa

Same SQL. Different semantics. X / Bluesky: @gamussa

Viktor Gamov · @gamussa I build streaming things (and the toy you just saw — selectstar.stream) Flink, Kafka, real-time data X / Bluesky: @gamussa

Everything from tonight Everything from tonight lives here ↓ speaking.gamov.io · selectstar.stream · sql.selectstar.stream scan → speaking.gamov.io X / Bluesky: @gamussa

Apache Flink The open-source engine for stateful stream processing — and batch. Distributed · event-time aware · exactly-once · driven by SQL. X / Bluesky: @gamussa

Sources → Flink → Sinks Kafka, databases, files in. Results out. Flink is the processing in between. X / Bluesky: @gamussa

Unbounded data · event-time · state · exactly-once · scale The hard parts of streaming — handled, so you write SQL. X / Bluesky: @gamussa

SQL — declared dead since 1974 (SEQUEL, IBM Research). X / Bluesky: @gamussa

Table = the harbor (data at rest) Stream = the open sea (data in motion, never done) You stop being the harbormaster. You become the captain. X / Bluesky: @gamussa

You don’t need a Kafka PhD. You need SELECT. X / Bluesky: @gamussa

SELECT / JOIN / GROUP BY — on water that never stops. X / Bluesky: @gamussa

⎯ ship’s log ⎯ GROUP BY, two ways SELECT card_id, COUNT(*) AS n FROM txns GROUP BY card_id; batch → card_3, 3 (one final row) stream → +I (card_3, 1) -U (card_3, 1) +U (card_3, 2) -U (card_3, 2) +U (card_3, 3) … never done X / Bluesky: @gamussa

X / Bluesky: @gamussa

Windows. Late data. Watermarks. X / Bluesky: @gamussa

A watermark says: “I’ll wait a reasonable while for stragglers — then I sail without you.” X / Bluesky: @gamussa

⎯ ship’s log ⎯ temporal join SELECT t.txn_id, t.amount, r.rate_to_eur, ROUND(t.amount * r.rate_to_eur, 2) AS amount_eur FROM txn_events AS t JOIN fx_rates FOR SYSTEM_TIME AS OF t.event_time AS r ON t.ccy = r.ccy; X / Bluesky: @gamussa

When SQL runs out of road · and real-time AI X / Bluesky: @gamussa

⎯ ship’s log ⎯ Table API table.groupBy($(“card_id”)).select($(“card_id”), $ (“amount”).sum()); // same semantics, programmatic control X / Bluesky: @gamussa

Scoring a 3am batch isn’t realtime. It’s a horoscope. X / Bluesky: @gamussa

Card in Amsterdam. 4 seconds later, Tokyo. X / Bluesky: @gamussa

Stream when freshness changes the decision. Batch when the answer can wait. X / Bluesky: @gamussa

Same syntax. Different semantics. Know both. X / Bluesky: @gamussa

Play now: selectstar.stream · sql.selectstar.stream Learn: Apache Flink docs · Flink SQL cookbook All of it: speaking.gamov.io X / Bluesky: @gamussa

as always, have a nice day