From Tables to Streams Apache Flink for SQL Developers Viktor Gamov · @gamussa · Amsterdam JUG
X / Bluesky: @gamussa
Slide 2
Same SQL. Different semantics.
X / Bluesky: @gamussa
Slide 3
Viktor Gamov · @gamussa I build streaming things (and the toy you just saw — selectstar.stream) Flink, Kafka, real-time data
X / Bluesky: @gamussa
Slide 4
Everything from tonight Everything from tonight lives here ↓ speaking.gamov.io · selectstar.stream · sql.selectstar.stream
scan → speaking.gamov.io
X / Bluesky: @gamussa
Slide 5
Apache Flink The open-source engine for stateful stream processing — and batch. Distributed · event-time aware · exactly-once · driven by SQL.
X / Bluesky: @gamussa
Slide 6
Sources → Flink → Sinks Kafka, databases, files in. Results out. Flink is the processing in between.
X / Bluesky: @gamussa
Slide 7
Unbounded data · event-time · state · exactly-once · scale
The hard parts of streaming — handled, so you write SQL.
X / Bluesky: @gamussa
Slide 8
SQL — declared dead since 1974 (SEQUEL, IBM Research).
X / Bluesky: @gamussa
Slide 9
Table = the harbor (data at rest) Stream = the open sea (data in motion, never done) You stop being the harbormaster. You become the captain.
X / Bluesky: @gamussa
Slide 10
You don’t need a Kafka PhD. You need SELECT.
X / Bluesky: @gamussa
Slide 11
SELECT / JOIN / GROUP BY — on water that never stops.
X / Bluesky: @gamussa
Slide 12
⎯ ship’s log ⎯ GROUP BY, two ways SELECT card_id, COUNT(*) AS n FROM txns GROUP BY card_id; batch → card_3, 3 (one final row) stream → +I (card_3, 1) -U (card_3, 1) +U (card_3, 2) -U (card_3, 2) +U (card_3, 3) … never done
X / Bluesky: @gamussa
Slide 13
X / Bluesky: @gamussa
Slide 14
Windows. Late data. Watermarks.
X / Bluesky: @gamussa
Slide 15
A watermark says: “I’ll wait a reasonable while for stragglers — then I sail without you.”
X / Bluesky: @gamussa
Slide 16
⎯ ship’s log ⎯ temporal join SELECT t.txn_id, t.amount, r.rate_to_eur, ROUND(t.amount * r.rate_to_eur, 2) AS amount_eur FROM txn_events AS t JOIN fx_rates FOR SYSTEM_TIME AS OF t.event_time AS r ON t.ccy = r.ccy;
X / Bluesky: @gamussa
Slide 17
When SQL runs out of road · and real-time AI
X / Bluesky: @gamussa
Slide 18
⎯ ship’s log ⎯ Table API table.groupBy($(“card_id”)).select($(“card_id”), $ (“amount”).sum()); // same semantics, programmatic control
X / Bluesky: @gamussa
Slide 19
Scoring a 3am batch isn’t realtime. It’s a horoscope.
X / Bluesky: @gamussa
Slide 20
Card in Amsterdam. 4 seconds later, Tokyo.
X / Bluesky: @gamussa
Slide 21
Stream when freshness changes the decision. Batch when the answer can wait.
X / Bluesky: @gamussa
Slide 22
Same syntax. Different semantics. Know both.
X / Bluesky: @gamussa
Slide 23
Play now: selectstar.stream · sql.selectstar.stream Learn: Apache Flink docs · Flink SQL cookbook All of it: speaking.gamov.io
X / Bluesky: @gamussa