One Does Not Simply Query a Stream! Viktor Gamov, Con luent @gamussa Zurich Ka ka Meetup, Switzerland 2025 f f @gamussa | @confluentinc | @apacheflink

@gamussa | @confluentinc | @apacheflink

@gamussa | @confluentinc | @apacheflink

Viktor GAMOV Principal Developer Advocate | Con luent f f THE CLOUD CONNECTIVITY COMPANY Twitter X: @gamussa Kong Con idential

Simpler times Monolith @gamussa | gamov.dev/rel | @ConfluentInc

Simpler analytics ETL and CDC @gamussa | gamov.dev/rel | @ConfluentInc

DHW->Hadoop Mobile Era @gamussa | gamov.dev/rel | @ConfluentInc

Data Pipelines Streaming data pipelines and Microservices @gamussa | gamov.dev/rel | @ConfluentInc

LOG @gamussa | gamov.dev/rel | @ConfluentInc

@gamussa | @confluentinc | @apacheflink

@gamussa | @confluentinc | @apacheflink

OLTP stream vs OLAP vs. OLTP in Streams OLAP streams @gamussa | gamov.dev/rel | @ConfluentInc

 Skip Paywall Sign Up for Confluent Cloud Get $400 worth free credits for your first 30 Days Use Promo Code POPTOUT000MZG62 to skip the paywall! 13

Our Options f • Connect/Relational DB • Ka ka Streams • Streaming SQL • Data Warehouse • Data Lake • Real-Time OLAP Database

f Ka ka Connect

Connect/RDBMS Broker Broker Broker Cluster Data Source Kafka Connect Kafka Connect Data Sink

` Connect/RDBMS • Suitable for smaller data • Transactional • Familiar to users

f Ka ka Streams

Ka ka Streams (transactional) f • Ingests directly from a topic • KTable • Forms an in-memory key/value store suitable for querying by topic key • Scalable across members of a consumer group • Readable through Interactive Queries

Ka ka Streams (transactional) final KStream<String, String> stream = builder.stream(inputTopic, Consumed.with(stringSerde, stringSerde)); f final KTable<String, String> convertedTable = stream.toTable(Materialized.as(“streamconverted-to-table”));

Ka ka Streams (analytical) • • • • • Full-featured Java stream processing API Arbitrary streaming computation Can emit new streams (not this talk) KTables queryable by key f Every read pattern requires its own topology • Interactive Queries again

Ka ka Streams (analytical) KTable<String, Long> wordCounts = textLines .flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split(“\W+”))) .groupBy((key, word) -> word) .count(Materialized.<String, Long, KeyValueStore<Bytes, byte[]>>as(“counts-store”)); f wordCounts.toStream().to(“WordsWithCountsTopic”, Produced.with(Serdes.String(), Serdes.Long()));

Streaming SQLs

Streaming SQL • • • • Materialize DeltaStream RisingWave ksqlDB

Why not Flink? @gamussa | gamov.dev/rel | @ConfluentInc

@gamussa | gamov.dev/rel | @ConfluentInc

Materialize f • Replacement data warehouse • Integrates with Ka ka, Postgres, dbt • The Materialized View is the central abstraction • Views are persistent and queryable • Postgres wire-compatible • Positioned as an analytics solution

Delta Stream • • • • f Cloud-native streaming SQL Serverless, BYOC Ka ka, Kinesis integration Materialized views and streaming pipelines • streaming database and streaming analytics

Rising Wave f • Distributed SQL Streaming database • Cloud and OSS versions • Implementation of Flink in Rust • Ka ka, Pulsar, Kinesis integrations • Flink+persistent views • Postgres wire-compatible

ksqlDB f • «Streaming Database» • Provides persistent TABLE abstraction • Pull and Push queries • Like Ka kaStreams, but in SQL

Real-Time Analytics Database

Real-Time OLAP f • Designed for high concurrency, low latency queries • Ingests from streaming and batch sources • Intimate integration with Ka ka • Conventional tables and SQL

Real-Time OLAP • Analytics shaped like real-time data • Analytics when users are decision makers

Cloud Data Warehouses

Cloud Data Warehouses

Cloud Data Warehouses • The cloud-based heir of legacy DWH • Ingest from batch and streaming sources • Biased towards structured data and batch access

Data Lake

Data Lake f Anything else We’ll igure this out

Data Lakes • • • • • Started as the HDFS cluster Became S3 That didn’t help… ELT vs. ETL Iceberg/Hudi/DeltaLake

Data Lakes f • Storage and compute are radically decoupled • Structure is relatively less important • Reads are slow • Streaming is historically dif icult

No Solutions Technology Selection only Trade Offs @gamussa | gamov.dev/rel | @ConfluentInc

Sometimes you go with what you know

This is not bad!

Performance Performance

Community/Adoption Community

Differentiated Application Code Area of Exploration Kafka @gamussa | gamov.dev/rel | @ConfluentInc

@gamussa | @confluentinc | @apacheflink

 Skip Paywall Sign Up for Confluent Cloud Get $400 worth free credits for your first 30 Days Use Promo Code POPTOUT000MZG62 to skip the paywall! 48