ksqlDB • «Streaming Database» • Provides persistent TABLE abstraction • Pull and Push queries • Like Kafka Streams, but in SQL
@gamussa | @confluentinc | @apacheiceberg
Slide 20
Materialize • Replacement data warehouse • Integrates with Kafka, Postgres, dbt • The Materialized View is the central abstraction • Views are persistent and queryable • Postgres wire-compatible • Positioned as an analytics solution
@gamussa | @confluentinc | @apacheiceberg
Real-Time OLAP • Designed for high concurrency, low latency queries • Ingests from streaming and batch sources • Intimate integration with Kafka • Conventional tables and SQL
@gamussa | @confluentinc | @apacheiceberg
Slide 25
Real-Time OLAP • Analytics shaped like real-time data • Analytics when users are decision makers
@gamussa | @confluentinc | @apacheiceberg
Slide 26
Cloud Data Warehouses
Slide 27
Cloud Data Warehouses • The cloud-based heir of legacy DWH • Ingest from batch and streaming sources • Biased towards structured data and batch access
Slide 28
Data Lake @gamussa | @confluentinc | @apacheiceberg
Slide 29
Data Lake
Anything else
We’ll figure this out
@gamussa | @confluentinc | @apacheiceberg
Slide 30
Data Lakes • Storage and compute are radically decoupled • Structure is relatively less important • Reads are slow • Streaming is historically difficult
@gamussa | @confluentinc | @apacheiceberg
Slide 31
Data Lakes • Started as the HDFS cluster • Became S3 • That didn’t help… • ELT vs. ETL • Iceberg/Hudi/DeltaLake
@gamussa | @confluentinc | @apacheiceberg