Crossing the Streams: Kafka for Spring Developers @gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

The agenda Quick intro to Streaming Platform Gentle intro to Stream processing Wire everything nicely with Spring @gamussa @ @NYJavaSig @confluentinc

https://cnfl.io/streams-movie @gamussa @ @NYJavaSig @confluentinc

Who am I? Solutions Architect Developer Advocate @gamussa in internetz Hey you, yes, you, go follow me in twitter © @gamussa @ @NYJavaSig @confluentinc

Kafka & Confluent @gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

Origins in Stream Processing Kafka Streams / KSQL Kafka Serving Layer (Cassandra, KV-storage etc.) High Throughput Messaging @gamussa @ @NYJavaSig Continuous Computation API based clustering @confluentinc

Kafka is a Streaming Platform Producer Connectors Consumer The Log Connectors Streaming Engine @gamussa @ @NYJavaSig @confluentinc

Streaming @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? authorization_attempts @gamussa possible_fraud @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What exactly is Stream Processing? possible_fraud authorization_attempts CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa @ @NYJavaSig @confluentinc

What is a Streaming Platform? Producer Consumer Connectors Connectors The Log Streaming Engine @gamussa @ @NYJavaSig @confluentinc

Kafka’s Distributed Log Producer Connectors Consumer Connectors The Log Streaming Engine @gamussa @ @NYJavaSig @confluentinc

The log - durable messaging system Similar to a traditional messaging system (ActiveMQ, Rabbit) but with: (a) Far better scalability (b) Built in fault tolerance / HA (c) Storage @gamussa @ @NYJavaSig @confluentinc

The log is a simple idea New Old Messages are added at the end of the log @gamussa @ @NYJavaSig @confluentinc

Consumers have a position all of their own George is here Scan New Old Fred is here @gamussa Sally is here Scan @ @NYJavaSig @confluentinc Scan

Only Sequential Access Old Read to offset & scan @gamussa @ @NYJavaSig @confluentinc New

Shard data to get scalability Messages are sent to different partitions Producer (1) Producer (2) Producer (3) Cluster of machines Partitions live on different machines @gamussa @ @NYJavaSig @confluentinc

Replicate to get fault tolerance leader msg Machine A @gamussa Machine B replicate @ @NYJavaSig msg @confluentinc

Replication provides resiliency A ‘replica’ takes over on machine failure @gamussa @ @NYJavaSig @confluentinc

Linearly Scalable Architecture Producers Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!! Consumers @gamussa @ @NYJavaSig @confluentinc

Worldwide, localized views London Replicator Replicator Tokyo NY Replicator @gamussa @ @NYJavaSig @confluentinc !30

The Connect API Producer Connectors Consumer Connectors The Log Streaming Engine @gamussa @ @NYJavaSig @confluentinc

Ingest / Egest into any data source Kafka Connect @gamussa Kafka Connect @ @NYJavaSig @confluentinc

Ingest/Egest data from/to data sources Amazon S3 Elasticsearch HDFS JDBC Couchbase Cassandra Oracle SAP Vertica Blockchain JMX Kenesis MongoDB MQTT NATS Postgres Rabbit Redis Twitter Bintray DynamoDB FTP Github BigQuery Google Pub Sub RethinkDB Salesforce Solr Splunk @gamussa @ @NYJavaSig @confluentinc

Kafka Streams and KSQL Producer Connectors Consumer Connectors The Log Streaming Engine @gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

Before @gamussa @ @NYJavaSig @confluentinc

After @gamussa @ @NYJavaSig @confluentinc

Things Kafka Streams Does Runs everywhere Clustering done for you Integrated database Exactly-once processing Joins, windowing, aggregation @gamussa @ @NYJavaSig Event-time processing S/M/L/XL/XXL/XXXL sizes @confluentinc

Streams to Tables @gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

Stream/Table Duality @gamussa @ @NYJavaSig @confluentinc

Stream/Table Duality @gamussa @ @NYJavaSig @confluentinc

Join Streams and Tables Kafka Kafka Streams / KSQL Topic Stream Join Table Compacted Topic @gamussa @ @NYJavaSig @confluentinc

Kafka is a complete Streaming Platform Producer Connectors Consumer Connectors The Log Streaming Engine @gamussa @ @NYJavaSig @confluentinc

https://www.confluent.io/download/ @gamussa @ @NYJavaSig @confluentinc

Will it fly? Let’s try! @gamussa @ @NYJavaSig @confluentinc

One more thing… @gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

@gamussa @ @NYJavaSig @confluentinc

A Major New Paradigm @gamussa @ @NYJavaSig @confluentinc

www.kafka-summit.org promo: Gamov20 @gamussa @ @NYJavaSig @confluentinc

Thanks! @gamussa viktor@confluent.io We are hiring! https://www.confluent.io/careers/ @gamussa @ @NYJavaSig @confluentinc