Walking up the Spring for Apache Kafka Stack Gary Russell @gprussell Pivotal Viktor Gamov @gamussa Confluent
A presentation at SpringOne Platform in September 2018 in Washington, DC, USA by Viktor Gamov
Walking up the Spring for Apache Kafka Stack Gary Russell @gprussell Pivotal Viktor Gamov @gamussa Confluent
Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 2
Gary Russell @gprussell Pivotal Viktor Gamov @gamussa Confluent @gprussell @gamussa @s1p
Origins in Stream Processing Spring Boot / Kafka Streams Serving Layer (Cassandra, Elastic, etc.) High Throughput Continuous Streaming platform Computation API based clustering Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 4
What is a Streaming Platform? Producer Consumer The Log Connectors Connectors Streaming Engine Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 5
Kafka’s Distributed Log Producer Consumer The Log Connectors Connectors Streaming Engine Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 6
The log is a simple idea New Old Messages are added at the end of the log Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 7
Consumers have a position all of their own George is here Scan New Old Fred is here Sally is here Scan Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p Scan 8
Only Sequential Access Old Read to offset & scan Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p New 9
Shard data to get scalability Messages are sent to different partitions Producer (1) Producer (2) Producer (3) Messages are sent to different partitions Cluster of machines Partitions live on different machines Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 10
Linearly Scalable Architecture Producers Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!! Consumers Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 11
Consumer Group Consumer Group Coordinator Consumers
Talk is cheap! Show me code! https://cnfl.io/streams-movie Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p
The Connect API Producer Consumer The Log Connectors Connectors Streaming Engine Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 14
Ingest / Output to practically any data source Kafka Connect Kafka Connect Kafka Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 15
Ingest/Output from/to many data sources DynamoDB FTP Github BigQuery Google Pub Sub RethinkDB Salesforce Solr Splunk Amazon S3 Elasticsearch HDFS JDBC Couchbase Cassandra Oracle SAP Vertica Blockchain JMX Kenesis MongoDB MQTT NATS Postgres Rabbit Redis Twitter Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 16
Stream Processing in Kafka Producer Consumer The Log Connectors Connectors Streaming Engine Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 17
App Not running inside brokers! Streams API Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 18
Same app, many instances App App App Streams API Streams API Streams API Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p Brokers? Nope! 19
Before Processing Cluster Shared Database Dashboard Your Job Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 20
As developers, we want to build APPS not INFRASTRUCTURE @gprussell @gamussa @s1p
After Dashboard APP Streams API Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 22
Things Kafka Streams Does Runs everywhere Integrated database Clustering done for you Exactly-once processing Joins, windowing, aggregation S/M/L/XL/XXL/XXXL sizes Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa Event-time processing @s1p 23
Table-Stream Duality Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 25
Do you think that’s a table you are querying ?
TABLE Gary Gary Viktor Gary Viktor Gary Viktor Soby TABLE STREAM 1 (“Gary”, 1) Gary 1 (“Viktor”, 1) Gary Viktor 1 1 (“Gary”, 2) Gary Viktor 2 1 Gary Viktor Soby 2 1 1 1 1 2 1 2 1 1 (“Soby”, 1) Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 27
Join Streams and Tables Kafka Kafka Streams Topic Stream Join Table Compacted Topic Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 28
Talk is cheap! Show me code! Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p
What’s next? Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p
Spring for Apache Kafka - The Full Stack Application Spring Cloud Stream (Kafka Binder, Streams Binder) Spring Boot Spring Integration (Kafka Extension) Spring For Apache Kafka kafka-clients Kafka Reactor Streams Kafka Spring Framework Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 31
What’s New in Spring for Apache Kafka 2.2 ? • Container Factory - no longer just for @KafkaListener • Create any container - e.g. for Spring Integration Adapter/Gateway • @KafkaListener - autoStartup and concurrency property overrides • Recoverable SeekToCurrentErrorHandler/AfterRollbackProcessor • Optional Publishing to a Dead-Letter Topic • ErrorHandlingDeserializer • Embedded Kafka (test) Broker - added JUnit5 support • 2.2.0.RC1 release candidate available - last chance for requests!! Unless otherwise indicated, these slides are © 2013-2018 Pivotal Software, Inc. and licensed under a Creative Commons A t t r i b u t i o n - N o n C o m m e r c i a l l i c e n s e : h t t p : / / c r e a t i v e c o m m o n s . o r g / l i c e n s e s / b y - n c / 3 .0 / @gprussell @gamussa @s1p 32
www.kafka-summit.org promo: Gamov20 @gamussa @S1P @ @confluentinc
Stay Connected. https://cnfl.io/streams-movie https://github.com/garyrussell/spring-kafka-demos @s1p @gprussell #springone @gamussa @s1p