Crossing the Streams: Rethinking Stream Processing with Kafka Streams and KSQL TORONTO KAFKA MEETUP, NOVEMBER 2018

https://twitter.com/gAmUssA/status/1048258981595111424

Streaming is the toolset for dealing with events as they move! @gamussa #TorontoKafka @confluentinc

Java Apps / Kafka Streams Serving Layer (Cassandra, Elastic, etc.) High Throughput Continuous Streaming platform Computation @gamussa #TorontoKafka @ API based clustering @confluentinc

Stream Processing by Analogy Connect API Stream Processing Connect API $ cat < in.txt | grep "ksql" | tr a-z A-Z > out.txt Kafka Cluster @gamussa #TorontoKafka @confluentinc

Streaming Platform Architecture Application Application Application Native Client library Kafka Streams Load Balancer * REST Proxy Schema Registry Kafka Brokers @gamussa Kafka Connect Zookeeper Nodes #TorontoKafka @ @confluentinc

https://twitter.com/monitoring_king/status/1048264580743479296

LET’S TALK ABOUT THIS FRAMEWORK OF YOURS. I THINK ITS GOOD, EXCEPT IT SUCKS @gamussa #TorontoKafka @ @confluentinc

SO LET ME WRITE THE FRAMEWORK THAT’S WHY IT MIGHT BE REALLY GOOD @gamussa #TorontoKafka @ @confluentinc

Every framework Wants to be when it grows up Scalable Elastic Stateful @gamussa Fault-tolerant Distributed #TorontoKafka @confluentinc

https://twitter.com/157rahul/status/1050505569746841600

The log is a simple idea New Old Messages are added at the end of the log @gamussa #TorontoKafka @confluentinc

Consumers have a position all of their own George is here Scan New Old Fred is here @gamussa Scan Sally is here #TorontoKafka Scan @confluentinc

Only Sequential Access Old Read to offset & scan @gamussa #TorontoKafka New @confluentinc

Shard data to get scalability Producer (1) Producer (2) Producer (3) Messages are sent to different partitions Cluster of machines Partitions live on different machines @gamussa #TorontoKafka @confluentinc

CONSUMERS CONSUMER GROUP CONSUMER GROUP COORDINATOR

Linearly Scalable Architecture Producers Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!! Consumers @gamussa #TorontoKafka @confluentinc

Talk is cheap! Show me code! https://cnfl.io/streams-movie-demo

As developers, we want to build APPS not INFRASTRUCTURE @gamussa #TorontoKafka @confluentinc

@

the KAFKA STREAMS API is a JAVA API to BUILD REAL-TIME APPLICATIONS @gamussa #TorontoKafka @confluentinc

App Streams API @gamussa #TorontoKafka Not running inside brokers! @confluentinc

Same app, many instances @gamussa App App App Streams API Streams API Streams API #TorontoKafka Brokers? Nope! @confluentinc

Before Processing Cluster Shared Database Dashboard Your Job @gamussa #TorontoKafka @confluentinc

After Dashboard APP Streams API @gamussa #TorontoKafka @confluentinc

this means you can DEPLOY your app ANYWHERE using WHATEVER TECHNOLOGY YOU WANT

So many places to run you app! ...and many more... @gamussa #TorontoKafka @confluentinc

Things Kafka Stream Does Enterprise Support Open Source Powerful Processing incl. Filters, Transforms, Joins, Aggregations, Windowing @gamussa Runs Everywhere Supports Streams and Tables Elastic, Scalable, Fault-tolerant Exactly-Once Processing #TorontoKafka Kafka Security Integration Event-Time Processing @confluentinc

Table-Stream Duality @gamussa #TorontoKafka @confluentinc

TABLE Gwen STREAM 1 Gwen 1 (“Matthias”, 1) Gwen Matthias 1 1 (“Gwen”, 2) Gwen Matthias 2 1 (“Viktor”, 1) Gwen Matthias Viktor 2 1 1 (“Gwen”, 1) Gwen Matthias Gwen Matthias Gwen Matthias Viktor TABLE 1 1 2 1 2 1 1 @gamussa #TorontoKafka @confluentinc

Do you think that’s a table you are querying ?

Talk is cheap! Show me code!

What’s next?

https://twitter.com/IDispose/status/1048602857191170054

KSQL #FTW ksql> 1 UI 2 @gamussa POST /query CLI 3 REST #TorontoKafka 4 Headless @confluentinc

Interaction with Kafka KSQL (processing) Kafka JVM application with Kafka Streams (processing) (data) Does not run on Kafka brokers @gamussa Does not run on Kafka brokers #TorontoKafka @confluentinc

Fault-Tolerance, powered by Kafka @gamussa #TorontoKafka @confluentinc

Standing on the shoulders of Streaming Giants KSQL Ease of use Powered by KSQL UDFs Kafka Streams Powered by Producer, Consumer APIs @gamussa Flexibility #TorontoKafka @confluentinc

Thanks! @gamussa viktor@confluent.io We are hiring! https://www.confluent.io/careers/ @gamussa #TorontoKafka @ @confluentinc