8
New World Streaming first • DB/DWH + Many more distributed data systems • Monolith -> Microservices • Batch -> Real-time
@gamussa
|
#NDCPorto
|
@ConfluentINc
32
Linearly Scalable Architecture Producers
Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!!
Consumers @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 33
33
From/to other systems: Kafka Connect
and more
Tip: Great option to gradually move workloads to Kafka while keeping production running!
Slide 34
34
Kafka Connect ● Deployed standalone (development) or as a distributed cluster (production) ● Elastic service that works on bare-metal, VMs, containers, Kubernetes, … ● The individual ‘Connector’ determines delivery guarantees, e.g., exactly-once
VM
VM
Slide 35
35
Single Message Transforms for real-time ETL Ingress: modify an Event before storing ●Obfuscate sensitive information, e.g. PII ●Add origin of event for lineage tracking ●Remove unnecessary data fields ●… and more
{ user: ab123, gender: female, ip: 1.2.3.95 }
Egress: modify an Event on its way out ●Route high-priority events to faster stores ●Direct events to different Elasticsearch indexes ●Cast data types to match destination ●… and more
{ user: ab123, ip: 1.2.3.XXX }
Slide 36
36
Replicate to get fault tolerance leader
msg Machine B
Machine A
@gamussa
replicate msg |
#NDCPorto
|
@ConfluentINc
40
The log is a type of durable messaging system Similar to a traditional messaging system (ActiveMQ, Rabbit etc) but with: (a) Far better scalability (b) Built in fault tolerance / HA (c) Storage
43
Streaming is the toolset for dealing with events as they move! @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 44
44
What exactly is Stream Processing?
authorization_attempts
@gamussa
possible_fraud
|
#NDCPorto
|
@ConfluentINc
Slide 45
45
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 46
46
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 47
47
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 48
48
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 49
49
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 50
50
What exactly is Stream Processing? possible_fraud
authorization_attempts
CREATE STREAM possible_fraud AS SELECT card_number, count() FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count() > 3; @gamussa
|
#NDCPorto
|
@ConfluentINc
Slide 51
51
Coding Sophistication
Lower the bar to enter the world of streaming Core developers who use Java/Scala
streams Core developers who don’t use Java/Scala, e.g. .NET, Go
Data engineers, architects, DevOps/SRE
BI analysts
User Population @gamussa
|
#NDCPorto
|
@ConfluentINc
53
Interaction with Kafka KSQL (processing)
Application
Kafka
(processing) Java/KStreams, .NET
(data)
Does not run on Kafka brokers @gamussa
Does not run on Kafka brokers |
#NDCPorto
|
@ConfluentINc
Slide 54
54
Find your local Meetup Group https://cnfl.io/kafka-meetups Grab Stream Processing books https://cnfl.io/book-bundle Join us in Slack http://cnfl.io/slack @gamussa
|
#NDCPorto
|
@ConfluentINc