Who’s tweeting about #ScalaLove

A presentation at Scala Love in April 2020 in by Viktor Gamov

Slide 1

Slide 1

Who’s tweeting about #scalalove @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 2

Slide 2

@GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 3

Slide 3

3 I build highly scalable Hello World apps @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 4

Slide 4

Apache Kafka Producer Consumer The Log Connectors Streaming processing @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 5

Slide 5

{ “reading_ts”: “2020-02-14T12:19:27Z”, “sensor_id”: “aa-101”, “production_line”: “w01”, “widget_type”: “acme94”, “temp_celcius”: 23, “widget_weight_g”: 100 Photo by Franck V. on Unsplash } @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 6

Slide 6

Streams of events Time @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 7

Slide 7

Stream Processing with ksqlDB Stream: widgets Stream: widgets_red @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 8

Slide 8

Stream Processing with Kafka Streams Stream: widgets final StreamsBuilder builder = new StreamsBuilder() .stream(“widgets”, Consumed.with(stringSerde, widgetsSerde)) .filter( (key, widget) “-> widget.getColour().equals(“RED”) ) .to(“widgets_red”, Produced.with(stringSerde, widgetsSerde)); Stream: widgets_red @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 9

Slide 9

Stream Processing with ksqlDB Stream: widgets ksqlDB CREATE STREAM widgets_red AS SELECT * FROM widgets WHERE color=’RED’; Stream: widgets_red @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 10

Slide 10

{ “reading_ts”: “2020-02-14T12:19:27Z”, “sensor_id”: “aa-101”, “production_line”: “w01”, “widget_type”: “acme94”, “temp_celcius”: 23, “widget_weight_g”: 100 Photo by Franck V. on Unsplash } @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 11

Slide 11

SELECT COUNT(*) FROM WIDGETS GROUP BY PRODUCTION_LINE { “reading_ts”: “2020-02-14T12:19:27Z”, “sensor_id”: “aa-101”, “production_line”: “w01”, “widget_type”: “acme94”, “temp_celcius”: 23, “widget_weight_g”: 100 } SELECT AVG(TEMP_CELCIUS) AS TEMP FROM WIDGETS GROUP BY SENSOR_ID HAVING TEMP>20 Photo by Franck V. on Unsplash SELECT * FROM WIDGETS WHERE WEIGHT_G > 120 CREATE SINK CONNECTOR dw WITH ( Object store, ‘connector.class’ = ‘S3Connector’, data warehouse, ‘topics’ = ‘widgets’); RDBMS @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 12

Slide 12

ksqlDB The event streaming database purpose-built for stream processing applications. @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 13

Slide 13

Stream Processing with ksqlDB Source stream @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 14

Slide 14

Stream Processing with ksqlDB Source stream @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 15

Slide 15

Stream Processing with ksqlDB Source stream @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 16

Slide 16

Stream Processing with ksqlDB Source stream Analytics @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 17

Slide 17

Stream Processing with ksqlDB Source stream Applications / Microservices @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 18

Slide 18

Stream Processing with ksqlDB …SUM(TXN_AMT) GROUP BY AC_ID AC _I D= 42 BA LA NC AC E= _I 94 D= .0 42 0 Source stream Applications / Microservices @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 19

Slide 19

Photo by Raoul Droog on Unsplash DEMO https:”//gamov.dev/scala-love-demo @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 20

Slide 20

Interacting with ksqlDB Photo by Tim Mossholder on Unsplash

Slide 21

Slide 21

ksqlDB - Confluent Control Center @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 22

Slide 22

ksqlDB - REST API @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 23

Slide 23

ksqlDB - Native client (coming soon) @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 24

Slide 24

What else can ksqlDB do? Photo by Sereja Ris on Unsplash @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 25

Slide 25

Message transformation with ksqlDB ORDERS s i h t t r e v o Con t p m a t s e m i t n a hum e l b a d a re t a m r o f { “ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5, “address”: { “street”: “243 Utah Way”, “city”: “Orange”, “state”: “California” } } @GAMUSSA | #SCALALOVE | @CONFLUENTINC Drop the se address fields

Slide 26

Slide 26

Message transformation with ksqlDB { “ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5, “address”: { “street”: “243 Utah Way”, “city”: “Orange”, “state”: “California” } CREATE STREAM ORDERS_NO_ADDRESS_DATA AS SELECT ORDERTIME, ORDERID, ITEMID,} ORDERUNITS ORDERS ksqlDB FROM ORDERS; @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 27

Slide 27

Message transformation with ksqlDB { “ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5, “address”: { “street”: “243 Utah Way”, “city”: “Orange”, “state”: CREATE STREAM ORDERS_NO_ADDRESS_DATA AS “California” } SELECT TIMESTAMPTOSTRING(ROWTIME, ‘yyyy-MM-dd HH:mm:ss’) } ORDERS ksqlDB AS ORDER_TIMESTAMP, ORDERID, ITEMID, ORDERUNITS FROM ORDERS; ORDERS_NO_ADDRESS_DATA { “order_ts”: “2020-02-14 15:10:58”, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5 } @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 28

Slide 28

Lookups and Joins with ksqlDB ORDERS {“ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5} @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 29

Slide 29

Lookups and Joins with ksqlDB ITEMS ORDERS { “id”: “Item_9”, “make”: “Boyle-McDermott”, “model”: “Apiaceae”, “unit_cost”: 19.9 } {“ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5} @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 30

Slide 30

Lookups and Joins with ksqlDB ITEMS ORDERS ksqlDB CREATE STREAM ORDERS_ENRICHED AS SELECT O., I., O.ORDERUNITS * I.UNIT_COST AS TOTAL_ORDER_VALUE, FROM ORDERS O INNER JOIN ITEMS I ON O.ITEMID = I.ID ; { “id”: “Item_9”, “make”: “Boyle-McDermott”, “model”: “Apiaceae”, “unit_cost”: 19.9 } {“ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5} @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 31

Slide 31

Lookups and Joins with ksqlDB { ITEMS ORDERS ksqlDB “id”: “Item_9”, “make”: “Boyle-McDermott”, “model”: “Apiaceae”, “unit_cost”: 19.9 } {“ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5} ORDERS_ENRICHED CREATE STREAM AS SELECT O., I., O.ORDERUNITS * I.UNIT_COST AS TOTAL_ORDER_VALUE, FROM ORDERS O INNER JOIN ITEMS I ON O.ITEMID = I.ID ; { “ordertime”: 1560070133853, “orderid”: 67, “itemid”: “Item_9”, “orderunits”: 5, “make”: “Boyle-McDermott”, “model”: “Apiaceae”, “unit_cost”: 19.9, “total_order_value”: 99.5 ORDERS_ENRICHED } @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 32

Slide 32

@GAMUSSA | #SCALALOVE | @CONFLUENTINC Photo by Mak on Unsplash Connecting ksqlDB to other systems

Slide 33

Slide 33

Connecting ksqlDB to other systems syslog Google BigQuery Amazon S3 @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 34

Slide 34

Connecting ksqlDB to other systems CREATE SOURCE CONNECTOR SOURCE_MYSQL_01 WITH ( syslog ‘connector.class’ = ‘i.d.c.mysql.MySqlConnector’, ‘database.hostname’ = ‘mysql’, ‘table.whitelist’ = ‘demo.customers’); @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 35

Slide 35

Connecting ksqlDB to other systems CREATE SINK CONNECTOR SINK_ELASTIC_01 WITH ( ‘connector.class’ = Google BigQuery ‘…ElasticsearchSinkConnector’, ‘connection.url’ = ‘http:”//elasticsearch:9200’, ‘topics’ = ‘orders’); Amazon S3 @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 36

Slide 36

Streams & Tables @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 37

Slide 37

Streams (k/v and Tables ksqlDB Kafka bytes) topic { “event_ts”: “2020-02-17T15:22:00Z”, “person” : “robin”, “location”: “Leeds” } { “event_ts”: “2020-02-17T17:23:00Z”, “person” : “robin”, “location”: “London” } Stream +——————————+———-+————-+ |EVENT_TS |PERSON |LOCATION | +——————————+———-+————-+ |2020-02-17 15:22:00 |robin |Leeds | |2020-02-17 17:23:00 |robin |London | |2020-02-17 22:23:00 |robin |Wakefield| |2020-02-18 09:00:00 |robin |Leeds | Stream: Topic + Schema { “event_ts”: “2020-02-17T22:23:00Z”, “person” : “robin”, “location”: “Wakefield” } { “event_ts”: “2020-02-18T09:00:00Z”, “person” : “robin”, “location”: “Leeds” } @GAMUSSA | #SCALALOVE | @CONFLUENTINC ksqlDB Table +———-+————-+ |PERSON |LOCATION | +———-+————-+ |robin |Leeds |London |Wakefield| | Table: state for given key Topic + Schema

Slide 38

Slide 38

Stateful aggregations in ksqlDB Kafka topic { “event_ts”: “2020-02-17T15:22:00Z”, “person” : “robin”, “location”: “Leeds” } { “event_ts”: “2020-02-17T17:23:00Z”, “person” : “robin”, “location”: “London” } SELECT PERSON, COUNT(*) FROM MOVEMENTS GROUP BY PERSON; +———-+—————————+ |PERSON | LOCATION_CHANGES | +———-+—————————+ |robin | 4 1 2 3 | SELECT PERSON, COUNT_DISTINCT(LOCATION) FROM MOVEMENTS GROUP BY PERSON; +———-+—————————+ |PERSON | UNIQUE_LOCATIONS | +———-+—————————+ |robin | 3 1 2 | { “event_ts”: “2020-02-17T22:23:00Z”, “person” : “robin”, “location”: “Wakefield” } { “event_ts”: “2020-02-18T09:00:00Z”, “person” : “robin”, “location”: “Leeds” } Aggregations can be across the entire input, or windowed (TUMBLING, HOPPING, SESSION) @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 39

Slide 39

Pull and Push queries in ksqlDB Pull query Tells you: Exits: Point in time value Immediately Push query All value changes Never @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 40

Slide 40

Under the covers of ksqlDB @GAMUSSA | #SCALALOVE | @CONFLUENTINC Photo by Vinicius de Moraes on Unsplas

Slide 41

Slide 41

Kafka consume ksqlDB produce @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 42

Slide 42

JVM Kafka consume ksqlDB produce RocksDB Kafka Streams @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 43

Slide 43

Slide 44

Slide 44

L Q S K & ^ Fully Managed Kafka as a Service

Slide 45

Slide 45

Running ksqlDB - self-managed DEB, RPM, ZIP, TAR downloads http://confluent.io/download Docker images ksqlDB Server confluentinc/ksqldb-server (JVM process) …and many more… @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 46

Slide 46

Scaling ksqlDB Kafka cluster ksqlDB @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 47

Slide 47

Scaling ksqlDB Kafka cluster ksqlDB Work split by partition ksqlDB ksqlDB cluster @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 48

Slide 48

Slide 49

Slide 49

Think Applications, not database instances Kafka cluster ksqlDB cluster Inventory ksqlDB cluster Orders @GAMUSSA | #SCALALOVE | @CONFLUENTINC ksqlDB cluster Fraud

Slide 50

Slide 50

ksqlDB or Kafka Streams? @GAMUSSA | #SCALALOVE | @CONFLUENTINC Photo by Ramiz Dedaković on Unsplash

Slide 51

Slide 51

ksqlDB Builds on Streams ksqlDB Kafka Streams Consumer, Producer @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 52

Slide 52

Photo by Tucker Good on Unsplas Want to learn more? @GAMUSSA | #SCALALOVE | @CONFLUENTINC CTAs, not CATs (sorry, not sorry)

Slide 53

Slide 53

Learn Kafka. Start building with Apache Kafka at Confluent Developer. developer.confluent.io

Slide 54

Slide 54

Confluent Community Slack group cnfl.io/slack @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 55

Slide 55

Free Books! https:”//cnfl.io/book-bundle @GAMUSSA | #SCALALOVE | @CONFLUENTINC

Slide 56

Slide 56

@GAMUSSA | #SCALALOVE | @CONFLUENTINC