apiVersion: cluster.confluent.com/v1alpha1 kind: KafkaCluster Who’s tweeting about #DSGTech and #KSQL? spec: image: confluent-docker.jfrog.io/confluent-operator/kafka jvmConfig: heapSize: 4G metricReporter: bootstrapEndpoint: Using Apachekafka:9071 Kafka, Kafka Connect, enabled: true KSQL and Kubernetes internal: false publishMs: 30000 February 2019 | @gamussa

#devkafkaops @gamussa | @ #DSGTech | @confluentinc

@gamussa | #DSGTech | @confluentinc

https://twitter.com/kelseyhightower/status/963413508300812295 @gamussa | @ #DSGTech | @confluentinc

https://twitter.com/kelseyhightower/status/963414038603427840 @gamussa | @ #DSGTech | @confluentinc

Don’t despair… “… not even over the fact that you don’t despair. Just when everything seems over with, new forces come marching up, and precisely that means that you are alive” Franz Kafka @gamussa | @ #DSGTech | @confluentinc

Kafka Streaming Architecture Fundamentals

@gamussa | @ #DSGTech | @confluentinc

@gamussa | @ #DSGTech | @confluentinc

Event Streaming Platform Architecture Application Application Application KSQL Native Client library Kafka Streams Kafka Streams Load Balancer * REST Proxy Schema Registry Kafka Brokers @gamussa | @ #DSGTech Kafka Connect Zookeeper Nodes | @confluentinc

Kubernetes Fundamentals

Microservices Docker Kubernetes Monolith @gamussa | @ #DSGTech | @confluentinc

https://twitter.com/sahrizv/status/1018184792611827712 @gamussa | @ #DSGTech | @confluentinc

@gamussa | @ #DSGTech | @confluentinc

Orchestration Compute Networking Storage Service Discovery @gamussa | @ #DSGTech | @confluentinc

Kubernetes Schedules and allocates resources Networking between Pods Storage Service Discovery @gamussa | @ #DSGTech | @confluentinc

Refresher - Kubernetes Architecture kubectl https://thenewstack.io/kubernetes-an-overview/ @gamussa | @ #DSGTech | @confluentinc

Pod Basic Unit of Deployment in Kubernetes A collection of containers sharing: Namespace Network Volumes @gamussa | @ #DSGTech | @confluentinc

Storage Persistent Volume (PV) & Persistent Volume Claim (PVC) Both PV and PVC are ‘resources’ @gamussa | @ #DSGTech | @confluentinc

Storage Persistent Volume (PV) & Persistent Volume Claim (PVC) PV is a piece of storage that is provisioned dynamic or static of any individual pod that uses the PV @gamussa | @ #DSGTech | @confluentinc

Storage Persistent Volume (PV) & Persistent Volume Claim (PVC) PVC is a request for storage by a User @gamussa | @ #DSGTech | @confluentinc

Storage Persistent Volume (PV) & Persistent Volume Claim (PVC) PVCs consume PV @gamussa | @ #DSGTech | @confluentinc

Stateful Workloads

StatefulSet Rely on Headless Service to provide network identity Headless Service Pod-0 Ideal for highly available stateful workloads @gamussa | @ #DSGTech | Pod-1 Pod-2 Containers Containers Containers Volumes Volumes Volumes @confluentinc

StatefulSet Rely on Headless Service to provide network identity @gamussa | Headless Service Pod-0 @ #DSGTech | Pod-1 Pod-2 Containers Containers Containers Volumes Volumes Volumes @confluentinc

StatefulSet Ideal for highly available stateful workloads @gamussa | Headless Service Pod-0 @ #DSGTech | Pod-1 Pod-2 Containers Containers Containers Volumes Volumes Volumes @confluentinc

Workloads Deployment @gamussa #DSGTech @confluentinc

Helm Charts @gamussa | @ #DSGTech | @confluentinc

Helm Charts @gamussa | @ #DSGTech | @confluentinc

Helm Charts @gamussa | @ #DSGTech | @confluentinc

https://cnfl.io/helm_video @gamussa | #DSGTech | @confluentinc

Helm Charts Package Manager Package multiple K8s resources into one deployment unit: Chart @gamussa | @ #DSGTech | @confluentinc

Kafka deployment checklist PVC for Storage Uses ZK Headless Svc StatefulSet for 3-node zk PVC for Storage Optional Pod Anti-Affinity to spread the ZK ensemble across nodes StatefulSet for n-node Kafka Headless Service ConfigMap for Prometheus JMX exporter @gamussa | @ #DSGTech A group of NodePort Services for external traffic ConfigMap for Prometheus JMX exporter | @confluentinc

Basic components are not enough @gamussa #DSGTech @confluentinc

Meet Kubernetes Operator @gamussa | @ #DSGTech | @confluentinc

Kubernetes Operator Embedded with operational knowledge of both data software and Kubernetes Backup/restore Scale up/down Rebalance data Regular health checks @gamussa | @ #DSGTech | @confluentinc

Controller Brain behind Kubernetes resources e.g. replication controller, namespace controller etc. @gamussa | @ #DSGTech | @confluentinc

Custom Resource Definition(CRD) Extend existing Kubernetes API API StatefulSet ReplicaSet … CRD Controller StatefulSet Controller ReplicaSet Controller … Custom Controller ReplicaSet … Custom Resource Instance @gamussa | @ #DSGTech | StatefulSet @confluentinc

Custom Resource Definition(CRD) Usually works together Custom Controller API StatefulSet ReplicaSet … CRD Controller StatefulSet Controller ReplicaSet Controller … Custom Controller ReplicaSet … Custom Resource Instance @gamussa | @ #DSGTech | StatefulSet @confluentinc

Custom Resource Definition(CRD) Users can create and access Customer Resources with kubectl, just as they do for built-in resources like pods. @gamussa | API StatefulSet ReplicaSet … CRD Controller StatefulSet Controller ReplicaSet Controller … Custom Controller ReplicaSet … Custom Resource Instance @ #DSGTech | StatefulSet @confluentinc

Operator Deploy and Manage your production streaming platform with Confluent Operator. Automated Provisioning Platform Operations Resiliency Monitoring @gamussa | @ #DSGTech | @confluentinc

Confluent Platform Reference Architecture Each Confluent Platform component has specific characteristics: Application Application Native Client library Kafka Streams Load Balancer * Security (SSL certificates) DNS names and zones Host selection Fault tolerance Scaling @gamussa Application REST Proxy Schema Registry Kafka Brokers | @ #DSGTech | @confluentinc Kafka Connect Zookeeper Nodes

Confluent Operator: Automated Provisioning Load Balancer Kafka Pod Kafka Pod Storage @gamussa | @ #DSGTech | @confluentinc Kafka Pod

Confluent Operator: Scale Horizontally Automate scaling: Spin up new broker pod(s) Distribute partitions to the new broker(s) Determine balancing plan Execute balancing plan Monitor resources @gamussa | @ #DSGTech | @confluentinc

Confluent Operator: Rolling Upgrade Automated rolling upgrade with no downtime for Kafka. Stop broker Wait for leader election to complete Start broker with new version Wait for zero underreplicated-partitions Repeat @gamussa | @ #DSGTech | @confluentinc

Will it fly? Let’s see @gamussa | #DSGTech | @confluentinc

Confluent Operator Automate provisioning Scale your Kafkas and CP clusters elastically Monitor SLAs through Confluent Control Center or Prometheus Operate at scale with enterprise support from Confluent @gamussa | @ #DSGTech | @confluentinc

Advanced use cases vs. @gamussa | #DSGTech | @confluentinc

Don’t despair! @gamussa | @ #DSGTech | @confluentinc

Coding Sophistication Lower the bar to enter the world of streaming Core developers who use Java/Scala streams Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts User Population @gamussa | #DSGTech | @confluentinc

KSQL #FTW ksql> 1 UI 2 @gamussa POST /query CLI | #DSGTech 3 | REST @confluentinc 4 Headless

Interaction with Kafka KSQL (processing) JVM application Kafka with Kafka Streams (processing) (data) Does not run on Kafka brokers @gamussa Does not run on Kafka brokers | #DSGTech | @confluentinc

Fault-Tolerance, powered by Kafka @gamussa | #DSGTech | @confluentinc

Differences KSQL streams You write… KSQL statements JVM applications UI included for human interaction Yes, in Confluent Platform No CLI included for human interaction Yes No Data formats Avro, JSON, CSV (today) Any data format, including Avro, JSON, CSV, Protobuf, XML REST API included Yes No, but you can DIY Runtime included Yes, the KSQL server Not needed, applications run as standard JVM processes Queryable state Not yet Yes @gamussa | #DSGTech | @confluentinc

Standing on the shoulders of Streaming Giants Ease of use KSQL Powered by KSQL UDFs Kafka Streams Powered by Producer, Consumer APIs @gamussa | Flexibility #DSGTech | @confluentinc

@gamussa | #DSGTech | @confluentinc

One last thing…

https://kafka-summit.org Gamov30 @gamussa | @ #DSGTech | @confluentinc

Resources and Next Steps https://cnfl.io/helm_video https://cnfl.io/cp-helm https://cnfl.io/k8s https://slackpass.io/confluentcommunity #kubernetes @gamussa | #DSGTech | @confluentinc

Thanks! @gamussa viktor@confluent.io @gamussa | @ #DSGTech | @confluentinc