Apache Mesos Archives - Kai Waehner

Kafka Operator for Kubernetes – Confluent Operator to establish a Cloud-Native Apache Kafka Platform

Kai Waehner — Mon, 29 Jul 2019 08:03:01 +0000

Confluent Operator is now GA for production deployments (Download Confluent Operator for Kafka here). This is a Kafka Operator for Kubernetes which provides automated provisioning and operations of an Apache Kafka cluster and its whole ecosystem (Kafka Connect, Schema Registry, KSQL, etc.) on any Kubernetes infrastructure.

I want to share a slide deck which explains:

Why Kubernetes is getting more and more traction to build a cloud-native infrastructure
Why this is relevant for Apache Kafka and Confluent Platform
The challenges running Kafka on Kubernetes
How Confluent Operator solves these problems providing a powerful Kafka Operator for Kubernetes

Cloud-Native vs. SaaS / Serverless

Software as a Service (SaaS) and Serverless Platforms provide software and services in the public cloud as managed service. Cloud-native infrastructures allow you to leverage the features of SaaS / Serverless in your own self-managed infrastructure (either on premise or in public cloud, and without vendor-lockin).

What is Cloud Native? Many different definitions exist on the web. Two definitions which I like are “The Twelve-Factor App” and “10 key characteristics of The New Stack“.

Some of the key benefits of cloud-native infrastructures:

Scalable
Flexible
Agile
Elastic
Automated

This is very different from traditional bare metal or VM infrastructures. Even if you use containers like Docker, you don’t automatically get above benefits. Providing cloud-native infrastructure is a key requirement to build a DevOps infrastructure and culture. Note that technology is just one part of a fully successful DevOps mentality, of course.

Kubernetes Won the Container War

In the beginning, many cloud-native container platforms built their own cloud-native technology and infrastructure. Many of these solutions were open source, but only one took over. Just take a look at these Google Trends of last five years:

In the meantime, most cloud-native infrastructure providers (such as Red Hat OpenShift, Mesosphere, Pivotal Cloud Foundry) moved their whole strategy around supporting Kubernetes. These vendors enhance the user experience and add additional features to differentiate from vanilla Kubernetes. OpenShift made this decision a few years earlier than most others; take a look how the above trends reflect this decision. Furthermore, Kubernetes is also available as managed service on all major cloud providers (AWS, Azure, GCP) in the meantime.

Stateful Kubernetes Deployments using Operator Pattern

Kubernetes was mainly used for stateless deployments in the early phases (for instance to deploy REST microservices). Today, people deploy everything on Kubernetes because it adds a lot of value – as discussed in the section about cloud-native infrastructure above. This includes the Kafka backend and clients.

Stateful deployments of backend services leverage the Kubernetes Operator pattern. For many infrastructure components, like databases, messaging, search engines, etc. The implementation of the Operator Pattern includes standard Kubernetes objects like StatefulSets, ConfigMaps, Secrets and Persistent Volumes. However, the secret sauce are the custom Kubernetes Controller and Custom Resource Definitions (CRDs) which implement unique application functionality for the specific stateful deployment.

Challenges running Kafka on Kubernetes

Apache Kafka became the de facto standard for event streaming platforms. Apache Kafka and its ecosystem provides a powerful option to build reliable, scalable, mission-critical distributed systems. Therefore, as you can image, it is harder to operate than a traditional messaging system or database which do not scale elastically without downtime and just use active/passive for high availability.

Kubernetes environments are similar: Very powerful but not easy to operate. Hence the combination of both, Kafka and Kubernetes, does not make it easier. Here are some challenges running the Apache Kafka ecosystem on Kubernetes:

Translating an existing architecture to Kubernetes
Failover handling
Data rebalancing
Communication between ZooKeeper, Kafka Brokers, Clients (Java, REST, Connect, KSQL), Schema Registry, etc.
External access from / to outside Kubernetes cluster
Persistent storage options on premise and in the cloud
Security configuration
Rolling upgrades
Etc.

This is the secret sauce which a Kubernetes Operator has to implement and automate. Consequently, a Kafka Operator sounds like a very good and valuable component.

Confluent Operator as Kafka Operator to establish a Cloud-Native Kafka Platform

Confluent has long experience running Kafka on Kubernetes:

Confluent Cloud runs on Kubernetes using a Kafka Operator to offer “Serverless Kafka”: Confluent Cloud provides mission-critical SLAs on all three major cloud providers (Google GCP, Microsoft Azure, Amazon AWS), consumption-based pricing and throughput of several GB / sec using a single Kafka cluster. Seems like running Kafka on Kubernetes using a Kafka Operator is not a bad idea.

Slide Deck: Confluent Operator for Kafka Ecosystem on Kubernetes

My slide deck describes the journey and the features of Confluent Operator to deploy and operate Kafka in a cloud-native way similar to how Kafka and its ecosystem (like Kafka Connect, Schema Registry, KSQL) is deployed in Confluent Cloud.

Confluent Operator enables you to:

Provisioning, management and operations of Confluent Platform (including ZooKeeper, Apache Kafka, Kafka Connect, KSQL, Schema Registry, REST Proxy, Control Center)
Deployment on any Kubernetes Platform (Vanilla K8s, OpenShift, Rancher, Mesosphere, Cloud Foundry, Amazon EKS, Azure AKS, Google GKE, etc.)
Automate provisioning of Kafka pods in minutes
Monitor SLAs through Confluent Control Center or Prometheus
Scale Kafka elastically, handle fail-over & Automate rolling updates
Automate security configuration
Built on our first hand knowledge of running Confluent at scale
Fully supported for production usage

Here is the Agenda of the slide deck:

Cloud Native vs. SaaS / Serverless Kafka
The Emergence of Kubernetes
Kafka on K8s Deployment Challenges
Confluent Operator as Kafka Operator

Also check out the documentation for Confluent Operator.

Please let me know if you have any comments or feedback.

The post Kafka Operator for Kubernetes – Confluent Operator to establish a Cloud-Native Apache Kafka Platform appeared first on Kai Waehner.

Visualisation from my Apache Kafka + Mesos Session at OOP 2018

Kai Waehner — Sun, 18 Feb 2018 19:13:44 +0000

I did some talks about “Apache Kafka + Apache Mesos = Highly Scalable Microservices” in the last months… See my blog post with notes and slides from MesosCon Europe.

I did an updated version at OOP 2018 conference in Munich. The conference organizers invited some great people who do live drawings during some of the talks. The result of the live whiteboard drawing of my session is really awesome. Take a look:

Thanks to the guys from Remarker. Great visualisation! Love it…

The post Visualisation from my Apache Kafka + Mesos Session at OOP 2018 appeared first on Kai Waehner.

Apache Kafka + Kafka Streams + Mesos = Highly Scalable Microservices

Kai Waehner — Fri, 12 Jan 2018 16:34:12 +0000

My latest article about Apache Kafka, Kafka Streams and Apache Mesos was published on Confluent’s blog:

Apache Mesos, Apache Kafka and Kafka Streams for Highly Scalable Microservices

This blog post discusses how to build a highly scalable, mission-critical microservice infrastructure with Apache Kafka, Kafka Streams, and Apache Mesos respectively in their vendor-supported platforms from Confluent and Mesosphere.

https://www.confluent.io/blog/apache-mesos-apache-kafka-kafka-streams-highly-scalable-microservices/

Have fun reading it and let me know if you have any feedback…

The post Apache Kafka + Kafka Streams + Mesos = Highly Scalable Microservices appeared first on Kai Waehner.

Apache Kafka + Kafka Streams + Mesos / DCOS = Scalable Microservices

Kai Waehner — Fri, 27 Oct 2017 08:05:16 +0000

I had a talk at MesosCon 2017 Europe in Prague about building highly scalable, mission-critical microservices with Apache Kafka, Kafka Streams and Apache Mesos / DCOS. I would like to share the slides and a video recording of the live demo.

Abstract

Microservices establish many benefits like agile, flexible development and deployment of business logic. However, a Microservice architecture also creates many new challenges. This includes increased communication between distributed instances, the need for orchestration, new fail-over requirements, and resiliency design patterns.

This session discusses how to build a highly scalable, performant, mission-critical microservice infrastructure with Apache Kafka, Kafka Streams and Apache Mesos respectively DC/OS. Apache Kafka brokers are used as powerful, scalable, distributed message backbone. Kafka’s Streams API allows to embed stream processing directly into any external microservice or business application. Without the need for a dedicated streaming cluster. Apache Mesos can be used as scalable infrastructure for both, the Apache Kafka brokers and external applications using the Kafka Streams API, to leverage the benefits of a cloud native platforms like service discovery, health checks, or fail-over management.

A live demo shows how to develop real time applications for your core business with Kafka messaging brokers and Kafka Streams API. You see how to deploy / manage / scale them on a DC/OS cluster using different deployment options.

Key takeaways

Successful microservice architectures require a highly scalable messaging infrastructure combined with a cloud-native platform which manages distributed microservices
Apache Kafka offers a highly scalable, mission critical infrastructure for distributed messaging and integration
Kafka’s Streams API allows to embed stream processing into any external application or microservice
Mesos respectively DC/OS allow management of both, Kafka brokers and external applications using Kafka Streams API, to leverage many built-in benefits like health checks, service discovery or fail-over control of microservices
See a live demo which combines the Apache Kafka streaming platform and DC/OS

Architecture: Kafka Brokers + Kafka Streams on Kubernetes and DC/OS

The following picture shows the architecture. You can either run Kafka Brokers and Kafka Streams microservices natively on DC/OS via Marathon or leverage Kubernetes as Docker container orchestration tool (which is also supported my Mesosphere in the meantime).

Slides

Here are the slides from my talk:

Live Demo

The following video shows the live demo. It is built on AWS using Mesosphere’s Cloud Formation script to setup a DC/OS cluster in ten minutes.

Here, I deployed both – Kafka brokers and Kafka Streams microservices – directly to DC/OS without leveraging Kubernetes. I expect to see many people continue to deploy Kafka brokers directly on DC/OS. For microservices many teams might move to the following stack: Microservice –> Docker –> Kubernetes –> DC/OS.

Do you also use Apache Mesos respectively DC/OS to run Kafka? Only the brokers or also Kafka clients (producers, consumers, Streams, Connect, KSQL, etc)? Or do you prefer another tool like Kubernetes (maybe on DC/OS)?

The post Apache Kafka + Kafka Streams + Mesos / DCOS = Scalable Microservices appeared first on Kai Waehner.