Kubernetes Archives - Kai Waehner https://www.kai-waehner.de/blog/category/kubernetes/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 25 Nov 2019 19:32:11 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Kubernetes Archives - Kai Waehner https://www.kai-waehner.de/blog/category/kubernetes/ 32 32 Apache Kafka in the Automotive Industry https://www.kai-waehner.de/blog/2019/11/22/apache-kafka-automotive-industry-industrial-iot-iiot/ Fri, 22 Nov 2019 05:51:28 +0000 https://www.kai-waehner.de/?p=1917 In November 2019, I had the pleasure to visit “Motor City” Detroit. I met with several automotive companies,…

The post Apache Kafka in the Automotive Industry appeared first on Kai Waehner.

]]>
In November 2019, I had the pleasure to visit “Motor City” Detroit. I met with several automotive companies, suppliers, startups and cloud providers to discuss use cases and architectures around Apache Kafka. I work with companies related to the German automotive industry for many years. It was great to see the ideas and current status of projects running overseas in the US.

Kai in Detroit

I am really excited about the role of Apache Kafka and its ecosystem in the automotive industry. Kafka became the central nervous system of many applications in various different areas related to automotive industry. Machine Learning also got more and more impact on these use cases.

A Long Journey – From Car Production and Sales to Digital Services

My trip to Detroit started with some private events. First, I watched the amazing football college game of the Michigan Wolverines hosting the Michigan State Spartans in front of 111000 fans in Ann Arbor. The next day I went to the Ford Piquette Plant in Detroit where Henry Ford and his team manufactured the first cars. The Henry Ford Museum was another great visit to learn more about the mastermind Henry Ford and the history of car manufacturing in the US.

The Car Industry is Changing…

I talked to many local people. The main excitement was around three topics in the state of Michigan: Universities, sports and cars. I have to admit that talking about car companies created a much more frustrating and negative mood for many people. They work in the factories and offices of car companies. They were more happy about discussions around sports and universities. Many people are worried about the future of the automotive industry.

Thus, let’s think in more detail about the car business… A car company produces and sells cars. A supplier produces parts for the car. A vendor shop or independent company provides maintenance and repair services.

That was the situation for many decades. Welcome to a new era where car companies have to do much more than just manufacturing and selling cars to stay competitive.

Car business - From Manufacturing and Sales to Digital Services

I think people should not be worried, though. This is very exciting. While many old jobs will be done my machines and robots in the future, new tasks and job roles will be created for everybody.

Customer Behaviour has Changed…

Customer behaviour has changed significantly:

  • Customers expect digital services and integration with their own IT ecosystem (smartphone, music streaming service, smart home, and many other apps)
  • Customers look for cheap repair services away from the car vendors because it is much cheaper but still a good service
  • Repair shops use 3rd party suppliers for car parts because the original supplier  is too expensive and does not add any value or better product quality
  • Tesla, Uber, Volkswagen, Daimler and many other automotive and non-automotive companies work on autonomous driving (in several steps and releases: Automatic distance measurement  on the highway —> Automatic parking in the parking lot —> Driving back to the entrance of the supermarket when you call your call via your smartphone —> Real autonomous driving to the final destination)
  • Zipcar, car2go, drivenow and other companies provide car sharing services where car usage is paid by minutes and miles: You use your smartphone to locate the next car at the airport, drive to your home in the city, and leave the car in front of your house so that the next customer can pick it up
  • Uber, Lyft, Grab, Free Now (the most ridiculous company name you could have for ride sharing) and other services provide cheap ride sharing with fantastic user experience: You order a taxi via mobile app, select guest services (like choice of music, wish of conversations with the driver or not), see expected estimated time of arrival and expected cost, monitor the status of the taxi location in real time, get transported, pay automatically by the integrated payment service, collect points at the integrated and partnered loyalty systems (like Hilton in case of Lyft or MilesAndMore like Free Now to name a few examples)

In short, companies in the automotive industry have to change their business model fundamentally to stay competitive. Otherwise, they will become the next “hardware supplier” for a tech giant. This would reduce profit, establish tough dependencies and create market pressure. Even the existence of the company is at risk. No matter if you are on the market for 10 or 100 years, already.

Digital Use Cases in Automotive Industry

The background about the history and new requirements for the automotive industry is important to understand. Every car-related company can add more value and innovation to the own business model.

The use cases can be separated in infrastructure, manufacturing and customer interaction (while there are some overlaps, of course).

Infrastructure for Connected Cars

A fully connected infrastructure with real time communication is required to build innovative use cases such as:

  • Connected cars: The “Hello World” example for discussing cutting edge use cases in automotive industry. It includes more or less all challenges you have to solve in Internet of Things (IoT) scenarios, including connectivity to millions of devices, large scale throughput, reliable communication with bad and low network connectivity, real time processing requirements. Ingest and process sensor information from millions of cars in real time to correlate events and. Provide bi-directional communication between cars and other applications. Do big data analytics in the backend to get new insights. Build additional services to the manufacturer, customer and partners.
  • Fleet Management: A specific example of real time correlation of events from various different systems like mobile apps, hardware integrated into cars and trucks, and backend systems like CRMs and payment systems. Various examples exist in different industries, including logistics of truck fleets, track&trace of package delivery, ridesharing, food services, and many more.
  • Emergency system: If your car shows abnormal behaviour  or crashed into a tree, the car automatically sends a crash notification (including details about the crash scenario) to the police and hospital next to you to send help immediately.
  • Smart City and Smart Driving: Connect and correlate data of cars with other devices like traffic lights. Automotive companies partner with cities and other data providers to offer better security and more comfortable driving experience. Collect and share basic safety messages. Monitor cars and send speed / slow-down warnings, wrong-way detection, congestion. Provide crowdsourced data with applications such as Waze, Google Maps, Twitter, Nextdoor.

Manufacturing and Industrial IoT (IIoT)

Producing cars is a very complex task. This includes the production of the car itself (the “hardware”) and the intelligent logic (the “software”) of the car. Both integrate deeply with each other to implement use cases such as:

  • Industrial IoT (IIoT) and smart manufacturing: Predictive maintenance, early part scrapping and improved product quality are some examples to improve the quality of the manufactured parts, reduce costs and risks. The analysis of car sensor data also allows to implement additional digital services on top of the plain car parts.
  • Autonomous Driving: One of the most interesting use cases. Real time processing of big data sets and using cutting edge technologies such as Neural Networks allows car companies to build car with autonomous driving features. This is still in early stage and will probably take many years before it is available in Europe. But some US states and China are pressing ahead with self-driving cars on public streets, already. This will take some more years in most places. In the meantime, you can already use less powerful but still impressive features like Tesla’s “summon car” feature. This is not working perfectly today, but this is just a matter of time.

Customer Interactions and Data Analytics

Connected infrastructures and self-driving cars are awesome. Another added value comes through the possibility to communicate directly with the drivers to provide additional services, for instance:

  • Remote Diagnostics: Analytics can happen at the edge (e.g. in the car)  or in a data center / cloud. Autonomous driving needs to detect persons in real time. Thus, this needs to be applied in the car. Cross selling is an example where you correlate data between different backend systems and remote users or apps. Therefore, this should happen in the backend. You just send the result of the correlation back to the car or mobile app of the driver. Predictive maintenance can happen directly in the car or machine. Or you correlate the events in the backend and just send the alerts to the car or machine to stop it before the engine breaks. No matter where you deploy the analytics logic, it typically has to happen in real time. Usually within milliseconds or seconds, but sometimes minutes or even hours are sufficient.
  • Identity Management: The days are over where you use a key (a piece of hardware) to open your car and start your engine. Car sharing services provide a mobile app. This is used for registration, payments and handing over the key (for the time of usage). Security (Authentication, authorization and encryption of data) are getting more and more important in IoT. This is even true in Industrial IoT where security discussions did not exist at all in the past. Car sharing is a great example where this is a key requirement to deploy an infrastructure and service successfully. The next step several companies are working on (long live China!) is image recognition instead of using a physical key, or smartphone to open your car and start the engine. Other use cases include improving the customer experience in the car. For instance, default configuration of seat, radio and air conditioning can be based on the driver’s historical behaviour (stored forever in your favorite database) and the outside ecosystem (e.g. integration with a weather service to know how warm it is). You will probably also not need to carry your driver’s license with you in the future because police will also just scan your face. I was pretty impressed when I just had to scan my face at Global Entry entering the US border coming from Germany last week, instead of scanning my passport and entering all the annoying details about my travel and private information.
  • Aftersales: Car companies and manufacturers need to sell additional services to make more money and / or to make customers happy and loyal. This includes use cases like upselling (provide 100 extra horsepower via digital download for 24 hours to have some fun on German Autobahn) and cross-selling (provide a 20% coupon for the partnered steak house coming at the next stop 10 miles always from your current driving location).
  • Payment Integration: In the US, it is common to pay everything by credit card. In Europe, many shops still just accept cash. In the future, payments will be done automatically. You just go to the gas station or electric charging unit and refuel respectively load your car and leave. Similar to existing an Uber or Lyft car today. The payment is done via the integrated partner payment system of your choice. Once again, loyalty systems are integrated, too. Authentication and authorization are done via cameras and neural networks doing image recognition.
  • Data Monetization: Connected car infrastructures produce a lot of data. Depending on law, privacy options and other regulations, automotive companies can and will do its own analytics, but also sell the data or parts of the data to partners (hopefully anonymized properly). Like Google or Facebook today, the automotive industry will be able to make a fortune with the data of the cars and all the connected partner systems. Most customers will probably agree to this “feature” because they get added value out of this, too (like discounts or “free” additional services).

Wow, this is a lot of use cases for digitalization of the automotive business, isn’t it? And there are many more to come…

Challenges and Requirements to Build Automotive Infrastructure at Scale

Let’s now quickly discuss the challenges and requirements to solve with all those exciting use cases:

  • Integration between many different consumer apps and backend systems (including many legacy and proprietary interfaces, and different communication paradigms like real time streaming, request-response, batch processing)
  • Big Data – sensors and millions of mobile apps from users produce a large set of (structured and unstructured) data continuously
  • Hybrid integration and deployments at the edge (e.g. cars), local proxies in different regions, backend applications in data centers or cloud, and SaaS applications
  • Global scale with zero downtime, rolling out the applications on different infrastructures (there is no AWS / Azure / GCP) in China but only Alibaba, and there is no cloud at all in Russia (just private clouds on premises).
  • Real time information is required in most scenarios to provide a good digital experience and added business value

This is a lot of challenges to solve. If you know Apache Kafka already well (and maybe also other traditional middleware), then you can probably guess why Kafka is such a good fit here.

Apache Kafka in the Automotive Industry

After we understood the various innovative use cases and its related challenges in the automotive industry, let’s now discuss (quickly) why so many automotive companies and suppliers use Apache Kafka as central nervous system for the discussed use cases.

Apache Kafka as Event Stream Platform and Central Nervous System

Kafka for Real Time Streaming at Scale

Car sensors produce data continuously. The more cars you connect, the more data you get. Kafka provides messaging and streaming at scale, providing a reliable, highly scalable infrastructure with zero downtime. Rolling upgrades, backwards compatibility, up- and downscale at runtime, and other features are built-in.

Kafka for Handling Backpressure and Decoupling of Services

Cars are decoupled from the backed systems and mobile apps. Kafka provides storage for backpressure and decoupling of services. It is not just a messaging system! Kafka also stores data as long as you want. For instance, a Kafka topic is stored for a few days for log analytics. Another Kafka topic is stored for years to analyze customer and payment transactions of the past. Order guarantees, timestamps and high availability are provided out of-the-box. Connected applications and data stores can build a materialized view to leverage the data.

Kafka for Integration with Legacy and Modern Technologies

Car data needs to be integrated. Kafka is the backbone for backend systems (legacy like ERP, Mainframe + modern SaaS like Cloud CRM, big data analytics). Connectivity via Kafka Connect provides Kafka-native integration at scale to any legacy or modern technology and communication paradigm. Either directly from Kafka clients or via other technologies as proxy like MQTT to cars, Websockets to mobile apps, via proprietary protocols like Siemens S7 or open standards like OPC-UA to machines in plants). Check out my comparison between Kafka and Middleware (MQ, ETL, ESB) for more details.

Kafka for Continuous Stream Processing with Kafka Streams and ksqlDB

Car data needs to be processed and correlated in real time at scale with other backend databases and applications. Kafka Streams or ksqlDB provide Kafka-native capabilities to do process data continuously, for use cases like Streaming ETL, stateful aggregations (sliding windows) and building business applications with their own state (leveraging rocksdb under the hood – no need for another database.

Machine Learning and Kafka in Automotive Use Cases

Kafka is used more and more in Machine Learning infrastructures. Some examples from tech giants are Uber, Netflix or Paypal. The big challenge about Machine Learning is the deploy at scale in a reliable way (for both model training and predictions). I covered this topic in details in various blog posts, videos and demos. Check out this blog post to get started: “How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka“.

Apache Kafka Open Source Ecosystem as Infrastructure for Machine Learning

Kafka for Data Engineering, Model Training + Deployment and Monitoring

In the Automotive Industry, Machine Learning needs to be applied at scale. Predictions often need to happen in real time. Therefore, more and more automotive-related use cases discussed in the above section leverage Kafka together with Machine Learning for different aspects like:

  • Data ingestion and processing of data with Kafka from cars and other systems for both model training of historical data and real time predictions on new events – ideally using the same ingestion and pre-processing pipeline for both.
  • Streaming model training with Kafka without an additional database in the middle (e.g. leveraging TensorFlow I/O and its native Kafka integration) or the storage in a data lake (e.g. HDFS, AWS S3) for model training.
  • Model deployment with Kafka at the edge (e.g. in autonomous driving for stopping the car via image recognition because a person is on the street in front of the car) or in the data center / cloud (e.g. predictions for cross selling by correlation of real time behaviour together with correlated information from the loyalty and CRM systems in backend). Check out the video and slides discussing “Event-driven Model Serving: Stream Processing vs. RPC” from Kafka Summit for more details.
  • Monitoring with Kafka (e.g. real time alerting, or distributed tracing to detect anomalies and the root cause of problems).

One huge advantage of using Kafka for Machine Learning infrastructures is solving the impedance mismatch between the data scientist (loving rapid prototyping with Python and Jyipter) and the software developer / production engineer (loving scalable and reliable Java applications).

Kafka Architecture – From Edge Deployments to Global Replication

The architecture for Kafka deployments depends on the use cases, SLAs and many other requirements. Automotive companies and suppliers typically have to think globally:

Global Kafka Architecture with Edge Deployments

Using Kafka as global nervous system for streaming data typically means you spin up different Kafka clusters. The following scenarios are very common in the automotive industry:

  • Local edge Kafka clusters in the factories: Each factory has its own Kafka cluster to integrate with the machines, sensors and assembly lines. But also with ERP systems, SCADA monitoring tools, and mobile devices from the workers. This is typically a very small Kafka cluster with e.g. three Brokers (which still can process ~100+ MB/sec). Sometimes, just one single Kafka broker is deployed. This is fine if you do not need high availability and prefer low cost and very simple operations.
  • Central regional Kafka clusters: Kafka clusters are deployed in different regions. Each Kafka cluster is used to ingest, process and aggregate data from different factories in that region or from all cars within a region. These Kafka clusters are bigger than the local Kafka clusters as they need to integrate data from several edge Kafka clusters. The integration can be realized easily and reliable with Confluent Replicator or in the future maybe with MirrorMaker 2. Don’t use Mirrormaker 1 at all – you can find many good reasons on the web). Another option is to directly integrate Kafka clients deployed at the edge to a central regional Kafka cluster. Either with a Kafka Client using Java, C, C++, Python, Go or another programming language. Or using a proxy in the middle, like Confluent REST Proxy, Confluent MQTT Proxy, or any MQTT Broker outside the Kafka environment. Find out more details about comparing different MQTT and HTTP-based IoT integration options for Kafka here.
  • Global Kafka clusters: You can deploy one Kafka cluster in each region or continent. Then replicate the data between each other (one- or bidirectional) in real time using Confluent Replicator. Or you can leverage the multi-data center replication Kafka feature from Confluent to spin up one logical cluster over different regions.

This is just a quick summary of deployment options for Kafka clusters at the edge, on premises or in the cloud. You typically combine different options to deploy a hybrid and global Kafka infrastructure. I will speak about different “architecture patterns for distributed, hybrid and global Apache Kafka deployments” at DevNexus in Atlanta in February 2020.

For example, you might leverage Confluent Cloud, a fully managed Kafka service with usage-based pricing and enterprise-ready SLAs in Europe and the US on Azure, AWS or GCP. But you need to use Alibaba cloud in China and used self-managed Kafka there deployed to Kubernetes and operated by Confluent Operator. In Russia, no public cloud is available at all. You need to deploy on premises and manage the cluster by yourself. Then you replicate and aggregate some anonymous sensor data from all continents to get new aggregated insights. But some specific user data might always just stay in the country of origin and local region.

Live Demo – 100000 Connected Cars

We built a live demo (available on Github) to show how easy it is to set up a connected car infrastructure at scale (including a cutting edge streaming machine learning example for predictive maintenance in real time): Streaming Machine Learning at Scale from 100000 IoT Devices with HiveMQ, Apache Kafka and TensorfLow:

Machine Learning at Scale in IoT with Kafka, MQTT, TensorFlow and Kubernetes

Such a setup can be the foundation for most of the discussed use cases in this blog post.

You can also take a look at the following link to see the slide deck and video recording. They discuss the demo architecture and walking you through the live demo:

Please let me know your thoughts about Apache Kafka in the automotive industry. Try out the demo and share your feedback. Let’s build some exciting new and innovative use cases for the automotive industry.

The post Apache Kafka in the Automotive Industry appeared first on Kai Waehner.

]]>
IoT Live Demo – 100.000 Connected Cars with Kubernetes, Kafka, MQTT, TensorFlow https://www.kai-waehner.de/blog/2019/11/08/live-demo-iot-100-000-connected-cars-kubernetes-kafka-mqtt-tensorflow/ Fri, 08 Nov 2019 13:38:34 +0000 https://www.kai-waehner.de/?p=1890 Live Demo - 100.000 Connected Cars - Real Time Processing and Analytics with Kubernetes, Kafka, MQTT and TensorFlow leveraging Confluent and HiveMQ.

The post IoT Live Demo – 100.000 Connected Cars with Kubernetes, Kafka, MQTT, TensorFlow appeared first on Kai Waehner.

]]>
You want to see an Internet of Things (IoT) example at huge scale? Not just 100 or 1000 devices producing data, but a really scalable demo with millions of messages from tens of thousands of devices? This is the right demo for you! we leveraging Kubernetes, Apache Kafka, MQTT and TensorFlow.

The demo shows how you can integrate with tens or hundreds of thousands IoT devices and process the data in real time. The demo use case is predictive maintenance (i.e. anomaly detection) in a connected car infrastructure to predict motor engine failures:

IoT Use Case - Kafka MQTT TensorFlow and Kubernetes

IoT Infrastructure – MQTT and Kafka on Kubernetes

We deploy Kubernetes, Kafka, MQTT and TensorFlow in a scalable, cloud-native infrastructure to integrate and analyse sensor data from 100000 cars in real time. The infrastructure is built with Terraform. We use GCP, but you could do the same on AWS, Azure, Alibaba or on premises.

Data processing and analytics is done in real time at scale with GCP GKE, HiveMQ, Confluent and TensorFlow I/O for streaming machine learning / deep learning and bi-directional communication in a scalable, elastic and reliable infrastructure:

IoT Architecture - Kafka MQTT TensorFlow and Kubernetes

Github Project – 100000 Connected Cars

The project is available on Github. You can set the demo up in ~30min by just installing a few CLI tools and executing two or three shell scripts.

Check out the Github project “Streaming Machine Learning at Scale from 100000 IoT Devices with HiveMQ, Apache Kafka and TensorFlow“.

Please try out the demo. Feedback and PRs are welcome.

20min Live Demo – IoT at Scale on GCP with GKE, Confluent, HiveMQ and TensorFlow IO

Here is the video recording of the live demo:

If your area of interest is Industrial IoT (IIoT), you might also check out the following example. It covers the integration of machines and PLCs like Siemens S7, Modbus or Beckhoff in factories and shop floors:

Apache Kafka, KSQL and Apache PLC4X for IIoT Data Integration and Processing

The post IoT Live Demo – 100.000 Connected Cars with Kubernetes, Kafka, MQTT, TensorFlow appeared first on Kai Waehner.

]]>
Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd https://www.kai-waehner.de/blog/2019/09/24/cloud-native-apache-kafka-kubernetes-envoy-istio-linkerd-service-mesh/ Tue, 24 Sep 2019 15:50:45 +0000 http://www.kai-waehner.de/blog/?p=1585 This blog post takes a look at cutting edge technologies like Apache Kafka, Kubernetes, Envoy, Linkerd and Istio to implement a cloud-native service mesh for a scalable, robust and observable microservice architecture.

The post Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd appeared first on Kai Waehner.

]]>
Microservice architectures are not free lunch! Microservices need to be decoupled, flexible, operationally transparent, data aware and elastic. Most material from last years only discusses point-to-point architectures with tightly coupled and non-scalable technologies like REST / HTTP. This blog post takes a look at cutting edge technologies like Apache Kafka, Kubernetes, Envoy, Linkerd and Istio to implement a cloud-native service mesh to solve these challenges and bring microservices to the next level of scale, speed and efficiency.

Here are the key requirements for building a scalable, reliable, robust and observable microservice architecture:

Key Requirements for Microservices

Before we go into more detail, let’s take a look at the key takeaways first:

  • Apache Kafka decouples services, including event streams and request-response
  • Kubernetes provides a cloud-native infrastructure for the Kafka ecosystem
  • Service Mesh helps with security and observability at ecosystem / organization scale
  • Envoy and Istio sit in the layer above Kafka and are orthogonal to the goals Kafka addresses

The following  sections cover some more thoughts about this. The end of the blog post contains a slide deck and video recording to get some more detailed explanations.

Microservices, Service Mesh and Apache Kafka

Apache Kafka became the de facto standard for microservice architectures. It goes far beyond reliable and scalable high-volume messaging. The distributed storage allows high availability and real decoupling between the independent microservices. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams.

A Service Mesh complements the architecture. It describes the network of microservices that make up such applications and the interactions between them. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary rollouts, rate limiting, access control, and end-to-end authentication.

I explore the problem of distributed Microservices communication and how both Apache Kafka and Service Mesh solutions address it. This blog post takes a look at some approaches for combining both to build a reliable and scalable microservice architecture with decoupled and secure microservices.

Kafka Service Mesh Domain Driven Design

Discussions and architectures include various open source technologies like Apache Kafka, Kafka Connect, Kubernetes, HAProxy, Envoy, LinkerD and Istio.

Learn more about decoupling microservices with Kafka in this related blog post about “Microservices, Apache Kafka, and Domain-Driven Design (DDD)“.

Cloud-Native Kafka with Kubernetes

Cloud-native infrastructures are scalable, flexible, agile, elastic and automated. Kubernetes got the de factor standard. Deployment of stateless services is pretty easy and straightforward. Though, deploying stateful and distributed applications like Apache Kafka is much harder. A lot of human operations is required. Kubernetes does NOT automatically solve Kafka-specific challenges like rolling upgrades, security configuration or data balancing between brokers. The Kafka Operator – implemented in K8s Custom Resource Definitions (CRD) – can help here!

The Operator pattern for Kubernetes aims to capture the key aim of a human operator who is managing a service or set of services. Human operators who look after specific applications and services have deep knowledge of how the system ought to behave, how to deploy it, and how to react if there are problems.

People who run workloads on Kubernetes often like to use automation to take care of repeatable tasks. The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides.

Different implementations for a Kafka Operator  for Kubernetes exist: Confluent Operator, IBM’s / Red Hat’s Strimzi, Banzai Cloud. I won’t go into more detail about the characteristics and advantages of a K8s Kafka Operator here. I already explained it in detail in another blog post (and the video below will also discuss this topic):

Service Mesh with Kubernetes-based Technologies like Envoy, Linkerd or Istio

Service Mesh is a microservice pattern to move visibility, reliability, and security primitives for service-to-service communication into the infrastructure layer, out of the application layer.

A great, detailed explanation of the design pattern “service mesh” can be found here, including the following diagram which shows the relation between a control plane and the microservices with proxy sidecars:

design pattern service mesh

You can find much more great content about service mesh concepts and its implementations from the creators of frameworks like Envoy or Linkerd. Check out these two links or just use Google for more information about the competing alternatives and their trade-offs.

(Potential) Features for Apache Kafka and Service Mesh

An event streaming platform like Apache Kafka and a service mesh on top of Kubernetes are cloud-native, orthogonal and complementary. Together they solve the key requirements for building a scalable, reliable, robust and observable microservice architecture:

Key Requirements for Microservices Solved with Kafka Kubernetes Envoy Istio

Companies use Kafka together with service mesh implementations like Envoy, Linkerd or Istio already today. You can easily combine them to add security, enforce rate limiting, or implement other related use cases. Banzai Cloud published one of the most interesting architectures: They use Istio for adding security to Kafka Brokers and ZooKeeper via proxies using Envoy.

However, in the meantime, the support gets even better: The pull request for Kafka support in Envoy was merged in May 2019. This means you now have native Kafka protocol support in Envoy. The very interesting discussions about its challenges and potential features of implementing a Kafka protocol filter are also worth reading.

With native Kafka protocol support, you can do many more interesting things beyond L4 TCP filtering. Here are just some ideas (partly from above Github discussion) of what you could do with L7 Kafka protocol support in a Service Mesh:

Protocol conversion from HTTP / gRPC to Kafka

  • Tap feature to dump to a Kafka stream
  • Protocol parsing for observability (stats, logging, and trace linking with HTTP RPCs)
  • Shadow requests to a Kafka stream instead of HTTP / gRPC shadow
  • Integrate with Kafka Connect and its whole ecosystem of connectors

Proxy features

  • Dynamic Routing
  • Rate limiting at both the L4 connection and L7 message level
  • Filter, add compression, …
  • Automatic topic name conversion (e.g. for canary release or blue/green deployment)

Monitoring and Tracing

  • Request logs and stats
  • Data lineage / audit log
  • Audit log by taking request logs and enriching them with the user info.
  • Client specific metrics (Byte rate per client id / per consumer groups, versions of the client libraries, consumer lag monitoring for the entire data center)

Security

  • SSL Termination
  • Mutual TLS (mTLS)
  • Authorization

Validation of Events

  • Serialization format (JSON, Avro, Protobuf, etc.)
  • Message schema
  • Headers, attributes, etc.

That’s awesome, isn’t it?

Microservices, Kafka and Service Mesh – Slide Deck and Video Recording

Let’s take a look at my slide deck and video recording to understand the requirements, challenges and opportunities of building a Service Mesh with Apache Kafka, its ecosystem, Kubernetes and Service Mesh technologies in more detail…

Here is the slide deck:

The video recording walks you through the slide deck:

Any thoughts or feedback? Please let me know via a comment or Tweet or let’s connect on LinkedIn.

 

 

The post Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd appeared first on Kai Waehner.

]]>
Kafka Operator for Kubernetes – Confluent Operator to establish a Cloud-Native Apache Kafka Platform https://www.kai-waehner.de/blog/2019/07/29/confluent-kafka-operator-cloud-native-apache-kafka-platform-kubernetes/ Mon, 29 Jul 2019 08:03:01 +0000 http://www.kai-waehner.de/blog/?p=1516 Kafka Operator for Kubernetes - Confluent Operator to establish a Cloud-Native Apache Kafka Platform on Kubernetes (OpenShift, CloudFoundry, Hybrid Cloud).

The post Kafka Operator for Kubernetes – Confluent Operator to establish a Cloud-Native Apache Kafka Platform appeared first on Kai Waehner.

]]>
Confluent Operator is now GA for production deployments (Download Confluent Operator for Kafka here). This is a Kafka Operator for Kubernetes which provides automated provisioning and operations of an Apache Kafka cluster and its whole ecosystem (Kafka Connect, Schema Registry, KSQL, etc.) on any Kubernetes infrastructure.

Confluent Operator Kafka Operator for Kubernetes Download

I want to share a slide deck which explains:

  • Why Kubernetes is getting more and more traction to build a cloud-native infrastructure
  • Why this is relevant for Apache Kafka and Confluent Platform
  • The challenges running Kafka on Kubernetes
  • How Confluent Operator solves these problems providing a powerful Kafka Operator for Kubernetes

Cloud-Native vs. SaaS / Serverless

Software as a Service (SaaS) and Serverless Platforms provide software and services in the public cloud as managed service. Cloud-native infrastructures allow you to leverage the features of SaaS / Serverless in your own self-managed infrastructure (either on premise or in public cloud, and without vendor-lockin).

What is Cloud Native? Many different definitions exist on the web. Two definitions which I like are “The Twelve-Factor App” and “10 key characteristics of The New Stack“.

Some of the key benefits of cloud-native infrastructures:

  • Scalable
  • Flexible
  • Agile
  • Elastic
  • Automated

This is very different from traditional bare metal or VM infrastructures. Even if you use containers like Docker, you don’t automatically get above benefits. Providing cloud-native infrastructure is a key requirement to build a DevOps infrastructure and culture. Note that technology is just one part of a fully successful DevOps mentality, of course.

Kubernetes Won the Container War

In the beginning, many cloud-native container platforms built their own cloud-native technology and infrastructure. Many of these solutions were open source, but only one took over. Just take a look at these Google Trends of last five years:

Google Trends for Kubernetes (Mesosphere, Cloud Foundry, OpenShift)

In the meantime, most cloud-native infrastructure providers (such as Red Hat OpenShift, Mesosphere, Pivotal Cloud Foundry) moved their whole strategy around supporting Kubernetes. These vendors enhance the user experience and add additional features to differentiate from vanilla Kubernetes. OpenShift made this decision a few years earlier than most others; take a look how the above trends reflect this decision. Furthermore, Kubernetes is also available as managed service on all major cloud providers (AWS, Azure, GCP) in the meantime.

Stateful Kubernetes Deployments using Operator Pattern

Kubernetes was mainly used for stateless deployments in the early phases (for instance to deploy REST microservices). Today, people deploy everything on Kubernetes because it adds a lot of value – as discussed in the section about cloud-native infrastructure above. This includes the Kafka backend and clients.

Stateful deployments of backend services leverage the Kubernetes Operator pattern. For many infrastructure components, like databases, messaging, search engines, etc. The implementation of the Operator Pattern includes standard Kubernetes objects like StatefulSets, ConfigMaps, Secrets and Persistent Volumes. However, the secret sauce are the custom Kubernetes Controller and Custom Resource Definitions (CRDs) which implement unique application functionality for the specific stateful deployment.

Challenges running Kafka on Kubernetes

Apache Kafka became the de facto standard for event streaming platforms. Apache Kafka and its ecosystem provides a powerful option to build reliable, scalable, mission-critical distributed systems. Therefore, as you can image, it is harder to operate than a traditional messaging system or database which do not scale elastically without downtime and just use active/passive for high availability.

Kubernetes environments are similar: Very powerful but not easy to operate. Hence the combination of both, Kafka and Kubernetes, does not make it easier. Here are some challenges running the Apache Kafka ecosystem on Kubernetes:

  • Translating an existing architecture to Kubernetes
  • Failover handling
  • Data rebalancing
  • Communication between ZooKeeper, Kafka Brokers, Clients (Java, REST, Connect, KSQL), Schema Registry, etc.
  • External access from / to outside Kubernetes cluster
  • Persistent storage options on premise and in the cloud
  • Security configuration
  • Rolling upgrades
  • Etc.

This is the secret sauce which a Kubernetes Operator has to implement and automate. Consequently, a Kafka Operator sounds like a very good and valuable component.

Confluent Operator as Kafka Operator to establish a Cloud-Native Kafka Platform

Confluent has long experience running Kafka on Kubernetes:

Confluent Experience for Kafka on Kubernetes

Confluent Cloud runs on Kubernetes using a Kafka Operator to offer “Serverless Kafka”: Confluent Cloud provides mission-critical SLAs on all three major cloud providers (Google GCP, Microsoft Azure, Amazon AWS), consumption-based pricing and throughput of several GB / sec using a single Kafka cluster. Seems like running Kafka on Kubernetes using a Kafka Operator is not a bad idea.

Slide Deck: Confluent Operator for Kafka Ecosystem on Kubernetes

My slide deck describes the journey and the features of Confluent Operator to deploy and operate Kafka in a cloud-native way similar to how Kafka and its ecosystem (like Kafka Connect, Schema Registry, KSQL) is deployed in Confluent Cloud.

Confluent Operator - A Kafka Operator for Kubernetes

Confluent Operator enables you to:

  • Provisioning, management and operations of Confluent Platform (including ZooKeeper, Apache Kafka, Kafka Connect, KSQL, Schema Registry, REST Proxy, Control Center)
  • Deployment on any Kubernetes Platform (Vanilla K8s, OpenShift, Rancher, Mesosphere, Cloud Foundry, Amazon EKS, Azure AKS, Google GKE, etc.)
  • Automate provisioning of Kafka pods in minutes
  • Monitor SLAs through Confluent Control Center or Prometheus
  • Scale Kafka elastically, handle fail-over & Automate rolling updates
  • Automate security configuration
  • Built on our first hand knowledge of running Confluent at scale
  • Fully supported for production usage

Here is the Agenda of the slide deck:

  • Cloud Native vs. SaaS / Serverless Kafka
  • The Emergence of Kubernetes
  • Kafka on K8s Deployment Challenges
  • Confluent Operator as Kafka Operator

Also check out the documentation for Confluent Operator.

Please let me know if you have any comments or feedback.

The post Kafka Operator for Kubernetes – Confluent Operator to establish a Cloud-Native Apache Kafka Platform appeared first on Kai Waehner.

]]>
Deep Learning at Extreme Scale 
with the Apache Kafka Open Source Ecosystem https://www.kai-waehner.de/blog/2018/05/09/deep-learning-at-extreme-scale-%e2%80%a8with-apache-kafka-open-source-ecosystem/ Wed, 09 May 2018 06:53:34 +0000 http://www.kai-waehner.de/blog/?p=1280 I had a new talk presented at “Codemotion Amsterdam 2018” this week. I discussed the relation of Apache…

The post Deep Learning at Extreme Scale 
with the Apache Kafka Open Source Ecosystem appeared first on Kai Waehner.

]]>
I had a new talk presented at “Codemotion Amsterdam 2018” this week. I discussed the relation of Apache Kafka and Machine Learning to build a Machine Learning infrastructure for extreme scale.

Long version of the title:

Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Source Ecosystem – How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc.

As always, I want to share the slide deck. The talk was also recorded. I will share the video as soon as it was published by the organizer.

The room was full. No free seats. A lot of interest in this topic. I talked to many attendees who have huge challenges bringing analytic models into mission-critical production. No scalability and missing flexibility were other challenges many people had when using e.g. just a Python environment to build an analytic model.

Apache Kafka as Key Component in a Machine Learning Infrastructure

The open source Apache Kafka ecosystem helps in many situations of a machine learning process. Data integration, data ingestion, data preprocessing, model deployment, monitoring, etc. Many companies have built a Kafka ML infrastructure already. Take a look at tech giants like Netflix with Meson or Uber with Michelangelo to mention two examples.

Tech giants and many other companies use Apache Kafka at very large scale already. So you do not need to worry about this. You just need to evaluate how you integrate it into your existing environment and projects. See the example from LinkedIn (> 4.5 Trillion messages per day) or Netflix (peak of 6 Petabyte per day):

Machine Learning and Apache Kafka are both part of their core infrastructure. I see similar scenarios at most traditional companies (like banks, telcos, retailer) these days. They all build a central nervous system around Apache Kafka and apply analytic models in several scenarios and business processes.

Abstract of the Talk: Machine Learning and Deep Learning with Apache Kafka

This talk shows how to build Machine Learning models at extreme scale and how to productionize the built models in mission-critical real time applications by leveraging open source components in the public cloud. The session discusses the relation between TensorFlow and the Apache Kafka ecosystem – and why this is a great fit for machine learning at extreme scale.

The Machine Learning architecture includes: Kafka Connect for continuous high volume data ingestion into the public cloud, TensorFlow leveraging Deep Learning algorithms to build an analytic model on powerful GPUs, Kafka Streams for model deployment and inference in real time, and KSQL for real time analytics of predictions, alerts and model accuracy.

Sensor analytics for predictive alerting in real time is used as real world example from Internet of Things scenarios. A live demo shows the out-of-the-box integration and dynamic scalability of these components on Google Cloud.

Machine Learning Infrastructure based on the Apache Kafka and Confluent open source ecosystem:

Apache Kafka and Confluent Open Source Ecosystem for Machine Learning and Deep Learning

Key takeaways for the audience

  • Data Scientist and Developers have to work together continuously (org + tech!)
  • Mission critical, scalable production infrastructure is key for success of Machine Learning projects
  • Apache Kafka Ecosystem + Cloud = Machine Learning at Extreme Scale (Ingestion, Processing, Training, Inference, Monitoring)

Slide Deck: Apache Kafka + Machine Learning at Extreme Scale

Here is the slide deck of my talk:

As always, I appreciate any feedback.

The post Deep Learning at Extreme Scale 
with the Apache Kafka Open Source Ecosystem appeared first on Kai Waehner.

]]>
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL https://www.kai-waehner.de/blog/2018/03/13/rethinking-stream-processing-with-apache-kafka-streams-and-ksql/ Tue, 13 Mar 2018 16:02:49 +0000 http://www.kai-waehner.de/blog/?p=1267 I presented at JavaLand 2018 in Brühl recently. A great developer conference with over 1800 attendees. The location…

The post Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL appeared first on Kai Waehner.

]]>
I presented at JavaLand 2018 in Brühl recently. A great developer conference with over 1800 attendees. The location is also awesome! A theme park: Phantasialand. My talk: “New Era of Stream Processing with Apache Kafka’s Streams API and KSQL“. Just want to share the slide deck…

Kai Speaking at JavaLand 2018 about Kafka Streams and KSQL

Abstract

Stream Processing is a concept used to act on real-time streaming data. This session shows and demos how teams in different industries leverage the innovative Streams API from Apache Kafka to build and deploy mission-critical streaming real time application and microservices.

The session discusses important Streaming concepts like local and distributed state management, exactly once semantics, embedding streaming into any application, deployment to any infrastructure. Afterwards, the session explains key advantages of Kafka’s Streams API like distributed processing and fault-tolerance with fast failover, no-downtime rolling deployments and the ability to reprocess events so you can recalculate output when your code changes.

A demo shows how to combine any custom code with your streams application – by an example using an analytic model built with any machine learning framework like Apache Spark ML or TensorFlow.

The end of the session introduces KSQL – the open source Streaming SQL Engine for Apache Kafka. Write “simple” SQL streaming queries with the scalability, throughput and fail-over of Kafka Streams under the hood.

Slide Deck

Here we go:

 

The post Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL appeared first on Kai Waehner.

]]>
Video Recording – Apache Kafka as Event-Driven Open Source Streaming Platform (Voxxed Zurich 2018) https://www.kai-waehner.de/blog/2018/03/13/video-recording-apache-kafka-as-event-driven-open-source-streaming-platform-voxxed-zurich-2018/ Tue, 13 Mar 2018 07:25:43 +0000 http://www.kai-waehner.de/blog/?p=1264 I spoke at Voxxed Zurich 2018 about Apache Kafka as Event-Driven Open Source Streaming Platform. The talk includes…

The post Video Recording – Apache Kafka as Event-Driven Open Source Streaming Platform (Voxxed Zurich 2018) appeared first on Kai Waehner.

]]>
I spoke at Voxxed Zurich 2018 about Apache Kafka as Event-Driven Open Source Streaming Platform. The talk includes an intro to Apache Kafka and its open source ecosystem (Kafka Streams, Connect, KSQL, Schema Registry, etc.). Just want to share the video recording of my talk.

Abstract

This session introduces Apache Kafka, an event-driven open source streaming platform. Apache Kafka goes far beyond scalable, high volume messaging. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams. The open source Confluent Platform adds further components such as a KSQL, Schema Registry, REST Proxy, Clients for different programming languages and Connectors for different technologies and databases. Live Demos included.

Video Recording

The post Video Recording – Apache Kafka as Event-Driven Open Source Streaming Platform (Voxxed Zurich 2018) appeared first on Kai Waehner.

]]>
Visualisation from my Apache Kafka + Mesos Session at OOP 2018 https://www.kai-waehner.de/blog/2018/02/18/visualization-apache-kafka-mesos-oop-session/ Sun, 18 Feb 2018 19:13:44 +0000 http://www.kai-waehner.de/blog/?p=1255 I did some talks about “Apache Kafka + Apache Mesos = Highly Scalable Microservices” in the last months……

The post Visualisation from my Apache Kafka + Mesos Session at OOP 2018 appeared first on Kai Waehner.

]]>
I did some talks about “Apache Kafka + Apache Mesos = Highly Scalable Microservices” in the last months… See my blog post with notes and slides from MesosCon Europe.

I did an updated version at OOP 2018 conference in Munich. The conference organizers invited some great people who do live drawings during some of the talks. The result of the live whiteboard drawing of my session is really awesome. Take a look:

Whiteboard Drawing Kafka Streams Mesos Microservices

Thanks to the guys from Remarker. Great visualisation! Love it…

The post Visualisation from my Apache Kafka + Mesos Session at OOP 2018 appeared first on Kai Waehner.

]]>
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem https://www.kai-waehner.de/blog/2018/02/13/machine-learning-trends-of-2018-combined-with-apache-kafka-ecosystem/ Tue, 13 Feb 2018 15:55:59 +0000 http://www.kai-waehner.de/blog/?p=1234 At OOP 2018 conference in Munich, I presented an updated version of my talk about building scalable, mission-critical…

The post Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem appeared first on Kai Waehner.

]]>
At OOP 2018 conference in Munich, I presented an updated version of my talk about building scalable, mission-critical microservices with the Apache Kafka ecosystem and Deep Learning frameworks like TensorFlow, DeepLearning4J or H2O. I want to share the updated slide deck and discuss a few updates about newest trends, which I incorporated into the talk.

The main story is the same as in my Confluent blog post about Apache Kafka Ecosystem and Machine Learning: How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka. But I  focused more on Deep Learning / Neural Networks. I also discussed a few innovations in the ecosystem of Apache Kafka and trends in ML in the last months: KSQL, ONNX, AutoML, ML platforms from Uber and Netflix. Let’s take a look into these interesting topics and how this is related to each other.

KSQL – A Streaming SQL Language on top of Apache Kafka.

KSQL is a streaming SQL engine for Apache Kafka. KSQL lowers the entry bar to the world of stream processing, providing a simple and completely interactive SQL interface for processing data in Kafka. You no longer need to write code in a programming language such as Java or Python! KSQL is open-source (Apache 2.0 licensed), distributed, scalable, reliable, and real-time. It supports a wide range of powerful stream processing operations including aggregations, joins, windowing, sessionization, and much more.” More details here: “Introducing KSQL: Open Source Streaming SQL for Apache Kafka“.

You can write SQL-like queries to deploy scalable, mission-critical stream processing apps (which leverage Kafka Streams under the hood). Definitely a highlight in the Kafka open source ecosystem.

KSQL and Machine Learning

KSQL is built on top of Kafka Streams and therefore allows to build scalable, mission-critical services. Machine Learning models including Neural Networks are embeddable easily by building a User Defined Function (UDF). I am preparing an example these days where I apply a Neural Network – more precisely an Autoencoder – for sensor analytics to detect anomalies – i.e. critical values in health checks – of hospital guests in real time to send an alert to the doctor.

Let’s now talk about some interesting new developments in the machine learning ecosystem.

ONNX – A Open Format to Represent Deep Learning Models

ONNX is a open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them.”

This sounds similar to PMML (Predictive Model Markup Language, see “What is PMML” on KDnuggets) and PFA (Portable Format for Analytics), two other standards to define and share machine learning models. However, ONNX differs in a few  aspects:

  • focuses on Deep Learning
  • has several huge tech companies (AWS, Microsoft, Facebook) and hardware vendors (AMD, NVidia, Intel, Qualcomm, etc.) behind it
  • supports many leading open source frameworks, already (including TensorFlow, Pytorch, MXNet)

ONNX is already GA in version 1.0 and production ready (as announced by Amazon, Microsoft and Facebook in December 2017). There is also a nice getting started guide for different frameworks.

ONNX and the Apache Kafka ecosystem

Unfortunately, ONNX has no Java support yet. Therefore, no support yet for embedding it into Kafka Streams Java API natively. Only via workaround like doing a REST call or embedding a JNI binding. But I am very sure this is only a matter of time, because the Java platform is so important in many enterprises to deploy mission-critical applications.

Right now, you could use Kafka’s Java API or other Kafka Clients. Confluent provides official clients for several programming languages, e.g. for Python or Go, which both are perfect for Machine Learning applications, too.

Automated Machine Learning (aka AutoML)

“Automated machine learning (AutoML) is a hot new field with the goal of making it easy to select different machine learning algorithms, their parameter settings, and the pre-processing methods that improve their ability to detect complex patterns in big data” as stated here.

With AutoML, you can build analytic models without any knowledge about Machine Learning. The AutoML implementations uses different implementations of Decision Trees, Clustering, Neural Networks, etc. to build and compare different models out-of-the-box. You just upload or connect your historical data set and click a few buttons to start the process. Maybe not perfect for every use case, but you can easily improve many existing processes without the need for a rare and expensive data scientist.

DataRobot or Google’s AutoML are two of many well-known cloud offerings in this space. H2O’s AutoML is integrated into its open source ML framework, but they also offer a nice UI-focused commercial product called “Driverless AI“. I highly recommend to spend 30min on any AutoML tool. It is really fascinating to see how AI tools develop these days.

AutoML and the Apache Kafka ecosystem

Most AutoML tools offer deployment of their models. You can access the analytic models e.g. via a REST interface. Not a perfect solution for a scalable, event-drive architecture like Kafka. The good news: Many AutoML solutions also allow to export their generated models so that you can deploy them into your application. For example, AutoML in H2O’s open source frameworks is just one of many options. You only use another operation in the programming language of your choice (R, Python, Scala, Web UI):

aml <- h2o.automl(x = x, y = y,
                  training_frame = train,
                  leaderboard_frame = test,
                  max_runtime_secs = 30)

Similar to what you would do to build a Linear Regression, Decision Tree or Neural Network. The result is generated Java code which you can easily embed into your Kafka Streams microservice or any other Kafka application. AutoML enables you to build and deploy highly scalable machine learning without deep knowledge in ML.

ML Platforms: Uber’s Michelangelo; Netflix’ Meson

Tech giants are typically some years ahead of “traditional enterprises”. They already built years ago what you build today or tomorrow. ML Platforms are no difference. Writing the ML source code to train an analytic model is just a very small part of a real world ML infrastructure. You need to think about the whole development process.  The following picture shows the “Hidden Technical Debt in Machine Learning Systems“:

Hidden Technical Debt in Machine Learning Systems

You will probably build several analytic models with different technologies. Not everything will be built in your Spark or Flink cluster or in a single cloud infrastructure. You might run TensorFlow on some big, expensive GPU in the public cloud to build powerful neural networks. Or use H2O to build some small, but very efficient and performant decision trees which do inference in a few microseconds… ML has many use cases.

That’s why many tech giants have built their own ML platforms, like Uber’s Michelangelo or Netflix’ Meson . These ML Platforms allow them to build and monitor powerful, scalable analytic models, but also to stay flexible to choose the right ML technology for each use case.

Apache Kafka ecosystem for ML Platforms

One of the reasons why Apache Kafka is so successful is the huge adoption by many tech giants. Almost all great Silicon Valley companies like LinkedIn, Netflix, Uber, Ebay, “you-name-it” blog and speak about their usage of Kafka as event-driven central nervous system for their mission-critical applications. Many focus on the distributed streaming platform for messaging, but we also see more and more adoption of add-ons like Kafka Connect, Kafka Streams, REST Proxy, Schema Registry or KSQL.

If you look at the above picture again, then think about Kafka: Isn’t it a perfect fit for a ML Platform? Training, monitoring, deployment, inference, configuration, A/B testing, etc. etc. etc. That’s probably why Uber, Netflix and many others use Kafka already as central component in their ML infrastructure.

Apache Kafka Ecosystem for Machine Learning

And again, you are not forced to use just one specific technology. One of the great design concepts of Kafka is that you can re-process data again and again from its distributed commit log. This means you can either build different models with one technology as Kafka sink (let’s say Apache Flink or Spark), or connect different technologies like scikit-learn for local testing, TensorFlow running on Google Cloud GPUs for powerful deep learning, an on premise installation of H2O nodes for AutoML, and some other Kafka Streams ML apps deployed in Docker containers or Kubernetes. All of these ML applications consume the data in parallel in their pace and how often they need to.

Here is a great example of how to automate training and deployment of a scalable ML microservice with Kafka and Kafka Streams. No need to add another big data cluster. That’s one of the key differences of using Kafka Streams or KSQL for your ML applications instead of other Stream Processing frameworks.

Apache Kafka and Deep Learning – Slide Deck from OOP

Finally, after all these discussions about the Apache Kafka ecosystem and new trends in Machine Learning / Deep Learning, here are my updated slides from my talk at OOP 2018 conference:

I have also built a few examples using Apache Kafka, Kafka Streams and different open source ML frameworks like H2O, TensorFlow and DeepLearning4j (DL4J). The Github project shows how easy it is to deploy analytic models to a highly scalable, fault-tolerant, mission-critical Kafka microservice. A KSQL demo will also come soon.

Please share your feedback. Do you already use Kafka in the Machine Learning space? What components in addition to Kafka core do you use? Feel free to contact me to discuss this in more detail.

The post Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem appeared first on Kai Waehner.

]]>
Apache Kafka + Kafka Streams + Mesos / DCOS = Scalable Microservices https://www.kai-waehner.de/blog/2017/10/27/mesos-kafka-streams-scalable-microservices/ Fri, 27 Oct 2017 08:05:16 +0000 http://www.kai-waehner.de/blog/?p=1208 Apache Kafka + Kafka Streams + Apache Mesos = Highly Scalable Microservices. Mission-critical deployments via DC/OS and Confluent on premise or public cloud.

The post Apache Kafka + Kafka Streams + Mesos / DCOS = Scalable Microservices appeared first on Kai Waehner.

]]>
I had a talk at MesosCon 2017 Europe in Prague about building highly scalable, mission-critical microservices with Apache Kafka, Kafka Streams and Apache Mesos / DCOS. I would like to share the slides and a video recording of the live demo.

Abstract

Microservices establish many benefits like agile, flexible development and deployment of business logic. However, a Microservice architecture also creates many new challenges. This includes increased communication between distributed instances, the need for orchestration, new fail-over requirements, and resiliency design patterns.

This session discusses how to build a highly scalable, performant, mission-critical microservice infrastructure with Apache Kafka, Kafka Streams and Apache Mesos respectively DC/OS. Apache Kafka brokers are used as powerful, scalable, distributed message backbone. Kafka’s Streams API allows to embed stream processing directly into any external microservice or business application. Without the need for a dedicated streaming cluster. Apache Mesos can be used as scalable infrastructure for both, the Apache Kafka brokers and external applications using the Kafka Streams API, to leverage the benefits of a cloud native platforms like service discovery, health checks, or fail-over management.

A live demo shows how to develop real time applications for your core business with Kafka messaging brokers and Kafka Streams API. You see how to deploy / manage / scale them on a DC/OS cluster using different deployment options.

Key takeaways

  • Successful microservice architectures require a highly scalable messaging infrastructure combined with a cloud-native platform which manages distributed microservices
  • Apache Kafka offers a highly scalable, mission critical infrastructure for distributed messaging and integration
  • Kafka’s Streams API allows to embed stream processing into any external application or microservice
  • Mesos respectively DC/OS allow management of both, Kafka brokers and external applications using Kafka Streams API, to leverage many built-in benefits like health checks, service discovery or fail-over control of microservices
  • See a live demo which combines the Apache Kafka streaming platform and DC/OS

Architecture: Kafka Brokers + Kafka Streams on Kubernetes and DC/OS

The following picture shows the architecture. You can either run Kafka Brokers and Kafka Streams microservices natively on DC/OS via Marathon or leverage Kubernetes as Docker container orchestration tool (which is also supported my Mesosphere in the meantime).

 

Architecture - Kafka Streams, Kubernetes and Mesos / DCOS

Slides

Here are the slides from my talk:

Live Demo

The following video shows the live demo. It is built on AWS using Mesosphere’s Cloud Formation script to setup a DC/OS cluster in ten minutes.

Here, I deployed both – Kafka brokers and Kafka Streams microservices – directly to DC/OS without leveraging Kubernetes. I expect to see many people continue to deploy Kafka brokers directly on DC/OS. For microservices many teams might move to the following stack: Microservice –> Docker –> Kubernetes –> DC/OS.

Do you also use Apache Mesos respectively DC/OS to run Kafka? Only the brokers or also Kafka clients (producers, consumers, Streams, Connect, KSQL, etc)? Or do you prefer another tool like Kubernetes (maybe on DC/OS)?

 

The post Apache Kafka + Kafka Streams + Mesos / DCOS = Scalable Microservices appeared first on Kai Waehner.

]]>