API Archives - Kai Waehner

Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker

Kai Waehner — Mon, 26 May 2025 05:32:01 +0000

Agentic AI is gaining traction as a design pattern for building more intelligent, autonomous, and collaborative systems. Unlike traditional task-based automation, agentic AI involves intelligent agents that operate independently, make contextual decisions, and collaborate with other agents or systems—across domains, departments, and even enterprises.

In the enterprise world, agentic AI is more than just a technical concept. It represents a shift in how systems interact, learn, and evolve. But unlocking its full potential requires more than AI models and point-to-point APIs—it demands the right integration backbone.

That’s where Apache Kafka as event broker for true decoupling comes into play together with two emerging AI standards: Google’s Application-to-Application (A2A) Protocol and Antrophic’s Model Context Protocol (MCP) in an enterprise architecture for Agentic AI.

Inspired by my colleague Sean Falconer’s blog post, “Why Google’s Agent2Agent Protocol Needs Apache Kafka”, this blog post explores the Agentic AI adoption in enterprises and how an event-driven architecture with Apache Kafka fits into the AI architecture.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including various AI examples across industries.

Business Value of Agentic AI in the Enterprise

For enterprises, the promise of agentic AI is compelling:

Smarter automation through self-directed, context-aware agents
Improved customer experience with faster and more personalized responses
Operational efficiency by connecting internal and external systems more intelligently
Scalable B2B interactions that span suppliers, partners, and digital ecosystems

But none of this works if systems are coupled by brittle point-to-point APIs, slow batch jobs, or disconnected data pipelines. Autonomous agents need continuous, real-time access to events, shared state, and a common communication fabric that scales across use cases.

Model Context Protocol (MCP) + Agent2Agent (A2A): New Standards for Agentic AI

The Model Context Protocol (MCP) coined by Anthropic offers a standardized, model-agnostic interface for context exchange between AI agents and external systems. Whether the interaction is streaming, batch, or API-based, MCP abstracts how agents retrieve inputs, send outputs, and trigger actions across services. This enables real-time coordination between models and tools—improving autonomy, reusability, and interoperability in distributed AI systems.

Source: Anthropic

Google’s Agent2Agent (A2A) protocol complements this by defining how autonomous software agents can interact with one another in a standard way. A2A enables scalable agent-to-agent collaboration—where agents discover each other, share state, and delegate tasks without predefined integrations. It’s foundational for building open, multi-agent ecosystems that work across departments, companies, and platforms.

Source: Google

Why Apache Kafka Is a Better Fit Than an API (HTTP/REST) for A2A and MCP

Most enterprises today use HTTP-based APIs to connect services—ideal for simple, synchronous request-response interactions.

In contrast, Apache Kafka is a distributed event streaming platform designed for asynchronous, high-throughput, and loosely coupled communication—making it a much better fit for multi-agent (A2A) and agentic AI architectures.

API-Based Integration	Kafka-Based Integration
Synchronous, blocking	Asynchronous, event-driven
Point-to-point coupling	Loose coupling with pub/sub topics
Hard to scale to many agents	Supports multiple consumers natively
No shared memory	Kafka retains and replays event history
Limited observability	Full traceability with schema registry & DLQs

Kafka serves as the decoupling layer. It becomes the place where agents publish their state, subscribe to updates, and communicate changes—independently and asynchronously. This enables multi-agent coordination, resilience, and extensibility.

MCP + Kafka = Open, Flexible Communication

As the adoption of Agentic AI accelerates, there’s a growing need for scalable communication between AI agents, services, and operational systems. The Model-Context Protocol (MCP) is emerging as a standard to structure these interactions—defining how agents access tools, send inputs, and receive results. But a protocol alone doesn’t solve the challenges of integration, scaling, or observability.

This is where Apache Kafka comes in.

By combining MCP with Kafka, agents can interact through a Kafka topic—fully decoupled, asynchronous, and in real time. Instead of direct, synchronous calls between agents and services, all communication happens through Kafka topics, using structured events based on the MCP format.

This model supports a wide range of implementations and tech stacks. For instance:

A Python-based AI agent deployed in a SaaS environment
A Spring Boot Java microservice running inside a transactional core system
A Flink application deployed at the edge performing low-latency stream processing
An API gateway translating HTTP requests into MCP-compliant Kafka events

Regardless of where or how an agent is implemented, it can participate in the same event-driven system. Kafka ensures durability, replayability, and scalability. MCP provides the semantic structure for requests and responses.

The result is a highly flexible, loosely coupled architecture for Agentic AI—one that supports real-time processing, cross-system coordination, and long-term observability. This combination is already being explored in early enterprise projects and will be a key building block for agent-based systems moving into production.

Stream Processing as the Agent’s Companion

Stream processing technologies like Apache Flink or Kafka Streams allow agents to:

Filter, join, and enrich events in motion
Maintain stateful context for decisions (e.g., real-time credit risk)
Trigger new downstream actions based on complex event patterns
Apply AI directly within the stream processing logic, enabling real-time inference and contextual decision-making with embedded models or external calls to a model server, vector database, or any other AI platform

Agents don’t need to manage all logic themselves. The data streaming platform can pre-process information, enforce policies, and even trigger fallback or compensating workflows—making agents simpler and more focused.

Technology Flexibility for Agentic AI Design with Data Contracts

One of the biggest advantages of Kafka-based event-driven and decoupled backend for agentic systems is that agents can be implemented in any stack:

Languages: Python, Java, Go, etc.
Environments: Containers, serverless, JVM apps, SaaS tools
Communication styles: Event streaming, REST APIs, scheduled jobs

The Kafka topic is the stable data contract for quality and policy enforcement. Agents can evolve independently, be deployed incrementally, and interoperate without tight dependencies.

Microservices, Data Products, and Reusability – Agentic AI Is Just One Piece of the Puzzle

To be effective, Agentic AI needs to connect seamlessly with existing operational systems and business workflows.

Kafka topics enable the creation of reusable data products that serve multiple consumers—AI agents, dashboards, services, or external partners. This aligns perfectly with data mesh and microservice principles, where ownership, scalability, and interoperability are key.

A single stream of enriched order events might be consumed via a single data product by:

A fraud detection agent
A real-time alerting system
An agent triggering SAP workflow updates
A lakehouse for reporting and batch analytics

This one-to-many model is the opposite of traditional REST designs and crucial for enabling agentic orchestration at scale.

Agentic Al Needs Integration with Core Enterprise Systems

Agentic AI is not a standalone trend—it’s becoming an integral part of broader enterprise AI strategies. While this post focuses on architectural foundations like Kafka, MCP, and A2A, it’s important to recognize how this infrastructure complements the evolution of major AI platforms.

Leading vendors such as Databricks, Snowflake, and others are building scalable foundations for machine learning, analytics, and generative AI. These platforms often handle model training and serving. But to bring agentic capabilities into production—especially for real-time, autonomous workflows—they must connect with operational, transactional systems and other agents at runtime. (See also: Confluent + Databricks blog series | Apache Kafka + Snowflake blog series)

This is where Kafka as the event broker becomes essential: it links these analytical backends with AI agents, transactional systems, and streaming pipelines across the enterprise.

At the same time, enterprise application vendors are embedding AI assistants and agents directly into their platforms:

SAP Joule / Business AI – Embedded AI for finance, supply chain, and operations
Salesforce Einstein / Copilot Studio – Generative AI for CRM and sales automation
ServiceNow Now Assist – Predictive automation across IT and employee services
Oracle Fusion AI / OCI – ML for ERP, HCM, and procurement
Microsoft Copilot – Integrated AI across Dynamics and Power Platform
IBM watsonx, Adobe Sensei, Infor Coleman AI – Governed, domain-specific AI agents

Each of these solutions benefits from the same architectural foundation: real-time data access, decoupled integration, and standardized agent communication.

Whether deployed internally or sourced from vendors, agents need reliable event-driven infrastructure to coordinate with each other and with backend systems. Apache Kafka provides this core integration layer—supporting a consistent, scalable, and open foundation for agentic AI across the enterprise.

Agentic AI Requires Decoupling – Apache Kafka Supports A2A and MCP as an Event Broker

To deliver on the promise of agentic AI, enterprises must move beyond point-to-point APIs and batch integrations. They need a shared, event-driven foundation that enables agents (and other enterprise software) to work independently and together—with shared context, consistent data, and scalable interactions.

Apache Kafka provides exactly that. Combined with MCP and A2A for standardized Agentic AI communication, Kafka unlocks the flexibility, resilience, and openness needed for next-generation enterprise AI.

It’s not about picking one agent platform—it’s about giving every agent the same, reliable interface to the rest of the world. Kafka is that interface.

The post Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker appeared first on Kai Waehner.

Request-Response with REST/HTTP vs. Data Streaming with Apache Kafka – Friends, Enemies, Frenemies?

Kai Waehner — Fri, 12 Aug 2022 06:28:02 +0000

Request-response communication with REST / HTTP is simple, well understood, and supported by most technologies, products, and SaaS cloud services. Contrarily, data streaming with Apache Kafka is a fundamental change to process data continuously. HTTP and Kafka complement each other in various ways. This post explores the architectures and use cases to leverage request-response together with data streaming in the control plane for management or in the data plane for producing and consuming events.

Request-response (HTTP) versus data streaming (Apache Kafka)

Prior to discussing the relationship between HTTP/REST and Apache Kafka, let’s explore the concepts behind both. Traditionally, request-response and data streaming are two different paradigms.

Request-response with REST/HTTP

The following characteristics make HTTP so prevalent in software engineering for request-response (aka request-reply) communication:

The foundation of data communication for the World Wide Web
The standard application layer protocol in the internet protocol suite, commonly known as TCP/IP
Simple and well understood
Supported by most open source frameworks, proprietary products, and SaaS cloud services
Pre-defined API with GET, POST, PUT, and DELETE commands
Typically synchronous communication, but chunked transfer encoding (i.e., streaming) is also possible
Point-to-point message exchange between two applications (like a client and server or two independent microservices)

Data streaming with Apache Kafka

HTTP is about communication between two applications. On the contrary, data streaming is much more than just data communication between a client and a server. Hence, data streaming platforms like Apache Kafka have very different characteristics:

No official web standard like HTTP
Plenty of open-source and proprietary implementations exist
An event-driven system with asynchronous communication using truly decoupled producers and consumers due to the storage of the streaming platform
General-purpose events instead of pre-defined APIs – contract management using schemas is crucial in larger projects for API enforcement and data governance
Continuous data processing in real-time at any scale – a fundamental change for developers that are used to web services and databases for building applications
Backpressure handling, slow consumers, and replayability of historical events are core concepts built-in out-of-the-box
Data integration and data processing capabilities are built into the data streaming platform, i.e., it is not just a message queue

Please note that this article specifically discusses Apache Kafka as it is the established de facto standard for data streaming. It powers most data streaming distributions like Confluent, Red Hat, IBM, Cloudera, TIBCO, and many more, plus cloud services like Confluent Cloud and Amazon MSK. Nevertheless, other frameworks and cloud services like Apache Pulsar, Redpanda, Apache Flink, AWS Kinesis, and many other data streaming technologies follow the same principles. Just be aware of the technical differences and trade-offs between data streaming products.

Request-response and data streaming are complementary

Most architectures need request-response for point-to-point communication (e.g., between a server and mobile app) and data streaming for continuous event processing.

Event sourcing with CQRS is the better design for most data streaming scenarios. However, developers can implement the request-response message exchange pattern natively with Apache Kafka.

Nevertheless, direct HTTP communication with Kafka is the easier and often better approach for appropriate use cases. With this in mind, let’s look at use cases where HTTP is used with Kafka and how they complement each other.

Why is HTTP / REST so popular?

Most developers and administrators are familiar with REST APIs. They are the natural option for many best practices and security guidelines. Here are some good reasons why this will not change in the future:

Avoiding technology lock-in: Sometimes, you want to embed the communication or proxy it with a more agnostic API.
Familiarity with a known technology: Developers are familiar with REST endpoints and if they are under pressure or need a quick result, it’s quicker than learning how to use a new API.
Supported by almost all products: Most open-source frameworks, commercial products, and SaaS cloud provide HTTP APIs.
Security: HTTP ports are much easier to open by security teams compared to the TCP ports of the Kafka-native protocol used by client APIs from programming languages such as Java, Go, C++, or Python. For instance, in DMZ pass-through requirements, InfoSec owns the F5 proxies in the DMZ. A Kafka REST Proxy makes the integration easier.
Domain-driven design (DDD): Often, HTTP/REST and Kafka are combined to leverage the best of both worlds: Kafka for decoupling and HTTP for synchronous client-server communication. A service mesh using Kafka with REST APIs is a common architecture.

Other great implementations exist for request-response communication. For instance:

gRPC: Efficient request-response communication via a cross-platform open source high-performance Remote Procedure Call framework
GraphQL Data query and manipulation language for APIs and a runtime for fulfilling queries with existing data

Nevertheless, HTTP is the first choice in most projects. gRPC, GraphQL, or other implementations are chosen for specific problems if HTTP is not good enough.

Use cases for HTTP / REST APIs with Apache Kafka

RESTful interfaces to an Apache Kafka cluster make it easy to produce and consume messages, view the cluster’s metadata, and perform administrative actions using standard HTTP(S) instead of the native TCP-based Kafka protocol or clients.

Each scenario differs significantly in its purpose. Some use cases are implemented out of convenience, while others are required because of technical specifications.

There are two major categories of use cases for combining HTTP with Kafka. In terms of cloud-native architectures, this can be divided into the management plane (i.e., administration and configuration) and the data plane (i.e., producing and consuming data).

Management plane with HTTP and Kafka

The management and administration of a Kafka cluster involve various tasks, such as:

Cluster management: Creation of a cluster, expanding or shrinking a cluster, etc.
Cluster configuration: Management of Kafka topics, consumer groups, key management, role-based access control (RBAC), etc.
CI/CD and DevOps integration: HTTP APIs are the most popular way to build delivery pipelines and automate administration, instead of using Python or other alternative scripting options.
Data governance: Tools for data lineage, data catalogs, and policy enforcement need to be configured using APIs.
3rd party monitoring integration: Connect metrics APIs, alerts, and other notifications into systems like Datadog, Slack, etc.

Data plane with HTTP and Kafka

Many scenarios require or prefer the usage of REST APIs for producing and consuming messages to/from Kafka, such as

Natural request-response applications such as mobile apps: These applications and the frameworks almost always require integration via HTTP and request-response. WebSockets, Server-Sent Events (SSE), and similar concepts are a better fit for data streaming with Kafka. They are in the client framework, though often not supported.
Legacy application and third-party tool integration: Legacy applications, standard software, and traditional middleware are often proprietary. The only integration capabilities are HTTP/REST. Extract, transform, load (ETL), enterprise service bus (ESB), and other third-party tools are complementary to data streaming with Kafka. Mainframe integration using REST APIs from COBOL to Kafka is another example.
API gateway: Most API management tools do not provide native support for data streaming and Kafka today and only work on top of REST interfaces. Kafka (via the REST interface) and API management are still very complementary for some use cases, such as service monetization or integration with partner systems.
Other programming languages: Kafka provides Java and Scala clients. Confluent provides and supports additional clients, including Python, .NET, C, C++, and Go. More Kafka clients exist from the community, including Erlang, Kotlin, Node.js, PHP, Ruby, and Rust. Many of these community clients are not battle tested or supported. Therefore, calling the REST API from your favorite programming language is sometimes the better and easier option. Others, such as COBOL on the mainframe, don’t even provide a Kafka client at all. Hence, a REST API is the only viable solution.

Example: HTTP + Kafka with Confluent REST Proxy

The Confluent REST Proxy has been around for a long time and is available under the Confluent Community License. Many companies use it in production as a management plane and data plane as a self-managed component in conjunction with open source Apache Kafka, Confluent Platform, or Confluent Cloud.

While not being a lawyer, the short version is that you can use the Confluent REST Proxy for free – even for your production workloads at any scale – as long as you don’t build a competitive cloud service with it (say, e.g., the “AWS Kafka HTTP Proxy”) and charge per hour or volume for the serverless offering.

The Confluent REST Proxy and REST APIs are separated into both a data plane and a management plane:

While some applications require both, in many scenarios, only one or the other is used.

The management plane is typically used for very low throughout and a few API calls. The data plane, on the other hand, varies. Many applications produce and consume data continuously. The biggest limitation of the REST Proxy data plane is that it is a synchronous request-response protocol.

What scale and volumes does a REST Proxy for Kafka support?

Don’t underestimate the power of the REST Proxy as a data plane because Kafka provides batch capabilities to scale up to many parallel REST Proxy instances. There are deployments where four REST Proxy instances can handle ~20,000 events per second, which is sufficient for many use cases.

Many HTTP use cases do not require millions of events per second. Hence, the Confluent REST Proxy is often good enough. This is even true for many IoT use cases I have seen in the wild where devices or machines connect to Kafka via HTTP.

How does HTTP’s streaming data transfer fit into the architecture?

Please note that chunked transfer encoding is a streaming data transfer mechanism available in version 1.1 of HTTP. In chunked transfer encoding, the data stream is divided into a series of non-overlapping “chunks”. The chunks are sent out and received independently of one another.

Some Kafka REST Produce APIs support a streaming mode that allows sending multiple records over a single stream. The stream remains open unless explicitly terminated. The streaming mode can be achieved by setting an additional header “Transfer-Encoding: chunked” on the initial request. Check if your favorite Kafka proxy or cloud API supports the HTTP streaming mode.

Architecture options for Kafka + Rest Proxy

Different deployment options exist for the Confluent REST Proxy:

The self-managed REST Proxy instance or cluster of instances (as a “dedicated node”) is still decoupled from the open-open source Kafka broker or commercial Confluent Server. This is the ideal option for a data plane to produce and consume messages.

The management plane is also embedded as a unified REST API into Confluent Server (as a “broker plugin”) and Confluent Cloud for administrative operations. This simplifies the architecture because no additional nodes are required for using the administration APIs.

In some deployments, both approaches may be combined: The management plane is used via the embedded REST APIs in Confluent Server or in Confluent Cloud. Meanwhile, data plane use cases are decoupled into their own REST Proxy instances to easily handle scalability and be independent of the server side.

The developer does not have to care about the infrastructure or architecture for REST APIs in Confluent Cloud. The HTTP interfaces are fully managed by the vendor.

The REST APIs of the self-managed REST Proxy and Confluent Cloud are compatible. Hybrid architectures and cloud migration are possible without implementing any breaking changes.

Data governance for data streaming and HTTP services with a Schema Registry

Data governance is an important part of most data streaming projects. Kafka deployments usually include various decoupled producers and consumers, often following the DDD principle for microservice architectures. Hence, Confluent Schema Registry is used in most projects for schema enforcement and versioning.

Any Kafka client built by Confluent can leverage the Schema Registry using Avro, Protobuf, or JSON Schema. This includes programming APIs like Java, Python, Go, or Python, but also Kafka Connect sources and sink, Kafka Streams, ksqlDB, and the Confluent REST Proxy. Like the REST Proxy, Schema Registry is available under the Confluent Community License.

Schema Registry lives separately from your Kafka brokers. Confluent REST Proxy still talks to Kafka to publish and read data (messages) to topics. Concurrently, the REST Proxy can also talk to Schema Registry to send and retrieve schemas that describe the data models for the messages.

Schema Registry provides a serving layer for your metadata and enables data governance and schema enforcement for all events. It provides a RESTful interface for storing and retrieving your Avro, JSON Schema, and Protobuf schemas. The Schema Registry stores a versioned history of all schemas based on a specified subject name strategy, provides multiple compatibility settings, and allows the evolution of schemas according to the configured compatibility settings and expanded support for these schema types. It provides serializers that plug into Kafka clients that handle schema storage and retrieval for Kafka messages that are sent in any of the supported formats:

Schema enforcement happens on the client side. Additionally, Confluent Platform and Confluent Cloud provide server-side schema validation. The latter is helpful if incorrect or malicious client applications send messages to Kafka without using the client-side Schema Registry integration.

API Management is a term from the Open API world that puts a layer on top of HTTP / REST APIs for the management, monitoring, and monetization of APIs. Solutions include Apigee, MuleSoft Anypoint, Kong, IBM API Connect, TIBCO Mashery, and many more.

Features of API gateways and API management products

API Gateway and API Management Tools provide many outstanding features:

API Portal for creating and publishing APIs
Enforcing usage policies and controlling access
Technical features for data transformations
Nurturing the subscriber community
Collecting and analyzing usage statistics
Reporting on performance
Monetization and billing

These features are unavailable in any data streaming platform like Kafka, Pulsar, Kinesis, et al. on the other side, API tools like MuleSoft or Kong are not built for processing real-time data at scale with low latency.

API Management and data streaming are complementary

Hence, API Management and data streaming are complementary, not competitive! The blog post “Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?” explores this:

API == REST/HTTP for most API Management products and related API gateways. Vendors start to integrate either the Kafka API or event standards like AsyncAPI to get into event-based architectures. That’s great news!

Data sharing becomes crucial to modern and flexible enterprise architectures that build on concepts like microservices and data mesh. Real-time data beats slow data. That’s not just true for applications but also for data replication across business units, organizations, B2B, clouds, hybrid environments, and other scenarios. Therefore, the next generation of data sharing is built on top of data streaming.

HTTP APIs make little sense in many data streaming scenarios, especially if you expect high volumes or require low latency. Hence, data sharing in real-time by linking Kafka clusters or using a stream data exchange is the much better approach:

I won’t go into more detail here. The dedicated blog post “Streaming Data Exchange with Kafka for Real-Time Data Sharing” explores the idea, its trade-offs, and some real-world examples.

Not one or the other, instead combine Kafka and HTTP/REST in your projects!

Various use cases employ HTTP/REST with Apache Kafka as a management plane or data plane. This combination will not go away in the future.

The Confluent REST Proxy can be used for HTTP(S) communication with your favorite client interface. No matter if you run open-source Apache Kafka, Confluent Platform, or Confluent Cloud. Check out the source code on GitHub or get started with an intuitive tutorial.

How do you combine data streaming with request-response principles? How do you combine HTTP and Kafka? What proxy are you using? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Request-Response with REST/HTTP vs. Data Streaming with Apache Kafka – Friends, Enemies, Frenemies? appeared first on Kai Waehner.

Apache Kafka in the Public Sector – Part 2: Smart City

Kai Waehner — Tue, 12 Oct 2021 07:48:48 +0000

The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores how the public sector leverages data in motion powered by Apache Kafka to add value for innovative new applications and modernizing legacy IT infrastructures. This post is part 2: Use cases and architectures for a Smart City.

Blog series: Apache Kafka in the Public Sector and Government

This blog series explores why many governments and public infrastructure sectors leverage event streaming for various use cases. Learn about real-world deployments and different architectures for Kafka in the public sector:

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts once published.

As a side note: If you wonder why healthcare is not on the above list. Healthcare is another blog series on its own. While the government can provide public health care through national healthcare systems, it is part of the private sector in many other cases.

Real-time is Mandatory for a Smart City Everywhere

I wrote a lot about event streaming and Apache Kafka for smart city infrastructure and use cases. I won’t repeat myself. Check out the following event Streaming with Kafka as Foundation for a Smart City and Apache Kafka and MQTT for the Last Mile IoT integration in a Smart City.

This post dives deeper into architectural questions and how collaboration with 3rd party services can look from the government’s perspective and public administration of a smart city.

The Need for Real-time Data Processing Everywhere in a Smart City and how Kafka helps

A smart city is a very complex beast. I am glad that I only cover technology and not regulatory or political discussions. However, even the technology standpoint is not straightforward. A smart city needs to correlate data across data centers, devices, vehicles, and many other things. This scenario is an actual internet of things (IoT) and therefore includes plenty of different technologies, communication paradigms, and infrastructures:

Smart city projects require the integration of various 1st party and 3rd party services. Most use cases only work well if that data is correlated in real-time; think about traffic routing, emergency alerts, predictive monitoring and maintenance, mobility services such as ride-hailing, and other fancy smart city use cases. Without real-time data processing, the use case is either a bad user experience or not cost-efficient. Hence, Kafka is adopted more and more for these scenarios.

Low Latency and 5G Networks for (some) Data Streaming Use Cases

The term “real-time” needs to be defined. Processing data in a few seconds is good enough in most use cases and a significant game-changer compared to hourly, daily, or weekly batch processing.

Having said this, some use cases like location-based upselling in retail or condition monitoring in equipment and manufacturing require lower latency, meaning sub-second end-to-end data processing.

Here is an example of leveraging 5G networks for low latency. The demo was built by the AWS Wavelength team, Verizon, and Confluent:

Most real-world deployments use separation of concerns: Low-latency use cases run at the edge and everything else in the regular data center or public cloud region. Read the article “Low Latency Data Streaming with Apache Kafka and Cloud-Native 5G Infrastructure” for more details.

At this point, it is important to remind everybody that Kafka (and any IT software) is not hard real-time and not built for the OT world and embedded systems. Learn more in the article “Kafka is NOT hard real-time but soft real-time“. Also, (soft) real-time is not competitive to batch processing and data warehouse/data lake architecture. As you can learn in “Serverless Kafka in a Cloud-native Data Lake Architecture” it is complimentary.

Collaboration between Government, City, and 3rd Party via Open API

Real-time data processing is crucial in implementing smart city use cases. Additionally, most smart city projects require collaboration between different teams, infrastructures, and 3rd party services.

Let’s take a look at three very different real-world event streaming deployments to see the broad spectrum of use cases and integration challenges:

Ohio Department of Transportation’s government-owned event streaming platform
Deutsche Bahn’s single source of truth for customer communication in real-time and 3rd party integration with the Google Maps API
Free Now’s mobility service in the cloud for real-time data correlation in compliance with regional laws and independent vehicles/drivers.

Ohio Department of Transportation (ODOT) – A Government-Owned Event Streaming Platform

Ohio Department of Transportation (ODOT) has an exciting initiative: DriveOhio. It aims to organize and accelerate smart vehicle and connected vehicle projects in the State of Ohio. DriveOhio offers to be the single point of contact for policymakers, agencies, researchers, and private companies to collaborate with one another on intelligent transportation efforts around the state.

ODOT presented their real-time data transportation data platform at the last Kafka Summit Americas:

The whole Kafka ecosystem powers ODOT’s cloud-native Event Streaming Platform (ESP). The platform enables continuous data integration and stream processing for transactional and analytical workloads. The ESP runs on Kubernetes to provide an elastic, flexible, and scalable infrastructure for real-time data processing.

Deutsche Bahn – Single Source of Truth and Google Maps Integration in Real-time

Deutsche Bahn is a German railway company. It is a private joint-stock company (AG), with the Federal Republic of Germany being its single shareholder. I already talked about their real-time traveler information system in another blog post: “Mobility Services and Transportation powered by Apache Kafka“.

They leverage the Apache Kafka ecosystem powered by Confluent because it combines several characteristics that you would have to integrate with different technologies otherwise:

Real-time messaging
Data integration
Data correlation
Storage and caching
Replication and high availability
Elastic scalability

This example is excellent for this blog. It shows how an existing solution needs connectivity to other internal applications and 3rd party services to provide a better customer experience and expand the customer base.

Recently, Deutsche Bahn integrated its platform with Google Maps via Google’s Open API. In addition to a better customer experience, the railway company can reach out to many new end-users to expand their business. The Railway-News has a good article about this integration. Here is my summary:

Free Now – Mobility Service in the Cloud Connected to Regional Laws and Vehicles

Free Now (former MyTaxi) is a mobility service. Their app uses mobile and GPS technology to match taxi drivers with passengers based on availability and proximity. Mobility services need to integrate with other 3rd party services for routing, payment, tax implications, and many different use cases.

Here is one example from Free Now’s Kafka Summit talk where they explain the added value of continuous stream processing for calculating context-specific dynamic pricing:

The public administration is always involved when a new mobility service is released to the public. While some cities build their mobility services, the reality is that most governments provide the infrastructure together with the Telco providers, and 3rd party vendors provide the mobility service. The specific relationship between the government, city, and mobility service provider differs across regions, countries, and continents.

Almost every mobility service uses Kafka as its backbone. Google for your favorite mobility service across the globe and add “Kafka” to the search. Chances are very high that you find some excellent blog posts, conferences talks, or at least job offers from the mobility service’s recruiting page. Here are just a few examples that posted great content about their Kafka usage: Uber, Lyft, Grab, Otonomo, Here Technologies, and many more.

Data in Motion with Kafka for a Connected and Innovative Smart City

Smart City is a vast topic. Many stakeholders are involved. Collaboration and Open APIs are critical for success. In most cases, governments work together with telco providers, infrastructure providers such as the cloud hyperscalers, and software vendors (including an event streaming platform like Kafka).

Most valuable and innovative smart city use cases require data processing in real-time. The use cases require data integration, storage, and backpressure handling, and data correlation. Event Streaming is the ideal technology for these use cases. Examples from the Ohio Department of Transportation, Deutsche Bahn and its Google Maps integration, and Free Now showed a few different angles to realize successful smart city projects.

How do you leverage event streaming in the public sector? Are you working on smart city projects? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Public Sector – Part 2: Smart City appeared first on Kai Waehner.

Kafka API is the De Facto Standard API for Event Streaming like Amazon S3 for Object Storage

Kai Waehner — Sun, 09 May 2021 14:32:49 +0000

Real-time beats slow data in most use cases across industries. The rise of event-driven architectures and data in motion powered by Apache Kafka enables enterprises to build real-time infrastructure and applications. This blog post explores why the Kafka API became the de facto standard API for event streaming like Amazon S3 for object storage, and the tradeoffs of these standards and corresponding frameworks, products, and cloud services.

Event-Driven Architecture: This Time It’s Not A Fad

The Forbes’ article “Event-Driven Architecture: This Time It’s Not A Fad” from April 2021 explained why enterprises are not just talking about event-driven real-time applications, but finally building them. Here are some arguments:

REST limitations can limit your business strategy
Data needs to be fluid and real-time
Microservices and serverless need event-driven architectures

Real-time Data in Motion beats Slow Data

Use cases for event-driven architectures exist across industries. Some examples:

Transportation: Real-time sensor diagnostics, driver-rider match, ETA updates
Banking: Fraud detection, trading, risk systems, mobile applications/customer experience
Retail: Real-time inventory, real-time POS reporting, personalization
Entertainment: Real-time recommendations, a personalized news feed, in-app purchases
The list goes on across verticals…

Real-time data in motion beats data at rest in databases or data lakes in most scenarios. There are a few exceptions that require batch processing:

Reporting (traditional business intelligence)
Batch analytics (processing high volumes of data in a bundle, for instance, Hadoop and Spark’s map-reduce, shuffling, and other data processing only make sense in batch mode)
Model training as part of a machine learning infrastructure (while model scoring and monitoring often requires real-time predictions, the model training is batch in almost all currently available ML algorithms)

Beyond these exceptions, almost everything is better in real-time than batch.

Be aware that real-time data processing is more than just sending data from A to B in real-time (aka messaging or pub/sub). Real-time data processing requires integration and processing capabilities. If you send data into a database or data lake in real-time but have to wait until it is processed there in batch, it does not solve the problem.

With the ideas around real-time in mind, let’s explore what a de facto standard API is.

What is a (De Facto) Standard API?

The answer is longer than you might expect and needs to be separated into three sections:

API
Standard API
De facto standard API

What is an API?

An application programming interface (API) is an interface that defines interactions between multiple software applications or mixed hardware-software intermediaries. It defines the kinds of calls or requests that can be made, how to make them, the data formats that should be used, the conventions to follow, etc. It can also provide extension mechanisms so that users can extend existing functionality in various ways and to varying degrees.

An API can be entirely custom, specific to a component, or designed based on an industry-standard to ensure interoperability. Through information hiding, APIs enable modular programming, allowing users to use the interface independently of the implementation.

What is a Standard API?

Industry consortiums or other industry-neutral (often global) groups or organizations specify standard APIs. A few characteristics show the trade-offs:

Vendor-agnostic interfaces
Slow evolution and long specification process
Most vendors add proprietary features because a) too slow process of the standard specification or more often b) to differentiate their commercial offering
Acceptance and success depend on the complexity and added value (this sounds obvious but is often the key blocker for success)

Examples for Standard APIs

Here are some examples of standard APIs. I also add my thoughts if I think they are successful or not (but I fully understand that there are good arguments against my opinion).

Generic Standards

SQL: Domain-specific language used in programming and designed for managing data held in a relational database management system. Successful as almost every database somehow supports SQL or tries to build a similar syntax. A good example is ksqlDB, the Kafka-native streaming SQL engine. ksqlDB (like most other streaming SQL engines) is not ANSI SQL, but still understood easily by people that know SQL.
J2EE / Java EE / Jakarta EE: Successful as most vendors adopted at least parts of it for Java frameworks. While early versions were very heavyweight and complex, the current APIs and implementations are much more lightweight and user-friendly. JMS is a great example where vendors added proprietary add-ons to add features and differentiate. No vendor-lockin is only true in theory!
HTTP: Successful as application layer protocol for distributed, collaborative, hypermedia information systems. While not 100% correct, people typically interpret HTTP as REST Web Services. HTTP is often misused for things it is not built for.
SOAP / WSDL: Partly successful in providing XML-based web service standard specifications. Some vendors built good tooling around it. However, this is typically only true for the basic standards such as SOAP and WSDL, not so much for all the other complex add-ons (often called WS-* hell).

Standards for a Specific Problem or Industry

OPC-UA for Industrial IoT (IIoT): Partly successful machine-to-machine communication protocol for industrial automation developed. Adopted by almost every vendor in the industrial space. The drawback (similarly to HTTP) is that it is often misused. For instance, MQTT is a much better and more lightweight choice in some scenarios. OPC-UA is a great example where the core is successful, but the industry-specific add-ons are not prevalent and not supported by tools. Also, OPC-UA is too heavyweight for many of the use cases it is used in.
PMML for Machine Learning: Not successful as an XML-based predictive model interchange format. The idea is great: Train an analytic model once and then deploy it across platforms and programming languages. In practice, it did not work. Too many limitations and unnecessary complexity for a project. Most real-world machine learning deployments I have seen in the wild avoid it and deploy models to production with a standard wrapper. ONNX and other successors are not more prevalent yet either.

In summary, some standard APIs are successful and adopted well; many others are not. Contrary to these standards specified by consortiums, there is another category emerging: De Facto Standard APIs.

What is a De Facto Standard API?

De Facto standard APIs originate from an existing successful solution (that can be an open-source framework, a commercial product, or a cloud service). Two ways exist how these de facto standard APIs emerge:

Driven by a single vendor (often proprietary), for example: Amazon S3 for object storage.
Driven by a huge community around a successful open-source project, for example: Apache Kafka for event streaming.

No matter how a de facto standard API originated, they typically have a few characteristics in common:

Creation of a new category of software, something that did not exist before
Adoption by other frameworks, products, or cloud services as the API because became the de facto standard
No complex, formal, long-running standard processes; hence innovation is possible in a relatively flexible and agile way
Practical processes and rules are in place to ensure good quality and consensus (either controlled by the owner company for a proprietary standard API or across the open source community)

Let’s now explore two de facto standard APIs: Amazon S3 and Apache Kafka. Both are very successful but very different regarding being a standard. Hence, the trade-offs are very different.

Amazon S3: De Facto Standard API for Object Storage

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface in the public AWS cloud. It uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network. Amazon S3 can be employed to store any type of object, which allows for uses like storage for internet applications, backup and recovery, disaster recovery, data archives, data lakes for analytics, and hybrid cloud storage. Additionally, S3 on Outposts provides on-premises object storage for on-premises applications that require high-throughput local processing.

“Amazon CTO on Past, Present, Future of S3” is a great read about the evolution of this fully-managed cloud service. While the public API was kept stable, the internal backend architecture under the hood changed several times significantly. Plus, new features were developed on top of the API, for instance, AWS Athena for analytics and interactive queries using standard SQL. I really like how Werner Vogels describes his understanding of a good cloud service:

Vogels doesn’t want S3 users to even think for a moment about spindles or magnetic hardware. He doesn’t want them to care about understanding what’s happening in those data centers at all. It’s all about the services, the interfaces, and the flexibility of access, preferably with the strongest consistency and lowest latency when it really matters.

So, we are talking about a very successful proprietary cloud service by AWS. Hence, what’s the point?

Most Object Storage Vendors Support the Amazon S3 API

Many enterprises use the Amazon S3 API. Hence, it became the de facto standard. If other storage vendors want to sell object storage, supporting the S3 interface is often crucial to get through the evaluations and RFPs. If you don’t support the S3 API, it is much harder for companies to adopt the storage and implement the integration (as most companies already use Amazon S3 and have built tools, scripts, testing around this API).

For this reason, many applications have been built to support the Amazon S3 API natively. This includes applications that write data to Amazon S3 and Amazon S3-compatible object stores.

S3 compatible solutions include client backup, file browser, server backup, cloud storage, cloud storage gateway, sync&share, hybrid storage, on-premises storage, and more.

Many vendors sell S3-compatible products: Oracle, EMC, Microsoft, NetApp, Western Digital, MinIO, Pure Storage, and many more. Check out the Amazon S3 site from Wikipedia for a more detailed and complete list.

So why has the S3 API become so ubiquitous?

The creation of a new software category is a dream for every vendor! Let’s understand how and why Amazon was successful in establishing S3 for object storage. The following is a quote from Chris Evan’s great article from 2016: “Has S3 become the de facto API standard?”

So why has the S3 API become so ubiquitous? I suspect there are a number of reasons. These include:

First to market – When S3 was launched in 2006, most enterprises were familiar with object storage as “content addressable storage” through EMC’s Centera platform. Other than that, applications were niche and not widely adopted except for specific industries like High Performance Computing where those users were used to coding to and for the hardware. S3 quickly became a platform everyone could use with very little investment. That made it easy to consume and experiment with. By comparison, even today the leaders in object storage (as ranked by the major analysts) still don’t make it easy (or possible) to download and evaluate their products, even though most are software only implementations.
Documentation – following on from the previous point, S3 has always been well documented, with examples on how to run API commands. There’s a document history listing changes over the past 6-7 years that shows exactly how the API has evolved.
A Single Agenda – the S3 API was designed to fit a single agenda – that of storing and retrieving objects from S3. As such, Amazon didn’t have to design by committee and could implement the features they required and evolve from there. Contrast that with the CDMI (Cloud Data Management Interface) from SNIA. The SNIA website is difficult to navigate, the standard itself is only on the 4th published iteration in six years, while the documentation runs to 264 pages! (Note that the S3 API runs into more pages, but is infinitely more consumable, with simple examples from page 11 onwards).

Cons of a Proprietary De Facto Standard like Amazon S3

Many people might say: “Better a proprietary standard than no standard.” I partly agree with this. The possibility to learn one API and use it across multi-cloud and on-premise systems and vendors is great. However, Amazon S3 has several disadvantages as it is NOT an open standard:

Other vendors (have to) build their implementation on a best guess about the behavior of the API. There is no official standard specification they can rely on.
Customers cannot be sure what they buy. At least, they should not expect the same behavior of 3rd party S3 implementations that they get from their experiences using Amazon S3 on AWS.
Amazon can change APIs and features as it likes. Other vendors need to “reverse engineer the API” and adjust their products.
Amazon could sue competitors for using S3 API branding – even though this is not likely to happen as the benefits are probably bigger (I am not a lawyer; hence this statement might be wrong and is just my personal opinion)

Let’s now look at an open-source de facto standard: Kafka.

Kafka API: De Facto Standard API for Event Streaming

Apache Kafka is mainstream today! The Kafka API became the de facto standard for event-driven architectures and event streaming. Two proof points:

Use cases across all industries and infrastructure. Including various kinds of transactional and analytics workloads. Edge, hybrid, multi-cloud. I collected a few examples across verticals that use Apache Kafka to show the prevalence across markets.
Adoption by various open-source frameworks and many software/cloud vendors. Check out my blog post if you are interested in a comparison of Kafka vendors such as Confluent, Cloudera, Red Hat or Amazon MSK and related technologies like Azure Event Hubs, AWS Kinesis, RedPanda, or Apache Pulsar.

The Kafka API (aka Kafka Protocol)

Kafka became the de facto event streaming API. Similar like the S3 API became the de facto standard for object storage. Actually, the situation is even better for the Kafka API as the S3 API is a proprietary protocol from AWS. In contrast, the Kafka API and protocol are open source under Apache 2.0 license.

The Kafka protocol covers the wire protocol implemented in Kafka. It defines the available requests, their binary format, and the proper way to make use of them to implement a client.

One of my favorite characteristics of the Kafka protocol is backward compatibility. Kafka has a “bidirectional” client compatibility policy. In other words, new clients can talk to old servers, and old clients can talk to new servers. This allows users to upgrade either clients or servers without experiencing any downtime or data loss. This makes Kafka ideal for microservice architectures and domain-driven design (DDD). Kafka really decouples the applications from each other in contrary to web service/REST-based architectures).

Pros of an Open Source De Facto Standard like the Kafka API

The huge benefit of an open-source de facto standard API is that it is open and usually follows a collaborative standardized process to make changes to the API. This brings various benefits to the community and software vendors.

The following facts about the Kafka API make many developers and enterprises happy:

Changes occur in a visible process enforced by a committee. For Apache Kafka, the Apache Software Foundation (ASF) is the relevant organization. Apache projects are managed using a collaborative, consensus-based process with members from various countries and enterprises. Check out how it works if you don’t know it yet.
Frameworks and vendors can implement against the open protocol and validate the implementation. That is significantly different from proprietary de facto standards like Amazon S3. Having said this, not every product that says it uses the Kafka API is 100% compatible and consequently is limited in the feature set and provides different behavior.
Developers can test the underlying behavior against the same API. Hence, unit and performance tests for different implementations can use the same code.
The Apache 2.0 license makes sure that the user does not have to worry about infringing any patents by using the software.

Frameworks, Products, and Cloud Services using the Kafka API

Many frameworks and vendors adopted the Kafka API. Let’s take a look at a few very different alternatives available today that use the Kafka API:

Open-source Apache Kafka from the Apache website
Self-managed Kafka-based vendor solutions for on-premises or cloud deployments from Confluent, Cloudera, Red Hat
Partially managed Kafka-based cloud offerings from Amazon MSK, Red Hat, Azure HD Insight’s Kafka, Aiven, cloudkarafka, Instaclustr.
Fully managed Kafka cloud offerings such as Confluent Cloud – actually, there is no other serverless, fully compatible Kafka SaaS offering on the market today (even though many marketing departments try to sell it like this)
Partly protocol-compatible, self-managed solutions such Apache Pulsar (with a simple, very limited Kafka wrapper class) or RedPanda for embedded / WebAssembly (WASM) use cases
Partly protocol-compatible, fully managed offerings like Azure EventHubs

Just be aware that the devil is in the details. Many offerings only implement a fraction of the Kafka API. Additionally, many offerings only support the core messaging concept, but exclude key features such as Kafka Connect for data integration, Kafka Streams for stream processing, or exactly-once semantics (EOS) for building transactional systems.

The Kafka API Dominates the Event Streaming Landscape

If you look at the current event streaming landscape, you see that more and more frameworks and products adopt the Kafka API. Even though the following is not a complete list (and other non-Kafka offerings exist), it is imposing:

If you want to learn more about the different Kafka offerings on the market, check out my Kafka vendor comparison. It is crucial to understand what Kafka offering is right for you. Do you want to focus on business logic and consume the Kafka infrastructure as a service? Or do you want to implement security, integration, monitoring, etc., by yourself?

The Kafka API is here to stay…

The Kafka API became the de facto standard API for event streaming. The usage of an open protocol creates huge benefits for corresponding frameworks, products, and cloud services leveraging the Kafka API.

Vendors can implement against the open standard and validate their implementation. End users can choose the best solution for their business problem. Migration between different Kafka services is also possible relatively easily – as long as each vendor is compliant with the Kafka protocol and implements it completely and correctly.

Are you using the Kafka API today? Open source Kafka (“car engine”), a commercial self-managed offering (“complete car”), or the serverless Confluent Cloud (“self-driving car) to focus on business problems? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka API is the De Facto Standard API for Event Streaming like Amazon S3 for Object Storage appeared first on Kai Waehner.

Apache Kafka in the Airline, Aviation and Travel Industry

Kai Waehner — Fri, 19 Feb 2021 11:21:55 +0000

Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. Coronavirus is just a piece of the challenge. This post explores use cases, architectures, and references for Apache Kafka in the aviation industry, including airline, airports, global distribution systems (GDS), aircraft manufacturers, and more. Kafka was relevant pre-covid and will become even more important post-covid.

Airlines and Aviation are Changing – Beyond Covid-19!

Aviation and travel are notoriously vulnerable to social, economic, and political events. These months have been particularly testing one due to the global pandemic with Covid-19. But the upcoming change is coming not just due to the Coronavirus but because of the ever-changing expectations of consumers.

Right now is the time to lay the ground for the future of the aviation and travel industry.

Consumer behaviors and expectations are changing. Whole industries are being disrupted, and the aviation industry is not immune to these sweeping forces of change.

The future business of airlines and airports will be digitally integrated into the ecosystem of partners and suppliers. Companies will provide more personalized customer experiences and be enabled by a new suite of the latest technologies, including automation, robotics, and biometrics.

For instance, new customer notification mobile apps provide customers with relevant and timely updates throughout their journeys. Other major improvements support the front line service teams at various touchpoints throughout the airports and end to end travel journey.

Apache Kafka in the Airline Industry

Apache Kafka is the de facto standard for event streaming use cases across industries. Many use cases can be applied to the aviation industry, too. Concepts like payment, customer experience, and manufacturing differ in detail. But in the end, it is about integrating systems and processing data in real-time at scale.

For instance, omnichannel retail with Apache Kafka applies to airline, airports, global distribution systems (GDS), and other aviation industry sectors.

However, it is always easier to learn from other companies in the same industry. Therefore, the following explores a few public Apache Kafka success stories from the aviation industry.

Lufthansa – Kafka Unified Streaming Cloud Operations

Lufthansa talks about the benefits of using Apache Kafka instead of traditional messaging queues (TIBCO EMS, IBM MQ) for data processing.

The journey started with the question if Lufthansa can do data processing better, cheaper, and faster.

Lufthansa’s Kafka architecture does not have any surprises. A key lesson learned from many companies: The real added value is created when you leverage Kafka not just for messaging, but its entire ecosystem, including different clients/proxies, connectors, stream processing, and data governance.

The result at Lufthansa: A better, cheaper, and faster infrastructure for real-time data processing at scale.

My two favorite statements (once again: not really a surprise, as I see the same at many other customers):

“Scaling Kafka is really inexpensive”
“Kafka adopted and integrated within 3 months”

Watch the full talk from Marcos Carballeira Rodríguez from Lufthansa recorded at the Confluent Streaming Days 2020 to see all the architectures and quotes from Lufthansa.

And check out this exciting video recording of Lufthansa discussing their Kafka use cases for middleware modernization and machine learning:

Singapore Airlines – Predictive Maintenance with Kafka Connect, Kafka Streams, and ksqlDB

Singapore Airlines is an early adopter of KSQL to continuously process sensor data and apply analytic models to the events. They already talked about their Kafka ecosystem usage (including Kafka Connect, Kafka Streams, and KSQL) back in 2018. The use case is predictive maintenance with a scalable real-time infrastructure, as you can see in my summary slide:

Check out the complete slide deck from Singapore Airlines for more details.

Air France Hop – Scalable Real-Time Microservices

I really like the Kafka Summit talk title: “Hop! Airlines Jets to Real-Time“. Air France Hop leverage Change Data Capture (CDC) with HVR and Kafka for real-time data processing and integration with legacy monoliths. A pretty common pattern to integrate the old and the new software and IT world:

The complete slide deck and on-demand video recording about this case study are available on the Kafka Summit page.

Amadeus – Real-Time and Batch Log Processing

As I said initially, Kafka is not just relevant for each airline, airport, and aircraft manufacturers. The global distribution system (GDS) from Amadeus is one of the world’s biggest (competing mainly with Sabre). Passenger name record (PNR) is a record in the computer reservation system (CRS) and a crucial part of any GDS vendor. While many end-users don’t even know about Amadeus, the aviation industry could not survive without them. Their workloads are mission-critical and need to run 24/7 in real-time, plus connect to their partners’ systems (like an airline) in a very stable and mature manner!

Amadeus is relying on Apache Kafka for both real-time and batch data processing, as they explain on the official Apache Kafka website:

Streaming Data Exchange for the Travel Industry

After looking at some examples, let’s now cover one more key topic: Data integration and correlation between partners in the aviation industry. Airline, airports, GDS, travel companies, and many other companies need to integrate very well. Obviously, this is already implemented. Otherwise, there is no way to operate flights with passengers and cargo. At least in theory. Honestly, one of the most significant pain points of the travel industry for customers is bad integration across companies. Some examples:

Late or (even worse) no notification about a delay or cancellation
Issues with the display of available seats or upgrade
Broken booking process on the website because of different flight numbers, connecting flights,
Booking class issues for upgrades or rebookings
Display of technical error messages instead of business information (for instance, I can’t count how often I had seen an “IBM WebSphere” error message when I tried to book a flight on the website of my most commonly used airline)
The list goes on and on and on… No matter which airline you pick. That’s at my experience as a frequent traveler across all continents and timezones.

There are reasons for these issues. The aviation network is very complex. For instance, Lufthansa group sells tickets for all their own brands (like Swiss or Austrian Airlines), plus tickets from Star Alliance partners (such as United or Singapore Airlines). Hence, airline, airports, GDS, and many partner systems have to work together. 24/7. In real-time. For this reason, more and more companies in the aviation industry rely on Kafka internally.

But that’s only half of the story… How do you integrate with partners?

Event Streaming vs. REST / HTTP APIs

I explored the discussion around event streaming with Kafka vs. RESTful web services with HTTP in much more detail in another article: “Comparison: Apache Kafka vs. API Management / API Gateway tools like Mulesoft or Kong“. In short: Kafka and REST APIs have their trade-offs. Both are complementary and used together in many architectures. API Management is a great add-on for many applications and microservices, no matter if they are built with HTTP or Kafka under the hood.

But one point is clear: If you need a scalable real-time integration with a partner system, then HTTP is not the right choice. You can either pick gRPC as a request-response alternative or use Kafka natively for the integration with partners, as you use it internally already anyway:

Kafka-native replication between partners works very well. No matter what Kafka vendor and version you and your partner are running. Obviously, the biggest challenge is the security (not from a technical but an organizational perspective). Kafka requires TCP. That’s much harder to get approval for opening it to a partner than HTTP ports.

But from a technical point of view, streaming replication often makes much more sense. I have seen the first customers implementing integration via tools like Confluent Replicator. I am sure that we will see this pattern much more in the future and with better out-of-the-box tool support from vendors.

Data Integration and Correlation at an Airport with Airline Data using KSQL

So, let’s assume that you have the data streams connected at an airport. No matter if just internal data or also partner data. Data correlation adds the business value. Sönke Liebau from OpenCore presented a great airport demo with Kafka and KSQL at a Kafka Summit.

Let’s take a look at some events at an airport:

These events exist in various structures and with different technologies and formats. Some data streams arrive in real-time. However, some other data sets come from a monolithic mainframe in batch via a file integration. Kafka Connect is a Kafka-native middleware to implement this integration.

Afterward, all this data needs to be correlated with historical data from a loyalty system or relational database. This is where stream processing comes into play: This concept enables the continuous data correlation in real-time at scale. Kafka-native technologies like Kafka Streams or ksqlDB exist to build streaming ETL pipelines or business applications.

The following example correlates the gate information from the airport with the airline flight information to send a delay notification to the customer who is waiting for the connection flight:

Tons of use cases exist to leverage event streams from different systems (and partners) in real-time. Some examples from an airport perspective:

Location-based services while the customer is walking through the airport and waiting for the flight. Example: Coupons for a restaurant (with many empty seats or food reserves to thrash if not sold during the day)
Airline services such as free or points-based discounted lounge entrance (because the lounge tracking systems knows that it is almost empty right now anyway)
Partner services like notifying the airport hotel that the guest can stay longer in the room because of a long delay of the upcoming flight

The list of opportunities is almost endless. However, most use cases are only possible if all systems are integrated and data is continuously correlated in real-time. If you need some more inspiration, check out the two blogs “Kafka at the Edge in a Smart Retail Store” and “Kafka in a Train for Improved Customer Experience“. All these use cases are a perfect fit for airline, airports, and their partner ecosystem.

Slides – Apache Kafka in the Aviation, Airline and Travel Industry

The following slide deck goes into more detail:

Kafka for Improved Operations and Customer Experience in the Aviation Industry

This post explored various use cases for event streaming with Apache Kafka in the aviation industry. Airline, airports, aerospace, flight safety, manufacturing, GDS, retail, and many more partners rely on Apache Kafka.

No question: Kafka is getting mainstream these months in the aviation industry. Serverless and consumption-based offerings such as Confluent Cloud boost the adoption even more. A streaming data exchange between partners is the next step I see on the horizon. I am looking forward to Kafka-native interfaces from Open APIs of enterprises, better support for streaming interfaces in API Management tools, and COTS solution from software vendors.

What are your experiences and plans for event streaming in the aviation industry? Did you already build applications with Apache Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Airline, Aviation and Travel Industry appeared first on Kai Waehner.

Event Streaming with Kafka as Foundation for a Smart City

Kai Waehner — Sun, 14 Feb 2021 10:11:44 +0000

A smart city is an urban area that uses different types of electronic Internet of Things (IoT) sensors to collect data and then use insights gained from that data to manage assets, resources, and services efficiently. This blog post explores how Apache Kafka fits into the smart city architecture and the benefits and use cases.

What is a Smart City?

That’s a good question. Here is a sophisticated definition:

Okay, that’s too easy, right? No worries… Additionally, many more definitions exist for the term “Smart City”. Hence, make sure to define the term before discussing it. The following is a great summary described in the book “Smart Cities for Dummies“, authored by Jonathan Reichental:

“A smart city is an approach to urbanization that uses innovative technologies to enhance community services and economic opportunities, improves city infrastructure, reduce costs and resource consumption, and increases civic engagement.”

A smart city provides many benefits for the civilization and the city management. Some of the goals are:

Improved Pedestrian Safety
Improved Vehicle Safety
Proactively Engaged First Responders
Reduced Traffic Congestion
Connected / Autonomous Vehicles
Improved Customer Experience
Automated Business Processes

The research company IoT Analytics describes the top use cases of 2020:

These use cases indicate a significant detail about smart cities: The need for collaboration between various stakeholders.

Collaboration is Key for Success in a Smart City

A smart city requires the collaboration of many stakeholders. For this reason, smart city initiatives usually involve cities, vendors, and operators, such as the following:

Public sector (including government, cities, communes) providing laws, regulations, initiatives, and often budgets.
Hardware makers (including cars, scooters, drones, traffic lights, etc.) and their suppliers providing smart things.
Telecom industry providing stable networks with high bandwidth and low latency (including 5G, Wifi hotspots, etc.
Mobility services providing innovative apps and solutions like ride-hailing, car-sharing, map routing, etc. –
Cloud providers like AWS, GCP, Azure, Alibaba providing cloud-native infrastructure for the backend applications and services
Software providers providing SaaS and/or edge and mobile applications for CRM, analytics, payment, location-based services, and many other use cases.

Obviously, different stakeholders are often in competition or coopetition. For instance, the cities have a huge interest in building their own mobility services as this is the main gate to the end-users.

Funding is another issue, as IOT Analytics states: “Cities typically rely either on public or private funding for realizing their Smart City projects. To overcome funding-related limitations, successful Smart Cities tend to encourage and facilitate collaboration between the public and private sectors (i.e., Public-Private-Partnerships (PPPs)) in the development and implementation of Smart City projects.”

Digital Infrastructure as Prerequisite for a Smart City

A smart city is not possible without a digital infrastructure. No way around this. That includes various components:

A digital infrastructure enables building a smart city. Hence, let’s take a look at how to do that next…

Event Streaming as Foundation for a Smart City

A smart city has to work with various interfaces, data structures, and technologies. Many data streams have to be integrated, correlated, and processed in real-time. Many data streams are high volume. Hence, scalability and elastic infrastructure are essential for success. Many data streams contain mission-critical workloads. Therefore, characteristics like reliability, zero data loss, persistence are super important.

An event streaming platform based on Apache Kafka and its ecosystem provide all these capabilities:

Smart city use cases often include hybrid architectures. Some parts have to run at the edge, i.e., closer to the streets, buildings, cameras, and many other interfaces, for high availability, low latency, and lower cost. Check out the “use cases for Kafka in edge and hybrid architectures” for more details.

Public Sector Use Cases for Event Streaming

Smart cities and the public sector are usually looked at together as they are closely related. Here are a few use cases that can be improved significantly by leveraging event streaming:

Citizen Services

Health services, e.g., hospital modernization, track&trace – Covid distance control
Efficient and digital citizen engagement, e.g., the personal ID application process
Open exchange, e.g., mobility services (working with partners such as car makers)

Smart City

Smart driving, parking, buildings, environment
Waste management
Mobility services

Energy and Utilities

Smart grid and utility infrastructure (energy distribution, smart home, smart meters, smart water, etc.)

Security

Law enforcement, surveillance
Defense, military
Cybersecurity

Obviously, this is just a small number of possible scenarios for event streaming. Additionally, many use cases from other industries can also be applied to the public sector and smart cities. Check out “real-life examples across industries for use cases and architectures leveraging Apache Kafka” to learn about real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and many more.

NAV (Norwegian Work and Welfare Department) – Life is a Stream of Events

Here is a great example from the public sector: The Norwegian Work and Welfare Department (NAV) has implemented the life of its citizens as a stream of events:

The consequence is a digital twin of citizens. This enables various use cases for real-time and batch processing of citizen data. For instance, new ID applications can be processed and monitored in real-time. At the same time, anonymized citizen data allows the aggregation for improving city services (e.g., by hiring the correct number of people for each department).

Obviously, such a use case is only possible with security and data governance in mind. Authentication, authorization, encryption, role-based access control, audit logs, data lineage, and other concepts need to be applied end-to-end using the event streaming platform.

The Smarter City Nervous System

A smart city requires more than real-time data integration and real-time messaging. Many use cases are only possible if the data is also processed in real-time continuously. That’s where Kafka-native stream processing frameworks like Kafka Streams and ksqlDB come into play. Here is an example for receiving images or videos from surveillance cameras to do monitoring and alerting in real-time at scale:

For more details about data integration and stream processing with the Kafka ecosystem, check out the post “Stream Processing in a Smart City with Kafka Connect and KSQL“.

Kafka Enables a Scalable Real-Time Smart City Infrastructure

The public sector and smart city architectures leverage event streaming for various use cases. The reasons are the same as in all other industries: Kafka provides an open, scalable, elastic infrastructure. Additionally, it is battle-tested and runs in every infrastructure (edge, data center, cloud, bare metal, containers, Kubernetes, fully-managed SaaS such as Confluent Cloud). But event streaming is not the silver bullet for every problem. Therefore, Kafka is very complementary to other technologies such as MQTT for edge integration or a cloud data lake for batch analytics.

What are your experiences and plans for event streaming in smart city use cases? Did you already build applications with Apache Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Event Streaming with Kafka as Foundation for a Smart City appeared first on Kai Waehner.

Apache Kafka in the Financial Services Industry

Kai Waehner — Mon, 18 Jan 2021 13:24:41 +0000

The rise of event streaming in financial services is growing like crazy. Continuous real-time data integration and processing are mandatory for many use cases. Many business departments in the financial services sector deploy Apache Kafka for mission-critical transactional workloads and big data analytics. High scalability, high reliability, and an elastic open infrastructure are the key reasons for Kafka’s success. This blog post explores different use cases, architectures, and real-world examples in the FinServ sector.

FinServ Enterprise Reality: Innovate or be disrupted!

There is no way around it: The new business reality is very different from the last decades:

Technology was a support function in the past.
Innovation required for growth.
“Good enough” to run on yesterday’s data.
Technology is the business.
Innovation is required for survival.
Yesterday’s data = failure.
Modern, real-time data infrastructure is required.

Only two options exist for enterprises in the finance sector: Innovate or be disrupted!

The New FinServ Enterprise Reality – Every Company is a Software Company

Please take a look at your favorite traditional bank and how its market cap looks like compared to new FinTech companies such as Robinhood, Stripe, Square, or Revolut.

Some traditional companies re-invented themselves to focus on innovative new products and great customer experience to stay competitive. Software is eating the world, including the finance sector.

Here are a few examples:

Capital One: 10,000 of 40,000 employees are software engineers
Goldman Sachs: 1.5B (billion!) lines of code across 7,000+ applications
JPMorgan Chase & Co: Employs over 50,000 people in technology and has $10B+ technology spend.

Most successful post-modern companies in the finance sector heavily rely on Apache Kafka. This is true for emerging fintechs but also for (some) traditional banks.

Apache Kafka in Financial Services

Various use cases emerged to deploy event streaming with Apache Kafka in the finance industry. This includes mission-critical transactional workloads like payment processing or regulatory reporting and big data analytics projects leveraging Machine Learning, data lakes, etc.

Examples for Real-World Deployments

Here are a few companies leveraging Apache Kafka for banking projects:

Check past Kafka Summit video recordings and slides for details about use cases and architectures of these companies from the finance sector.

Here are a few concrete examples:

Capital One: Becoming truly event-driven – offering a service other parts of the bank can use.
ING: Significantly improved customer experience – as a differentiator + Fraud detection and cost savings.
Nordea: Able to meet strict regulatory requirements around real-time reporting + cost savings.
Paypal: Processing 400+ Billion events per day for user behavioral tracking, merchant monitoring, risk & compliance, fraud detection, and other use cases.
Royal Bank of Canada (RBC): Mainframe off-load, better CX & fraud detection – brought many parts of the bank together
10X Banking: Cloud-native and open core-banking platform to implement a next-generation FinServ platform
Robinhood: Commission-free stock trading using a mobile app and website.

This is just a concise list of companies in the financial sector using Apache Kafka as an event streaming platform for their business’s heart. Plenty of other examples are available by tens of global banks leveraging Apache Kafka for many use cases.

Kafka makes your Business Real-Time

The huge advantage is that Kafka allows decoupling your applications and infrastructure in a domain-driven design (DDD). Each microservice can use its own technology or product but leverage the same data (with security and privacy in mind, of course):

It is great to see that many FinServ companies do not just leverage Kafka in their applications but also contribute to the community. For instance, Robinhood published Faust: A stream processing library, porting Kafka Streams’ ideas from Java to Python.

This is a great example of a microservice architecture and the freedom of technology choice: Robinhood did not want to use Java for (some) applications and chose Python instead. No problem with Kafka as the brokers are dumb. The data processing and business logic happen in the clients in your favorite programming language.

And to be clear: Financial services are important in every company! Payments, orders, fraud, and similar transactional and analytical data rely on FinServ applications and the integration with partners.

Slides – The Rise of Event Streaming in FinServ

The following slide deck goes into more detail. Learn about the rise of event streaming in the financial services industry. Kafka is adopted in more and more scenarios:

Kafka in Banking for Middleware, Mainframe, Machine Learning, Open API, and more

The following links share additional content related to many banking and FinServ use cases and architectures:

Please check them out to learn more about the usage of event streaming with Kafka and its ecosystem across various business units in the Finserv sector.

Last but not least, please be aware that the term “real-time” is used in many contexts and can have different meanings. Read “Kafka is NOT hard real-time” to understand why Kafka is used in most banking projects, but not for the specific use case of trading in microseconds.

Software is Eating the Banks and FinServ Industry

Software is eating the world, including financial services. Continuous real-time data integration and processing are mandatory for many use cases. Apache Kafka is deployed across industries for mission-critical transactional workloads and big data analytics. No matter if you need to integrate with legacy systems, process mission-critical payment data, or build batch reports and analytic models, Kafka is a predominant choice as part of the architecture. Hybrid, edge, and multi-cloud deployments of Kafka are the new black.

What are your experiences and plans for event streaming in the financial services industry? Did you already build applications with Apache Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Financial Services Industry appeared first on Kai Waehner.

Kafka and XML Messages – Transformation, Connector, Middleware

Kai Waehner — Fri, 25 Sep 2020 07:20:15 +0000

XML messages and XML Schema are not very common in the Apache Kafka and Event Streaming world! Why? Many people call XML legacy. It is complex, verbose, and often associated with the ugly WS-* Hell (SOAP, WSDL, etc). On the other side, every company older than five years uses XML. It is well understood, provides a good structure, and is human- and machine-readable.

This post does not want to start another flame war between XML and other technologies such as JSON (which also provides JSON Schema now), Avro, or Protobuf. Instead, I will walk you through the three main approaches to integrate between Kafka and XML messages as there is still a vast demand for implementing this integration today (often for integrating legacy applications and middleware).

XML and XML Schema

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium’s XML 1.0 Specification of 1998 and several other related specifications – all of them free open standards – define XML.

The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services. Several schema systems exist to aid in defining XML-based languages, while programmers have developed many application programming interfaces (APIs) to assist the processing of XML data.

SOAP / WSDL Web Services – The WS-* Hell

Web Services use XML messages that follow the SOAP standard and have been popular with traditional enterprises for many years. In such systems, there is often a machine-readable description of the operations offered by the service written in the Web Services Description Language (WSDL). Web Services are one of the predominant use cases for XML integration. Some people call this the “WS-* Hell”:

Kafka for any Data Format (JSON, XML, Avro, Protobuf, …)

Kafka can store and process anything, including XML. The Kafka brokers are dumb. They don’t care about data formats. The implementation of Kafka under the hood stores and processes only byte arrays. This approach follows the design principle of dumb pipes and smart endpoints (coined by Martin Fowler for microservice architectures). Dumb brokers are one of the architectural reasons why Kafka scales and performs so well.

As Kafka supports any data format, XML is no problem at all. It accepts any serializable input format. XML is just text so that plain string serializers can be used. However, if you want additional validation before pushing messages into Kafka (like checking the content is actually XML or doing Schema Validation using XML Schema), then you need to write your own XML Serializer/Deserializer implementation.

XML mappings can be very complex, including referencing other documents and using plenty of ugly open standards. XML includes generic standards such as WS-Security or industry-specific standards such as XBRL for regulatory processing or HL7 for healthcare. It is a mess. Period. Even mature tools struggle as soon as any parts of such a standard are adjusted to their own needs (even though the standards support customization).

Kafka works with any programming language, and Confluent also provides a REST Proxy for HTTP(S) communication with Kafka. But these clients require the developer to implement all the complex XML mapping and processing. Hence, let’s now talk about the most common approaches to implement the integration between any XML-based application and Apache Kafka: 3rd party middleware (ETL / ESB tools) and Kafka-native Kafka Connect.

Kafka and XML via Middleware (ETL, ESB)

Apache Kafka and traditional middleware (ETL, ESB) are frenemies (friends and enemies). Check out this blog and video recording/slide deck for a more in-depth discussion and comparison. If these legacy integration tools such as TIBCO BusinessWorks or Software AG webMethods do one thing well, then it is graphical mappings of complex XML structures, including good and mature (but not 100%) support of the ugly WS-* web service standards.

Pros of 3rd Party Middleware for XML-Kafka Integration

Visual coding for a more straightforward mapping experience (especially crucial for very complex structures) – for all coders: Trust me, this is really easier and more time-efficient than writing, testing, and debugging source code!
Mature (10-20 years old technologies don’t have many bugs anymore – if they are still alive)
Support of complex XML Schema structures (but yet often issues with import, UI, and export)
(Often) already in place (implemented, tested, and deployed)
Kafka integration exists for any middleware which is still alive (i.e., maintained and supported by the vendor)

Cons of 3rd Party Middleware for XML-Kafka Integration

Products are legacy – as old as the source systems
Monolithic, inflexible architecture
Separate infrastructure to operate, test, maintain and pay
End-to-end integration is more challenging (from a technical and support perspective) as two systems in the middle instead of just one
Licensing
Point-to-point and tight coupling, and not event-based streaming with real decoupling
(Often) proprietary solution

Traditional middleware (such as TIBCO, IBM, Software AG, or Mulesoft) complements Kafka deployments. If you have the middleware already running and licensed (and do not plan to migrate away from it), then this is a viable approach to integrate between XML messages from legacy systems and Kafka.

Kafka and XML via Kafka Connect

The open Kafka ecosystem provides Kafka-native support for XML integration leveraging Kafka Connect. Kafka Connect is a Kafka-native tool for scalably and reliably streaming data integration between Apache Kafka and other data systems. It makes it simple to quickly define connectors that move large data sets into and out of Kafka. Think about it as ESB-on-Kafka.

Use cases include

Messaging integration (MQ)
Mainframe offloading
File outputs from batch processes
Web services (SOAP / WSDL / WS-*)
Legacy applications

Pros

Kafka-native (leveraging Kafka under the hood for scalability, throughput, high availability, exactly-once semantics, low latency, etc.)
Decoupled design (Domain-driven Microservice approach instead of tight coupling)
Open ecosystem and flexible integration with any data sources and sinks – check out Confluent Hub for hundreds of open source and commercial connectors
Cloud-native to be deployed in any edge / on-premise or cloud infrastructure such as Kubernetes

Cons

Limited support for complex XML Schemas and standards – not all ugly documents will work well
No visual coding – unfortunately, no Kafka-native visual coding tools exist in 2020. Let’s go, Confluent!

Let’s take a look at two Kafka Connect approaches in more detail: A dedicated XML Connector and an SMT (Single Message Transformation) embedded into any Kafka Connect source or sink connector.

Kafka Connect Connector for XML Files

An XML connector directly accesses the XML file to parse and transform the content:

Connect FilePulse is an open-source Kafka Connect connector built by streamthoughts to parse, transform and stream any XML file. Other file formats are also supported. But as many other tools support the modern data formats such as JSON, CSV, Avro, or Protobuf, I really think the highlight of this connector is the XML support.

Features:

Support for recursive scanning of local directories.
Reading and writing files into Kafka line by line.
Support multiple input file formats (e.g: CSV, JSON, AVRO, XML).
Parsing and transforming data using built-in or custom processing filters
Error handler definition
Monitoring files while they are written into Kafka
Support pluggable strategies to cleanup up completed files

Here is an excellent tutorial for using this XML connector for Kafka Connect: Streaming data into Kafka – Loading an XML file.

The Connect FilePulse Kafka Connector is the right choice for direct integration between XML files and Kafka.

SMT for Embedding XML Transformations into ANY Kafka Connect Connector

An SMT (Single Message Transformation) is part of the Kafka Connect framework. SMTs are applied to messages as they flow through Kafka Connect. They transform inbound messages after a source connector has produced them, but before they are written to Kafka. SMTs transform outbound messages before they are sent to a sink connector.

An SMT can be embedded into any Kafka Connect source or sink connector. Hence, the XML SMT for Kafka Connect allows direct integration with any interface and mapping XML messages without the need for storing the file or using a specific XML connector.

SMTs even allow to add or change metadata, e.g., by adding a new header in addition to the key and value of the message.

Here is an example: Receive XML messages from JMS-based messaging platforms and convert the XML payload to JSON, AVRO, or Protobuf for further processing and integration into the rest of the (modern) enterprise architecture. For instance, Confluent provides a generic JMS connector but also dedicated connectors for various legacy MQ products such as IBM MQ (often running on the mainframe), TIBCO EMS, and ActiveMQ. The XML SMT allows on-the-fly transformation of the incoming XML messages. I have seen integration and later replacement of these MQ tools across the globe in any kind of industry.

Dead Letter Queue (DLQ) for Handling Bad XML Messages

Just transforming messages is often not sufficient. A Dead Letter Queue (DLQ), aka Dead Letter Channel, is an Enterprise Integration Pattern (EIP) to handle bad messages. This design pattern is complementary for XML integration. For instance, a DLQ can store badly processed XML that didn’t fit the XSD in the transform. Here is an example of how to implement the rerouting to a DLQ using the above SMT.

Confluent Schema Registry for Data Governance

The Confluent Schema Registry is a complimentary (optional) tool. It provides a smart implementation of data format and content validation (including enforcement, versioning, and other features). I see it used in ~70% of Kafka projects across the globe. As soon as you do more than just data ingestion into a data lake like HDFS or S3, the added value is enormous. Today, Confluent Schema Registry supports JSON Schema, Avro, and Protobuf.

The Schema Registry provides an open interface and is pluggable. For example, some users have asked for Schema Registry to support XML. Now, you can add XML support to Schema Registry directly, and use the Schema Registry to store both XML and Avro at the same time. For more on how to add your schema formats, please refer to the documentation. The workaround is to do the discussed XML-to-another-format transformation first and start your event streaming data governance from that point.

Summary

XML is predominant in most enterprises and mostly used for legacy applications, batch processing, and SOAP / WSDL web services. A digital transformation can only be successful if the old world is connected well to the new world (as you might have learned in my example about how to integrate between Kafka and Mainframes).

This post explored the three most common options for integration between Kafka and XML:

XML integration with a 3rd party middleware
Kafka Connect connector for integration with XML files
Kafka Connect SMT to embed the transformation into ANY other Kafka Connect source or sink connector to transform XML messages on the flight

What are your experiences with XML integration for Kafka? Which implementation did you choose? What challenges did you face, and how did you or do you plan to solve this? What is your strategy? Let’s connect on LinkedIn and discuss it!

The post Kafka and XML Messages – Transformation, Connector, Middleware appeared first on Kai Waehner.

Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?

Kai Waehner — Mon, 25 May 2020 10:42:10 +0000

Event Streaming with Apache Kafka and API Management / API Gateway solutions (Apigee, Mulesoft Anypoint, Kong, TIBCO Mashery, etc.) are complementary, not competitive! Read this blog post to understand the relation between these two components in your enterprise architecture.

API Management is relevant for many years already. I talked about “A New Front for SOA: Open API and API Management as Game Changer” in 2014 when SOAP Web Services and Service-Oriented Architectures (SOA) were cutting-edge technologies and concepts. Exposing APIs and monetization were still in their infancy at that time. EDI / EDIFACT and similar complex technologies were used for B2B communication. B2C communication was just starting with smartphones and mobile apps. Internally billing was done with estimations and Excel sheets instead of automated and accurate information systems.

Let’s start this blog post with an overview of the current market situation. Use cases and the relation between event streaming with Apache Kafka and API Management with tools like Mulesoft Anypoint Platform are discussed afterwards. The last part of the post explores the future of API Management for streaming technologies (and how you can even solve this use case today already).

Market Situation – One Middleware Tool to Solve All your Problems?

Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.

Apache Kafka plays a key role in modern architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs. This blog post explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement each other, and why they are still not always a perfect match.

In the middleware market, every software vendor is the best one and puts itself into the middle of the enterprise architecture; at least if you trust marketing graphics. No matter which vendor’s website you visit, you will see something similar to this:

Middleware, Event Streaming and API Management Vendors

Here are some examples of global middleware vendors providing software to glue together applications and to provide APIs:

Universal Players offer various products. Vendors like Red Hat / IBM, Oracle, Software AG, TIBCO even offer different overlapping and competing solutions. For instance, IBM has 10+ products for integration middleware (not included are the rebranded product names).
Cloud Providers like AWS, GCP, Azure and Alibaba provide a vast number of services for gluing together applications and services.
Some companies focus just on Messaging, for instance Solace or Synadia (the company behind nats.io).
Event Streaming Platforms like Confluent or Streamlio (the company behind Pulsar; acquired by Splunk recently) are relative new on the market (compared to the above categories), but get more and more traction these days.
API Management solutions like Mulesoft, Apigee or Kong focus on the creation, life cycle management on monetization of APIs.
New startups focus on specific niches or cutting edge technologies, like solo.io providing an API Gateway on top of Envoy Proxy Service Mesh.

MQ, ETL, ESB, Kafka, API Management – When to use which Tool(s)?

Obviously this market situation creates an important question: When to use which tool(s)? How do they overlap with each other? When are they complementary?

I covered the discussion about traditional middleware and Kafka already in detail. Check out “Event streaming with Apache Kafka vs. traditional middleware using MQ, ETL, ESB“.

It is also relative easy to explain the relation between traditional middleware and API Management: Build a SOAP or REST based application (aka web service) and put an API Gateway or API Management tool in front of it to manage its lifecycle and monetize it.

Important pointer here: Some platforms like Mulesoft provide an ESB and API Management. You can use just one of them, or both together. Just make sure to compare the right things (to Kafka). For Mule ESB (vs. Kafka), check out the above link. For Mulesoft’s Anypoint Platform for API Management (vs. Kafka), read the below content… Read both if you need integration middleware and API Management.

How do Apache Kafka and API Management relate to each other? This question is harder to answer because both solve very different problems based on different technologies. Let’s discuss this topic in more detail in the following.

Use Cases for Event Streaming and Apache Kafka

First of all, it is very important to understand what ‘Event Streaming’ is and why this is different from the “traditional API approach” providing REST or SOAP web services.

Apache Kafka is used in all Industries and Verticals

Some use cases can also be done with other technologies, but it is easier and a simpler architecture with Kafka. That is true for integration layers and microservice architectures – and all the use cases around this like real time monitoring or customer 360.

Some other use cases cannot be done easily with other technologies because others don’t provide the combination of messaging + storage + processing in one single platform in a scalable, reliable and fault tolerant way – which is e.g. required to build a connected car infrastructure or sensor processing and analytics at scale in real time.

In the early era of Apache Kafka, many companies just used it for data ingestion into Hadoop or another data lake. The significant difference today – and this is what i would define as innovative – is that companies today use Apache Kafka as Event Streaming Platform to build mission-critical infrastructures and core operations platforms.

To be fair, Kafka is not the best solution for every problem. If you need point-to-point messaging, use something like RabbitMQ or IBM MQ. If you need to transfer large files, evaluate the market for MFT (Managed File Transfer) products. And… If you need to manage and monetize APIs, then evaluate API Management solutions.

Kafka’s Ecosystem to build mission-critical and scalable Platforms and real time Applications

Apache Kafka is more than just data ingestion or messaging. Apache Kafka (which includes Kafka Connect and Kafka Streams) and its open ecosystem (Schema Registry, ksqlDB, etc.) established a complete event streaming platform for many innovative use cases.

Here are some examples:

An interesting trend can be seen here: More and more Kafka deployments are mission-critical focusing on business transactions. These deployments cannot be down for an hour because the company behind it would be in huge trouble then.

Many more use cases from companies in almost all existing verticals and industries can be found at the Kafka Summit website. Videos and slides from all past talks are available for free. This includes success stories from tech giants, traditional companies and cutting edge startups.

Why Event Streaming with Apache Kafka?

Kafka has a few unique characteristics:

Combination of messaging, storage, integration and processing of data
Event-based architecture for real time processing, supporting modern design patterns like Event Sourcing and CQRS
Built for high availability, high throughput and cloud-native DevOps and CI/CD integration
Open source with a huge community and ecosystem

For these and other reasons, Kafka became the de facto standard for Microservice architectures and many other application infrastructures. Many of these use cases cannot be built with traditional middleware due to various limitations of scalability, non-flexible architectures or simply too high cost for building a highly available deployment.

So, what is the relation between event streaming with Kafka and API Management? Let’s explore this in the next section.

What is an API and its Relation to Event Streaming?

Event Streaming is changing from ground up how applications are built. More scalable, more reliable, decoupled, real time. In many new innovative use cases, there is no way around using event streaming instead of web services and traditional APIs.

This brings up several questions. Why do we still need to create and manage APIs? Does it make sense to put an API on top of streaming data? What technology and interface should this API use?

Let’s cover the basics first…

API (Application Programming Interface)

An API (application programming interface) is a computing interface which defines interactions between multiple software intermediaries. It defines the kinds of calls or requests that can be made, how to make them, the data formats that should be used, the conventions to follow, etc.

API Technologies

From a technical perspective, most people and products mean REST (HTTP) or SOAP (XML) web services when talking about APIs. Most API Gateway and API Management tools just support these technologies.

These two technologies are established in most companies for many years and are very mature. Some people prefer the one, some the other. Some people don’t like either one but have to use them because REST and SOAP web services are the de facto standard in enterprises today.

In fact, many other API technologies are available. Many of these other APIs do not use synchronous request-response patterns, but asynchronous communication.

Examples: WebSocket, MQTT, Server-side Events (SSE), or the Kafka protocol (the underlying wire protocol implemented in Kafka). So why are more and more technologies emerging?

Synchronous Request-Response vs. Asynchronous Event Streaming

Two very different communication paradigms exist: Request-response and event streaming.

Request-Response communication has the following characteristics:

Low latency
Typically synchronous
Point-to-point
“Bespoke API”
e.g. HTTP, SOAP, gRPC

Event streams are based on these concepts:

Messaging / Pub Sub (sending data from A to B and C)
Continuous data processing (filtering, transformations, aggregations, business logic)
Often asynchrounous
Event-driven, supporting patterns like Event Sourcing and CQRS
General-purpose events
e.g. Apache Kafka

Both approaches have their trade-offs. Most architectures need request-response and event streams! Read the great article from Gregor Hophe (author of the famous Enterprise Integration Patterns) from 2004: “Starbucks Does Not Use Two-Phase Commit“. This article explains very well why both synchronous and asynchronous communication make sense (together).

REST and SOAP Web Services typically use synchronous communication. This is not the full story, you could e.g. also use JMS-based SOAP communication, but the reality in most cases is synchronous request-response. Event streaming is asynchronous, but you can implement request-reply patterns, too.

Event Streaming instead of REST / SOAP Web Services

So what are the most important reasons why event streaming with technologies like Apache Kafka is often used for new projects instead of REST / SOAP web services?

REST / SOAP web services do not provide characteristics to build a scalable, reliable real time infrastructure for a high throughput of events. Period!

The other big advantage of Kafka is that it decouples microservices from each other. The storage of Kafka and the asynchronous (i.e. decoupled) communication keeps every microservice independent from each other. Microservice A does not need to know Microservice B, but they can still communicate with each other. Even if one of them is down while the other one is producing data. There can still be a contract (a term used in API Management a lot) between the producers and consumers, for instance using the Confluent Schema Registry.

One thing to point out here is that most API Management solutions and API Gateway today don’t support Event Streams but only Web Service APIs, unfortunately.

But let’s go one step back first and understand what API Management actually is.

What is API Management?

API management is the process of creating and publishing web application programming interfaces (APIs), enforcing their usage policies, controlling access, nurturing the subscriber community, collecting and analyzing usage statistics, and reporting on performance. API Management components provide mechanisms and tools to support the developer and subscriber community.

Gartner’s Magic Quadrant 2019 for Full Life Cycle API Management shows the various vendors in this market:

Use Cases for API Management

API Management can be used for different scenarios:

Open API: Developer portal and API Gateway
Partner Gateway: Access control for well-known external parties
Mobile App Gateway: Access control for apps deployed externally
Cloud Integration Gateway: Governance and mediation control for SaaS
Internal Governance: Manage, monetize and bill internal services and applications

Various different API business models are possible as John Musser explained very well in 2013 already:

What changed since 2013? Not that much! The main idea is the same: APIs are provided for the public, external partners or internal teams. However, technically speaking, more and more of these interfaces need to use a technology for real time streaming data at scale. REST APIs are not ideal or sometimes not even possible at all with its limitations regarding scalability.

No matter if the API Management solution supports just REST / SOAP web services or modern streaming technologies, the API development workflow looks like this:

While API Management solutions vary, components that provide the following functionalities are typically found in products:

API Gateway

A server that acts as an API front-end, receives API requests, enforces throttling and security policies, passes requests to the back-end service and then passes the response back to the requester. A gateway often includes a transformation engine to orchestrate and modify the requests and responses on the fly. A gateway can also provide functionality such as collecting analytics data and providing caching. The gateway can provide functionality to support authentication, authorization, security, audit and regulatory compliance.

API Life Cycle Management and Publishing Tools

A collection of tools that API providers use to define APIs, for instance using the OpenAPI or RAML specifications, generate API documentation, manage access and usage policies for APIs, test and debug the execution of API, including security testing and automated generation of tests and test suites, deploy APIs into production, staging, and quality assurance environments, and coordinate the overall API lifecycle.

Developer Portal / API Store

Community site, typically branded by an API provider, that can encapsulate for API users in a single convenient source information and functionality including documentation, tutorials, sample code, software development kits, an interactive API console and sandbox to trial APIs, the ability to subscribe to the APIs and manage subscription keys such as OAuth2 Client ID and Client Secret, and obtain support from the API provider and user and community.

Reporting and Analytics

Functionality to monitor API usage and load (overall hits, completed transactions, number of data objects returned, amount of compute time and other internal resources consumed, volume of data transferred). This can include real-time monitoring of the API with alerts being raised directly or via a higher-level network management system, for instance, if the load on an API has become too great, as well as functionality to analyze historical data, such as transaction logs, to detect usage trends. Functionality can also be provided to create synthetic transactions that can be used to test the performance and behavior of API endpoints. The information gathered by the reporting and analytics functionality can be used by the API provider to optimize the API offering within an organization’s overall continuous improvement process and for defining software Service-Level Agreements for APIs.

Monetization and Billing

Functionality to support charging for access to commercial APIs. This functionality can include support for setting up pricing rules, based on usage, load and functionality, issuing invoices and collecting payments including multiple types of credit card payments.

As you can see: An API Management solution has some exciting features to build and operate APIs! So what is the relation to Kafka? As discussed earlier, many innovative use cases require a scalable, reliable event streaming platform. That’s what Kafka is.

Kafka and API Management – Friends, Enemies or Frenemies?

To be very clear

Apache Kafka does not provide out-of-the-box capabilities of an API Management solution.
API Management solutions do not provide event streaming capabilities to continuously send, process, store and handle millions of events in real time (aka stream processing / streaming analytics).

Therefore, the combination of Kafka and API Management solution makes a lot of sense in many scenarios. It is NOT a competitive situation (like many people think – or are “taught” by some vendors).

Unique API Management Features

Some of the unique features of API Management products are:

API Developer Portal and Publishing Tools
API Life Cycle Management
Billing and Monetization

These components can be provided as standalone services respectively products (e.g. from a cloud provider) or within a complete platform (like Mulesoft Anypoint Platform).

Domain-Driven Design (DDD), Decoupling and Anti-Patterns

Some features from API Management tools overlap with other solutions. You should question if API Management is the right spot for doing this. This is not a ‘yes or no’ discussion. But I think in many cases, the API Management solution should not be used for tasks where other platforms provide the better capabilities regarding scalability, tooling, reliability, performance, and other characteristics.

A clear separation of concerns is important to simple and flexible enterprise architecture. Don’t couple things too tightly. This was a key issue of ESB deployments in the past. Don’t do the same fault with API Management. It is not a surprise and should be a warning that several vendors even built their API Management product on top of their ESB to couple things together.

Martin Fowler taught us several years ago “not to recreate ESB Anti Patterns with Kafka“. Keep this in mind for your API strategy, too! My article “Microservices, Apache Kafka, and Domain-Driven Design” should also help you understanding how important the separation of concerns and decoupling is for your enterprise architecture. This is true for Kafka, APIs and other business applications.

Overlapping Features between Kafka and API Management

Kafka provides a messaging and storage solution for event-based processing as its core. In addition, Kafka Connect (for integration) and Kafka Streams (for stream processing) are part of the open source project.

API Management exists for completely different use cases as discussed in detail in the above section: To create, publish, manage and monetize APIs.

Nevertheless, some overlapping features exist between Kafka and API Gateways and API Management solutions. Here are some examples:

Protocol conversion: One consumer or client requires JSON while the other one can only process Avro, Protobuf or XML.
ETL (Extract Transform Load): Transformations, filtering, sorting and similar tasks.
Connectivity: Integration with back-end systems like databases, data warehouses, data lakes, messaging systems, business applications.
API Gateway: Routing, public endpoints, single entry point, access control, encryption, throttling, etc. are common features. This can either be configured / implemented by a dedicated API Gateway (like Amazon API Gateway) or with a Kafka-based platform (like Confluent Platform providing features such as RBAC, Rest Proxy, etc).

Who should solve these overlapping tasks? The Event Streaming Platform or the API Management solution? Well, each vendor will tell you that they can do it the best way. Think about your architecture and requirements. What makes most sense? As so often: It depends!

If you want to build a scalable, reliable integration pipeline, Kafka is probably the better choice. If you need to provide a flexible Gateway interface for REST web services with routing configurations, a dedicated API Gateway is probably the best choice. Try to keep the architecture as simple as possible.

Let’s now take a look at an architecture to understand how Kafka and API Management solutions play together very well.

Microservices, API Management (Mulesoft Anypoint) and Event Streaming (Kafka)

The following examples shows a microservices architecture leveraging Event Streaming and API Management. It uses a combination of Confluent Platform for the event-based nervous system and Mulesoft Anypoint Platform for API Management and integration with some legacy applications:

There are different options to combine Kafka and Event Streaming with API Management solutions:

Event Streaming is used to process data continuously at scale in real time
Event Streaming is used to directly integrate with various data sources and data sinks (databases, messaging systems, business applications, etc.)
The heart of many companies is Event Streaming, gluing together streaming applications with batch, request-response and other platforms.
API Management is used to provide an API interface (including lifecycle management, monetization, etc) on top of Kafka applications, e.g. using services via Confluent REST Proxy, the REST API of Confluent Cloud to provision a new Kafka cluster, or the REST API running on top of a custom Kafka Streams / ksqlDB application or microservice using Interactive Queries.
Kafka is used as backend infrastructure. A proxy or business application is used in between Kafka and business applications. API Management is not directly used with Kafka interfaces, but one layer higher on top of the applications which use Kafka under the hood.

Most enterprise architectures require event streaming, request-response and API management. I hope if you read this far in this blog post, you agree and now understand why Apache Kafka and API Management platforms are complementary, not competitive.

But it is also clear that event streaming and today’s API Management tools don’t fit together perfectly because in many cases it does not make sense to put a REST or SOAP API on top of event streaming data.

The Missing Killer Feature: Native Kafka Integration in API Management and API Gateway

The last section explored options how Kafka and API Management work together very well.

In an ideal world, an API could be put directly on top of the Kafka protocol. In the real world, almost all API Management products today only support REST / SOAP web services. This means you (have to) build a web service on top of event streaming to provide the API Management capabilities.

Envoy proxy, one of the established proxies for building a Service Mesh, actually supports the Kafka protocol natively. On TCP level, no need to use HTTP REST APIs. This is huge from scalability and performance perspective. HTTP / synchronous request-response is an anti-pattern for streaming data and will not work if large scale is required for the streaming application. Check out “Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd” for more details on this topic.

Unfortunately, examples like Envoy’s support for the Kafka protocol are very rare today. What if you get native Kafka support in your API Management solution?

Streaming-based API Management for Cross Companies Communication

API Management using REST or SOAP web services is not appropriate for streaming data and large scale use cases. Therefore, more and more enterprises build streaming applications. How strange is it that almost all of these enterprises use the anti-pattern of providing a request-response based REST API on top of the streaming services for API Management?

Support for the Kafka protocol would be very helpful to make API Management even more complementary than it is today. Think about the huge opportunities if you could build life cycle management and monetization / billing on top of a streaming Kafka service.

A great example of a Kafka-native API is HERE Technologies, a company majority-owned by a consortium of German automotive companies (namely Audi, BMW, and Daimler) providing mapping and location services. Their real-time APIs recommend using the native Kafka interface (as all their backend services run on Kafka for the reasons discussed in this post) instead of an optional HTTP wrapper endpoint.

Cross-Company Streaming Replication

Even without proper support for event streaming in most API Management tools, I have seen many customers doing Kafka-native real time communication at scale between different business units or projects. Check out “Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments” to understand various different options.

Here is the most exciting use case: Streaming replication between different enterprises:

Different tools enable streaming replication between business units, regions or companies:

MirrorMaker 1
MirrorMaker 2
Confluent Replicator
uReplicator (Uber)
Mirus (Salesforce)
Brooklin (LinkedIn)
Custom Replication

If you want to rely on a mature and battle-tested product, then Confluent Replicator is the way to go today in 2020 for real time streaming replication. MirrorMaker 1 should never be an option. MirrorMaker 2 will be a great option in some quarters, but today it is very new and probably not the best option for a mission-critical project yet. All other options are only recommended if you want to dive deep into the project.

Tools like Confluent Schema Registry provide governance for the “streaming API interface”. Technologies like Avro, Protobuf or JSON Schema are used to define the “API contract” and process large data volumes efficiently and in real time.

Event Streaming Internally and REST API to the Outside World?

A cross-company streaming architecture has one key drawback: Information security and politics are your biggest enemy! But I have seen customers running this setup in production with a partner company. So it is doable, and even without API Management in the middle, you can leverage event streaming at scale with your partner. Think about use cases like airline ticketing, retail transactions or financial services.

Why would you build everything in real time at scale internally, but only provide a non-scalable synchronous HTTP interface to the outside world? And your external partners are asking themselves exactly the same question…

API Management for event streaming would make this easier from security and monetization / billing perspective. I hope this feature will be implemented soon by various API Management software vendors.

The Future – Streaming-based API Management for Apache Kafka?

Most architectures require request-response based communication (typically REST) and event streaming (typically Kafka). API Management helps making applications accessible; no matter if the heart of the infrastructure is event-based or a point-to-point communication.

I think (and hope) the future will provide streaming-based API Management solutions for Apache Kafka. Envoy’s support for the Kafka protocol is a first example. A few other frameworks also provide some “first hacks” already.

I hope this blog post helped you understanding the relation between Event Streaming with Apache Kafka and API Management solutions such as Kong or Mulesoft Anypoint Platform. They are complementary, not competitive.

How do you think about API Management in conjunction with event streaming and Apache Kafka? What is your strategy? Let’s connect on LinkedIn and discuss! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies? appeared first on Kai Waehner.