microservices Archives - Kai Waehner https://www.kai-waehner.de/blog/tag/microservices/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 19 Aug 2024 07:36:44 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png microservices Archives - Kai Waehner https://www.kai-waehner.de/blog/tag/microservices/ 32 32 The State of Data Streaming for Telco https://www.kai-waehner.de/blog/2023/06/02/the-state-of-data-streaming-for-telco-in-2023/ Fri, 02 Jun 2023 05:38:56 +0000 https://www.kai-waehner.de/?p=5437 This blog post explores the state of data streaming for the telco industry. The evolution of telco infrastructure, customer services, and new business models requires real-time end-to-end visibility, fancy mobile apps, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. Learn about customer stories from Dish Network, British Telecom, Globe Telecom, Swisscom, and more. A complete slide deck and on-demand video recording are included.

The post The State of Data Streaming for Telco appeared first on Kai Waehner.

]]>
This blog post explores the state of data streaming for the telco industry. The evolution of telco infrastructure, customer services, and new business models requires real-time end-to-end visibility, fancy mobile apps, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. Data streaming allows integrating and correlating data in real-time at any scale to improve most telco workloads.

I look at trends in the telecommunications sector to explore how data streaming helps as a business enabler, including customer stories from Dish Network, British Telecom, Globe Telecom, Swisscom, and more. A complete slide deck and on-demand video recording are included.

The State of Data Streaming for Telco in 2023

The Telco industry is fundamental for growth and innovation across all industries.

The global spending on telecom services is expected to reach 1.595 trillion U.S. dollars by 2024 (Source: Statista, Jul 2022).

Cloud-native infrastructure and digitalization of business processes are critical enablers. 5G network capabilities and telco marketplaces enable entirely new business models.

5G enables new business models

Presentation of Amdocs / Mavenir:

5G Use Cases with Amdocs and Mavenir

A report from McKinsey & Company says, “74 percent of customers have a positive or neutral feeling about their operators offering different speeds to mobile users with different needs”. The potential for increasing the revenue per user (ARPU) with 5G use cases is enormous for telcos:

Potential from 5G monetization

Telco marketplace

Many companies across industries are trying to build a marketplace these days. But especially the telecom sector might shine here because of its interface between infrastructure, B2B, partners, and end users for sales and marketing.

tmforum has a few good arguments for why communication service providers (CSP) should build a marketplace for B2C and B2B2X:

  • Operating the marketplace keeps CSP in control of the relationship with customers
  • A marketplace is a great sales channel for additional revenue
  • Operating the marketplace helps CSPs monetize third-party (over-the-top) content
  • The only other option is to be relegated to connectivity provider
  • Enterprise customers have decided this is their preferred method of engagement
  • CPSs can take a cut of all sales
  • Participating in a marketplace prevents any one company from owning the customer

Data streaming in the telco industry

Adopting trends like network monitoring, personalized sales and cybersecurity is only possible if enterprises in the telco industry can provide and correlate information at the right time in the proper context. Real-time, which means using the information in milliseconds, seconds, or minutes, is almost always better than processing data later (whatever later means):

Real-Time Data Streaming in the Telco Industry

Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. Apache Kafka is the de facto standard for data streaming.

Use Cases for Apache Kafka in Telcois a good article for starting with an industry-specific point of view on data streaming. “Apache Kafka for Telco-OTT and Media Applications” explores over-the-top B2B scenarios.

Data streaming with the Apache Kafka ecosystem and cloud services are used throughout the supply chain of the telco industry. Search my blog for various articles related to this topic: Search Kai’s blog.

From Telco to TechCo: Next-generation architecture

Deloitte describes the target architecture for telcos very well:

Requirements for the next generation telco architecture

Data streaming provides these characteristics: Open, scalable, reliable, and real-time. This unique combination of capabilities made Apache Kafka so successful and widely adopted.

Kafka decouples applications and is the perfect technology for microservices across a telco’s enterprise architecture. Deloitte’s diagram shows this transition across the entire telecom sector:

Cloud-native Microservices and Data Mesh in the Telecom Sector

This is a massive shift for telcos:

  • From purpose-built hardware to generic hardware and elastic scale
  • From monoliths to decoupled, independent services

Digitalization with modern concepts helps a lot in designing the future of telcos.

Open Data Architecture (ODA)

tmforum describes Open Digital Architecture (ODA) as follows:

“Open Digital Architecture is a standardized cloud-native enterprise architecture blueprint for all elements of the industry from Communication Service Providers (CSPs), through vendors to system integrators. It accelerates the delivery of next-gen connectivity and beyond – unlocking agility, removing barriers to partnering, and accelerating concept-to-cash.

ODA replaces traditional operations and business support systems (OSS/BSS) with a new approach to building software for the telecoms industry, opening a market for standardized, cloud-native software components, and enabling communication service providers and suppliers to invest in IT for new and differentiated services instead of maintenance and integration.”

Open Data Architecture ODA tmforum

If you look at the architecture trends and customer stories for data streaming in the next section, you realize that real-time data integration and processing at scale is required to provide most modern use cases in the telecommunications industry.

The telco industry applies various trends for enterprise architectures for cost, flexibility, security, and latency reasons. The three major topics I see these days at customers are:

  • Hybrid architectures with synchronization between edge and cloud in real-time
  • End-to-end network and infrastructure monitoring across multiple layers
  • Proactive service management and context-specific customer interactions

Let’s look deeper into some enterprise architectures that leverage data streaming for telco use cases.

Hybrid 5G architecture with data streaming

Most telcos have a cloud-first strategy to set up modern infrastructure for network monitoring, sales and marketing, loyalty, innovative new OTT services, etc. However, edge computing gets more relevant for use cases like pre-processing for cost reduction, innovative location-based 5G services, and other real-time analytics scenarios:

Hybrid 5G Telco Infrastructure with Data Streaming

Learn about architecture patterns for Apache Kafka that may require multi-cluster solutions and see real-world examples with their specific requirements and trade-offs. That blog explores scenarios such as disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments, and global Kafka.

Edge deployments for data streaming are their own challenges. In separate blog posts, I covered use cases for Kafka at the edge and provided an infrastructure checklist for edge data streaming.

End-to-end network and infrastructure monitoring

Data streaming enables unifying telemetry data from various sources such as Syslog, TCP, files, REST, and other proprietary application interfaces:

Telemetry Network Monitoring with Data Streaming

End-to-end visibility into the telco networks allows massive cost reductions. And as a bonus, a better customer experience. For instance, proactive service management tells customers about a network outage:

Proactive Service Management across OSS and BSS

Context-specific sales and digital lifestyle services

Customers expect a great customer experience across devices (like a web browser or mobile app) and human interactions (e.g., in a telco store). Data streaming enables a context-specific omnichannel sales experience by correlating real-time and historical data at the right time in the proper context:

Omnichannel Retail in the Telco Industry with Data Streaming

Omnichannel Retail and Customer 360 in Real Time with Apache Kafka” goes into more detail. But one thing is clear: Most innovative use cases require both historical and real-time data. In summary, correlating historical and real-time information is possible with data streaming out-of-the-box because of the underlying append-only commit log and replayability of events. A cloud-native Tiered Storage Kafka infrastructure to separate compute from storage makes such an enterprise architecture more scalable and cost-efficient.

The article “Fraud Detection with Apache Kafka, KSQL and Apache Flink” explores stream processing for real-time analytics in more detail, shows an example with embedded machine learning, and covers several real-world case studies.

New customer stories for data streaming in the telco industry

So much innovation is happening in the telecom sector. Automation and digitalization change how telcos monitor networks, build customer relationships, and create completely new business models.

Most telecommunication service providers use a cloud-first approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating IT infrastructure. And elastic scalability gets even more critical with all the growing networks and 5G workloads.

Here are a few customer stories from worldwide telecom companies:

  • Dish Network: Cloud-native 5G Network with Kafka as the central communications hub between all the infrastructure interfaces and IT applications. The standalone 5G infrastructure in conjunction with data streaming enables new business models for customers across all industries, like retail, automotive, or energy sector.
  • Verizon: MEC use cases for low-latency 5G stream processing, such as autonomous drone-in-a-box-based monitoring and inspection solutions or vehicle-to-Everything (V2X).
  • Swisscom: Network monitoring and incident management with real-time data at scale to inform customers about outages, root cause analysis, and much more. The solution relies on Apache Kafka and Apache Druid for real-time analytics use cases.
  • British Telecom (BT): Hybrid multi-cloud data streaming architecture for proactive service management. BT extracts more value from its data and prioritizes real-time information and better customer experiences.
  • Globe Telecom: Industrialization of event streaming for various use cases. Two examples: Digital personalized rewards points based on customer purchases. Airtime loans are made easier to operationalize (vs. batch, where top-up cash is already spent again).

Resources to learn more

This blog post is just the starting point. Learn more about data streaming in the telco industry in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases.

On-demand video recording

The video recording explores the telecom industry’s trends and architectures for data streaming. The primary focus is the data streaming case studies. Check out our on-demand recording:

The State of Data Streaming for Telco in 2023

Slides

If you prefer learning from slides, check out the deck used for the above recording:

Fullscreen Mode

Case studies and lightboard videos for data streaming in telco

The state of data streaming for telco is fascinating. New use cases and case studies come up every month. This includes better data governance across the entire organization, real-time data collection and processing data from network infrastructure and mobile apps, data sharing and B2B partnerships with OTT players for new business models, and many more scenarios.

We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Stay tuned; I will update the links in the next few weeks and publish a separate blog post for each story and lightboard video.

And this is just the beginning. Every month, we will talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on…

Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post The State of Data Streaming for Telco appeared first on Kai Waehner.

]]>
Apache Kafka for Data Consistency (and Real-Time Data Streaming) https://www.kai-waehner.de/blog/2022/12/27/apache-kafka-for-data-consistency-and-real-time-data-streaming/ Tue, 27 Dec 2022 08:46:04 +0000 https://www.kai-waehner.de/?p=5090 Real-time data beats slow data in almost all use cases. But as essential is data consistency across all systems, including non-real-time legacy systems and modern request-response APIs. Apache Kafka's most underestimated feature is the storage component based on the append-only commit log. It enables loose coupling for domain-driven design with microservices and independent data products in a data mesh. This blog post explores how Kafka enables data consistency with a real-world case study from financial services.

The post Apache Kafka for Data Consistency (and Real-Time Data Streaming) appeared first on Kai Waehner.

]]>
Real-time data beats slow data in almost all use cases. But as essential is data consistency across all systems, including non-real-time legacy systems and modern request-response APIs. Apache Kafka’s most underestimated feature is the storage component based on the append-only commit log. It enables loose coupling for domain-driven design with microservices and independent data products in a data mesh. This blog post explores how Kafka enables data consistency with a real-world case study from financial services.

Apache Kafka for Data Consistency (and Real-Time Data Streaming)

Apache Kafka = Real-time data streaming

Real-time beats slow data. It is that easy in almost all use cases. Ask any executive or business person: What’s better? If you can consume and use information now or later? The value of data goes down over time:

Forrester - Business Value of Real Time Data

Apache Kafka is the de facto standard for real-time data streaming. Check out the data streaming landscape 2023 to learn more about Kafka-related products and cloud services.

So far, so good. However, one valid question always comes up: “Why is Apache Kafka different from a real-time message broker like RabbitMQ, IBM MQ, NATS, or Amazon SQS?”

TL;DR: Message brokers provide real-time messaging capabilities to produce and consume messages. Apache Kafka is a data streaming platform that combines messaging, storage, data integration, and stream processing capabilities.

My comparison of message brokers and data streaming explored the differences using ten characteristics. It is all about the storage component of Apache Kafka in the discussion of data consistency. Let’s explore why…

Real-time means many things…

… from deterministic systems with hard real-time up to minutes or even hours. Always define your requirements for real-time data processing and the end-to-end latency (not just the messaging component).

I clarified when to use Apache Kafka for real-time workloads in separate blog posts:

Data consistency = The biggest challenge of the enterprise architecture

Data consistency refers to whether the same data kept at different places does or does not match. The data is processed in many ways across the enterprise architecture:

  • Real-time: Message brokers or data streaming platforms transfer or process data when it is in motion.
  • Near real-time: Platforms ingest data into data lakes and data warehouses in seconds or minutes.
  • Batch: Reporting and analytics of historical data.
  • Request-response: Interactive API or SQL queries to collect specific information.
  • A point-in-time replay of historical data: Troubleshooting, incident management, regulatory reporting, and similar scenarios.

The applications and data platforms use very different (old and new) technologies, products, cloud services, and APIs. Integration and data consistency across the different communication paradigms is a massive challenge within the spaghetti architecture:

Integration Mess in a Spaghetti Enterprise Architecture

The consequence of inconsistent data is obvious:

  • Bad customer experience, e.g., late notification about flight delays or cancellations.
  • Revenue loss, e.g., inventory not up-to-date, missed or too late detection of fraud.
  • Increased cost, e.g., slow or wrong decisions in logistics across the supply chain.
  • Increased risk, e.g., unrecognized data breaches, compliance issues.

This is where the storage component of Apache Kafka makes the difference…

Apache Kafka = Streaming platform to decouple any application and communication paradigm

Kafka is an append-only commit log. Consumers are independent of each other and independent of producers. They interact at their own pace with their own communication paradigm and pull the information from the Kafka log.

This enables independent consumption and processing of data consistently. It does not matter what technology or communication paradigm the downstream consumer application uses:

Data Consistency with Real-Time, Batch or Request Response Consumption

This example of a real-time locating system (RTLS) for asset tracking can be built with Kafka straightforwardly. Downstream applications consume events in real-time, near real-time, batch, or via request-response HTTP/REST APIs. With other technologies, like real-time message queues, you need to add additional platforms for storage, integration, and data processing. Data streaming provides a single (scalable and reliable) platform.

Apache Kafka enables domain-driven design and data mesh architectures

Domain-driven Design (DDD) needs decoupled applications. Push-based message brokers or HTTP/REST web services enforce tight coupling between the systems. This creates the above spaghetti architecture.

On the other side, Apache Kafka truly decouples the domains, no matter what technologies or communication paradigms each domain uses. Loose coupling is the norm with Kafka:

Decentralized Data Mesh powered by data streaming and Apache Kafka

This is why the storage component using the combination of real-time messaging and a distributed commit log enables data consistency across technologies and communication paradigms.

With Tiered Storage for Kafka, storage and compute are separated to enable long-term storage in Kafka for the replayability of historical data. This is not needed for every use case. Kafka does not replace your favorite database. But it is beneficial for some use cases, e.g., model training with TensorFlow and Kafka to leverage machine learning with a Kappa architecture.

Domain-driven design is the foundation of modern microservice architecture or data mesh. For that reason, many cloud-native enterprise architectures are built using Kafka as the heart of the infrastructure for real-time data sharing plus loose coupling between the data products.

Let’s look at a practical real-world example where the added value of Kafka was not its real-time capability but enabling data consistency across systems…

Erste Group Bank: A case study for data consistency with Apache Kafka

Erste Group Bank AG (Erste Group) is an Austrian financial service provider in Central and Eastern Europe serving 15.7 million clients in over 2,700 branches in seven countries. The bank presented its data streaming journey at Confluent’s Data in Motion 2022 tour in Zurich.

Having a strong mission to increase its customer experience, Erste Group is putting more and more data into action. Making this happen can be summarized as a race for consistency across our channels, squaring the circle between latency and data volumes.

The digital transformation at Erste Group required challenging integration across different technologies and communication paradigms. The following sections describe why Apache Kafka was chosen.

Hyper-personalized mobile banking

A great user experience in the new mobile app “Georg” is a crucial strategic component of Erste Group’s digital transformation to increase customer experience and revenue.

Here is how Erste Group promotes its mobile app: “For 8 million people in 6 countries, banking has a name. George. George empowers everyone to understand, manage, and improve their financial health. Simple. Intelligent. Personal. Unique.”

The following diagram shows the positive and negative reviews of customers, including Erste Group’s mobile app “George” and competitive banking apps:

Erste Bank Mobile App Reviews and Feedback
Source: Erste Group Bank AG

A great mobile app user experience requires the combination of many technologies in the backend. Data streaming enables the foundation in the backend to build an intuitive mobile app across various European countries.

Scalability and accurate information at the right time in the right context make the difference:

Erste Bank Data Platform Landscape
Source: Erste Group Bank AG

Fully managed data streaming for omnichannel data consistency

Here comes the surprising part of why I chose this case study for the blog post: While the real-time capability of Apache Kafka at any scale is essential, the critical aspect of the technology choice was data consistency across platforms and communication paradigms:

Real Time and Data Consistency with Apache Kafka and Confluent Cloud at Erste Bank Group
Source: Erste Group Bank AG

Erste Group built an enterprise architecture with data streaming to enable asynchronous decoupled domains with event sourcing. The responsibility is split across cross-functional teams like Digital and Business Intelligence. However, consistent data is served in different ways for various downstream consumers:

  • Stream processing for real-time subscriptions
  • A serving layer for API integration via request-response protocols like HTTP/REST
  • Tiered Storage for long-term replayability of historical data to enable analytical queries
Data Consistency with Stream Processing powered by Apache Kafka
Source: Erste Group Bank AG

The infrastructure is fully managed in Confluent Cloud to enable focusing on business problems and innovation. DevOps and MLOps automate the development and monitoring lifecycle of the applications.

Data consistency is as critical as real-time data

Apache Kafka is the de facto standard for real-time data streaming. In addition, most enterprise architectures leverage the append-only commit log for loosely coupling to enable agile and elastic microservice architectures. The vision of building data products in a decentralized data mesh is made possible with Apache Kafka.

This post showed the case study of Erste Group to enable data consistency across domains and technologies with fully managed Kafka in the cloud. Obviously, we just explored the foundation. Data sharing across organizations and enforced data governance, including access control, encryption, and audit logging, are mandatory to realize a data mesh in the real world. I discussed these topics in my overview of the top 5 trends for data streaming in 2023.

Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka for Data Consistency (and Real-Time Data Streaming) appeared first on Kai Waehner.

]]>
The Heart of the Data Mesh Beats Real-Time with Apache Kafka https://www.kai-waehner.de/blog/2022/07/28/the-heart-of-the-data-mesh-beats-real-time-with-apache-kafka/ Thu, 28 Jul 2022 06:08:38 +0000 https://www.kai-waehner.de/?p=4706 If there were a buzzword of the hour, it would undoubtedly be "data mesh"! This new architectural paradigm unlocks analytic and transactional data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios. The data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. Learn how the de facto standard for data streaming, Apache Kafka, plays a crucial role in building a data mesh.

The post The Heart of the Data Mesh Beats Real-Time with Apache Kafka appeared first on Kai Waehner.

]]>
If there were a buzzword of the hour, it would undoubtedly be “data mesh“! This new architectural paradigm unlocks analytic and transactional data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios. The data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. Learn how the de facto standard for data streaming, Apache Kafka, plays a crucial role in building a data mesh.

The Heart of the Data Mesh Beats Real Time with Apache Kafka

There is no single technology or product for a data mesh!

This post explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and – complemented by many other data platforms like a data warehouse, data lake, and lakehouse – solve real business problems.

There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job. A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.

What is a data mesh?

I won’t write yet another article describing the concepts of a data mesh. Zhamak Dehghani coined the term in 2019. The following data mesh architecture from 30,000-foot view explains the basic idea well:

Data Mesh for Domain Driven Design Microservices Data Mart Data Streaming

I summarize data mesh as the following three bullet points:

  • An architecture paradigm with several historical influences (domain-driven design, microservices, data marts, data streaming)
  • Not specific to a single technology or product; no single vendor can implement a data mesh alone
  • Handling data as a product is a fundamental change, enabling a more flexible architecture and independent solving of separate business problems
  • Decentralized services, not just analytics but also transactional workloads

Why handle data as a product?

Talking about innovative technology is insufficient to introduce a new architectural paradigm. Consequently, measuring the business value of the enterprise architecture is critical, too.

McKinsey finds that “when companies instead manage data like a consumer product—be it digital or physical—they can realize near-term value from their data investments and pave the way for quickly getting more value tomorrow. Creating reusable data products and patterns for piecing together data technologies enables companies to derive value from data today and tomorrow”:

McKinsey - Why Handle Data as a Product

For McKinsey, the benefits of this approach can be significant:

  • New business use cases can be delivered as much as 90 percent faster.
  • The total cost of ownership, including technology, development, and maintenance, can decline by 30 percent.
  • The risk and data-governance burden can be reduced.

What is data streaming with Apache Kafka and its relation to data mesh?

A data mesh enables flexibility through decentralization and best-of-breed data products. The heart of data sharing requires reliable real-time data at any scale between data producers and data consumers. Additionally, true decoupling between the decentralized data products is key to the success of the data mesh paradigm. Each domain must have access to shared data but also the ability to choose the right tool (i.e., technology, API, product, or SaaS) to solve its business problems.

That’s where data streaming fits into the data mesh story:

Flexibility through Decentralization and Best-of-Breed with Data Streaming

The de facto standard for data streaming is Apache Kafka. A cloud-native data streaming infrastructure that can link clusters with each other out-of-the-box enables building a modern data mesh. No Data Mesh will use just one technology or vendor. Learn from inspiring posts from your favorite data products vendors like AWS, Snowflake, Databricks, Confluent, and many more to successfully define and build your custom Data Mesh. Data Mesh is a journey, not a big bang. A data warehouse or data lake (or in modern days, a lakehouse) cannot be the only infrastructure for data mesh and data products.

I covered how to leverage the capabilities of Apache Kafka and its ecosystem like Kafka Connect, ksqlDB, Cluster Linking, etc. to build the heart of a data mesh in a separate blog post: Streaming Data Exchange with Kafka and a Data Mesh in Motion.

Example: Real-time data fabric in hybrid cloud

Here is one example spanning a streaming Data Mesh across multiple cloud providers like AWS, Azure, GCP, or Alibaba, and on-premise / edge sites:

Hybrid Cloud Streaming Data Mesh powered by Apache Kafka and Cluster Linking

This example shows all the characteristics discussed in the above sections for a Data Mesh:

  • Decentralized real-time infrastructure across domains and infrastructures
  • True decoupling between domains within and between the clouds
  • Several communication paradigms, including data streaming, RPC, and batch
  • Data integration with legacy and cloud-native technologies
  • Continuous stream processing where it adds value, and batch processing in some analytics sinks

Presentation: Building a decentralized data mesh with data streaming at its heart

The following slide deck walks you through the motivation, principles, and architectures of building a real-time data mesh powered by Apache Kafka using the Kappa architecture, hybrid cloud, and stream data sharing:

The data mesh provides flexibility and freedom of technology choice for each data product

The heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. As the de facto standard for data streaming, Apache Kafka plays a crucial role in a cloud-native data mesh architecture. Nevertheless, data mesh is not bound to a specific technology. The beauty of the decentralized architecture is the freedom of technology choice for each business unit when building its data products.

Data sharing between domains within and across organizations is another aspect where data streaming helps in a data mesh. Real-time data beats slow data. That is not just true for most business problems across industries but also for replicating data between data centers, clouds, regions, or organizations. A streaming data exchange enables data sharing in real-time to build a data mash in motion.

Did you already start building your Data Mesh? What does the enterprise architecture look like? What frameworks, products, and cloud services do you use? Is the heart of your data mesh real-time in motion or some lakehouse at rest? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post The Heart of the Data Mesh Beats Real-Time with Apache Kafka appeared first on Kai Waehner.

]]>
Streaming Data Exchange with Kafka and a Data Mesh in Motion https://www.kai-waehner.de/blog/2021/11/14/streaming-data-exchange-data-mesh-apache-kafka-in-motion/ Sun, 14 Nov 2021 13:25:45 +0000 https://www.kai-waehner.de/?p=3412 Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a  Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms to solve business problems.

The post Streaming Data Exchange with Kafka and a Data Mesh in Motion appeared first on Kai Waehner.

]]>
Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. Every data and platform vendor describes how to build the best Data Mesh with their platform. The Data Mesh story includes cloud providers like AWS, data analytics vendors like Databricks and Snowflake, and Event Streaming solutions like Confluent. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms, to solve business problems.

Streaming Data Exchange with Apache Kafka and Data Mesh in Motion

Data at Rest vs. Data in Motion

Before we get into the Data Mesh discussion, it is crucial to clarify the difference and relevance of Data at Rest and Data in Motion:

  • Data at Rest: Data is ingested and stored in a storage system (database, data warehouse, data lake). Business logic and queries execute against the storage. Everyday use cases include reporting with business intelligence tools, model training in machine learning,  and complex batch analytics like shuffling or map and reduce. As the data is at rest, the processing is too late for real-time use cases.
  • Data in Motion: Data is processed and correlated continuously while new events are fed into the platform. Business logic and queries execute in real-time. Common real-time use cases include inventory management, order processing, fraud detection, predictive maintenance, and many other use cases.

Real-time Data Beats Slow Data

Real-time beats slow data in almost all use cases across industries. Hence, ask yourself or your business team how they want or need to consume and process the data in the next project. Data at Rest and Data in Motion have trade-offs. Therefore, both concepts are complementary. For this reason, modern cloud infrastructures leverage both in their architecture. Serverless Event Streaming with Kafka combined with the AWS Lakehouse is a great resource to learn more.

However, while connecting a batch system to a real-time nervous system is possible, the other way round – connecting a real-time consumer to batch storage – is not possible. The Kappa vs. Lambda Architecture discussion gives more insights into this.

Kafka is a database. So, it is also possible to use it for data at rest. For instance, the replayability of historical events in guaranteed ordering is essential and helpful for many use cases. However, long-term storage in Kafka has several limitations, like limited query capabilities. Hence, for many use cases, event streaming and other storage systems are complementary, not competitive.

Data Mesh – An Architecture Paradigm

Data mesh is an implementation pattern (not unlike microservices or domain-driven design) but applied to data. Thoughtworks coined the term. You will find tons of resources on the web. Zhamak Dehghani gave a great introduction about “How to build the Data Mesh Foundation and its Relation to Event Streaming” at the Kafka Summit Europe 2021.

Domain-driven + Microservices + Event Streaming

Data Mesh is not an entirely new paradigm. It has several historical influences:

Data Mesh Architecture with Micorservices Domain-driven Design Data Marts and Event Streaming

The architectural paradigm unlocks analytical data at scale, rapidly unlocking access to an ever-growing number of distributed domain data sets for a proliferation of consumption scenarios such as machine learning, analytics, or data-intensive applications across the organization. A data mesh addresses the common failure modes of the traditional centralized data lake or data platform architecture.

Data Mesh is a Logical View, not Physical!

Data mesh shifts to a paradigm that draws from modern distributed architecture: considering domains as the first-class concern, applying platform thinking to create a self-serve data infrastructure, treating data as a product, and implementing open standardization to enable an ecosystem of interoperable distributed data products.

Here is an example of a Data Mesh:

Data Mesh with Apache Kafka

TL;DR: Data Mesh combines existing paradigms, including Domain-driven Design, Data Marts, Microservices, and Event Streaming.

Data as the Product

However, the differentiating aspect focuses on product thinking (“Microservice for Data”) with data as a first-class product. Data products are a perfect fit for Event Streaming with Data in Motion to build innovative new real-time use cases.

Data Product - The Domain Driven Microservice for Data

A Data Mesh with Event Streaming

Why is Event Streaming a good fit for data mesh?

Streams are real-time, so you can propagate data throughout the mesh immediately, as soon as new information is available. Streams are also persisted and replayable, so they let you capture both real-time AND historical data with one infrastructure. And because they are immutable, they make for a great source of record, which is helpful for governance.

Data in Motion is crucial for most innovative use cases. And as discussed before, real-time data beats slow data in almost all scenarios. Hence, it makes sense that the heart of a Data Mesh architecture is an Event Streaming platform. It provides true decoupling, scalable real-time data processing, and highly reliable operations across the edge, data center, and multi-cloud.

Kafka Streaming API – The De Facto Standard for Data in Motion

The Kafka API is the de facto standard for Event Streaming. I won’t explore this discussion again and again. Here are a few references before we move to the “Kafka + Data Mesh” content…

A Kafka-powered Data Mesh

I highly recommend watching Ben Stopford’s and Michael Noll’s talk about “Apache Kafka and the Data Mesh“. Several of the screenshots in this post are from that presentation, too. Kudos to my two colleagues! The talk explores the key concepts of a Data Mesh and how they are related to Event Streaming:

  • Domain-driven Decentralization
  • Data as a Self-serve Product
  • First-class Data Platform
  • Federated Governance

Let’s now explore how Event Streaming with Kafka fits into the Data Mesh architecture and how other solutions like a database or data lake complement it.

Data Exchange for Input and Output within a Data Mesh using Kafka

Data product, a “microservice for the data world”:

  • A node on the data mesh, situated within a domain.
  • Produces and possibly consumes high-quality data within the mesh.
  • Encapsulates all the elements required for its function, namely data plus code plus infrastructure.

A Data Mesh is not just one Technology!

The heart of a Data Mesh infrastructure must be real-time, decoupled, reliable, and scalable. Kafka is a modern cloud-native enterprise integration platform (also often called iPaaS today). Therefore, Kafka provides all the capabilities for the foundation of a Data Mesh.

However, not all components can or should be Kafka-based. Choose the right tool for a problem. Let’s explore in the following subsections how Kafka-native technologies and other solutions are used in a Data Mesh together.

Stream Processing within the Data Product with Kafka Streams and ksqlDB

An event-based data product aggregates and correlates information from one or more data sources in real-time. Stateless and stateful stream processing is implemented with Kafka-native tools such as Kafka Streams or ksqlDB:

Event Streaming within the Data Product with Stream Processing Kafka Streams and ksqlDB

Variety of Protocols and Communication Paradigms within the Data Product – HTTP, gRPC, MQTT, and more

Obviously, not every application uses just Event Streaming as a technology and communication paradigm. The above picture shows how one consumer application could also be a request/response technology like HTTP or gRPC to do a pull query. In contrast, another application continuously consumes the streaming push query with a native Kafka consumer in any programming language, such as Java, Scala, C, C++, Python, Go, etc.

The data product often includes complementary technologies. For instance, if you built a connected car infrastructure, you likely use MQTT for the last-mile integration, ingest the data into Kafka, and further processing with Event Streaming. The “Kafka + MQTT Blog Series” is an excellent example from the IoT space to learn about building a data product with complementary technologies.

Variety of Solutions within the Data Product – Event Streaming, Data Warehouse, Data Lake, and more

The beauty of microservice architectures is that every application can choose the right technologies. An application might or might not include databases, analytics tools, or other complementary components. The input and output data ports of the data product should be independent of the chosen solutions:

Data Stores within the Data Product with Snowflake MongoDB Oracle et al

Kafka Connect is the right Kafka-native technology to connect other technologies and communication paradigms with the Event Streaming platform. Evaluate if you need another integration middleware (like an ETL or ESB) or if the Kafka infrastructure is the better enterprise integration platform (iPaaS) for your data product within the data mesh.

A Global Streaming Data Exchange

The Data Mesh concept is relevant for global deployments, not just within a single project or region. Multiple Kafka clusters are the norm, not an exception. I wrote about customers using Event Streaming with Kafka in global architectures a long time ago.

Various architectures exist to deploy Kafka across data centers and multiple clouds. Some use cases require low latency and deploy some Kafka instances at the edge or in a 5G zone. Other use cases replicate data between regions, countries, or continents across the globe for disaster recovery, aggregation, or analytics use cases.

Here is one example spanning a streaming Data Mesh across multiple cloud providers like AWS, Azure, GCP, or Alibaba, and on-premise / edge sites:

Hybrid Cloud Streaming Data Mesh powered by Apache Kafka and Cluster Linking

This example shows all the characteristics discussed in the above sections for a Data Mesh:

  • Decentralized real-time infrastructure across domains and infrastructures
  • True decoupling between domains within and between the clouds
  • Several communication paradigms, including data streaming, RPC, and batch
  • Data integration with legacy and cloud-native technologies
  • Continuous stream processing where it adds value, and batch processing in some analytics sinks

Example: A Streaming Data Exchange across Domains in the Automotive Industry

The following example from the automotive industry shows how independent stakeholders (= domains in different enterprises) use a cross-company streaming data exchange:

Streaming Data Exchange with Data Mesh in Motion using Apache Kafka and Cluster Linking

Innovation does not stop at the own border. Streaming replication is relevant for all use cases where real-time is better than slow data (valid for most scenarios). A few examples:

  • End-to-end supply chain optimization from suppliers to the OEM to the intermediary to the aftersales
  • Track and trace across countries
  • Integration of 3rd party add-on services to the own digital product
  • Open APIs for embedding and combining external services to build a new product

I could go on and on with the list. Many data products need to be accessible by 3rd party in real-time at scale. Some API gateway or API management tool comes into play in such a situation.

A real-world example of a streaming data exchange powered by Kafka is the mobility service Here Technologies. They expose the Kafka API to directly consume streaming data from their mapping services (as an alternative option to their HTTP API):

Here Technologies Moblility Service Apache Kafka Open API

However, even if all collaborating partners use Kafka under the hood in their architecture, exposing the Kafka API directly to the outside world does not always make sense. Some technical capabilities (e.g., access control or connectivity to thousands of devices) and missing business functions (e.g., for monetization or reporting) of the Kafka ecosystem bring an API layer on top of the Event Streaming infrastructure into play in many real-world deployments.

Open API for 3rd Party Integration and Streaming API Management

API Gateways and API Management tools exist in many varieties, including open-source frameworks, commercial products, and SaaS cloud offerings. Features include technical routing, access control, monetization, and reporting.

However, most people still implement the Open API concept with RPC in mind. I guess 95+% still use HTTP(S) to make APIs accessible to other stakeholders (e.g., other business units or external parties). RPC makes little sense in a streaming Data Mesh architecture if the data needs to be processed at scale in real-time.

There is still an impedance mismatch between Event Streaming and API Management. But it gets better these days. Specifications like AsyncAPI, calling itself the “industry standard for defining asynchronous APIs”, and similar approaches bring Open API to the data streaming world. My post “Kafka versus API Management with tools like MuleSoft, Kong, or Apigee” is still pretty much accurate if you want to dive deeper into this discussion. IBM API Connect was one of the first vendors that integrated Kafka via Async API.

A great example of the evolution from RPC to streaming APIs is the machine learning space. “Streaming Machine Learning with Kafka-native Model Deployment” explores how model servers such as Seldon enhance their product with a native Kafka API besides HTTP and gRPC request-response communication:

Kafka-native Machine Learning Model Server Seldon

Journey to the Streaming Data Mesh with Kafka

The paradigm shift is enormous. Data Mesh is not a free lunch. The same was and still is true for microservice architectures, domain-driven design, Event Streaming, and other modern design principles.

In analogy to Confluent’s maturity model for Event Streaming, our team has described the journey for deploying a streaming Data Mesh:

Data Mesh Journey with Event Streaming and Apache Kafka

The efforts likely take a few years in most scenarios. The shift is not just about technologies, but, as necessary are adjustments to organizations and business processes. I guess most companies are still in a very early stage. Let me know where you are on this journey!

Streaming Data Exchange as Foundation for a Data Mesh

A Data Mesh is an implementation pattern, not a specific technology. However, most modern enterprise architectures require a decentralized streaming data infrastructure to build valuable and innovative data products in independent, truly decoupled domains. Hence, Kafka, being the de facto standard for Event Streaming,  comes into play in many Data Mesh architectures.

Many Data Mesh architectures span across many domains in various regions or even continents. The deployments run at the edge, on-prem, and multi-cloud. The integration connects to many solutions, technologies with different communication paradigms.

A cloud-native Event Streaming infrastructure with the capability to link clusters with each other out-of-the-box enables building a modern Data Mesh. No Data Mesh will use just one technology or vendor. Learn from the inspiring posts from your favorite data products vendors like AWS, Snowflake, Databricks, Confluent, and many more to define and build your custom Data Mesh successfully. Data Mesh is a journey, not a big bang.

Did you already start building your Data Mesh? How does the enterprise architecture look like? What frameworks, products, and cloud services do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Streaming Data Exchange with Kafka and a Data Mesh in Motion appeared first on Kai Waehner.

]]>
Apache Kafka for Smart Grid, Utilities and Energy Production https://www.kai-waehner.de/blog/2021/01/14/apache-kafka-smart-grid-energy-production-edge-iot-oil-gas-green-renewable-sensor-analytics/ Thu, 14 Jan 2021 09:59:09 +0000 https://www.kai-waehner.de/?p=3018 The energy industry is changing from system-centric to smaller-scale and distributed smart grids and microgrids. These smart grids require a flexible, scalable, elastic, and reliable cloud-native infrastructure for real-time data integration and processing. This post explores use cases, architectures, and real-world deployments of event streaming with Apache Kafka in the energy industry to implement smart grids and real-time end-to-end integration.

The post Apache Kafka for Smart Grid, Utilities and Energy Production appeared first on Kai Waehner.

]]>
The energy industry is changing from system-centric to smaller-scale and distributed smart grids and microgrids. A smart grid requires a flexible, scalable, elastic, and reliable cloud-native infrastructure for real-time data integration and processing. This post explores use cases, architectures, and real-world deployments of event streaming with Apache Kafka in the energy industry to implement a smart grid and real-time end-to-end integration.

Smart Grid Energy Production and Distribution with Apache Kafka

Smart Grid – The Energy Production and Distribution of the Future

The energy sector includes corporations that primarily are in the business of producing or supplying energy such as fossil fuels or renewables.

What is a Smart Grid?

A smart grid is an electrical grid that includes a variety of operation and energy measures,, including smart meters, smart appliances, renewable energy resources, and energy-efficient resources. Electronic power conditioning and control of the production and distribution of electricity are important aspects of the smart grid.

The European Union Commission Task Force for Smart Grids provides smart grid definition as:

“A Smart Grid is an electricity network that can cost-efficiently integrate the behavior and actions of all users connected to it – generators, consumers and those that do both – to ensure economically efficient, sustainable power system with low losses and high levels of quality and security of supply and safety. A smart grid employs innovative products and services, together with intelligent monitoring, control, communication, and self-healing technologies to:

  1. Better facilitate the connection and operation of generators of all sizes and technologies.
  2. Allow consumers to play a part in optimizing the operation of the system.
  3. Provide consumers with greater information and options for how they use their supply.
  4. Significantly reduce the environmental impact of the whole electricity supply system.
  5. Maintain or even improve the existing high levels of system reliability, quality, and security of supply.
  6. Maintain and improve existing services efficiently.”

Technologies and Evolution of a Smart Grid

The Roll-out of smart grid technology also implies a fundamental re-engineering of the electricity services industry, although typical usage of the term is focused on the technical infrastructure. Key smart grid technologies include solar power, smart meters, microgrids, and self-optimizing systems:

Smart Grid - Energy Industry

Requirements for a Smart Grid and Modern Energy Infrastructure

  • Reliability: The smart grid uses technologies such as state estimation which improve fault detection and allow self-healing of the network without the intervention of technicians. This will ensure a more reliable supply of electricity and reduced vulnerability to natural disasters or attack.
  • Flexibility in network topology: Next-generation transmission and distribution infrastructure will be better able to handle possible bidirectional energy flows, allowing for distributed generation such as from photovoltaic panels on building roofs, but also charging to/from the batteries of electric cars, wind turbines, pumped hydroelectric power, the use of fuel cells, and other sources.
  • Efficiency: Numerous contributions to the overall improvement of the efficiency of energy infrastructure are anticipated from the deployment of smart grid technology, in particular including demand-side management, for example, turning off air conditioners during short-term spikes in electricity price, reducing the voltage when possible on distribution lines through Voltage/VAR Optimization (VVO), eliminating truck-rolls for meter reading, and reducing truck-rolls by improved outage management using data from Advanced Metering Infrastructure systems. The overall effect is less redundancy in transmission and distribution lines and greater utilization of generators, leading to lower power prices.
  • Sustainability: The smart grid’s improved flexibility permits greater penetration of highly variable renewable energy sources such as solar power and wind power, even without the addition of energy storage.
  • Market-enabling: The smart grid allows for systematic communication between suppliers (their energy price) and consumers (their willingness-to-pay. It permits both the suppliers and the consumers more flexible and sophisticated operational strategies.
  • Cybersecurity: Provide a secure infrastructure with encrypted and authenticated communication and real-time anomaly detection at scale across the supply chain.

Architectures with Kafka for a Smart Grid

From a technical perspective, use cases such as load adjustment/load balancing or peak curtailment/leveling and time of use pricing cannot be implemented with traditional, monolith software like they were used in the past in the energy industry.

A smart grid requires a cloud-native infrastructure that is flexible, scalable, elastic, and reliable. All of that in combination with real-time data integration and data processing. These requirements explain why more and more energy companies rely heavily on event streaming with Apache Kafka and its ecosystem.

Energy Production and Distribution with a Hybrid Architecture

Many energy companies have a cloud-first strategy. They build new innovative applications in the cloud. Especially in the energy industry, this makes a lot of sense. The need for elastic and scalable data processing is key to success. However, not everything can run in the cloud. Most energy-related use cases required data processing at the edge, too. This is true for energy production and energy distribution.

Here is an example architecture leveraging Apache Kafka in the cloud and at the edge:

Smart Grid Energy Production and Distribution with Apache Kafka in a Hybrid Architecture

The integration in the cloud requires connecting to modern technologies such as Snowflake data warehouse, Google’s Machine Learning services based on TensorFlow, or 3rd party SaaS services like Salesforce. The edge is different. Connectivity is required for machines, equipment, sensors, PLCs, SCADA systems, and many other systems. Kafka Connect is a perfect, Kafka-native tool to implement these integrations in real-time at scale.

Replication in real-time between the edge sites and the cloud is another important case where tools like MirrorMaker 2 or Confluent’s Cluster Linking fit perfectly.

The continuous processing of data (aka stream processing) is possible with Kafka-native components like Kafka Streams or ksqlDB. Using an external tool such as Apache Flink is also a good fit.

Event Streaming for Energy Production at the Edge with a 5G Campus Network

Kafka at the edge is the new black. Energy production a great example:

Event Streaming with Apache Kafka for Energy Production and Smart Grid at the Edge with a 5G Campus Network

More details about deploying Kafka at edge sites is described in the post “Building a Smart Factory with Apache Kafka and 5G Campus Networks“.

The edge is often disconnected from the cloud or remote data centers. Mission-critical applications have to run 24/7 in a decoupled way even if the internet connection is not available or not stable:

Energy Production and Smart Grid at the Disconnected Edge with Apache Kafka

Example: Real-Time Outage Management with Kafka in the Utilities Sector

Let’s take a look at an example implemented in the utilities sector: Continous processing of smart meter sensor data with Kafka and ksqlDB:

Smart Meters - High Frequency Noise Filter with Apache Kafka

The preprocessing and filtering happens at the edge, as shown in the above picture. However, the aggregation and monitoring of all the assets of the smart grid (including smart home, smart buildings, powerlines, switches, etc.) happen in the cloud:

Cloud Aggregator for Field Management and Smart Grid with Apache Kafka

 

Real-time data processing is not just important for operations. Huge added value comes from the improved customer experience. For instance, the outage management operations tool can alert a customer in real-time:

Real-Time Outage Management for a Better Customer Experience with Apache Kafka

Let’s now take a look at a few real-world examples of energy-related use cases.

Kafka Real-World Deployments in the Energy Sector

This section explores three very different deployments and architectures of Kafka-based infrastructure to build smart grids and energy production-related use cases: EON, WPX Energy, and Tesla.

EON – Smart Grid for Energy Production and Distribution with Kafka

The EON Streaming Platform is built on the following paradigms to provide a cloud-native smart grid infrastructure:

  • IoT scale capabilities of public cloud providers
  • EON microservices that are independent of cloud providers
  • Real-time data integration and processing powered by Apache Kafka

Kafka at EON Cloud Streaming Platform

WPX Energy – Kafka at the Edge for Integration and Analytics

WPX Energy (now merged into Devon Energy) is a company in the oil&gas industry. The digital transformation creates many opportunities to improve processes and reduce costs in this vertical. WPX leverages Confluent Platform on Hivecell edge hardware to realize edge processing and replication to the cloud in real-time at scale.

Concept of automation in oil and gas industry

The solution is designed for true real-time decision making and future potential closed-loop control optimization.  WPX conducts edge stream processing to enable true real-time decision-making at well site. They also replicate business-relevant data streams produced by machine learning models and analytical preprocessed data at the well site to the cloud, enabling WPX to harness the full power of its real-time events.

Kafka deployments at the edge (i.e., outside a data center) come up more and more in the energy industry, but also in factories, restaurants, retail stores, banks, and hospitals.

Tesla – Kafka-based Data Platform for Trillions of Data Points per Day

Tesla is not just a car maker. Telsa is a tech company writing a lot of innovative and cutting-edge software. They provide an energy infrastructure for cars with their Telsa Superchargers, solar energy production at their Gigafactories, and much more. Processing and analyzing the data from their smart grids and the integration with the rest of the IT backend services in real-time is a key piece of their success:

Telsa Factory

Tesla has built a Kafka-based data platform infrastructure “to support millions of devices and trillions of data points per day”. Tesla showed an interesting history and evolution of their Kafka usage at a Kafka Summit in 2019:

History of Kafka Usage at Tesla

Kafka + OT Middleware (OSIsoft PI, Siemens MindSphere, et al)

A common and very relevant question is the relation between Apache Kafka and traditional OT middleware such as OSIsoft PI or Siemens MindSphere.

TL;DR: Apache Kafka and OT middleware complement each other. Kafka is NOT an IoT platform. OT middleware is not built for the integration and correlation of OT data with the rest of the enterprise IT. Kafka and OT middleware connect to each other very well. Both sides provide integration options, including REST APIs, native Kafka Connect connectors, and more. Hence, in many cases, enterprises leverage and integrate both technologies instead of choosing just one of them.

Apache Kafka and OT Middleware such as OSIsoft PI or Siemens MindSphere

Please check out the following blogs/slides/videos to understand how Apache Kafka and OT middleware complement each other very well:

Slides: Kafka-based Smart Grid and Energy Use Cases and Architectures

The following slide deck goes into more details about this topic:

The Future of Kafka for the Energy Sector and Smart Grid

Kafka is relevant in many use cases for building an elastic and scalable smart grid infrastructure. Even beyond, many projects utilize Kafka heavily for customer 360, payment processing, and many other use cases. Check out the various “Use Cases and Architectures for Apache Kafka across Industries“. Energy companies can apply many of these use cases, too.

If you have read this far and actually wonder what “real-time” actually means in the context of Kafka and the OT/IT convergence, please check out the post “Kafka is NOT hard real-time“.

What are your experiences and plans for event streaming in the energy industry? Did you already build a smart grid with Apache Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

 

The post Apache Kafka for Smart Grid, Utilities and Energy Production appeared first on Kai Waehner.

]]>
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka https://www.kai-waehner.de/blog/2020/12/16/top-5-event-streaming-apache-kafka-use-cases-2021-edge-hybrid-cloud-cybersecurity-machine-learning-service-mesh/ Wed, 16 Dec 2020 08:16:58 +0000 https://www.kai-waehner.de/?p=2885 TOP 5 Event Streaming Architectures and Use Cases for 2021: Edge deployments, hybrid and multi-cloud architectures, service mesh-based microservices, streaming machine learning, and cybersecurity.

The post Top 5 Event Streaming Use Cases for 2021 with Apache Kafka appeared first on Kai Waehner.

]]>
Apache Kafka and Event Streaming are two of the most relevant buzzwords in tech these days. Ever wonder what the predicted TOP 5 Event Streaming Architectures and Use Cases for 2021 are? Check out the following presentation. Learn about edge deployments, hybrid and multi-cloud architectures, service mesh-based microservices, streaming machine learning, and cybersecurity.
Apache Kafka and Event Streaming Top Use Cases for 2021

After preparing the presentation, I also discovered Gartner’s top strategic technology trends for 2021:

Gartner Top Strategic Technology Trends for 2021

It is really funny (but not surprising) how much these Gartner predictions overlap and complement the five use cases I focus on for Event Streaming and Apache Kafka. The tech industry’s key trends are all about data correlation, real-time processing, analytics, and integration between various systems and technologies—all of that globally and securely. Hence, here you go with the top 5 trends around Apache Kafka for 2021…

Top 5 Apache Kafka Use Cases for 2021

Here are the five use cases I see coming up more and more in conversations with customers across the globe:
  • Edge deployments outside the data center: It’s time to challenge the normality of limited hardware and disconnected infrastructure.
  • Hybrid and multi-cloud architectures: Discover how these span multiple sites across regions, continents, data centers, and clouds with real-time information at scale to connect legacy and modern infrastructures.
  • Service mesh-based microservices architectures: Learn what becomes possible when organizations can provide a cloud-native event-based infrastructure.
  • Streaming machine learning: In 2021, many companies will move to streaming machine learning in production without the need for a data lake that enables scalable real-time analytics.
  • Cybersecurity: While security never goes out of style, in 2021, we will see cybersecurity in real-time at scale with openness and flexibility at its core.

Hybrid and Global Apache Kafka and Event Streaming Use Case

Slides and Video for Event Streaming Use Cases in 2021

Here are the slides and the on-demand video recording for this talk.

Slides

On-Demand Video Recording

Kafka Use Cases Top 5 2021 including Cybersecurity Hybrid Edge Multi Cloud Machine Learning Service Mesh

What are your most relevant and interesting use cases for Event Streaming and Apache Kafka in 2021? What are your strategy and timeline? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Top 5 Event Streaming Use Cases for 2021 with Apache Kafka appeared first on Kai Waehner.

]]>
Kafka and XML Messages – Transformation, Connector, Middleware https://www.kai-waehner.de/blog/2020/09/25/kafka-xml-messages-transformation-connector-middleware-comparison-connect-smt-esb-etl-web-services-soap-wsdl-schema/ Fri, 25 Sep 2020 07:20:15 +0000 https://www.kai-waehner.de/?p=2689 XML messages and XML Schema are not very common in the Apache Kafka and Event Streaming world! Why?…

The post Kafka and XML Messages – Transformation, Connector, Middleware appeared first on Kai Waehner.

]]>
XML messages and XML Schema are not very common in the Apache Kafka and Event Streaming world! Why? Many people call XML legacy. It is complex, verbose, and often associated with the ugly WS-* Hell (SOAP, WSDL, etc). On the other side, every company older than five years uses XML. It is well understood, provides a good structure, and is human- and machine-readable.

This post does not want to start another flame war between XML and other technologies such as JSON (which also provides JSON Schema now), Avro, or Protobuf. Instead, I will walk you through the three main approaches to integrate between Kafka and XML messages as there is still a vast demand for implementing this integration today (often for integrating legacy applications and middleware).

XML and XML Schema

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium’s XML 1.0 Specification of 1998 and several other related specifications – all of them free open standards – define XML.

The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services. Several schema systems exist to aid in defining XML-based languages, while programmers have developed many application programming interfaces (APIs) to assist the processing of XML data.

SOAP / WSDL Web Services – The WS-* Hell

Web Services use XML messages that follow the SOAP standard and have been popular with traditional enterprises for many years. In such systems, there is often a machine-readable description of the operations offered by the service written in the Web Services Description Language (WSDL). Web Services are one of the predominant use cases for XML integration. Some people call this the “WS-* Hell”:

XML Web Sevrice Hell - WS-*

Kafka for any Data Format (JSON, XML, Avro, Protobuf, …)

Kafka can store and process anything, including XML. The Kafka brokers are dumb. They don’t care about data formats. The implementation of Kafka under the hood stores and processes only byte arrays. This approach follows the design principle of dumb pipes and smart endpoints (coined by Martin Fowler for microservice architectures). Dumb brokers are one of the architectural reasons why Kafka scales and performs so well.

As Kafka supports any data format, XML is no problem at all. It accepts any serializable input format. XML is just text so that plain string serializers can be used. However, if you want additional validation before pushing messages into Kafka (like checking the content is actually XML or doing Schema Validation using XML Schema), then you need to write your own XML Serializer/Deserializer implementation.

XML mappings can be very complex, including referencing other documents and using plenty of ugly open standards. XML includes generic standards such as WS-Security or industry-specific standards such as XBRL for regulatory processing or HL7 for healthcare. It is a mess. Period. Even mature tools struggle as soon as any parts of such a standard are adjusted to their own needs (even though the standards support customization).

Kafka works with any programming language, and Confluent also provides a REST Proxy for HTTP(S) communication with Kafka. But these clients require the developer to implement all the complex XML mapping and processing. Hence, let’s now talk about the most common approaches to implement the integration between any XML-based application and Apache Kafka: 3rd party middleware (ETL / ESB tools) and Kafka-native Kafka Connect.

Kafka and XML via Middleware (ETL, ESB)

Apache Kafka and traditional middleware (ETL, ESB) are frenemies (friends and enemies). Check out this blog and video recording/slide deck for a more in-depth discussion and comparison. If these legacy integration tools such as TIBCO BusinessWorks or Software AG webMethods do one thing well, then it is graphical mappings of complex XML structures, including good and mature (but not 100%) support of the ugly WS-* web service standards.

XML Kafka Integration with 3rd Party Middleware - ESB ETL Tools

Pros of 3rd Party Middleware for XML-Kafka Integration
  • Visual coding for a more straightforward mapping experience (especially crucial for very complex structures) – for all coders: Trust me, this is really easier and more time-efficient than writing, testing, and debugging source code!
  • Mature (10-20 years old technologies don’t have many bugs anymore – if they are still alive)
  • Support of complex XML Schema structures (but yet often issues with import, UI, and export)
  • (Often) already in place (implemented, tested, and deployed)
  • Kafka integration exists for any middleware which is still alive (i.e., maintained and supported by the vendor)
Cons of 3rd Party Middleware for XML-Kafka Integration
  • Products are legacy – as old as the source systems
  • Monolithic, inflexible architecture
  • Separate infrastructure to operate, test, maintain and pay
  • End-to-end integration is more challenging (from a technical and support perspective) as two systems in the middle instead of just one
  • Licensing
  • Point-to-point and tight coupling, and not event-based streaming with real decoupling
  • (Often) proprietary solution

Traditional middleware (such as TIBCO, IBM, Software AG, or Mulesoft) complements Kafka deployments. If you have the middleware already running and licensed (and do not plan to migrate away from it), then this is a viable approach to integrate between XML messages from legacy systems and Kafka.

Kafka and XML via Kafka Connect

The open Kafka ecosystem provides Kafka-native support for XML integration leveraging Kafka Connect. Kafka Connect is a Kafka-native tool for scalably and reliably streaming data integration between Apache Kafka and other data systems. It makes it simple to quickly define connectors that move large data sets into and out of Kafka. Think about it as ESB-on-Kafka.

Use cases include

  • Messaging integration (MQ)
  • Mainframe offloading
  • File outputs from batch processes
  • Web services (SOAP / WSDL / WS-*)
  • Legacy applications
Pros
  • Kafka-native (leveraging Kafka under the hood for scalability, throughput, high availability, exactly-once semantics, low latency, etc.)
  • Decoupled design (Domain-driven Microservice approach instead of tight coupling)
  • Open ecosystem and flexible integration with any data sources and sinks – check out Confluent Hub for hundreds of open source and commercial connectors
  • Cloud-native to be deployed in any edge / on-premise or cloud infrastructure such as Kubernetes
Cons
  • Limited support for complex XML Schemas and standards – not all ugly documents will work well
  • No visual coding – unfortunately, no Kafka-native visual coding tools exist in 2020. Let’s go, Confluent! 🙂

Let’s take a look at two Kafka Connect approaches in more detail: A dedicated XML Connector and an SMT (Single Message Transformation) embedded into any Kafka Connect source or sink connector.

Kafka Connect Connector for XML Files

An XML connector directly accesses the XML file to parse and transform the content:

Kafka XML Integration with Kafka Connect XML Connector

Connect FilePulse is an open-source Kafka Connect connector built by streamthoughts to parse, transform and stream any XML file. Other file formats are also supported. But as many other tools support the modern data formats such as JSON, CSV, Avro, or Protobuf, I really think the highlight of this connector is the XML support.

Features:

  • Support for recursive scanning of local directories.
  • Reading and writing files into Kafka line by line.
  • Support multiple input file formats (e.g: CSV, JSON, AVRO, XML).
  • Parsing and transforming data using built-in or custom processing filters
  • Error handler definition
  • Monitoring files while they are written into Kafka
  • Support pluggable strategies to cleanup up completed files

Here is an excellent tutorial for using this XML connector for Kafka Connect: Streaming data into Kafka – Loading an XML file.

The Connect FilePulse Kafka Connector is the right choice for direct integration between XML files and Kafka.

SMT for Embedding XML Transformations into ANY Kafka Connect Connector

An SMT (Single Message Transformation) is part of the Kafka Connect framework. SMTs are applied to messages as they flow through Kafka Connect. They transform inbound messages after a source connector has produced them, but before they are written to Kafka. SMTs transform outbound messages before they are sent to a sink connector.

An SMT can be embedded into any Kafka Connect source or sink connector. Hence, the XML SMT for Kafka Connect allows direct integration with any interface and mapping XML messages without the need for storing the file or using a specific XML connector.

Kafka XML Integration with SMT and ANY Source Sink Connector

SMTs even allow to add or change metadata, e.g., by adding a new header in addition to the key and value of the message.

Here is an example: Receive XML messages from JMS-based messaging platforms and convert the XML payload to JSON, AVRO, or Protobuf for further processing and integration into the rest of the (modern) enterprise architecture. For instance, Confluent provides a generic JMS connector but also dedicated connectors for various legacy MQ products such as IBM MQ (often running on the mainframe), TIBCO EMS, and ActiveMQ. The XML SMT allows on-the-fly transformation of the incoming XML messages. I have seen integration and later replacement of these MQ tools across the globe in any kind of industry.

Dead Letter Queue (DLQ) for Handling Bad XML Messages

Just transforming messages is often not sufficient. A Dead Letter Queue (DLQ), aka Dead Letter Channel, is an Enterprise Integration Pattern (EIP) to handle bad messages. This design pattern is complementary for XML integration. For instance, a DLQ can store badly processed XML that didn’t fit the XSD in the transform. Here is an example of how to implement the rerouting to a DLQ using the above SMT.

Confluent Schema Registry for Data Governance

The Confluent Schema Registry is a complimentary (optional) tool. It provides a smart implementation of data format and content validation (including enforcement, versioning, and other features). I see it used in ~70% of Kafka projects across the globe. As soon as you do more than just data ingestion into a data lake like HDFS or S3, the added value is enormous. Today, Confluent Schema Registry supports JSON Schema, Avro, and Protobuf.

The Schema Registry provides an open interface and is pluggable. For example, some users have asked for Schema Registry to support XML. Now, you can add XML support to Schema Registry directly, and use the Schema Registry to store both XML and Avro at the same time. For more on how to add your schema formats, please refer to the documentation. The workaround is to do the discussed XML-to-another-format transformation first and start your event streaming data governance from that point.

Summary

XML is predominant in most enterprises and mostly used for legacy applications, batch processing, and SOAP / WSDL web services. A digital transformation can only be successful if the old world is connected well to the new world (as you might have learned in my example about how to integrate between Kafka and Mainframes).

This post explored the three most common options for integration between Kafka and XML:

  • XML integration with a 3rd party middleware
  • Kafka Connect connector for integration with XML files
  • Kafka Connect SMT to embed the transformation into ANY other Kafka Connect source or sink connector to transform XML messages on the flight

What are your experiences with XML integration for Kafka? Which implementation did you choose? What challenges did you face, and how did you or do you plan to solve this? What is your strategy? Let’s connect on LinkedIn and discuss it!

 

The post Kafka and XML Messages – Transformation, Connector, Middleware appeared first on Kai Waehner.

]]>
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies? https://www.kai-waehner.de/blog/2020/05/25/api-management-gateway-apache-kafka-comparison-mulesoft-kong-apigee/ Mon, 25 May 2020 10:42:10 +0000 https://www.kai-waehner.de/?p=2308 Event Streaming with Apache Kafka and API Management / API Gateway solutions (Apigee, Mulesoft Anypoint, Kong, TIBCO Mashery,…

The post Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies? appeared first on Kai Waehner.

]]>
Event Streaming with Apache Kafka and API Management / API Gateway solutions (Apigee, Mulesoft Anypoint, Kong, TIBCO Mashery, etc.) are complementary, not competitive! Read this blog post to understand the relation between these two components in your enterprise architecture.

API Management is relevant for many years already. I talked about “A New Front for SOA: Open API and API Management as Game Changer” in 2014 when SOAP Web Services and Service-Oriented Architectures (SOA) were cutting-edge technologies and concepts. Exposing APIs and monetization were still in their infancy at that time. EDI / EDIFACT and similar complex technologies were used for B2B communication. B2C communication was just starting with smartphones and mobile apps. Internally billing was done with estimations and Excel sheets instead of automated and accurate information systems.

Let’s start this blog post with an overview of the current market situation. Use cases and the relation between event streaming with Apache Kafka and API Management with tools like Mulesoft Anypoint Platform are discussed afterwards. The last part of the post explores the future of API Management for streaming technologies (and how you can even solve this use case today already).

Market Situation – One Middleware Tool to Solve All your Problems?

Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.

Apache Kafka plays a key role in modern architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs. This blog post explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement each other, and why they are still not always a perfect match.

In the middleware market, every software vendor is the best one and puts itself into the middle of the enterprise architecture; at least if you trust marketing graphics. No matter which vendor’s website you visit, you will see something similar to this:

Middleware API Management

 

Middleware, Event Streaming and API Management Vendors

Here are some examples of global middleware vendors providing software to glue together applications and to provide APIs:

  • Universal Players offer various products. Vendors like Red Hat / IBM, Oracle, Software AG, TIBCO even offer different overlapping and competing solutions. For instance, IBM has 10+ products for integration middleware (not included are the rebranded product names).
  • Cloud Providers like AWS, GCP, Azure and Alibaba provide a vast number of services for gluing together applications and services.
  • Some companies focus just on Messaging, for instance Solace or Synadia (the company behind nats.io).
  • Event Streaming Platforms like Confluent or Streamlio (the company behind Pulsar; acquired by Splunk recently) are relative new on the market (compared to the above categories), but get more and more traction these days.
  • API Management solutions like Mulesoft, Apigee or Kong focus on the creation, life cycle management on monetization of APIs.
  • New startups focus on specific niches or cutting edge technologies, like solo.io providing an API Gateway on top of Envoy Proxy Service Mesh.

MQ, ETL, ESB, Kafka, API Management – When to use which Tool(s)?

Obviously this market situation creates an important question: When to use which tool(s)? How do they overlap with each other? When are they complementary?

I covered the discussion about traditional middleware and Kafka already in detail. Check out “Event streaming with Apache Kafka vs. traditional middleware using MQ, ETL, ESB“.

It is also relative easy to explain the relation between traditional middleware and API Management: Build a SOAP or REST based application (aka web service) and put an API Gateway or API Management tool in front of it to manage its lifecycle and monetize it.

Important pointer here: Some platforms like Mulesoft provide an ESB and API Management. You can use just one of them, or both together. Just make sure to compare the right things (to Kafka). For Mule ESB (vs. Kafka), check out the above link. For Mulesoft’s Anypoint Platform for API Management (vs. Kafka), read the below content… Read both if you need integration middleware and API Management.

How do Apache Kafka and API Management relate to each other? This question is harder to answer because both solve very different problems based on different technologies. Let’s discuss this topic in more detail in the following.

Use Cases for Event Streaming and Apache Kafka

First of all, it is very important to understand what ‘Event Streaming’ is and why this is different from the “traditional API approach” providing REST or SOAP web services.

Apache Kafka is used in all Industries and Verticals

Some use cases can also be done with other technologies, but it is easier and a simpler architecture with Kafka. That is true for integration layers and microservice architectures – and all the use cases around this like real time monitoring or customer 360.

Some other use cases cannot be done easily with other technologies because others don’t provide the combination of messaging + storage + processing in one single platform in a scalable, reliable and fault tolerant way – which is e.g. required to build a connected car infrastructure or sensor processing and analytics at scale in real time.

In the early era of Apache Kafka, many companies just used it for data ingestion into Hadoop or another data lake. The significant difference today – and this is what i would define as innovative – is that companies today use Apache Kafka as Event Streaming Platform to build mission-critical infrastructures and core operations platforms.

To be fair, Kafka is not the best solution for every problem. If you need point-to-point messaging, use something like RabbitMQ or IBM MQ. If you need to transfer large files, evaluate the market for MFT (Managed File Transfer) products. And… If you need to manage and monetize APIs, then evaluate API Management solutions.

Kafka’s Ecosystem to build mission-critical and scalable  Platforms and real time Applications

Apache Kafka is more than just data ingestion or messaging. Apache Kafka (which includes Kafka Connect and Kafka Streams) and its open ecosystem (Schema Registry, ksqlDB, etc.) established a complete event streaming platform for many innovative use cases.

Here are some examples:

Event Streaming Platform with Apache Kafka - Value per Use Case

An interesting trend can be seen here: More and more Kafka deployments are mission-critical focusing on business transactions. These deployments cannot be down for an hour because the company behind it would be in huge trouble then.

Many more use cases from companies in almost all existing verticals and industries can be found at the Kafka Summit website. Videos and slides from all past talks are available for free. This includes success stories from tech giants, traditional companies and cutting edge startups.

Why Event Streaming with Apache Kafka?

Kafka has a few unique characteristics:

  • Combination of messaging, storage, integration and processing of data
  • Event-based architecture for real time processing, supporting modern design patterns like Event Sourcing and CQRS
  • Built for high availability, high throughput and cloud-native DevOps and CI/CD integration
  • Open source with a huge community and ecosystem

For these and other reasons, Kafka became the de facto standard for Microservice architectures and many other application infrastructures. Many of these use cases cannot be built with traditional middleware due to various limitations of scalability, non-flexible architectures or simply too high cost for building a highly available deployment.

So, what is the relation between event streaming with Kafka and API Management? Let’s explore this in the next section.

What is an API and its Relation to Event Streaming?

Event Streaming is changing from ground up how applications are built. More scalable, more reliable, decoupled, real time. In many new innovative use cases, there is no way around using event streaming instead of web services and traditional APIs.

This brings up several questions. Why do we still need to create and manage APIs? Does it make sense to put an API on top of streaming data? What technology and interface should this API use?

Let’s cover the basics first…

API (Application Programming Interface)

An API (application programming interface) is a computing interface which defines interactions between multiple software intermediaries. It defines the kinds of calls or requests that can be made, how to make them, the data formats that should be used, the conventions to follow, etc.

API Technologies

From a technical perspective, most people and products mean  REST (HTTP) or SOAP (XML) web services when talking about APIs. Most API Gateway and API Management tools just support these technologies.

These two technologies are established in most companies for many years and are very mature. Some people prefer the one, some the other. Some people don’t like either one but have to use them because REST and SOAP web services are the de facto standard in enterprises today.

In fact, many other API technologies are available. Many of these other APIs do not use synchronous request-response patterns, but asynchronous communication.

Examples: WebSocket, MQTT, Server-side Events (SSE), or the Kafka protocol (the underlying wire protocol implemented in Kafka). So why are more and more technologies emerging?

Synchronous Request-Response vs. Asynchronous Event Streaming

Two very different communication paradigms exist: Request-response and event streaming.

Request-Response communication has the following characteristics:

  • Low latency
  • Typically synchronous
  • Point-to-point
  • “Bespoke API”
  • e.g. HTTP, SOAP, gRPC

Event streams are based on these concepts:

  • Messaging / Pub Sub (sending data from A to B and C)
  • Continuous data processing (filtering, transformations, aggregations, business logic)
  • Often asynchrounous
  • Event-driven, supporting patterns like Event Sourcing and CQRS
  • General-purpose events
  • e.g. Apache Kafka

Both approaches have their trade-offs. Most architectures need request-response and event streams! Read the great article from Gregor Hophe (author of the famous Enterprise Integration Patterns) from 2004: “Starbucks Does Not Use Two-Phase Commit“. This article explains very well why both synchronous and asynchronous communication make sense (together).

REST and SOAP Web Services typically use synchronous communication. This is not the full story, you could e.g. also use JMS-based SOAP communication, but the reality in most cases is synchronous request-response. Event streaming is asynchronous, but you can implement request-reply patterns, too.

Event Streaming instead of REST / SOAP Web Services

So what are the most important reasons why event streaming with technologies like Apache Kafka is often used for new projects instead of REST / SOAP web services?

REST / SOAP web services do not provide characteristics to build a scalable, reliable real time infrastructure for a high throughput of events. Period!

The other big advantage of Kafka is that it decouples microservices from each other. The storage of Kafka and the asynchronous (i.e. decoupled) communication keeps every microservice independent from each other. Microservice A does not need to know Microservice B, but they can still communicate with each other. Even if one of them is down while the other one is producing data. There can still be a contract (a term used in API Management a lot) between the producers and consumers, for instance using the Confluent Schema Registry.

One thing to point out here is that most API Management solutions and API Gateway today don’t support Event Streams but only Web Service APIs, unfortunately.

But let’s go one step back first and understand what API Management actually is.

What is API Management?

API management is the process of creating and publishing web application programming interfaces (APIs), enforcing their usage policies, controlling access, nurturing the subscriber community, collecting and analyzing usage statistics, and reporting on performance. API Management components provide mechanisms and tools to support the developer and subscriber community.

Gartner’s Magic Quadrant 2019 for Full Life Cycle API Management shows the various vendors in this market:

Gartner Magic Quadrant 2020 for Full Life Cycle API Management

Use Cases for API Management

API Management can be used for different scenarios:

  • Open API: Developer portal and API Gateway
  • Partner Gateway: Access control for well-known external parties
  • Mobile App Gateway: Access control for apps deployed externally
  • Cloud Integration Gateway: Governance and mediation control for SaaS
  • Internal Governance: Manage, monetize and bill internal services and applications

Various different API business models are possible as John Musser explained very well in 2013 already:

API Business Models

What changed since 2013? Not that much! The main idea is the same: APIs are provided for the public, external partners or internal teams. However, technically speaking, more and more of these interfaces need to use a technology for real time streaming data at scale. REST APIs are not ideal or sometimes not even possible at all with its limitations regarding scalability.

No matter if the API Management solution supports just REST / SOAP web services or modern streaming technologies, the API development workflow looks like this:

API Development Workflow

While API Management solutions vary, components that provide the following functionalities are typically found in products:

API Gateway

A server that acts as an API front-end, receives API requests, enforces throttling and security policies, passes requests to the back-end service and then passes the response back to the requester. A gateway often includes a transformation engine to orchestrate and modify the requests and responses on the fly. A gateway can also provide functionality such as collecting analytics data and providing caching. The gateway can provide functionality to support authentication, authorization, security, audit and regulatory compliance.

API Life Cycle Management and Publishing Tools

A collection of tools that API providers use to define APIs, for instance using the OpenAPI or RAML specifications, generate API documentation, manage access and usage policies for APIs, test and debug the execution of API, including security testing and automated generation of tests and test suites, deploy APIs into production, staging, and quality assurance environments, and coordinate the overall API lifecycle.

Developer Portal / API Store

Community site, typically branded by an API provider, that can encapsulate for API users in a single convenient source information and functionality including documentation, tutorials, sample code, software development kits, an interactive API console and sandbox to trial APIs, the ability to subscribe to the APIs and manage subscription keys such as OAuth2 Client ID and Client Secret, and obtain support from the API provider and user and community.

Reporting and Analytics

Functionality to monitor API usage and load (overall hits, completed transactions, number of data objects returned, amount of compute time and other internal resources consumed, volume of data transferred). This can include real-time monitoring of the API with alerts being raised directly or via a higher-level network management system, for instance, if the load on an API has become too great, as well as functionality to analyze historical data, such as transaction logs, to detect usage trends. Functionality can also be provided to create synthetic transactions that can be used to test the performance and behavior of API endpoints. The information gathered by the reporting and analytics functionality can be used by the API provider to optimize the API offering within an organization’s overall continuous improvement process and for defining software Service-Level Agreements for APIs.

Monetization and Billing

Functionality to support charging for access to commercial APIs. This functionality can include support for setting up pricing rules, based on usage, load and functionality, issuing invoices and collecting payments including multiple types of credit card payments.

As you can see: An API Management solution has some exciting features to build and operate APIs! So what is the relation to Kafka? As discussed earlier, many innovative use cases require a scalable, reliable event streaming platform. That’s what Kafka is.

Kafka and API Management – Friends, Enemies or Frenemies?

To be very clear

  • Apache Kafka does not provide out-of-the-box capabilities of an API Management solution.
  • API Management solutions do not provide event streaming capabilities to continuously send, process, store and handle millions of events in real time (aka stream processing / streaming analytics).

Therefore, the combination of Kafka and API Management solution makes a lot of sense in many scenarios. It is NOT a competitive situation (like many people think – or are “taught” by some vendors).

Unique API Management Features

Some of the unique features of API Management products are:

  • API Developer Portal and Publishing Tools
  • API Life Cycle Management
  • Billing and Monetization

These components can be provided as standalone services respectively products (e.g. from a cloud provider) or within a complete platform (like Mulesoft Anypoint Platform).

Domain-Driven Design (DDD), Decoupling and Anti-Patterns

Some features from API Management tools overlap with other solutions. You should question if API Management is the right spot for doing this. This is not a ‘yes or no’ discussion. But I think in many cases, the API Management solution should not be used for tasks where other platforms provide the better capabilities regarding scalability, tooling, reliability, performance, and other characteristics.

A clear separation of concerns is important to simple and flexible enterprise architecture. Don’t couple things too tightly. This was a key issue of ESB deployments in the past. Don’t do the same fault with API Management. It is not a surprise and should be a warning that several vendors even built their API Management product on top of their ESB to couple things together.

Martin Fowler taught us several years ago “not to recreate ESB Anti Patterns with Kafka“. Keep this in mind for your API strategy, too! My article “Microservices, Apache Kafka, and Domain-Driven Design” should also help you understanding how important the separation of concerns and decoupling is for your enterprise architecture. This is true for Kafka, APIs and other business applications.

Overlapping Features between Kafka and API Management

Kafka provides a messaging and storage solution for event-based processing as its core. In addition, Kafka Connect (for integration) and Kafka Streams (for stream processing) are part of the open source project.

API Management exists for completely different use cases as discussed in detail in the above section: To create, publish, manage and monetize APIs.

Nevertheless, some overlapping features exist between Kafka and API Gateways and API Management solutions. Here are some examples:

  • Protocol conversion: One consumer or client requires JSON while the other one can only process Avro, Protobuf or XML.
  • ETL (Extract Transform Load): Transformations, filtering, sorting and similar tasks.
  • Connectivity: Integration with back-end systems like databases, data warehouses, data lakes, messaging systems, business applications.
  • API Gateway: Routing, public endpoints, single entry point, access control, encryption, throttling, etc. are common features. This can either be configured / implemented by a dedicated API Gateway (like Amazon API Gateway) or with a Kafka-based platform (like Confluent Platform providing features such as RBAC, Rest Proxy, etc).

Who should solve these overlapping tasks? The Event Streaming Platform or the API Management solution? Well, each vendor will tell you that they can do it the best way. Think about your architecture and requirements. What makes most sense? As so often: It depends!

If you want to build a scalable, reliable integration pipeline, Kafka is probably the better choice. If you need to provide a flexible Gateway interface for REST web services with routing configurations, a dedicated API Gateway is probably the best choice. Try to keep the architecture as simple as possible.

Let’s now take a look at an architecture to understand how Kafka and API Management solutions play together very well.

Microservices, API Management (Mulesoft Anypoint) and Event Streaming (Kafka)

The following examples shows a microservices architecture leveraging Event Streaming and API Management. It uses a combination of Confluent Platform for the event-based nervous system and Mulesoft Anypoint Platform for API Management and integration with some legacy applications:

Kafka and API Management in the Enterprise Architecture with Confluent and Mulesoft

There are different options to combine Kafka and Event Streaming with API Management solutions:

  • Event Streaming is used to process data continuously at scale in real time
  • Event Streaming is used to directly integrate with various data sources and data sinks (databases, messaging systems, business applications, etc.)
  • The heart of many companies is Event Streaming, gluing together streaming applications with batch, request-response and other platforms.
  • API Management is used to provide an API interface (including lifecycle management, monetization, etc) on top of Kafka applications, e.g. using services via Confluent REST Proxy, the REST API of Confluent Cloud to provision a new Kafka cluster, or the REST API running on top of a custom Kafka Streams / ksqlDB application or microservice using Interactive Queries.
  • Kafka is used as backend infrastructure. A proxy or business application is used in between Kafka and business applications. API Management is not directly used with Kafka interfaces, but one layer higher on top of the applications which use Kafka under the hood.

Most enterprise architectures require event streaming, request-response and API management. I hope if you read this far in this blog post, you agree and now understand why Apache Kafka and API Management platforms are complementary, not competitive.

But it is also clear that event streaming and today’s API Management tools don’t fit together perfectly because in many cases it does not make sense to put a REST or SOAP API on top of event streaming data.

The Missing Killer Feature: Native Kafka Integration in API Management and API Gateway

The last section explored options how Kafka and API Management work together very well.

In an ideal world, an API could be put directly on top of the Kafka protocol. In the real world, almost all API Management products today only support REST / SOAP web services. This means you (have to) build a web service on top of event streaming to provide the API Management capabilities.

Envoy proxy, one of the established proxies for building a Service Mesh, actually supports the Kafka protocol natively. On TCP level, no need to use HTTP REST APIs. This is huge from scalability and performance perspective. HTTP / synchronous request-response is an anti-pattern for streaming data and will not work if large scale is required for the streaming application. Check out “Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd” for more details on this topic.

Unfortunately, examples like Envoy’s support for the Kafka protocol are very rare today. What if you get native Kafka support in your API Management solution?

Streaming-based API Management for Cross Companies Communication

API Management using REST or SOAP web services is not appropriate for streaming data and large scale use cases. Therefore, more and more enterprises build streaming applications. How strange is it that almost all of these enterprises use the anti-pattern of providing a request-response based REST API on top of the streaming services for API Management?

Support for the Kafka protocol would be very helpful to make API Management even more complementary than it is today. Think about the huge opportunities if you could build life cycle management and monetization / billing on top of a streaming Kafka service.

A great example of a Kafka-native API is HERE Technologies, a company majority-owned by a consortium of German automotive companies (namely Audi, BMW, and Daimler) providing mapping and location services. Their real-time APIs recommend using the native Kafka interface (as all their backend services run on Kafka for the reasons discussed in this post) instead of an optional HTTP wrapper endpoint.

Cross-Company Streaming Replication

Even without proper support for event streaming in most API Management tools, I have seen many customers doing Kafka-native real time communication at scale between different business units or projects. Check out “Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments” to understand various different options.

Here is the most exciting use case: Streaming replication between different enterprises:

Cross-Company Kafka Integration - Special Case of Hybrid Integration

Different tools enable streaming replication between business units, regions or companies:

  • MirrorMaker 1
  • MirrorMaker 2
  • Confluent Replicator
  • uReplicator (Uber)
  • Mirus (Salesforce)
  • Brooklin (LinkedIn)
  • Custom Replication

If you want to rely on a mature and battle-tested product, then Confluent Replicator is the way to go today in 2020 for real time streaming replication. MirrorMaker 1 should never be an option. MirrorMaker 2 will be a great option in some quarters, but today it is very new and probably not the best option for a mission-critical project yet. All other options are only recommended if you want to dive deep into the project.

Tools like Confluent Schema Registry provide governance for the “streaming API interface”. Technologies like Avro, Protobuf or JSON Schema are used to define the “API contract” and process large data volumes efficiently and in real time.

Event Streaming Internally and REST API to the Outside World?

A cross-company streaming architecture has one key drawback: Information security and politics are your biggest enemy! 🙂 But I have seen customers running this setup in production with a partner company. So it is doable, and even without API Management in the middle, you can leverage event streaming at scale with your partner. Think about use cases like airline ticketing, retail transactions or financial services.

Why would you build everything in real time at scale internally, but only provide a non-scalable synchronous HTTP interface to the outside world? And your external partners are asking themselves exactly the same question…

API Management for event streaming would make this easier from security and monetization / billing perspective. I hope this feature will be implemented soon by various API Management software vendors.

The Future – Streaming-based API Management for Apache Kafka?

Most architectures require request-response based communication (typically REST) and event streaming (typically Kafka). API Management helps making applications accessible; no matter if the heart of the infrastructure is event-based or a point-to-point communication.

I think (and hope) the future will provide streaming-based API Management solutions for Apache Kafka. Envoy’s support for the Kafka protocol is a first example. A few other frameworks also provide some “first hacks” already.

I hope this blog post helped you understanding the relation between Event Streaming with Apache Kafka and API Management solutions such as Kong or Mulesoft Anypoint Platform. They are complementary, not competitive.

How do you think about API Management in conjunction with event streaming and Apache Kafka? What is your strategy? Let’s connect on LinkedIn and discuss! Stay informed about new blog posts by subscribing to my newsletter.

 

 

The post Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies? appeared first on Kai Waehner.

]]>
Mainframe Integration, Offloading and Replacement with Apache Kafka https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/ Fri, 24 Apr 2020 14:55:10 +0000 https://www.kai-waehner.de/?p=2232 Time to get more innovative; even with the mainframe! This blog post covers the steps I have seen…

The post Mainframe Integration, Offloading and Replacement with Apache Kafka appeared first on Kai Waehner.

]]>
Time to get more innovative; even with the mainframe! This blog post covers the steps I have seen in projects where enterprises started offloading data from the mainframe to Apache Kafka with the final goal of replacing the old legacy systems.

Mainframes are still hard at work, processing over 70 percent of the world’s most important computing transactions every day. Organizations like banks, credit card companies, medical facilities, stock brokerages, and others that can absolutely not afford downtime and errors depend on the mainframe to get the job done. Nearly three-quarters of all Fortune 500 companies still turn to the mainframe to get the critical processing work completed” (BMC).

Mainframe Offloading and Replacement with Apache Kafka and CDC

 

Cost, monolithic architectures and missing experts are the key challenges for mainframe applications. Mainframes are used in several industries, but I want to start this post by thinking about the current situation at banks in Germany; to understand the motivation why so many enterprises ask for help with mainframe offloading and replacement…

Finance Industry in 2020 in Germany

Germany is where I live, so I get the most news from the local newspapers. But the situation is very similar in other countries across the world…

Here is the current situation in Germany.

Challenges for Traditional Banks

  • Traditional banks are in trouble. More and more branches are getting closed every year.
  • Financial numbers are pretty bad. Market capitalization went down significantly (before Corona!). Deutsche Bank and Commerzbank – former flagships of the German finance industry – are in the news every week. Usually for bad news.
  • Commerzbank had to leave the DAX in 2019 (the blue chip stock market index consisting of the 30 major German companies trading on the Frankfurt Stock Exchange); replaced by Wirecard, a modern global internet technology and financial services provider offering its customers electronic payment transaction services and risk management. UPDATE Q3 2020: Wirecard is now insolvent due to the Wirecard scandal. A really horrible scandal for the German financial market and government.
  • Traditional banks talk a lot about creating a “cutting edge” blockchain consortium and distributed ledger to improve their processes in the future and to be “innovative”. For example, some banks plan to improve the ‘Know Your Customer’ (KYC) process with a joint distributed ledger to reduce costs. Unfortunately, this is years away from being implemented; and not clear if it makes sense and adds real business value. Blockchain has still not proven its added value in most scenarios.

Neobanks and FinTechs Emerging

  • Neobanks like Revolut or N26 provide an innovative mobile app and great customer experience, including a great KYC process and experience (without the need for a blockchain). In my personal International bank transactions typically take less than 24 hours. In contrary, the mobile app and website of my private German bank is a real pain in the ass. And it feels like getting worse instead of better. To be fair: Neobanks do not shine with customers service if there are problems (like fraud; they sometimes simply freeze your account and do not provide good support).
  • International Fintechs are also arriving in Germany to change the game. Paypal is the normal everywhere, already. German initiatives for a “German Paypal alternative” failed. Local retail stores already accept Apple Pay (the local banks are “forced” to support it in the meantime). Even the Chinese service Alipay is supported by more and more shops.

Traditional banks should be concerned. The same is true for other industries, of course. However, as said above, many enterprises still rely heavily on the 50+ year old mainframe while competing with innovative competitors. It feels like these companies relying on mainframes have a harder job to change than let’s say airlines or the retail industry.

Digital Transformation with(out) the Mainframe

I had a meeting with an insurance company a few months ago. The team presented me an internal paper quoting their CEO: “This has to be the last 20M mainframe contract with IBM! In five years, I expect to not renew this.” That’s what I call ‘clear expectations and goals’ for the IT team… 🙂

Why does Everybody Want to Get Rid of the Mainframe?

Many large, established companies, especially those in the financial services and insurance space still rely on mainframes for their most critical applications and data. Along with reliability, mainframes come with high operational costs, since they are traditionally charged by MIPS (million instructions per second). Reducing MIPS results in lowering operational expense, sometimes dramatically.

Many of these same companies are currently undergoing architecture modernization including cloud migration, moving from monolithic applications to micro services and embracing open systems.

Modernization with the Mainframe and Modern Technologies

This modernization doesn’t easily embrace the mainframe, but would benefit from being able to access this data. Kafka can be used to keep a more modern data store in real-time sync with the mainframe, while at the same time persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.

This not only will reduce operational expenses, but will provide a path for architecture modernization and agility that wasn’t available from the mainframe alone.

As final step and the ultimate vision of most enterprises, the mainframe might be replaced by new applications using modern technologies.

Kafka in Financial Services and Insurance Companies

I recently wrote about how payment applications can be improved leveraging Kafka and Machine Learning for real time scoring at scale. Here are a few concrete examples from banking and insurance companies leveraging Apache Kafka and its ecosystem for innovation, flexibility and reduced costs:

  • Capital One: Becoming truly event driven – offering a service other parts of the bank can use.
  • ING: Significantly improved customer experience – as a differentiator + Fraud detection and cost savings.
  • Freeyou: Real time risk and claim management in their auto insurance applications.
  • Generali: Connecting legacy databases and the modern world with a modern integration architecture based on event streaming.
  • Nordea: Able to meet strict regulatory requirements around real-time reporting + cost savings.
  • Paypal: Processing 400+ Billion events per day for user behavioral tracking, merchant monitoring, risk & compliance, fraud detection, and other use cases.
  • Royal Bank of Canada (RBC): Mainframe off-load, better user experience and fraud detection – brought many parts of the bank together.

The last example brings me back to the topic of this blog post: Companies can save millions of dollars “just” by offloading data from their mainframes to Kafka for further consumption and processing. Mainframe replacement might be the long term goal; but just offloading data is a huge $$$ win.

Domain-Driven Design (DDD) for Your Integration Layer

So why use Kafka for mainframe offloading and replacement? There are hundreds of middleware tools available on the market for mainframe integration and migration.

Well, in addition to being open and scalable (I won’t cover all the characteristics of Kafka in this post), one key characteristic is that Kafka enables decoupling applications better than any other messaging or integration middleware.

Legacy Migration and Cloud Journey

Legacy migration is a journey. Mainframes cannot be replaced in a single project. A big bang will fail. This has to be planned long-term. The implementation happens step by step.

Almost every company on this planet has a cloud strategy in the meantime. The cloud has many benefits, like scalability, elasticity, innovation, and more. Of course, there are trade-offs: Cloud is not always cheaper. New concepts need to be learned. Security is very different (I am NOT saying worse or better, just different). And hybrid will be the normal for most enterprises. Only enterprises younger than 10 years are cloud-only.

Therefore, I use the term ‘cloud’ in the following. But the same phases would exist if you “just” want to move away from mainframes but stay in your own on premises data centers.

On a very high level, mainframe migration could contain three phases:

  • Phase 1 – Cloud Adoption: Replicate data to the cloud for further analytics and processing. Mainframe offloading is part of this.
  • Phase 2 – Hybrid Cloud: Bidirectional replication of data in real time between the cloud and application in the data center, including mainframe applications.
  • Phase 3 – Cloud-First Development: All new applications are build in the cloud. Even the core banking system (or at least parts of it) is running in the cloud (or in a modern infrastructure in the data center).

Journey from Legacy to Hybrid and Cloud Infrastructure

How do we get there? As I said, a big bang will fail. Let’s take a look at a very common approach I have seen at various customers…

Status Quo: Mainframe Limitations and $$$

The following is the status quo: Applications are consuming data from the mainframe; creating expensive MIPS:

Direct Legacy Mainframe Communication to App

The mainframe is running and core business logic is deployed. It works well (24/7, mission-critical). But it is really tough or even impossible to make changes or add new features. Scalability is often becoming an issue, already. And well, let’s not talk about cost. Mainframes cost millions.

Mainframe Offloading

Mainframe offloading means data is replicated to Kafka for further analytics and processing by other applications. Data continues to be written to the mainframe via the existing legacy applications:

Mainframe Offloading

Offloading data is the easy part as you do not have to change the code on the mainframe. Therefore, the first project is often just about reading data.

Writing to the mainframe is much harder because the business logic on the mainframe must be understood to make changes. Therefore, many projects avoid this huge (and sometimes not solvable) challenge by keeping the writes as it is… This is not ideal, of course. You cannot add new features or changes to the existing application.

People are not able or do not want to take the risk to change it. There is no DevOps or CI/CD pipelines and no A/B testing infrastructure behind your mainframe, dear university graduates! 🙂

Mainframe Replacement

Everybody wants to replace Mainframes due to its technical limitations and cost. But enterprises cannot simply shut down the mainframe because of all the mission-critical business applications.

Therefore, COBOL, the main programming language for the mainframe is still widely used in applications to do large-scale batch and transaction processing jobs. Read the  blog post ‘Brush up your COBOL: Why is a 60 year old language suddenly in demand?‘ for a nice ‘year 2020 introduction to COBOL’.

Enterprises have the following options:

Option 1: Continue to develop actively on the mainframe

Train or hire more (expensive) Cobol developers to extend and maintain your legacy applications. Well, this is where you want to go away from. If you forgot why: Cost, cost, cost. And technical limitations. I find it amazing how mainframes even support “modern technologies” such as Web Services. Mainframes do not stand still. Having said this, even if you use more modern technologies and standards, you are still running on the mainframe with all its drawbacks.

Option 2: Replace the Cobol code with a modern application using a migration and code generation tool

Various vendors provide tools which take COBOL code and automatically migrate it to generated source code in a modern programming language (typically Java). The new source code can be refactored and changed (as Java uses static typing so that any errors can be shown in the IDE and changes can be apply to all related dependencies in the code structure). This is the theory. The more complex your COBOL code is, the harder it gets to migrate it and the more custom coding is required for the migration (which results in the same problem the COBOL developer had on the mainframe: If you don’t understand the code routines, you cannot change it without risking errors, data loss or downtime in the new version of the application).

Option 3: Develop a new future-ready application with modern technologies to provide an open, flexible and scalable architecture

This is the ideal approach. Keep the old legacy mainframe running as long as needed. A migration is not a big bang. Replace use cases and functionality step-by-step. The new applications will have very different feature requirements anyway. Therefore, take a look at the Fintech and Insurtech companies. Starting green field has many advantages. The big drawback is high development and migration costs.

In reality, a mix of these three options can also happen. No matter what works for you, let’s now think how Apache Kafka fits into this story.

Domain-Driven Design for your Integration Layer

Kafka is not just a messaging system. It is an event streaming platform that provides storage capabities. In contrary to MQ systems or Web Service / API based architectures, Kafka really decouples producers and consumers:

Domain-Driven Design for the Core Banking Integration Layer with Apache Kafka

With Kafka and its Domain-driven Design, every application / service / microservice / database / “you-name-it” is completely independent and loosely coupled from each other. But scalable, highly available and reliable! The blog post “Apache Kafka, Microservices and Domain-Driven Design (DDD)” goes much deeper into this topic.

Sberbank, the biggest bank in Russia, is a great example for their domain-driven approach: Sberbank built their new core banking system on top of the Apache Kafka ecosystem. This infrastructure is open and scalable, but also ready for mission-critical payment use cases like instant-payment or fraud detection, but also for innovative applications like Chat Bots in customer service or Robo-advising for trading markets.

Why is this decoupling so important? Because the mainframe integration, migration and replacement is a (long!) journey

Event Streaming Platform and Legacy Middleware

An Event Streaming Platform gives applications independence. No matter if an application uses a new modern technology or legacy, proprietary, monolithic components. Apache Kafka provides the freedom to tap into and manage shared data, no matter if the interface is real time messaging, a synchronous REST / SOAP web service, a file based system, a SQL or NoSQL database, a Data Warehouse, a data lake or anything else:

Integration between Kafka and Legacy Middleware

Apache Kafka as Integration Middleware

The Apache Kafka ecosystem is a highly scalable, reliable infrastructure and allows high throughput in real time. Kafka Connect provides integration with any modern or legacy system, be it Mainframe, IBM MQ, Oracle Database, CSV Files, Hadoop, Spark, Flink, TensorFlow, or anything else. More details here:

Kafka Ecosystem for Security and Data Governance

The Apache Kafka ecosystem provides additional capabilities. This is important as Kafka is not just used as messaging or ingestion layer, but a platform for your most mission-critical use cases. Remember the examples from banking and insurance industry I showed in the beginning of this blog post. These are just a few of many more mission-critical Kafka deployments across the world. Check past Kafka Summit video recordings and slides for many more mission-critical use cases and architectures.

I will not go into detail in this blog post, but the Apache Kafka ecosystem provides features for security and data governance like Schema Registry (+ Schema Evolution), Role Based Access Control (RBAC), Encryption (on message or field level), Audit Logs, Data Flow analytics tools, etc…

Mainframe Offloading and Replacement in the Next 5 Years

With all the theory in mind, let’s now take a look at a practical example. The following is a journey many enterprises walk through these days. Some enterprises are just in the beginning of this journey, while others already saved millions by offloading or even replacing the mainframe.

I will walk you through a realistic approach which takes ~5 years in this example (but this can easily take 10+ years depending on the complexity of your deployments and organization). The importance is quick wins and a successful step-by-step approach; no matter how long it takes.

Year 0: Direct Communication between App and Mainframe

Let’s get started. Here is again the current situation: Applications directly communicate with the mainframe.

Direct Legacy Mainframe Communication to App

Let’s reduce cost and dependencies between the mainframe and other applications.

Year 1: Kafka for Decoupling between Mainframe and App

Offloading data from the mainframe enables existing applications to consume the data from Kafka instead of creating high load and cost for direct mainframe access:

Kafka for Decoupling between Mainframe and App

This saves a lot of MIPS (million instructions per second). MIPS is a way to measure the cost of computing on mainframes. Less MIPS, less cost.

After offloading the data from the mainframe, new applications can also consume the mainframe data from Kafka. No additional MIPS cost! This allows building new applications on the mainframe data. Gone is the time where developers cannot build new innovative applications because the MIPS cost was the main blocker to access the mainframe data.

Kafka can store the data as long as it needs to be stored. This might be just 60 minutes for one Kafka Topic for log analytics or 10 years for customer interactions for another Kafka Topic. With Tiered Storage for Kafka, you can even reduce costs and increase elasticity by separating processing from storage with a back-end storage like AWS S3. This discussion is out of scope of this article. Check out “Is Apache Kafka a Database?” for more details.

Change Data Capture (CDC) for Mainframe Offloading to Kafka

The most common option for mainframe offloading is Change Data Capture (CDC): Transaction log-based CDC pushes data changes (insert, update, delete) from the mainframe to Kafka. The advantages:

  • Real time push updates to Kafka
  • Eliminate disruptive full loads, i.e. minimize production impact
  • Reduce MIPS consumption
  • Integrate with any mainframe technology (DB2 Z/OS, VSAM, IMS/DB, CISC, etc.)
  • Full support

On the first connection, the CDC tool reads a consistent snapshot of all of the tables that are whitelisted. When that snapshot is complete, the connector continuously streams the changes that were committed to the DB2 database for all whitelisted tables in capture mode. This generates corresponding insert, update and delete events. All of the events for each table are recorded in a separate Kafka topic, where they can be easily consumed by applications and services.

The big disadvantage of CDC is high licensing costs. Therefore, other integration options exist…

Integration Options for Kafka and Mainframe

While CDC is typically the preferred choice, there are more alternatives. Here are the integration options I have seen being discussed in the field for mainframe offloading:

  1. IBM InfoSphere Data Replication (IIDR) CDC solution for Mainframe
  2. 3rd Party commercial CDC solution (e.g. Attunity or HLR)
  3. Open-source CDC solution (e.g Debezium – but you still need an IIDC license, this is the same challenge as with Oracle and GoldenGate CDC)
  4. Create interface tables + Kafka Connect + JDBC connector
  5. IBM MQ interface + Kafka Connect’s IBM MQ connector
  6. Confluent REST Proxy and HTTP(S) calls from the mainframe
  7. Kafka Clients on the mainframe

Evaluate the trade-offs and make your decision. Requiring a fully supported solution often eliminates several options quickly 🙂 In my experience, most people use CDC with IIDR or a 3rd party tool, followed by IBM MQ. Other options are more theoretical in most cases.

Year 2 to 4: New Projects and Applications

Now it is time to build new applications:

New Projects and Applications

 

This can be agile, lightweight Microservices using any technology. Or this can be external solutions like a data lake or cloud service. Pick what you need. Your are flexible. The heart of your infrastructure allows this. It is open, flexible and scalable. Welcome to a new modern IT world! 🙂

Year 5: Mainframe Replacement

At some point, you might wonder: What about the old mainframe applications? Can we finally replace them (or at least some of them)? When the new application based on a modern technology is ready, switch over:

Mainframe Replacement

As this is a step-by-step approach, the risk is limited. First of all, the mainframe can keep running. If the new application does not work, switch back to the mainframe (which is still up-to-date as you did not stop inserting the updates into its database in parallel to the new application).

As soon as the new application is battle-tested and proven, the mainframe application can be shut down. The $$$ budgeted for the next mainframe renewal can be used for other innovative projects. Congratulations!

Mainframe Offloading and Replacement is a Journey… Kafka can Help!

Here is the big picture of our mainframe offloading and replacement story:

Mainframe Offloading and Replacement with Apache Kafka

Hybrid Cloud Infrastructure with Apache Kafka and Mainframe

Remember the three phases I discussed earlier: Phase 1 – Cloud Adoption, Phase 2 – Hybrid Cloud, Phase 3 – Cloud-First Development. I did not tackle this in detail in this blog post. Having said this, Kafka is an ideal candidate for this journey. Therefore, this is a perfect combination for offloading and replacing mainframes in a hybrid and multi-cloud strategy. More details here:

Slides and Video Recording for Kafka and Mainframe Integration

Here are the slides:

And the video walking you through the slides:

What are your experiences with mainframe offloading and replacement? Did you or do you plan to use Apache Kafka and its ecosystem? What is your strategy? Let’s connect on LinkedIn and discuss! Stay informed about new blog posts by subscribing to my newsletter.

The post Mainframe Integration, Offloading and Replacement with Apache Kafka appeared first on Kai Waehner.

]]>
Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd https://www.kai-waehner.de/blog/2019/09/24/cloud-native-apache-kafka-kubernetes-envoy-istio-linkerd-service-mesh/ Tue, 24 Sep 2019 15:50:45 +0000 http://www.kai-waehner.de/blog/?p=1585 This blog post takes a look at cutting edge technologies like Apache Kafka, Kubernetes, Envoy, Linkerd and Istio to implement a cloud-native service mesh for a scalable, robust and observable microservice architecture.

The post Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd appeared first on Kai Waehner.

]]>
Microservice architectures are not free lunch! Microservices need to be decoupled, flexible, operationally transparent, data aware and elastic. Most material from last years only discusses point-to-point architectures with tightly coupled and non-scalable technologies like REST / HTTP. This blog post takes a look at cutting edge technologies like Apache Kafka, Kubernetes, Envoy, Linkerd and Istio to implement a cloud-native service mesh to solve these challenges and bring microservices to the next level of scale, speed and efficiency.

Here are the key requirements for building a scalable, reliable, robust and observable microservice architecture:

Key Requirements for Microservices

Before we go into more detail, let’s take a look at the key takeaways first:

  • Apache Kafka decouples services, including event streams and request-response
  • Kubernetes provides a cloud-native infrastructure for the Kafka ecosystem
  • Service Mesh helps with security and observability at ecosystem / organization scale
  • Envoy and Istio sit in the layer above Kafka and are orthogonal to the goals Kafka addresses

The following  sections cover some more thoughts about this. The end of the blog post contains a slide deck and video recording to get some more detailed explanations.

Microservices, Service Mesh and Apache Kafka

Apache Kafka became the de facto standard for microservice architectures. It goes far beyond reliable and scalable high-volume messaging. The distributed storage allows high availability and real decoupling between the independent microservices. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams.

A Service Mesh complements the architecture. It describes the network of microservices that make up such applications and the interactions between them. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary rollouts, rate limiting, access control, and end-to-end authentication.

I explore the problem of distributed Microservices communication and how both Apache Kafka and Service Mesh solutions address it. This blog post takes a look at some approaches for combining both to build a reliable and scalable microservice architecture with decoupled and secure microservices.

Kafka Service Mesh Domain Driven Design

Discussions and architectures include various open source technologies like Apache Kafka, Kafka Connect, Kubernetes, HAProxy, Envoy, LinkerD and Istio.

Learn more about decoupling microservices with Kafka in this related blog post about “Microservices, Apache Kafka, and Domain-Driven Design (DDD)“.

Cloud-Native Kafka with Kubernetes

Cloud-native infrastructures are scalable, flexible, agile, elastic and automated. Kubernetes got the de factor standard. Deployment of stateless services is pretty easy and straightforward. Though, deploying stateful and distributed applications like Apache Kafka is much harder. A lot of human operations is required. Kubernetes does NOT automatically solve Kafka-specific challenges like rolling upgrades, security configuration or data balancing between brokers. The Kafka Operator – implemented in K8s Custom Resource Definitions (CRD) – can help here!

The Operator pattern for Kubernetes aims to capture the key aim of a human operator who is managing a service or set of services. Human operators who look after specific applications and services have deep knowledge of how the system ought to behave, how to deploy it, and how to react if there are problems.

People who run workloads on Kubernetes often like to use automation to take care of repeatable tasks. The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides.

Different implementations for a Kafka Operator  for Kubernetes exist: Confluent Operator, IBM’s / Red Hat’s Strimzi, Banzai Cloud. I won’t go into more detail about the characteristics and advantages of a K8s Kafka Operator here. I already explained it in detail in another blog post (and the video below will also discuss this topic):

Service Mesh with Kubernetes-based Technologies like Envoy, Linkerd or Istio

Service Mesh is a microservice pattern to move visibility, reliability, and security primitives for service-to-service communication into the infrastructure layer, out of the application layer.

A great, detailed explanation of the design pattern “service mesh” can be found here, including the following diagram which shows the relation between a control plane and the microservices with proxy sidecars:

design pattern service mesh

You can find much more great content about service mesh concepts and its implementations from the creators of frameworks like Envoy or Linkerd. Check out these two links or just use Google for more information about the competing alternatives and their trade-offs.

(Potential) Features for Apache Kafka and Service Mesh

An event streaming platform like Apache Kafka and a service mesh on top of Kubernetes are cloud-native, orthogonal and complementary. Together they solve the key requirements for building a scalable, reliable, robust and observable microservice architecture:

Key Requirements for Microservices Solved with Kafka Kubernetes Envoy Istio

Companies use Kafka together with service mesh implementations like Envoy, Linkerd or Istio already today. You can easily combine them to add security, enforce rate limiting, or implement other related use cases. Banzai Cloud published one of the most interesting architectures: They use Istio for adding security to Kafka Brokers and ZooKeeper via proxies using Envoy.

However, in the meantime, the support gets even better: The pull request for Kafka support in Envoy was merged in May 2019. This means you now have native Kafka protocol support in Envoy. The very interesting discussions about its challenges and potential features of implementing a Kafka protocol filter are also worth reading.

With native Kafka protocol support, you can do many more interesting things beyond L4 TCP filtering. Here are just some ideas (partly from above Github discussion) of what you could do with L7 Kafka protocol support in a Service Mesh:

Protocol conversion from HTTP / gRPC to Kafka

  • Tap feature to dump to a Kafka stream
  • Protocol parsing for observability (stats, logging, and trace linking with HTTP RPCs)
  • Shadow requests to a Kafka stream instead of HTTP / gRPC shadow
  • Integrate with Kafka Connect and its whole ecosystem of connectors

Proxy features

  • Dynamic Routing
  • Rate limiting at both the L4 connection and L7 message level
  • Filter, add compression, …
  • Automatic topic name conversion (e.g. for canary release or blue/green deployment)

Monitoring and Tracing

  • Request logs and stats
  • Data lineage / audit log
  • Audit log by taking request logs and enriching them with the user info.
  • Client specific metrics (Byte rate per client id / per consumer groups, versions of the client libraries, consumer lag monitoring for the entire data center)

Security

  • SSL Termination
  • Mutual TLS (mTLS)
  • Authorization

Validation of Events

  • Serialization format (JSON, Avro, Protobuf, etc.)
  • Message schema
  • Headers, attributes, etc.

That’s awesome, isn’t it?

Microservices, Kafka and Service Mesh – Slide Deck and Video Recording

Let’s take a look at my slide deck and video recording to understand the requirements, challenges and opportunities of building a Service Mesh with Apache Kafka, its ecosystem, Kubernetes and Service Mesh technologies in more detail…

Here is the slide deck:

The video recording walks you through the slide deck:

Any thoughts or feedback? Please let me know via a comment or Tweet or let’s connect on LinkedIn.

 

 

The post Service Mesh and Cloud-Native Microservices with Apache Kafka, Kubernetes and Envoy, Istio, Linkerd appeared first on Kai Waehner.

]]>