Hybrid Archives - Kai Waehner https://www.kai-waehner.de/blog/tag/hybrid/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 17 Mar 2025 12:45:14 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Hybrid Archives - Kai Waehner https://www.kai-waehner.de/blog/tag/hybrid/ 32 32 Modernizing OT Middleware: The Shift to Open Industrial IoT Architectures with Data Streaming https://www.kai-waehner.de/blog/2025/03/17/modernizing-ot-middleware-the-shift-to-open-industrial-iot-architectures-with-data-streaming/ Mon, 17 Mar 2025 12:45:14 +0000 https://www.kai-waehner.de/?p=7573 Legacy OT middleware is struggling to keep up with real-time, scalable, and cloud-native demands. As industries shift toward event-driven architectures, companies are replacing vendor-locked, polling-based systems with Apache Kafka, MQTT, and OPC-UA for seamless OT-IT integration. Kafka serves as the central event backbone, MQTT enables lightweight device communication, and OPC-UA ensures secure industrial data exchange. This approach enhances real-time processing, predictive analytics, and AI-driven automation, reducing costs and unlocking scalable, future-proof architectures.

The post Modernizing OT Middleware: The Shift to Open Industrial IoT Architectures with Data Streaming appeared first on Kai Waehner.

]]>
Operational Technology (OT) has traditionally relied on legacy middleware to connect industrial systems, manage data flows, and integrate with enterprise IT. However, these monolithic, proprietary, and expensive middleware solutionsstruggle to keep up with real-time, scalable, and cloud-native architectures.

Just as mainframe offloading modernized enterprise IT, offloading and replacing legacy OT middleware is the next wave of digital transformation. Companies are shifting from vendor-locked, heavyweight OT middleware to real-time, event-driven architectures using Apache Kafka and Apache Flink—enabling cost efficiency, agility, and seamless edge-to-cloud integration.

This blog explores why and how organizations are replacing traditional OT middleware with data streaming, the benefits of this shift, and architectural patterns for hybrid and edge deployments.

Replacing OT Middleware with Data Streaming using Kafka and Flink for Cloud-Native Industrial IoT with MQTT and OPC-UA

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including architectures and customer stories for hybrid IT/OT integration scenarios.

Why Replace Legacy OT Middleware?

Industrial environments have long relied on OT middleware like OSIsoft PI, proprietary SCADA systems, and industry-specific data buses. These solutions were designed for polling-based communication, siloed data storage, and batch integration. But today’s real-time, AI-driven, and cloud-native use cases demand more.

Challenges: Proprietary, Monolithic, Expensive

  • High Costs – Licensing, maintenance, and scaling expenses grow exponentially.
  • Proprietary & Rigid – Vendor lock-in restricts flexibility and data sharing.
  • Batch & Polling-Based – Limited ability to process and act on real-time events.
  • Complex Integration – Difficult to connect with cloud and modern IT systems.
  • Limited Scalability – Not built for the massive data volumes of IoT and edge computing.

Just as PLCs are transitioning to virtual PLCs, eliminating hardware constraints and enabling software-defined industrial control, OT middleware is undergoing a similar shift. Moving from monolithic, proprietary middleware to event-driven, streaming architectures with Kafka and Flink allows organizations to scale dynamically, integrate seamlessly with IT, and process industrial data in real time—without vendor lock-in or infrastructure bottlenecks.

Data streaming is NOT a direct replacement for OT middleware, but it serves as the foundation for modernizing industrial data architectures. With Kafka and Flink, enterprises can offload or replace OT middleware to achieve real-time processing, edge-to-cloud integration, and open interoperability.

Event-driven Architecture with Data Streaming using Kafka and Flink in Industrial IoT and Manufacturing

While Kafka and Flink provide real-time, scalable, and event-driven capabilities, last-mile integration with PLCs, sensors, and industrial equipment still requires OT-specific SDKs, open interfaces, or lightweight middleware. This includes support for MQTT, OPC UA or open-source solutions like Apache PLC4X to ensure seamless connectivity with OT systems.

Apache Kafka: The Backbone of Real-Time OT Data Streaming

Kafka acts as the central nervous system for industrial data to ensure low-latency, scalable, and fault-tolerant event streaming between OT and IT systems.

  • Aggregates and normalizes OT data from sensors, PLCs, SCADA, and edge devices.
  • Bridges OT and IT by integrating with ERP, MES, cloud analytics, and AI/ML platforms.
  • Operates seamlessly in hybrid, multi-cloud, and edge environments, ensuring real-time data flow.
  • Works with open OT standards like MQTT and OPC UA, reducing reliance on proprietary middleware solutions.

And just to be clear: Apache Kafka and similar technologies support “IT real-time” (meaning milliseconds of latency and sometimes latency spikes). This is NOT about hard real-time in the OT world for embedded systems or safety critical applications.

Flink powers real-time analytics, complex event processing, and anomaly detection for streaming industrial data.

Condition Monitoring and Predictive Maintenance with Data Streaming using Apache Kafka and Flink

By leveraging Kafka and Flink, enterprises can process OT and IT data only once, ensuring a real-time, unified data architecture that eliminates redundant processing across separate systems. This approach enhances operational efficiency, reduces costs, and accelerates digital transformation while still integrating seamlessly with existing industrial protocols and interfaces.

Unifying Operational (OT) and Analytical (IT) Workloads

As industries modernize, a shift-left architecture approach ensures that operational data is not just consumed for real-time operational OT workloads but is also made available for transactional and analytical IT use cases—without unnecessary duplication or transformation overhead.

The Shift-Left Architecture: Bringing Advanced Analytics Closer to Industrial IoT

In traditional architectures, OT data is first collected, processed, and stored in proprietary or siloed middleware systems before being moved later to IT systems for analysis. This delayed, multi-step process leads to inefficiencies, including:

  • High latency between data collection and actionable insights.
  • Redundant data storage and transformations, increasing complexity and cost.
  • Disjointed AI/ML pipelines, where models are trained on outdated, pre-processed data rather than real-time information.

A shift-left approach eliminates these inefficiencies by bringing analytics, AI/ML, and data science closer to the raw, real-time data streams from the OT environments.

Shift Left Architecture with Data Streaming into Data Lake Warehouse Lakehouse

Instead of waiting for batch pipelines to extract and move data for analysis, a modern architecture integrates real-time streaming with open table formats to ensure immediate usability across both operational and analytical workloads.

Open Table Format with Apache Iceberg / Delta Lake for Unified Workloads and Single Storage Layer

By integrating open table formats like Apache Iceberg and Delta Lake, organizations can:

  • Unify operational and analytical workloads to enable both real-time data streaming and batch analytics in a single architecture.
  • Eliminate data silos, ensuring that OT and IT teams access the same high-quality, time-series data without duplication.
  • Ensure schema evolution and ACID transactions to enable robust and flexible long-term data storage and retrieval.
  • Enable real-time and historical analytics, allowing engineers, business users, and AI/ML models to query both fresh and historical data efficiently.
  • Reduce the need for complex ETL pipelines, as data is written once and made available for multiple workloadssimultaneously. And no need to use the anti-pattern of Reverse ETL.

The Result: An Open, Cloud-Native, Future-Proof Data Historian for Industrial IoT

This open, hybrid OT/IT architecture allows organizations to maintain real-time industrial automation and monitoring with Kafka and Flink, while ensuring structured, queryable, and analytics-ready data with Iceberg or Delta Lake. The shift-left approach ensures that data streams remain useful beyond their initial OT function, powering AI-driven automation, predictive maintenance, and business intelligence in near real-time rather than relying on outdated and inconsistent batch processes.

Open and Cloud Native Data Historian in Industrial IoT and Manufacturing with Data Streaming using Apache Kafka and Flink

By adopting this unified, streaming-first architecture to build an open and cloud-native data historian, organizations can:

  • Process data once and make it available for both real-time decisions and long-term analytics.
  • Reduce costs and complexity by eliminating unnecessary data duplication and movement.
  • Improve AI/ML effectiveness by feeding models with real-time, high-fidelity OT data.
  • Ensure compliance and historical traceability without compromising real-time performance.

This approach future-proofs industrial data infrastructures, allowing enterprises to seamlessly integrate IT and OT, while supporting cloud, edge, and hybrid environments for maximum scalability and resilience.

Key Benefits of Offloading OT Middleware to Data Streaming

  • Lower Costs – Reduce licensing fees and maintenance overhead.
  • Real-Time Insights – No more waiting for batch updates; analyze events as they happen.
  • One Unified Data Pipeline – Process data once and make it available for both OT and IT use cases.
  • Edge and Hybrid Cloud Flexibility – Run analytics at the edge, on-premise, or in the cloud.
  • Open Standards & Interoperability – Support MQTT, OPC UA, REST/HTTP, Kafka, and Flink, avoiding vendor lock-in.
  • Scalability & Reliability – Handle massive sensor and machine data streams continuously without performance degradation.

A Step-by-Step Approach: Offloading vs. Replacing OT Middleware with Data Streaming

Companies transitioning from legacy OT middleware have several strategies by leveraging data streaming as an integration and migration platform:

  1. Hybrid Data Processing
  2. Lift-and-Shift
  3. Full OT Middleware Replacement

1. Hybrid Data Streaming: Process Once for OT and IT

Why?

Traditional OT architectures often duplicate data processing across multiple siloed systems, leading to higher costs, slower insights, and operational inefficiencies. Many enterprises still process data inside expensive legacy OT middleware, only to extract and reprocess it again for IT, analytics, and cloud applications.

A hybrid approach using Kafka and Flink enables organizations to offload processing from legacy middleware while ensuring real-time, scalable, and cost-efficient data streaming across OT, IT, cloud, and edge environments.

Offloading from OT Middleware like OSISoft PI to Data Streaming with Kafka and Flink

How?

Connect to the existing OT middleware via:

  • A Kafka Connector (if available).
  • HTTP APIs, OPC UA, or MQTT for data extraction.
  • Custom integrations for proprietary OT protocols.
  • Lightweight edge processing to pre-filter data before ingestion.

Use Kafka for real-time ingestion, ensuring all OT data is available in a scalable, event-driven pipeline.

Process data once with Flink to:

  • Apply real-time transformations, aggregations, and filtering at scale.
  • Perform predictive analytics and anomaly detection before storing or forwarding data.
  • Enrich OT data with IT context (e.g., adding metadata from ERP or MES).

Distribute processed data to the right destinations, such as:

  • Time-series databases for historical analysis and monitoring.
  • Enterprise IT systems (ERP, MES, CMMS, BI tools) for decision-making.
  • Cloud analytics and AI platforms for advanced insights.
  • Edge and on-prem applications that need real-time operational intelligence.

Result?

  • Eliminate redundant processing across OT and IT, reducing costs.
  • Real-time data availability for analytics, automation, and AI-driven decision-making.
  • Unified, event-driven architecture that integrates seamlessly with on-premise, edge, hybrid, and cloud environments.
  • Flexibility to migrate OT workloads over time, without disrupting current operations.

By offloading costly data processing from legacy OT middleware, enterprises can modernize their industrial data infrastructure while maintaining interoperability, efficiency, and scalability.

2. Lift-and-Shift: Reduce Costs While Keeping Existing OT Integrations

Why?

Many enterprises rely on legacy OT middleware like OSIsoft PI, proprietary SCADA systems, or industry-specific data hubs for storing and processing industrial data. However, these solutions come with high licensing costs, limited scalability, and an inflexible architecture.

A lift-and-shift approach provides an immediate cost reduction by offloading data ingestion and storage to Apache Kafka while keeping existing integrations intact. This allows organizations to modernize their infrastructure without disrupting current operations.

How?

Use the Stranger Fig Design Pattern as a gradual modernization approach where new systems incrementally replace legacy components, reducing risk and ensuring a seamless transition:

Stranger Fig Pattern to Integrate, Migrate, Replace

“The most important reason to consider a strangler fig application over a cut-over rewrite is reduced risk.” Martin Fowler

Replace expensive OT middleware for ingestion and storage:

  • Deploy Kafka as a scalable, real-time event backbone to collect and distribute data.
  • Offload sensor, PLC, and SCADA data from OSIsoft PI, legacy brokers, or proprietary middleware.
  • Maintain the connectivity with existing OT applications to prevent workflow disruption.

Streamline OT data processing:

  • Store and distribute data in Kafka instead of proprietary, high-cost middleware storage.
  • Leverage schema-based data governance to ensure compatibility across IT and OT systems.
  • Reduce data duplication by ingesting once and distributing to all required systems.

Maintain existing IT and analytics integrations:

  • Keep connections to ERP, MES, and BI platforms via Kafka connectors.
  • Continue using existing dashboards and reports while transitioning to modern analytics platforms.
  • Avoid vendor lock-in and enable future migration to cloud or hybrid solutions.

Result?

  • Immediate cost savings by reducing reliance on expensive middleware storage and licensing fees.
  • No disruption to existing workflows, ensuring continued operational efficiency.
  • Scalable, future-ready architecture with the flexibility to expand to edge, cloud, or hybrid environments over time.
  • Real-time data streaming capabilities, paving the way for predictive analytics, AI-driven automation, and IoT-driven optimizations.

A lift-and-shift approach serves as a stepping stone toward full OT modernization, allowing enterprises to gradually transition to a fully event-driven, real-time architecture.

3. Full OT Middleware Replacement: Cloud-Native, Scalable, and Future-Proof

Why?

Legacy OT middleware systems were designed for on-premise, batch-based, and proprietary environments, making them expensive, inflexible, and difficult to scale. As industries embrace cloud-native architectures, edge computing, and real-time analytics, replacing traditional OT middleware with event-driven streaming platforms enables greater flexibility, cost efficiency, and real-time operational intelligence.

A full OT middleware replacement eliminates vendor lock-in, outdated integration methods, and high-maintenance costs while enabling scalable, event-driven data processing that works across edge, on-premise, and cloud environments.

How?

Use Kafka and Flink as the Core Data Streaming Platform

  • Kafka replaces legacy data brokers and middleware storage by handling high-throughput event ingestion and real-time data distribution.
  • Flink provides advanced real-time analytics, anomaly detection, and predictive maintenance capabilities.
  • Process OT and IT data in real-time, eliminating batch-based limitations.

Replace Proprietary Connectors with Lightweight, Open Standards

  • Deploy MQTT or OPC UA gateways to enable seamless communication with sensors, PLCs, SCADA, and industrial controllers.
  • Eliminate complex, costly middleware like OSIsoft PI with low-latency, open-source integration.
  • Leverage Apache PLC4X for industrial protocol connectivity, avoiding proprietary vendor constraints.

Adopt a Cloud-Native, Hybrid, or On-Premise Storage Strategy

  • Store time-series data in scalable, purpose-built databases like InfluxDB or TimescaleDB.
  • Enable real-time query capabilities for monitoring, analytics, and AI-driven automation.
  • Ensure data availability across on-premise infrastructure, hybrid cloud, and multi-cloud deployments.

Journey from Legacy OT Middleware to Hybrid Cloud

Modernize IT and Business Integrations

  • Enable seamless OT-to-IT integration with ERP, MES, BI, and AI/ML platforms.
  • Stream data directly into cloud-based analytics services, digital twins, and AI models.
  • Build real-time dashboards and event-driven applications for operators, engineers, and business stakeholders.

OT Middleware Integration, Offloading and Replacement with Data Streaming for IoT and IT/OT

Result?

  • Fully event-driven and cloud-native OT architecture that eliminates legacy bottlenecks.
  • Real-time data streaming and processing across all industrial environments.
  • Scalability for high-throughput workloads, supporting edge, hybrid, and multi-cloud use cases.
  • Lower operational costs and reduced maintenance overhead by replacing proprietary, heavyweight OT middleware.
  • Future-ready, open, and extensible architecture built on Kafka, Flink, and industry-standard protocols.

By fully replacing OT middleware, organizations gain real-time visibility, predictive analytics, and scalable industrial automation, unlocking new business value while ensuring seamless IT/OT integration.

Helin is an excellent example for a cloud-native IT/OT data solution powered by Kafka and Flink to focus on real-time data integration and analytics, particularly in the context of industrial and operational environments. Its industry focus on maritime and energy sector, but this is relevant across all IIoT industries.

Why This Matters: The Future of OT is Real-Time & Open for Data Sharing

The next generation of OT architectures is being built on open standards, real-time streaming, and hybrid cloud.

  • Most new industrial sensors, machines, and control systems are now designed with Kafka, MQTT, and OPC UA compatibility.
  • Modern IT architectures demand event-driven data pipelines for AI, analytics, and automation.
  • Edge and hybrid computing require scalable, fault-tolerant, real-time processing.

Industrial IoT Data Streaming Everywhere Edge Hybrid Cloud with Apache Kafka and Flink

Use Kafka Cluster Linking for seamless bi-directional data replication and command&control, ensuring low-latency, high-availability data synchronization across on-premise, edge, and cloud environments.

Enable multi-region and hybrid edge to cloud architectures with real-time data mirroring to allow organizations to maintain data consistency across global deployments while ensuring business continuity and failover capabilities.

It’s Time to Move Beyond Legacy OT Middleware to Open Standards like MQTT, OPC-UA, Kafka

The days of expensive, proprietary, and rigid OT middleware are numbered (at least for new deployments). Industrial enterprises need real-time, scalable, and open architectures to meet the growing demands of automation, predictive maintenance, and industrial IoT. By embracing open IoT and data streaming technologies, companies can seamlessly bridge the gap between Operational Technology (OT) and IT, ensuring efficient, event-driven communication across industrial systems.

MQTT, OPC-UA and Apache Kafka are a match in heaven for industrial IoT:

  • MQTT enables lightweight, publish-subscribe messaging for industrial sensors and edge devices.
  • OPC-UA provides secure, interoperable communication between industrial control systems and modern applications.
  • Kafka acts as the high-performance event backbone, allowing data from OT systems to be streamed, processed, and analyzed in real time.

Whether lifting and shifting, optimizing hybrid processing, or fully replacing legacy middleware, data streaming is the foundation for the next generation of OT and IT integration. With Kafka at the core, enterprises can decouple systems, enhance scalability, and unlock real-time analytics across the entire industrial landscape.

Stay ahead of the curve! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation. And make sure to download my free book about data streaming use cases and industry success stories.

The post Modernizing OT Middleware: The Shift to Open Industrial IoT Architectures with Data Streaming appeared first on Kai Waehner.

]]>
Apache Kafka in Manufacturing at Automotive Supplier Brose for Industrial IoT Use Cases https://www.kai-waehner.de/blog/2024/06/13/apache-kafka-in-manufacturing-at-automotive-supplier-brose-for-industrial-iot-use-cases/ Thu, 13 Jun 2024 07:15:57 +0000 https://www.kai-waehner.de/?p=6543 Data streaming unifies OT/IT workloads by connecting information from sensors, PLCs, robotics and other manufacturing systems at the edge with business applications and the big data analytics world in the cloud. This blog post explores how the global automotive supplier Brose deploys a hybrid industrial IoT architecture using Apache Kafka in combination with Eclipse Kura, OPC-UA, MuleSoft and SAP.

The post Apache Kafka in Manufacturing at Automotive Supplier Brose for Industrial IoT Use Cases appeared first on Kai Waehner.

]]>
Data streaming unifies OT/IT workloads by connecting information from sensors, PLCs, robotics and other manufacturing systems at the edge with business applications and the big data analytics world in the cloud. This blog post explores how the global automotive supplier Brose deploys a hybrid industrial IoT architecture using Apache Kafka in combination with Eclipse Kura, OPC-UA, MuleSoft and SAP.

Data Streaming with Apache Kafka for Industrial IoT in the Automotive Industry at Brose

Data Streaming and Industrial IoT / Industry 4.0

Data streaming with Apache Kafka plays a critical role in Industrial IoT by enabling real-time data ingestion, processing, and analysis from various industrial devices and sensors. Kafka’s high throughput and scalability ensure that it can reliably handle and integrate massive streams of data from IoT devices into analytics platforms for valuable insights. This real-time capability enhances predictive maintenance, operational efficiency, and overall automation in industrial settings.

Here is an exemplary hybrid industrial IoT architecture with data streaming at the edge in the factory and 5G supply chain environments synchronizing in real-time with business applications and analytics / AI platforms in the cloud:

Brose – A Global Automotive Supplier

Brose is a global automotive supplier headquartered in beautiful Franconia, Bavaria, Germany. The company has a global presence with 70 locations, 25 countries, 5 continents, and about 30,000 employees.

Brose specializes in mechatronic systems for vehicle doors, seats, and electric motors. They develop and manufacture innovative products that enhance vehicle comfort, safety, and efficiency, serving major car manufacturers worldwide.

Brose Automotive Supplier Product Portfolio
Source: Brose

Brose’s Hybrid Architecture for Industry 4.0 with Eclipse Kura, OPC UA, Kafka, SAP and MuleSoft

Brose is an excellent example of combining data streaming using Confluent with other technologies like open source Eclipse Kura and OPC-UA for the OT and edge site, and IT infrastructure and cloud software like SAP, Splunk, SQL Server, AWS Kinesis and MuleSoft:

Brose IoT Architecture with Apache Kafka Eclipe Kura OPC UA SAP Mulesoft
Source: Brose

Here is how it works according to Sven Matuschzik, Director of IT-Platforms and Databases at Brose:

Regional Kafka on-premise clusters are embedded within the IIoT and production platform, facilitating seamless data flow from the shop floor to the business world in combination with other integration tools. This hybrid IoT streaming architecture connects machines to the IT infrastructure, mastering various technologies, and ensuring zero trust security with micro-segmentation. It manages latencies between sites and central IT, enables two-way communication between machines and the IT world, and maintains high data quality from the shop floor.

For more insights from Brose (and Siemens) about IoT and data streaming with Apache Kafka, listen to the following interactive discussion.

Interactive Discussion with Siemens and Brose about Data Streaming and IoT

Brose and Siemens discussed with me

  • the practical strategies employed by Brose and Siemens to integrate data streaming in IoT for real-time data utilization.
  • the challenges faced by both companies in embracing data streaming, and reveal how they overcame barriers to maximize their potential with a hybrid cloud infrastructure.
  • how these enterprise architectures will be expanded, including real-time data sharing with customers, partners, and suppliers, and the potential impact of artificial intelligence (AI), including GenAI, on data streaming efforts, providing valuable insights to drive business outcomes and operational efficiency.
  • the significance of event-driven architectures and data streaming for enhanced manufacturing processes to improve overall equipment effectiveness (OEE) and seamlessly integrate with existing IT systems like SAP ERP and Salesforce CRM to optimize their operations.

Here is the video recording with Stefan Baer from Siemens and Sven Matuschzik from Brose:

Brose Industrial IoT Webinar with Kafka Confluent
Source: Confluent

Data Streaming with Apache Kafka to Unify Industrial IoT Workloads from Edge to Cloud with Apache Kafka

Many manufacturers leverage data streaming powered by Apache Kafka to unify the OT/IT world from edge sites like factories to the data center or public cloud for analytics and business applications.

I wrote a lot about data streaming with Apache Kafka and Flink in Industry 4.0, Industrial IoT and OT/IT modernization. Here are a few of my favourite articles:

How does your IoT architecture look like? Do you already use data streaming? What are the use cases and enterprise architecture? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in Manufacturing at Automotive Supplier Brose for Industrial IoT Use Cases appeared first on Kai Waehner.

]]>
The State of Data Streaming for Healthcare with Apache Kafka and Flink https://www.kai-waehner.de/blog/2023/11/27/the-state-of-data-streaming-for-healthcare-in-2023/ Mon, 27 Nov 2023 13:52:35 +0000 https://www.kai-waehner.de/?p=5841 This blog post explores the state of data streaming for the healthcare industry powered by Apache Kafka and Apache Flink. IT modernization and innovation with pioneering technologies like sensors, telemedicine, or AI/machine learning are explored. I look at enterprise architectures and customer stories from Humana, Recursion, BHG (former Bankers Healthcare Group), and more. A complete slide deck and on-demand video recording are included.

The post The State of Data Streaming for Healthcare with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
This blog post explores the state of data streaming for the healthcare industry. The digital disruption combined with growing regulatory requirements and IT modernization efforts require a reliable data infrastructure, real-time end-to-end observability, fast time-to-market for new features, and integration with pioneering technologies like sensors, telemedicine, or AI/machine learning. Data streaming allows integrating and correlating legacy and modern interfaces in real-time at any scale to improve most business processes in the healthcare sector much more cost-efficiently.

I look at trends in the healthcare industry to explore how data streaming helps as a business enabler, including customer stories from Humana, Recursion, BHG (former Bankers Healthcare Group), Evernorth Health Services, and more. A complete slide deck and on-demand video recording are included.

The State of Data Streaming for Healthcare in 2023 with Apache Kafka and Flink

The digitalization of the healthcare sector and disruptive use cases is exciting. Countries where healthcare is not part of the public administration innovate quickly. However, regulation and data privacy are crucial across the world. And even innovative technologies and cloud services need to comply with law and in parallel connect to legacy platforms and protocols.

Regulation and interoperability

Healthcare does often not have a choice. Regulations by the government must be implemented by a specific deadline. IT modernization, adoption of new technologies, and integration with the legacy world are mandatory. Many regulations demand Open APIs and interfaces. But even if not enforced, the public sector does itself a favour adopting open technologies for data sharing between different sectors and the members.

A concrete example: Interoperability and Patient Access final rule (CMS-9115-F), as explained by a US government, “aims to put patients first, giving them access to their health information when they need it most and, in a way, they can best use it.

  • Interoperability = Interoperability is the ability of two or more systems to exchange health information and use the information once it is received.
  • Patient Access = Patient Access refers to the ability of consumers to access their health care records.

Lack of seamless data exchange in healthcare has historically detracted from patient care, leading to poor health outcomes, and higher costs. The CMS Interoperability and Patient Access final rule establishes policies that break down barriers in the nation’s health system to enable better patient access to their health information, improve interoperability and unleash innovation while reducing the burden on payers and providers.

Patients and their healthcare providers will be more informed, which can lead to better care and improved patient outcomes, while reducing burden. In a future where data flows freely and securely between payers, providers, and patients, we can achieve truly coordinated care, improved health outcomes, and reduced costs.”

Digital disruption and automated workflows

Gartner has a few interesting insights about the evolution of the healthcare sector. The digital disruption is required to handle revenue reduction and revenue reinvention because of economic pressure, scarce and extensive talent, and supply challenges:

Challenges for the Digital Disruption in the Health System

Gartner points out that real-time workflows and automation are critical across the entire health process to enable an optimal experience:

Real Time Automated Interoperable Data and Workflows

Therefore, data streaming is very helpful in implementing new digitalized healthcare processes.

Data streaming in the healthcare industry

Adopting healthcare trends like telemedicine, automated member service with Generative AI (GenAI), or automated claim processing are only possible if enterprises in the games sector can provide and correlate information at the right time in the proper context. Real-time, which means using the information in milliseconds, seconds, or minutes, is almost always better than processing data later:

Use Cases for Real-Time Data Streaming in the Healthcare Industry with Apache Kafka and Flink

Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. Apache Kafka is the de facto standard for data streaming.

The following blog series about data streaming with Apache Kafka in the healthcare industry is a great starting point to learn more about data streaming in the health sector, including a few industry-specific and Kafka-powered case studies:

The healthcare industry applies various software development and enterprise architecture trends for cost, elasticity, security, and latency reasons. The three major topics I see these days at customers are:

  • Event-driven architectures (in combination with request-response communication) to enable domain-driven design and flexible technology choices
  • Data mesh for building new data products and real-time data sharing with internal platforms and partner APIs
  • Fully managed SaaS (whenever doable from compliance and security perspective) to focus on business logic and faster time-to-market

Let’s look deeper into some enterprise architectures that leverage data streaming for healthcare use cases.

Event-driven architecture for integration and IT modernization

IT modernization requires integration between legacy and modern applications. The integration challenges include different protocols (often proprietary and complex), various communication paradigms (asynchronous, request-response, batch), and SLAs (transactions, analytics, reporting).

Here is an example of a data integration workflow combining clinical health data and claims in EDI / EDIFACT format, data from legacy databases, and modern microservices:

Public Healthcare Data Automation with Data Streaming

One of the biggest problems in IT modernization is data consistency between files, databases, messaging platforms, and APIs. That is a sweet spot for Apache Kafka: Providing data consistency between applications no matter what technology, interface or API they use.

Data mesh for real-time data sharing and consistency

Data sharing across business units is important for any organization. The healthcare industry has to combine very interesting (different) data sets, like big data game telemetry, monetization and advertisement transactions, and 3rd party interfaces.

 

Data Streaming with Apache Kafka and Flink in the Healthcare Sector

Data consistency is one of the most challenging problems in the games sector. Apache Kafka ensures data consistency across all applications and databases, whether these systems operate in real-time, near-real-time, or batch.

One sweet spot of data streaming is that you can easily connect new applications to the existing infrastructure or modernize existing interfaces, like migrating from an on-premise data warehouse to a cloud SaaS offering.

New customer stories for data streaming in the healthcare sector

The innovation is often slower in the healthcare sector. Automation and digitalization change how healthcare companies process member data, execute claim processing, integrate payment processors, or create new business models with telemedicine or sensor data in hospitals.

Most healthcare companies use a hybrid cloud approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating all IT infrastructure on premises. The integration between legacy protocols like EDIFACT and modern applications is still one of the toughest challenges.

Here are a few customer stories from healthcare organizations for IT modernization and innovation with new technologies:

  • BHG Financial (formerly: Bankers Healthcare Group): Direct lender for healthcare professionals offering loans, credit card, insurance
  • Evernorth Health Services: Hybrid integration between on-premise mainframe and microservices on AWS cloud
  • Humana: Data integration and analytics at the point of care
  • Recursion: Accelerating drug discovery with a hybrid machine learning architecture

Resources to learn more

This blog post is just the starting point. Learn more about data streaming with Apache Kafka and Apache Flink in the healthcare industry in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases.

On-demand video recording

The video recording explores the healthcare industry’s trends and architectures for data streaming. The primary focus is the data streaming architectures and case studies.

I am excited to have presented this webinar in my interactive light board studio:

Lightboard Video about Apache Kafka and Flink in Healthcare

This creates a much better experience, especially in a time after the pandemic, where many people are “Zoom fatigue”.

Check out our on-demand recording:

Video: Data Streaming in Real Life in Healthcare

Slides

If you prefer learning from slides, check out the deck used for the above recording:

Fullscreen Mode

Case studies and lightboard videos for data streaming in the healthcare industry

The state of data streaming for healthcare in 2023 is interesting. IT modernization is the most important initiative across most healthcare companies and organizations. This includes cost reduction by migrating from legacy infrastructure like the mainframe, hybrid cloud architectures with bi-directional data sharing, and innovative new use cases like telehealth.

We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Here is an example of cost reduction through mainframe offloading.

Healthcare is just one of many industries that leverages data streaming with Apache Kafka and Apache Flink.. Every month, we talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on… Check out my other blog posts.

How do you modernize IT infrastructure in the healthcare sector? Do you already leverage data streaming with Apache Kafka and Apache Flink? Maybe even in the cloud as a serverless offering? Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post The State of Data Streaming for Healthcare with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies? https://www.kai-waehner.de/blog/2022/06/27/data-warehouse-vs-data-lake-vs-data-streaming-friends-enemies-frenemies/ Mon, 27 Jun 2022 12:35:22 +0000 https://www.kai-waehner.de/?p=4533 The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let's explore this dilemma in a blog series. This is part 1: Data Warehouse vs. Data Lake vs. Data Streaming - Friends, Enemies, Frenemies?

The post Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies? appeared first on Kai Waehner.

]]>
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Storing data at rest for reporting and analytics requires different capabilities and SLAs than continuously processing data in motion for real-time workloads. Many open-source frameworks, commercial products, and SaaS cloud services exist. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. Learn how to build a modern data stack with cloud-native technologies. This is part 1: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?

Data Warehouse vs Data Lake vs Data Streaming Comparison

Blog Series: Data Warehouse vs. Data Lake vs Data Streaming

This blog series explores concepts, features, and trade-offs of a modern data stack using a data warehouse, data lake, and data streaming together:

  1. THIS POST: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
  2. Data Streaming for Data Ingestion into the Data Warehouse and Data Lake
  3. Data Warehouse Modernization: From Legacy On-Premise to Cloud-Native Infrastructure
  4. Case Studies: Cloud-native Data Streaming for Data Warehouse Modernization
  5. Lessons Learned from Building a Cloud-Native Data Warehouse

Stay tuned for a dedicated blog post for each topic as part of this blog series. I will link the blogs here as soon as they are available (in the next few weeks). Subscribe to my newsletter to get an email after each publication (no spam or ads).

I also created a short ten minute video that explains the differences between data streaming and a lakehouse (and why they are complementary):

The value of data: Transactional vs. analytical workloads

The last decade offered many articles, blogs, and presentations about data becoming the new oil. Today, nobody questions that data-driven business processes change the world and enable innovation across industries.

Data-driven business processes require both real-time data processing and batch processing. Think about the following flow of events across applications, domains, and organizations:

A stream of events - the business process for transactions and analytics

An event is business information or technical information. Events happen all the time. A business process in the real world requires the correlation of various events.

How critical is an event?

The criticality of an event defines the outcome. Potential impacts can be increased revenue, reduced risk, reduced cost, or improved customer experience.

  • Business transaction: Ideally, zero downtime and zero data loss. Example: Payments need to be processed exactly once.
  • Critical analytics: Ideally, zero downtime. Data loss of a single sensor event might be okay. Alerting on the aggregation of events is more critical. Example: Continuous monitoring of IoT sensor data and a (predictive) machine failure alert.
  • Non-critical analytics: Downtime and data loss are not good but do not kill the whole business. It is an accident, but not a disaster. Example: Reporting and business intelligence to forecast demand.

When to process an event?

Real-time usually means end-to-end processing within milliseconds or seconds. If you don’t need real-time decisions, batch processing (i.e., after minutes, hours, days) or on-demand (i.e., request-reply) is sufficient.

  • Business transactions are often real-time: A transaction like a payment usually requires real-time processing (e.g., before the customer leaves the store; before you ship the item; before you leave the ride-hailing car).
  • Critical analytics is usually real-time: Critical analytics very often requires real-time processing (e.g., detecting the fraud before it happens; predicting a machine failure before it breaks; upselling to a customer before he leaves the store).
  • Non-critical analytics is usually not real-time: Finding insights in historical data is usually done in a batch process using paradigms like complex SQL queries, map-reduce, or complex algorithms (e.g., reporting; model training with machine learning algorithms; forecasting).

With these basics about processing events, let’s understand why storing all events in a single, central data lake is not the solution to all problems.

Flexibility through decentralization and best-of-breed

The traditional data warehouse respectively data lake approach is to ingest all data from all sources into a central storage system for centralized data ownership. The sky (and your budget) is the limit with current big data and cloud technologies.

However, architectural concepts like domain-driven design, microservices, and data mesh show that decentralized ownership is the right choice for modern enterprise architecture:

Decentralization of the Data Warehouse and Data Lake using Data Streaming

No worries. The data warehouse and the data lake are not dead but more relevant than ever before in a data-driven world. Both make sense for many use cases. Even in one of these domains, larger organizations don’t use a single data warehouse or data lake. Selecting the right tool for the job (in your domain or business unit) is the best way to solve the business problem.

There are good reasons people are pleased with Databricks for batch ETL, machine learning, and now even data warehouse, but still prefer a lightweight cloud SQL database like AWS RDS (fully managed PostgreSQL) for some use cases.

There are good reasons happy Splunk users also ingest some data into Elasticsearch instead. And why Cribl is getting more and more traction in this space, too.

There are good reasons some projects leverage Apache Kafka as the database. Storing data long-term in Kafka makes only sense for some specific use cases (like compacted topics, key/value queries, streaming analytics). Kafka does not replace other databases or data lakes.

Choose the right tool for the job with decentralized data ownership!

With this in mind, let’s explore the use cases and added value of a modern data warehouse (and how it relates to data lakes and the new buzz lakehouse).

Data warehouse: Reporting and business intelligence with data at rest

A data warehouse (DWH) provides capabilities for reporting and data analysis. It is considered a core component of business intelligence.

Use cases for data at rest

No matter if you use a product called a data warehouse, data lake, or lakehouse. The data is stored at rest for further processing:

  • Reporting and Business Intelligence: Fast and flexible availability of reports, statistics, and key figures, for example, to identify correlations between market and service offering
  • Data Engineering: Integration of data from differently structured and distributed data sets to enable the identification of hidden relationships between data
  • Big Data Analytics and AI / Machine Learning: Global view of the source data and thus overarching evaluations to find unknown insights to improve business processes and interrelationships.

Some readers might say: Only the first is a use case for a data warehouse, and the other two are for a data lake or lakehouse! It all depends on the definition.

Data warehouse architecture

DWHs are central repositories of integrated data from disparate sources. They store historical data in one storage system. The data is stored at rest, i.e., saved for later analysis and processing. Business users analyze the data to find insights.

Data Warehouse Architecture

The data is uploaded from operational systems, such as IoT data, ERP, CRM, and many other applications. Data cleansing and data quality assurance are crucial pieces in the DWH pipeline. Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) are the two main approaches to building a data warehouse system. Data marts help to focus on a single subject or line of business within the data warehouse ecosystem.

The relation of the data warehouse to data lake and lakehouse

The focus of a data warehouse is reporting and business intelligence using structured data. Contrarily, the data lake is a synonym for storing and processing raw big data. A data lake was built with technologies like Hadoop, HDFS, and Hive in the past. Today, the data warehouse and data lake merged into a single solution. A cloud-native DWH supports big data. Similarly, a cloud-native data lake needs business intelligence with traditional tools.

Databricks: The evolution from the data lake to the data warehouse

That’s true for almost all vendors. For example, look at the history of one of the leading big data vendors: Databricks, known for being THE Apache Spark company. The company started as a commercial vendor behind Apache Spark, a big data batch processing platform. The platform was enhanced with (some) real-time workloads using micro-batching. A few milestones later, Databricks is an entirely different company today, focusing on cloud, data analytics, and data warehousing. Databricks’ strategy changed from:

  • open source to the cloud
  • self-managed software to fully-managed serverless offerings
  • focus on Apache Spark to AI / Machine Learning and later added data warehousing capabilities
  • a single product to a vast product portfolio around data analytics, including standardized data formats (“Delta Lake”), governance, ETL tooling (Delta Live Tables), and more,

Vendors like Databricks and AWS also coined a new buzzword for this merge of the data lake, data warehouse, business intelligence, and real-time capabilities: The Lakehouse.

The Lakehouse (sometimes called Data Lakehouse) is nothing new. It combines characteristics of separate platforms. I wrote an article about building a cloud-native serverless lakehouse on AWS using Kafka in conjunction with AWS analytics platforms.

Snowflake: The evolution from the data warehouse to the data lake

Snowflake came from the other direction. It was the first genuinely cloud-native data warehouse available on all major clouds. Today, Snowflake provides many more capabilities beyond the traditional business intelligence spectrum. For instance, data and software engineers have features to interact with Snowflake’s data lake through other technologies and APIs. The data engineer requires a Python interface to analyze historical data, while the software engineer prefers real-time data ingestion and analysis at any scale.

No matter if you build a data warehouse, data lake, or lakehouse: The crucial point is understanding the difference between data in motion and data at rest to find the right enterprise architecture and components for your solution. The following sections explore why a good data warehouse architecture requires both and how they complement each other very well.

Transactional real-time workloads should not run within the data warehouse or data lake! Separation of concerns is critical due to different uptime SLAs, regulatory and compliance laws, and latency requirements.

Data streaming: Supplementing the modern data warehouse with data in motion

Let’s clarify: Data streaming is NOT the same as data ingestion! You can use a data streaming technology like Apache Kafka for data ingestion into a data warehouse or data lake. Most companies do this. Fine and valuable.

BUT: A data streaming platform like Apache Kafka is so much more than just an ingestion layer. Hence, it differs significantly from ingestion engines like AWS Kinesis, Google Pub/Sub, and similar tools.

Data streaming is NOT the same as data ingestion

Data streaming provides messaging, persistence, integration, and processing capabilities. High scalability for millions of messages per second, high availability including backward compatibility and rolling upgrades for mission-critical workloads, and cloud-native features are some of the built-in features.

The de facto standard for data streaming is Apache Kafka. Therefore, I mainly use Kafka for data streaming architectures and use cases.

Transactional and analytics use cases for data streaming with Apache Kafka

The number of different use cases for data streaming is almost endless. Please remember that data streaming is NOT just a message queue for data ingestion! While data ingestion into a data lake was the first prominent use case, this implies <5% of actual Kafka deployments. Business applications, streaming ETL middleware, real-time analytics, and edge/hybrid scenarios are some of the other examples:

Use Cases for Data Streaming with Apache Kafka by Business Value

The persistence layer of Kafka enables decentralized microservice architectures for agile and truly decoupled applications.

Keep in mind that Apache Kafka supports transactional and analytics workloads. Both usually have very different uptime, latency, and data loss SLAs. Check out this post and slide deck to learn more about data streaming use cases across industries powered by Apache Kafka.

The next post of this blog series explores why data streaming with tools like Apache Kafka is the prevalent technology for data ingestion into data warehouses and data lakes.

Don’t (try to) use the data warehouse or data lake for data streaming

This blog post explored the difference between data at rest and data in motion:

  • A data warehouse is excellent for reporting and business intelligence.
  • A data lake is perfect for big data analytics and AI / Machine Learning.
  • Data streaming enables real-time use cases.
  • A decentralized, flexible enterprise architecture is required to build a modern data stack around microservices and data mesh.

None of the technologies is a silver bullet. Choose the right tool for the problem. Monolithic architectures do not solve today’s business problems. Storing all data only at rest does not help with the demand for real-time use cases.

The Kappa architecture is a modern approach for real-time and batch workloads to avoid a much more complex infrastructure using the Lambda architecture. Data streaming complements the data warehouse and data lake. The connectivity between these systems is available out-of-the-box if you choose the right vendors (that are often strategic partners, not competitors, as some people think).

For more details, browse to other posts of this blog series:

  1. THIS POST: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
  2. Data Streaming for Data Ingestion into the Data Warehouse and Data Lake
  3. Data Warehouse Modernization: From Legacy On-Premise to Cloud-Native Infrastructure
  4. Case Studies: Cloud-native Data Streaming for Data Warehouse Modernization
  5. Lessons Learned from Building a Cloud-Native Data Warehouse

How do you combine data warehouse and data streaming today? Is Kafka just your ingestion layer into the data lake? Do you already leverage data streaming for additional real-time use cases? Or is Kafka already the strategic component in the enterprise architecture for decoupled microservices and a data mesh? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies? appeared first on Kai Waehner.

]]>
Legacy Modernization and Hybrid Multi-Cloud with Kafka in Healthcare https://www.kai-waehner.de/blog/2022/03/30/legacy-modernization-and-hybrid-multi-cloud-with-kafka-in-healthcare/ Wed, 30 Mar 2022 08:10:25 +0000 https://www.kai-waehner.de/?p=4393 IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores how data streaming with Apache Kafka enables real-time data processing and business process automation. This is part two: Legacy modernization and hybrid multi-cloud. Examples include Optum / UnitedHealth Group, Centene, and Bayer.

The post Legacy Modernization and Hybrid Multi-Cloud with Kafka in Healthcare appeared first on Kai Waehner.

]]>
IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores how data streaming with Apache Kafka enables real-time data processing and business process automation. Real-world examples show how traditional enterprises and startups increase efficiency, reduce cost, and improve the human experience across the healthcare value chain, including pharma, insurance, providers, retail, and manufacturing. This is part two: Legacy modernization and hybrid multi-cloud. Examples include Optum / UnitedHealth Group, Centene, and Bayer.

Legacy Modernization and Hybrid Multi Cloud with Apache Kafka in Healthcare

Blog Series – Kafka in Healthcare

Many healthcare companies leverage Kafka today. Use cases exist in every domain across the healthcare value chain. Most companies deploy data streaming in different business domains. Use cases often overlap. I tried to categorize a few real-world deployments into different technical scenarios and added a few real-world examples:

Stay tuned for a dedicated blog post for each of these topics as part of this blog series. I will link the blogs here as soon as they are available (in the next few weeks). Subscribe to my newsletter to get an email after each publication (no spam or ads).

Legacy Modernization and Hybrid Multi-Cloud with Kafka

Application modernization benefits from the Apache Kafka ecosystem for hybrid integration scenarios.

Most enterprises require a reliable and scalable integration between legacy systems such as IBM Mainframe, Oracle, SAP ERP, and modern cloud-native applications like Snowflake, MongoDB Atlas, or AWS Lambda.

I already explored “architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments” some time ago:

Hybrid Cloud Architecture with Apache Kafka

TL;DR: Various alternatives exist to deploy Apache Kafka across data centers, regions, and continents. There is no single best architecture. It always depends on characteristics such as RPO / RTO, SLAs, latency, throughput, etc.

Some deployments focus on on-prem to cloud integration. Others link Kafka clusters on multiple cloud providers. Technologies such as Apache Kafka’s  MirrorMaker 2, Confluent Replicator, Confluent Multi-Region-Clusters, and Confluent Cluster Linking help build such an infrastructure.

Let’s look at a few real-world deployments in the healthcare sector.

Optum (United Health Group) – Cloud-native Kafka-as-a-Service

Optum is an American pharmacy benefit manager and health care provider. It is a subsidiary of UnitedHealth Group. The Apache Kafka infrastructure is provided as an internal service, centrally managed, and used by over 200 internal application teams.

Optum built a repeatable, scalable, cost-efficient way to standardize data. They leverage the whole Kafka ecosystem:

  • Data ingestion from multiple resources (Kafka Connect)
  • Data enrichment (Table Joins & Streaming API)
  • Aggregation and metrics calculation (Kafka Streams API)
  • Sinking data to database (Kafka Connect)
  • Near real-time APIs to serve the data

Optum’s Kafka Summit talk explored the journey and maturity curve for their data streaming evolution:

Optum United Healthcare Apache Kafka Journey

As you can see, the journey started with a self-managed Kafka cluster on-premises. Over time, they migrated to a cloud-native Kubernetes environment and built an internal Kafka-as-a-Service offering. Right now, Optum works on multi-cloud enterprise architecture to deploy across multiple cloud service providers.

Centene – Data Integration for M&A across Infrastructures

Centene is the largest Medicaid and Medicare Managed Care Provider in the US. The healthcare insurer acts as an intermediary for government-sponsored and privately insured healthcare programs. Centene’s mission is to “help people live healthier lives and to help make the health system work better for everyone”.

The critical challenge of Centene is interesting: Growth! Many mergers and acquisitions happened in the last decade: Envolve, HealthNet, Fidelis, and Wellcare.

Data integration and processing at scale in real-time between various systems, infrastructures, and cloud environments is a considerable challenge. Kafka provides Centene with valuable capabilities, as they explained in an online talk:

  • Highly scalable
  • High autonomy/decoupling
  • High availability & data resiliency
  • Real-time data transfer
  • Complex stream processing

The event-driven integration architecture leverages Apache Kafka with MongoDB:

Centene Cloud Architecture with Kafka MongoDB ETL Pipeline

Bayer – Hybrid Multi-Cloud Data Streaming

Bayer AG is a German multinational pharmaceutical and life sciences company and one of the largest pharmaceutical companies in the world. They leverage Kafka in various use cases and business domains. The following scenario is from Monsanto.

Bayer adopted a cloud-first strategy and started a multi-year transition to the cloud to provide real-time data flows across hybrid and multi-cloud infrastructures.

The Kafka-based cross-data center DataHub facilitated migration and the shift to real-time stream processing. It offers strong enterprise adoption and supports a myriad of use cases. The Apache Kafka ecosystem is the “middleware” to build a bi-directional streaming replication and integration architecture between on-premises data centers and multiple cloud providers:

From legacy on premise to hybrid multi cloud at Bayer with Apache Kafka

The Kafka journey of Bayer started on AWS. Afterward, some project teams worked on GCP. In parallel, DevOps and cloud-native technologies modernized the underlying infrastructure. Today, Bayer operates a multi-cloud infrastructure with mature, reliable, and scalable stream processing use cases:

Bayer AG using Apache Kafka for Hybrid Cloud Architecture and Integration

Learn about Bayer’s journey and how they built their hybrid and multi-cloud Enterprise DataHub with Apache Kafka and its ecosystem: Bayer’s Kafka Summit talk.

Data Streaming with Kafka across Hybrid and Multi-cloud Infrastructures

Think about IoT sensor analytics, cybersecurity, patient communication, insurance, research, and many other domains. Real-time data beats slow data in the healthcare supply chain almost everywhere.

This blog post explored the value of data streaming with Apache Kafka to modernize IT infrastructure and build hybrid multi-cloud architectures. Real-world deployments from Optum, Centene, and Bayer showed how enterprises deploy Kafka successfully for different use cases in the enterprise architecture.

How do you leverage data streaming with Apache Kafka in the healthcare industry? What architecture does your platform use? Which products do you combine with data streaming? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Legacy Modernization and Hybrid Multi-Cloud with Kafka in Healthcare appeared first on Kai Waehner.

]]>
Kafka for Real-Time Replication between Edge and Hybrid Cloud https://www.kai-waehner.de/blog/2022/01/26/kafka-cluster-linking-for-hybrid-replication-between-edge-cloud/ Wed, 26 Jan 2022 12:45:05 +0000 https://www.kai-waehner.de/?p=4131 Not all workloads should go to the cloud! Low latency, cybersecurity, and cost-efficiency require a suitable combination of edge computing and cloud integration. This blog post explores hybrid data streaming with Apache Kafka anywhere. A live demo shows data synchronization from the edge to the public cloud across continents with Kafka on Hivecell edge hardware and serverless Confluent Cloud.

The post Kafka for Real-Time Replication between Edge and Hybrid Cloud appeared first on Kai Waehner.

]]>
Not all workloads should go to the cloud! Low latency, cybersecurity, and cost-efficiency require a suitable combination of edge computing and cloud integration. This blog post explores architectures and design patterns for software and hardware considerations to deploy hybrid data streaming with Apache Kafka anywhere. A live demo shows data synchronization from the edge to the public cloud across continents with Kafka on Hivecell edge hardware and serverless Confluent Cloud.

Real-Time Edge Computing and Hybrid Cloud with Apache Kafka Confluent and Hivecell

Not every workload should go into the cloud

Almost every company has a cloud-first strategy in the meantime. Nevertheless, not all workloads should be deployed in the public cloud. A few reasons why IT applications still run at the edge or in a local data center:

  • Cost-efficiency: The more data produced at the edge, the more costly it is to transfer everything to the cloud. This significant data transfer is often non-sense for high volumes of raw sensor and telemetry data.
  • Low latency: Some use cases require data processing and correlation in real-time in milliseconds. Communication with remote locations increases the response time significantly.
  • Bad, unstable internet connection: Some environments do not provide good connectivity to the cloud or are entirely disconnected all the time or for some time of the day.
  • Cybersecurity with air-gapped environments: The disconnected Edge is common in safety-critical environments. Controlled data replication is only possible via unidirectional hardware gateways or manual human copy tasks within the site.

Here is a great recent example of why not all workloads should go to the cloud: AWS outage that created enormous issues for visitors to Disney World as the mobile app features are running online in the cloud. Business continuity is not possible if the connection to the cloud is offline:

AWS Outage at Disney World

The Edge is…

To be clear: The term ‘edge’ needs to be defined at the beginning of every conversation. I define the edge as having the following characteristics and needs:

  • Edge is NOT a data center
  • Offline business continuity
  • Often 100+ locations
  • Low-footprint and low-touch
  • Hybrid integration

Hybrid cloud for Kafka is the norm; not an exception

Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. Several scenarios require multi-cluster solutions. Real-world examples have different requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments, and global Kafka.

Global Event Streaming with Apache Kafka Confluent Multi Region Clusters Replication and Confluent Cloud

I posted about this in the past. Check out “architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments“.

Apache Kafka at the Edge

From a Kafka perspective, the edge can mean two things:

  • Kafka clients at the edge connecting directly to the Kafka cluster in a remote data center or public cloud, connecting via a native client (Java, C++, Python, etc.) or a proxy (MQTT Proxy, HTTP / REST Proxy)
  • Kafka clients AND the Kafka broker(s) deployed at the edge, not just the client applications

Both alternatives are acceptable and have their trade-offs. This post is about the whole Kafka infrastructure at the edge (potentially replicating to another remote Kafka cluster via MirrorMaker, Confluent Replicator, or Cluster Linking).

Check out my Infrastructure Checklist for Apache Kafka at the edge for more details.

I also covered various Apache Kafka use cases for the edge and hybrid cloud across industries like manufacturing, transportation, energy, mining, retail, entertainment, etc.

Hardware for edge computing and analytics

Edge hardware has some specific requirements to be successful in a project:

  • No special equipment for power, air conditioning, or networking
  • No technicians are required on-site to install, configure, or maintain the hardware and software
  • Start with the smallest footprint possible to show ROI
  • Easily add more compute power as workload expands
  • Deploy and operate simple or distributed software for containers, middleware, event streaming, business applications, and machine learning
  • Monitor, manage and upgrade centrally via fleet management and automation, even when behind a firewall

Devon Energy: Edge and Hybrid Cloud with Kafka, Confluent, and Hivecell

Devon Energy (formerly named WPX Energy) is a company in the oil & gas industry. The digital transformation creates many opportunities to improve processes and reduce costs in this vertical. WPX leverages Confluent Platform on Hivecell edge hardware to realize edge processing and replication to the cloud in real-time at scale.

The solution is designed for real-time decision-making and future closed-loop control optimization. Devon Energy conducts edge stream processing to enable real-time decision-making at the well sites. They also replicate business-relevant data streams produced by machine learning models and analytical preprocessed data at the well site to the cloud, enabling Devon Energy to harness the full power of its real-time events:

Devon Energy Apache Kafka and Confluent at the Edge with Hivecell and Cluster Linking to the Cloud

A few interesting notes about this hybrid Edge to cloud deployment:

  • Improved drilling and well completion operations
  • Edge stream processing / analytics + closed-loop control ready
  • Vendor agnostic (pumping, wireline, coil, offset wells, drilling operations, producing wells)
  • Replication to the cloud in real-time at scale
  • Cloud agnostic (AWS, GCP, Azure)

Live Demo – How to deploy a Kafka Cluster in production on your desk or anywhere

Confluent and Hivecell delivered the promise of bringing a piece of Confluent Cloud right there to your desk and delivering managed Kafka on a cloud-native Kubernetes cluster at the edge. For the first, Kafka deployments run time at scale at the edge, enabling local Kafka clusters at oil drilling sites, on ships, in factories, or in quick-service restaurants.

In this webinar, we showed how it works during a live demo – where we deploy an edge Confluent cluster, stream edge data, and synchronize it with Confluent Cloud across regions and even continents:

Hybrid Kafka Edge to Cloud Replication with Confluent and Hivecell

Dominik and I had our Hivecell cluster in our home in Texas, USA, respectively, Bavaria, Germany. We synchronized events across continents to a central Confluent Cloud cluster and simulated errors and cloud-native self-healing by “killing” one of my Hivecell nodes in Germany.

The webinar covered the following topics:

Edge computing as the next significant paradigm shift in IT
Apache Kafka and Confluent use cases at the Edge in IoT environments
– An easy way of setting up Kafka clusters where ever you need them, including fleet management and error handling
– Hands-on examples of Kafka cluster deployment and data synchronization

Slides and on-demand video recording

Here are the slides:

And the on-demand video recording:

Video Recording - Kafka, Confluent and Hivecell at the Edge and Hybrid Cloud

 

 

Real-time data streaming everywhere required hybrid edge to cloud data streaming!

A cloud-first strategy makes sense. Elastic scaling, agile development, and cost-efficient infrastructure allow innovation. However, not all workloads should go to the cloud for latency, cost, or security reasons.

Apache Kafka can be deployed everywhere. Essential for most projects is the successful deployment and management at the edge and the uni- or bidirectional synchronization in real-time between the edge and the cloud. This post showed how Confluent Cloud, Kafka at the Edge on Hivecell edge hardware, and Cluster Linking enable hybrid streaming data exchanges.

How do you use Apache Kafka? Do you deploy in the public cloud, in your data center, or at the edge outside a data center? How do you process and replicate the data streams? What other technologies do you combine with Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Real-Time Replication between Edge and Hybrid Cloud appeared first on Kai Waehner.

]]>
Apache Kafka Landscape for Automotive and Manufacturing https://www.kai-waehner.de/blog/2022/01/12/apache-kafka-landscape-for-automotive-and-manufacturing/ Wed, 12 Jan 2022 12:07:20 +0000 https://www.kai-waehner.de/?p=4124 Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments. This article explores the event streaming landscape for automotive including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.

The post Apache Kafka Landscape for Automotive and Manufacturing appeared first on Kai Waehner.

]]>
Before the Covid pandemic, I had the pleasure of visiting “Motor City” Detroit in November 2019. I met with several automotive companies, suppliers, startups, and cloud providers to discuss use cases and architectures around Apache Kafka. A lot has happened. Since then, I have also met several OEMs and suppliers in Europe and Asia. As I finally go back to Detroit this January 2022 to meet customers again, I thought it would be a good time to update the status quo of event streaming and Apache Kafka in the automotive and manufacturing industry.

Today, in 2022, Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments. This article explores the automotive event streaming landscape, including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.

Automotive and Manufacturing Landscape for Apache Kafka

The Event Streaming Landscape for Automotive and Manufacturing

Every business domain leverages Event Streaming with Apache Kafka in the automotive and manufacturing industry. Data in motion helps everywhere. The infrastructure and deployment differ depending on the use case and requirements. I have seen everything at carmakers and manufacturers across the globe:

  • Cloud-first strategy with all new business applications in the public cloud deployed and connected across regions and even continents
  • Hybrid integration scenarios between legacy applications in the data center and modern cloud-native services the public cloud
  • Edge computing in a smart factory for low latency, cost-efficient data processing, and cybersecurity
  • Embedded Kafka brokers in machines and vehicles at the disconnected edge

This spread of use cases is impressive. The following diagram depicts a high-level overview:

Automotive and Manufacturing Landscape for Apache Kafka with Edge and Hybrid Cloud

The following sections describe the automotive and manufacturing landscape for event streaming in more detail:

  • Manufacturing 4.0
  • Supply Chain Optimization
  • Mobility Services
  • New Business Models

If you are mainly interested in real-world Kafka deployments with examples from BMW, Porsche, Audi, Tesla, and other OEMs, check out the article “Real-World Deployments of Kafka in the Automotive Industry“.

If you want to understand why Kafka makes such a difference in automotive and manufacturing, check out the article “Apache Kafka in the Automotive Industry“. This article explores the business motivation for these game-changing concepts of data in motion for the digitalization of the automotive industry.

Before you start reading the below section, I want to clearly emphasize that Kafka is not the silver bullet for every problem. “When NOT to use Apache Kafka?” digs deep into this discussion.

I keep the following sections relatively short to give a high-level overview. Each section contains links to more deep-dive articles about the topics.

Manufacturing 4.0

Industrial IoT (IIoT) respectively Industry 4.0 changes how the shop floor and production lines produce goods. Automation, process efficiency, and a much better Overall Equipment Effectiveness (OEE) enable cost reduction and flexibility in the production process:

Manufacturing and Industrial IoT with Apache Kafka

Smart Factory

A smart factory is not necessarily a newly built building like a Tesla Gigafactory. Many enterprises install smart technology like networked sensors for temperature or vibrations measurements into old factories. Improving the Overall Equipment Effectiveness (OEE) is the primary goal of most use cases. Many scenarios leverage Kafka for continuously processing sensor and telemetry data in motion:

Legacy Modernization with Open APIs and Hybrid Cloud

Factories exist for decades after they are built. Digitalization and the modernization of legacy technologies are some of the biggest challenges in IIoT projects. Such an initiative usually includes several tasks:

Continuous Data-driven Engineering and Product Development

Last but not least, an opportunity many people underestimate: Continuous data streaming with Kafka enables new possibilities in software engineering and product development for IoT and automotive projects.

For instance, developing and deploying the “big loop” for machine learning of advanced driver-assistance systems (ADAS) or self-driving functions based on sensor data from the fleet is a new way of software engineering. Tesla’s Kafka-based data platform is a fantastic example. A related use case in engineering is the ingest of sensor data during and after test drives.

Supply Chain Optimization

Supply chain processes and solutions are very complex. The Covid pandemic showed how only flexible enterprises could survive, stay profitable, and provide a great customer experience, even in disastrous external events.

Here are the top 5 critical challenges of supply chains:

  • Time Frames are Shorter
  • Rapid Change
  • Zoo of Technologies and Products
  • Historical Models are No Longer Viable
  • Lack of visibility

Only real-time data streaming and correlation solve these supply chain challenges end-to-end across regions and companies:

Supply Chain Optimization in Automotive at the Edge and in the Cloud with Apache Kafka

In its detailed blog post, I covered Supply Chain Optimization (SCM) with Apache Kafka. Check it out to learn about real-world supply chain use cases from Bosch, BMW, Walmart, and other companies.

Intra-logistics and Global Distribution Networks

Logistics and supply chains within a factory, distribution center, or store require real-time data integration and processing to provide efficient processing of goods and a great customer experience. Batch processes or manual interaction by human workers cannot implement these use cases. Examples include:

Track & Trace and Fleet Management

Real-time logistics is a game-changer for fleet management and track & trace use cases.

  • Commercial motor vehicles such as cars, vans, trucks, specialist vehicles (such as mobile construction machinery), forklifts, and trailers
  • Private vehicles used for work (the ‘grey fleet’)
  • Aviation machinery such as aircraft (planes and helicopters)
  • Ships
  • Rail cars
  • Non-powered assets such as generators, tanks, gearboxes

All the following aspects are not new. The difference is that event streaming allows to continuously execute these tasks in real-time to act on new information in motion:

  • Visualization
  • Location-based services
  • Routing and navigation
  • Estimated time of arrival
  • Alerting
  • Proactive recalculation
  • Monitoring of the assets and mechanical components of a vehicle

Most companies have a cloud-first strategy for building such a platform. However, some cases require edge computing either via local 5G location for low latency use cases or embedded Kafka brokers for disconnected data collection and analytics within the vehicles.

Streaming Data Exchange for B2B Collaboration with Partners

Real-time data is not just relevant without a company. OEMs and Tier 1 and Tier 2 suppliers benefit in the same way from data streams. The same is true for car dealerships, end customers, and any other consumer of the data. Hence, a clear trend in the market is the emergence of a Kafka-based streaming data exchange across companies to build a data mesh.

I have often seen this situation in the past: The OEM leverages event streaming. The Tier 1 supplier leverages event streaming. The used ERP solution is built on Kafka, too. All leverage the capabilities of scalable real-time data streaming. It makes little sense to integrate with partners and software vendors via web service APIs, such as SOAP or HTTP/REST. Instead, a streaming interface is a natural choice to hand streaming data to partners.

The following example from the automotive industry shows how independent stakeholders (= domains in different enterprises) use a cross-company streaming data exchange:

Streaming Data Exchange with Data Mesh in Motion using Apache Kafka and Cluster Linking

Mobility Services

Every OEM, supplier, or innovative startup in the automotive space thinks about providing a mobility service either on top of the goods they sell or as an independent service.

Most mobility services on your mobile apps used today for business or privately are only possible because of a scalable real-time backbone powered by event streaming:

Mobility Services and Connected Cars with Event Streaming and Apache Kafka

The possibilities for mobility services are endless. A few examples that are mainstream today already:

  • Omnichannel retail and aftersales to buy additional car features online, for instance, more power, seat heater, up-to-date navigation, self-driving software (okay, the latter one is not mainstream yet, but Tesla shows where it goes)
  • Connected Cars for ride-hailing, scooter rental, taxi services, food delivery
  • 3rd party integration for adding services that a company does not want to build by themselves

Today’s most successful and widely adopted mobility services are independent of a specific carmaker or supplier.

Examples of prominent Kafka-powered consumer mobility services are Uber and Lyft in the US, Grab in Asia, and FREENOW in Europe. Here Technologies is an excellent example for a B2B mobility service providing mapping information so that companies can build new or improve existing applications on top of it.

A good starting point to learn more is my blog post about Apache Kafka and MQTT for mobility services and transportation.

New Business Models

The access to real-time data enables companies to build entirely new business models on top of their existing products:

New Automotive Business Models enabled by Event Streaming with Apache Kafka

A few examples:

  • Next-generation car rental with excellent customer experience, context-specific coupons, loyalty platform, and car rental fleets with other services from the carmaker.
  • Reinventing car insurance based on real-time driving information about each driver to build driver-specific pricing based on real-time analysis of the driver behavior instead of legacy approaches using statistical models with attributes like driver age, number of accidents in the past, etc.
  • Data provider for monetization enables other companies to build new business models with your car data – for instance, working with a government to make a smart city traffic system or a mobility service startup to analyze and correlate car data across OEMs.

This evolution is just the beginning of the usage of streaming data. I have seen many customers build a first streaming pipeline for one use case. However, new business divisions will leverage the data for innovations when the platform is there.

The Data is in Motion in Automotive and Manufacturing

The landscape for Apache Kafka in the automotive and manufacturing industry showed that Apache Kafka is the central nervous system of many applications in various areas for processing analytical and transactional data in motion.

This article explored use cases such as connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models. The possibilities for data in motion are almost endless. The automotive and manufacturing industry is still in the very early stages of leveraging data in motion.

Where do you use Apache Kafka and its ecosystem in the automotive and manufacturing industry? Do you deploy in the public cloud, in your data center, or at the edge outside a data center? What other technologies do you combine with Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka Landscape for Automotive and Manufacturing appeared first on Kai Waehner.

]]>
Streaming Data Exchange with Kafka and a Data Mesh in Motion https://www.kai-waehner.de/blog/2021/11/14/streaming-data-exchange-data-mesh-apache-kafka-in-motion/ Sun, 14 Nov 2021 13:25:45 +0000 https://www.kai-waehner.de/?p=3412 Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a  Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms to solve business problems.

The post Streaming Data Exchange with Kafka and a Data Mesh in Motion appeared first on Kai Waehner.

]]>
Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. Every data and platform vendor describes how to build the best Data Mesh with their platform. The Data Mesh story includes cloud providers like AWS, data analytics vendors like Databricks and Snowflake, and Event Streaming solutions like Confluent. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms, to solve business problems.

Streaming Data Exchange with Apache Kafka and Data Mesh in Motion

Data at Rest vs. Data in Motion

Before we get into the Data Mesh discussion, it is crucial to clarify the difference and relevance of Data at Rest and Data in Motion:

  • Data at Rest: Data is ingested and stored in a storage system (database, data warehouse, data lake). Business logic and queries execute against the storage. Everyday use cases include reporting with business intelligence tools, model training in machine learning,  and complex batch analytics like shuffling or map and reduce. As the data is at rest, the processing is too late for real-time use cases.
  • Data in Motion: Data is processed and correlated continuously while new events are fed into the platform. Business logic and queries execute in real-time. Common real-time use cases include inventory management, order processing, fraud detection, predictive maintenance, and many other use cases.

Real-time Data Beats Slow Data

Real-time beats slow data in almost all use cases across industries. Hence, ask yourself or your business team how they want or need to consume and process the data in the next project. Data at Rest and Data in Motion have trade-offs. Therefore, both concepts are complementary. For this reason, modern cloud infrastructures leverage both in their architecture. Serverless Event Streaming with Kafka combined with the AWS Lakehouse is a great resource to learn more.

However, while connecting a batch system to a real-time nervous system is possible, the other way round – connecting a real-time consumer to batch storage – is not possible. The Kappa vs. Lambda Architecture discussion gives more insights into this.

Kafka is a database. So, it is also possible to use it for data at rest. For instance, the replayability of historical events in guaranteed ordering is essential and helpful for many use cases. However, long-term storage in Kafka has several limitations, like limited query capabilities. Hence, for many use cases, event streaming and other storage systems are complementary, not competitive.

Data Mesh – An Architecture Paradigm

Data mesh is an implementation pattern (not unlike microservices or domain-driven design) but applied to data. Thoughtworks coined the term. You will find tons of resources on the web. Zhamak Dehghani gave a great introduction about “How to build the Data Mesh Foundation and its Relation to Event Streaming” at the Kafka Summit Europe 2021.

Domain-driven + Microservices + Event Streaming

Data Mesh is not an entirely new paradigm. It has several historical influences:

Data Mesh Architecture with Micorservices Domain-driven Design Data Marts and Event Streaming

The architectural paradigm unlocks analytical data at scale, rapidly unlocking access to an ever-growing number of distributed domain data sets for a proliferation of consumption scenarios such as machine learning, analytics, or data-intensive applications across the organization. A data mesh addresses the common failure modes of the traditional centralized data lake or data platform architecture.

Data Mesh is a Logical View, not Physical!

Data mesh shifts to a paradigm that draws from modern distributed architecture: considering domains as the first-class concern, applying platform thinking to create a self-serve data infrastructure, treating data as a product, and implementing open standardization to enable an ecosystem of interoperable distributed data products.

Here is an example of a Data Mesh:

Data Mesh with Apache Kafka

TL;DR: Data Mesh combines existing paradigms, including Domain-driven Design, Data Marts, Microservices, and Event Streaming.

Data as the Product

However, the differentiating aspect focuses on product thinking (“Microservice for Data”) with data as a first-class product. Data products are a perfect fit for Event Streaming with Data in Motion to build innovative new real-time use cases.

Data Product - The Domain Driven Microservice for Data

A Data Mesh with Event Streaming

Why is Event Streaming a good fit for data mesh?

Streams are real-time, so you can propagate data throughout the mesh immediately, as soon as new information is available. Streams are also persisted and replayable, so they let you capture both real-time AND historical data with one infrastructure. And because they are immutable, they make for a great source of record, which is helpful for governance.

Data in Motion is crucial for most innovative use cases. And as discussed before, real-time data beats slow data in almost all scenarios. Hence, it makes sense that the heart of a Data Mesh architecture is an Event Streaming platform. It provides true decoupling, scalable real-time data processing, and highly reliable operations across the edge, data center, and multi-cloud.

Kafka Streaming API – The De Facto Standard for Data in Motion

The Kafka API is the de facto standard for Event Streaming. I won’t explore this discussion again and again. Here are a few references before we move to the “Kafka + Data Mesh” content…

A Kafka-powered Data Mesh

I highly recommend watching Ben Stopford’s and Michael Noll’s talk about “Apache Kafka and the Data Mesh“. Several of the screenshots in this post are from that presentation, too. Kudos to my two colleagues! The talk explores the key concepts of a Data Mesh and how they are related to Event Streaming:

  • Domain-driven Decentralization
  • Data as a Self-serve Product
  • First-class Data Platform
  • Federated Governance

Let’s now explore how Event Streaming with Kafka fits into the Data Mesh architecture and how other solutions like a database or data lake complement it.

Data Exchange for Input and Output within a Data Mesh using Kafka

Data product, a “microservice for the data world”:

  • A node on the data mesh, situated within a domain.
  • Produces and possibly consumes high-quality data within the mesh.
  • Encapsulates all the elements required for its function, namely data plus code plus infrastructure.

A Data Mesh is not just one Technology!

The heart of a Data Mesh infrastructure must be real-time, decoupled, reliable, and scalable. Kafka is a modern cloud-native enterprise integration platform (also often called iPaaS today). Therefore, Kafka provides all the capabilities for the foundation of a Data Mesh.

However, not all components can or should be Kafka-based. Choose the right tool for a problem. Let’s explore in the following subsections how Kafka-native technologies and other solutions are used in a Data Mesh together.

Stream Processing within the Data Product with Kafka Streams and ksqlDB

An event-based data product aggregates and correlates information from one or more data sources in real-time. Stateless and stateful stream processing is implemented with Kafka-native tools such as Kafka Streams or ksqlDB:

Event Streaming within the Data Product with Stream Processing Kafka Streams and ksqlDB

Variety of Protocols and Communication Paradigms within the Data Product – HTTP, gRPC, MQTT, and more

Obviously, not every application uses just Event Streaming as a technology and communication paradigm. The above picture shows how one consumer application could also be a request/response technology like HTTP or gRPC to do a pull query. In contrast, another application continuously consumes the streaming push query with a native Kafka consumer in any programming language, such as Java, Scala, C, C++, Python, Go, etc.

The data product often includes complementary technologies. For instance, if you built a connected car infrastructure, you likely use MQTT for the last-mile integration, ingest the data into Kafka, and further processing with Event Streaming. The “Kafka + MQTT Blog Series” is an excellent example from the IoT space to learn about building a data product with complementary technologies.

Variety of Solutions within the Data Product – Event Streaming, Data Warehouse, Data Lake, and more

The beauty of microservice architectures is that every application can choose the right technologies. An application might or might not include databases, analytics tools, or other complementary components. The input and output data ports of the data product should be independent of the chosen solutions:

Data Stores within the Data Product with Snowflake MongoDB Oracle et al

Kafka Connect is the right Kafka-native technology to connect other technologies and communication paradigms with the Event Streaming platform. Evaluate if you need another integration middleware (like an ETL or ESB) or if the Kafka infrastructure is the better enterprise integration platform (iPaaS) for your data product within the data mesh.

A Global Streaming Data Exchange

The Data Mesh concept is relevant for global deployments, not just within a single project or region. Multiple Kafka clusters are the norm, not an exception. I wrote about customers using Event Streaming with Kafka in global architectures a long time ago.

Various architectures exist to deploy Kafka across data centers and multiple clouds. Some use cases require low latency and deploy some Kafka instances at the edge or in a 5G zone. Other use cases replicate data between regions, countries, or continents across the globe for disaster recovery, aggregation, or analytics use cases.

Here is one example spanning a streaming Data Mesh across multiple cloud providers like AWS, Azure, GCP, or Alibaba, and on-premise / edge sites:

Hybrid Cloud Streaming Data Mesh powered by Apache Kafka and Cluster Linking

This example shows all the characteristics discussed in the above sections for a Data Mesh:

  • Decentralized real-time infrastructure across domains and infrastructures
  • True decoupling between domains within and between the clouds
  • Several communication paradigms, including data streaming, RPC, and batch
  • Data integration with legacy and cloud-native technologies
  • Continuous stream processing where it adds value, and batch processing in some analytics sinks

Example: A Streaming Data Exchange across Domains in the Automotive Industry

The following example from the automotive industry shows how independent stakeholders (= domains in different enterprises) use a cross-company streaming data exchange:

Streaming Data Exchange with Data Mesh in Motion using Apache Kafka and Cluster Linking

Innovation does not stop at the own border. Streaming replication is relevant for all use cases where real-time is better than slow data (valid for most scenarios). A few examples:

  • End-to-end supply chain optimization from suppliers to the OEM to the intermediary to the aftersales
  • Track and trace across countries
  • Integration of 3rd party add-on services to the own digital product
  • Open APIs for embedding and combining external services to build a new product

I could go on and on with the list. Many data products need to be accessible by 3rd party in real-time at scale. Some API gateway or API management tool comes into play in such a situation.

A real-world example of a streaming data exchange powered by Kafka is the mobility service Here Technologies. They expose the Kafka API to directly consume streaming data from their mapping services (as an alternative option to their HTTP API):

Here Technologies Moblility Service Apache Kafka Open API

However, even if all collaborating partners use Kafka under the hood in their architecture, exposing the Kafka API directly to the outside world does not always make sense. Some technical capabilities (e.g., access control or connectivity to thousands of devices) and missing business functions (e.g., for monetization or reporting) of the Kafka ecosystem bring an API layer on top of the Event Streaming infrastructure into play in many real-world deployments.

Open API for 3rd Party Integration and Streaming API Management

API Gateways and API Management tools exist in many varieties, including open-source frameworks, commercial products, and SaaS cloud offerings. Features include technical routing, access control, monetization, and reporting.

However, most people still implement the Open API concept with RPC in mind. I guess 95+% still use HTTP(S) to make APIs accessible to other stakeholders (e.g., other business units or external parties). RPC makes little sense in a streaming Data Mesh architecture if the data needs to be processed at scale in real-time.

There is still an impedance mismatch between Event Streaming and API Management. But it gets better these days. Specifications like AsyncAPI, calling itself the “industry standard for defining asynchronous APIs”, and similar approaches bring Open API to the data streaming world. My post “Kafka versus API Management with tools like MuleSoft, Kong, or Apigee” is still pretty much accurate if you want to dive deeper into this discussion. IBM API Connect was one of the first vendors that integrated Kafka via Async API.

A great example of the evolution from RPC to streaming APIs is the machine learning space. “Streaming Machine Learning with Kafka-native Model Deployment” explores how model servers such as Seldon enhance their product with a native Kafka API besides HTTP and gRPC request-response communication:

Kafka-native Machine Learning Model Server Seldon

Journey to the Streaming Data Mesh with Kafka

The paradigm shift is enormous. Data Mesh is not a free lunch. The same was and still is true for microservice architectures, domain-driven design, Event Streaming, and other modern design principles.

In analogy to Confluent’s maturity model for Event Streaming, our team has described the journey for deploying a streaming Data Mesh:

Data Mesh Journey with Event Streaming and Apache Kafka

The efforts likely take a few years in most scenarios. The shift is not just about technologies, but, as necessary are adjustments to organizations and business processes. I guess most companies are still in a very early stage. Let me know where you are on this journey!

Streaming Data Exchange as Foundation for a Data Mesh

A Data Mesh is an implementation pattern, not a specific technology. However, most modern enterprise architectures require a decentralized streaming data infrastructure to build valuable and innovative data products in independent, truly decoupled domains. Hence, Kafka, being the de facto standard for Event Streaming,  comes into play in many Data Mesh architectures.

Many Data Mesh architectures span across many domains in various regions or even continents. The deployments run at the edge, on-prem, and multi-cloud. The integration connects to many solutions, technologies with different communication paradigms.

A cloud-native Event Streaming infrastructure with the capability to link clusters with each other out-of-the-box enables building a modern Data Mesh. No Data Mesh will use just one technology or vendor. Learn from the inspiring posts from your favorite data products vendors like AWS, Snowflake, Databricks, Confluent, and many more to define and build your custom Data Mesh successfully. Data Mesh is a journey, not a big bang.

Did you already start building your Data Mesh? How does the enterprise architecture look like? What frameworks, products, and cloud services do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Streaming Data Exchange with Kafka and a Data Mesh in Motion appeared first on Kai Waehner.

]]>
Is Apache Kafka an iPaaS or is Event Streaming its own Software Category? https://www.kai-waehner.de/blog/2021/11/03/apache-kafka-cloud-native-ipaas-versus-mq-etl-esb-middleware/ Wed, 03 Nov 2021 06:57:54 +0000 https://www.kai-waehner.de/?p=3917 This post explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS solutions, and why event streaming is a new software category. A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively iPaaS.

The post Is Apache Kafka an iPaaS or is Event Streaming its own Software Category? appeared first on Kai Waehner.

]]>
Enterprise integration is more challenging than ever before. The IT evolution requires the integration of more and more technologies. Applications are deployed across the edge, hybrid, and multi-cloud architectures. Traditional middleware such as MQ, ETL, ESB does not scale well enough or only processes data in batch instead of real-time. This post explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS solutions, and why event streaming is a new software category. A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively iPaaS.

Apache Kafka as iPaaS or Event Streaming as new Software Category

What is iPaaS (Enterprise Integration Platform as a Service)?

iPaaS (Enterprise Integration Platform as a Service) is a term coined by Gartner. Here is the official Gartner definition: “Integration Platform as a Service (iPaaS) is a suite of cloud services enabling development, execution, and governance of integration flows connecting any combination of on-premises and cloud-based processes, services, applications, and data within individual or across multiple organizations.” The acronym eiPaaS (Enterprise Integration Platform as a Service)” is used in some reports as a replacement for iPaaS.

The Gartner Magic Quadrant for iPaaS shows various vendors:

Gartner Magic Quadrant for iPaaS 2021

Three points stand out for me:

  • Many very different vendors provide a broad spectrum of integration solutions.
  • Many vendors (have to) list various products to provide an iPaaS; this means different technologies, codebases, support teams, etc.
  • No Kafka offering (like Confluent, Cloudera, Amazon MSK) is in the magic quadrant.

The last bullet point makes me wonder if Kafka-based solutions should be considered iPaaS or not?

Is Apache Kafka an iPaaS?

I don’t know. It depends on the definition of the term “iPaaS”. Yes, Kafka solutions fit into the iPaaS, but it is just a piece of the event streaming success story.

Kafka is an event streaming platform. Use cases differ from traditional middleware like MQ, ETL, ESB, or iPaaS. Check out real-world Kafka deployments across industries if you don’t know the use cases yet.

Kafka does not directly compete with ETL tools like Talend or Informatica, MQ frameworks like IBM MQ or RabbitMQ, API Management platforms like MuleSoft, and cloud-based iPaaS like Boomi or TIBCO. At least not if people understand the differences between Kafka and traditional middleware. For that reason, many people (including me) think that Event Streaming should be its Magic Quadrant.

Having said this, all these very different vendors are in the iPaaS Magic Quadrant. So, should Kafka respectively its vendors be in here? I think so because I have seen hundreds of customers leverage the Kafka ecosystem as a cloud-native, scalable, event-driven integration platform, often in hybrid and multi-cloud architectures. And that’s an iPaaS.

What’s different with Kafka as an Integration Platform?

If you are new to this discussion, check out the article “Apache Kafka vs. MQ, ETL, ESB” or the related slides and video. Here is my summary on why Kafka is unique for integration scenarios and therefore adopted everywhere:

Why Kafka as iPaaS instead of Traditional Middleware like MQ ETL ESB

A unique selling point for event streaming is the ability to leverage a single platform. In contrast, other iPaaS solutions require different products (including codebases, support teams, integration between the additional tech, etc.).

Kafka as Cloud-native and Serverless iPaaS

Fundamental differences exist between modern iPaaS solutions and traditional legacy middleware; this includes the software architecture, the scalability and operations of the platform, and data processing capabilities. On a high level, an “Kafka iPaaS” requires the following characteristics:

  • Cloud-native Infrastructure: Elasticity is vital for success in the modern IT world. The ability to scale up and down (= expanding and shrinking Kafka clusters) is mandatory. This capability enables starting with a small pilot project and scaling up or handling load spikes (like Christmas business in retail).
  • Automated Operations: Truly serverless SaaS should always be the preferred option if the software runs in a public cloud. Orchestration tools like Kubernetes and related tools (like a Kafka operator for Kubernetes) are the next best option in a self-managed data center or at the edge outside a data center.
  • Complete Platform: An integration platform requires real-time messaging and storage for backpressure handling and long-running processes. Data integration and continuous data processing are mandatory, too. Hence, an “Kafka iPaaS” is only possible if you have access to various pre-built Kafka-native connectors to open standards, legacy systems, and modern SaaS interfaces. Otherwise, Kafka needs to be combined with another middleware like an iPaaS or an ETL tool like Apache Nifi.
  • Single Solution: It sounds trivial, but most other middleware solutions use several codebases and products under the hood. Just look at stacks from traditional players such as IBM and Oracle or open-source-driven Cloudera. The complex software stack makes it much harder to provide end-to-end integration, 24/7 operations, and mission-critical support. Don’t get me wrong: Kafka-native solutions like Confluent Cloud also include different products with additional pricing (like a fully-managed connector or data governance add-ons), but all run on a single Kafka-native platform.

From this perspective, some Kafka solutions are modern, cloud-native, scalable, iPaaS. Having said this, would I be happy if you consider some Kafka solutions as an iPaaS on your technology radar? No, not really!

Event Streaming is its Software Category!

While some Kafka solutions can be used as iPaaS, this is only one of many usage scenarios for event streaming. However, as explained above, Kafka-based solutions differ greatly from other iPaaS solutions in the Gartner Magic Quadrant. Hence, event streaming deserves its software category.

If you still wonder what I mean, check out event streaming use cases across industries to understand the difference between Kafka and traditional iPaaS, MQ, ETL, ESB, API tools. Here is a relatively old but still fantastic diagram that summarizes the broad spectrum of use cases for event streaming:

Use Cases for Apache Kafka and Event Streaming

TL;DR: Kafka provides capabilities for various use cases, not just integration scenarios. Many new innovative business applications are built with Kafka. It is not just an integration platform but a unique suite of capabilities for end-to-end data processing in real-time at scale.

New concepts like Data Mesh also prove the evolution. The basic principles are not unique: Domain-driven design, microservices, true decoupling of services, but now with much more focus on data as a product. The latter means it is turning from a cost center into a profit center and innovative new services. Event streaming is a critical component of a data mesh-powered enterprise architecture as real-time almost always beats slow data across use cases.

The Non-Existing Event Streaming Gartner Magic Quadrant or Forrester Wave

Unfortunately, a Gartner Magic Quadrant or Forrester Wave for Event Streaming does NOT exist today. While some event streaming solutions fit into some of these reports (like the Gartner iPaaS Magic Quadrant or the Forrester Wave for Streaming Analytics), it is still an apple to orange comparison.

Event Streaming is its category. Many software vendors built their entire business around this category. Confluent is the leader in this space – note that I am biased as a Confluent employee, but I guess there is no question around this statement 🙂 Many other companies emerge around Kafka, or in a related way using the Kafka protocol, or competitive event streaming offerings such as Amazon Kinesis or Apache Pulsar.

The following Event Streaming Landscape 2021 summarizes the current status:

Event Streaming Landscape with Apache Kafka Pulsar Kinesis Event Hubs

I hope to see a Gartner Magic Quadrant for Event Streaming and a Forrester Wave for Event Streaming soon, too.

Open Source vs. Partially Managed vs. Fully-Managed Event Streaming

One more aspect to point out: You might have noticed that I said, “some event streaming solutions can be considered an iPaaS”. The word “some” is a crucial detail. Just providing an open-source framework is not enough.

iPaaS requires a complete offering, ideally as fully-managed services. Many vendors for event streaming use Kafka, Pulsar, or another framework but do NOT provide a complete offering with operations tooling, commercial 24/7 support, user interfaces, etc. The following resources should help you learn more about the event streaming landscape in 2021:

TL;DR: Evaluate the various offerings. A lot of capabilities are just marketing! Many “fully-managed services” are only partially managed instead of serverless and with very limited SLAs and support. Some other offerings provide plenty of cool features but are more an alpha version and overselling than a mature battle-service solution. A counterexample is the complexity in T-Mobile’s report about upgrading Amazon MSK. This shows the difference between “promoting and selling a fully managed service” and the “not at all fully-managed reality”. A truly fully-managed offering does NOT require the end user to upgrade the infrastructure.

Kafka as Event Streaming iPaaS at Deutsche Bahn (German Railway)

Let’s now look at a practicable example to understand why a traditional iPaaS cannot help in use cases that require event streaming and why this combination of capabilities in a single technology sets a new software category.

This section explores a real-world use case with the journey of Deutsche Bahn (German Railway) providing a great customer experience to their customers. This example uses Event Streaming as iPaaS (regarding my definition of these terms).

Use Case: Improve Customer Experience with Real-time Notifications

The use case sounds very simple: Improve the customer experience by providing real-time information and notifications to customers across various channels like websites, smartphones, and displays at train stations.

Delays and cancellations happen in a complex rail network like in Germany. Frequent travelers accept this downside. Nobody can do anything against lousy weather, self-murder using a traveling train, and technical defects.

However, customers at least expect real-time information and notifications so that they can wait in a coffee shop or lounge instead of freezing at the station platform for minutes or even hours. The reality at Deutsche Bahn was a different sequence of notifications: 10min delay, 20min delay, 30min delay, train canceled – take the next train.

The goal of the project Reisendeninformation (= traveler information system) was to improve the day-to-day experience of millions of travelers across Germany by delivering up-to-date, accurate, and consistent travel information in any location.

Initial Project: A mess of different Integration Technologies

Again, the use case sounds simple. Let’s send real-time notifications to customers if a train is delayed or canceled. Every magic black iPaaS box can do this:

Deutsche Bahn Reisendeninformation Kafka

Is this true or just the marketing of all the integration vendors? Can each black box integrate all the different systems to correlate events in real-time?

Deutsche Bahn started with an open-source messaging platform to send real-time notifications. Unfortunately, the team quickly found out that not every piece of information was coming in in real-time. So, a caching and storage system was added to the project to handle the late-arriving information from some legacy batch or file-based systems. Now, the data from the legacy systems needed to be integrated. Hence, an integration framework was installed to connect to files, databases, and other applications. Now, the data needed to be processed, correlating real-time and non-real-time data from various systems. A stream processing engine can do this.

The pilot project included several different frameworks. A few skeptical questions came up:

  • How to scale this combination of very different technologies?
  • How to get end-to-end support across various platforms?
  • Is this cost-efficient?
  • What is the time-to-market for new features?

Deutsche Bahn re-evaluated their tech stack and found out that Apache Kafka provides all the required capabilities out-of-the-box within one platform.

The Migration to Cloud-native Kafka

The team at Deutsche Bahn re-architected their pilot project. Here is the new solution leveraging Kafka as the single point of truth between various systems, technologies, and communication paradigms:

A traditional iPaaS can implement this scenario. But with several codebases, technologies, and clusters, even if you select one software vendor! Some iPaaS might even do well in the beginning but struggle to scale up. Only event streaming allows to start small but scales up with no need to re-architect the infrastructure.

Today, the project is in production. Check your DB Navigator mobile app to get real-time updates about all trains in Germany. Living in Germany, I appreciate this new service to have a much better traveling experience.

Learn more about the project from the Deutsche Bahn team. They gave several public talks at different conferences and wrote on the Confluent Blog about their Kafka journey. Though, the journey did not end here 🙂 As described in their blog post, Deutsche Bahn is now evaluating the migration from a self-managed Confluent Platform deployment in the public cloud to the fully-managed, truly serverless Confluent Cloud offering to reduce TCO and improve time-to-market.

Complete Platform: Kafka is more than just Messaging

A project like the one described above is only possible with a complete platform. Many people still think about Kafka as an ingestion layer into a data lake or data warehouse, as this was one of the first prominent Kafka use cases. Data ingestion is still an excellent use case today. Many projects already use more than just the core of Kafka to implement this. Kafka Connect provides out-of-the-box connectivity between Kafka and the data store. If you are in the public cloud, you even get integrated in a fully-managed, serverless manner whether you need to integrate with a 1st party cloud service like Amazon S3, Google Cloud BigQuery, Azure Cosmos DB, or other 3rd SaaS like MongoDB Atlas, Snowflake, or Databricks.

Continous Kafka-native stream processing is the next level of a complete platform. For instance, Deutsche Bahn leverages Kafka Streams a lot for their data correlations in real-time at scale. Other companies use ksqlDB as a fully-managed function in Confluent Cloud. The enormous benefit is that you don’t need yet another platform or service for streaming analytics. A complete platform makes the architecture more cost-efficient, and end-to-end integration is easier from SLA and support perspective.

A complete platform requires many additional services “on top”, like visualization of data flows, data governance, role-based access control, audit logs, self-service capabilities, user interfaces, visual coding/code generation tools, etc. Visual coding is the point where traditional middleware and iPaaS tools are stronger today than event streaming offerings.

3rd Party integration via Open API and non-Kafka Tools

So far, you learned why event streaming is its software category and how Deutsche Bahn is an excellent example to show this. However, event streaming is NOT the silver bullet for every problem! When exploring if Kafka and MQ/ETL/ESB are friends, enemies, or frenemies, I already pointed this out. For instance, MQ or an ESB can complement event streaming in an integration project, depending on your project requirements.

Let’s go back to Deutsche Bahn. As mentioned, their real-time traveler information platform is live, with Kafka as the single point of truth. Recently, Deutsche Bahn announced a partnership with Google and 3rd Party Integration with Google Maps:

Deutsche Bahn Google Maps Open API Integration

Real-time Schedule Updates to 3rd Party Google Maps API

The integration provides real-time train schedule updates to Google Maps users:

Google Maps Deutsche Bahn Real-time Integration

The integration allows to reach new people and expand the business. Users can buy train tickets via one click from the Google Maps page.

I don’t know what technology or product this 3rd party integration uses. The heart of Deutsche Bahn’s real-time infrastructure enables new innovative business models and collaboration with partners.

Likely, this integration between Deutsche Bahn and Google Maps does not directly use the Kafka protocol (even though this is done sometimes, for instance, see Here Technologies Open API for their mapping service).

Event Streaming is complementary to other services. In this example, the project team might have used an API Management platform to provide internal APIs to external consumers, including access control, billing, and reporting. The article “Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?” explores the relationship between event streaming and API Management.

Event Streaming Everywhere – Cloud, Data Center, Edge

Real-time beats slow data everywhere. This is a new software category because we don’t just send events into another database via a messaging system. Instead, we use and correlate data from different data source in real-time. That’ the real added value and game changer in innovative projects.

Hence, event streaming must be possible in every location. While cloud-first is a viable strategy for many IT projects, edge and hybrid scenarios are and will continue to be very relevant.

Think about a project related to the Deutsche Bahn example above (but being completely fictive): A hybrid architecture with real-time applications the cloud and edge computing within the trains:

Hybrid Architecture with Kafka at the edge and in the cloud

I covered this in other articles, including “Edge Use Cases for Apache Kafka Across Industries“. TL;DR: Leverage the open architecture of event streaming for real-time data processing everywhere, including multi-cloud, data centers, and edge deployments (i.e., outside a data center). The enterprise architecture does not need various technologies and products to implement real-time data processing and integration with separate iPaaS, ETL tools, ESBs, MQ systems.

However, once again, it is crucial to understand how event streaming fits into the enterprise architecture. For instance, Kafka is often combined with IoT technologies such as MQTT for the last mile integration with IoT devices in these edge scenarios.

Slide Deck and Video for “Apache Kafka vs. Cloud-native iPaaS Middleware”

Here are the related slides for this topic:

And the video recording of this presentation:

Kafka is a cloud-native iPaaS, and much more!

Kafka is the new black for integration projects across industries because of its unique combination of capabilities. Some Kafka solutions are part of the iPaaS category, with trade-offs like any other integration platform.

However, event streaming is its software category. Hence, iPaaS is just one usage of Kafka or other similar event streaming platforms. Real-time data beats slow data. For that reason, event streaming is the backbone for many projects to process data in motion (but also integrate with other systems that store data at rest for reporting, model training, and other use cases).

How do you leverage event streaming and Kafka as an integration platform? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Is Apache Kafka an iPaaS or is Event Streaming its own Software Category? appeared first on Kai Waehner.

]]>
Condition Monitoring and Predictive Maintenance with Apache Kafka https://www.kai-waehner.de/blog/2021/10/25/apache-kafka-condition-monitoring-predictive-maintenance-industrial-iot-digital-twin/ Mon, 25 Oct 2021 14:46:26 +0000 https://www.kai-waehner.de/?p=3888 The manufacturing industry is moving away from just selling machinery, devices, and other hardware. Software and services increase revenue and margins. Equipment-as-a-Service (EaaS) even outsources the maintenance to the vendor. This paradigm shift is only possible with reliable and scalable real-time data processing leveraging an event streaming platform such as Apache Kafka. This post explores how Kafka-native Condition Monitoring and Predictive Maintenance help with this innovation.

The post Condition Monitoring and Predictive Maintenance with Apache Kafka appeared first on Kai Waehner.

]]>
The manufacturing industry is moving away from just selling machinery, devices, and other hardware. Software and services increase revenue and margins. A former cost center becomes a profit center for innovation. Equipment-as-a-Service (EaaS) even outsources the maintenance to the vendor. This paradigm shift is only possible with reliable and scalable real-time data processing leveraging an event streaming platform such as Apache Kafka. This post explores how the next generation of software for Condition Monitoring and Predictive Maintenance can help build new innovative products and improve the OEE for customers.

Apache Kafka for Condition Monitoring and Predictive Maintenance in Industrial IoT

Condition Monitoring and Predictive Maintenance

Let’s define the two terms first as no standard definition exists. Some literature sees condition monitoring as a major component of predictive maintenance.  However, others interpret the latter as a more modern software leveraging machine learning. Both terms are sometimes used as synonyms, too.

Modern Maintenance Strategies and Goals

The main goal of modern maintenance strategies is a more efficient and optimized usage of machines and resources. Reactive maintenance or time-based/usage-based preventive measurements are suboptimal. Therefore, modern condition-based maintenance strategies take over.

Industrial IoT / Industry 4.0 enable several benefits on the shop floor level:

  • Maintain instead of repair
  • No (un)planned downtime
  • Maintenance optimizations and no unnecessary work
  • No negative financial impact
  • Optimized productivity
  • Improved overall equipment effectiveness (OEE)
  • Move from an isolated to a company-wide view

The machine operator is interested in the following questions:

  • Is the machine running normally? (Detect anomalies, classify errors)
  • How long can the engine still run? (Remaining useful life – RUL, time to the first failure)
  • Why does the machine run abnormally? (Sensor monitoring, root cause analysis)

Condition Monitoring and Predictive Maintenance

Condition Monitoring is the process of monitoring a parameter of condition in machinery (vibration, temperature, etc.) to identify a significant change indicative of a developing fault. It is a substantial component of predictive maintenance. The use of condition monitoring allows scheduling maintenance or taking other actions to prevent consequential damages and avoid its consequences. Condition monitoring has a unique benefit: It addresses conditions that shorten the expected lifespan before developing into a major failure.

Predictive maintenance techniques help determine the condition of in-service equipment to estimate when maintenance is necessary. The central promise of predictive maintenance is to allow convenient scheduling of corrective maintenance and prevent unexpected equipment failures.

TL;DR: Both approaches promise cost savings over routine or time-based preventive maintenance because maintenance tasks only are performed when warranted. However, modern maintenance means digitalization. That does not come for free.

Condition monitoring and predictive maintenance only work well if the infrastructure and software are reliable, scalable, and real-time. The main trade-off is a reasonable risk and costs analysis to plan the total cost of ownership (TCO) and return on investment (ROI).

Equipment as a Service (EaaS) as new Business Model

Equipment-as-a-Service (EaaS) is a business model that involves renting out equipment to end-users and collecting periodic subscription payments for using the equipment.

This service-driven business model, also known as Machine-as-a-Service, provides a variety of benefits to both sides:

  • The EaaS provider (OEMs and machine builders) can improve the product design (R&D, digital twin, etc.), plan recurring revenue, and provide predictive maintenance services.
  • The customer (manufacturers) can optimize machine utilization and productivity (with the help of the EaaS software) and reduce the overall cost (moving Capital Expenditures (CapEx) to Operating Expenses (OpEx) and reducing operations costs).

EaaS is only a successful business model if condition monitoring and predictive maintenance are stable 24/7 and continuously collect, process, and analyze incoming data streams.

Apache Kafka for Industrial IoT / Industry 4.0

Apache Kafka is the de facto standard for event streaming. Industrial IoT / Industry 4.0 deployments across the globe use event streaming in edge and hybrid cloud deployments. Here is an example of a smart factory architecture that combines event streaming in the public cloud, factories, and at the edge:
Hybrid Edge to Cloud Architecture for Low Latency with 5G Kafka and AWS Wavelength
Kafka is an information technology (IT). It collects data from operational technology (OT) devices and machines at the edge. Kafka is soft real-time and not suitable for embedded systems or robotics. If you wonder about the relation, read the post “Apache Kafka is NOT Hard Real-Time BUT Used Everywhere in Automotive and Industrial IoT“.
Nevertheless, Kafka is suitable for mission-critical low-latency use cases such as condition monitoring and predictive maintenance where the end-to-end latency is a few milliseconds. Here is an example leveraging 5G together with Kafka and ksqlDB on Kubernetes:
Low Latency 5G Use Cases with AWS Wavelength based on AWS Outposts and Confluent

Data in Motion with Event Streaming and Stream Processing

Condition monitoring and predictive maintenance require an event-based architecture to collect, process, and analyze data in motion. Traditional IIoT platforms are proprietary, inflexible, often not scalable, and not happy to integrate across different vendors and various standards. On the contrary, Kafka-native stream processing is an open, flexible, and scalable technology to implement data integration processing across IoT interfaces.
Let’s look at two examples: Stateless condition monitoring with Kafka Streams and predictive maintenance with ksqlDB and TensorFlow. To be clear: These are just examples. Any other technology can be integrated (with its pros and cons), like Apache Flink for stream processing, cloud-based ML platforms, proprietary IoT edge platforms for the last-mile integration, etc.
Here is the basic setup to build condition monitoring and predictive maintenance with Kafka:
Sensor Events from Machines PLCs Scada IoT
On the left side, we see the Kafka log that stores and forwards events. On the right side, various machines ingest sensor data in real-time. This architecture works at any scale and in real-time. Some Confluent customers leverage Confluent Cloud to process 10GB and more per second.
The IoT integration between machines, PLCs, sensors, etc., is either implemented with Kafka Connect or other APIs for MQTT, OPC-UA, REST/HTTP, files, or any different open or proprietary interface. Let’s now explore the two examples. That’s not the topic of this post. “Kafka and PLC4x for Industrial IoT Integration” and “Kafka as a Modern Data Historian” are great resources to learn more.

Stateless Condition Monitoring with Kafka Streams

The following diagram shows Kafka-native condition monitoring analyzing temperature spikes in real-time:
Stateless Condition Monitoring with Kafka Streams
The example is implemented with Kafka Streams, a Java-based library that can be embedded into any application. The business logic continuously monitors the sensor data. High volumes of data are processed in real-time. However, only relevant events showing temperature spikes over 100 degrees are forwarded to another Kafka topic. Any interested consumer gets it, for instance, a real-time alerting system or a batch report.
The application is stateless. It processes event by event. This capability is already compelling to realize streaming ETL for filtering or transformations. Any complex business logic is also possible within the application.

Stateful Predictive Maintenance with ksqlDB

While stateless stream processing is already powerful, stateful stream processing solves even more business problems. The following example shows how a Kafka-native ksqlDB microservice implements stateful stream processing to detect anomalies continuously:
Stateful Predictive Maintenance with Kafka and ksqlDB
A one-hour sliding window continuously aggregates the temperature spikes from sensors. Consumers use the data in real-time to proactively act on defined thresholds. For instance, the data science team could have analyzed historical data to determine that more than ten temperature spikes with an average of over 100 degrees significantly increase the risk of an outage. In that case, the machine operator is alerted in real-time to do maintenance.

Applied Machine Learning in Real-time with Kafka and TensorFlow

Simple business logic already solves many problems and improves the OEE and maintenance processes. Machine Learning adds additional “magic” to make condition monitoring and predictive maintenance even better.
The great news is that the architecture does not need to change. Analytic models can be embedded into a Kafka application like any other business logic. I talked about Kafka and Artificial Intelligence (AI)/Machine Learning(ML)/Deep Learning (DL) a lot in the past. Check out these posts to learn more:
Here is an example with ksqlDB and an embedded TensorFlow model:
Real Time Machine Learning with Kafka KSQL and TensorFlow
A ksqlDB user-defined function (UDF) embeds the model. This model uses an unsupervised autoencoder for anomaly detection in real-time within the Kafka application. Supervised algorithms are possible the same way.
This architecture solves the impedance mismatch between the data science team and production engineers intelligently. Data scientists use Python and a Jupyther notebook for rapid prototyping and model development. The production team deploys the ksqlDB query in a cluster for real-time scoring at scale. You can learn from an excellent Github project that implements this separation of concerns with a Kappa architecture for a Connected Car infrastructure to do predictive maintenance with MQTT and Kafka:
Kappa Architecture with Apache Kafka MQTT Kubernetes and Tensorflow for Streaming Machine Learning

Equipment-as-a-Service with Fully-Managed Kafka

Many manufacturers created a new business model: Equipment-as-a-Service (EaaS). Think about it: Many buyers do not want to operate machines and worry about maintenance. McKinsey published an excellent report about industry trends that shows why manufacturers want to provide machinery and devices as a service and get good margins:
McKinsey Report about Equipment as a Service
EaaS takes over this burden from the buyer. The machine vendor continuously monitors if the engine or other components needs maintenance. Late maintenance means an irreparable engine. Early maintenance means higher costs. The solution is to determine the service life of the engine and use optimal maintenance times. Hence, the machine vendor has to provide this subscription maintenance service the best way it can, for its interest and a better customer experience. 
Many manufacturers use Kafka and event streaming for their next-generation software solutions that run on top of the machinery or in the cloud connecting to it. Many modern IIoT services leverage a fully-managed and truly serverless Kafka solution like Confluent Cloud. The vendors want/need to focus on the business problems, not operating the infrastructure for event streaming.
Digital Twins play a vital role in this discussion; no matter if you use the buzzword or just the concepts behind it 🙂 Here are a few articles related to fully-managed Kafka for building machine-as-a-service offerings with Digital Twins:

Video Recording – Apache Kafka in Industrial IoT

Here is a video recording walking you through the use case of prediction monitoring with the Kafka ecosystem:

 

Event Streaming for Next-Generation IoT Platforms and Equipment Services

This post showed how event streaming with the Kafka ecosystem enables new business models for manufacturers to sell machinery. Kafka-native stream processing allows using a single technology for different use cases such as condition monitoring or predictive maintenance. Stateless and stateful streaming analytics is beneficial to make proactive and predictive decisions in real-time at scale. This architecture is possible everywhere, in one or multiple cloud and/or regions, on-premise in data centers, at the edge outside the data center, or any combination of hybrid architectures.
Of course, other use cases not covered but necessary include integration with the ERP and MES systems, like direct connectivity between Kafka and SAP. Also, when you think about condition monitoring and predictive maintenance, not all data comes from sensors and interfaces such as OPC-UA or MQTT. Image, video, and sound processing are part of many scenarios. Kafka can handle large messages (with some trade-offs). Learn how and where this makes sense in a dedicated blog post.
How do you leverage event streaming at the shop floor level for condition monitoring and predictive maintenance? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Condition Monitoring and Predictive Maintenance with Apache Kafka appeared first on Kai Waehner.

]]>