IIoT Archives - Kai Waehner

Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink

Kai Waehner — Fri, 20 Sep 2024 06:48:31 +0000

As industries continue to adopt digital transformation, the convergence of Operational Technology (OT) and Information Technology (IT) has become essential. The OT/IT Bridge is a key concept in industrial automation to connect real-time operational processes with business-oriented IT systems ensuring seamless data flow and coordination. This integration plays a critical role in the Industrial Internet of Things (IIoT). It enables industries to monitor, control, and optimize their operations through real-time data synchronization and improve the Overall Equipment Effectiveness (OEE). By leveraging IIoT middleware and data streaming technologies like Apache Kafka and Flink, businesses can achieve a unified approach to managing both production processes and higher-level business operations to drive greater efficiency, predictive maintenance, and streamlined decision-making.

Industrial Automation – The OT/IT Bridge

An OT/IT Bridge in industrial automation refers to the integration between Operational Technology (OT) systems, which manage real-time industrial processes, and Information Technology (IT) systems, which handle data, business operations, and analytics. This bridge is crucial for modern Industrial IoT (IIoT) environments, as it enables seamless data flow between machines, sensors, and industrial control systems (PLC, SCADA) on the OT side, and business management applications (ERP, MES) on the IT side.

The OT/IT Bridge facilitates real-time data synchronization. It allows industries to monitor and control their operations more efficiently, implement condition monitoring/predictive maintenance, and perform advanced analytics. The OT/IT bridge helps overcome the traditional siloing of OT and IT systems by integrating real-time data from production environments with business decision-making tools. Data Streaming frameworks like Kafka and Flink, often combined with specialized platforms for the last-mile IoT integration, act as intermediaries to ensure data consistency, interoperability, and secure communication across both domains.

This bridge enhances overall productivity and improves the OEE by providing actionable insights that help optimize performance and reduce downtime across industrial processes.

OT/IT Hierarchy – Different Layers based on ISA-95 and the Purdue Model

The OT/IT Levels 0-5 framework is commonly used to describe the different layers in industrial automation and control systems, often following the ISA-95 or Purdue model:

Level 0: Physical Process: This is the most basic level, consisting of the physical machinery, equipment, sensors, actuators, and production processes. It represents the actual processes being monitored or controlled in a factory or industrial environment.
Level 1: Sensing and Actuation: At this level, sensors, actuators, and field devices gather data from the physical processes. This includes things like temperature sensors, pressure gauges, motors, and valves that interact directly with the equipment at Level 0.
Level 2: Control Systems: Level 2 includes real-time control systems such as Programmable Logic Controllers (PLCs) and Distributed Control Systems (DCS). These systems interpret the data from Level 1 and make real-time decisions to control the physical processes.
Level 3: Manufacturing Operations Management (MOM): This level manages and monitors production workflows. It includes systems like Manufacturing Execution Systems (MES), which ensure that production runs smoothly and aligns with the business’s operational goals. It bridges the gap between the physical operations and higher-level business planning.
Level 4: Business Planning and Logistics: This is the IT layer that includes systems for business management, enterprise resource planning (ERP), and supply chain management (SCM). These systems handle business logistics such as order processing, materials procurement, and long-term planning.
Level 5: Enterprise Integration: This level encompasses corporate-wide IT functions such as financial systems, HR, sales, and overall business strategy. It ensures the alignment of all operations with the broader business goals.

In summary, Levels 0-2 focus on the OT (Operational Technology) side—real-time control and monitoring of industrial processes, while Levels 3-5 focus on the IT (Information Technology) side—managing data, logistics, and business operations.

While the modern, cloud-native IIoT world is not strictly hierarchical anymore (e.g. there is also lots of edge computing like sensor analytics), these layers are still often used to separate functions and responsibilities. Industrial IoT data platforms, including the data streaming platform, often connect to several of these layers in a decoupled hub and spoke architecture.

Industrial IoT Middleware

Industrial IoT (IIoT) Middleware is a specialized software infrastructure designed to manage and facilitate the flow of data between connected industrial devices and enterprise systems. It acts as a mediator that connects various industrial assets, such as machines, sensors, and controllers, with IT applications and services such as MES or ERP, often in a cloud or on-premises environment.

This middleware provides a unified interface for managing the complexities of data integration, protocol translation, and device communication to enable seamless interoperability among heterogeneous systems. It often includes features like real-time data processing, event management, scalability to handle large volumes of data, and robust security mechanisms to protect sensitive industrial operations.

In essence, IIoT Middleware is critical for enabling the smart factory concept, where connected devices and systems can communicate effectively, allowing for automated decision-making, predictive maintenance, and optimized production processes in industrial settings.

By providing these services, IIoT Middleware enables industrial organizations to optimize operations, enhance Overall Equipment Effectiveness (OEE), and improve system efficiency through seamless integration and real-time data analytics.

Relevant Industries for IIoT Middleware

Industrial IoT Middleware is essential across various industries that rely on connected equipment, sensors or vehicles and data-driven processes to optimize operations. Some key industries where IIoT Middleware is particularly needed include:

Manufacturing: For smart factories, IIoT Middleware enables real-time monitoring of production lines, predictive maintenance, and automation of manufacturing processes. It supports Industry 4.0 initiatives by integrating machines, robotics, and enterprise systems.
Energy and Utilities: IIoT Middleware is used to manage data from smart grids, power plants, and renewable energy sources. It helps in optimizing energy distribution, monitoring infrastructure health, and improving operational efficiency.
Oil and Gas: In this industry, IIoT Middleware facilitates the remote monitoring of pipelines, drilling rigs, and refineries. It enables predictive maintenance, safety monitoring, and optimization of extraction and refining processes.
Transportation and Logistics: IIoT Middleware is critical for managing fleet operations, tracking shipments, and monitoring transportation infrastructure. It supports real-time data analysis for route optimization, fuel efficiency, and supply chain management.
Healthcare: In healthcare, IIoT Middleware connects medical devices, patient monitoring systems, and healthcare IT systems. It enables real-time monitoring of patient vitals, predictive diagnostics, and efficient management of medical equipment.
Agriculture: IIoT Middleware is used in precision agriculture to connect sensors, drones, and farm equipment. It helps in monitoring soil conditions, weather patterns, and crop health, leading to optimized farming practices and resource management.
Aerospace and Defense: IIoT Middleware supports the monitoring and maintenance of aircraft, drones, and defense systems. It ensures the reliability and safety of critical operations by integrating real-time data from various sources.
Automotive: In the automotive industry, IIoT Middleware connects smart vehicles, assembly lines, and supply chains. It enables connected car services, autonomous driving, and the optimization of manufacturing processes.
Building Management: For smart buildings and infrastructure, IIoT Middleware integrates systems like HVAC, lighting, and security. It enables real-time monitoring and control, energy efficiency, and enhanced occupant comfort.
Pharmaceuticals: In pharmaceuticals, IIoT Middleware helps monitor production processes, maintain regulatory compliance, and ensure the integrity of the supply chain.

These industries benefit from IIoT Middleware by gaining better visibility into their operations. The digitalization of shop floor and business processes improves decision-making and drives efficiency through automation and real-time data analysis.

Industrial IoT Middleware Layers in OT/IT

While modern, cloud-native IoT architectures don’t always use an hierarchical model anymore, Industrial IoT (IIoT) middleware typically operates at Level 3 (Manufacturing Operations Management) and Level 2 (Control Systems) in the OT/IT hierarchy.

At Level 3, IIoT middleware integrates data from control systems, sensors, and other devices, coordinating operations, and connecting these systems to higher-level IT layers such as MES and ERP systems. At Level 2, the middleware handles real-time data exchange between industrial control systems (like PLCs) and IT infrastructure, ensuring data flow and interoperability between the OT and IT layers.

This middleware acts as a bridge between the operational technology (OT) at Levels 0-2 and the business-oriented IT systems at Levels 4-5.

Edge and Cloud Vendors for Industrial IoT

The industrial IoT space provides many solutions from various software vendors. Let’s explore the different options and their trade-offs.

Traditional “Legacy” Solutions

Traditional Industrial IoT (IIoT) solutions are often characterized by proprietary, monolithic architectures that can be inflexible and expensive to implement and maintain. These traditional platforms, offered by established industrial vendors like PTC ThingWorx, Siemens MindSphere, GE Predix, and Osisoft PI, are typically designed to meet specific industry needs but may lack the scalability, flexibility, and cost-efficiency required for modern industrial applications. However, while these solutions are often called “legacy” do a solid job integrating with proprietary PLCs, SCADA systems and data historians. They still operate the shop floor in most factories worldwide.

Emerging Cloud Solutions

In contrast to legacy systems, emerging cloud-based IIoT solutions offer elastic, scalable, and (hopefully) cost-efficient alternatives that are fully managed by cloud service providers. These platforms, such as AWS IoT Core, enable industrial organizations to quickly deploy and scale IoT applications while benefiting from the cloud’s inherent flexibility, reduced operational overhead, and integration with other cloud services.

However, emerging cloud solutions for IIoT can face challenges:

Latency and real-time processing limitations, making them less suitable for time-sensitive industrial applications.
High network transfer cost from the edge to the cloud.
Security and compliance concerns arise when transferring sensitive operational data to the cloud, particularly in regulated industries.
Depending on reliable internet connectivity, which can be a significant drawback in remote or unstable environments.
Very limited connectivity to proprietary (legacy) protocols such as Siemens S7 or Modbus.

The IIoT Enterprise Architecture is a Mix of Vendors and Platforms

Threre is no black and white comparing different solutions. The current IIoT landscape in real world deployments features a mix of traditional industrial vendors and new cloud-native solutions. Companies like Schneider Electric’s EcoStruxure still provide robust industrial platforms, while newer entrants like AWS IoT Core are gaining traction due to their modern, cloud-centric approaches. The shift towards cloud solutions reflects the growing demand for more agile and scalable IIoT infrastructures.

The reality in the industrial space is that:

OT/IT is usually hybrid edge to cloud, not just cloud
Most cloud-only solutions do not provide the right security, SLAs, latency, cost
IoT is a complex space. “Just” a OPC-UA or MQTT connector is not sufficient in most scenarios.

Data Streaming for Industrial IoT in the OT/IT World with Kafka and Flink

Data streaming with Apache Kafka and Flink is a powerful approach that enables the continuous flow and processing of real-time data across various systems. However, to be clear: Data streaming is NOT a silver bullet. It is complementary to other IoT middleware. And some modern, cloud-native industrial software is built on top of data streaming technologies like Kafka and Flink under the hood.

In the context of Industrial IoT, data streaming plays a crucial role by seamlessly integrating and processing data from numerous IoT devices, equipment, PLCs, MES and ERP in real-time. This capability enhances decision-making processes and operational efficiency by providing continuous insights, allowing industries to optimize their operations and respond proactively to changing conditions. The last-mile integration is usually done by complementary IIoT technologies providing sophisticated connectivity to OPC-UA, MQTT and proprietary legacy protocols like S7 or Modbus.

Apache Kafka and Flink in IT Environments

In data center and cloud settings, Kafka and Flink are used to provide continuous processing and data consistency across IT applications including sales and marketing, B2B communication with partners, and eCommerce. Data streaming facilitates data integration, processing and analytics to enhance the efficiency and responsiveness of IT operations and business; no matter if data sources or sinks are real-time, batch or request-response APIs.

Apache Kafka as an OT/IT Bridge

Kafka serves as a critical bridge between Operational Technology (OT) and Information Technology (IT) by enabling real-time data synchronization at scale. This integration ensures data consistency across different systems, supporting seamless communication and coordination between industrial operations and business systems.

Apache Kafka and Flink in OT Edge Applications

At the edge of operational technology, Kafka and Flink provide a robust backbone for use cases such as condition monitoring and predictive maintenance. By processing data locally and in real-time, these technologies improve the Overall Equipment Effectiveness (OEE), and support advanced analytics and decision-making directly within industrial environments.

IoT Success Story: Industrial Edge Intelligence with Helin and Confluent

Helin is a company specializes in providing advanced data solutions focusing on real-time data integration and analytics, particularly in the context of industrial and operational environments. Its industry focus on maritime and energy sector, but this is relevant across all IIoT industries.

Helin presented about its Industrial Edge Intelligence Platform at Confluent’s Data in Motion Tour in Utrecht, Netherlands in. 2024. The IIoT platform includes capabilities for data streaming, processing, and visualization to help organizations leverage their data more effectively for decision-making and operational improvements.

Source: Helin

Helin’s platform bridges the OT and IT worlds by seamlessly integrating industrial edge analytics with multi-tenant cloud solutions:

Source: Helin

The above architecture diagram shows how Helin maps to the OT/IT hierarchy:

OT – 0,1,2,3
- 1: Sensors, Actuators, Field Devices
- 2: Remote I/O
- 3: Controller
DMZ / Gateway – 3.5
BIZ (= IT) – 4,5
- 4 OT Applications (MES, SCADA, etc)
- 5 – outside of Helin – IT Applications (ERP, CRM, DWH, etc)

The strategy and value of Helin’s IoT platform is relevant for most industrial organizations: Making dumb assets smart by extracting data in real-time and utilize AI to transform it into significant business value and actionable insights for the maritime & energy sectors.

Business Value: Fuel Reduction, Increased Revenue, Saving Human Lives

Helin presented three success stories with huge business value:

8% Fuel reduction: Helin’s platform reduced the fuel consumption for Boskalis 8% by delivering real-time insights to vessel operators offshore.
20% Revenue: An increase of revenue for the solar parks of Sunrock with 20% by optimizing their assets by the platform.
Saving human lives: Optimization of drilling operations while increasing the safety of the crew on oil rigs by reducing human errors.

Data Streaming with Kafka and Flink in the Cloud

Why does the Helin IoT Platform use Kafka? Helin brought up a few powerful arguments:

Flexibility towards the integration between the edge and the cloud
Different data streams at different velocity
- Slow cold storage data
- Real time streams for analytics
- Data base endpoint for visualization
Multi-cloud with a standardized streaming protocol
- Reduced code overhead by not having to build adapters
- Open platform so that customers can land their data anywhere
- Failover baked in

Helin’s Data Streaming Journey from Self-Managed Kafka to Serverless Confluent Cloud

Helin started with self-managed Kafka and cumbersome Python scripts…

Source: Helin

… and transitioned to fully managed Kafka in Confluent Cloud:

Source: Helin

As a next step, Helin is migrating from cumbersome and unreliable Python mappings to Apache Flink for scalable and reliable data processing.

Please note that the last-mile IoT connectivity at the edge (SCADA, PLC, etc.) is implemented with technologies like OPC-UA, MQTT or custom integrations. You can see a common best practice: Choose and combine the right tools for the job.

Data Streaming with Kafka and Flink Empowers Edge-to-Cloud Industrial IoT Middleware

Data streaming plays a crucial role in bridging OT and IT in industrial automation. By enabling continuous data flow between the edge and the cloud, Kafka and Flink ensure that both operational data from sensors and machinery, and IT applications like ERP and MES, remain synchronized in real-time. Additionally, data consistency with non-real-time systems like a legacy batch system or a cloud-native data lakehouse are guaranteed out-of-the-box.

The real-time integration powered by Kafka and Flink improves the overall operational efficiency (OEE) and enables specific use cases such as enhanced predictive maintenance, condition monitoring. As industries increasingly adopt edge computing alongside cloud solutions, these data streaming tools provide the scalability, flexibility, and low-latency performance needed to drive Industrial IoT initiatives forward.

Helin’s Industrial Edge Intelligence platform is an excellent example for an IIoT middleware. It leverages Apache Kafka and Flink to integrate real-time data from industrial assets and enabling predictive analytics and operational optimization. By using this platform, companies like Boskalis achieved 8% fuel savings, and Sunrock increased revenue by 20%. These real world scenarios demonstrate the platform’s ability to drive significant business value through real-time insights and decision-making in industrial projects.

How does your OT/IT integration look like today? Do you plan to optimize the infrastructure with data streaming? How does the hybrid architecture look like? What are the use cases? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink appeared first on Kai Waehner.

Energy Trading with Apache Kafka and Flink

Kai Waehner — Fri, 28 Jun 2024 02:30:09 +0000

Energy trading and data streaming are connected because real-time data helps traders make better decisions in the fast-moving energy markets. This data includes things like price changes, supply and demand, smart IoT meters and sensors, and weather, which help traders react quickly and plan effectively. As a result, data streaming with Apache Kafka and Apache Flink makes the market clearer, speeds up information sharing, and improves forecasting and risk management. This blog post explores the use cases and architectures for scalable and reliable real-time energy trading, including real-world deployments from Uniper, re.alto and Powerledger.

What is Energy Trading?

Energy trading is the process of buying and selling energy commodities in order to manage risk, optimize costs, and ensure the efficient distribution of energy. Commodities traded include:

Electricity: Traded in wholesale markets to balance supply and demand.
Natural Gas: Bought and sold for heating, electricity generation, and industrial use.
Oil: Crude oil and refined products like gasoline and diesel are traded globally.
Renewable Energy Certificates (RECs): Represent proof that energy was generated from renewable sources.

Market Participants:

Producers: Companies that extract or generate energy.
Utilities: Entities that distribute energy to consumers.
Industrial Consumers: Large energy users that purchase energy directly.
Traders and Financial Institutions: Participants that buy and sell energy contracts for profit or risk management.

Objectives of Energy Trading

The objectives for energy trading are risk management (hedging against price volatility and supply disruptions), cost optimization (securing energy at the best possible prices) and revenue generation (profiting from price differences in different markets).

Market types include:

Spot Markets: Immediate delivery and payment of energy commodities.
Futures Markets: Contracts to buy or sell a commodity at a future date, helping manage price risks.
Over-the-Counter (OTC) Markets: Direct trades between parties, often customized contracts.
Exchanges: Platforms like the New York Mercantile Exchange (NYMEX) and Intercontinental Exchange (ICE) where standardized contracts are traded.

Energy trading is subject to extensive regulation to ensure fair practices, prevent market manipulation, and protect consumers.

What is Data Streaming with Apache Kafka and Flink?

Data streaming with Apache Kafka and Flink provides a unique combination of capabilities:

Real-time messaging at scale for analytical and transactional workloads.
Event store for durability, true decoupling and the ability to travel back in time for replayability of events with guaranteed ordering.
Data integration with any data source and sink (real-time, near real-time, batch, request response APIs, files, etc.)
Stream processing for stateless and stateful correlations of data for streaming ETL and business applications

Trading Architecture with Apache Kafka

Many trading markets use data streaming with Apache Kafka under the hood to integrate with internal systems, external exchanges and data providers, clearing houses and regulators:

Source: Confluent

For instance, NASDAQ combines critical stock exchange trading with low-latency streaming analytics. This is not much different for energy trading, even though the interfaces and challenges differ a bit as additionally various IoT data sources are involved.

Why Apache Kafka and Flink for Energy Trading?

Data streaming with Apache Kafka and Apache Flink is highly beneficial for energy trading for several reasons across the end-to-end business process and data pipelines.

Here is why these technologies are often used in the energy sector:

Real-Time Data Processing

Real-Time Analytics: Energy trading relies on real-time data to make informed decisions. Kafka and Flink can process data streams in real-time, providing immediate insights into market conditions, energy consumption and production levels.

Immediate Response: Real-time processing allows traders to respond instantly to market changes, such as price fluctuations or sudden changes in supply and demand, optimizing trading strategies and mitigating risks.

Scalability and Performance

Scalability: Both Kafka and Flink handle high-throughput data streams. This scalability is crucial for energy markets, which generate vast amounts of data from multiple sources, including sensors, smart meters, and market feeds.

High Performance: Data streaming enables fast data processing and analysis. Kafka ensures low-latency data ingestion, while Flink provides efficient, distributed stream processing.

Fault Tolerance and Reliability

Fault Tolerance: Kafka’s distributed architecture ensures data durability and fault tolerance, essential for the continuous operation of energy trading systems.

Reliability: Flink offers exactly-once processing semantics, ensuring that each piece of data is processed accurately without loss or duplication, which is critical for maintaining data integrity in trading operations.

Integration and Flexibility

Integration Capabilities: Kafka can integrate with various data sources and sinks via Kafka Connect or Client APIs like Java, C, C++, Python, JavaScript or REST/HTTP. This making it versatile for collecting data from different energy systems. Flink can process this data in real-time and output the results to various storage systems or dashboards.

Flexible Data Processing: Flink supports complex event processing, windowed computations, and machine learning, allowing for advanced analytics and predictive modeling in energy trading.

Event-Driven Architecture (EDA)

Event-Driven Processing: Energy trading can benefit from an event-driven architecture where trading decisions and alerts are triggered by specific events, such as market price thresholds or changes in energy production. Kafka and Flink facilitate this approach by efficiently handling event streams.

Energy Trading at Uniper

Uniper is a global energy company headquartered in Germany that focuses on power generation, energy trading, and energy storage solutions, providing electricity, natural gas, and other energy services to customers worldwide.

Source: Uniper

Uniper’s Business Value of Data Streaming

Why has Uniper chosen to use the Apache Kafka and Apache Flink ecosystem? If you look at the trade lifecycle in the energy sector, you probably can find out by yourself:

Source: Uniper

The underlying process is much more complex than the above picture. For instance, pre-trading including aspects like capacity management. If you trade energy between the Netherlands and Germany, the transportation of the energy needs to be planned while executing the trade. Uniper explained the process in much more details in the below webinar recording.

Here are Uniper’s benefits of implementing the trade lifecycle with data streaming using Kafka and Flink, as they described them:

Business-driven:

Increase of trading volumes
More messages per day
Faster processing of data

Architecture-driven:

Decoupling of applications
Faster processing of data – batch vs. streaming data
Reusability of data

Uniper’s IT Landscape

Uniper’s enterprise architecture leverages data streaming as central nervous systems between various technical platforms (integrated via Kafka Connect or Apache Camel) and business applications (e.g., algorithmic trading, dispatch and invoicing systems).

Source: Uniper

Uniper runs mission-critical workloads through Kafka. Confluent Cloud provides the right scale, elasticity, and SLAs for such use cases. Apache Flink serves ETL use cases for continuous stream processing.

Kafka Connect provides many connectors for direct integration with (non)streaming interfaces. Apache Camel is used for some other protocols that do not fit well into a native Kafka connector. Camel is an integration framework with native Kafka integration.

Fun fact: If you did not know: I have a history with Apache Camel, too. I worked a lot with this open source framework as independent consultant and at Talend with its Enterprise Service Bus (ESB) powered by Apache Camel. Hence, my blog has some articles about Apache Camel, too. Including: “When to use Apache Camel vs. Apache Kafka?”

Webinar Recording: Energy Trading with Kafka and Flink @ Uniper

The following on-demand webinar recording explores the relation between data streaming and energy trading in more detail. Uniper’s Alex Esseling (Platform & Security Architect, Sales & Trading IT) discusses Apache Kafka and Flink inside energy trading at Uniper:

Source: Confluent

IoT Data for Energy Trading

Energy trading differs a bit from traditional trading on Nasdaq and similar financial markets as IoT data is an important additional data source for several key reasons:

1. Real-Time Market Insights

Live Data Feed: IoT devices, such as smart meters and sensors, provide real-time data on energy production, consumption, and grid status, enabling traders to make informed decisions based on the latest market conditions.
Demand Forecasting: Accurate demand forecasting relies on real-time consumption data, which IoT devices supply continuously, helping traders anticipate market movements and adjust their strategies accordingly.

2. Enhanced Decision Making

Predictive Analytics: IoT data allows for sophisticated predictive analytics, helping traders forecast price trends, identify potential supply disruptions, and optimize trading positions.
Risk Management: Continuous monitoring of energy infrastructure through IoT sensors helps in identifying and mitigating risks, such as equipment failures or grid imbalances, which could affect trading decisions.

A typical use case in energy trading might involve:

Data Collection: IoT devices across the energy grid collect data on energy production from renewable sources, consumption patterns in residential and commercial areas, and grid stability metrics.
Data Analysis: This data is streamed and processed in real-time using platforms like Apache Kafka and Flink, enabling immediate analysis and visualization.
Trading Decisions: Traders use the insights derived from this analysis to make informed decisions about buying and selling energy, optimizing their strategies based on current and predicted market conditions.

In summary, IoT data is essential in energy trading for providing real-time insights, enhancing decision-making, optimizing grid operations, ensuring compliance, and integrating renewable energy sources, ultimately leading to a more efficient and responsive energy market.

Data Streaming to Ingest IoT Data into Energy Trading

Data streaming with Kafka and Flink is deployed in various edge and hybrid cloud energy use cases.

As discussed above, some of the IoT data is very helpful for energy trading, not just for OT and operational workloads. Read about data streaming in the IoT space in the following articles:

Powerledger – Energy Trading with Kafka, MongoDB and Blockchain

Powerledger is another excellent success story for energy trading powered by data streaming with Apache Kafka. The technology company uses blockchain to enable decentralized energy trading. Their platform allows users to trade energy directly with each other, manage renewable energy certificates, and track the origin and movement of energy in real-time.

The platform provides

Tracking, tracing and trading of renewable energy
Blockchain-based energy trading platform
Facilitating peer-to-peer (P2P) trading of excess electricity from rooftop solar power installations and virtual power plants
Traceability with non-fungible tokens (NFTs) representing renewable energy certificates (RECs)

Powerledger uses a decentralised rather than the conventional unidirectional market. Benefits include reduced customer acquisition costs, increased customer satisfaction, better prices for buyers and sellers (compared with feed-in and supply tariffs), and provision for cross-retailer trading.

Apache Kafka via Confluent Cloud as a core piece of infrastructure, specifically to ingest data from smart electricity meters and feed it into the trading system.

Wondering why to combine Kafka and Blockchain? Learn more here: “Apache Kafka and Blockchain – Comparison and a Kafka-native Implementation“.

re.alto – Solar Trading: Insights into the Production of Solar Plants

re.alto is a company that provides a digital marketplace for energy data and services. Their platform connects energy providers, consumers, and developers, facilitating the exchange of data and APIs (application programming interfaces) to optimize energy usage and distribution. By enabling seamless access to diverse energy-related data, re.alto supports innovation, enhances energy efficiency, and helps create smarter, more flexible energy systems.

re.alto presented their data streaming use cases at the Data in Motion Tour in Brussels, Belgium:

Xenn: Real time monitoring of energy costs
Smart charging: Schedule the charging of an electric vehicle to reduce costs or environmental impact
Solar trading: Insights into the production of solar plants

Let’s explore solar trading in more detail. re.alto presented about its platform at the Confluent Data in Motion Tour 2024 in Brussels, Belgium. re.alto’s platform provides connectivity and APIs for market pricing data but also IoT integration to smart meters, grid data, batteries, SCADA systems, etc.:

Source: re.alto

Solar trading includes three steps:

Data collection from sources such as SMA, FIMER, HUAWEI, solar edge
Data processing with data streaming, time series analytics and overvoltage detection
Providing data via ePox Spot and APIs / marketplace

Energy Trading Needs Reliable Real-Time Data Feeds, Connectivity and Reliability

Energy trading requires scalable and reliable real-time data processing. In contrary to trading in financial markets, the energy sector additional integrates IoT data sources like smart meters and sensors.

Uniper, re.alto and Powerledger are excellent examples of how to build a reliable energy trading platform powered by data streaming.

How does your enterprise architecture look like for energy trading? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Energy Trading with Apache Kafka and Flink appeared first on Kai Waehner.

MQTT Market Trends: Cloud, Unified Namespace, Sparkplug, Kafka Integration

Kai Waehner — Fri, 08 Dec 2023 09:15:24 +0000

The lightweight and open IoT messaging protocol MQTT gets adopted more widely across industries. This blog post explores relevant market trends for MQTT: cloud deployments and fully managed services, data governance with unified namespace and Sparkplug B, MQTT vs. OPC-UA debates, and the integration with Apache Kafka for OT/IT data processing in real-time.

MQTT Summit in Munich

In December 2023, I attended the MQTT Summit Connack. HiveMQ sponsored the event. The agenda included various industry experts. The talks covered industrial IoT deployments, unified namespace, Sparkplug B, security and fleet management, and use cases for Kafka combined with MQTT like connected vehicles or smart city (my talk).

It was a pleasure to meet many industry peers of the MQTT community, independent consultants, and software vendors. I learned a lot about the adoption of MQTT in the real world, best practices, and a few trade-offs of Sparkplug B. The following sections summarize my trends for MQTT of this event combined with experiences I had this year in customer meetings around the world.

Special thanks to Kudzai Manditereza of HiveMQ to organize this great event with many international attendees across industries:

What is MQTT?

MQTT stands for Message Queuing Telemetry Transport. MQTT is a lightweight and open-source messaging protocol designed for small sensors and mobile devices with high-latency or unreliable networks. IBM originally developed MQTT in the late 1990s and later became an open standard.

MQTT follows a publish/subscribe model, where devices (or clients) communicate through a central message broker. The key components in MQTT are:

Client: The device or application that connects to the MQTT broker to send or receive messages.
Broker: The central hub that manages the communication between clients. It receives messages from publishing clients and routes them to subscribing clients based on topics.
Topic: A hierarchical string that acts as a label for a message. Clients subscribe to topics to receive messages and publish messages to specific topics.

When to use MQTT?

The publish/subscribe model allows for efficient communication between devices. When a client publishes a message to a specific topic, all other clients subscribed to that topic receive the message. This decouples the sender and receiver, enabling a scalable and flexible communication system.

The MQTT standard is known for its simplicity, low bandwidth usage, and support for unreliable networks. These characteristics make it well-suited for Internet of Things (IoT) applications, where devices often have limited resources and may operate under challenging network conditions. Good MQTT implementations provide a scalable and reliable platform for IoT projects.

MQTT has gained widespread adoption in various industries for IoT deployments, home automation, and other scenarios requiring lightweight and efficient communication.

I discuss the following four market trends for MQTT in the following sections. These have huge impact on the adoption and making a decision to choose MQTT:

MQTT in the Public Cloud
Data Governance for MQTT
MQTT vs. OPC-UA Debates
MQTT and Apache Kafka for OT/IT Data Processing

Trend 1: MQTT in the Public Cloud

Most companies have a cloud first strategy. Go serverless if you can! Focus on business problems, faster time-to-market, and an elastic infrastructure are the consequence.

Mature MQTT cloud services exist. At Confluent, we work a lot with HiveMQ. The combination even provides a fully managed integration between both cloud offerings.

Having said that, not everything can or should go to the (public) cloud. Security, latency and cost often make a deployment in the data center or at the edge (e.g., in a smart factory) the preferred or mandatory option. Hybrid architectures allow the combination of both options for building the most cost-efficient but also reliable and secure IoT infrastructure. I talked about zero-trust and air-gapped environments leveraging unidirectional hardware for the most critical use cases in another blog..

Automation and Security are the Typical Blockers for Public Cloud

Key for success, especially in hybrid architectures, is automation and fleet management with CI/CD and GitOps for multi-cluster management. Many projects leverage Kubernetes as a cloud-native infrastructure for the edge and private cloud. However, in the public cloud, the first option should always be a fully managed service (if security and other requirements allow it).

Be careful when adopting fully-managed MQTT cloud services: Support for MQTT is not always equal across the cloud vendors. Many vendors do not implement the entire protocol, miss features, and require usage limitations. HiveMQ wrote a great article showing this. The article is a bit outdated (and opinionated, of course, as a competing MQTT vendor). But it shows very well how some vendors provide offerings that are far away from a good MQTT cloud solution.

The hardest problem for public cloud adoption of MQTT is security! Double check the requirements early. Latency, availability or specific features are usually not the problem. The deployment and integration need to be compliant and follow the cloud strategy. As Industrial IoT projects always have to include some kind of edge story, it is a tougher discussion than sales or marketing projects.

Trend 2: Data Governance for MQTT

Data governance is crucial across the enterprise. From an IoT and MQTT perspective, the two main topics are unified namespace as the concept and Sparkplug B as the technology.

Unified Namespace for Industrial IoT

In the context of Industrial Internet of Things (IIoT), a unified namespace (UNS) typically refers to a standardized and cohesive way of naming and organizing devices, data, and resources within an industrial network or ecosystem. The goal is to provide a consistent naming structure that facilitates interoperability, data sharing, and management of IIoT devices and systems.

The term Unified Namespace (in Industrial IoT) was coined and popularized by Walker Reynolds, an expert and content creator for Industrial IoT.

Concepts of Unified Namespace

Here are some key aspects of a unified namespace in Industrial IoT:

Device Naming: Devices in an IIoT environment may come from various manufacturers and have different functionalities. A unified namespace ensures that devices are named consistently, making it easier for administrators, applications, and other devices to identify and interact with them.
Data Naming and Tagging: IIoT involves the generation and exchange of vast amounts of data. A unified namespace includes standardized naming conventions and tagging mechanisms for data points, variables, or attributes associated with devices. This consistency is crucial for applications that need to access and interpret data across different devices.
Interoperability: A unified namespace promotes interoperability by providing a common framework for devices and systems to communicate. When devices and applications follow the same naming conventions, it becomes easier to integrate new devices into existing systems or replace components without causing disruptions.
Security and Access Control: A well-defined namespace contributes to security by enabling effective access control mechanisms. Security policies can be implemented based on the standardized names and hierarchies, ensuring that only authorized entities can access specific devices or data.
Management and Scalability: In large-scale industrial environments, having a unified namespace simplifies device and resource management. It allows for scalable solutions where new devices can be added or replaced without requiring extensive reconfiguration.
Semantic Interoperability: Beyond just naming, a unified namespace may include semantic definitions and standards. This helps in achieving semantic interoperability, ensuring that devices and systems understand the meaning and context of the data they exchange.

Overall, a unified namespace in Industrial IoT is about establishing a common and standardized structure for naming devices, data, and resources, providing a foundation for efficient, secure, and scalable IIoT deployments. Standards organizations and industry consortia often play a role in developing and promoting these standards to ensure widespread adoption and compatibility across diverse industrial ecosystems.

Sparkplug B: Interoperability and Standardized Communication for MQTT Topics and Payloads

Unified Namespace is the theoretical concept for interoperability. The standardized implementation for payload structure enforcement is Sparkplug B. This is a specification created at the Eclipse foundation and turned into an ISO standard later.

Sparkplug B provides a set of conventions for organizing data and defining a common language for devices to exchange information. Here is an example of HiveMQ depicting how a unified namespace makes communication between devices, systems, and sites easier:

Source: HiveMQ

Key features of Sparkplug B include:

Payload Structure: Sparkplug B defines a specific format for the payload of MQTT messages. This format includes fields for information such as timestamp, data type, and value. This standardized payload structure ensures that devices can consistently understand and interpret the data being exchanged.
Topic Namespace: The specification defines a standardized topic namespace for MQTT messages. This helps in organizing and categorizing messages, making it easier for devices to discover and subscribe to relevant information.
Birth and Death Certificates: Sparkplug B introduces the concept of “Birth” and “Death” certificates for devices. When a device comes online, it sends a Birth certificate with information about itself. Conversely, when a device goes offline, it sends a Death certificate. This mechanism aids in monitoring the status of devices within the IIoT network.
State Management: The specification includes features for managing the state of devices. Devices can publish their current state, and other devices can subscribe to receive updates. This helps in maintaining a synchronized view of device states across the network.

Sparkplug B is intended to enhance the interoperability, scalability, and efficiency of IIoT deployments by providing a standardized framework for MQTT communication in industrial environments. Its adoption can simplify the integration of diverse devices and systems within an industrial ecosystem, promoting seamless communication and data exchange.

Limitations of Sparkplug B

Sparkplug B has a few limitations, such as:

Only supports Quality of Service (QoS) 0 providing at most once message delivery guarantees.
Limits in the structure of topic namespaces.
Very device centric (but MQTT is for many “things”)

Understand the pros and cons of Sparkplug B. It is perfect for some use cases. But the above limitations are blockers for some others. Especially, only supporting QoS 0 is a huge limitation for mission-critical use cases.

Trend 3: MQTT vs. OPC-UA Debates

MQTT has many benefits compared to other industrial protocols. However, OPC-UA is another standard in the IoT space that gets at least as much traction in the market as MQTT. The debate about choosing the right IoT standard is controversial, often led by emotions and opinions, and still absolutely valid to discuss.

OPC-UA (Open Platform Communications Unified Architecture) is a machine-to-machine communication protocol for industrial automation. It enables seamless and secure communication and data exchange between devices and systems in various industrial settings.

OPC UA has become a widely adopted standard in the industrial automation and control domain, providing a foundation for secure and interoperable communication between devices, machines, and systems. Its open nature and support from industry organizations contribute to its widespread use in applications ranging from manufacturing and process control to energy management and more.

If you look at the promises of MQTT and OPC-UA, a lot of overlapping exists:

Scalable
Reliable
Real-time
Open
Standardized

All of them are true for both standards. Still, trade-offs exist. I won’t start a flame war here. Just search for “MQTT vs. OPC-UA”. You will find many blog posts, articles and videos. Most are very opinionated (and often driven by a vendor). Reality is that the industry adopted both MQTT and OPC-UA widely.

And while the above characteristics might all be true for both standards in general, the details make the difference for specific implementations. For instance, if you try to connect plenty of Siemens S3 PLCs via OPC-UA, then you quickly realize that the number of parallel connections is not as scalable as the OPC-UA standard specification tells you.

When to Choose MQTT vs. OPC-UA?

The clear recommendation is starting with the business problem, not the technology. Evaluate both standards and their implementations, supported interfaces, vendors cloud services, etc. Then choose the right technology.

Here is what I use as a simplified rule of thumb if you have to start a technical discussion:

MQTT: Use cases for connected IoT devices, vehicles, and other interfaces with support for lightweight infrastructure, large number of connections, and/or bad networks.
OPC-UA: Use cases for industrial automation to connect heavy equipment, PLCs, SCADA systems, data historians, etc.

This is just a rule of thumb. And the situation changes. Modern PLCs and other equipment add support for multiple protocols to be more flexible. But, nowadays, you rarely have an option anyway because specific equipment, devices, or vehicles only support one or the other. And you can still be happy: Otherwise, you need to use another IIoT platform to connect to proprietary legacy protocols like S3, Modbus, et al.

MQTT and OPC-UA Gotchas

A few additional gotchas I realized from various customer conversations around the world in the past quarters:

In theory, MQTT and OPC-UA work well together, i.e., MQTT is the underlying transportation protocol for OPC-UA. I did not see this yet in the real world (no statistical evidence, just my personal experience). But what I see is the combination of OPC-UA for the last mile integration to the PLC and then forwarding the data to other consumers via MQTT. All in a single gateway, usually a proprietary IoT platform.
OPC-UA defines many sub-standards for different industries or use cases. In theory, this is great. In practice, I see this more like the WS-* hell in the SOAP/WSDL web service world where most projects moved to a much simpler HTTP/REST architectures. Similarly, most integrations I see to OPC-UA use simple, custom-coded clients in Java or other programming languages – because the tools don’t support the complex standards.
IoT vendors pitch any possible integration scenario in marketing. I am amazed that MQTT and OPC-UA platforms directly integrate with MES and ERP system like SAP, and any data warehouse and data lake, like Google Big Query, Snowflake, or Databricks. But that’s only the theory. Should you really do this? And did you ever try to connect SAP ECC to MQTT or OPC-UA? Good luck from a technical, and even harder, from an organizational perspective. And do you want tight coupling and point-to-point communication in between the OT world and the ERP? In most cases, it is a good thing to have a clear separation of concerns between different business units, domains, and use cases. Choose the right tool and enterprise architecture; not just for the POC and first pipeline, but for the entire long-term strategy and vision.

The last point brings me to another growing trend: The combination of MQTT for IoT / OT workloads and data streaming with Apache Kafka for the integration with the IT world.

Trend 4: MQTT and Apache Kafka for OT/IT Data Processing

Contrary to MQTT, Apache Kafka is NOT an IoT platform. Instead, Kafka is an event streaming platform and used the underpinning of an event-driven architecture for various use cases across industries. It provides a scalable, reliable, and elastic real-time platform for messaging, storage, data integration, and stream processing. Apache Kafka and MQTT are a perfect combination for many IoT use cases.

Let’s explore the pros and cons of both technologies from the IoT perspective.

Trade-offs of MQTT

MQTT’s pros:

Lightweight
Built for thousands of connections
All programming languages supported
Built for poor connectivity / high latency scenarios
High scalability and availability (depending on broker implementation)•ISO Standard
Most popular IoT protocol (competing with OPC UA)

MQTT’s cons:

Adoption mainly in IoT use cases
Only pub/sub, not stream processing
No reprocessing of events

Trade-offs of Apache Kafka

Kafka’s pros:

Stream processing, not just pub/sub
High throughput
Large scale
High availability
Long-term storage and buffering
Reprocessing of events
Good integration to rest of the enterprise

Kafka’s cons:

Not built for tens of thousands of connections
Requires stable network and good infrastructure
No IoT-specific features like keep alive, last will, or testament

Use Cases, Architectures and Case Studies for MQTT and Kafka

I wrote a blog series about MQTT in conjunction with Apache Kafka with many more technical details and real-world case studies across industries.

The first blog post explores the relation between MQTT and Apache Kafka. Afterward, the other four blog posts discuss various use cases, architectures, and reference deployments.

Part 1 – Overview: Relation between Kafka and MQTT, pros and cons, architectures
Part 2 – Connected Vehicles: MQTT and Kafka in a private cloud on Kubernetes; use case: remote control and command of a car
Part 3 – Manufacturing: MQTT and Kafka at the edge in a smart factory; use case: Bidirectional OT-IT integration with Sparkplug B between PLCs, IoT Gateways, Data Historian, MES, ERP, Data Lake, etc.
Part 4 – Mobility Services: MQTT and Kafka leveraging serverless cloud infrastructure; use case: Traffic jam prediction service using machine learning
Part 5 – Smart City: MQTT at the edge connected to fully-managed Kafka in the public cloud; use case: Intelligent traffic routing by combining and correlating different 1st and 3rd party services

The following presentation is from my talk at the MQTT Summit. It explores various use cases and reference architectures for MQTT and Apache Kafka:

Fullscreen Mode

If you have a bad network, tens of thousands of clients, or the need for a lightweight push-based messaging solution, then MQTT is the right choice. Elsewhere, Kafka, a powerful event streaming platform, is probably the right choice for real-time messaging, data integration, and data processing. In many IoT use cases, the architecture combines both technologies. And even in the industrial space, various projects use Kafka for use cases like building a cloud-native data historian or real-time condition monitoring and predictive maintenance.

Data Governance for MQTT with Sparkplug and Kafka (and Beyond)

Unified Namespace and the concrete implementation with Sparkplug B is excellent for data governance in IoT workloads with MQTT. In a similar way, the Schema Registry defines the data contracts for Apache Kafka data pipelines.

Schema Registry should be the foundation of any Kafka project! Data contracts (aka Schemas, similar to Swagger in REST/HTTP APIs) enforce good data quality and interoperability between independent microservices in the Kafka ecosystem. Each business unit and its data products can choose any technology or API. But data sharing with others works only with good (enforced) data quality.

You can see the issue: Each technology uses its own data governance technology. If you add your favorite data lake, you will add another concept, like Apache Iceberg, to define the data tables for analytics storage systems. And that’s okay! Each data governance suite is optimized for its workloads and requirements. A company-wide master data management failed in the last two decades because each software category has different requirements.

Hence, one clear trend I see is an enterprise-wide data governance strategy across the different systems (with technologies like Collibra or Azure Purview). It has open interfaces and integrates with specific data contracts like Sparkplug B for MQTT, Schema Registry for Kafka, Swagger for HTTP/REST applications, or Iceberg for data lakes. Don’t try to solve the entire enterprise-wide data governance strategy with a single technology. It will fail! We have seen this before…

Legacy PLC (S7, Modbus, BACnet, etc.) with MQTT or Kafka?

MQTT and Kafka enable reliable and scalable end-to-end data pipelines between IoT and IT systems. At least, if you can use modern APIs and standards. Most IoT projects today are still brownfield. A lot of legacy PLCs, SCADA systems, and data historians only support proprietary protocols like Siemens S7, Modbus, BACnet, and so on.

MQTT or Kafka don’t support these legacy protocols out-of-the-box. Another middleware is required. Usually, enterprises choose a dedicated IoT platform for this. That means more cost and complexity, and slower projects.

In the Kafka world, Apache PLC4X is a great open source option if you want to build a modern, cloud-native data historian with Kafka. The framework provides integration with many legacy protocols. And it offers a Kafka Connect connector. The main issue is no official vendor support behind. Companies cannot buy support with a 24/7 business model for mission-critical applications. And that’s typically a blocker for any industrial deployment.

As MQTT is only a pub/sub message broker, it cannot help with legacy protocol integration. HiveMQ tries to solve this challenge with a new framework called HiveMQ Edge: A software-based industrial edge protocol converter. It is a young project and just kicking off. The core is open source. The first supported legacy protocol is Modbus. I think this is an excellent product strategy. I hope the project gets traction and evolves to support many other legacy IIoT technologies to modernize the brownfield shop floor. The project actually also supports OPC-UA. We will see how much demand that feature creates, too.

MQTT and Sparkplug Adoption Grows Year-By-Year for IoT Use Cases

In the IoT world, MQTT and OPC UA have established themselves as open and platform-independent standards for data exchange in Industrial IoT and Industry 4.0 use cases. Data Streaming with Apache Kafka is the data hub for integrating and processing massive volumes of data at any scale in real-time. The “Trinity of Data Streaming in IoT explores the combination of MQTT, OPC-UA and Apache Kafka” in more detail.

MQTT adoption grows year by year with the need for more scalable, reliable and open IoT communication between devices, equipment, vehicles, and the IT backend. The sweet spots of MQTT are unreliable networks, lightweight (but reliable and scalable) communication and infrastructure, and connectivity to thousands of things.

Maturing trends like the Unified Namespace with Sparkplug B, fully managed cloud services, and combined usage with Apache Kafka make MQTT one of the most relevant IoT standards across verticals like manufacturing, automotive, aviation, logistics, and smart city.

But don’t get fooled by architecture pictures and theory. For example, most diagrams for MQTT and Sparkplug show integrations with the ERP (e.g., SAP) and Data Lake (e.g., Snowflake). Should you really integrate directly from the OT world into the analytics platform? Most times, the answer is no because of cost, decoupling of business units, legal issues, and other reasons. This is where the combination of MQTT and Kafka (or another integration platform) shines.

How do you use MQTT and Sparkplug today? What are the use cases? Do you combine it with other technologies, like Apache Kafka, for end-to-end integration across the OT/IT pipeline? Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post MQTT Market Trends: Cloud, Unified Namespace, Sparkplug, Kafka Integration appeared first on Kai Waehner.

A cloud-native SCADA System for Industrial IoT built with Apache Kafka

Kai Waehner — Tue, 04 Oct 2022 11:10:06 +0000

Industrial IoT and Industry 4.0 enable digitalization and innovation. SCADA control systems are a vital component of IT/OT modernization. The SCADA evolution started with monolithic applications and moved to networked and web-based platforms. This blog post explores building the 5th generation: A cloud-native SCADA infrastructure with Apache Kafka. A real-world case study explores the journey of a German system operator for electricity to show how such a journey to open and scalable real-time workloads and edge-to-cloud integration progressed.

What is a SCADA system?

Supervisory control and data acquisition (SCADA) is a control system architecture comprising computers, networked data communications, and graphical user interfaces for high-level supervision of machines and processes. It also covers sensors and other devices, such as programmable logic controllers, which interface with process plants or machinery.

While many people refer to specific commercial products, SCADA is a concept or architecture. It can include various components, functions, and products (from different vendors) on different levels:

Wikipedia has a detailed article explaining the terms, history, components, and functions of SCADA. The evolution describes four generations of SCADA systems:

First generation: Monolithic
Second generation: Distributed
Third generation: Networked
Fourth generation: Web-based

The evolution did not stop here. The following explores the 5. generation: Cloud-native and open SCADA systems.

How does Apache Kafka help in Industrial IoT?

Industrial IoT (IIoT) and Industry 4.0 create a few new challenges across industries:

The need for a much bigger scale
The demand for real-time information
Hybrid architectures with mission-critical workloads at the edge and analytics in elastic public cloud infrastructure.
A flexible Open API culture and data sharing across OT/IT environments, and between partners (e.g., supplier, OEM, and mobility service).

Apache Kafka is unique in its characteristics for IoT infrastructures, being very scalable (for transactional and analytical requirements and SLAs), reliable, and open. Hence, many new Industrial IoT projects adopt Apache Kafka for various use cases, including data hub between OT and IT, global integration of smart factories for analytics, predictive maintenance, customer 360, and many other scenarios.

Cloud-native data historian powered by Apache Kafka (operating at the edge or in the cloud)

Data Historian is a well-known concept in Industrial IoT. It helps to ensure and improve the Overall Equipment Effectiveness (OEE). The term often overlaps with SCADA. Some people even use it as a synonym.

Apache Kafka can be used as a component of a Data Historian to improve the OEE and reduce/eliminate the most common causes of equipment-based productivity loss in manufacturing (aka Six Big Losses):

Continuous real-time data ingestion, processing, and monitoring 24/7 at scale is a crucial requirement for thriving Industry 4.0 initiatives. Data Streaming with Apache Kafka and its ecosystem brings enormous value to implementing these modern IoT architectures.

Let’s explore a concrete example of a cloud-native SCADA system.

50hertz: A case study for a cloud-native SCADA system built with Apache Kafka

50hertz is a transmission system operator for electricity in Germany. The company secures electricity supply to 18 million people in northern and eastern Germany.

The infrastructure must operate 24 hours, seven days a week. Various shift teams and a mission-critical SCADA infrastructure supervise and control the OT systems.

50hertz presented their OT/IT and SCADA modernization leveraging data streaming with Apache Kafka at the Confluent Data in Motion tour 2021. The on-demand video recording is available (the speech is in German, unfortunately).

The Journey of 50hertz in a big picture

Look at this fantastic picture of 50hertz’s digital transformation journey from monolithic and proprietary legacy technology to a modern cloud-native integration platform powered by Kafka to modernize their IoT ecosystem, such as SCADA systems:

Source: 50hertz

Notice the details in the above picture:

The legacy infrastructure on the left side glues and patches together different components. It almost breaks together. No changes are possible to existing components.
The new infrastructure on the ride side is based on flexible, standardized containers. It is easy to scale, add, or remove applications. The communication happens via standard sizes and schemas.
The bridge in the middle shows the journey. This is a brownfield approach where the old and new world has to communicate with each other for many years. Over time, the company can shut down more and more of the legacy infrastructure.

A great example of innovation in the energy sector! Let’s explore the details of building a cloud-native SCADA system with Apache Kafka:

Challenges of the monolithic legacy IoT infrastructure

The old IT/OT infrastructure and SCADA system are monolithic, proprietary, not scalable, and miss open APIs based on standard interfaces:

Source: 50hertz

A very common infrastructure setup. Most existing OT/IT infrastructures have exactly the same challenges. This is how factories and production lines were built in the past decades.

The consequence is inflexibility regarding software updates, hardware changes, security fixes, and no option for scalability or innovation. Applications run in disconnected mode and are air-gapped from the internet because the old Windows servers are not even supported and no longer get security patches.

Digital transformation in the industrial space requires modernization. Legacy infrastructure still needs to be integrated into most scenarios. Not every company starts from scratch like Tesla, building brand new factories that are built with automation and digitalization from scratch.

Cloud-native SCADA with Kafka to enable innovation (and legacy integration)

50hertz next-generation Modular Control Center System (MCCS) leverages a central, scalable, event-based integration platform based on Confluent:

Source: 50hertz

The first four containers include the Supervisory & Control (SCADA), Load Frequency Control (LFC), and Time Series Management & Forecasting applications. Each container can have multiple services/functions that follow the event-based microservices pattern.

50hertz provides central governance for security, protocols, and data schemas (CIM compliant) between platform containers/ modules. The cloud-native 24/7 SCADA system is developed in the cloud and deployed in safety-critical edge environments.

More on data streaming and Industrial IoT

If you want to learn more about real-world case studies, use cases, and technical architectures for data streaming with Apache Kafka in IIoT scenarios, check out these articles:

If this is insufficient, please let me know what else you need to know…

Cloud-native architectures and Open API are the future of Industrial IoT

50hertz is a tremendous real-world case study about the modernization of the OT/IT world. A modern SCADA architecture requires real-time data processing at any scale, true decoupling between data producers and consumers (no matter what API these apps use), and open interfaces to integrate with any other application like MES, ERP, cloud services, and so on.

From the IT side, this is nothing new. The last decade brought up scalable open source technologies like Kafka, Spark, Flink, Iceberg, and many more, plus related fully managed, elastic cloud services like Confluent Cloud, Databricks, Snowflake, and so on.

However, the OT side has to change. Instead of using monolithic legacy systems, unsupported and unstable Windows servers, and proprietary protocols, next-generation SCADA systems need to use the same cloud-native IT systems, adopt modern OT hardware/software combinations, and integrate the old and new world to enable digitalization and innovation in industry verticals like manufacturing, automotive, military, energy, and so on.

What role plays data streaming in your Industrial IoT environments and OT/IT modernization? Do you run everything around Kafka in the cloud or operate hybrid edge scenarios? What tasks does Kafka take over – is it “just” the data hub, or are IoT use cases built with it, too? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post A cloud-native SCADA System for Industrial IoT built with Apache Kafka appeared first on Kai Waehner.

OPC UA, MQTT, and Apache Kafka – The Trinity of Data Streaming in IoT

Kai Waehner — Fri, 11 Feb 2022 03:44:16 +0000

In the IoT world, MQTT (Message Queue Telemetry Transport protocol) and OPC UA (OPC Unified Architecture) have established themselves as open and platform-independent standards for data exchange in Internet of Things (IIoT) and Industry 4.0 use cases. Data Streaming with Apache Kafka is the data hub for integrating and processing massive volumes of data at any scale in real-time. This blog post explores the relationship between Kafka and the IoT protocols, when to use which technology, and why sometimes HTTP/REST is the better choice. The end explores real-world case studies from Audi and BMW.

Industry 4.0: Data streaming platforms increase overall plant effectiveness and connect equipment

Machine data must be transformed and made available across the enterprise as soon as it is generated to extract the most value from the data. As a result, operations can avoid critical failures and increase the effectiveness of their overall plant.

Automotive manufacturers such as BMW and Tesla have already recognized the potential of data streaming platforms to get their data moving with the power of the Apache Kafka ecosystem. Let’s explore the benefits of data streaming and how this technology enriches data-driven manufacturing companies.

The goals of increasing digitization and automation of the manufacturing sector are many:

To make production processes more efficient
Faster and cheaper overall
To minimize error rates.

Manufacturers are also striving to increase overall equipment effectiveness (OEE) in their production facilities – from product design and manufacturing to maintenance operations. This confronts them with equally diverse challenges. Industry 4.0 respectively Industrial IoT (IIoT) means that the amount of data generated daily is increasing and needs to be transported, processed, analyzed, and made available through systems in near real-time.

Complicating matters further is that legacy IT environments continue to live in today’s manufacturing facilities. This limits manufacturers’ ability to efficiently integrate data across operations. Therefore, most manufacturers require a hybrid data replication and synchronization strategy.

An adaptive manufacturing strategy starts with real-time data

Automation.com published an excellent article explaining the need for real-time processes and monitoring to provide a flexible production line. TL;DR: Processes should be real-time when possible, but real-time is not always possible, even within an application. Think about just-in-time production fighting with the supply chain issues because of the Covid pandemic and the Suez Canal block in 2021.

The theory of just-in-time production does not work with supply chain issues! You need to provide flexibility and be able to switch between different approaches:

Just-in-time (JIT) vs. make to forecast
Fixed vs. variable price contracts
Build vs. buy plant capacity
Staffed vs. lights-out third shift
Linking vs. not linking prices for materials and finished goods

Kappa architecture for a real-time IoT data hub

Real-time production and process monitoring data are essential for success! This evolution is only possible with real-time Kappa architecture. Lambda architecture with batch workloads either completely fails or makes things much more complex and costs from an IT infrastructure and OEE perspective.

For clarification, when I speak about real-time, I talk about millisecond latency. This is not hard real-time and deterministic like in safety-critical and embedded environments. The post “Apache Kafka is NOT Hard Real-Time BUT Used Everywhere in Automotive and Industrial IoT” elaborates on this topic.

In IoT, the MQTT and OPC UA have established standards for data exchange as platform-independent open standards. See what this combination of IoT protocols and Kafka looks like in a smart factory.

When to use Kafka vs. MQTT and OPC UA?

Kafka is a fantastic data streaming platform for messaging, storage, data integration, and data processing in real-time at scale. However, it is not a silver bullet for every problem!

Kafka is NOT…

A proxy for millions of clients (like mobile apps) – but Kafka-native proxies (like REST or MQTT) exist for some use cases.
An API Management platform – but these tools are usually complementary and used for the creation, life cycle management, or the monetization of Kafka APIs.
A database for complex queries and batch analytics workloads – but good enough for transactional queries and relatively simple aggregations (especially with ksqlDB).
An IoT platform with features such as device management – but direct Kafka-native integration with (some) IoT protocols such as MQTT or OPC-UA is possible and the appropriate approach for (some) use cases.
A technology for hard real-time applications such as safety-critical or deterministic systems – but that’s true for any other IT framework, too. Embedded systems are different software!

For these reasons, Kafka is complementary, not competitive, to MQTT and OPC UA. Choose the right tool for the job and combine them! I wrote a detailed blog post exploring when NOT to use Apache Kafka. The above was just the summary.

You should also think about this question from the other side to understand when a message broker is not the right choice. For instance, United Manufacturing Hub is an open-source manufacturing data infrastructure that recently migrated from MQTT as messaging infrastructure to Kafka as the central nervous system because of its storage capabilities, higher throughput, and guaranteed ordering. However, to be clear, this update is not replacing but complementing MQTT with Kafka.

Meeting the challenges of Industry 4.0 through data streaming and data mesh

Machine-to-machine communications and the (Industrial) Internet of Things enable automation, data-driven monitoring, and the use of intelligent machines that can, for example, identify defects and vulnerabilities on their own.

For all these scenarios, large volumes of data must be processed in near real-time and made available across plants, companies, and, under certain circumstances, worldwide via a stream data exchange:

This novel design approach is often implemented with Apache Kafka as decentralized data streaming data mesh.

The essential requirement here is integrating various systems, such as edge and IoT devices and business software, and execution independent of the underlying infrastructure (edge, on-premises as well as public, multi-, and hybrid cloud).

Therefore, an open, elastic, and flexible architecture is essential to integrate with the legacy environment while taking advantage of modern cloud-native applications.

Event-driven, open, and elastic data streaming platforms such as Apache Kafka serve precisely these requirements. They collect relevant sensor and telemetry data alongside data from information technology systems and process it while it is in motion. That concept is called “data in motion“. The new fundamental change differs significantly from processing “data at rest“, meaning you store events in a database and wait until someone else looks at them later. The latter is a “too late architecture” in many IoT use cases.

Separation of concerns in the OT/IT world with domain-driven design and true decoupling

Data integration with legacy and modern systems takes place in near real-time – target systems can use relevant data immediately. It doesn’t matter what infrastructure the plant’s IT landscape is built on. Besides the continuous flow of data, the decoupling of systems also allows messages to be stored until the target systems need them.

That feature of true decoupling with backpressure handling and replayability of data is a unique differentiator compared to other messaging systems like RabbitMQ in the IT space or MQTT in the IoT space. Kafka is also highly available and fail-safe, which is critical in the production environment. “Domain-driven design (DDD) with Apache Kafka” dives deeper into this benefit:

How to choose between OPC UA and MQTT with Kafka?

Three de facto standards for open and standardized IoT architectures. Two IoT-specific protocols and REST / HTTP as simple (and often good enough) options. Modern proprietary protocols compete in the space, too:

OPC UA (Open Platform Communications Unified Architecture)
MQTT (Message Queuing Telemetry Transport)
REST / HTTP
Proprietary protocols and IoT platforms

These alternatives are great vs. the legacy proprietary monolith world of the last decades in the OT/IT and IoT space.

MQTT vs. OPC UA (vs. HTTP vs. Proprietary)

First of all, this discussion is only relevant if you have the choice. If you buy and install a new machine or PLC on your shop floor and that one only offers a specific interface, then you have to use it. However, new software like IoT gateways provides different options to choose from.

How to compare these communication protocols?

Well, frankly, it is challenging as most literature is opinionated and often includes FUD about the “competing protocols”. Every alternative has its sweet spots. Hence, it is more of an apples and oranges comparison.

More or less randomly, I googled “OPC UA vs MQTT” and found the following interesting comparison from Skynet’s proprietary DataHub Transfer Protocol (DHTP). The vendor pitches its commercial product against the open standards (and added AMQP as an additional alternative):

Each comparison on the web differs. The above comparison is valid (and some people will disagree with some points). And sometimes, proprietary solutions provide the better choice from a TCO and ROI perspective, too.

Hint: Look at different comparisons. Understand if the publication is related to a specific vendor and standard. Evaluate several solutions and vendors to understand the differences and added value.

Decision tree for evaluating IoT protocols

My recommendation for comparing the different IoT protocols is to use open standards whenever possible. Choose the right tool for the job and combine them in a best-of-breed approach as needed.

Let’s take at a simple decision tree to decide between OCP UA, MQTT, HTTP, and other proprietary IIoT protocols (note: This is just a very simplified point of view, and you can build your opinion with different decisions, of course):

A few notes on the reasoning for how I built this decision tree:

HTTP / REST is perfect for simple use cases (keep it as simple as possible). HTTP is supported almost everywhere, well understood, and simple to use. No additional tooling, APIs, or middleware is needed. Communication is synchronous request-response. Conversations with security teams are much easier if you just need to open port 80 or 443 for HTTP(S) instead of TCP ports, like most other protocols. HTTP is unidirectional communication (e.g., a connected car needs an HTTP server to get data pushed from the cloud – pub/sub is the right choice instead of HTTP here).
MQTT is perfect for intermittent networks, respectively limited bandwidth and/or connecting tens or hundreds of thousands of devices (e.g., connected car infrastructure). Communication is asynchronous publish/subscribe using an MQTT broker as the middleman. MQTT uses no standard data format. But developers can use Sparkplug as an add-on built for this purpose. MQTT is incredibly lightweight. Features like Quality of Service (QoS), last will, and testament solve many requirements for IoT use cases out-of-the-box. MQTT is excellent for IT use cases and can easily be used for bidirectional communication (e.g., connected cars <–> cloud communication). LoRaWAN and other low-power wide-area networks are great for MQTT, too.
OPC UA is perfect for industrial automation (e.g., machines at the production line). Communication is usually client/server today, but publish/subscribe is also supported. It uses standard data formats and provides a rich (= powerful but also complex) set of features, components, and industry-specific data formats. OPC UA is excellent for OT/IT integration scenarios. OPC UA TSN (time-sensitive networking), one optional component, is an Ethernet communication standard that provides open, deterministic, hard real-time communication.
Proprietary protocols suit specific problems that standard-based implementations cannot solve similarly. These protocols have various trade-offs. Often powerful and performant, but also expensive and proprietary.

Choosing between OPC UA, MQTT, and other protocols isn’t an either/or decision. Each protocol plays its role and excels at certain use cases. An optimal modern industrial network uses OPC UA and MQTT for modern applications. Both together combine the strengths of each and mitigate their downsides. Legacy applications and proprietary SCADA systems or other data historians are usually integrated with other existing proprietary middleware.

Many IIoT platforms, such as Siemens, OSIsoft, or Inductive Automation, support various modern and legacy protocols. Some smaller vendors focus on a specific sweet spot, like HiveMQ for MQTT or OPC Router for OPA-UA.

Integration between MQTT / OPC UA and Kafka

A few integration options between equipment, machines, and devices that support MQTT or OPC UA and Kafka are:

Kafka Connect connectors: Native Kafka integration on protocol level. Check Confluent Hub for a few alternatives. Some enterprises built their custom Kafka Connect connectors.
Custom integration: Integration via a low level MQTT / OPC UA API (e.g. using Kafka’s HTTP / REST Proxy) or Kafka client (e.g. .NET / C++ for Windows environments).
Modern and open 3rd party IoT middleware: Generic open source integration middleware (e.g., Apache Camel with its IoT connectors), IoT-specific frameworks (like Apache PLC4X or Eclipse Ditto), or proprietary 3rd party IoT middleware with open and standards-based APIs
Commercial IoT platforms: Best fit for existing historical deployments and glue code with legacy protocols such as Modbus, Siemens S7, et al. Traditional data historians, proprietary protocols, monolith architectures, limited scalability, batch ETL platforms, work well for these workloads to connect the past with the future of the OT/IT world and to create a bridge between on-premise and cloud. Almost all IoT platforms added connectors for MQTT, OCP UA, and Kafka in the meantime.

OEE scenarios that benefit from data streaming

Data streaming platforms apply in various use cases to increase overall plant effectiveness as the central nervous system. These include connectivity via industry standards such as OPC UA or MQTT, visualization of multiple devices and assets in digital twins, and modern maintenance in the form of condition monitoring and predictive maintenance.

Connectivity to machines and equipment with OPC UA or MQTT

OPC UA and MQTT are not designed for data processing and integration. Instead, the strength is that bidirectional “last mile communication” to devices, machines, PLCs, IoT gateway, or vehicles is established in real-time.

As discussed above, both standards have different “sweet spots” and can also be combined: OPC UA is supported by almost all modern machines, PLCs, and IoT gateways for the smart factory. MQTT is used primarily in poor networks and/or also for thousands and hundreds of thousands of devices.

These data streams are then streamed into the data streaming platforms via connectors. The streaming platform can either be deployed in parallel with an IoT platform ‘at the edge’ or combined in hybrid or cloud scenarios.

The data streaming platform is a flexible data hub for data integration and processing between OT and IT applications. Besides OPC UA and MQTT on the OT side, various IT applications such as MES, ERP, CRM, data warehouse, or data lake are connected in real-time, regardless of whether they are operated ‘at the edge’, on-premise, or in the cloud.

More details: Apache Kafka as Data Historian – an IIoT / Industry 4.0 Real-Time Data Lake.

Digital twins for development and predictive simulation

By continuously streaming data and processing and integrating sensor data, data streaming platforms enable the creation of an open, scalable, and highly available infrastructure for the deployment of Digital Twins.

Digital Twins combine IoT, artificial intelligence, machine learning, and other technologies to create a virtual simulation of, for example, physical components, devices, and processes. They can also consider historical data and update themselves as soon as the data generated by the physical counterpart changes.

Kafka is the leading system in the following digital twin example:

Kafka is combined with other technologies to build a digital twin most times. For instance, Eclipse Ditto is a project combining Kafka with IoT protocols. And some teams made a custom digital twin with Kafka and a database like MongoDB.

IoT Architectures for Digital Twin with Apache Kafka provide more details about different digital twin architectures.

Industry 4.0 benefits from digital twins, as they allow detailed insight into the lifecycle of the elements they simulate or monitor. For example, product and process optimization can be carried out, individual parts or entire systems can be tested for their functionality and performance, or forecasts can be made about energy consumption and wear and tear.

Condition monitoring and predictive maintenance

For modern maintenance, machine operators mainly ask themselves questions: Are all devices functioning as intended? How long will these devices usually function before maintenance work is necessary? What are the causes of anomalies and errors?

On the one hand, Digital Twins can also be used here for monitoring and diagnostics. They correlate current sensor data with historical data, which makes it possible to identify the causes of faults and expect maintenance measures.

On the other hand, production facilities can also benefit from data streaming in this area. A prerequisite for Modern Maintenance is a reliable and scalable infrastructure that enables the processing, analysis, and integration of data streams. This allows the detection of critical changes in plants, such as severe temperature fluctuations or vibrations, in near real-time, after which operators can initiate measures to maintain plant effectiveness.

Above all, more efficient predictive maintenance scheduling saves manufacturing companies valuable resources by ensuring equipment and facilities are serviced only when necessary. In addition, operators avoid costly downtime periods when machines are not productive for a while.

More details: Condition Monitoring and Predictive Maintenance with Apache Kafka.

Connected cars and streaming machine learning

A connected car is a car that can communicate bidirectionally with other systems outside of the vehicle. This allows the car to share internet access and data with other devices and applications inside and outside the car. The possibilities are endless! MQTT in conjunction with Kafka is more or less a de facto standard architecture for connected car use cases and infrastructures.

The following shows how to integrate with tens or hundreds of thousands of IoT devices and process the data in real-time. The demo use case is predictive maintenance (i.e., anomaly detection) in a connected car infrastructure to predict motor engine failures:

The blog post “IoT Live Demo – 100.000 Connected Cars with Kubernetes, Kafka, MQTT, TensorFlow” explores the architecture and implementation in more detail. The source code is available on Github.

BMW case study: Manufacturing 4.0 with smart factory and cloud

I spoke with Felix Böhm, responsible for BMW Plant Digitalization and Cloud Transformation, at our Data in Motion tour in Germany in 2021. We talked about their journey towards data in motion in manufacturing and the use cases and architectures. He also talked to Confluent CEO Jay Kreps at the Kafka Summit EU 2021.

Kafka and OPC UA as real-time data hub between equipment at the edge and applications in the cloud

Let’s explore this BMW success story from a technical perspective.

Decoupled IoT Data and Manufacturing

BMW connects workloads from their global smart factories and replicates them in real-time in the public cloud. The team uses an OPC UA connector to directly communicate with Confluent Cloud in Azure.

Kafka provides decoupling, transparency, and innovation. Confluent adds stability via products and expertise. The latter is critical for success in manufacturing. Each minute of downtime costs a fortune. Read my related article “Apache Kafka as Data Historian – an IIoT / Industry 4.0 Real-Time Data Lake” to understand how Kafka improves the Overall Equipment Effectiveness (OEE) in manufacturing.

Logistics and supply chain in global plants

The discussed use case covered optimized supply chain management in real-time.

The solution provides information about the right stock in place, both physically and in ERP systems like SAP. “Just in time, just in sequence” is crucial for many critical applications.

Things BMW couldn’t do before

Get IoT data without interfering with others, and get it to the right place
Collect once, process, and consume several times (by different consumers at different times with varying paradigms of communication like real-time, batch, request-response)
Enable scalable real-time processing and improve time-to-market with new applications

The true decoupling between different interfaces is a unique advantage of Kafka vs. other messaging platforms such as IBM MQ, Rabbit MQ, or MQTT brokers. I also explored this in my article about Domain-driven Design (DDD) with Kafka.

Check out “Apache Kafka Landscape for Automotive and Manufacturing” for more Kafka architectures and use cases in this industry.

Audi case study – Connected cars for swarm intelligence

Audi has built a connected car infrastructure with Apache Kafka. Their Kafka Summit keynote explored the use cases and architecture:

Use cases include real-time data analysis, swarm intelligence, collaboration with partners, and predictive AI.

Depending on how you define the term and buzzword “Digital Twin“, this is a perfect example: All sensor data from the connected cars are processed in real-time and stored for historical analysis and reporting. Read more about “Kafka for Digital Twin Architectures” here.

I wrote a whole blog series with many more practical use cases and architecture for Apache Kafka and MQTT to learn more.

Serverless data streaming enables focusing on IoT business applications and improving OEE

An event-driven data streaming platform is elastic and highly available. It represents an opportunity to increase production facilities’ overall asset effectiveness significantly.

With the help of their data processing and integration capabilities, data streaming complements machine connectivity via MQTT, OPC UA, HTTP, among others. This allows streams of sensor data to be transported throughout the plant and to the cloud in near real-time. This is the basis for the use of Digital Twins as well as Modern Maintenance such as Condition Monitoring and Predictive Maintenance. The increased overall plant effectiveness not only enables manufacturing companies to work more productively and avoid potential disruptions, but also to save time and costs.

I did not talk about operating the infrastructure for data streaming and IoT. TL;DR: Go serverless if you can. That enables you to focus on solving business problems. The above example of BMW had exactly this motivation and leverages Confluent Cloud for this reason to roll out their smart factory use cases across the globe. “Serverless Kafka” is your best choice for data streaming if connectivity and the network infrastructure allow it in your IoT projects.

Do you use MQTT or OPC UA with Apache Kafka today? What use cases? Or do you rely on the HTTP protocol because it is good enough and simpler to integrate? How do you decide which protocol to choose? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post OPC UA, MQTT, and Apache Kafka – The Trinity of Data Streaming in IoT appeared first on Kai Waehner.

Apache Kafka for Industrial IoT and Manufacturing 4.0

Kai Waehner — Wed, 19 May 2021 08:47:24 +0000

This post explores use cases and architectures for processing data in motion with Apache Kafka in Industrial IoT (IIoT) across verticals such as automotive, energy, steel manufacturing, oil&gas, cybersecurity, shipping, logistics. Use cases include predictive maintenance, quality assurance, track and track, real-time locating system (RTLS), asset tracking, customer 360, and more. Examples include BMW, Bosch, Baader, Intel, Porsche, and Devon.

Why Kafka Is a Key Piece of the Evolution for Industrial IoT and Manufacturing

Industrial IoT was a mess of monolithic and proprietary technologies in the last decades. Modbus, Siemens S7, SCADA, and similar “concepts” controlled the industry. Vendors locked in enterprises by intentionally building incompatible products without open interfaces. These systems still run on Windows XP or similar non-supported outdated operating systems and without security in mind.

Fortunately, this is completely changing. Apache Kafka and its ecosystem play a key role in the IIoT evolution. System integration and data processing get an open architecture with a scalable, reliable infrastructure.

I speak to customers in this industry every week across the globe. Very different challenges, use cases, and innovative ideas originate. I covered this topic a lot in the past, already.

Check out my other related blog posts for Kafka in IIoT and Manufacturing. Learn about use cases and architecture for deployments at the edge (i.e., outside the data center), the relation between Kafka and other IoT standards like MQTT or OPC-UA, and how to build a modern, open and scalable data historian.

I want to highlight one post as it is superimportant for any discussion around shop floors, PLCs, machines, robots, cars, and any other embedded systems: Kafka and other IT software are NOT hard real-time.

This post here “just” shares my latest presentation on this topic, including the slide deck and on-demand video recording. Before we get there, let’s summarize the current scenarios for Kafka in Industrial IoT in one concrete example.

Requirements for Industrial IoT: Everywhere, Complete, Cloud-native!

Let’s take a look at one specific example. The following picture depicts the usage of event streaming in combination with other OT and IT technologies in the shipping industry:

This is an interesting example because it shows many challenges and requirements of many Industrial IoT real-world scenarios across verticals:

Everywhere: Industrial IoT is not possible only in the cloud. The edge is impossible to avoid because manufacturing produces tangible goods. Integration between the (often disconnected) edge and the data center is essential for many use cases.
Complete: Industrial IoT is mission-critical. Stability with zero downtime, security, and safety are crucial across verticals. The only realistic option is a robust, battle-tested enterprise-grade solution to realize IIoT use cases.
Cloud-native: Automation, scalability, decoupled agile applications, and flexibility regarding technologies and products are required for enterprises to stay competitive. Not just in the cloud, but also at the edge! Not all use cases required a critical, scalable solution, though. For instance, a single broker for data processing and storage is sufficient in a disconnected drone.

A unique value of Kafka is that you can use one single technology for scalable real-time messaging, storage and caching, continuous stateless and stateful data processing, and data integration with the OT and IT world. This is especially important at the edge where the hardware is constrained, and the network is limited. It is much easier to operate and much more cost-efficient to deploy one single infrastructure instead of glue together a best-of-breed like you often do in the cloud.

With this introduction, let’s now share the slide deck and video recording to talk about all these points in much more detail.

Slide Deck: Kafka for Industrial IoT and Manufacturing 4.0

Here is the slide deck:

Video Recording: Connect All the Things

Here is the video recording:

Apache Kafka for an open, scalable, flexible IIoT Architecture

Industrial IoT was a mess of monolithic and proprietary technologies in the last decades. Fortunately, Apache Kafka is completely changing many industrial environments. An open architecture with a scalable, reliable infrastructure changes how systems are integrated and how data is processed in the future.

What are your experiences and plans in IIoT projects? What use case and architecture did you implement? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka for Industrial IoT and Manufacturing 4.0 appeared first on Kai Waehner.

Apache Kafka and MQTT (Part 3 of 5) – Manufacturing 4.0 and Industrial IoT

Kai Waehner — Mon, 22 Mar 2021 09:18:19 +0000

Apache Kafka and MQTT are a perfect combination for many Industrial IoT use cases. This blog series covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions. This post is part three: Manufacturing, Industrial IoT, and Industry 4.0.

Apache Kafka + MQTT Blog Series

The first blog post explores the relation between MQTT and Apache Kafka. Afterward, the other four blog posts discuss various use cases, architectures, and reference deployments.

Part 1 – Overview: Relation between Kafka and MQTT, pros and cons, architectures
Part 2 – Connected Vehicles: MQTT and Kafka in a private cloud on Kubernetes; use case: remote control and command of a car
Part 3 – Manufacturing (THIS POST): MQTT and Kafka at the edge in a smart factory; use case: Bidirectional OT-IT integration with Sparkplug between PLCs, IoT Gateways, Data Historian, MES, ERP, Data Lake, etc.
Part 4 – Mobility Services: MQTT and Kafka leveraging serverless cloud infrastructure; use case: Traffic jam prediction service using machine learning
Part 5 – Smart City: MQTT at the edge connected to fully-managed Kafka in the public cloud; use case: Intelligent traffic routing by combining and correlating 3rd party services

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Use Case: Manufacturing 4.0 and Industrial IoT with Kafka

The following list shows different examples where Kafka is used as a strategic platform for various manufacturing use cases to implement Industry 4.0 initiatives:

Track&Trace / Production Control / Plant Logistics
Quality Assurance / Yield Management
Predictive Maintenance
Supply Chain Management
Cybersecurity
Servitization leveraging Digital Twins
Additive Manufacturing
Augmented Reality
Many more…

I already covered this topic in detail recently. Hence, check out the blog post “Apache Kafka for Manufacturing and Industrial IoT“. That post includes a detailed slide deck and video recording.

Let’s specific examples for Kafka and MQTT in Manufacturing 4.0 in the following sections.

Architecture: Smart Factory and Industry 4.0 with Kafka

The following diagram shows the architecture of a smart factory. Both MQTT and Kafka infrastructure are deployed at the edge in the factory for security, latency, and cost reasons:

This example connects to modern PLCs and other gateways via MQTT and Sparkplug B. The benefit of using this technology is a lightweight communication protocol and open standard.

An OT middleware such as OSIsoft PI is required if legacy and proprietary protocols such as Modbus or Siemens S7 need to be integrated. Most plants today are brownfield. Hence both proprietary integration platforms and open MQTT or OPC-UA integration must communicate with Kafka from the OT side. “Apache Kafka as Data Historian” explores the integration options in more detail.

The IT components (such as SAP ERP) and the integration platforms (HiveMQ, Confluent) run in the factory. Obviously, many other architectures are possible if latency and security allow it. Check out “Building a Smart Factory with Apache Kafka and 5G Campus Networks” for a hybrid cloud architecture. Additionally, most smart factories are not completely independent from the central IT world running in a remote data center or public cloud. Various “architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments” exist to replicate data bi-directionally between smart factories and data centers.

Example: MQTT for Critical Manufacturing @ Daimler

Manufacturing processes in the automotive industry cannot go down. Hence, Daimler built a Vehicle Diagnostic System (VDS) to efficiently share information between test devices on the factory floor and enterprise IT systems. The VDS fulfills some core functionality in the manufacturing process for E/E components, such as calibrating sensors controlled by an ECU, flashing new firmware, personalizing the key to the car, and testing to make sure each ECU works properly.

MQTT works in bad networks. Therefore, test devices behave properly even if the network connection is dropped and reconnected.

The system is rolled out to 24 factories around the world. 10,000 testing devices are connected. The devices generate 470 million messages/month.

The complete case study from HiveMQ explores the use case in more detail.

Example: Kafka in the Cloud for Business Critical Supply Chain Operations @ Baader

BAADER is a worldwide manufacturer of innovative machinery for the food processing industry. They run an IoT-based and data-driven food value chain on Confluent Cloud:

The Kafka-based infrastructure provides a single source of truth across the factories and regions across the food value chain. Business-critical operations are available 24/7 for tracking, calculations, alerts, etc.

The event streaming platform runs on Confluent Cloud. Hence, Baader can focus on building new innovative business applications. The serverless Kafka infrastructure provides mission-critical SLAs and consumption-based pricing for all required capabilities: Messaging, storage, data integration, and data processing.

MQTT provides connectivity to machines and GPS data from vehicles at the edge. Kafka Connect connectors integrate MQTT and other IT systems such as Elasticsearch, MongoDB, and AWS S3. ksqlDB processes the data in motion continuously. Stream processing or streaming analytics are other terms for this concept.

Kafka + MQTT = Manufacturing 4.0

In conclusion, Apache Kafka and MQTT are a perfect combination for Manufacturing, Industrial IoT, and Industry 4.0.

Follow the blog series to learn about use cases such as connected vehicles, manufacturing, mobility services, and smart city. Every blog post also includes real-world deployments from companies across industries. It is key to understand the different architectural options to make the right choice for your project.

What are your experiences and plans in IoT projects? What use case and architecture did you implement? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka and MQTT (Part 3 of 5) – Manufacturing 4.0 and Industrial IoT appeared first on Kai Waehner.

Apache Kafka and MQTT (Part 2 of 5) – V2X and Connected Vehicles

Kai Waehner — Fri, 19 Mar 2021 08:00:31 +0000

Apache Kafka and MQTT are a perfect combination for many IoT use cases. This blog series covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions. This post is part two: Connected Vehicles and V2X applications.

Apache Kafka + MQTT Blog Series

The first blog post explores the relation between MQTT and Apache Kafka. Afterward, the other four blog posts discuss various use cases, architectures, and reference deployments.

Part 1 – Overview: Relation between Kafka and MQTT, pros and cons, architectures
Part 2 – Connected Vehicles (THIS POST): MQTT and Kafka in a private cloud on Kubernetes; use case: remote control and command of a car
Part 3 – Manufacturing: MQTT and Kafka at the edge in a smart factory; use case: Bidirectional OT-IT integration with Sparkplug between PLCs, IoT Gateways, Data Historian, MES, ERP, Data Lake, etc.
Part 4 – Mobility Services: MQTT and Kafka leveraging serverless cloud infrastructure; use case: Traffic jam prediction service using machine learning
Part 5 – Smart City: MQTT at the edge connected to fully-managed Kafka in the public cloud; use case: Intelligent traffic routing by combining and correlating 3rd party services

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Use Case: Connected Vehicles and V2X

Vehicle-to-everything (V2X) is communication between a vehicle and any entity that may affect, or may be affected by, the vehicle. It is a vehicular communication system that incorporates other more specific types of communication as V2I (vehicle-to-infrastructure), V2N (vehicle-to-network), V2V (vehicle-to-vehicle), V2P (vehicle-to-pedestrian), V2D (vehicle-to-device), and V2G (vehicle-to-grid). The main motivations for V2X are road safety, traffic efficiency, energy savings, and better driver experience.

V2X includes various use cases. The following picture from 3G4G shows some examples :

Business Point of View for Connected Vehicles

From a business perspective, the following diagram from Frost & Sullivan explains the use cases for connected vehicles very well:

Technical Point of View for V2X and Connected Vehicles

A few things to point out from a technical perspective:

MQTT + Kafka provides a scalable real-time infrastructure for high volumes of data in motion in milliseconds with end-to-end processing between 10 and 20ms. This is good enough for the integration with backend IT systems and almost all mobility services.
MQTT and Kafka are not used for hard real-time and deterministic embedded systems.
Some safety-critical V2X use cases require other communication technologies such as 5G New Radio (NR) / NR C-V2X sidelink to directly connect vehicles or vehicles and local infrastructure (e.g. traffic lights). There is no need for an intermediary cellular network or radio access network (RAN).
Example: A self-driving car executes all its algorithms like image processing and decision making within the car in embedded systems. These use cases require deterministic behavior and hard real-time. Communication with 3rd party such as emergency services, traffic routing, parking, etc., connects to backend systems for data correlation (close to the edge or far away in a cloud data center). Real-time in milliseconds – or sometimes even seconds – is good enough in these cases.
Not every application is for tens or hundreds of thousands of connected vehicles. For instance, a real-time locating system (RTLS) is a perfect example for realizing use cases in logistics in transportation. This can be geofencing within a plant or regional global track&trace. “Real-Time Locating System (RTLS) with Apache Kafka for Transportation and Logistics” explores this use case in more detail.

The following sections focus on use cases that require real-time (but not hard real-time) data integration and processing at scale with 24/7 uptime between vehicles, networks, infrastructure, and applications.

Architecture: MQTT and Kafka for Connected Vehicles

Let’s take a look at an example: Remote control and command of a car. This can be simple scenarios like opening your car trunk from a remote location with your digital key for the mailman or more sophisticated use cases like the payment process for buying a new feature via OTA (over the air) update.

The following diagram shows an architecture for V2X leveraging MQTT and Kafka:

A few notes on the above architecture:

The MQTT and Kafka clusters run in a Kubernetes environment.
Kubernetes allows the deployment across data centers and multiple cloud providers with a single “template”.
Bi-directional communication is guaranteed in reliable, scalable infrastructure end-to-end in real-time.
The MQTT clients from cars and mobile devices communicate with the MQTT cluster. This allows connecting hundreds of thousands of interfaces and support of bad networks.
Kafka is the integration backbone for connected vehicles and mobile devices. Use cases include streaming ETL, correlation of the data in stateful business applications, or ingestion into other IT applications, databases, and cloud services.

V2X with MQTT and Kafka in a 5G Infrastructure

The following diagram shows the above use cases around connected vehicles from the V2X perspective:

The infrastructure is separated into three categories and networks:

The edge (vehicles, devices) using local processing and remote integration via 5G.
MEC (multi-access edge computing) region for low-latency use cases. This example leverages AWS Wavelength for combining the power of 5G with cloud services and Confluent Platform for processing data in motion at scale.
The public cloud infrastructure using AWS and Confluent Cloud for all other cloud-native applications.

The integration between the edge and the IT world depends on the requirements. In this example, we use mostly MQTT but also HTTP for the integration with the Kafka cluster. The connectivity to other IT applications happens via Kafka-native interfaces such as Kafka clients, Kafka Connect, or Confluent’s Cluster Linking (for the bi-directional replication between the AWS Wavelength zone and the AWS cloud region).

Direct communication between vehicles or vehicles and pedestrians requires deterministic behavior and ultra-low latency. Hence, this communication does not use technologies like MQTT or Kafka. Technologies like 5G Sidelink were invented for these requirements.

Let’s now look at two-real world examples for connected vehicles.

Example: MQTT and Kafka for Millions of Connected Cars @ Autonomic

Autonomic built the Transportation Mobility Cloud (TMC), a standard way of accessing connected vehicle data and sending remote commands. This platform provides the foundation to build smart mobility applications related to driver safety, preventive maintenance, fleet management.

Autonomic built a solution with MQTT and Kafka to connect millions of cars. MQTT forwards the car data in real-time to Kafka to distribute the messages to the different microservices and applications in the platform.

This is a great example of combining the benefits of MQTT and Kafka. Read the complete case study from HiveMQ for more details.

Example: Kafka as Car Data Collector @ Audi

Audi started its journey for connected cars a long time ago to collect data from hundreds of thousands of cars in real-time. The car data is collected and processed in real-time with Apache Kafka. The following diagram shows the idea:

As you can imagine, tens of potential use cases exist to reduce cost, improve the customer experience, and increase revenue. The following is the example of a real-time service to find a free parking lot:

Watch Audi’s Kafka Summit keynote for more details about the infrastructure and use cases.

Slide Deck – Kafka for Connected Vehicles and V2X

Here is a slide deck covering this topic in more detail:

Kafka + MQTT = Connected Vehicles and V2X

In conclusion, Apache Kafka and MQTT are a perfect combination for V2X and connected vehicles. It makes so many new IoT use cases possible!

Follow this blog series to learn about use cases such as connected vehicles, manufacturing, mobility services, and smart city. Every blog post also includes real-world deployments from companies across industries. It is key to understand the different architectural options to make the right choice for your project.

The post Apache Kafka and MQTT (Part 2 of 5) – V2X and Connected Vehicles appeared first on Kai Waehner.

Apache Kafka for Smart Grid, Utilities and Energy Production

Kai Waehner — Thu, 14 Jan 2021 09:59:09 +0000

The energy industry is changing from system-centric to smaller-scale and distributed smart grids and microgrids. A smart grid requires a flexible, scalable, elastic, and reliable cloud-native infrastructure for real-time data integration and processing. This post explores use cases, architectures, and real-world deployments of event streaming with Apache Kafka in the energy industry to implement a smart grid and real-time end-to-end integration.

Smart Grid – The Energy Production and Distribution of the Future

The energy sector includes corporations that primarily are in the business of producing or supplying energy such as fossil fuels or renewables.

What is a Smart Grid?

A smart grid is an electrical grid that includes a variety of operation and energy measures,, including smart meters, smart appliances, renewable energy resources, and energy-efficient resources. Electronic power conditioning and control of the production and distribution of electricity are important aspects of the smart grid.

The European Union Commission Task Force for Smart Grids provides smart grid definition as:

“A Smart Grid is an electricity network that can cost-efficiently integrate the behavior and actions of all users connected to it – generators, consumers and those that do both – to ensure economically efficient, sustainable power system with low losses and high levels of quality and security of supply and safety. A smart grid employs innovative products and services, together with intelligent monitoring, control, communication, and self-healing technologies to:

Better facilitate the connection and operation of generators of all sizes and technologies.
Allow consumers to play a part in optimizing the operation of the system.
Provide consumers with greater information and options for how they use their supply.
Significantly reduce the environmental impact of the whole electricity supply system.
Maintain or even improve the existing high levels of system reliability, quality, and security of supply.
Maintain and improve existing services efficiently.”

Technologies and Evolution of a Smart Grid

The Roll-out of smart grid technology also implies a fundamental re-engineering of the electricity services industry, although typical usage of the term is focused on the technical infrastructure. Key smart grid technologies include solar power, smart meters, microgrids, and self-optimizing systems:

Requirements for a Smart Grid and Modern Energy Infrastructure

Reliability: The smart grid uses technologies such as state estimation which improve fault detection and allow self-healing of the network without the intervention of technicians. This will ensure a more reliable supply of electricity and reduced vulnerability to natural disasters or attack.
Flexibility in network topology: Next-generation transmission and distribution infrastructure will be better able to handle possible bidirectional energy flows, allowing for distributed generation such as from photovoltaic panels on building roofs, but also charging to/from the batteries of electric cars, wind turbines, pumped hydroelectric power, the use of fuel cells, and other sources.
Efficiency: Numerous contributions to the overall improvement of the efficiency of energy infrastructure are anticipated from the deployment of smart grid technology, in particular including demand-side management, for example, turning off air conditioners during short-term spikes in electricity price, reducing the voltage when possible on distribution lines through Voltage/VAR Optimization (VVO), eliminating truck-rolls for meter reading, and reducing truck-rolls by improved outage management using data from Advanced Metering Infrastructure systems. The overall effect is less redundancy in transmission and distribution lines and greater utilization of generators, leading to lower power prices.
Sustainability: The smart grid’s improved flexibility permits greater penetration of highly variable renewable energy sources such as solar power and wind power, even without the addition of energy storage.
Market-enabling: The smart grid allows for systematic communication between suppliers (their energy price) and consumers (their willingness-to-pay. It permits both the suppliers and the consumers more flexible and sophisticated operational strategies.
Cybersecurity: Provide a secure infrastructure with encrypted and authenticated communication and real-time anomaly detection at scale across the supply chain.

Architectures with Kafka for a Smart Grid

From a technical perspective, use cases such as load adjustment/load balancing or peak curtailment/leveling and time of use pricing cannot be implemented with traditional, monolith software like they were used in the past in the energy industry.

A smart grid requires a cloud-native infrastructure that is flexible, scalable, elastic, and reliable. All of that in combination with real-time data integration and data processing. These requirements explain why more and more energy companies rely heavily on event streaming with Apache Kafka and its ecosystem.

Energy Production and Distribution with a Hybrid Architecture

Many energy companies have a cloud-first strategy. They build new innovative applications in the cloud. Especially in the energy industry, this makes a lot of sense. The need for elastic and scalable data processing is key to success. However, not everything can run in the cloud. Most energy-related use cases required data processing at the edge, too. This is true for energy production and energy distribution.

Here is an example architecture leveraging Apache Kafka in the cloud and at the edge:

The integration in the cloud requires connecting to modern technologies such as Snowflake data warehouse, Google’s Machine Learning services based on TensorFlow, or 3rd party SaaS services like Salesforce. The edge is different. Connectivity is required for machines, equipment, sensors, PLCs, SCADA systems, and many other systems. Kafka Connect is a perfect, Kafka-native tool to implement these integrations in real-time at scale.

Replication in real-time between the edge sites and the cloud is another important case where tools like MirrorMaker 2 or Confluent’s Cluster Linking fit perfectly.

The continuous processing of data (aka stream processing) is possible with Kafka-native components like Kafka Streams or ksqlDB. Using an external tool such as Apache Flink is also a good fit.

Event Streaming for Energy Production at the Edge with a 5G Campus Network

Kafka at the edge is the new black. Energy production a great example:

More details about deploying Kafka at edge sites is described in the post “Building a Smart Factory with Apache Kafka and 5G Campus Networks“.

The edge is often disconnected from the cloud or remote data centers. Mission-critical applications have to run 24/7 in a decoupled way even if the internet connection is not available or not stable:

Example: Real-Time Outage Management with Kafka in the Utilities Sector

Let’s take a look at an example implemented in the utilities sector: Continous processing of smart meter sensor data with Kafka and ksqlDB:

The preprocessing and filtering happens at the edge, as shown in the above picture. However, the aggregation and monitoring of all the assets of the smart grid (including smart home, smart buildings, powerlines, switches, etc.) happen in the cloud:

Real-time data processing is not just important for operations. Huge added value comes from the improved customer experience. For instance, the outage management operations tool can alert a customer in real-time:

Let’s now take a look at a few real-world examples of energy-related use cases.

Kafka Real-World Deployments in the Energy Sector

This section explores three very different deployments and architectures of Kafka-based infrastructure to build smart grids and energy production-related use cases: EON, WPX Energy, and Tesla.

EON – Smart Grid for Energy Production and Distribution with Kafka

The EON Streaming Platform is built on the following paradigms to provide a cloud-native smart grid infrastructure:

IoT scale capabilities of public cloud providers
EON microservices that are independent of cloud providers
Real-time data integration and processing powered by Apache Kafka

WPX Energy – Kafka at the Edge for Integration and Analytics

WPX Energy (now merged into Devon Energy) is a company in the oil&gas industry. The digital transformation creates many opportunities to improve processes and reduce costs in this vertical. WPX leverages Confluent Platform on Hivecell edge hardware to realize edge processing and replication to the cloud in real-time at scale.

The solution is designed for true real-time decision making and future potential closed-loop control optimization. WPX conducts edge stream processing to enable true real-time decision-making at well site. They also replicate business-relevant data streams produced by machine learning models and analytical preprocessed data at the well site to the cloud, enabling WPX to harness the full power of its real-time events.

Kafka deployments at the edge (i.e., outside a data center) come up more and more in the energy industry, but also in factories, restaurants, retail stores, banks, and hospitals.

Tesla – Kafka-based Data Platform for Trillions of Data Points per Day

Tesla is not just a car maker. Telsa is a tech company writing a lot of innovative and cutting-edge software. They provide an energy infrastructure for cars with their Telsa Superchargers, solar energy production at their Gigafactories, and much more. Processing and analyzing the data from their smart grids and the integration with the rest of the IT backend services in real-time is a key piece of their success:

Tesla has built a Kafka-based data platform infrastructure “to support millions of devices and trillions of data points per day”. Tesla showed an interesting history and evolution of their Kafka usage at a Kafka Summit in 2019:

Kafka + OT Middleware (OSIsoft PI, Siemens MindSphere, et al)

A common and very relevant question is the relation between Apache Kafka and traditional OT middleware such as OSIsoft PI or Siemens MindSphere.

TL;DR: Apache Kafka and OT middleware complement each other. Kafka is NOT an IoT platform. OT middleware is not built for the integration and correlation of OT data with the rest of the enterprise IT. Kafka and OT middleware connect to each other very well. Both sides provide integration options, including REST APIs, native Kafka Connect connectors, and more. Hence, in many cases, enterprises leverage and integrate both technologies instead of choosing just one of them.

Please check out the following blogs/slides/videos to understand how Apache Kafka and OT middleware complement each other very well:

Apache Kafka vs. Middleware (MQ, ETL, ESB): Blog or Slides/Video.
Domain-driven Design (DDD) with Apache Kafka for true decoupling and flexible integration infrastructures

Slides: Kafka-based Smart Grid and Energy Use Cases and Architectures

The following slide deck goes into more details about this topic:

The Future of Kafka for the Energy Sector and Smart Grid

Kafka is relevant in many use cases for building an elastic and scalable smart grid infrastructure. Even beyond, many projects utilize Kafka heavily for customer 360, payment processing, and many other use cases. Check out the various “Use Cases and Architectures for Apache Kafka across Industries“. Energy companies can apply many of these use cases, too.

If you have read this far and actually wonder what “real-time” actually means in the context of Kafka and the OT/IT convergence, please check out the post “Kafka is NOT hard real-time“.

What are your experiences and plans for event streaming in the energy industry? Did you already build a smart grid with Apache Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka for Smart Grid, Utilities and Energy Production appeared first on Kai Waehner.

Apache Kafka is NOT Hard Real Time BUT Used Everywhere in Automotive and Industrial IoT

Kai Waehner — Mon, 04 Jan 2021 13:14:44 +0000

Is Apache Kafka really real-time? This is a question I get asked every week. Real-time is a great marketing term to describe how businesses can add value by processing data as fast as possible. Most software and product vendors use it these days. Including messages frameworks (e.g., IBM MQ, RabbitMQ), event streaming platforms (e.g., Apache Kafka, Confluent), data warehouse/analytics vendors (e.g., Spark, Snowflake, Elasticsearch), and security / SIEM products (e.g., Splunk). This blog post explores what “real-time” really means and how Apache Kafka and other messaging frameworks accomplish the mission of providing real-time data processing.

Definition: What is real-time?

The definition of the term “real-time” is not easy. However, it is essential to define it before you start any discussion about this topic.

In general, real-time computing (sometimes called reactive computing) is the computer science term for hardware and software systems subject to a “real-time constraint”, for example, from event to system response. Real-time programs must guarantee a response within specified time constraints, often referred to as “deadlines”. Real-time processing fails if not completed within a specified deadline relative to an event; deadlines must always be met, regardless of system load.

Hard vs. soft vs. near real-time

Unfortunately, there is more than one “real-time”. The graphic from embedded.com describes it very well:

Here are a few different nuances of real-time data processing:

Real-time: This is the marketing term. It can be anything from zero latency and zero spikes to minutes and beyond.
Hard real-time: Missing a deadline is a total system failure. Delays or spikes are not accepted. Hence, the goal is to ensure that all deadlines are met,
Soft real-time: The usefulness of a result degrades after its deadline, thereby degrading the system’s quality of service. The goal becomes meeting a certain subset of deadlines to optimize some application-specific criteria. The particular criteria optimized depend on the application.
Near real-time: Refers to the time delay introduced, by automated data processing or network transmission, between the occurrence of an event and the use of the processed data. The range goes from microseconds and milliseconds to seconds, minutes, and sometimes even hours or days.

Technical implementation of hard real-time

From a more technical point of view, hard real-time is a synchronous push operation. The caller invokes something and must wait for the return. This cannot be implemented with event distribution effectively. It is rather an API call. Soft- and near real-time are asynchronous. The caller propagates an event but others do not affect the outcome.

Hermann Kopetz’s book “Real-Time Systems: Design Principles for Distributed Embedded Applications” is a great resource if you want to dig deeper. The Wikipedia article is also a good, detailed summary with further references.

Always question what is meant by the term “real-time” if the context is not clear yet. While it is not always accurate, it is okay to use the term “real-time” in many cases, especially when you talk to business people.

The use cases in the next sections will make different real-time scenarios more clear.

Use cases for hard real-time

Hard real-time requires a deterministic network with zero latency and no spikes. Common scenarios include embedded systems, field bus and PLCs in manufacturing, cars, robots, etc. Time-Sensitive Networking (TSN) is the right keyword if you want to do more research. TSN is a set of standards under development by the Time-Sensitive Networking task group of the IEEE 802.1 working group.

This is NOT Java, NOT Cloud, and NOT anything else a web developer knows and uses for daily routine.

Examples of hard real-time

Here are a few examples that are only doable (and safe) with hard real-time:

A car engine control system is a hard real-time system because a delayed signal may cause engine failure or damage. This gets even more important with autonomous vehicles.
Medical systems, such as heart pacemakers. Even though a pacemaker’s task is simple, because of the potential risk to human life, medical systems like these are typically required to undergo thorough testing and certification, which in turn requires hard real-time computing to offer provable guarantees that a failure is unlikely or impossible.
Industrial process controllers, such as a machine on an assembly line. If the machine is delayed, the item on the assembly line could pass beyond the machine’s reach (leaving the product untouched), or the machine or the product could be damaged by activating the robot at the wrong time. If the failure is detected, both cases will lead to the assembly line stopping, which slows production. When the failure is not detected, a product with a defect could make it through production or could cause damage later production steps.
Hard real-time systems are typically found interacting at a low level with physical hardware in embedded systems.

Most hard real-time implementations are proprietary and have a long history. Nevertheless, the industry is getting more and more open. Industry 4.0, Industrial IoT, autonomous driving, smart cities, and similar scenarios are impossible without an open architecture.

Use cases for soft real-time and near real-time

Soft real-time or near real-time is what most people actually talk about when they say “real-time”. The use cases include everything that is not “hard real-time”. End-to-end communication has latency, delays, and spikes. Near real-time can be very fast, but also take a long time. Most use cases across all verticals sit in this category.

Some verticals such as retailing or gaming might never have to think about hard real-time at all. Even though, if you dig deeper, retailing also often have production lines. Gaming also has game consoles and hardware. Hence, it always depends on your business department.

Examples of soft real-time

Some examples of soft/near real-time use cases:

High-frequency trading on financial or energy markets. Messaging and processing typically have to happen in microseconds. This is probably the closest alternative to hard real-time. Only specific proprietory stream processing products such as TIBCO StreamBase or Software AG’s Apama can do this. Honestly, I am not aware of many other use cases where this speed is required and worth the trade-offs (such as high license cost, proprietary system, often single-server instead of distributed system, etc).
Point-to-point message queuing is the most well-known approach to send data from A to B in near real-time. Alternatives include traditional, proprietary products like IBM MQ, and open source frameworks such as RabbitMQ or NATS. The big problem of message queues is that the real added value comes when the data is also used “now” instead of “too late”. Just sending the data to a database does not help. Hence, a near real-time processing framework is a mandatory combination for many use cases.
Context-specific data processing and data correlation (often called event streaming or event stream processing) to provide the right information at the right time. This sounds generic but is probably what you require most of the time. No matter what industry or vertical. Apache Kafka is the de facto standard for event streaming, i.e., the combination of messaging, integration, storage, and processing of data. Often, Kafka is combined with dedicated stream processing frameworks such as Apache Flink. Example applications: Fraud detection, omnichannel cross-selling, predictive maintenance, regulatory reporting, or any other digital platform for innovative business models. The presentation “Use Cases and Architectures for Apache Kafka across Industries” covers plenty of examples in more detail.
Analytics and reporting with data warehouse, data lake, ETL processes, and machine learning aggregates and analyses huge volumes of data. Near real-time can mean seconds (e.g., indexing into a search engine like Elasticsearch or DWH like Snowflake), minutes (e.g., regulatory reporting in financial services), or even hours (e.g., capacity planning for the next day in a supply chain process). Often, the main goal is to store the processed data at rest for further batch analytics with business intelligence tools or provide a human dashboard for operations monitoring.

Always define your meaning of “real-time!

As you can see, “near real-time” can mean many different things. It is okay to say “real-time” to these use cases. Not just in marketing, but also in business and technical meetings! But make sure to understand your requirements and find the appropriate technologies.

I will focus on Apache Kafka as it established itself as the de facto standard for near real-time processing in the market (aka event streaming). However, Kafka is also often used as a messaging platform, and to ingest data into other analytics tools. Hence, it fits into most of the near real-time use cases.

Kafka for real-time requirements? Yes and no!

Kafka is real-time. But not for everybody’s definition of real-time. Let’s understand this better…

Kafka is real-time!

Apache Kafka became the de facto standard for reliable data processing at scale in real-time. Most people agree with this in the IT world. Kafka provides capabilities to process trillions of events per day. Each Kafka broker (= server) can process tens of thousands of messages per second. End-to-end latency from producer to consumer can be as low as ~10ms if the hardware and network setup are good enough. Kafka is battle-tested at thousands of companies for hundreds of different use cases. It uses Apache 2.0 license and provides a huge community and ecosystem. So far so good…

Kafka is not real-time!

However, in the OT world, things are different: Kafka is only soft real-time. Many OT applications require hard real-time. Hence, scenarios around automotive, manufacturing, and smart cities need to make this distinction. Consortiums and standards provide hard real-time frameworks and guidelines on how to integrate with them from the IT side. Two examples:

MISRA C is a set of software development guidelines for the C programming language developed by MISRA (Motor Industry Software Reliability Association).
ROS-Industrial is an open-source project that extends ROS software’s advanced capabilities to industrial relevant hardware and applications.

Most companies I have talked to in these industries combine hard real-time and soft real-time. Both approaches are complementary and have different trade-offs.

The following section shows how enterprises combine the OT world (cars, machines, PLCs, robots, etc.) with the IT world (analytics, reporting, business applications).

How to combine Kafka with hard real-time applications

Kafka is not hard real-time, but most enterprises combine it with hard real-time applications to correlate the data, integrate with other systems in near real-time, and build innovative new business applications.

Some notes on the above architecture about the relation between Kafka and the OT world:

Hard real-time requires C or even lower-level programming with an assembly language that is designed for exactly one specific computer architecture. If you are “lucky”, you are allowed to use C++. Hard real-time is required in automotive ECUs (electronic control unit), Cobots (collaborative robots), and similar things. Safety and zero latency are key. This is not Java! This is not Kafka!
Most integration scenarios and almost all business applications only require near real-time data processing. Java, Golang, Python, and similar programming languages (and tool stacks/frameworks on top of that) are used because they are much more simple and convenient to use for most people.
Open standards are mandatory for a connected world with manufacturing 4.0, innovative mobility services, and smart cities. I explored the relationship between proprietary monoliths and open, scalable platforms in the blog post “Apache Kafka as Data Historian – an IIoT / Industry 4.0 Real-Time Data Lake”.”
Kafka is the perfect tool for integrating the OT and IT world – at scale, reliable, and near real-time. For instance, check out how to build a digital twin with OPC-UA and Kafka or do analytics in a connected car infrastructure with MQTT and Kafka. Sometimes, embedded systems directly integrate with Kafka via Confluent’s C or C++ client, or the REST Proxy for near real-time use cases.
Kafka runs everywhere. This includes any data center or cloud but also the edge (e.g., a factory or even a vehicle). Check out various use cases where Kafka is deployed at the edge outside the data center.
Hybrid and global Kafka deployments are the new black. Tools are battle-tested and deployed across all verticals and continents.

Example: Kafka for cybersecurity and SIEM in the smart factory

Let’s conclude this post with a specific example for combining hard real-time systems and near real-time using Apache Kafka: Cybersecurity and SIEM in the smart factory.

Most factories require hard real-time for their machines, PLCs, DCS, robots, etc. Unfortunately, many applications are 10, 20, 30 years, and older. They run on unsecured and unsupported operating systems (Windows XP is still far from going away in factories!)

I have seen a few customers leveraging Apache Kafka as a cybersecurity platform in the middle. I.e., between the monolithic, proprietary legacy systems and the modern IT world.

Kafka monitors all data communication in near real-time to implement access control, detect anomalies, and provide secure communication. This architecture enables the integration with non-connected legacy systems to collect sensor data but also ensures that no external system gets access to the unsecured machines. Intel is a great public example for building a modern, scalable cyber intelligence platform with Apache Kafka and Confluent Platform.

One common security design pattern in Industrial IoT is the data diode. Implementations often include a hardware/software combination such as the products from Owl Cyber Defense. Another option is the Kafka Connect based Data Diode Connector (Source and Sink) to build a Kafka-native high-security unidirectional network. In such networks, the network settings do not permit TCP/IP packets and UDP packets are only allowed in one direction.

Soft real-time is what you need for most use cases!

Hard real-time is critical for some use cases, such as car engines, medical systems, and industrial process controllers. However, most other use cases only require near real-time. Apache Kafka comes into play to build scalable, reliable, near real-time applications and connect to the OT world. The open architecture and backpressure handling of huge volumes from IoT interfaces are two of the key reasons why Kafka is such a good fit in OT/IT architectures.

How do you use Kafka for (near) real-time applications? How is it combined with machines, PLCs, cars, and other hard real-time applications? What are your strategy and timeline? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka is NOT Hard Real Time BUT Used Everywhere in Automotive and Industrial IoT appeared first on Kai Waehner.