Energy Archives - Kai Waehner https://www.kai-waehner.de/blog/category/energy/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 28 Apr 2025 15:17:20 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Energy Archives - Kai Waehner https://www.kai-waehner.de/blog/category/energy/ 32 32 Virta’s Electric Vehicle (EV) Charging Platform with Real-Time Data Streaming: Scalability for Large Charging Businesses https://www.kai-waehner.de/blog/2025/04/22/virtas-electric-vehicle-ev-charging-platform-with-real-time-data-streaming-scalability-for-large-charging-businesses/ Tue, 22 Apr 2025 11:53:00 +0000 https://www.kai-waehner.de/?p=7477 The rise of Electric Vehicles (EVs) demands a scalable, efficient charging network—but challenges like fluctuating demand, complex billing, and real-time availability updates must be addressed. Virta, a global leader in smart EV charging, is tackling these issues with real-time data streaming. By leveraging Apache Kafka and Confluent Cloud, Virta enhances energy distribution, enables predictive maintenance, and supports dynamic pricing. This approach optimizes operations, improves user experience, and drives sustainability. Discover how real-time data streaming is shaping the future of EV charging and enabling intelligent, scalable infrastructure.

The post Virta’s Electric Vehicle (EV) Charging Platform with Real-Time Data Streaming: Scalability for Large Charging Businesses appeared first on Kai Waehner.

]]>
The Electric Vehicle (EV) revolution is here, but scaling charging infrastructure and integration with the energy system presents challenges— rapid power supply and demand fluctuations, billing complexity, and real-time availability updates. Virta, a global leader in smart EV charging, is leveraging real-time data streaming to optimize operations, improve user experience, and drive sustainability. By integrating Apache Kafka and Confluent Cloud, Virta ensures seamless energy distribution, predictive maintenance, and dynamic pricing for a smarter, greener future. Read how data streaming is transforming EV charging and enabling scalable, intelligent infrastructure.

Electric Vehicle (EV) Charging - Automotive and ESG with Data Streaming at Virta

I spoke with Jussi Ahtikari (Chief AI Officer at Virta) at a HotTopics C-Suite Exchange about Virta business model around EV charging networks and how they leverage data streaming. The following is a summary of this excellent success story about an innovative EV charging platform.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including several success stories around Kafka and Flink to improve ESG.

The Evolution and Challenges of Electric Vehicle (EV) Charging

The global shift towards electric vehicles (EVs) is accelerating, driven by the surge in variable renewable energy (wind, solar) production, need for sustainable and more cost-efficient transportation solutions, government incentives, and rapid advancements in battery technology. EV charging infrastructure plays a critical role in making this transition successful. It ensures that drivers have access to reliable and efficient charging options while keeping the costs of energy and charging operations in check and energy system in balance.

The innovation in EV charging goes beyond simply providing power to vehicles. Intelligent charging networks, dynamic pricing models, and energy management solutions are transforming the industry. Sustainability is also a key factor, as efficient energy consumption and integration with renewable energy system contribute to environmental, social, and governance (ESG) goals.

While the user and charged energy volumes grow, the real time interplay with the energy system, demand fluctuations, complex billing systems, and real-time station availability updates require a scalable and resilient data infrastructure. Delays in processing real-time data can lead to inefficient energy distribution, poor user experience, and lost revenue.

Virta: Innovating the Future of EV Charging

Virta is a digital cloud platform for electric vehicle (EV) charging businesses and a global leader in connecting of smart charging infrastructure and EV battery capacity with the renewable energy system via bi-directional charging (V2G) and demand response (V1G).

The digital Virta EV Energy platform provides a comprehensive suite of solutions for charging businesses to launch and manage their own EV charging networks. Virta full-service charging platform enables Charging Network and Business Management, Transactions, Pricing, Payments and Invoicing, EV Driver and Fleet Services, Roaming, Energy Management, and Virtual Power Plant services.

Its Charge Point Management System (CPMS) supports over 450 charger models, allowing seamless integration with third-party infrastructure. Virta is the only provider combining CPMS with energy flexibility platform.

Virta EV Charging Platform
Source: Virta

Virta Platform Connecting 100,000+ Charging Stations Serving Millions of EV Drivers

The Virta platform is utilised by professional charge point operators (CPOs) and e-mobility service providers (EMPs) across energy, petrol, retail, automotive and real estate industries in 36 countries in Europe and South-East Asia. Virta is headquartered in Helsinki, Finland.

Virta manages real-time data from well over 100,000 EV charging stations, serving millions of EV drivers, and processes approximately 40 GB of real-time data every hour. Including roaming partnerships, the platform offers EV drivers access to in total over 620,000 public charging stations in over 60 countries.

With this scale, real-time responsiveness is critical. Each time a charging station sends a signal—for example, when a driver starts charging—the platform must immediately trigger a series of actions:

  • Start billing
  • Update real-time status in mobile apps
  • Notify roaming networks
  • Update metrics and statistics
  • Conduct fraud checks

At the early days of electric mobility all of these operations could be handled in a monolithic system using tightly coupled and synchronized code. According to Jussi Ahtikari, Chief AI Officer at Virta, this would have made the system “complex, difficult to maintain, and hard to scale” as data volumes grew. Therefore the team identified early a need for a more modular, scalable, and real-time architecture to support its rapid growth and evolving service portfolio.

Innovative Industry Partnerships: Virta and Valeo

Virta is also exploring new opportunities in the EV ecosystem through its partnership with Valeo, a leader in automotive and energy solutions. The companies are working on integrating Valeo’s Ineez charging technology with Virta’s CPMS platform to enhance fleet charging, leasing services, and vehicle-to-grid (V2G) capabilities.

Vehicle-to-grid technology enables EVs to act as distributed energy storage, feeding excess power back into the grid during peak demand. This innovation is expected to play a critical role in balancing electricity supply and demand, contributing to cheaper electricity and more stable renewables based energy system.

The Role of Data Streaming in ESG and EV Charging

Sustainability and environmental responsibility are key drivers of ESG initiatives in industries such as energy, transportation, and manufacturing. Data streaming plays a crucial role in achieving ESG goals by enabling real-time monitoring, predictive maintenance, and energy efficiency improvements.

In the EV charging industry, real-time data streaming supports:

Foreseeing the growing need for these real-time insights led Virta to adopt a data streaming approach with Confluent.

Virta’s Data Streaming Transformation

To maintain its rapid growth and provide an exceptional customer experience, Virta needed a scalable, real-time data streaming solution. The company turned to Confluent’s data streaming platform (DSP), powered by Apache Kafka, to process millions of messages per hour and ensure seamless operations.

Scaling Challenges and the Need for Real-Time Processing

Virta’s rapid growth to scale of millions of charging events and tens of gigawatt hours of charged energy on a monthly basis in Europe and South-East Asia resulted in massive volumes of data that needed to be processed instantly. Something legacy systems, based on sequential authorization, would have struggled with.

Without real-time updates, large scale charging operations would face issues such as:

  • Unclear station availability
  • Slow transaction processing
  • Inaccurate billing information

Initially, Virta worked with open-source Apache Kafka but found managing high-volume data streams at scale to be increasingly resource-intensive. Therefore the team sought an enterprise-grade solution that would remove operational complexities while providing robust real-time capabilities.

Deploying A Data Streaming Platform for Scalable EV Charging

Confluent has become the backbone of Virta’s real-time data architecture. With Confluent’s event streaming platform, Virta is able to maintain a modern event-driven microservices architecture. Instead of tightly coupling all business logic into one system, each charging event—such as a driver starting a session—is published as a single, centralized event. Independent microservices subscribe to that event to trigger specific actions like billing, mobile app updates, roaming notifications, fraud detection, and more.

Here is a diagram of Virta’s cloud-Native microservices architecture powered by AWS, Confluent Cloud, Snowflake, Redis, OpenSearch, and other technologies:

Virta Cloud-Native Microservices Architecture for EV Charging Platform powered by AWS, Confluent Cloud, Snowflake, Redis, OpenSearch
Source: Virta

This architectural shift with an event-driven architecture and the data streaming platform as central nervous system has significantly improved scalability, maintainability, and fault isolation. It has also accelerated innovation with fast roll-out times of new services, including audit trails, improved data governance through schemas, and the foundation for AI-powered capabilities—all built on clean, real-time data streams.

Key Benefits of a SaaS Data Streaming Platform for Virta

As a fully managed data streaming platform, Confluent Cloud has eliminated the need for Virta to maintain Kafka clusters manually, allowing its engineering teams to focus on innovation rather than infrastructure management:

  • Elastic scalability: Automatically scales up to handle peak loads, ensuring uninterrupted service.
  • Real-time processing: Supports 45 million messages per hour, enabling immediate updates on charging status and availability.
  • Simplified development: Tools such as Schema Registry and pre-built APIs provide a standardized approach for developers, speeding up feature deployment.

Data Streaming Landscape: Spoilt for Choice – Open Source Kafka, Confluent, and many other Vendors

To navigate the evolving data streaming landscape, Virta chose a cloud-native, enterprise-grade platform that balances reliability, scalability, cost-efficiency, and ease of use. While many streaming technologies exist, Confluent offered the right trade-offs between operational simplicity and real-time performance at scale.

Read more about the different data streaming frameworks, platforms and cloud services in the data streaming landscape overview:The Data Streaming Landscape 2025 with Kafka Flink Confluent Amazon MSK Cloudera Event Hubs and Other Platforms

Business Impact of a Data Streaming Platform

By leveraging Confluent Cloud as its cloud-native and serverless data streaming platform, Virta has realized significant business benefits:

1. Faster Time to Market

Virta’s teams can now deploy new app features, charge points, and business services more quickly. The company has regained the agility of a startup, rolling out improvements without infrastructure bottlenecks.

2. Instant Updates for Customers and Operators

With real-time data streaming, Virta can update station availability and configuration changes in less than a second. This ensures that customers always have the latest information at their fingertips.

3. Cost Savings through Usage-Based Pricing

Virta’s shift to a usage-based pricing model has optimized its operational expenses. Instead of maintaining excess capacity, the company only pays for the resources it consumes.

4. Future-Ready Infrastructure for Advanced Analytics

Virta is building the future of real-time analytics, predictive maintenance, and smart billing by integrating Confluent with Snowflake’s AI-powered data cloud.

By decoupling data streams with Kafka, Virta ensures data consistency, scalability, and agility—enabling advanced analytics without operational bottlenecks.

Beyond EV Charging: Broader Energy and ESG Use Cases

Virta’s success with real-time data streaming highlights broader applications across the energy and ESG sectors. Similar data-driven solutions are being deployed for:

  • Smart grids: Real-time monitoring of electricity distribution to optimize supply and demand.
  • Renewable energy integration: Managing wind and solar power fluctuations with predictive analytics.
  • Industrial sustainability: Tracking carbon emissions and optimizing resource utilization.

The transition to electric mobility requires more than just an increase in charging stations. The ability to process and act on data in real time is critical to optimizing the use and costs of energy and infrastructure, enhancing user experience, and driving sustainability.

Virta’s usage of a serverless data streaming platform demonstrates the power of real-time data streaming in enabling scalable, efficient, and future-ready EV charging solutions. By eliminating infrastructure constraints, improving responsiveness, and reducing operational costs, Virta is setting new industry standards for innovation in mobility and energy management.

The EV charging landscape will tenfold within the next ten years, and especially with the mass adoption of bi-directional charging (V2G), integrate seamlessly with the energy system. Real-time data streaming will serve as the cornerstone for this evolution, helping businesses navigate challenges while unlocking new opportunities for sustainability and profitability.

For more data streaming success stories and use cases, make sure to download my free ebook. Please let me know your thoughts, feedback and use cases on LinkedIn and stay in touch via my newsletter.

The post Virta’s Electric Vehicle (EV) Charging Platform with Real-Time Data Streaming: Scalability for Large Charging Businesses appeared first on Kai Waehner.

]]>
Tesla Energy Platform – The Power of Data Streaming with Apache Kafka https://www.kai-waehner.de/blog/2025/02/14/tesla-energy-platform-the-power-of-data-streaming-with-apache-kafka/ Fri, 14 Feb 2025 08:17:37 +0000 https://www.kai-waehner.de/?p=7340 Tesla’s Virtual Power Plant (VPP) turns thousands of home batteries, solar panels, and energy storage systems into a coordinated, intelligent energy network. By leveraging Apache Kafka for event streaming and WebSockets for real-time IoT connectivity, Tesla enables instant energy redistribution, dynamic grid balancing, and automated market participation. This event-driven architecture ensures millisecond-level decision-making, allowing homeowners to optimize energy usage and utilities to stabilize power grids. Tesla’s approach highlights how real-time data streaming and intelligent automation are reshaping the future of decentralized, resilient, and sustainable energy systems.

The post Tesla Energy Platform – The Power of Data Streaming with Apache Kafka appeared first on Kai Waehner.

]]>
Tesla’s Virtual Power Plant (VPP) is revolutionizing the energy sector by connecting home batteries, solar panels, and grid-scale storage into a real-time, intelligent energy network. Powered by Apache Kafka for event streaming and WebSockets for last-mile IoT integration, Tesla’s Energy Platform enables real-time energy trading, grid stabilization, and seamless market participation. By leveraging data streaming and automation, Tesla optimizes battery efficiency, prevents blackouts, and allows homeowners to monetize excess energy—all while making renewable energy more reliable and scalable. This software-driven approach showcases the power of real-time data in building the future of sustainable energy.

Tesla Energy Platform - The Power of Data Streaming with Apache Kafka

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases across all industries.

What is a Virtual Power Plant?

A Virtual Power Plant (VPP) is a network of decentralized energy resources—such as home batteries, solar panels, and smart grid systems—that function as a single unit. Unlike a traditional power plant that generates electricity from a centralized location, a VPP aggregates power from many small, distributed sources. This allows energy to be dynamically stored and shared, helping to balance supply and demand in real time.

VPPs are crucial in the shift to renewable energy. The traditional power grid was designed around fossil fuel plants that could easily adjust output. Renewable energy sources like solar and wind are intermittent—they don’t generate power on demand. By connecting thousands of batteries and solar panels in homes and businesses, a VPP can smooth out fluctuations in power generation and consumption. This prevents blackouts, reduces energy costs, and enables homes and businesses to participate in energy markets.

How Tesla’s Virtual Power Plant Fits Its Business Model

Tesla is not just an automaker. It is a sustainable energy company. Tesla’s product ecosystem includes electric vehicles, solar panels, home batteries (Powerwall), grid-scale energy storage (Megapack), and energy management software (Autobidder).

The Tesla Virtual Power Plant (VPP) ties all these elements together. Homeowners with Powerwalls store excess solar power during the day and feed it back to the grid when needed. Tesla’s Autobidder software automatically optimizes energy use and market participation, turning home batteries into revenue-generating assets.

For Tesla, the VPP strengthens its energy business, creating a scalable model that maximizes battery efficiency, stabilizes grids, and expands the role of software in energy markets. Tesla is not just selling batteries; it is selling energy intelligence.

Virtual Energy Platform and ESG (Environmental, Social, and Governance) Goals

Tesla’s energy platform is a perfect example of how data streaming and real-time decision-making align with ESG principles:

  • Environmental Impact: VPPs reduce reliance on fossil fuels by making renewable energy more reliable.
  • Social Benefit: By enabling energy independence, VPPs provide power during outages and extreme weather conditions.
  • Governance & Regulation: VPPs allow consumers to participate in energy markets, fostering decentralized energy ownership.

Tesla’s approach is smart grid innovation at scalereal-time data makes the grid more dynamic, efficient, and resilient.

My article “Green Data, Clean Insights: How Apache Kafka and Flink Power ESG Transformations” covers other real-world data streaming deployments in the energy sector like EON.

Tesla’s Energy Platform: A Network of Connected Home Energy Systems

Tesla’s VPP connects thousands of homes with Powerwalls, solar panels, and grid services. These systems work together to provide electricity on demand, reacting to supply fluctuations in real-time.

Key Functions of Tesla’s VPP:

  1. Energy Storage & Redistribution: Batteries store solar energy during the day and discharge at night or during peak demand.
  2. Grid Stabilization: The VPP balances energy supply and demand to prevent outages and fluctuations.
  3. Market Participation: Homeowners can sell excess power back to the grid, monetizing their batteries.
  4. Disaster Resilience: The VPP provides backup power during blackouts, storms, and grid failures.

This requires real-time data processing at massive scale—something traditional batch-based data architectures cannot handle.

Apache Kafka and Real-Time Data Streaming at Tesla

Tesla operates in many domains—automotive, energy, and AI. Across all these areas, Apache Kafka plays a critical role in enabling real-time data movement and stream processing.

In 2018, Tesla already processed trillions of IoT messages with Apache Kafka:

Tesla Automotive Journey from RabbitMQ to Apache Kafka for IoT Events
Source: Tesla

Tesla leverages stream processing to handle trillions of IoT events daily, using Apache Kafka to ingest, process, and analyze data from its vehicle fleet in real time. By implementing efficient data partitioning, fast and slow data lanes, and scalable infrastructure, Tesla optimizes vehicle performance, predicts failures, and enhances manufacturing efficiency.

These strategies demonstrate how real-time data streaming is essential for managing large-scale IoT ecosystems, ensuring low-latency insights while maintaining operational stability. To learn more about these use cases read Tesla’s blog postStream Processing with IoT Data: Challenges, Best Practices, and Techniques“.

The following sections explore Tesla’s innovation for its virtual power plant, as discussed in an excellent presentation at QCon.

Tesla Energy Platform: Architecture of the Virtual Power Plant Powered by Apache Kafka

Tesla’s VPP uses Apache Kafka for:

  1. Telemetry Ingestion: Streaming data from millions of Powerwalls, solar panels, and Megapacks into the cloud.
  2. Command & Control: Sending real-time control commands to batteries and grid services.
  3. Market Participation: Autobidder analyzes real-time data and adjusts energy prices dynamically.

The event-driven architecture allows Tesla to react to energy demand in milliseconds—critical for balancing the grid.

Tesla’s Energy Platform is the software foundation of the VPP. It integrates OT (Operational Technology), IoT (Internet of Things), and IT (Information Technology) to control distributed energy assets.

Tesla Applications Built on the Energy Platform

Tesla’s Energy Platform powers a suite of applications that optimize energy management, market participation, and grid stability through real-time data streaming and automation.

Autobidder

  • Optimizes energy trading in real time.
  • Automatically bids into energy markets.

I wrote about about other data streaming success stories for energy trading with Apache Kafka and Flink, including Uniper, re.alto and Powerledger.

Distributed Virtual Power Plant

  • Aggregates thousands of Powerwalls into a single energy asset.
  • Provides grid stabilization and peak load balancing.

If you are interested in other smart grid infrastructures, check out “Apache Kafka for Smart Grid, Utilities and Energy Production“. The articles covers how data streaming realizes IT/OT integration. And some hybrid cloud IoT deployments.

Battery Control (Command & Control)

  • Ensures optimal charging and discharging of batteries.
  • Minimizes costs while maximizing energy efficiency.

Market Participation

  • Allows homeowners and businesses to profit from energy markets.
  • Ensures seamless grid integration of Tesla’s energy products.

Key Components of Tesla’s Energy Platform: Apache Kafka, WebSockets, Akka Streams

The combination of data streaming with Apache Kafka and the last-mile IoT integration via WebSockets builds the central nervous system of Tesla’s Energy Platform:

  1. Apache Kafka (Event Streaming):
    • Streams telemetry data from Powerwalls every second.
    • Ensures durability and reliability of data streams.
    • Allows real-time energy aggregation across thousands of homes.
  2. WebSockets (Last-Mile IoT Integration):
    • Provides low-latency bidirectional communication with Powerwalls.
    • Used to send real-time commands to home batteries.
  3. Pub/Sub (Command & Control):
    • Enables distributed energy resource coordination.
    • Ensures resilient messaging between systems.
  4. Business Logic (Applications & Microservices):
    • Tesla’s services are built with Scala and Python.
    • Uses gRPC & HTTP for inter-service communication.
  5. Digital Twins (Real-Time State Management):
    • Digital models of physical assets ensure real-time decision-making.
    • Tesla uses Akka Streams for stateful event processing.
  6. Kubernetes (Cloud Infrastructure):
    • Ensures scalability and resilience of Tesla’s energy microservices.
Tesla Virtual Power Plant Energy Architecture Using Apache Kafka WebSockets and Akka Streams
Source: Tesla

Interesting side note: While most energy companies I have seen rely on Kafka Streams or Apache Flink for stateful event processing, Tesla takes an interesting approach by leveraging Akka Streams (based on Akka’s Actor Model) to manage real-time digital twins of its energy assets. This choice provides fine-grained control over streaming workflows, but unlike Kafka Streams or Flink, Akka lacks widespread community adoption, making it a less common choice for many large-scale energy platforms. Kafka and Flink are a match made in heaven for most data streaming use cases.

Best Practice: Shift Left Architecture with Data Products for High-Volume IoT Data

Tesla leverages several data processing best practices to improve efficiency and consistency:

  • Canonical Kafka Topics: Data is filtered and structured at the source.
  • Consistent Downstream Services: Every consumer gets clean, structured data.
  • Real-Time Aggregation of Thousands of Batteries: A unique challenge that forms the foundation of the virtual power plant.

This data-first approach ensures Tesla’s energy platform can scale to millions of distributed assets.

Today, many people refer to the Shift Left Architecture when applying these best practices for processing data efficiently and continuously to provide data product in real-time and good quality:

Shift Left Architecture with Data Streaming into Data Lake Warehouse Lakehouse

 

In Tesla’s Energy Platform, the data comes from IoT interfaces. WebSockets provide the last-mile integration and feed the events into the data streaming platform for continuous processing before the ingestion into the operational and analytical applications.

Tesla’s Energy Vision: How Streaming Data Will Shape Tomorrow’s Power Grids

Tesla’s Virtual Power Plant is not just about batteries—it’s about software, real-time data, and automation.

Why Data Streaming Matters for Tesla’s Energy Platform:

  1. Scalability: Can handle millions of energy devices.
  2. Resilience: Works even when devices go offline.
  3. Real-Time Decision Making: Adjusts energy distribution within milliseconds.
  4. Market Optimization: Autobidder ensures maximum revenue for battery owners.

Tesla’s VPP is a blueprint for the future of energy—one where real-time data streaming and intelligent software optimize renewable energy. By leveraging Apache Kafka, WebSockets, and stream processing, Tesla is redefining how energy is generated, distributed, and consumed.

This is not just an innovation in power generation—it’s an AI-driven energy revolution.

How do you leverage data streaming in the energy and automotive sector? Follow me on LinkedIn or X (former Twitter) to stay in touch and discuss. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter. And make sure to download my free book about data streaming use cases across all industries.

The post Tesla Energy Platform – The Power of Data Streaming with Apache Kafka appeared first on Kai Waehner.

]]>
A New Era in Dynamic Pricing: Real-Time Data Streaming with Apache Kafka and Flink https://www.kai-waehner.de/blog/2024/11/14/a-new-era-in-dynamic-pricing-real-time-data-streaming-with-apache-kafka-and-flink/ Thu, 14 Nov 2024 12:09:57 +0000 https://www.kai-waehner.de/?p=6968 In the age of digitization, the concept of pricing is no longer fixed or manual. Instead, companies increasingly use dynamic pricing — a flexible model that adjusts prices based on real-time market changes to enable real-time responsiveness, giving companies the tools they need to respond instantly to demand, competitor prices, and customer behaviors. This blog post explores the fundamentals of dynamic pricing, its link to data streaming, and real-world examples across different industries such as retail, logistics, gaming and the energy sector.

The post A New Era in Dynamic Pricing: Real-Time Data Streaming with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
In the age of digitization, the concept of pricing is no longer fixed or manual. Instead, companies increasingly use dynamic pricing — a flexible model that adjusts prices based on real-time market changes. Data streaming technologies like Apache Kafka and Apache Flink have become integral to enabling this real-time responsiveness, giving companies the tools they need to respond instantly to demand, competitor prices, and customer behaviors. This blog post explores the fundamentals of dynamic pricing, its link to data streaming, and real-world examples of how different industries such as retail, logistics, gaming and the energy sector leverage this powerful approach to get ahead of the competition.

Dynamic Pricing with Data Streaming using Apache Kafka and Flink

What is Dynamic Pricing?

Dynamic pricing is a strategy where prices are adjusted automatically based on real-time data inputs, such as demand, customer behavior, supply levels, and competitor actions. This model allows companies to optimize profitability, boost sales, and better meet customer expectations.

Relevant Industries and Examples

Dynamic pricing has applications across many industries:

  • Retail and eCommerce: Dynamic pricing in eCommerce helps adjust product prices based on stock levels, competitor actions, and customer demand. Companies like Amazon frequently update prices on millions of products, using dynamic pricing to maximize revenue.
  • Transportation and Mobility: Ride-sharing companies like Uber and Grab adjust fares based on real-time demand and traffic conditions. This is commonly known as “surge pricing.”
  • Gaming: Context-specific in-game add-ons or virtual items are offered at varying prices based on player engagement, time spent in-game, and special events or levels.
  • Energy Markets: Dynamic pricing in energy adjusts rates in response to demand fluctuations, energy availability, and wholesale costs. This approach helps to stabilize the grid and manage resources.
  • Sports and Entertainment Ticketing: Ticket prices for events are adjusted based on seat availability, demand, and event timing to allow venues and ticketing platforms to balance occupancy and maximize ticket revenue.
  • Hospitality: Adaptive room rates and promotions in real-time based on demand, seasonality, and guest behavior, using dynamic pricing models.

These industries have adopted dynamic pricing to maintain profitability, manage supply-demand balance, and enhance customer satisfaction through personalized, responsive pricing.

Dynamic pricing relies on up-to-the-minute data on market and customer conditions, making real-time data streaming critical to its success. Traditional batch processing, where data is collected and processed periodically, is insufficient for dynamic pricing. It introduces delays that could mean lost revenue opportunities or suboptimal pricing. This scenario is where data streaming technologies come into play.

  • Apache Kafka serves as the real-time data pipeline, collecting and distributing data streams from diverse sources. For instance, user behaviour on websites, competitor pricing, social media signals, IoT data, and more. Kafka’s capability to handle high throughput and low latency makes it ideal for ingesting large volumes of data continuously.
  • Apache Flink processes the data in real-time, applying complex algorithms to identify pricing opportunities as conditions change. With Flink’s support for stream processing and complex event processing, businesses can apply sophisticated logic to assess and adjust prices based on multiple real-time factors.

Dynamic Pricing with Apache Kafka and Flink in Retail eCommerce

Together, Kafka and Flink create a powerful foundation for dynamic pricing, enabling real-time data ingestion, analysis, and action. This empowers companies to implement pricing models that are not only highly responsive but also resilient and scalable.

Clickstream Analytics in Real-Time with Data Streaming Replacing Batch with Hadoop and Spark

Years ago, companies relied on Hadoop and Spark to run batch-based clickstream analytics. Data engineers ingested logs from websites, online stores, and mobile apps to gather insights. Processing took hours. Therefore, any promotional offer or discount often arrived a day later — by which time the customer may have already made their purchase elsewhere, like on Amazon.

With today’s data streaming platforms like Kafka and Flink, clickstream analytics has evolved to support real-time, context-specific engagement and dynamic pricing. Instead of waiting on delayed insights, businesses can now analyze customer behavior as it happens, instantly adjusting prices and delivering personalized offers at the moment. This dynamic pricing capability allows companies to respond immediately to high-intent customers, presenting tailored prices or promotions when they’re most likely to convert. Dynamic pricing with Kafka and Flink can create a much better seamless and timely shopping experience that maximizes sales and customer satisfaction.

Here’s how businesses across various sectors are harnessing Kafka and Flink for dynamic pricing.

  • Retail: Hyper-Personalized Promotions and Discounts
  • Logistics and Transportation: Intelligent Tolling
  • Technology: Surge Pricing
  • Energy Markets: Manage Supply-Demand and Stabilize Grid Loads
  • Gaming: Context-Specific In-Game Add-Ons
  • Sports and Entertainment: Optimize Ticketing Sales Sports and Entertainment

Learn more about data streaming with Kafka and Flink for dynamic pricing in the following success stories:

AO: Hyper-Personalized Promotions and Discounts (Retail and eCommerce)

AO, a major UK eCommerce retailer, leverages data streaming for dynamic pricing to stay competitive and drive higher customer engagement. By ingesting real-time data on competitor prices, customer demand, and inventory stock levels, AO’s system processes this information instantly to adjust prices in sync with market conditions. This approach allows AO to seize pricing opportunities and align closely with customer expectations. The result is a 30% increase in customer conversion rates.

AO Retail eCommerce Hyper Personalized Online and Mobile Experience

Dynamic pricing has also allowed AO to provide a hyper-personalized shopping experience, delivering relevant product recommendations and timely promotions. This real-time responsiveness has enhanced customer satisfaction and loyalty, as customers receive offers that feel customized to their needs. During high-traffic periods like holiday sales, AO’s dynamic pricing ensures competitiveness and optimizes margins. This drives both profitability and customer retention. The company has applied this real-time approach not just to pricing, but also to other areas like delivery to make things run smoother. The retailer is now much more efficient and provides better customer service.

Quarterhill: Intelligent Tolling (Logistics and Transportation)

Quarterhill, a leader in tolling and intelligent transportation systems, uses Kafka and Flink to implement dynamic toll pricing. Kafka ingests real-time data from traffic sensors and road usage patterns. Flink processes this data to determine congestion levels and calculate the optimal toll based on real-time conditions.

Quarterhill – Intelligent Roadside Enforcement and Compliance

This dynamic pricing strategy allows Quarterhill to manage road congestion effectively, reward off-peak travel, and optimize toll revenues. This system not only improves travel efficiency but also helps regulate traffic flows in high-density areas, providing value both to drivers and the city infrastructure.

Uber, Grab, and FreeNow: Surge Pricing (Technology)

Ride-sharing companies like Uber, Grab, and FreeNow are widely known for their dynamic pricing or “surge pricing” models. With data streaming, these platforms capture data on demand, supply (available drivers), location, and traffic in real time. This data is processed continuously by Apache Flink, Kafka Streams or other stream processing engines to calculate optimal pricing, balancing supply with demand, while considering variables like route distance and current traffic.

Dynamic Surge Pricing at Mobility Service MaaS Freenow with Kafka and Stream Processing
Source FreeNow

Surge pricing enables these companies to provide incentives for drivers to operate in high-demand areas, maintaining service availability and ensuring customer needs are met during peak times. This real-time pricing model improves revenue while optimizing customer satisfaction through prompt service availability.

Uber’s Kappa Architecture is an excellent example for how to build a data pipeline for dynamic pricing and many other use cases with Kafka and Flink:

Kappa Architecture with Apache Kafka at Mobility Service Uber
Source: Uber

2K Games / Take-Two Interactive: Context-Specific In-Game Purchases (Gaming Industry)

In the gaming industry, dynamic pricing is becoming a strategy to improve player engagement and monetize experiences. Many gaming companies use Kafka and Flink to capture real-time data on player interactions, time spent in specific game sections, and in-game events. This data enables companies to offer personalized pricing for in-game items, bonuses, or add-ons, adjusting prices based on the player’s current engagement level and recent activities.

For instance, if players are actively taking part in a particular game event, they may be offered special discounts or dynamic prices on related in-game assets. Thereby, the gaming companies improve conversion rates and player engagement while maximizing revenue.

2K Games,A leading video game publisher and a subsidiary of Take-Two Interactive, has shifted from batch to real-time analytics to enhance player engagement across popular franchises like BioShock, NBA 2K, and Borderlands. By leveraging Confluent Cloud as fully managed data streaming platform, the publisher scales dynamically to handle high traffic, processing up to 3000 MB per second to serve 4 million concurrent users.

2K Games Take Two Interactive - Bridging the Gap And Overcoming Tech Hurdles to Activate Data
Source: 2K Games

Real-time telemetry analytics now allow them to analyze player actions and context instantly, enabling personalized, context-specific promotions and enhancing the gaming experience. Cost efficiencies are achieved through data compression, tiered storage, and reduced data transfer, making real-time engagement both effective and economical.

50hertz: Manage Supply-Demand and Stabilize Grid Loads (Energy Markets)

Dynamic pricing in energy markets is essential for managing supply-demand fluctuations and stabilizing grid loads. With Kafka, energy providers ingest data from smart meters, renewable energy sources, and weather. Flink processes this data in real-time, adjusting energy prices based on grid conditions, demand levels, and renewable supply availability.

50Hertz, as a leading electricity transmission system operator, indirectly (!) affects dynamic pricing in the energy market by sharing real-time grid data with partners and energy providers. This allows energy providers and market operators to adjust prices dynamically based on real-time insights into supply-demand fluctuations and grid stability.

To support this, 50Hertz is modernizing its SCADA systems with data streaming technologies to enable real-time data capture and distribution that enhances grid monitoring and responsiveness.

Data Streaming with Apache Kafka and Flink to Modernize SCADA Systems

Real-time pricing approach helps encourage consumption when renewable energy is abundant and discourages usage during peak times, leading to optimized energy distribution, grid stability, and improved sustainability.

Ticketmaster: Optimize Ticketing Sales (Sports and Entertainment)

In ticketing, dynamic pricing allows for optimized revenue management based on demand and availability. Companies like Ticketmaster use Kafka to collect data on ticket availability, sales velocity, and even social media sentiment surrounding events. Flink processes this data to adjust prices based on real-time market conditions, such as proximity to the event date and current demand.

By dynamically pricing tickets, event organizers can maximize seat occupancy, boost revenue, and respond to last-minute demand surges, ensuring that prices reflect real-time interest while enhancing fan satisfaction.

Real-time inventory data streams allow Ticketmaster to monitor ticket availability, pricing, and demand as they change moment-to-moment. With data streaming through Apache Kafka and Confluent Platform, Ticketmaster tracks sales, venue capacity, and customer behavior in a single, live inventory stream. This enables quick responses, such as adjusting prices for high-demand events or boosting promotions where conversions lag. Teams gain actionable insights to forecast demand accurately and optimize inventory. This approach ensures fans have timely access to tickets. The result is a dynamic, data-driven approach that enhances customer experience and maximizes event success.

Conclusion: Business Value of Dynamic Pricing Built with Data Streaming

Dynamic pricing powered by data streaming with Apache Kafka and Flink brings transformative business value by:

  • Maximizing Revenue and Margins: Real-time price adjustments enable companies to capture value during demand surges, optimize for competitive conditions, and maintain healthy margins.
  • Improving Operational Efficiency: By automating pricing decisions based on real-time data, organizations can reduce manual intervention, speed up reaction times, and allocate resources more effectively.
  • Boosting Customer Satisfaction: Responsive pricing models allow companies to meet customer expectations in real time, leading to improved customer loyalty and engagement.
  • Supporting Sustainability Goals: In energy and transportation, dynamic pricing helps manage resources and reward environmentally friendly behaviors. Examples include off-peak travel and renewable energy usage.
  • Empowering Strategic Decision-Making: Real-time data insights provide business leaders with the information needed to adjust strategies and respond to developing market demands quickly.

Building a dynamic pricing system with Kafka and Flink represents a strategic investment in business agility and competitive resilience. Using data streaming to set prices instantly, businesses can stay ahead of competitors, improve customer service, and become more profitable. Dynamic pricing powered by data streaming is more than just a revenue tool; it’s a vital lever for driving growth, differentiation, and long-term success.

Did you already implement dynamic pricing? What is your data platform and strategy? Do you use Apache Kafka and Flink? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post A New Era in Dynamic Pricing: Real-Time Data Streaming with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
Energy Trading with Apache Kafka and Flink https://www.kai-waehner.de/blog/2024/06/28/energy-trading-with-apache-kafka-and-flink/ Fri, 28 Jun 2024 02:30:09 +0000 https://www.kai-waehner.de/?p=6491 Energy trading and data streaming are connected because real-time data helps traders make better decisions in the fast-moving energy markets. This data includes things like price changes, supply and demand, smart IoT meters and sensors, and weather, which help traders react quickly and plan effectively. As a result, data streaming with Apache Kafka and Apache Flink makes the market clearer, speeds up information sharing, and improves forecasting and risk management. This blog post explores the use cases and architectures for scalable and reliable real-time energy trading, including real-world deployments from Uniper, re.alto and Powerledger.

The post Energy Trading with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
Energy trading and data streaming are connected because real-time data helps traders make better decisions in the fast-moving energy markets. This data includes things like price changes, supply and demand, smart IoT meters and sensors, and weather, which help traders react quickly and plan effectively. As a result, data streaming with Apache Kafka and Apache Flink makes the market clearer, speeds up information sharing, and improves forecasting and risk management. This blog post explores the use cases and architectures for scalable and reliable real-time energy trading, including real-world deployments from Uniper, re.alto and Powerledger.

Energy Trading with Apache Kafka and Flink at Uniper ReAlto Powerledger

What is Energy Trading?

Energy trading is the process of buying and selling energy commodities in order to manage risk, optimize costs, and ensure the efficient distribution of energy. Commodities traded include:

  • Electricity: Traded in wholesale markets to balance supply and demand.
  • Natural Gas: Bought and sold for heating, electricity generation, and industrial use.
  • Oil: Crude oil and refined products like gasoline and diesel are traded globally.
  • Renewable Energy Certificates (RECs): Represent proof that energy was generated from renewable sources.

Market Participants:

  • Producers: Companies that extract or generate energy.
  • Utilities: Entities that distribute energy to consumers.
  • Industrial Consumers: Large energy users that purchase energy directly.
  • Traders and Financial Institutions: Participants that buy and sell energy contracts for profit or risk management.

Objectives of Energy Trading

The objectives for energy trading are risk management (hedging against price volatility and supply disruptions), cost optimization (securing energy at the best possible prices) and revenue generation (profiting from price differences in different markets).

Market types include:

  • Spot Markets: Immediate delivery and payment of energy commodities.
  • Futures Markets: Contracts to buy or sell a commodity at a future date, helping manage price risks.
  • Over-the-Counter (OTC) Markets: Direct trades between parties, often customized contracts.
  • Exchanges: Platforms like the New York Mercantile Exchange (NYMEX) and Intercontinental Exchange (ICE) where standardized contracts are traded.

Energy trading is subject to extensive regulation to ensure fair practices, prevent market manipulation, and protect consumers.

Data streaming with Apache Kafka and Flink provides a unique combination of capabilities:

  • Real-time messaging at scale for analytical and transactional workloads.
  • Event store for durability, true decoupling and the ability to travel back in time for replayability of events with guaranteed ordering.
  • Data integration with any data source and sink (real-time, near real-time, batch, request response APIs, files, etc.)
  • Stream processing for stateless and stateful correlations of data for streaming ETL and business applications

Data Streaming Platform with Apache Kafka and Apache Flink for Energy Trading

Trading Architecture with Apache Kafka

Many trading markets use data streaming with Apache Kafka under the hood to integrate with internal systems, external exchanges and data providers, clearing houses and regulators:

Trading in Financial Services with Data Streaming using Apache Kafka
Source: Confluent

For instance, NASDAQ combines critical stock exchange trading with low-latency streaming analytics. This is not much different for energy trading, even though the interfaces and challenges differ a bit as additionally various IoT data sources are involved.

Data streaming with Apache Kafka and Apache Flink is highly beneficial for energy trading for several reasons across the end-to-end business process and data pipelines.

Energy Trading Business Processing with Data in Motion

Here is why these technologies are often used in the energy sector:

Real-Time Data Processing

Real-Time Analytics: Energy trading relies on real-time data to make informed decisions. Kafka and Flink can process data streams in real-time, providing immediate insights into market conditions, energy consumption and production levels.

Immediate Response: Real-time processing allows traders to respond instantly to market changes, such as price fluctuations or sudden changes in supply and demand, optimizing trading strategies and mitigating risks.

Scalability and Performance

Scalability: Both Kafka and Flink handle high-throughput data streams. This scalability is crucial for energy markets, which generate vast amounts of data from multiple sources, including sensors, smart meters, and market feeds.

High Performance: Data streaming enables fast data processing and analysis. Kafka ensures low-latency data ingestion, while Flink provides efficient, distributed stream processing.

Fault Tolerance and Reliability

Fault Tolerance: Kafka’s distributed architecture ensures data durability and fault tolerance, essential for the continuous operation of energy trading systems.

Reliability: Flink offers exactly-once processing semantics, ensuring that each piece of data is processed accurately without loss or duplication, which is critical for maintaining data integrity in trading operations.

Integration and Flexibility

Integration Capabilities: Kafka can integrate with various data sources and sinks via Kafka Connect or Client APIs like Java, C, C++, Python, JavaScript or REST/HTTP. This making it versatile for collecting data from different energy systems. Flink can process this data in real-time and output the results to various storage systems or dashboards.

Flexible Data Processing: Flink supports complex event processing, windowed computations, and machine learning, allowing for advanced analytics and predictive modeling in energy trading.

Event-Driven Architecture (EDA)

Event-Driven Processing: Energy trading can benefit from an event-driven architecture where trading decisions and alerts are triggered by specific events, such as market price thresholds or changes in energy production. Kafka and Flink facilitate this approach by efficiently handling event streams.

Energy Trading at Uniper

Uniper is a global energy company headquartered in Germany that focuses on power generation, energy trading, and energy storage solutions, providing electricity, natural gas, and other energy services to customers worldwide.

Uniper - The beating of energy
Source: Uniper

Uniper’s Business Value of Data Streaming

Why has Uniper chosen to use the Apache Kafka and Apache Flink ecosystem? If you look at the trade lifecycle in the energy sector, you probably can find out by yourself:

Energy Trading Lifecycle
Source: Uniper

The underlying process is much more complex than the above picture. For instance, pre-trading including aspects like capacity management. If you trade energy between the Netherlands and Germany, the transportation of the energy needs to be planned while executing the trade. Uniper explained the process in much more details in the below webinar recording.

Here are Uniper’s benefits of implementing the trade lifecycle with data streaming using Kafka and Flink, as they described them:

Business-driven:

  • Increase of trading volumes
  • More messages per day
  • Faster processing of data

Architecture-driven:

  • Decoupling of applications
  • Faster processing of data – batch vs. streaming data
  • Reusability of data

Uniper’s IT Landscape

Uniper’s enterprise architecture leverages data streaming as central nervous systems between various technical platforms (integrated via Kafka Connect or Apache Camel) and business applications (e.g., algorithmic trading, dispatch and invoicing systems).

Apache Kafka and Flink integrated into the Uniper IT landscape
Source: Uniper

Uniper runs mission-critical workloads through Kafka. Confluent Cloud provides the right scale, elasticity, and SLAs for such use cases. Apache Flink serves ETL use cases for continuous stream processing.

Kafka Connect provides many connectors for direct integration with (non)streaming interfaces. Apache Camel is used for some other protocols that do not fit well into a native Kafka connector. Camel is an integration framework with native Kafka integration.

Fun fact: If you did not know: I have a history with Apache Camel, too. I worked a lot with this open source framework as independent consultant and at Talend with its Enterprise Service Bus (ESB) powered by Apache Camel. Hence, my blog has some articles about Apache Camel, too. Including: “When to use Apache Camel vs. Apache Kafka?

The following on-demand webinar recording explores the relation between data streaming and energy trading in more detail. Uniper’s Alex Esseling (Platform & Security Architect, Sales & Trading IT) discusses Apache Kafka and Flink inside energy trading at Uniper:

Energy Trading Webinar with Confluent and Uniper
Source: Confluent

IoT Data for Energy Trading

Energy trading differs a bit from traditional trading on Nasdaq and similar financial markets as IoT data is an important additional data source for several key reasons:

1. Real-Time Market Insights

  • Live Data Feed: IoT devices, such as smart meters and sensors, provide real-time data on energy production, consumption, and grid status, enabling traders to make informed decisions based on the latest market conditions.
  • Demand Forecasting: Accurate demand forecasting relies on real-time consumption data, which IoT devices supply continuously, helping traders anticipate market movements and adjust their strategies accordingly.

2. Enhanced Decision Making

  • Predictive Analytics: IoT data allows for sophisticated predictive analytics, helping traders forecast price trends, identify potential supply disruptions, and optimize trading positions.
  • Risk Management: Continuous monitoring of energy infrastructure through IoT sensors helps in identifying and mitigating risks, such as equipment failures or grid imbalances, which could affect trading decisions.

A typical use case in energy trading might involve:

  1. Data Collection: IoT devices across the energy grid collect data on energy production from renewable sources, consumption patterns in residential and commercial areas, and grid stability metrics.
  2. Data Analysis: This data is streamed and processed in real-time using platforms like Apache Kafka and Flink, enabling immediate analysis and visualization.
  3. Trading Decisions: Traders use the insights derived from this analysis to make informed decisions about buying and selling energy, optimizing their strategies based on current and predicted market conditions.

In summary, IoT data is essential in energy trading for providing real-time insights, enhancing decision-making, optimizing grid operations, ensuring compliance, and integrating renewable energy sources, ultimately leading to a more efficient and responsive energy market.

Data Streaming to Ingest IoT Data into Energy Trading

Data streaming with Kafka and Flink is deployed in various edge and hybrid cloud energy use cases.

Data Streaming with Kafka at the Edge and Hybrid Cloud in the Energy Industry

As discussed above, some of the IoT data is very helpful for energy trading, not just for OT and operational workloads. Read about data streaming in the IoT space in the following articles:

Powerledger – Energy Trading with Kafka, MongoDB and Blockchain

Powerledger is another excellent success story for energy trading powered by data streaming with Apache Kafka. The technology company uses blockchain to enable decentralized energy trading. Their platform allows users to trade energy directly with each other, manage renewable energy certificates, and track the origin and movement of energy in real-time.
The platform provides
  • Tracking, tracing and trading of renewable energy
  • Blockchain-based energy trading platform
  • Facilitating peer-to-peer (P2P) trading of excess electricity from rooftop solar power installations and virtual power plants
  • Traceability with non-fungible tokens (NFTs) representing renewable energy certificates (RECs)

Powerledger uses a decentralised rather than the conventional unidirectional market. Benefits include reduced customer acquisition costs, increased customer satisfaction, better prices for buyers and sellers (compared with feed-in and supply tariffs), and provision for cross-retailer trading.

Apache Kafka via Confluent Cloud as a core piece of infrastructure, specifically to ingest data from smart electricity meters and feed it into the trading system.

Wondering why to combine Kafka and Blockchain? Learn more here: “Apache Kafka and Blockchain – Comparison and a Kafka-native Implementation“.

re.alto – Solar Trading: Insights into the Production of Solar Plants

re.alto is a company that provides a digital marketplace for energy data and services. Their platform connects energy providers, consumers, and developers, facilitating the exchange of data and APIs (application programming interfaces) to optimize energy usage and distribution. By enabling seamless access to diverse energy-related data, re.alto supports innovation, enhances energy efficiency, and helps create smarter, more flexible energy systems.

re.alto presented their data streaming use cases at the Data in Motion Tour in Brussels, Belgium:

  • Xenn: Real time monitoring of energy costs
  • Smart charging: Schedule the charging of an electric vehicle to reduce costs or environmental impact
  • Solar trading: Insights into the production of solar plants

Let’s explore solar trading in more detail. re.alto presented about its platform at the Confluent Data in Motion Tour 2024 in Brussels, Belgium. re.alto’s platform provides connectivity and APIs for market pricing data but also IoT integration to smart meters, grid data, batteries, SCADA systems, etc.:

re.alto Platform with Energy Market and IIoT Connectivity to Smart Meters and SCADA Systems
Source: re.alto

Solar trading includes three steps:

  1. Data collection from sources such as SMA, FIMER, HUAWEI, solar edge
  2. Data processing with data streaming, time series analytics and overvoltage detection
  3. Providing data via ePox Spot and APIs / marketplace

Solar Trading with Data Streaming using Apache Kafka and Timeseries Analytics at Energy Company re.alto

Energy Trading Needs Reliable Real-Time Data Feeds, Connectivity and Reliability

Energy trading requires scalable and reliable real-time data processing. In contrary to trading in financial markets, the energy sector additional integrates IoT data sources like smart meters and sensors.

Uniper, re.alto and Powerledger are excellent examples of how to build a reliable energy trading platform powered by data streaming.

How does your enterprise architecture look like for energy trading? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Energy Trading with Apache Kafka and Flink appeared first on Kai Waehner.

]]>
Modernizing SCADA Systems and OT/IT Integration with Data Streaming https://www.kai-waehner.de/blog/2023/09/10/modernizing-scada-systems-and-ot-it-integration-with-data-streaming/ Sun, 10 Sep 2023 12:56:13 +0000 https://www.kai-waehner.de/?p=5304 SCADA control systems are a vital component of IT/OT modernization. The old IT/OT infrastructure and SCADA system are monolithic, proprietary, not scalable, and miss open APIs based on standard interfaces. This post explains the modernization of such a system based on the real-life use case of 50Hertz, a transmission system operator for electricity in Germany. A lightboard video is included.

The post Modernizing SCADA Systems and OT/IT Integration with Data Streaming appeared first on Kai Waehner.

]]>
SCADA control systems are a vital component of IT/OT modernization. The old IT/OT infrastructure and SCADA system are monolithic, proprietary, not scalable, and miss open APIs based on standard interfaces. This post explains the modernization of such a system based on the real-life use case of 50Hertz, a transmission system operator for electricity in Germany. Two common business goals drove them: Improve the Overall Equipment Effectiveness (OEE) and stay innovative. A lightboard video about the related data streaming enterprise architecture is included.

Modernization of OT IT and SCADA with Data Streaming

The State of Data Streaming for Manufacturing in 2023

The evolution of industrial IoT, manufacturing 4.0, and digitalized B2B and customer relations require modern, open, and scalable information sharing. Data streaming allows integrating and correlating data in real-time at any scale. Trends like software-defined manufacturing and data streaming help modernize and innovate the entire engineering and sales lifecycle.

I have recently presented an overview of trending enterprise architectures in the manufacturing industry and data streaming customer stories from BMW, Mercedes, Michelin, and Siemens. A complete slide deck and on-demand video recording are included:

This blog post explores one of the enterprise architectures and case studies in more detail: Modernization of legacy and proprietary monoliths and SCADA systems to a scalable, open platform with real-time data integration capabilities.

What is a SCADA System? And how does Data Streaming help?

Supervisory control and data acquisition (SCADA) is a control system architecture comprising computers, networked data communications, and graphical user interfaces for high-level supervision of machines and processes. It also covers sensors and other devices, such as programmable logic controllers, which interface with process plants or machinery.

Supervisory control and data acquisition - SCADA

Data streaming helps connect high-volume sensor data from machines, PLCs, robots, and other IoT devices. This is possible in real-time at scale with stream processing. The de facto standard for data streaming is Apache Kafka and its ecosystems, like Kafka Stream and Kafka Connect.

Enterprises leverage Apache Kafka as the next generation of Data Historians. Integrating and pre-processing the events with data streaming is a prerequisite for data correlation with information systems like the MES or ERP (that might run at the edge or more often in the cloud).

50hertz: A cloud-native SCADA system built with Apache Kafka

50hertz is a transmission system operator for electricity in Germany. The company secures electricity supply to 18 million people in northern and eastern Germany.

The infrastructure must operate 24 hours, seven days a week. Various shift teams and a mission-critical SCADA infrastructure supervise and control the OT systems.

50hertz next-generation Modular Control Center System (MCCS) leverages a central, scalable, event-based integration platform based on Confluent:

Cloud-native SCADA system built with Apache Kafka at 50hertz
Source: 50hertz

The first four containers include the Supervisory & Control (SCADA), Load Frequency Control (LFC), and Time Series Management & Forecasting applications. Each container can have multiple services/functions that follow the event-based microservices pattern.

50hertz provides central governance for security, protocols, and data schemas (CIM compliant) between platform containers/ modules. The cloud-native 24/7 SCADA system is developed in the cloud and deployed in safety-critical edge environments.

50hertz presented their OT/IT and SCADA modernization leveraging data streaming with Apache Kafka at the Confluent Data in Motion tour 2021. Unfortunately, the on-demand video recording is available only in German. Therefore, in another blog post, I wrote more about the case study: “A cloud-native SCADA System for Industrial IoT built with Apache Kafka“.

Lightboard Video: How Data Streaming Modernizes SCADA and OT/IT

Here is a five-minute lightboard video that describes how data streaming helps with modernizing monolith and proprietary SCADA infrastructure and OT/IT environments:

If you liked this video, make sure to follow the YouTube channel for many more lightboard videos across all industries.

Apache Kafka glues together the old and new OT/IT World

The 50Hertz case study showed how to modernize an existing legacy infrastructure with cloud-native technologies, whether you deploy at the edge or in the public cloud. For more case studies, check out the free “The State of Data Streaming in Manufacturing” on-demand recording or read the related blog post.

A common question in these scenarios is the proper communication and integration protocol when you move away from proprietary legacy PLCs and OT interfaces. MQTT and OPC-UA established themselves as excellent standards with different sweet spots. Data Streaming with Apache Kafka is complementary, not competitive. Learn more by reading “OPC UA, MQTT, and Apache Kafka – The Trinity of Data Streaming in IoT“.

How do you leverage data streaming in your manufacturing use cases? Do you deploy at the edge, in the cloud, or both? Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post Modernizing SCADA Systems and OT/IT Integration with Data Streaming appeared first on Kai Waehner.

]]>
The State of Data Streaming for Energy & Utilities https://www.kai-waehner.de/blog/2023/09/01/the-state-of-data-streaming-for-energy-utilities-in-2023/ Fri, 01 Sep 2023 07:14:02 +0000 https://www.kai-waehner.de/?p=5606 The evolution of utility infrastructure, energy distribution, customer services, and new business models requires real-time end-to-end visibility, reliable and intuitive B2B and B2C communication, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. I look at trends in the utilities sector to explore how data streaming helps as a business enabler, including customer stories from SunPower, 50hertz, Powerledger, and more. A complete slide deck and on-demand video recording are included.

The post The State of Data Streaming for Energy & Utilities appeared first on Kai Waehner.

]]>
This blog post explores the state of data streaming for the energy and utilities industry. The evolution of utility infrastructure, energy distribution, customer services, and new business models requires real-time end-to-end visibility, reliable and intuitive B2B and B2C communication, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. Data streaming allows integrating and correlating data in real-time at any scale to improve most workloads in the energy sector.

I look at trends in the utilities sector to explore how data streaming helps as a business enabler, including customer stories from SunPower, 50hertz, Powerledger, and more. A complete slide deck and on-demand video recording are included.

The State of Data Streaming for Energy and Utilities in 2023

The energy & utilities industry is fundamental for a sustainable future. Garter explores the Top 10 Trends Shaping the Utility Sector in 2023: “In 2023, power and water utilities will continue to face a variety of forces that will challenge their business and operating models and shape their technology investments.

Utility technology leaders must confidently compose the future for their organizations in the midst of uncertainty during this energy transition volatile period — the future that requires your organizations to be both agile and resilient.”

Gartner - Top 10 Trends Shaping the Utility Sector in 2023

From system-centric and large to smaller-scale and distributed

The increased use of digital tools makes the expected structural changes in the energy system possible:

Smart Grid - Energy Industry

Energy AI use cases

Artificial Intelligence (AI) with technologies like Machine Learning (ML) and Generative AI (GenAI) is a hot topic across all industries. Innovation around AI disrupts many business models, tasks, business processes, and labor.

NVIDIA created an excellent diagram showing the various opportunities for AI in the energy & utilities sector. It separates the scenarios by segment: upstream, midstream, downstream, power generation, and power distribution:

AI Use Cases in the Energy sector (Source: NVIDIA)
AI Use Cases in the Energy Sector (Source: NVIDIA)

Cybersecurity: The threat is real!

McKinsey & Company explains that “the cyberthreats facing electric-power and gas companies include the typical threats that plague other industries: data theft, billing fraud, and ransomware. However, several characteristics of the energy sector heighten the risk and impact of cyberthreats against utilities:”

McKinsey - Cybersecurity in Energy & Utilities

Data streaming in the energy & utilities industry

Adopting trends like predictive maintenance, track&trace, proactive sales and marketing, or threat intelligence is only possible if enterprises in the energy sector can provide and correlate information at the right time in the proper context. Real-time, which means using the information in milliseconds, seconds, or minutes, is almost always better than processing data later (whatever later means):

Real-Time with Data Streaming powered by Apache Kafka and Flink

Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. Apache Kafka is the de facto standard for data streaming.

Apache Kafka for Smart Grid, Utilities and Energy Production” is a great starting point to learn more about data streaming in the industry, including a few case studies not covered in this blog post – such as

  • EON: Smart grid for energy production and distribution with Apache Kafka
  • Devon Energy: Kafka at the edge for hybrid integration and analytics in the cloud
  • Tesla: Kafka-based data platform for trillions of data points per day

5 Ways Utilities Accomplish More with Real-Time Data

“After creating a collaborative team that merged customer experience and digital capabilities, one North American utility went after a 30 percent reduction in its cost-to-serve customers in some of its core journeys.”

As the Utilities Analytics Institute explains: “Utilities need to ensure that the data they are collecting is high quality, specific to their needs, preemptive in nature, and, most importantly, real-time.” The following five characteristics are crucial to add value with real-time data:

  1. High-Quality Data
  2. Data Specific to Your Needs
  3. Make Your Data Proactive
  4. Data Redundancy
  5. Data is Constantly Changing

Real-Time Data for Smart Meters and Common Praxis

Smart meters are a perfect example of increasing business value with real-time data streaming. As Clou Global confirms: “The use of real-time data in smart grids and smart meters is a key enabler of the smart grid“.

Possible use cases include:

  1. Load Forecasting
  2. Fault Detection
  3. Demand Response
  4. Distribution Automation
  5. Smart Pricing

Processing and correlating events from smart meters with stream processing is just one IoT use case. You can leverage “Apache Kafka and Apache Flink for many Industrial IoT and Manufacturing 4.0 use cases“.

And there is so much more if you expand your thinking from upstream through midstream to downstream applications to “transform the global supply chain with data streaming and IoT“.

Cloud adoption in utilities & energy sector

Accenture points out that 84% use Cloud SaaS solutions and 79% use Cloud PaaS Solutions in the energy & utilities market for various reasons:

  • New approach to IT
  • Incremental adoption
  • Improved scalability, efficiency, agility and security
  • Unlock most business value

This is a general statistic, but this applies to all components in the data-driven enterprise, including data streaming. A company does not just move a specific application to the cloud; this would be counter-intuitive from a cost and security perspective. Hence, most companies start with a hybrid architecture and bring more and more workloads to the public cloud.

The energy & utilities industry applies various trends for enterprise architectures for cost, flexibility, security, and latency reasons. The three major topics I see these days at customers are:

  • Global data streaming
  • Edge computing and hybrid cloud integration
  • OT/IT modernization

Let’s look deeper into some enterprise architectures that leverage data streaming for energy & utilities use cases.

Global data streaming across data centers, clouds and the edge

Energy and utilities require data infrastructure everywhere. While most organizations have a cloud-first strategy, there is no way around running some workloads at the edge outside a data center for cost, security, or latency reasons.

Data streaming is available everywhere:

Apache Kafka in the Shipping Industry for Marine, Oil Transport, Vessel Fleet, Shipping Line, Drones

Data synchronization across environments, regions and clouds is possible with open-source Kafka tools like MirrorMaker. However, this requires additional infrastructure and development/operations efforts. Innovative solutions like Confluent’s Cluster Linking leverage the Kafka protocol for real-time replication. This enables much easier deployments and significantly reduced network traffic.

Edge computing and hybrid cloud integration

Kafka deployments look different depending on where it needs to be deployed.

Fully managed serverless offerings like Confluent Cloud are highly recommended in the public cloud to focus on business logic with reduced time-to-market and TCO.

In a private cloud, data center or edge environment, most companies deploy on Kubernetes today to provide a similar cloud-native experience.

Kafka can also be deployed on industrial PCs (IPC) and other industrial hardware. Many use cases exist for data streaming at the edge. Sometimes, a single broker (without high availability) is good enough.

No matter how you deploy data streaming workloads, a key value is the unidirectional or bidirectional synchronization between clusters. Often, only curated and relevant data is sent to the cloud for cost reasons. Also, command & control patterns can start a business process in the cloud and send events to the edge.

Event Streaming for Energy Production Upstream and Midstream at the Edge with a 5G Campus Network and Kafka

OT/IT modernization with data streaming

The energy sector operates many monoliths, inflexible and closed software and hardware products. This is changing in this decade. OT/IT modernization and the digital transformation require open APIs, flexible scale, and decoupled applications (from different vendors).

Many companies leverage Apache Kafka to build a postmodern data historian to complement or replace existing expensive OT middleware:

Apache Kafka as open scalable Data Historian for IIoT with MQTT and OPC UA

Just to be clear: Kafka and any other IT software like Spark, Flink, Amazon Kinesis, and so on are NOT hard real-time. It cannot be used for safety-critical use cases with deterministic systems like autonomous driving or robotics. That is C, Rust, or other embedded software.

However, data streaming connects the OT and IT worlds. As part of that, connectivity with robotic systems, intelligent vehicles, and other IoT devices is the norm for improving logistics, integration with ERP and MES, aftersales, etc.

Learn more about this discussion in two articles:

New customer stories for data streaming in the energy & utilities sector

So much innovation is happening in the energy & utilities sector. Automation and digitalization change how utilities monitor infrastructure, build customer relationships, and create completely new business models.

Most energy service providers use a cloud-first approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating IT infrastructure. And elastic scalability gets even more critical with all the growing networks, 5G workloads, autonomous vehicles, drones, and other innovations.

Here are a few customer stories from worldwide energy & utilities organizations:

  • 50hertz: A grid operator modernization of the legacy, monolithic and proprietary SCADA infrastructure to cloud-native microservices and a real-time data fabric powered by data streaming. More details: A cloud-native SCADA System for Industrial IoT built with Apache Kafka.
  • SunPower: Solar solutions across the globe where 6+ million devices in the field send data to the streaming platform. However, sensor data alone is not valuable! Fundamentals for delivering customer value include measurement ingestion, metadata association, storage, and analytics.
  • aedifion: Efficient management of real estate to operate buildings better and meet environmental, social, and corporate governance (ESG) goals. Secure connectivity and reliable data collection are implemented with Confluent Cloud (and deprecated the existing MQTT-based pipeline).
  • Ampeers Energy: Decarbonization for the real estate. The service provides district management with IoT-based forecasts and optimization, and local energy usage accounting. The real-time analytics of time-series data is implemented with OPC-UA, Confluent Cloud and TimescaleDB.
  • Powerledger: Green energy trading with blockchain-based tracking, tracing and trading of renewable energy from rooftop solar power installations and virtual power plants. Non-fungible tokens (NFTs) representing renewable energy certificates (RECs) in. a decentralised rather than the conventional unidirectional market. Confluent Cloud ingests data from smart electricity meters. Learn more: data streaming and blockchain.

Resources to learn more

This blog post is just the starting point. Learn more about data streaming in the energy & utilities industry in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases.

On-demand video recording

The video recording explores the telecom industry’s trends and architectures for data streaming. The primary focus is the data streaming case studies. Check out our on-demand recording:

Confluent Video Recording about the Energy Sector

Slides

If you prefer learning from slides, check out the deck used for the above recording:

Fullscreen Mode

Case studies and lightboard videos for data streaming in the energy & utilities industry

The state of data streaming for energy & utilities is fascinating. New use cases and case studies come up every month. This includes better data governance across the entire organization, real-time data collection and processing data across hybrid edge and cloud infrastructures, data sharing and B2B partnerships for new business models, and many more scenarios.

We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Stay tuned; I will update the links in the next few weeks and publish a separate blog post for each story and lightboard video.

And this is just the beginning. Every month, we will talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on… Check out my other blog posts.

Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post The State of Data Streaming for Energy & Utilities appeared first on Kai Waehner.

]]>
Apache Kafka in the Public Sector – Part 4: Energy and Utilities https://www.kai-waehner.de/blog/2021/10/18/apache-kafka-public-sector-part-4-energy-utilities-smart-grid/ Mon, 18 Oct 2021 12:47:01 +0000 https://www.kai-waehner.de/?p=3811 The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores both edges to show how data in motion powered by Apache Kafka adds value for innovative new applications and modernizing legacy IT infrastructures. This is part 4: Use cases and architectures for energy, utilities, and smart grid infrastructures.

The post Apache Kafka in the Public Sector – Part 4: Energy and Utilities appeared first on Kai Waehner.

]]>
The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores how the public sector leverages data in motion powered by Apache Kafka to add value for innovative new applications and modernizing legacy IT infrastructures. This is part 4: Use cases and architectures for energy, utilities, and smart grid infrastructure.

Apache Kafka for Public Utilities and Energy Sector

Blog series: Apache Kafka in the Public Sector and Government

This blog series explores why many governments and public infrastructure sectors leverage event streaming for various use cases. Learn about real-world deployments and different architectures for Kafka in the public sector:

  1. Life is a Stream of Events
  2. Smart City
  3. Citizen Services
  4. Energy and Utilities (THIS POST)
  5. National Security

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts once published.

As a side note: If you wonder why healthcare is not on the above list. Healthcare is another blog series on its own. While the government can provide public health care through national healthcare systems, it is part of the private sector in many other cases.

Energy, Utilities, Smart Grid, and the Public Sector

The energy sector is different in countries and even states. Public utilities are subject to public control and regulation, ranging from local community-based groups to statewide government monopolies. Hence, some markets are private businesses, or entirely controlled by the government, or a mix of both. For instance, here is the complex US Regulated vs. Deregulated Electricity Market:

Nevertheless, one thing is clear: The energy sector is changing; no matter if the government entirely regulates the market or not:

Smart Grid - Energy Industry

Let’s look at a few real-world examples for Apache Kafka in the Energy Sector, its relation to the public sector, and a few possible enterprise architectures.

Kafka Examples for Public Utilities

First of all, I already wrote about data in motion powered by Kafka in the energy sector. I also had a great panel discussion about edge and hybrid architecture in a panel discussion about Kafka and 5G networks in the oil and gas and mining industry.

Let’s now take a look at two more examples:

  • Stadtwerke Leipzig: A government-owned electricity provider
  • Tesla: A private company heavily influenced by the public administration

Stadtwerke Leipzig – Digital Customer Interface for Public Utilities

Stadtwerke Leipzig is a municipal energy utility in central Germany that provides electricity, natural gas, and district heating. They are wholly owned by LVV Leipziger Versorgungs- und Verkehrsgesellschaft, in which the City of Leipzig holds a 100% stake.

Leipziger Stadtwerke built a digital customer interface to connect public utilities, grid operators, the housing industry, end-consumer, industrial customers:

Apache Kafka at Leipziger Stadtwerke Utilities Energy Public Sector

The picture is of bad quality, unfortunately, and not available in a better version. Though, the essential point is that the long green rectangle in the middle is Apache Kafka. Kafka is the central nervous system to connect edge devices, proprietary protocols, and open standards such as MQTT, OPC-UA, XML, JSON, etc. This way, the OT and IT world are connected with a single, scalable real-time pipeline.

Instead of having various data silos, the data is now accessible by any interested consumer in real-time at scale. Hence, this architecture solves one of the biggest challenges in energy infrastructures: Getting value out of the massive volumes of OT data. Nevertheless, the enterprise architecture allows different technologies and brownfield integration. Kafka provides automatic backpressure handling and preprocessing.

Leipziger Stadtwerke combines Kafka with other great technologies to build innovative digital services. For instance, Kunbus edge devices (a PI with custom Linux) and over-the-air updates (OTA) with Mender.

Tesla – Streaming IoT Data for Innovative Services

Tesla is a private enterprise, not within the public sector. However, living in Germany, I see how much related the company is with the public administration, government, law, etc. The Gigafactory in Berlin is in the press every week. The innovation around electric cars is a widespread public discussion; even German competitors like Volkswagen admit Tesla’s innovative business. As the public sector often does not talk to the public about its projects, I thought Tesla’s Kafka success story is still worth mentioning in this post.

Why?

Well, because Tesla has a considerable energy business (they don’t just sell cars), innovates like not many other car and energy companies need to collaborate with governments across the globe regarding law compliance, charging infrastructure, and other crucial topics.

Tesla processes trillions of messages per day for IoT use cases with Kafka. The data comes from connected cars, power packs, factories,  charging stations, etc. Tesla’s Kafka Summit talk showed exciting information about their Kafka journey:


Tesla using Apache Kafka for IoT and Energy Sector

Hybrid IoT Architecture for the Energy Sector

IT architectures in the public sector look very similar to the private sector. The main difference is the more limited usage of public cloud providers. Nevertheless, most energy infrastructures require a hybrid approach with edge computing outside a data center or cloud.

Let’s take a look at a few example architecture for energy production from upstream and midstream to downstream:

Energy Production and Distribution with a Hybrid Architecture using Kafka

Event Streaming enables data integration and data processing in motion, whether it has to happen at the edge or in the data center/cloud.

Edge Computing with Kafka in Disconnected Offline Mode

From the perspective of the edge, data is often filtered, preprocessed, and aggregated at the edge for latency, security, or cost reasons:

Event Streaming for Energy Production Upstream and Midstream at the Edge with a 5G Campus Network and Kafka

Disconnected data processing at the edge is crucial in many energy, and utilities use cases. It has to work even without an internet connection in “offline mode”:

Energy Production at the Disconnected Edge Upstream with Apache Kafka in the Public Sector

The same is valid on the consumer side. The point-of-sale (POS) has to run 24/7 for transactional workloads, no matter if there is an internet connection:

 

Edge Processing at the Intelligent Gas Station Downstream with Apache Kafka

I covered edge use cases for Kafka and security implications with Kafka in a zero-trust air-gapped environment in separate posts.

Cybersecurity – The Threat is Real for Public Sector and the Energy Infrastructure

Cybersecurity is crucial everywhere in the public sector, including citizen services, smart city, and mobility services. But in these “convenience use cases”, we “only” talk about data privacy. Yes, this is very important. But in the energy sector, we are talking about safety and human lives at risk. The Colonial Pipeline ransomware attack in May 2021 in the US is just one of many successful attacks in the past few quarters.

National security is a huge topic for the energy sector. Electric utilities can be affected by cyberattacks across the whole value chain. McKinsey has an exciting diagram explaining this:

Cybersecurity The Threat is Real in Public Sector and Energy Infrastructure

 

The discussion around cybersecurity is a primer to the last post of this blog series.

Of course, my general blog series about Apache Kafka for Cybersecurity (Situational Awareness, Threat Intelligence, Forensics, Zero Trust, SIEM/SOAR Modernization) is helpful, too.

Data in Motion for Reliable and Scalable Smart Grid Infrastructure

This post showed a few real-world examples and architectures for data in motion in hybrid architectures in the energy industry. The private sector has fewer examples than the public sector. But the architectures look the same, no matter who is responsible.

The private energy sector needs to collaborate with the government and public administration like the public energy sector. The integration and processing of data in motion with Apache Kafka is a game-changer for improving existing processes and building new innovative solutions.

For instance, Tesla is a very innovative private company with cutting business models that are only possible if you collect, aggregate, and leverage data streams from various data sources. Tesla’s new car insurance service is an excellent example of this. The insurance business is backed by data from many IoT sensors and applied in real-time to provide context-specific information. That’s the way to go for the public sector, too.

How do you leverage event streaming in the public sector? Are you working on energy/utility projects or building a smart grid? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Public Sector – Part 4: Energy and Utilities appeared first on Kai Waehner.

]]>
Apache Kafka in the Public Sector – Blog Series about Use Cases and Architectures https://www.kai-waehner.de/blog/2021/10/07/apache-kafka-public-sector-part-1-data-in-motion-use-cases-architectures-examples/ Thu, 07 Oct 2021 14:13:24 +0000 https://www.kai-waehner.de/?p=3790 The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores both edges to show how data in motion powered by Apache Kafka adds value for innovative new applications and modernizing legacy IT infrastructures. Examples include a broad spectrum of use cases across smart cities, citizen services, energy and utilities, and national security.

The post Apache Kafka in the Public Sector – Blog Series about Use Cases and Architectures appeared first on Kai Waehner.

]]>
The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores how the public sector leverages data in motion powered by Apache Kafka to add value for innovative new applications and modernizing legacy IT infrastructures. Life is a stream of events. Therefore, examples include a broad spectrum of use cases across smart cities, citizen services, energy and utilities, and national security deployed across the edge, hybrid, and multi-cloud scenarios.

Apache Kafka in the Public Sector and Government for Data in Motion

Blog series: Apache Kafka in the Public Sector and Government

This blog series explores why many governments and public infrastructure sectors leverage event streaming for various use cases. Learn about real-world deployments and different architectures for Kafka in the public sector:

  1. Life is a Stream of Events (THIS POST)
  2. Smart City
  3. Citizen Services
  4. Energy and Utilities
  5. National Security

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts once published.

As a side note: If you wonder why healthcare is not on the above list. Healthcare is another blog series on its own. While the government can provide public health care through national healthcare systems, it is part of the private sector in many other cases.

The Public Sector is a Broad Spectrum of Use Cases

Real-time Data Beats Slow Data in the Public Sector

I won’t do yet another long introduction about the added value of real-time data. Check out my blog about “Use Cases across Industries for Data in Motion powered by Apache Kafka” to understand the broad spectrum and benefits. The public sector is not different: Real-time data beats slow data in almost every use case! Here are a few examples:

Real time data beats slow data in the public sector

But think about your use cases! How often can you say that getting data late (like in one hour or the following day) is better than getting data when it happens (now, in a few milliseconds or seconds)? Probably not very often.

An important fact is that the added business value comes from correlating the events from different data sources. As an example, let’s look at the processes in a smart city:

Data in Motion in the Public Sector powered by Apache Kafka

The sensor data from the car is only valuable if an application correlates it with data from other vehicles in the traffic planning system. Intelligent parking is only reasonable if it integrates with the overall city planning. Emergency service needs to receive an alert in real-time if a crash happens. All of that needs to happen in real-time! It does not matter if the use case is about transactional workloads (usually smaller data sets) or analytical workloads (usually more extensive data sets).

Open API and Partnerships are Mandatory

Governments can build great applications. At least in theory. In practice, they rely on external data from partners and 3rd party applications for many potential use cases:

Data in Motion as Foundation of a Smart City powered by Apache Kafka

Governments and cities need to work with several other stakeholders, including carmakers, suppliers, telcos, mobility Services, cloud providers, software providers, etc. Standards and open APIs are mandatory for successful cross-cutting projects. The foundation of such an enterprise architecture is an open, reliable, scalable platform that can process data in real-time. Apache Kafka became the de facto standard for event streaming.

Data Mesh for Sharing Events between Government and 3rd Party Applications and Services

An example that shows the added value of data integration across stakeholders and processing the data in real-time: Transportation Services. A mobile app needs context. Think about hailing a taxi ride. It doesn’t help you if you see the position of each taxi on the city map in real-time. You want to know the estimated time of arrival, the estimated cost, the estimated time of arrival at your destination, the car model that will pick you up, and so much more.

This use case – like many others – is only possible if you integrate and correlate the data from many different interfaces like a mapping service, all taxi drivers, all customers in a city, the weather service, backend analytics services, and much more:

Data in Motion with Kafka across the Public and Private Sector

The left side of the picture shows a dashboard built with a real-time message queue like RabbitMQ. The right side shows data correlation of data from different sources in real-time with an event streaming platform like Apache Kafka.

I hope you agree on the added value of the event streaming platform. Just sending data from A to B in real-time is not enough. Only the data processing in real-time adds true value.

Data in Motion as Paradigm Shift in the Public Sector

Real-time beats slow data. No matter if you think about cutting-edge use cases in national security or modernizing the IT infrastructure in the public administration. Event Streaming is the foundation of this paradigm shift moving towards real-time data processing in the public sector. The upcoming posts of this blog series explore many different use cases and architectures. If you also want to learn more about Apache Kafka offerings on the market, check out my comparison of Apache Kafka products and cloud services.

How do you leverage event streaming in the public sector? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Public Sector – Blog Series about Use Cases and Architectures appeared first on Kai Waehner.

]]>
Panel Discussion about Kafka, Edge, Networking and 5G in Oil and Gas and Mining Industry https://www.kai-waehner.de/blog/2021/08/20/panel-discussion-apache-kafka-edge-networking-5g-oil-and-gas-mining-industry/ Fri, 20 Aug 2021 07:40:43 +0000 https://www.kai-waehner.de/?p=3698 The oil & gas and mining industries require edge computing for low latency and zero trust use cases. Most IT architectures are hybrid with big data analytics in the cloud and safety-critical data processing in disconnected and often air-gapped environments. This blog post shares a panel discussion that explores the challenges, use cases, and hardware/software/network technologies to reduce cost and innovate. A key focus is on the open-source framework Apache Kafka, the de facto standard for processing data in motion at the edge and in the cloud.

The post Panel Discussion about Kafka, Edge, Networking and 5G in Oil and Gas and Mining Industry appeared first on Kai Waehner.

]]>
The oil & gas and mining industries require edge computing for low latency and zero trust use cases. Most IT architectures are hybrid with big data analytics in the cloud and safety-critical data processing in disconnected and often air-gapped environments. This blog post shares a panel discussion that explores the challenges, use cases, and hardware/software/network technologies to reduce cost and innovate. A key focus is on the open-source framework Apache Kafka, the de facto standard for processing data in motion at the edge and in the cloud.

Apache Kafka and Edge Networks in Oil and Gas and Mining

Apache Kafka at the Edge and in Hybrid Cloud

I explored the usage of event streaming at the edge and in hybrid cloud scenarios in lots of detail in the past. Hence, instead of yet another description, check out the following posts to learn about use cases and architectures:

Panel Discussion: Kafka, Network Infrastructure, Edge, and Hybrid Cloud in Oil and Gas

Here is the panel discussion. The conversation includes the software and the hardware/infrastructure/networking perspective. It is a great mix of exploring use cases from the oil&gas and mining industries for processing data in motion and technical facts about communication/radio/telco infrastructures. I think it was really a great mix of topics that are heavily related and depend on each other to deploy a project successfully.

Speakers:

  • Andrew Duong (Confluent): Moderator
  • Kai Waehner (Confluent): Expert on hybrid software architectures and data in motion
  • Dion Stevenson (Tait Communications): Expert on hardware and network infrastructure
  • Sohan Domingo (Tait Communications): Expert on hardware and network infrastructure

Now enjoy the discussion and feel free to share any thoughts or feedback:

Kafka in the Energy Sector including Oil and Gas, Mining, Smart Grids

An example architecture for hybrid event streaming in the oil and gas industry can look like the following:

Data in Motion for Energy Production - Upstream Midstream Downstream - at the Edge with Kafka in Oil and Gas and Mining

 

If you want to learn more about event streaming with Apache Kafka in the energy industry (including oil and gas, mining, smart grids), check out the following blog post:

Notes about the Kafka, Edge, Oil, and Gas, Mining Conversation

If you prefer reading or just listening to a few of the sections, here are some notes about the flow of the panel discussion:

0-4:20– Introduction to Tait Communications
4:45- 7:20– Introduction to Confluent and high-level definition of Edge & IoT
7:30- 10:10– Voice communication discussion about connectivity, the importance of context at the point of time through data so the right response can be determined sooner. No matter where they are, what they’re doing, we get communication’s at the edge to suit the needs of a modern workforce.
10:15-12:10 ML/AI at the Edge. Continuous monitoring of all infrastructure and sensors for safety purposes. Event streaming to help send alerts in the real and also for post-event analysis too. There’s a process to get into AI- Infra, then pipeline, then AI, not the other way around.
12:15- 14:42– 5G can’t solve all problems- security, privacy, compliance considerations as to where to process the data, and beyond this, the cost is also a factor. Considerations for Cloud and on-premise.
14:50 – 16:03– 5G discussion. There are real-world limitations like cell towers. You also need contextual awareness at the Edge and making decisions there (local awareness)- e.g., Gas on a vehicle that’s disconnected on the backend.
16:15 – 20:10– Manufacturing & Supply chain, radios & communications today and what’s possible in the future. Having IoT at the Edge manufacturing optimizations with low latency requirements where cloud-first doesn’t make sense. On the flip side, if it’s not safety-critical or things like an ERP system, this can be pushed into the cloud.
20:10 -23:35– Mining side of things, lacking connectivity and preference for edge-based usage. Autonomous trucks, decisions on the edge rather than delays or even milliseconds by going to the cloud. Doing it locally at the edge is more efficient in some cases. Collecting all sensors on the trucks, temperatures, etc., even whilst disconnected, but once the connection is re-established at the base, that data can be uploaded. ‘Last mile’ analytics. Confluent is IT, not OT- we integrate with IT systems, but the OT world is separated.
23:38- 26:25: Digital mobile radios and voice communications, but with autonomous trucks, you don’t have that. This is where our Unified Vehicle comes in where it’s a combination of Digital Mobile Radio(DMR) and LTE and intelligent algorithms help with failover from DMR to LTE if there are connectivity issues. Voice is still important despite the amount of technology being in use and data exploration.
27:03 – 31:15– Where to start with data exploration- Start with your requirements. Does it really need computing at the edge to solve the problems, or can Cloud work? Event streaming at the edge and use case where it makes sense. How customers get started, simple use cases to be solved first before the more advanced ones (building the foundations, data pipelines, simple rules, and test). AWS Wavelength team collaboration and edge-making sense with low latency and security requirements.
31:15- 32:54– Need to consider your bandwidth & latency as to whether edge computing makes sense. Driverless cars.
33:15 – 37:49- Where to go from here with existing customers. How do they upgrade, what customers are coming to Tait for, and the use of video as part of all this for public safety.
Health & safety, monitoring driver alertness in NZ. Truck performance, driver performance, and when to take a break. That decision needs to be made as a combination between edge and cloud.
37:50 – 40:55- Connected vehicles and cars- it’s not as hard as it looks. Gas stations with edge computing, loyalty systems, etc., and the importance of after-sales for connected vehicles. GDPR and compliance by aggregation of data instead of as some countries have high privacy issues.
41:00-44:10- Joint project with Tait in the law enforcement space. Voice to text, use of metadata, and combining voice + video with event streaming.

Kafka for Next-Generation Edge Computing

The energy industry, including oil&gas and mining, is super interesting from a technical perspective. It requires edge and cloud computing. Upstream, midstream, and downstream is a complex and safety-critical supply chain. Processing data in motion with Apache Kafka leveraging various network infrastructures is a great opportunity to innovate and reduce costs across various use cases.

Do you already leverage Apache Kafka for processing data in motion in the oil and gas, mining, or any other industry? How does your (future) edge or hybrid architecture look like? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

 

The post Panel Discussion about Kafka, Edge, Networking and 5G in Oil and Gas and Mining Industry appeared first on Kai Waehner.

]]>
Apache Kafka and MQTT (Part 5 of 5) – Smart City and 5G https://www.kai-waehner.de/blog/2021/03/29/apache-kafka-mqtt-part-5-of-5-smart-city-government-citizen-telco-5g/ Mon, 29 Mar 2021 07:10:02 +0000 https://www.kai-waehner.de/?p=3288 Apache Kafka and MQTT are a perfect combination for many IoT use cases. This blog series covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions. This post is part five: Smart City and 5G.

The post Apache Kafka and MQTT (Part 5 of 5) – Smart City and 5G appeared first on Kai Waehner.

]]>
Apache Kafka and MQTT are a perfect combination for many IoT use cases. This blog series covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions. This post is part five: Smart City and 5G.

MQTT and Kafka for Smart City and 5G Architectures

Apache Kafka + MQTT Blog Series

The first blog post explores the relation between MQTT and Apache Kafka. Afterward, the other four blog posts discuss various use cases, architectures, and reference deployments.

  • Part 1 – Overview: Relation between Kafka and MQTT, pros and cons, architectures
  • Part 2 – Connected Vehicles: MQTT and Kafka in a private cloud on Kubernetes; use case: remote control and command of a car
  • Part 3 – Manufacturing: MQTT and Kafka at the edge in a smart factory; use case: Bidirectional OT-IT integration with Sparkplug between PLCs, IoT Gateways, Data Historian, MES, ERP, Data Lake, etc.
  • Part 4 – Mobility Services: MQTT and Kafka leveraging serverless cloud infrastructure; use case: Traffic jam prediction service using machine learning
  • Part 5 – Smart City (THIS POST): MQTT at the edge connected to fully-managed Kafka in the public cloud; use case: Intelligent traffic routing by combining and correlating 3rd party services

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Use Case: Smart City and 5G

A smart city is an urban area that uses different types of electronic Internet of Things (IoT) sensors to collect data and then use insights gained from that data to manage assets, resources, and services efficiently.

smart city provides many benefits for civilization and city management. Some of the goals are:

  • Improved Pedestrian Safety
  • Improved Vehicle Safety
  • Proactively Engaged First Responders
  • Reduced Traffic Congestion
  • Connected / Autonomous Vehicles
  • Improved Customer Experience
  • Automated Business Processes

I covered the use cases in more detail in the post “Event Streaming with Kafka as Foundation for a Smart City“. For a specific 5G example, check out “Building a Smart Factory with Apache Kafka and 5G Campus Networks“.

Let’s now explore the relation of Kafka and MQTT for smart city use cases.

Architecture: MQTT and Kafka for a Smart City

The following architecture shows an infrastructure deployed at a stadium:

MQTT and Kafka for Smart City and 5G Use Cases

In this example, both MQTT and Kafka are deployed close to the stadium. For instance, AWS Wavelength is an innovative infrastructure option to build low latency 5G use cases. The connected “regular AWS cloud region” is still used for use cases that do not require low latency.

The combination of Kafka and MQTT enables connectivity and real-time data processing for various use cases:

  • Parking information and smart navigation.
  • Location-based shopping and restaurant experiences, including innovative scenarios such as monitoring of queues and geofencing.
  • Integration of loyalty platforms to earn rewards and points.
  • Live information about the game or concert
  • Lottery drawing experiences while watching a sports game.

The possibilities are endless. Integration with 1st and 3rd party applications will create completely new opportunities to improve the customer experience, increase safety, and improve operational efficiency.

The stadium example is a particular scenario to explore the added value of processing data in motion. Let’s take a look at other real-world examples that leverage MQTT and Kafka.

Example: Cloud-based Traffic Control Systems @Berlex

The Swedish company Berlex designs and manufactures new ways to improve traffic safety.

Berlex provides cloud-based portable traffic signals. Their innovative R6 traffic signal is one of the first mobile traffic signals controlled by a cloud-based service. Berlex’s connected solution allows customers to monitor the new traffic signals on a smartphone, computer, or tablet anytime and from anywhere. MQTT enables real-time information delivery and constant monitoring.

The cloud-based service reduces the time that their customers need to spend in dangerous traffic work zones. The system enables customers to carry out numerous tasks such as checking the battery status of a traffic signal or performing an inspection remotely, with no need for risky and time-consuming on-site intervention.

Each portable R6 traffic signal is equipped with a radar that allows the signal to see traffic. Sensors within the signals publish detailed information on the current status of the signal as MQTT data. The Berlex Connect cloud service captures the continuous stream of MQTT data from each signal and shares the information with the appropriate subscribers.

To prevent interruption of the traffic signal operation, high availability is essential for the system. Berlex customers monitor the real-time information on individual portals with customized user roles that fit their specific use case.

Read the complete case study from HiveMQ for more details about this successful smart city project.

Example: The Life of Citizens as a Stream of Events @ NAV

NAV (Norwegian Work and Welfare Department) currently distributes more than one-third of the national budget to Norway or abroad citizens. NAV assists people through all phases of life within work, family, health, retirement, and social security. Events happening throughout a person’s life determines which services we provide to them, how we provide them and when we provide them.

In most countries, each person has to apply for these services resulting in many tasks handled manually by various caseworkers in the organization. Their access to insight and useful information is limited and often hard to find, causing frustration to both our caseworkers and our users. By streaming a person’s life events through our Kafka pipelines, NAV revolutionized the way users experience government services and the way the employees work:

NAV (Norwegian Work and Welfare Department)- Life is a Stream of Events with Kafka

NAV and the government as a whole have access to vast amounts of data about the citizens, reported by health institutions, employers, various government agencies, or the users themselves. Some data is distributed by large batches, while others are available on-demand through APIs. The data is ingested into streams using Kafka, Streams API, and Java microservices. NAV distributes and acts on events about birth, death, relationships, employment, income, and business processes to vastly improve the user experience, provide real-time insight and reduce the need to apply for services the government already knows are needed.

NAV chose Confluent Platform to implement to get valuable insight from life and business events. Security is a key concern. Compliance with GDPR is essential for the success of this project.

More details about NAV’s Kafka usage in their Kafka Summit presentation.

Kafka + MQTT = Smart City

In conclusion, Apache Kafka and MQTT are a perfect combination for smart city and 5G use cases. Follow the blog series to learn about use cases such as connected vehicles, manufacturing, mobility services, and smart city. Every blog post also includes real-world deployments from companies across industries. It is key to understand the different architectural options to make the right choice for your project.

What are your experiences and plans in IoT projects? What use case and architecture did you implement? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka and MQTT (Part 5 of 5) – Smart City and 5G appeared first on Kai Waehner.

]]>