Logistics Archives - Kai Waehner https://www.kai-waehner.de/blog/category/logistics/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Mon, 02 Jun 2025 05:09:50 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Logistics Archives - Kai Waehner https://www.kai-waehner.de/blog/category/logistics/ 32 32 How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI https://www.kai-waehner.de/blog/2025/06/02/how-penske-logistics-transforms-fleet-intelligence-with-data-streaming-and-ai/ Mon, 02 Jun 2025 04:44:37 +0000 https://www.kai-waehner.de/?p=7971 Real-time visibility has become essential in logistics. As supply chains grow more complex, providers must shift from delayed, batch-based systems to event-driven architectures. Data Streaming technologies like Apache Kafka and Apache Flink enable this shift by allowing continuous processing of data from telematics, inventory systems, and customer interactions. Penske Logistics is leading the way—using Confluent’s platform to stream and process 190 million IoT messages daily. This powers predictive maintenance, faster roadside assistance, and higher fleet uptime. The result: smarter operations, improved service, and a scalable foundation for the future of logistics.

The post How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI appeared first on Kai Waehner.

]]>
Real-time visibility is no longer a competitive advantage in logistics—it’s a business necessity. As global supply chains become more complex and customer expectations rise, logistics providers must respond with agility and precision. That means shifting away from static, delayed data pipelines toward event-driven architectures built around real-time data.

Technologies like Apache Kafka and Apache Flink are at the heart of this transformation. They allow logistics companies to capture, process, and act on streaming data as it’s generated—from vehicle sensors and telematics systems to inventory platforms and customer applications. This enables new use cases in predictive maintenance, live fleet tracking, customer service automation, and much more.

A growing number of companies across the supply chain are embracing this model. Whether it’s real-time shipment tracking, automated compliance reporting, or AI-driven optimization, the ability to stream, process, and route data instantly is proving vital.

One standout example is Penske Logistics—a transportation leader using Confluent’s data streaming platform (DSP) to transform how it operates and delivers value to customers.

How Penske Logistics Transforms Fleet Intelligence with Kafka and AI

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

Why Real-Time Data Matters in Logistics and Transportation

Transportation and logistics operate on tight margins and stricter timelines than almost any other sector. Delays ripple through supply chains, disrupting manufacturing schedules, customer deliveries, and retail inventories. Traditional data integration methods—batch ETL, manual syncing, and siloed systems—simply can’t meet the demands of today’s global logistics networks.

Data streaming enables organizations in the logistics and transportation industry to ingest and process information in real-time while the data is valuable and critical. Vehicle diagnostics, route updates, inventory changes, and customer interactions can all be captured and acted upon in real time. This leads to faster decisions, more responsive services, and smarter operations.

Real-time data also lays the foundation for advanced use cases in automation and AI, where outcomes depend on immediate context and up-to-date information. And for logistics providers, it unlocks a powerful competitive edge.

Apache Kafka serves as the backbone for real-time messaging—connecting thousands of data producers and consumers across enterprise systems. Apache Flink adds stateful stream processing to the mix, enabling continuous pattern recognition, enrichment, and complex business logic in real time.

Event-driven Architecture with Data Streaming in Logistics and Transportation using Apache Kafka and Flink

In the logistics industry, this event-driven architecture supports use cases such as:

  • Continuous monitoring of vehicle health and sensor data
  • Proactive maintenance scheduling
  • Real-time fleet tracking and route optimization
  • Integration of telematics, ERP, WMS, and customer systems
  • Instant alerts for service delays or disruptions
  • Predictive analytics for capacity and demand forecasting

This isn’t just theory. Leading logistics organizations are deploying these capabilities at scale.

Data Streaming Success Stories Across the Logistics and Transportation Industry

Many transportation and logistics firms are already using Kafka-based architectures to modernize their operations. A few examples:

  • LKW Walter relies on data streaming to optimize its full truck load (FTL) freight exchanges and enable digital freight matching.
  • Uber Freight leverages real-time telematics, pricing models, and dynamic load assignment across its digital logistics platform.
  • Instacart uses event-driven systems to coordinate live order delivery, matching customer demand with available delivery slots.
  • Maersk incorporates streaming data from containers and ports to enhance shipping visibility and supply chain planning.

These examples show the diversity of value that real-time data brings—across first mile, middle mile, and last mile operations.

An increasing number of companies are using data streaming as the event-driven control tower for their supply chains. It’s not only about real-time insights—it’s also about ensuring consistent data across real-time messaging, HTTP APIs, and batch systems. Learn more in this article: A Real-Time Supply Chain Control Tower powered by Kafka.

Supply Chain Control Tower powered by Data Streaming with Apache Kafka

Penske Logistics: A Leader in Transportation, Fleet Services, and Supply Chain Innovation

Penske Transportation Solutions is one of North America’s most recognizable logistics brands. It provides commercial truck leasing, rental, and fleet maintenance services, operating a fleet of over 400,000 vehicles. Its logistics arm offers freight management, supply chain optimization, and warehousing for enterprise customers.

Penske Logistics
Source: Penske Logistics

But Penske is more than a fleet and logistics company. It’s a data-driven operation where technology plays a central role in service delivery. From vehicle telematics to customer support, Penske is leveraging data streaming and AI to meet growing demands for reliability, transparency, and speed.

Penske’s Data Streaming Success Story

Penske explored its data streaming journey at the Confluent Data in Motion Tour. Sarvant Singh, Vice President of Data and Emerging Solutions at Penske, explains the company’s motivation clearly: “We’re an information-intense business. A lot of information is getting exchanged between our customers, associates, and partners. In our business, vehicle uptime and supply chain visibility are critical.

This focus on uptime is what drove Penske to adopt a real-time data streaming platform, powered by Confluent. Today, Penske ingests and processes around 190 million IoT messages every day from its vehicles.

Each truck contains hundreds of sensors (and thousands of sub-sensors) that monitor everything from engine performance to braking systems. With this volume of data, traditional architectures fell short. Penske turned to Confluent Cloud to leverage Apache Kafka at scale as a fully-managed, elastic SaaS to eliminate the operational burden and unlocking true real-time capabilities.

By streaming sensor data through Confluent and into a proactive diagnostics engine, Penske can now predict when a vehicle may fail—before the problem arises. Maintenance can be scheduled in advance, roadside breakdowns avoided, and customer deliveries kept on track.

This approach has already prevented over 90,000 potential roadside incidents. The business impact is enormous, saving time, money, and reputation.

Other real-time use cases include:

  • Diagnosing issues instantly to dispatch roadside assistance faster
  • Triggering preventive maintenance alerts to avoid unscheduled downtime
  • Automating compliance for IFTA reporting using telematics data
  • Streamlining repair workflows through integration with electronic DVIRs (Driver Vehicle Inspection Reports)

Why Confluent for Apache Kafka?

Managing Kafka in-house was never the goal for Penske. After initially working with a different provider, they transitioned to Confluent Cloud to avoid the complexity and cost of maintaining open-source Kafka themselves.

“We’re not going to put mission-critical applications on an open source tech,” Singh noted. “Enterprise-grade applications require enterprise level support—and Confluent’s business value has been clear.”

Key reasons for choosing Confluent include:

  • The ability to scale rapidly without manual rebalancing
  • Enterprise tooling, including stream governance and connectors
  • Seamless integration with AI and analytics engines
  • Reduced time to market and improved uptime

Data Streaming and AI in Action at Penske

Penske’s investment in AI began in 2015, long before it became a mainstream trend. Early use cases included Erica, a virtual assistant that helps customers manage vehicle reservations. Today, AI is being used to reduce repair times, predict failures, and improve customer service experiences.

By combining real-time data with machine learning, Penske can offer more reliable services and automate decisions that previously required human intervention. AI-enabled diagnostics, proactive maintenance, and conversational assistants are already delivering measurable benefits.

The company is also exploring the role of generative AI. Singh highlighted the potential of technologies like ChatGPT for enterprise applications—but also stressed the importance of controls: “Configuration for risk tolerance is going to be the key. Traceability, explainability, and anomaly detection must be built in.”

Fleet Intelligence in Action: Measurable Business Value Through Data Streaming

For a company operating hundreds of thousands of vehicles, the stakes are high. Penske’s real-time architecture has improved uptime, accelerated response times, and empowered technicians and drivers with better tools.

The business outcomes are clear:

  • Fewer breakdowns and delays
  • Faster resolution of vehicle issues
  • Streamlined operations and reporting
  • Better customer and driver experience
  • Scalable infrastructure for new services, including electric vehicle fleets

With 165,000 vehicles already connected to Confluent and more being added as EV adoption grows, Penske is just getting started.

The Road Ahead: Agentic AI and the Next Evolution of Event-Driven Architecture Powered By Apache Kafka

The future of logistics will be defined by intelligent, real-time systems that coordinate not just vehicles, but entire networks. As Penske scales its edge computing and expands its use of remote sensing and autonomous technologies, the role of data streaming will only increase.

Agentic AI—systems that act autonomously based on real-time context—will require seamless integration of telematics, edge analytics, and cloud intelligence. This demands a resilient, flexible event-driven foundation. I explored the general idea in a dedicated article: How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time.

Agentic AI with Apache Kafka as Event Broker Combined with MCP and A2A Protocol

Penske’s journey shows that real-time data streaming is not only possible—it’s practical, scalable, and deeply transformative. The combination of a data streaming platform, sensor analytics, and AI allows the company to turn every vehicle into a smart, connected node in a global supply chain.

For logistics providers seeking to modernize, the path is clear. It starts with streaming data—and the possibilities grow from there. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

The post How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI appeared first on Kai Waehner.

]]>
Shift Left Architecture at Siemens: Real-Time Innovation in Manufacturing and Logistics with Data Streaming https://www.kai-waehner.de/blog/2025/04/11/shift-left-architecture-at-siemens-real-time-innovation-in-manufacturing-and-logistics-with-data-streaming/ Fri, 11 Apr 2025 12:32:50 +0000 https://www.kai-waehner.de/?p=7475 Industrial enterprises face increasing pressure to move faster, automate more, and adapt to constant change—without compromising reliability. Siemens Digital Industries addresses this challenge by combining real-time data streaming, modular design, and Shift Left principles to modernize manufacturing and logistics. This blog outlines how technologies like Apache Kafka, Apache Flink, and Confluent Cloud support scalable, event-driven architectures. A real-world example from Siemens’ Modular Intralogistics Platform illustrates how this approach improves data quality, system responsiveness, and operational agility.

The post Shift Left Architecture at Siemens: Real-Time Innovation in Manufacturing and Logistics with Data Streaming appeared first on Kai Waehner.

]]>
Industrial enterprises are under pressure to modernize. They need to move faster, automate more, and adapt to constant change—without sacrificing reliability or control. Siemens Digital Industries is meeting this challenge head-on by combining software, edge computing, and cloud-native technologies into a new architecture. This blog explores how Siemens is using data streaming, modular design, and Shift Left thinking to enable real-time decision-making, improve data quality, and unlock scalable, reusable data products across manufacturing and logistics operations. A real-world example for industrial IoT, intralogistics and shop floor manufacturing illustrates the architecture and highlights the business value behind this transformation.

Shift Left Architecture at Siemens with Stream Processing using Apache Kafka and Flink

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories across all industries.

The Data Streaming Use Case Show: Episode #1 – Manufacturing and Automotive

These Siemens success stories are part of The Data Streaming Use Case Show, a new industry webinar series hosted by me.

In the first episode, we focus on the manufacturing and automotive industries. It features:

  • Experts from Siemens Digital Industries and Siemens Healthineers
  • The Founder of ‘IoT Use Case, a content and community platform focused on real-world industrial IoT applications
  • Deep insights into how industrial companies combine OT, IT, cloud, and data streaming with the shift left architecture.

The Data Streaming Industry Use Case Show by Confluent with Host Kai Waehner

The series explores real-world solutions across industries, showing how leaders turn data into action through open architectures and real-time platforms.

Siemens Digital Industries: Company and Vision

Siemens Digital Industries is the technology and software arm of Siemens AG, focused on advancing industrial automation and digitalization. It empowers manufacturers and machine builders to become more agile, efficient, and resilient through intelligent software and integrated systems.

Its business model bridges the physical and digital worlds—combining operational technology (OT) with modern information technology (IT). From programmable logic controllers to industrial IoT, Siemens delivers end-to-end solutions across industries.

Today, the company is transforming itself into a software- and cloud-driven organization, focusing strongly on edge computing, real-time analytics, and data streaming as key enablers of modern manufacturing.

With edge and cloud working in harmony, Siemens helps industrial enterprises break up monoliths and develop toward modular, flexible architectures. These software-driven approaches make plants and factories more adaptive, intelligent, and autonomous.

Data Streaming at Industrial Companies

In industrial settings, data is continuously generated by machines, production systems, robots, and logistics processes. But traditional batch-oriented IT systems are not designed to handle this in real time.

To make smarter, faster decisions, companies need to process data as it is generated. That’s where data streaming comes in.

Apache Kafka and Apache Flink enable event-driven architectures. These allow industrial data to flow in real time, from edge to cloud, across hybrid environments.

Event-driven Architecture with Data Streaming using Kafka and Flink in Industrial IoT and Manufacturing

Check out my other blogs about use cases and architecture for manufacturing and Industrial IoT powered by data streaming.

Edge and Hybrid Cloud as a Standard

Modern industrial use cases are increasingly hybrid by design. Machines and controllers produce data at the edge. Decisions must be made close to the source. However, cloud platforms offer powerful compute and AI capabilities.

Industrial IoT Data Streaming Everywhere Edge Hybrid Cloud with Apache Kafka and Flink

Siemens leverages edge devices to capture and preprocess data on-site. Data streaming with Confluent provides Siemens a real-time backbone for integrating this data with cloud-based systems, including Snowflake, SAP, Salesforce, and others.

This hybrid architecture supports low latency, high availability, and full control over data processing and analytics workflows.

The Shift Left Architecture for Industrial IoT

In many industrial architectures, Kafka has traditionally been used to ingest data into analytics platforms like Snowflake or Databricks. Processing, transformation, and enrichment happened late in the data pipeline.

ETL and ELT Data Integration to Data Lake Warehouse Lakehouse in Batch

But Siemens is shifting that model.

The Shift Left Architecture moves processing closer to the source, directly into the streaming layer. Instead of waiting to transform data in a data warehouse, Siemens now applies stream processing in real time, using Confluent Cloud and Kafka topics.

Shift Left Architecture with Data Streaming into Data Lake Warehouse Lakehouse

This shift enables faster decision-making, better data quality, and broader reuse of high-quality data across both analytical and operational systems.

For a deeper look at how Shift Left is transforming industrial architectures, read the full article about the Shift Left Architecture with Data Streaming.

Siemens Data Streaming Success Story: Modular Intralogistics Platform

A key example of this new architecture is Siemens’ Modular Intralogistics Platform, used in manufacturing plants for material handling and supply chain optimization. I explored the shift left architecture in our data streaming use case show with Stefan Baer, Senior Key Expert – Data Streaming at Siemens IT.

Traditionally, intralogistic systems were tightly coupled, with rigid integrations between

  • Enterprise Resource Planning (ERP): Order management, master data
  • Manufacturing Operations Management (MOM): Production scheduling, quality, maintenance
  • Warehouse Execution System (EWM): Inventory, picking, warehouse automation
  • Execution Management System (eMS): Transport control, automated guided vehicle (AGV) orchestration, conveyor logic

The new approach breaks this down into package business capabilities—each one modular, orchestrated, and connected through Confluent Cloud.

Key benefits:

  • Real-time orchestration of logistics operations
  • Automated material delivery—no manual reordering required
  • ERP and MOM systems integrated flexibly via Kafka
  • High adaptability through modular components
  • GenAI used for package station load optimization

Stream processing with Apache Flink transforms events in motion. For example, when a production order changes or material shortages occur, the system reacts instantly—adjusting delivery routes, triggering alerts, or rebalancing station loads using AI.

Architecture: Data Products + Shift Left

At the heart of the solution is a combination of data products and stream processing:

  • Kafka Topics serve as real-time interfaces and persistency layer between business domains.
  • Confluent Cloud hosts the event streaming infrastructure as a fully-managed service with low latency, elasticity, and critical SLAs.
  • Stream processing with serverless Flink logic enriches and transforms data in motion.
  • Snowflake receives curated, ready-to-use data for analytics.
  • Other operational and analytical downstream consumers—such as GenAI modules or shop floor dashboards—access the same consistent data in real time.
Siemens Digital Industries - Modular Intralogistics Platform 
Source: Siemens Digital Industries

This reuse of data products ensures consistent semantics, reduces duplication, and simplifies governance.

By processing data earlier in the pipeline, Siemens improves both data quality and system responsiveness. This model replaces brittle, point-to-point integrations with a more sustainable, scalable platform architecture.

Siemens Shift Left Architecture and Data Products with Data Streaming using Apache Kafka and Flink
Source: Siemens Digital Industries

Business Value of Data Streaming and Shift Left at Siemens Digital Industries

The combination of real-time data streaming, modular data products, and Shift Left design principles unlocks significant value:

  • Faster response to dynamic events in production and logistics
  • Improved operational resilience and agility
  • Higher quality data for both analytics and AI
  • Reuse across multiple consumers (analytics, operations, automation)
  • Lower integration costs and easier scaling

This approach is not just technically superior—it supports measurable business outcomes like shorter lead times, lower stock levels, and increased manufacturing throughput.

Siemens Healthineers: Shift Left with IoT, Data Streaming, AI/ML, Confluent and Snowflake in Manufacturing and Healthcare

In a recent blog post, I explored how Siemens Healthineers uses Apache Kafka and Flink to transform both manufacturing and healthcare with a wide range of data streaming use cases. From predictive maintenance to real-time logistics, their approach is a textbook example of how to modernize complex environments with an event-driven architecture and data streamingeven if they don’t explicitly label it “shift left.”

Siemens Healthineers Data Cloud Technology Stack with Apache Kafka and Snowflake
Source: Siemens Healthineers

Their architecture enables proactive decision-making by pushing real-time insights and automation earlier in the process. Examples include telemetry streaming from medical devices, machine integration with SAP and KUKA robots, and logistics event streaming from SAP for faster packaging and delivery. Each use case shows how real-time data—combined with cloud-native platforms like Confluent and Snowflake—improves efficiency, reliability, and responsiveness.

Just like the intralogistics example from Siemens Digital Industries, Healthineers applies shift-left thinking by enabling teams to act on data sooner, reduce latency, and prevent costly delays. This approach enhances not only operational workflows but also outcomes that matter, like patient care and regulatory compliance.

This is shift left in action: embedding intelligence and quality controls early, where they have the greatest impact.

Rethinking Industrial Data Architectures with Data Streaming and Shift Left Architecture

Siemens Digital Industries is demonstrating what’s possible when you rethink the data architecture beyond just analytics in a data lake.

With data streaming leveraging Confluent Cloud, data products for modular software, and a Shift Left approach, Siemens is transforming traditional factories into intelligent, event-driven operations. A data streaming platform based on Apache Kafka is no longer just an ingestion layer. It is a central nervous system for real-time processing and decision-making.

This is not about chasing trends. It’s about building resilient, scalable, and future-proof industrial systems. And it’s just the beginning.

To learn more, watch the on-demand industry use case show with Siemens Digital Industries and Siemens Healthineers or connect with us to explore what data streaming can do for your organization.

Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter. And download my free book about data streaming use cases.

The post Shift Left Architecture at Siemens: Real-Time Innovation in Manufacturing and Logistics with Data Streaming appeared first on Kai Waehner.

]]>
Real-Time Logistics, Shipping, and Transportation with Apache Kafka https://www.kai-waehner.de/blog/2022/09/29/real-time-logistics-shipping-transportation-with-apache-kafka/ Thu, 29 Sep 2022 05:48:00 +0000 https://www.kai-waehner.de/?p=4850 Logistics, shipping, and transportation require real-time information to build efficient applications and innovative business models. Data streaming enables correlated decisions, recommendations, and alerts. Kafka is everywhere across the industry. This blog post explores several real-world case studies from companies such as USPS, Swiss Post, Austrian Post, DHL, and Hermes. Use cases include cloud-native middleware modernization, track and trace, and predictive routing and ETA planning.

The post Real-Time Logistics, Shipping, and Transportation with Apache Kafka appeared first on Kai Waehner.

]]>
Logistics, shipping, and transportation require real-time information to build efficient applications and innovative business models. Data streaming enables correlated decisions, recommendations, and alerts. Kafka is everywhere across the industry. This blog post explores several real-world case studies from companies such as USPS, Swiss Post, Austrian Post, DHL, and Hermes. Use cases include cloud-native middleware modernization, track and trace, and predictive routing and ETA planning.

Real Time Logistics Transportation Shipping with Apache Kafka

Logistics and transportation

Logistics is the detailed organization and implementation of a complex operation. It manages the flow of things between the point of origin and the point of consumption to meet the requirements of customers or corporations. The resources managed in logistics may include tangible goods such as materials, equipment, and supplies, as well as food and other consumable items.

Logistics management is the part of supply chain management (SCM) and supply chain engineering that plans, implements, and controls the efficient, effective forward, and reverse flow and storage of goods, services, and related information between the point of origin and the point of consumption to meet customers’ requirements.

The evolution of logistics technology

Unity created an excellent overview of the future of logistics and transportation:

Unity - Logistics Technology for Industry 4 0

The diagram shows the critical technical characteristics for innovation: Digitalization, automation, connectivity, and real-time data are must-haves for optimizing logistics and transportation infrastructure.

Data streaming with Apache Kafka in the shipping industry

Real-time data is relevant everywhere in logistics and transportation. Apache Kafka is the de facto standard for real-time data streaming. Kafka works well almost everywhere. Here is an example of enterprise architecture for transporting goods across the globe:

Apache Kafka in the Shipping Industry for Marine, Oil Transport, Vessel Fleet, Shipping Line, Drones

Most companies have a cloud-first strategy. Kafka in the cloud as a fully-managed service enables project teams to focus on building applications and scale elastically depending on the needs. Use cases like big data analytics or a real-time supply chain control tower often run in the cloud today.

On-premise Kafka deployments connect to existing IT infrastructure such as Oracle databases, SAP ERP systems, and other monolith and often decade-old technology.

The edge either directly connects to the data center or cloud (if the network connection is relatively stable), or operates its own mission-critical edge Kafka cluster (e.g., on a ship) or a single broker (e.g., embedded into a drone) in a semi-connected or air-gapped environment.

Case studies for real-time transportation, shipping, and logistics with Apache Kafka

The following shows several real-world deployments of the logistics, shipping, and transportation industry for real-time data streaming with the broader Kafka ecosystem.

Swiss Post: Decentralized integration using data as an asset across the shipping pipeline

Swiss Post is the national postal service of Switzerland. Data streaming is a fundamental shift in their enterprise architecture. Swiss Post had several motivations:

  • Data as an asset: Management and accessibility of strategic company data
  • New requirements regarding the amount of event throughput (new parcel
    center, loT, etc.)
  • Integration is not dependent on a central development team (self-service)
  • Empowering organization and integration skill development
  • Growing demand for real-time event processing (Event-driven architecture)
  • Providing a flexible integration technology stack (no one fits all)

The Kafka-based integration layer processes small events and large legacy files and images.

The evolution of integration at Swiss Post
Source: Swiss Post

The shift from ETL/ESB integration middleware to event-based and scalable Kafka is an approach many companies use nowadays:

Shift from ETL / ESB to Kafka as Integration Middleware
Source: Swiss Post

DHL: Parcel and letter express service with cloud-native middleware

The German logistics company DHL is a subsidiary of Deutsche Post AG. DHL Express is the market leader for parcel services in Europe.

Like the Swiss Post, DHL modernized its integration architecture with data streaming. They complement MQ and ESB with data streaming powered by Kafka and Confluent. Check out the comparison between Message Queue systems and Apache Kafka to understand why adding Kafka is sometimes the better approach than initially trying to replace MQ with Kafka.

Here is the target future hybrid enterprise architecture of DHL with IBM MQ, Apache Kafka, and Spring Boot applications:

DHL Express integration architecture with IBM MQ, Apache Kafka, and Spring Boot
Source: DHL

This is a very common approach to modernizing middleware infrastructure. Here, the on-premise middleware based on IBM MQ and Oracle Weblogic struggles with the scale, even though we are “only” talking about a few thousand messages per second.

A few more notes about DHL’s middleware migration journey:

  • Migration to a cloud-native Kubernetes Microservices infrastructure
  • Migration to Azure Cloud planned with Cluster Linking
  • Mid-term: Replacement of the legacy ESB.

An interesting side note: DHL processes relatively large messages (70kb) with Kafka, resulting in hundreds of MB/sec.

Austrian Post: Track & trace parcels in the cloud with Kafka

Austrian Post leverages data streaming to track and trace parcels end-to-end across the delivery routes:

Parcel tracking at Austrian Post
Source: Austrian Post

The infrastructure for Austrian Post’s data streaming infrastructure runs on Microsoft Azure. They evaluated three technologies with the following results in their own words:

  • Azure Event Hubs (fully managed, only the Kafka protocol, not true Kafka, with various limitations): Not flexible enough, limited stream processing, no schema registry.
  • Apache Kafka (open source, self-managed): Way too much hassle.
  • Confluent Cloud on Azure (fully-managed, complete platform): Selected option.

One example use case of Austrian Post is about problems with ident codes: They are not unique. Instead, they can (and will be) re-used. Shipments can have more than one ident code. Scan events for ident codes need to be added to the correct “digital twin” of parcel delivery.

Stream processing enables the implementation of such a stateful business process:

Stream processing at Austrian Post for stateful parcel shipping analytics
Source: Austrian Post

Hermes: Predictive delivery planning with CDC and Kafka

Hermes is another German delivery company. Their goal: Making business decisions more data-driven with real-time analytics. To achieve this goal, Hermes integrates, processes, and correlates data generated by machines, companies, humans, and interactions for predictive delivery planning.

They leverage Change Data Capture (CDC) with HVR and Kafka for real-time delivery and collection services. Databases like MongoDB and Redis provide long-term storage and analytical capabilities:

Real-time logistics and shipping at Hermes with Kafka HVR MongoDB and Redis
Source: Hermes

This is an excellent example of technology and architecture modernization, combining data streaming and various databases.

USPS: Digital representation of all critical assets in Kafka for real-time logistics

USPS (United States Postal Service) is by geography and volume the globe’s largest postal system. They started the Kafka journey in 2016. Today, USPS operates a hybrid multi-cloud environment including real-time replication across regions.

“Kafka processes every event that is important for us,” said USPS CIO Pritha Mehra at Current 2022Kafka events process a digital representation of all assets important for USPS, including carrier movement, vehicle movement, trucks, package scans, etc. For instance, USPS processes 900 million scans per day.

Apache Kafka use cases at USPS

One interesting use case was an immediate response to a White House directive in late 2021 to send Covid test kits to every American free of charge. Time-to-market for the project was three weeks (!). USPS processed up to 8.7 million test kits per hour with help from Kafka:

Covid Test Kit Order Flow at USPS

Baader: Real-time logistics for dynamic routing and ETA calculations

BAADER is a worldwide manufacturer of innovative machinery for the food processing industry. They run an IoT-based and data-driven food value chain on Confluent Cloud.

The Kafka-based infrastructure is running as a fully-managed service in the cloud. It provides a single source of truth across the factories and regions across the food value chainBusiness-critical operations are available 24/7 for tracking, calculations, alerts, etc.:

Food Supply Chain at Baader with Confluent Cloud

MQTT provides connectivity to machines and GPS data from vehicles at the edge. Kafka Connect connectors integrate MQTT and IT systems, such as Elasticsearch, MongoDB, and AWS S3. ksqlDB processes the data in motion continuously.

Check my blog series about Kafka and MQTT for other related IoT use cases and examples.

Shippeo: A Kafka-native transportation platform for logistics providers, shippers, and carriers

Shippeo provides real-time and multimodal transportation visibility for logistics providers, shippers, and carriers. Its software uses automation and artificial intelligence to share real-time insights, enable better collaboration, and unlock your supply chain’s full potential. The platform can give instant access to predictive, real-time information for every delivery.

Shippeo integrates traditional databases (MySQL and PostgreSQL) and cloud-native data warehouses (Snowflake and BigQuery) with Apache Kafka and Debezium:

From Mysql and Postgresql to Snowflake and BigQuery with Kafka and Debezium at Shippeo

This is a terrific example of cloud-native enterprise architecture leveraging a “best of breed” approach for data warehousing and analytics. Kafka decouples the analytical workloads from the transactional systems and handles the backpressure for slow consumers.

A real-time locating system (RTLS) built with Apache Kafka

I want to end this blog post with a more concrete example of a Kafka implementation. The following picture shows a multi-purpose Kafka-native real-time locating system (RTLS) for transportation and logistics:

Data Streaming for a Real-Time Locating and Tracking System (RTLS)

The example shows three use cases of how produced events (“P”) are consumed and processed:

  • (“C1”) Real-time alerting on a single event: Monitor assets and people and send an alert to a controller, mobile app, or any other interface if an issue happens.
  • (“C2”) Continuous real-time aggregation of multiple events: Correlation data while in motion. Calculate average, enforce business rules, and apply an analytic model for predictions on new events, or any other business logic.
  • (“C3”) Batch analytics on all historical events: Take all historical data to find insights, e.g., for analyzing past issues, planning future location requirements, or training analytic models.

The Kafka-native RTLS can run in the data center, cloud, or closer to the edge, e.g., in a factory close to the shop floor and production lines. The blog post “Real Time Locating System (RTLS) with Apache Kafka for Transportation and Logistics” explores this use case in more detail.

The logistics and transportation industry requires Kafka-native real-time data streams!

Real-time data beats slow data. That’s true almost everywhere. But logistics, shipping, and transportation cannot build efficient and innovative business models without real-time information and correlated decisions, recommendations, and alerts. Kafka is everywhere in this industry. And it is just getting started.

After writing the blog post, I realized most case studies were from European companies. This is just accidentally. I assure you that similar companies in the US, Asia, or Australia have built or are building similar enterprise architectures.

If you still want to learn more, here are more related blog posts:

What role plays data streaming in your logistics and transportation scenarios? Do you run everything around Kafka in the cloud or operate hybrid edge scenarios? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Real-Time Logistics, Shipping, and Transportation with Apache Kafka appeared first on Kai Waehner.

]]>
A Real-Time Supply Chain Control Tower powered by Kafka https://www.kai-waehner.de/blog/2022/09/23/supply-chain-control-tower-for-end-to-end-visibility-using-apache-kafka/ Fri, 23 Sep 2022 05:55:26 +0000 https://www.kai-waehner.de/?p=4703 A modern supply chain requires just-in-time production, global logistics, and complex manufacturing processes. This blog post explores a solution that ingests all information flows into a unified central nervous system. The idea of the Supply Chain Control Tower becomes a reality: An integrated data cockpit with real-time access to all levels and systems of the supply chain.

The post A Real-Time Supply Chain Control Tower powered by Kafka appeared first on Kai Waehner.

]]>
A modern supply chain requires just-in-time production, global logistics, and complex manufacturing processes. Intelligent control of the supply network is becoming increasingly demanding, not just because of Covid. At the same time, digitalization is generating exponentially growing data streams along the value chain. This blog post explores a solution that links the control with the data streams leveraging Apache Kafka to provide end-to-end visibility of your supply chain. All the digital information flows into a unified central nervous system enabling comprehensive control and timely response. The idea of the Supply Chain Control Tower becomes a reality: An integrated data cockpit with real-time access to all levels and systems of the supply chain.

Real-Time Supply Chain Control Tower with Apache Kafka

What is a supply chain control tower?

A supply chain is a network of companies and people that are involved in the production and delivery of a product or service. Today, many supply chains are global and involve intra-logistics, widespread enterprise logistics, and B2B data sharing for end-to-end supply chains.

Supply chain management (SCM)

Supply Chain Management (SCM) involves planning and coordinating all the people, processes, and technology involved in creating value for a company. This includes cross-cutting processes, including purchasing/procurement, logistics, operations/manufacturing, and others. Automation, robustness, flexibility, real-time, and hybrid deployment (edge + cloud) are essential for future success and a pre-requirement for end-to-end visibility across the supply chain, regardless of industry.

Monitoring Logistics
Source: Aadini (YouTube Channel)

The challenge with logistics and supply chain is that you have a lot of commercial off-the-shelf applications (ERP, WMS, TMS, DPS, CRM, SRM, etc.) in place that are highly specialized and highly advanced in their function.

Challenges of batch workloads across the supply chain

Batch workloads create many issues in a supply chain. The Covid pandemic showed this:

  • Missing information: Intra-logistics within a distribution or fulfillment center, across buildings, and regions, and between business partners.
  • Cost rising: Lower overall equipment efficiency (OEE). Lower availability. Increase cost for production and buyers (B2B and B2C).
  • Customers churning: Bad customer experience and contract discussion as a consequence of failing with delivery guarantees and other SLAs.
  • Revenue decreasing: Less production and/or sales means less revenue.

The specialized systems like an ERP, WMS, TMS, DPS, CRM, or SRM are often modernized and real-time or nearly real-time in their operation. However, the integrations between them are often still not real-time. For instance, batch waves in a WMS are being replaced with real-time order allocation processes, but the link between the WMS and the ERP is still batch.

Real-time end-to-end monitoring

A supply chain control tower provides end-to-end visibility and real-time monitoring across the supply chain:

Real-Time Supply Chain Control Tower
Source: Aadini (YouTube Channel)

The control tower helps answer questions such as

  • What is happening now?
  • Why is this happening?
  • What might happen next?
  • How can we perform better?
  • What if…?

A supply chain control tower combines technology, processes, and people. Check out this tremendous seven-minute YouTube video to get a simple explanation of the supply chain control tower “for dummies”. Most importantly, the evolution of software enables real-time automation instead of just human visual monitoring.

A Kafka-native supply chain control tower

Apache Kafka is the de facto standard for data streaming. It enables real-time data integration and processing at any scale. This is a crucial requirement for building a supply chain control tower.

I blogged about Apache Kafka for Supply Chain Management (SCM) Optimization before. Check it out to learn about real-time applications for logistics, inventory management, track and trace, and other real-time deployments from BMW, Bosch, Baader, and more.

Let’s recap the added value of Kafka for improving the supply chain in one picture of business value across use cases:

Use Cases for Event Streaming with Apache Kafka in the Supply Chain

If you look at the definitions of a supply chain control tower, well, that’s more or less the same 🙂 A supply chain control tower is only possible with real-time data correlation. That’s why Kafka is the perfect fit.

Global data mesh for real-time data sharing

Supply chains are global and rely on the collaboration between independent business units within enterprises, and B2B communication:

Global real-time supply chain data mesh
Source: Aadini (YouTube Channel)

The software industry pitches a new paradigm and architecture pattern these days: The Data Mesh. It allows independent, decoupled domains using its own technology, API, and data structures. But the data sharing is standardized, compatible, real-time, and reliable. The heart of such a data mesh beats real-time with the power of Apache Kafka:

Data Mesh with Apache Kafka

Kafka-native tools like MirrorMaker or Confluent Cluster Linking enable reliable real-time replication across data centers, clouds, regions, and even between independent companies.

Kafka as the data hub for a real-time supply chain control tower

A supply chain control tower coordinates demand, supply, trading, and logistics across various domains, technologies, APIs, and business processes.

The control tower is not just a Kafka cluster – but Kafka is underpinning business logic, integrations, and visualizations that aggregate all the data from the various sources and make it more valuable from end to end in real-time:

Supply Chain Control Tower powered by Data Streaming with Apache Kafka

The data communication happens in real-time, near real-time, batch, or request-response. Whatever the source and sink applications support. However, the heart of the enterprise architecture is real-time and scalable. This enables future modernization and replacing legacy batch systems with modern real-time services.

Visualization AND automatic resolutions as core components of a modern control tower

The first goal of a supply chain control tower is end-to-end visibility in real-time. Creating real-time dashboards, alerts, and integration with 3rd party monitoring tools is a massive value.

However, the built pipeline enables several additional new use cases. Data integration and correlation across legacy and modern applications is the significant change for innovation regarding supply chain optimization: Instead of just visualizing in real-time what happens, new applications can take automated actions and decisions to solve problems automatically based on real-time information.

Real-world Kafka examples across the supply chain for visibility and automation

Modern supply chains rely on real-time data across suppliers, fulfillment centers, warehouses, stores, and consumers. Here are a few real-world examples across verticals using Apache Kafka as a real-time supply chain control tower:

No matter what industry you work in. Learn from other companies across verticals. Challenges to improving the supply chain are very similar everywhere.

Postmodern ERP / MES / TMS powered by Apache Kafka

While the supply chain control tower provides end-to-end visibility, each system has similar requirements. Enterprise resource planning (ERP) has existed for many years. It is often monolithic, complex, proprietary, batch, and not scalable. The same is true for MES, TMS, and many other platforms for logistics and supply chain management.

Postmodern ERP represents the next generation of ERP architectures. It is real-time, scalable, and open. A Postmodern ERP combines open source technologies and proprietary standard software. Many solutions are cloud-native or even offered as fully managed SaaS.

Like end-users, software vendors leverage data streaming with Apache Kafka to implement a Postmodern ERP, MES, CRM, SRM, or TMS:

Postmodern ERP with Apache Kafka SAP S4 Hana Oracle XML Web Services MES

Logistics and supply chain management require real-time data

Data streams grow exponentially along the value chain. End-to-end visibility of the supply chain is crucial for optimized business processes. Real-time order management and inventory management are just two examples.

Visualization and taking actions in real-time via automated processes either reduce cost and risk or increases the revenue and customer experience. A real-time supply chain control tower enables this kind of innovation. The foundation of such a strategic component needs to be real-time, scalable, and reliable. That’s why the data streaming platform Apache Kafka is the perfect fit for building a control tower.

Various success stories across industries prove the value of data streaming across the supply chain. Even software vendors of products for the supply chain like ERP, WMS, TMS, DPS, CRM, or SRM build their next-generation software on top of Apache Kafka.

How do you bring visibility into your supply chain? Did you already build a real-time control tower? What role plays data streaming in these scenarios? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post A Real-Time Supply Chain Control Tower powered by Kafka appeared first on Kai Waehner.

]]>
Real-Time Supply Chain with Apache Kafka in the Food and Retail Industry https://www.kai-waehner.de/blog/2022/02/25/real-time-supply-chain-with-apache-kafka-in-food-retail-industry/ Fri, 25 Feb 2022 08:13:14 +0000 https://www.kai-waehner.de/?p=4309 This blog post explores real-world deployments across the end-to-end supply chain powered by data streaming with Apache Kafka to improve business processes with real-time services. The examples include manufacturing, logistics, stores, delivery, restaurants, and other parts of the business. Case studies include Walmart, Albertsons, Instacart, Domino's Pizza, Migros, and more.

The post Real-Time Supply Chain with Apache Kafka in the Food and Retail Industry appeared first on Kai Waehner.

]]>
The supply chain in the food and retail industry is complex, error-prone, and slow. This blog post explores real-world deployments across the end-to-end supply chain powered by data streaming with Apache Kafka to improve business processes with real-time services. The examples include manufacturing, logistics, stores, delivery, restaurants, and other parts of the business. Case studies include Walmart, Albertsons, Instacart, Domino’s Pizza, Migros, and more.

The supply chain in the food and retail industry

The food industry is a complex, global network of diverse businesses that supplies most of the food consumed by the world’s population. It is far beyond the following simplified visualization 🙂

The Food and Retail Industry Simplified

The term food industry covers a series of industrial activities directed at the production, distribution, processing, conversion, preparation, preservation, transport, certification, and packaging of foodstuffs.

Today’s food industry has become highly diversified, with manufacturing ranging from small, traditional, family-run activities that are highly labor-intensive to large, capital-intensive, and highly mechanized industrial processes.

This post explores several case studies with use cases and architectures to improve supply chain and business processes with real-time capabilities.

Most of the following companies leverage Kafka for various use cases. They sometimes overlap and compete. I structured the next sections to choose specific examples from each company. This way, you see a complete real-time supply chain flow using data streaming examples from different industries related to the food and retail business.

Why do the food and retail supply chains need real-time data streaming using Apache Kafka?

Before I start with the real-world examples, let’s discuss why I thought this was the right time for this post to collect a few case studies.

In February 2022, I was in Florida for a week for customer meetings. Let me report on my inconvenient experience as a frequent traveler and customer. Coincidentally, all this happened within one weekend.

A horrible weekend travel experience powered by manual processes and batch systems

  • Issue #1 (hotel): While I canceled a hotel night a week ago, I still got emails for buying upgrades from the batch system of the hotel’s booking engine. Contrarily, I did not get promotions for upgrading my new booking at the same hotel chain.
  • Issue #2 (clothing store): The Point-of-sale (POS) was down because of a power outage and no connectivity to the internet. The POS could not process payments without the internet. People left without buying items. I needed the new clothes (as my hotel also did not provide laundry service because of the limited workforce).
  • Issue #3 (restaurant): I went to a steak house to get dinner. Their POS was down, too, because of the same power outage. The waiter could not take orders. The separate software in the kitchen could not receive manual orders from the waiter either.
  • Issue #4 (restaurant): After 30 minutes of waiting, I ordered and had dinner. However, the dessert I chose was not available. The waiter explained that supply chain issues don’t provide this food, but the restaurant could never update the paper menu or online PDF.
  • Issue #5 (clothing store): After dinner, I went back to the other store to buy my clothes. It worked. The salesperson asked me for my email address to give a 15 percent discount. I wonder why they don’t see my loyalty information already. Maybe via a location based-service. Or at least, I would expect to pay and log in via my mobile app.
  • Issue #6 (hotel): Back in the hotel, I check out. The hotel did not add my loyalty bonus (discount on food in the restaurant) as the system only supports batch integration of other applications. Also, the receipt could not display loyalty information and the correct receipt value. I had to accept this, even though this is not even legal by German tax law.

This story shows why the digital transformation is crucial across all parts of the food, retail, and travel industry. I have already covered use cases and architecture for data streaming with Apache Kafka in the Airline, Aviation, and Travel Industry. Hence, this post focuses more on food and retail. But the concepts do not differ across these industries.

As you will see in the following sections, optimizing business processes, automating tasks, and improving customer experience is not complicated with the right technologies.

Digital transformation with automated business processes and real-time services across the supply chain

With this story in mind, I thought it was a good time to share various real-world examples across the food and retail supply chain. Some companies have already started with innovation. They digitalized their business processes and built innovative new real-time services.

The Real-Time Food and Retail Supply Chain powered by Apache Kafka

Food processing machinery for IoT analytics at Baader

BAADER is a worldwide manufacturer of innovative machinery for the food processing industry. They run an IoT-based and data-driven food value chain on Confluent Cloud.

The Kafka-based infrastructure provides a single source of truth across the factories and regions across the food value chainBusiness-critical operations are available 24/7 for tracking, calculations, alerts, etc.

Food Supply Chain at Baader with Apache Kafka and Confluent Cloud

The event streaming platform runs on Confluent Cloud. Hence, Baader can focus on building innovative business applications. The serverless Kafka infrastructure provides mission-critical SLAs and consumption-based pricing for all required capabilities: messaging, storage, data integration, and data processing.

MQTT provides connectivity to machines and GPS data from vehicles at the edge. Kafka Connect connectors integrate MQTT and other IT systems, such as Elasticsearch, MongoDB, and AWS S3. ksqlDB processes the data in motion continuously.

Check out my blog series about Kafka and MQTT for other related IoT use cases and examples.

Logistics and track & trace across the supply chain at Migros

Migros is Switzerland’s largest retail company, largest supermarket chain, and largest employer. They leverage MQTT and Kafka for real-time visualization and processing of logistics and transportation information.

As Migros explored in a Swiss Kafka meetup, they optimized their supply chain with a single data streaming pipeline, not just for real-time purposes but also for use cases that require replaying all historical events of the whole day.

The goal was to provide “one global optimum instead of four local optima”. Specific logistics use cases include forecasting the truck arrival time and rescheduling truck tours.

Optimized inventory management and replenishment at Walmart

Walmart operates Kafka across its end-to-end supply chain at a massive scale. The VP of Walmart Cloud says, “Walmart is a $500 billion in revenue company, so every second is worth millions of dollars. Having Confluent as our partner has been invaluable. Kafka and Confluent are the backbone of our digital omnichannel transformation and success at Walmart.”

The real-time supply chain includes Walmart’s inventory system. As part of that infrastructure, Walmart built a real-time replenishment system:

Walmart Replenishment System and Real Time Inventory

Here are a few notes from Walmart’s Kafka Summit talk covering this topic.

  • Caters millions of its online and walk-in customers
  • Ensures optimal availability of needed assortment and timely delivery on online fulfillment
  • 4+ billion messages in 3 hours generate an order plan for the entire network of Walmart stores with great accuracy for ordering decisions made daily
  • Apache Kafka as the data hub and for real-time processing
  • Apache Spark for micro-batches

The business value is enormous from a cost and operations perspective: cycle time reduction, accuracy, speed reduced complexity, elasticity, scalability, improved resiliency, reduced cost.

And to be clear: This is just one of many Kafka-powered use cases at Walmart! Check out other Kafka Summit presentations for different use cases and architectures.

Omnichannel order management in restaurants at Domino’s Pizza

Domino’s Pizza is a multinational pizza restaurant chain with ~17,000 stores. They successfully transformed from a traditional pizza company to a data-driven e-commerce organization. They focus on a data-first approach and a relentless customer experience.

Domino’s Pizza provides real-time operation views to franchise owners. This includes KPIs, such as order volume by channel and store efficiency metrics across different ordering channels.

Domino’s Pizza mentions the following capabilities and benefits of their real-time data hub to optimize their supply chain:

  • Improve store operational real-time analytics
  • Support global expansion goals via legacy IT modernization
  • Implement more personalized marketing campaigns
  • Real-time single pane of glass

Location-based services and upselling at the point-of-sale at Royal Caribbean

Royal Caribbean is a cruise line. It operates the four largest passenger ships in the world. As of January 2021, the line operates twenty-four ships and has six additional ships on order.

Royal Caribbean implemented one of Kafka’s most famous use cases at the edge. Each cruise ship has a Kafka cluster running locally for use cases, such as payment processing, loyalty information, customer recommendations, etc.:

Swimming Retail Stores at Royal Caribbean with Apache Kafka

Edge computing on the ship is crucial for an efficient supply chain and excellent customer experience at Royal Caribbean:

  • Bad and costly connectivity to the internet
  • The requirement to do edge computing in real-time for a seamless customer experience and increased revenue
  • Aggregation of all the cruise trips in the cloud for analytics and reporting to improve the customer experience, upsell opportunities, and many other business processes

Hence, a Kafka cluster on each ship enables local processing and reliable mission-critical workloads. The Kafka storage guarantees durability, no data loss, and guaranteed ordering of events – even though they are processed later. Only very critical data is sent directly to the cloud (if there is connectivity at all). All other information is replicated to the central Kafka cluster in the cloud when the ship arrives in a harbor for a few hours. A stable internet connection and high bandwidth are available before leaving for the next trip again.

Learn more about use cases and architecture for Kafka at the edge in its dedicated blog post.

Recommendations and discounts using the loyalty platform at Albertsons

Albertsons is the second-largest American grocery company with 2200+ stores and 290,000+ employees. They also operate pharmacies in many stores.

Albertsons operates Kafka powered by Confluent as the real-time data hub for various use cases, including:

  • Inventory updates from 2200 stores to the cloud services in near real-time to maintain visibility in the health of the stores and to ensure well managed out-of-stock and
    substitution recommendations for digital customers
  • Distributing offers and customer clips to each store in near real-time and improving store checkout experience.
  • Feed the data in real-time to analytics engines for supply chain order forecasts, warehouse order management, delivery, labor forecasting and management, demand planning, and inventory management.
  • Ingesting transactions near real-time to the data lake for batch workloads to generate dashboards for associates and business, training data models, and hyper-personalization.

As an example, here is how Albertsons data streaming architecture empowers their real-time loyalty platform:

Albertsons Loyalty Use Case in Grocery Stores powered by Apache Kafka

Learn more about Albertsons’ innovative real-time data hub by watching the on-demand webinar I did with two of their leading architects recently.

Payment processing and fraud detection at Grab

Grab is a mobility service in Asia, similar to Uber and Lyft in the United States or FREE NOW (former mytaxi) in Europe. The Kafka ecosystem heavily powers all of them. Like FREE NOW,  Grab leverages serverless Confluent Cloud to focus on business logic.

As part of this, Grab has built GrabDefence. This internal service provides a mission-critical payment and fraud detection platform. The platform leverages Kafka Streams and Machine Learning for stateful stream processing and real-time fraud detection at scale.

GrabDefence - Fraud Detection with Kafka Streams and Machine Learning in Grab Mobility ServiceGrab struggles with billions of fraud and safety detections performed daily for millions of transactions. Companies lose 1.6% in fraud in Southeast Asia. Therefore, Grab’s data science and engineering team built a platform to search for anomalous and suspicious transactions and identify high-risk individuals.

Here is one fraud example: An individual who pretends to be both the driver and passenger and makes cashless payments to get promotions.

Delivery and pickup at Instacart

Instacart is a grocery delivery and pickup service in the United States and Canada. The service orders groceries from participating retailers, with the shopping being done by a personal shopper.

Instacart requires a platform that enables elastic scale and fast, agile internal adoption of real-time data processing. Hence, Confluent Cloud was chosen to focus on business logic and roll out new features with fast time-to-market. As Instacart said, during the Covid pandemic, they had to “handle ten years’ worth of growth in six weeks”. Very impressive and only possible with cloud-native and serverless data streaming.

Learn more in the recent Kafka Summit interview between Confluent found Jun Rao and Instacart.

Real-time data streaming with Kafka beats slow data everywhere in the supply chain

This post showed various real-world deployments where real-time information increases operational efficiency, reduces the cost, or improves the customer experience.

Whether your business domain cares about manufacturing, logistics, stores, delivery, or customer experience and loyalty, data streaming with the Apache Kafka ecosystem provides the capability to be innovative at any scale. And in the cloud, serverless Kafka makes it even easier to focus on time to market and business value.

How do you optimize your supply chain? Do you leverage data streaming? Is Apache Kafka the tool of choice for real-time data processing? Or what data streaming technologies do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Real-Time Supply Chain with Apache Kafka in the Food and Retail Industry appeared first on Kai Waehner.

]]>
Apache Kafka Landscape for Automotive and Manufacturing https://www.kai-waehner.de/blog/2022/01/12/apache-kafka-landscape-for-automotive-and-manufacturing/ Wed, 12 Jan 2022 12:07:20 +0000 https://www.kai-waehner.de/?p=4124 Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments. This article explores the event streaming landscape for automotive including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.

The post Apache Kafka Landscape for Automotive and Manufacturing appeared first on Kai Waehner.

]]>
Before the Covid pandemic, I had the pleasure of visiting “Motor City” Detroit in November 2019. I met with several automotive companies, suppliers, startups, and cloud providers to discuss use cases and architectures around Apache Kafka. A lot has happened. Since then, I have also met several OEMs and suppliers in Europe and Asia. As I finally go back to Detroit this January 2022 to meet customers again, I thought it would be a good time to update the status quo of event streaming and Apache Kafka in the automotive and manufacturing industry.

Today, in 2022, Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments. This article explores the automotive event streaming landscape, including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.

Automotive and Manufacturing Landscape for Apache Kafka

The Event Streaming Landscape for Automotive and Manufacturing

Every business domain leverages Event Streaming with Apache Kafka in the automotive and manufacturing industry. Data in motion helps everywhere. The infrastructure and deployment differ depending on the use case and requirements. I have seen everything at carmakers and manufacturers across the globe:

  • Cloud-first strategy with all new business applications in the public cloud deployed and connected across regions and even continents
  • Hybrid integration scenarios between legacy applications in the data center and modern cloud-native services the public cloud
  • Edge computing in a smart factory for low latency, cost-efficient data processing, and cybersecurity
  • Embedded Kafka brokers in machines and vehicles at the disconnected edge

This spread of use cases is impressive. The following diagram depicts a high-level overview:

Automotive and Manufacturing Landscape for Apache Kafka with Edge and Hybrid Cloud

The following sections describe the automotive and manufacturing landscape for event streaming in more detail:

  • Manufacturing 4.0
  • Supply Chain Optimization
  • Mobility Services
  • New Business Models

If you are mainly interested in real-world Kafka deployments with examples from BMW, Porsche, Audi, Tesla, and other OEMs, check out the article “Real-World Deployments of Kafka in the Automotive Industry“.

If you want to understand why Kafka makes such a difference in automotive and manufacturing, check out the article “Apache Kafka in the Automotive Industry“. This article explores the business motivation for these game-changing concepts of data in motion for the digitalization of the automotive industry.

Before you start reading the below section, I want to clearly emphasize that Kafka is not the silver bullet for every problem. “When NOT to use Apache Kafka?” digs deep into this discussion.

I keep the following sections relatively short to give a high-level overview. Each section contains links to more deep-dive articles about the topics.

Manufacturing 4.0

Industrial IoT (IIoT) respectively Industry 4.0 changes how the shop floor and production lines produce goods. Automation, process efficiency, and a much better Overall Equipment Effectiveness (OEE) enable cost reduction and flexibility in the production process:

Manufacturing and Industrial IoT with Apache Kafka

Smart Factory

A smart factory is not necessarily a newly built building like a Tesla Gigafactory. Many enterprises install smart technology like networked sensors for temperature or vibrations measurements into old factories. Improving the Overall Equipment Effectiveness (OEE) is the primary goal of most use cases. Many scenarios leverage Kafka for continuously processing sensor and telemetry data in motion:

Legacy Modernization with Open APIs and Hybrid Cloud

Factories exist for decades after they are built. Digitalization and the modernization of legacy technologies are some of the biggest challenges in IIoT projects. Such an initiative usually includes several tasks:

Continuous Data-driven Engineering and Product Development

Last but not least, an opportunity many people underestimate: Continuous data streaming with Kafka enables new possibilities in software engineering and product development for IoT and automotive projects.

For instance, developing and deploying the “big loop” for machine learning of advanced driver-assistance systems (ADAS) or self-driving functions based on sensor data from the fleet is a new way of software engineering. Tesla’s Kafka-based data platform is a fantastic example. A related use case in engineering is the ingest of sensor data during and after test drives.

Supply Chain Optimization

Supply chain processes and solutions are very complex. The Covid pandemic showed how only flexible enterprises could survive, stay profitable, and provide a great customer experience, even in disastrous external events.

Here are the top 5 critical challenges of supply chains:

  • Time Frames are Shorter
  • Rapid Change
  • Zoo of Technologies and Products
  • Historical Models are No Longer Viable
  • Lack of visibility

Only real-time data streaming and correlation solve these supply chain challenges end-to-end across regions and companies:

Supply Chain Optimization in Automotive at the Edge and in the Cloud with Apache Kafka

In its detailed blog post, I covered Supply Chain Optimization (SCM) with Apache Kafka. Check it out to learn about real-world supply chain use cases from Bosch, BMW, Walmart, and other companies.

Intra-logistics and Global Distribution Networks

Logistics and supply chains within a factory, distribution center, or store require real-time data integration and processing to provide efficient processing of goods and a great customer experience. Batch processes or manual interaction by human workers cannot implement these use cases. Examples include:

Track & Trace and Fleet Management

Real-time logistics is a game-changer for fleet management and track & trace use cases.

  • Commercial motor vehicles such as cars, vans, trucks, specialist vehicles (such as mobile construction machinery), forklifts, and trailers
  • Private vehicles used for work (the ‘grey fleet’)
  • Aviation machinery such as aircraft (planes and helicopters)
  • Ships
  • Rail cars
  • Non-powered assets such as generators, tanks, gearboxes

All the following aspects are not new. The difference is that event streaming allows to continuously execute these tasks in real-time to act on new information in motion:

  • Visualization
  • Location-based services
  • Routing and navigation
  • Estimated time of arrival
  • Alerting
  • Proactive recalculation
  • Monitoring of the assets and mechanical components of a vehicle

Most companies have a cloud-first strategy for building such a platform. However, some cases require edge computing either via local 5G location for low latency use cases or embedded Kafka brokers for disconnected data collection and analytics within the vehicles.

Streaming Data Exchange for B2B Collaboration with Partners

Real-time data is not just relevant without a company. OEMs and Tier 1 and Tier 2 suppliers benefit in the same way from data streams. The same is true for car dealerships, end customers, and any other consumer of the data. Hence, a clear trend in the market is the emergence of a Kafka-based streaming data exchange across companies to build a data mesh.

I have often seen this situation in the past: The OEM leverages event streaming. The Tier 1 supplier leverages event streaming. The used ERP solution is built on Kafka, too. All leverage the capabilities of scalable real-time data streaming. It makes little sense to integrate with partners and software vendors via web service APIs, such as SOAP or HTTP/REST. Instead, a streaming interface is a natural choice to hand streaming data to partners.

The following example from the automotive industry shows how independent stakeholders (= domains in different enterprises) use a cross-company streaming data exchange:

Streaming Data Exchange with Data Mesh in Motion using Apache Kafka and Cluster Linking

Mobility Services

Every OEM, supplier, or innovative startup in the automotive space thinks about providing a mobility service either on top of the goods they sell or as an independent service.

Most mobility services on your mobile apps used today for business or privately are only possible because of a scalable real-time backbone powered by event streaming:

Mobility Services and Connected Cars with Event Streaming and Apache Kafka

The possibilities for mobility services are endless. A few examples that are mainstream today already:

  • Omnichannel retail and aftersales to buy additional car features online, for instance, more power, seat heater, up-to-date navigation, self-driving software (okay, the latter one is not mainstream yet, but Tesla shows where it goes)
  • Connected Cars for ride-hailing, scooter rental, taxi services, food delivery
  • 3rd party integration for adding services that a company does not want to build by themselves

Today’s most successful and widely adopted mobility services are independent of a specific carmaker or supplier.

Examples of prominent Kafka-powered consumer mobility services are Uber and Lyft in the US, Grab in Asia, and FREENOW in Europe. Here Technologies is an excellent example for a B2B mobility service providing mapping information so that companies can build new or improve existing applications on top of it.

A good starting point to learn more is my blog post about Apache Kafka and MQTT for mobility services and transportation.

New Business Models

The access to real-time data enables companies to build entirely new business models on top of their existing products:

New Automotive Business Models enabled by Event Streaming with Apache Kafka

A few examples:

  • Next-generation car rental with excellent customer experience, context-specific coupons, loyalty platform, and car rental fleets with other services from the carmaker.
  • Reinventing car insurance based on real-time driving information about each driver to build driver-specific pricing based on real-time analysis of the driver behavior instead of legacy approaches using statistical models with attributes like driver age, number of accidents in the past, etc.
  • Data provider for monetization enables other companies to build new business models with your car data – for instance, working with a government to make a smart city traffic system or a mobility service startup to analyze and correlate car data across OEMs.

This evolution is just the beginning of the usage of streaming data. I have seen many customers build a first streaming pipeline for one use case. However, new business divisions will leverage the data for innovations when the platform is there.

The Data is in Motion in Automotive and Manufacturing

The landscape for Apache Kafka in the automotive and manufacturing industry showed that Apache Kafka is the central nervous system of many applications in various areas for processing analytical and transactional data in motion.

This article explored use cases such as connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models. The possibilities for data in motion are almost endless. The automotive and manufacturing industry is still in the very early stages of leveraging data in motion.

Where do you use Apache Kafka and its ecosystem in the automotive and manufacturing industry? Do you deploy in the public cloud, in your data center, or at the edge outside a data center? What other technologies do you combine with Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka Landscape for Automotive and Manufacturing appeared first on Kai Waehner.

]]>
Omnichannel Retail and Customer 360 in Real Time with Apache Kafka https://www.kai-waehner.de/blog/2021/02/08/omnichannel-retail-customer-360-apache-kafka-edge-cloud-manufacturing-aftersales/ Mon, 08 Feb 2021 11:03:58 +0000 https://www.kai-waehner.de/?p=3090 Event Streaming with Apache Kafka disrupts the retail industry. Walmart's real-time inventory system and Target's omnichannel distribution and logistics are two great examples. This blog post explores a key use case for postmodern retail companies: Real-time omnichannel retail and customer 360.

The post Omnichannel Retail and Customer 360 in Real Time with Apache Kafka appeared first on Kai Waehner.

]]>
Event Streaming with Apache Kafka disrupts the retail industry. Walmart’s real-time inventory system and Target’s omnichannel distribution and logistics are two great examples. This blog post explores a key use case for postmodern retail companies: Real-time omnichannel retail and customer 360.

Omnichannel Retail and Customer 360 with Apache Kafka at the Edge and in the Cloud

Disruption of the Retail Industry with Apache Kafka

Various deployments across the globe leverage event streaming with Apache Kafka for very different use cases. Consequently, Kafka is the right choice, whether you need to optimize the supply chain, disrupt the market with innovative business models, or build a context-specific customer experience.

I discussed the use cases of Apache Kafka in retail in a dedicated blog post: “The Disruption of Retail with Event Streaming and Apache Kafka“. Learn about the real-time inventory system from Walmart, omnichannel distribution and logistics at Target, context-specific customer 360 at AO.com, and much more.

This post explores a specific use case in more detail: Real-time omnichannel retail and customer 360 with Apache Kafka. Learn about a possible architecture to deploy this scenario across the whole supply chain: From design and manufacturing to sales and aftersales. The architecture is very flexible. Any infrastructure (one or multiple cloud providers and/or on-premise data centers, bare metal, containers, Kubernetes) can be used.

‘My Porsche’ – A Global Omnichannel Platform for customers, fans, and enthusiasts

Let’s start with a great example from the automotive industry: ‘My Porsche’ is the innovative and modern digital omnichannel platform from Porsche for keeping a great relationship with their customers. Porsche can describe it better than me:

The way automotive customers interact with brands has changed, accompanied by a major transformation of customer needs and requirements. Today’s brand experience expands beyond the car and other offline touchpoints to include various digital touchpoints. Automotive customers expect a seamless brand experience across all channels — both offline and online.

My Porsche Digital Service Platform Omnichannel

The ‘Porsche Dev‘ group from Porsche published a few great posts about their architecture. Here is a good overview:

My Porsche Architecture with Apache Kafka

Kafka provides real decoupling between applications. Hence, Kafka became the defacto standard for microservices and Domain-driven Design (DDD) in many companies. It allows building independent and loosely coupled but scalable, highly available, and reliable applications.

That’s exactly what Porsche describes for their usage of Apache Kafka through its supply chain:

“The recent rise of data streaming has opened new possibilities for real-time analytics. At Porsche, data streaming technologies are increasingly applied across a range of contexts, including warranty and sales, manufacturing and supply chain, connected vehicles, and charging stations writes Sridhar Mamella (Platform Manager Data Streaming at Porsche).

As you can see in the above architecture, there is no need to have a fight between REST / HTTP and Event Streaming / Kafka enthusiasts! As I explained in detail before, most microservice architectures need Apache Kafka and API Management for REST. HTTP and Kafka complement each other very well!

Last but not least, a great podcast link: The Porsche guys talked about “Apache Kafka and Porsche: Fast Cars and Fast Data” to explain their event streaming platform called Streamzilla.

Omnichannel Retail and Customer 360 with Kafka

After exploring an example, let’s take a look at an omnichannel architecture with Apache Kafka from different points of view: Marketing+Sales, Analytics, and Manufacturing.

The good news first: All business units can use the same Kafka cluster! That’s actually pretty common and a key reason why Kafka is used so much today: Events generated in one domain are consumed for very different use cases by different departments and stakeholders. The real decoupling with Kafka’s storage is quite different from HTTP/REST web services or traditional message queues like RabbitMQ.

Marketing, Sales, and Aftersales (aka Customer 360)

From a marketing and sales perspective, companies need to correlate all the customer actions. No matter how old. No matter if online or offline channels. The following shows such a retail process for selling a car:

Omnichannel Retail with Apache Kafka - Customer 360 Sales and Aftersales

Reporting, Analytics, and Machine Learning

The data science team also uses the same events used for marketing and sales to create reports for management and train analytic models for making better decisions and recommendations in the future. The new, improved analytic model is then deployed back to the production pipeline of the marketing and sales channel:

Omnichannel Retail with Apache Kafka - Reporting Analytics and Machine Learning

In this example, Kafka is used together with TensorFlow for streaming machine learning. The reporting is done via Rockset’s native Kafka integration supporting ANSI SQL. This allows easy integration with business intelligence tools such as Tableau, Qlik, or Power BI in the same way as with other databases and batch data lakes.

Obviously, a key strength of Kafka is that it works together with any other technology. No matter if open source or proprietary. SaaS or self-managed, you choose. And no matter if real-time or batch. In most real-world Kafka deployments, the connectivity to systems changes over time.

Design and Manufacturing

Here is a pretty cool fact in some postmodern omnichannel retail solutions: Customers can monitor the whole manufacturing and delivery process. When buying a car, you can track the manufacturing steps on your mobile app to know the exact status of your future car. When you finally pick up the car, all future data is also stored (for you and the carmaker) for various use cases:

Omnichannel Retail with Apache Kafka - Digital Twin Design Manufacturing

 

Of course, this does not end when you picked up the car. Aftersales will become super important for carmakers in the future. Check out what you can do with a Tesla (which is actually more software on wheels than just hardware) today, already.

Apache Kafka as Digital Twin for Customer 360 and Omnichannel Aftersales

A key buzzword that has to be mentioned here is the digital twin or digital thread. The automotive industry is the perfect example for building a digital twin of each car. Both the carmaker and the buyer can get huge benefits. Use cases like predictive maintenance or context-specific upselling of new features are now possible.

Apache Kafka is the heart of this digital twin in many cases. Often, projects also combine Kafka with other technologies. For instance, Kafka can be the data integration pipeline into MongoDB. The latter stores the data of the cars. Check out the following two posts to learn more about this topic:

Omnichannel Customer 360 with Kafka for Happy Customers and Increased Revenue

In conclusion, Event Streaming with Apache Kafka plays a key role in this evolution of re-inventing the retail business. Walmart, Target, and many other retail companies rely on Apache Kafka and its ecosystem to provide a real-time infrastructure to make the customer happy, increase revenue, and stay competitive in this tough industry. Omnichannel and customer 360 end-to-end in real-time is key for success in a postmodern retail company. No matter if you operate in your own data center or in one or more public clouds.

What are your experiences and plans for event streaming in the retail industry? Did you already build applications with Apache Kafka for omnichannel, aftersales, customer 360? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Omnichannel Retail and Customer 360 in Real Time with Apache Kafka appeared first on Kai Waehner.

]]>
Real Time Locating System (RTLS) with Apache Kafka for Transportation and Logistics https://www.kai-waehner.de/blog/2021/01/07/real-time-locating-system-rtls-apache-kafka-asset-tracking-transportation-logistics/ Thu, 07 Jan 2021 07:49:41 +0000 https://www.kai-waehner.de/?p=2963 Real-Time Locating System (RTLS) enables identifying and tracking the location of assets or people in real-time. This blog post explores the use cases for RTLS, the challenges of existing implementations, and an open, scalable RTLS architecture based on Apache Kafka.

The post Real Time Locating System (RTLS) with Apache Kafka for Transportation and Logistics appeared first on Kai Waehner.

]]>
Real-Time Locating System (RTLS) enables identifying and tracking the location of objects or people in real-time. It is used everywhere in transportation and logistics across industries. A postmodern RTLS requires an open architecture and high scalability. This blog post explores the use cases for RTLS, the challenges of existing implementations, and why more and more RTLS implementations rely on Apache Kafka as an open, scalable, and reliable event streaming platform.

Real-Time Locating : Tracking System (RTLS) with Apache Kafka and Event Streaming

Real-Time Locating / Tracking System (RTLS) in Supply Chain and Logistics

RTLS is a key part of many use cases across verticals. Many manufacturing processes and supply chains rely on good real-time information of assets and people. But also, other innovative scenarios could not exist without RTLS. For instance, think about ride-sharing, car-sharing, or food delivery.

An RTLS enables identifying and tracking the location of objects or people in real-time. Some examples:

  • Tracking automobiles through an assembly line
  • Locating pallets of merchandise in a warehouse
  • Finding medical equipment in a hospital
  • Track tools, machines, people (if legal) in a construction area

An RTLS has three key goals:

  • Improve safety
  • Control security
  • Optimize processes and productivity

Wireless RTLS tags are attached to objects or worn by people, and in most RTLS, fixed reference points receive wireless signals from tags to determine their location. However, more and more use cases require outdoors tracking, too. In many cases, a postmodern RTLS combines indoors and outdoors location tracking.

Challenges of Today’s Location and Tracking Systems

RTLS exist for a long time, already. Plenty of products are available on the market. While they differ in their characteristics and features, most traditional RTLS have at least some of the following technical challenges:

  • Monolithic
  • Proprietary
  • Limited Scalability
  • No Hardware Flexibility
  • Single Purpose Solution
  • Limited Integration Capabilities
  • Limited Tracking Technologies

Many vendors invest in their RTLS system. Similarly to CRM, ERP, and MES systems, many of the next generation RTLS systems are based on Kafka to solve these challenges. So feel free to check the above characteristics with your favorite vendor and how they plan to solve (or have already solved) them.

Many enterprises prefer building their own custom postmodern RTLS. This approach allows an open, flexible solution. Custom RTLS are typically built to include innovative and differentiating features that add business value and optimize the business processes.

A Postmodern RTLS for Multi-Purpose Use Cases and Architectures

From my conversations with customers across industries, I learned that use cases and requirements for RTLS changed in the last years. In addition to solving the above technical challenges, Two key differences establish a postmodern view on how to define an RTLS:

  1. RTLS is not just about location anymore. Applications leverage enhanced metadata such as speed, direction, or spatial orientation. Data integration and correlation is key for adding business value and improving processes.
  2. The combination of indoors and outdoors via hybrid architectures enables multi-purpose RTLS.

Some examples for indoors location tracking: Asset tracking monitoring, non-linear production line, geofencing for safety (cobots) and distance enforcement (e.g., Covid 19). Outdoors track&trace enables regional or global logistics, routing, and end-to-end monitoring (e.g., construction areas).

A key requirement of modern RTLS is the ability to integrate with different technologies. This includes Location Tracking Technologies such as Radiofrequency (RF), Infrared (IR), RFID, Beacon, Wi-Fi, Bluetooth, UWB, GPS, GSM, 5G, etc. But that’s not all. The RTLS also needs to integrate with the rest of the enterprise reliably in real-time at scale. This includes MES, ERP, APS, CRM, data lakes, and many other applications.

Use Cases for a Postmodern RTLS

Many use cases exist to leverage a postmodern RTLS to improve processes or build innovative new applications that were not possible beforehand. Some examples:

  • Locate and manage assets within a facility, such as finding a misplaced tool cart in a warehouse or medical equipment
  • Notification of new locations, such as an alert if a tool cart improperly has left the facility
  • Combine identity of multiple items placed in a single location, such as on a pallet
  • Locate customers, for example, in a restaurant, for delivery of food or service
  • Maintain proper staffing levels of operational areas, such as ensuring guards are in the proper locations in a correctional facility
  • Quickly and automatically account for all staff after or during an emergency evacuation
  • Automatically track and time stamp the progress of people or assets through a process, such as following a patient’s emergency room wait time, time spent in the operating room, and total time until discharge
  • Clinical-grade locating to support acute care capacity management
  • Replay past events to understand the mass movements of workflows
  • Plan future location requirements
  • Auditing for compliance cases
  • Etc.

Two important notes here:

  1. Many use cases exist for a long time already. But once again: Check out the challenges discussed above. The requirements change regarding scale, flexibility, and other characteristics.
  2. As you can see, most of these use cases do not just require location tracking but also data correlation in real-time. That’s where the optimization or added business value comes from.

Vehicle Tracking System in other Industries

Transportation and logistics are the obvious industries for real-time tracking systems. But industries not traditionally known to use vehicle tracking systems have started to use it in creative ways to improve their processes or businesses. Here are a few examples:

  • The hospitality industry has caught on to this technology to improve customer service. For example, a luxury hotel in Singapore has installed vehicle tracking systems in their limousines to ensure they can welcome their VIPs when they reach the hotel.
  • Vehicle tracking systems used in food delivery vans may alert if the temperature of the refrigerated compartment moves outside of the range of safe food storage temperatures.
  • Car rental companies are also using it to monitor their rental fleets.

The following sections explore an example using the scenario around transportation and logistics with truck delivery. Let’s look at how Apache Kafka and Event Streaming can help implement a postmodern RTLS.

Kafka-native Real-Time Locating / Tracking System (RTLS)

The following picture shows a multi-purpose Kafka-native RTLS for transportation and logistics:

Kafka-native Real-Time Locating and Tracking System (RTLS)

The example shows three use cases of how produced events (“P”) are consumed and processed:

  • (“C1”) Real-time alerting on a single event: Monitor assets and people and send an alert to a controller, mobile app, or any other interface if an issue happens.
  • (“C2”) Continuous real-time aggregation of multiple events: Correlation data while it is in motion. Calculate average, enforce business rules, apply an analytic model for predictions on new events, or any other business logic.
  • (“C3”) Batch analytics on all historical events: Take all historical data to find insights, e.g., for analyzing issues of the past, planning future location requirements, or training analytic models.

The Kafka-native RTLS can run in the data center, cloud, or closer to the edge, e.g., in a factory close to the shop floor and production lines.

Hybrid Kafka Architecture for Transportation and Logistics for RTLS and Track&Trace

One of the benefits of Apache Kafka is the freedom to deploy the infrastructure as needed. On the one end, Kafka can be deployed as a single broker in a vehicle (like a truck or train). On the other end, a global Kafka infrastructure can spread multiple cloud providers, regions, countries, or even continents and integrate with tens or hundreds of factories or other edge locations. The reality is often somewhere in the middle. Most enterprises start small and roll it out across locations and countries over time.

The following shows a pretty powerful hybrid architecture for a Kafka-native RTLS:

Postmodern Asset and People Track and Trace APS and RTLS with Apache Kafka and Event Streaming

 

In the above scenario, the hybrid architecture includes:

  • A 5G infrastructure with public telco and private 5G Campus networks
  • Confluent Cloud as fully-managed event streaming platform in the cloud
  • Confluent Platform deployed at the edge in the 5G Campus leveraging AWS Wavelength
  • Real-time integration with assets and people at the edge and in the cloud
  • Real-time integration with enterprise applications such as APS, CRM, or ERP systems
  • Data correlation of edge and cloud data (replicated bi-directionally in real-time with tools such as Confluent’s Cluster Linking or Apache Kafka’s MirrorMaker 2)

This is obviously just one sample architecture. Again, you are totally free to design your own architecture with the components and technologies you need for your use cases.

An RTLS system is heavily connected to the whole Supply Chain Management (SCM) process. As Kafka plays a key role in many supply chains, it is also a perfect fit for building real-time asset tracking.

Let’s now move over to two public use cases for location-based transportation and logistics with Kafka-native technologies.

Example: Bosch – Location-based Construction Site Management

The global supplier Bosch has a track&trace application leveraging Apache Kafka and Confluent Cloud: Construction site management analyzing sensors, machines, and workers.

Use cases include collaborative planning, inventory and asset management, and track, manage, and locate tools and equipment anytime and anywhere:

Construction Management and Digital Twin at Bosch with Apache Kafka and Confluent Cloud

The example is close to the hybrid architecture I showed in the last section. The solution spans multiple construction areas in various regions and integrates with the event streaming platform running in the cloud.

Let’s now take a look at another advanced use case for a real-time location service.

Location-Analytics and Geofencing with Kafka and ksqlDB

A geofence is a virtual perimeter for a real-world geographic area and is used for location-analytics in real-time. A geo-fence could be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as school zones or neighborhood boundaries).

The use of a geofence is called geofencing. One example of usage involves a location-aware device of a location-based service (LBS) user entering or exiting a geo-fence. This activity could trigger an alert to the device’s user and message to the geo-fence operator. Or, in the case of a factory, it could enforce distancing during Covid 19 times.

Guido Schmutz from Trivadis has done great work on this topic: “Location Analytics and Real-time Geofencing using Apache Kafka and KSQL“. It is actually quite simple to implement with KSQL:

Location-Analytics and Geofencing with Kafka and ksqlDB

These ksqlDB queries create continuous stream processing that analyses and correlates sensor data in motion in real-time. As ksqlDB is a Kafka-native technology, it is possible to process millions of events per second in a reliable, scalable, and secure way.

 

Example: Lyft – Real-Time Map-Matching to Provide Accurate Locations

The ride-sharing giant Lyft shared a great example for location analytics in real-time. Lyft implemented map-matching to track customers based on the GPS information of the mobile app.

Lyft has “two main use cases for map-matching:

  1. At the end of a ride, to compute the distance traveled by a driver to calculate the fare.
  2. In real-time, to provide accurate locations to the ETA team and make dispatch decisions as well as to display the drivers’ cars on the rider app.

Lyft Map Matching

As the signal is often weak, Lyft enhanced and correlated the data with other data sets to get more accurate information. For instance, Lyft also uses location data from public free Wi-Fi hotspots close to the customer.

This is a great outdoors example of a modern, scalable RTLS. And once again, this example shows that the real added value of real-time data is the data correlation. It does not help if you only use real-time messaging and process the data in batch mode in a data lake.

Open, Scalable, Multi-Purpose, Real-Time RTLS based on Kafka is the New Black

Real-Time Locating System (RTLS) enables identifying and tracking the location of objects or people in real-time. This is not a new problem. But the requirements changed…

A postmodern RTLS provides an open architecture and high scalability. For this reason, more and more RTLS implementations rely on Apache Kafka as an open, scalable, and reliable event streaming platform.

Last but not least, if you wonder what the term “real-time” actually means in “RTLS” (no matter if Kafka-based or not), check out the article “Apache Kafka is NOT Hard Real-Time BUT Used Everywhere in Automotive and Industrial IoT” to understand what real-time really means.

What are your experiences with RTLS architectures and applications? Did you already use Apache Kafka? Which approach works best for you? What is your strategy? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Real Time Locating System (RTLS) with Apache Kafka for Transportation and Logistics appeared first on Kai Waehner.

]]>