apache kafka Archives - Kai Waehner

Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid

Kai Waehner — Wed, 28 May 2025 05:17:50 +0000

I had the opportunity to attend SAP Sapphire 2025 in Madrid—an impressive gathering of SAP customers, partners, and technology leaders from around the world. It was a massive event, bringing the global SAP community together to explore the company’s future direction, innovations, and growing ecosystem.

A key highlight was SAP’s deepening integration of Databricks as an OEM partner for AI and analytics within the SAP Business Data Cloud—showing how the ecosystem is evolving toward more open, composable architectures.

At the same time, conversations around Confluent and data streaming highlighted the critical role real-time integration plays in connecting SAP systems (including ERP, MES, DataSphere, Databricks, etc.) with the rest of the enterprise. As always, it was a great place to learn, connect, and discuss where enterprise data architecture is heading—and how technologies like data streaming are enabling that transformation.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, focusing on industry scenarios, success stories and business value.

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

SAP presented a broad and ambitious strategy centered around the SAP Business Data Cloud (BDC), SAP Joule (including its Agentic AI initiative), and strategic collaborations like SAP Databricks, SAP DataSphere, and integrations across multiple cloud platforms. The vision is clear: SAP wants to connect business processes with modern analytics, AI, and automation.

Source: SAP

For those of us working in data streaming and integration, these developments present a major opportunity. Most customers I meet globally uses SAP ERP or other products like MES, SuccessFactors, or Ariba. The relevance of real-time data streaming in this space is undeniable—and it’s growing.

Building the Bridge: Event-Driven Architecture + SAP

One of the most exciting things about SAP Sapphire is seeing how event-driven architecture is becoming more relevant—even if the conversations don’t start with “Apache Kafka” or “Data Streaming.” In the SAP ecosystem, discussions often focus on business outcomes first, then architecture second. And that’s exactly how it should be.

Many SAP customers are moving toward hybrid cloud environments, where data lives in SAP systems, Salesforce, Workday, ServiceNow, and more. There’s no longer a belief in a single, unified data model. Master Data Management (MDM) as a one-size-fits-all solution has lost its appeal, simply because the real world is more complex.

This is where data streaming with Apache Kafka, Apache Flink, etc. fits in perfectly. Event streaming enables organizations to connect their SAP solutions with the rest of the enterprise—for real-time integration across operational systems, analytics platforms, AI engines, and more. It supports transactional and analytical use cases equally well and can be tailored to each industry’s needs.

In the SAP ecosystem, customers typically don’t look for open source frameworks to assemble their own solutions—they look for a reliable, enterprise-grade platform that just works. That’s why Confluent’s data streaming platform is an excellent fit: it combines the power of Kafka and Flink with the scalability, security, governance, and cloud-native capabilities enterprises expect.

SAP, Databricks, and Confluent – A Triangular Partnership

At the event, I had some great conversations—often literally sitting between leaders from SAP and Databricks. Watching how these two players are evolving—and where Confluent fits into the picture—was eye-opening.

SAP and Databricks are working closely together, especially with the SAP Databricks OEM offering that integrates Databricks into the SAP Business Data Cloud as an embedded AI and analytics engine. SAP DataSphere also plays a central role here, serving as a gateway into SAP’s structured data.

Meanwhile, Databricks is expanding into the operational domain, not just the analytical lakehouse. After acquiring Neon (a Postgres-compatible cloud-native database), Databricks is expected to announce an additional own transactional OLTP solution soon. This shows how rapidly they’re moving beyond batch analytics into the world of operational workloads—areas where Kafka and event streaming have traditionally provided the backbone.

This trend opens up a significant opportunity for data streaming platforms like Confluent to play a central role in modern SAP data architectures. As platforms like Databricks expand their capabilities, the demand for real-time, multi-system integration and cross-platform data sharing continues to grow.

Confluent is uniquely positioned to meet this need—offering not just data movement, but also the ability to process, govern, and enrich data in motion using tools like Apache Flink, and a broad ecosystem of connectors, including those for transactional systems like SAP ERP, but also Oracle databases, IBM mainframe, and other cloud services like Snowflake, ServiceNow or Salesforce.

Data Products, Not Just Pipelines

The term “data product” was mentioned in nearly every conversation—whether from the SAP angle (business semantics and ownership), Databricks (analytics-first), or Confluent (independent, system-agnostic, streaming-native). The key message? Everyone wants real-time, reusable, discoverable data products.

This is where an event-driven architecture powered by a data streaming platform shines: Data Streaming connects everything and distributes data to both operational and analytical systems, with governance, durability, and flexibility at the core.

Confluent’s data streaming platform enables the creation of data products from a wide range of enterprise systems, complementing the SAP data products being developed within the SAP Business Data Cloud. The strength of the partnership lies in the ability to combine these assets—bringing together SAP-native data products with real-time, event-driven data products built from non-SAP systems connected through Confluent. This integration creates a unified, scalable foundation for both operational and analytical use cases across the enterprise.

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

One major takeaway: in the SAP ecosystem, generic messaging around cutting edge technologies such as Apache Kafka does not work. Success comes from being well-prepared—knowing which SAP systems are involved (ECC, S/4HANA, on-prem, or cloud) and what role they play in the customer’s architecture. The conversations must be use case-driven, often tailored to industries like manufacturing, retail, logistics, or the public sector.

This level of specificity is new to many people working in the technical world of Kafka, Flink, and data streaming. Developers and architects often approach integration from a tool- or framework-centric perspective. However, SAP customers expect business-aligned solutions that address concrete pain points in their domain—whether it’s real-time order tracking in logistics, production analytics in manufacturing, or spend transparency in the public sector.

Understanding the context of SAP’s role in the business process, along with industry regulations, workflows, and legacy system constraints, is key to having meaningful conversations. For the data streaming community, this is a shift in mindset—from building pipelines to solving business problems—and it represents a major opportunity to bring strategic value to enterprise customers.

You are lucky: I just published a free ebook about data streaming use cases focusing on industry scenarios and business value: “The Ultimate Data Streaming Guide“.

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Another theme to watch: data lake and format standardization. All cloud providers and data vendors like Databricks, Confluent or Snowflake are investing heavily in supporting open table formats like Apache Iceberg (alongside Delta Lake at Databricks) to standardize analytical integrations and reduce storage costs significantly.

SAP’s investment in Agentic AI through SAP Joule reflects a broader trend across the enterprise software landscape, with vendors like Salesforce, ServiceNow, and others embedding intelligent agents into their platforms. This creates a significant opportunity for Confluent to serve as the streaming backbone—enabling real-time coordination, integration, and decision-making across these diverse, distributed systems.

An event-driven architecture powered by data streaming is crucial for the success of Agentic AI with SAP Joule, Databricks AI agents, and other operational systems that need to be integrated into the business processes. The strategic partnership between Confluent and Databricks makes it even easier to implement end-to-end AI pipelines across the operational and analytical estates.

SAP Sapphire Madrid was a valuable reminder that data streaming is no longer a niche technology—it’s a foundation for digital transformation. Whether it’s SAP ERP, Databricks AI, or new cloud-native operational systems, a Data Streaming Platform connects them all in real time to enable new business models, better customer experiences, and operational agility.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

Replacing Legacy Systems, One Step at a Time with Data Streaming: The Strangler Fig Approach

Kai Waehner — Thu, 27 Mar 2025 06:37:24 +0000

Organizations looking to modernize legacy applications often face a high-stakes dilemma: Do they attempt a complete rewrite or find a more gradual, low-risk approach? Enter the Strangler Fig Pattern, a method that systematically replaces legacy components while keeping the existing system running. Unlike the “Big Bang” approach, where companies try to rewrite everything at once, the Strangler Fig Pattern ensures smooth transitions, minimizes disruptions, and allows businesses to modernize at their own pace. Data streaming transforms the Strangler Fig Pattern into a more powerful, scalable, and truly decoupled approach. Let’s explore why this approach is superior to traditional migration strategies and how real-world enterprises like Allianz are leveraging it successfully.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming architectures and use cases, including various use cases from the retail industry.

What is the Strangler Fig Design Pattern?

The Strangler Fig Pattern is a gradual modernization approach that allows organizations to replace legacy systems incrementally. The pattern was coined and popularized by Martin Fowler to avoid risky “big bang” system rewrites.

Inspired by the way strangler fig trees grow around and eventually replace their host, this pattern surrounds the old system with new services until the legacy components are no longer needed. By decoupling functionalities and migrating them piece by piece, businesses can minimize disruptions, reduce risk, and ensure a seamless transition to modern architectures.

When combined with data streaming, this approach enables real-time synchronization between old and new systems, making the migration even smoother.

Why Strangler Fig is Better than a Big Bang Migration or Rewrite

Many organizations have learned the hard way that rewriting an entire system in one go is risky. A Big Bang migration or rewrite often:

Takes years to complete, leading to stale requirements by the time it’s done
Disrupts business operations, frustrating customers and teams
Requires a high upfront investment with unclear ROI
Involves hidden dependencies, making the transition unpredictable

The Strangler Fig Pattern takes a more incremental approach:

It allows gradual replacement of legacy components one service at a time
Reduces business risk by keeping critical systems operational during migration
Enables continuous feedback loops, ensuring early wins
Keeps costs under control, as teams modernize based on priorities

Here is an example from the Industrial IoT space for the strangler fig pattern leveraging data streaming to modernizing OT Middleware:

If you come from the traditional IT world (banking, retail, etc.) and don’t care about IoT, then you can explore my article about mainframe integration, offloading and replacement with data streaming.

Instead of replacing everything at once, this method surrounds the old system with new components until the legacy system is fully replaced—just like a strangler fig tree growing around its host.

Better Than Reverse ETL: A Migration with Data Consistency across Operational and Analytical Applications

Some companies attempt to work around legacy constraints using Reverse ETL—extracting data from analytical systems and pushing it back into modern operational applications. On paper, this looks like a clever workaround. In reality, Reverse ETL is a fragile, high-maintenance anti-pattern that introduces more complexity than it solves.

Reverse ETL carries several critical flaws:

Batch-based by nature: Data remains at rest in analytical lakehouses like Snowflake, Databricks, Google BigQuery, Microsoft Fabric or Amazon Redshift. It is then periodically moved—usually every few hours or once a day—back into operational systems. This results in outdated and stale data, which is dangerous for real-time business processes.
Tightly coupled to legacy systems: Reverse ETL pipelines still depend on the availability and structure of the original legacy systems. A schema change, performance issue, or outage upstream can break downstream workflows—just like with traditional ETL.
Slow and inefficient: It introduces latency at multiple stages, limiting the ability to react to real-world events at the right moment. Decision-making, personalization, fraud detection, and automation all suffer.
Not cost-efficient: Reverse ETL tools often introduce double processing costs—you pay to compute and store the data in the warehouse, then again to extract, transform, and sync it back into operational systems. This increases both financial overhead and operational burden, especially as data volumes scale.

In short, Reverse ETL is a short-term fix for a long-term challenge. It’s a temporary bridge over the widening gap between real-time operational needs and legacy infrastructure.

Why Data Streaming with Apache Kafka and Flink is an Excellent Fit for the Strangler Fig Pattern

Many modernization efforts fail because they tightly couple old and new systems, making transitions painful. Data streaming with Apache Kafka and Flink changes the game by enabling real-time, event-driven communication.

1. True Decoupling of Old and New Systems

Data streaming using Apache Kafka with its event-driven streaming and persistence layer enables organizations to:

Decouple producers (legacy apps) from consumers (modern apps)
Process real-time and historical data without direct database dependencies
Allow new applications to consume events at their own pace

This avoids dependency on old databases and enables teams to move forward without breaking existing workflows.

2. Real-Time Replication with Persistence

Unlike traditional migration methods, data streaming supports:

Real-time synchronization of data between legacy and modern systems
Durable persistence to handle retries, reprocessing, and recovery
Scalability across multiple environments (on-prem, cloud, hybrid)

This ensures data consistency without downtime, making migrations smooth and reliable.

3. Supports Any Technology and Communication Paradigm

Data streaming’s power lies in its event-driven architecture, which supports any integration style—without compromising scalability or real-time capabilities.

The data product approach using Kafka Topics with data contracts handles:

Real-time messaging for low-latency communication and operational systems
Batch processing for analytics, reporting and AI/ML model training
Request-response for APIs and point-to-point integration with external systems
Hybrid integration—syncing legacy databases with cloud apps, uni- or bi-directionally

This flexibility lets organizations modernize at their own pace, using the right communication pattern for each use case while unifying operational and analytical workloads on a single platform.

4. No Time Pressure – Migrate at Your Own Pace

One of the biggest advantages of the Strangler Fig Pattern with data streaming is flexibility in timing.

No need for overnight cutovers—migrate module by module
Adjust pace based on business needs—modernization can align with other priorities
Handle delays without data loss—thanks to durable event storage

This ensures that companies don’t rush into risky migrations but instead execute transitions with confidence.

5. Intelligent Processing in the Data Migration Pipeline

In a Strangler Fig Pattern, it’s not enough to just move data from old to new systems—you also need to transform, validate, and enrich it in motion.

While Apache Kafka provides the backbone for real-time event streaming and durable storage, Apache Flink adds powerful stream processing capabilities that elevate the modernization journey.

With Apache Flink, organizations can:

Real-time preprocessing: Clean, filter, and enrich legacy data before it’s consumed by modern systems.
Data product migration: Transform old formats into modern, domain-driven event models.
Improved data quality: Validate, deduplicate, and standardize data in motion.
Reusable business logic: Centralize processing logic across services without embedding it in application code.
Unified streaming and batch: Support hybrid workloads through one engine.

Apache Flink allows you to roll out trusted, enriched, and well-governed data products—gradually, without disruption. Together with Kafka, it provides the real-time foundation for a smooth, scalable transition from legacy to modern.

Allianz’s Digital Transformation and Transition to Hybrid Cloud: An IT Modernization Success Story

Allianz, one of the world’s largest insurers, set out to modernize its core insurance systems while maintaining business continuity. A full rewrite was too risky, given the critical nature of insurance claims and regulatory requirements. Instead, Allianz embraced the Strangler Fig Pattern with data streaming.

This approach allows an incremental, low-risk transition. By implementing data streaming with Kafka as an event backbone, Allianz could gradually migrate from legacy mainframes to a modern, scalable cloud architecture. This ensured that new microservices could process real-time claims data, improving speed and efficiency without disrupting existing operations.

Source: Allianz at Data in Motion Tour Frankfurt 2022

To achieve this IT modernization, Allianz leveraged automated microservices, real-time analytics, and event-driven processing to enhance key operations, such as pricing, fraud detection, and customer interactions.

A crucial component of this shift was the Core Insurance Service Layer (CISL), which enabled decoupling applications via a data streaming platform to ensure seamless integration across various backend systems.

With the goal of migrating over 75% of applications to the cloud, Allianz significantly increases agility, reduced operational complexity, and minimized technical debt, positioning itself for long-term digital success by transitioning many applications to the cloud, as you can read (in German) in the CIO.de article:

“As one of the largest insurers in the world, Allianz plans to migrate over 75 percent of its applications to the cloud and modernize its core insurance system.”

Beyond technology, Allianz recognized that successful modernization also required cultural transformation. To drive internal adoption of event-driven architectures, the company launched Allianz Eventing Day—an initiative to educate teams on the benefits of Kafka-based streaming.

Hosted in partnership with Confluent, the event brought together Allianz experts and industry leaders to discuss real-world implementations, best practices, and future innovations. This collaborative environment reinforced Allianz’s commitment to continuous learning and agile development, ensuring that data streaming became a core pillar of its IT strategy.

Allianz also extended this engagement to the broader insurance industry, organizing the first Insurance Roundtable on Event-Driven Architectures with top insurers from across Germany and Switzerland. Experts from Allianz, Generali, HDI, Swiss Re, and others exchanged insights on topics like real-time data analytics, API decoupling, and event discovery. The discussions highlighted how streaming technologies drive competitive advantage, allowing insurers to react faster to customer needs and continuously improve their services. By embracing domain-driven design (DDD) and event storming, Allianz ensured that event-driven architectures were not just a technical upgrade, but a fundamental shift in how insurance operates in the digital age.

The Future of IT Modernization and Legacy Migrations with Strangler Fig using Data Streaming

The Strangler Fig Pattern is the most pragmatic approach to modernizing enterprise systems, and data streaming makes it even more powerful.

By decoupling old and new systems, enabling real-time synchronization, and supporting multiple architectures, data streaming with Apache Kafka and Flink provides the flexibility, reliability, and scalability needed for successful migrations and long-term integrations between legacy and cloud-native applications.

As enterprises continue to strengthen, this approach ensures that modernization doesn’t become a bottleneck, but a business enabler. If your organization is considering legacy modernization, data streaming is the key to making the transition seamless and risk-free.

Are you ready to transform your legacy systems without the risks of a big bang rewrite? What’s one part of your legacy system you’d “strangle” first—and why? If you could modernize just one workflow with real-time data tomorrow, what would it be?

Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation. And download my free book about data streaming architectures, use cases and success stories in the retail industry.

The post Replacing Legacy Systems, One Step at a Time with Data Streaming: The Strangler Fig Approach appeared first on Kai Waehner.

Retail Media with Data Streaming: The Future of Personalized Advertising in Commerce

Kai Waehner — Fri, 21 Mar 2025 07:18:29 +0000

Retail media is transforming advertising by leveraging first-party data to deliver highly targeted, real-time promotions across digital and physical channels. As traditional ad models decline, retailers are monetizing their data through retail media networks, creating additional revenue streams and improving customer engagement. However, success depends on real-time data streaming—enabling instant ad personalization, dynamic bidding, and seamless attribution. Data Streaming with Apache Kafka and Apache Flink provide the foundation for this shift, allowing retailers like Albertsons to optimize advertising strategies and drive measurable results. In this post, I explore how real-time streaming is shaping the future of retail media.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including various use cases from the retail industry.

What is Retail Media?

Retail media is transforming how brands advertise by leveraging first-party data from retailers to create highly targeted ads within their ecosystems. Instead of relying solely on third-party data from traditional digital advertising platforms, retail media allows companies to reach consumers at the point of purchase—whether online, in-store, or via mobile apps.

Retail media is one of the fastest-growing and most strategic revenue streams for retailers today. It has transformed from a niche digital advertising concept into a multi-billion-dollar industry, changing how retailers monetize their data and engage with brands. Below are the key reasons retail media is crucial for retailers.

Retail Media: Display with Advertisements in the Store

Retailers like Amazon, Walmart, and Albertsons are leading the way in monetizing their digital real estate, offering brands access to sponsored product placements, banner ads, video ads, and personalized promotions based on shopping behavior. This shift has made retail media one of the fastest-growing sectors in digital advertising, expected to exceed $100 billion globally in the coming years.

The Digitalization of Retail Media

Retail media has grown from traditional in-store promotions to a fully digitized, data-driven advertising ecosystem. The rise of e-commerce, mobile apps, and connected devices has enabled retailers to:

Collect granular consumer behavior data in real time
Offer personalized promotions to drive higher conversion rates
Provide advertisers with measurable ROI and closed-loop attribution
Leverage AI and machine learning for dynamic ad targeting

By integrating digital advertising with real-time customer data and real-time inventory, retailers can provide contextually relevant promotions across multiple touchpoints. The key to success lies in seamlessly connecting online and offline shopping experiences—a challenge that data streaming with Apache Kafka and Flink helps solve.

Online, Brick-and-Mortar, and Hybrid Retail Media

Retail media strategies vary depending on whether a retailer operates online, in-store, or in a hybrid model:

Online-Only Retail Media: Retail giants like Amazon and eBay leverage vast amounts of digital consumer data to offer programmatic ads, sponsored products, and personalized recommendations directly on their websites and apps.
Brick-and-Mortar Retail Media: Traditional retailers like Target and Albertsons are integrating digital signage, in-store Wi-Fi promotions, and AI-powered shelf displays to engage customers while shopping in physical stores.
Hybrid Retail Media: Retailers like Walmart and Kroger are bridging the gap between digital and physical shopping experiences with omnichannel marketing strategies, personalized mobile app promotions, and AI-powered customer insights that drive both online and in-store purchases.

Omnichannel vs. Unified Commerce in Retail Media

Retailers are moving beyond omnichannel marketing, where customer interactions happen across multiple channels, to unified commerce, where all customer data, inventory, and marketing campaigns are synchronized in real time.

Omnichannel: Offers a seamless shopping experience across different platforms but often lacks real-time data integration.
Unified Commerce: Uses real-time data streaming to unify customer behavior, inventory management, and personalized advertising for a more cohesive experience.

For example, a unified commerce strategy allows a retailer to:

Track customer interactions in real time across e-commerce, mobile, and in-store purchases
Deliver personalized product recommendations based on live shopping behavior
Optimize ad placements and promotions based on real-time demand and inventory availability

This level of integration is only possible with real-time data streaming using technologies such as Apache Kafka and Apache Flink.

How Data Streaming with Apache Kafka and Flink Powers Retail Media

Retail media networks require real-time data processing at scale to manage millions of customer interactions across online and offline touchpoints. Kafka and Flink provide the foundation for a scalable, event-driven infrastructure that enables retailers to:

Process customer behavior in real time: Tracking clicks, searches, and in-store activity instantly
Deliver hyper-personalized ads and promotions: AI-driven dynamic ad targeting
Optimize inventory and pricing: Aligning promotions with real-time stock levels
Measure campaign performance instantly: Providing brands with real-time attribution and insights

With Apache Kafka as the backbone for data streaming and Apache Flink for real-time analytics, retailers can ingest, analyze, and act on consumer data within milliseconds.

Here are a few examples of input data sources, stream processing applications, and outputs for other systems:

Input Data Sources for Retail Media

Customer transaction data (e.g., point-of-sale purchases, online orders)
Website and app interactions (e.g., product views, searches, cart additions)
Loyalty program data (e.g., customer preferences, purchase frequency)
Third-party ad networks (e.g., campaign performance data, audience segments)
In-store sensor and IoT data (e.g., foot traffic, digital shelf interactions)

Stream Processing Applications for Retail Media

Real-time advertisement personalization engine (customizes promotions based on live behavior)
Dynamic pricing optimization (adjusts ad bids and discounts in real-time)
Customer segmentation & targeting (creates audience groups based on behavioral signals)
Fraud detection & clickstream analysis (identifies bot traffic and fraudulent ad clicks)
Omnichannel attribution modeling (correlates ads with online and offline purchases)

Output Systems for Retail Media

Retail media network platforms (e.g., sponsored product listings, display ads)
Programmatic ad exchanges (e.g., Google Ads, The Trade Desk, Amazon DSP)
CRM & marketing automation tools (e.g., Salesforce, Adobe Experience Cloud)
Business intelligence dashboards (e.g., Looker, Power BI, Tableau)
In-store digital signage & kiosks (personalized promotions for physical shoppers)

Key Retail Media Use Cases for Data Streaming with Kafka & Flink

Real-time data streaming with Kafka and Flink enables critical retail media use cases by processing vast amounts of data from customer interactions, inventory updates and advertising platforms. The ability to analyze and act on data instantly allows retailers to optimize ad placements, enhance personalization, and measure the effectiveness of marketing campaigns with unprecedented accuracy. Below are some of the most impactful retail media applications powered by event-driven architectures.

Personalized In-Store Promotions

Retailers can use real-time customer location data, combined with purchase history and preferences, to deliver highly personalized promotions through mobile apps or digital signage. By incorporating location-based services (LBS), the system detects when a shopper enters a specific section of a store and triggers a targeted discount or special offer. For example, a customer browsing the beverage aisle might receive a notification offering 10% off their favorite soda, increasing the likelihood of an impulse purchase.

Dynamic Ad Placement & Bidding

Kafka and Flink power real-time programmatic advertising, enabling retailers to dynamically adjust ad placements and bids based on customer activity and shopping trends. This allows advertisers to serve the most relevant ads at the optimal time, maximizing engagement and conversions. For instance, Walmart Connect continuously analyzes in-store and online behavior to adjust which ads appear on product pages or search results, ensuring brands reach the right shoppers at the right moment.

Inventory-Aware Ad Targeting

Real-time inventory tracking ensures that advertisers only bid on ads for products that are in stock and ready for fulfillment, reducing wasted ad spend and improving customer satisfaction. This integration between retail media networks and inventory systems prevents scenarios where customers click on an ad only to find the item unavailable. For example, if a popular TV model is running low in a specific store, the system can prioritize ads for a similar in-stock product, ensuring a seamless shopping experience.

Fraud Detection & Brand Safety

Retailers must protect their media platforms from click fraud, fake engagement, and suspicious transactions, which can distort performance metrics and drain marketing budgets.

Kafka and Flink enable real-time fraud detection by analyzing patterns in ad clicks, user behavior, and IP addresses to identify bots or fraudulent activity. For example, if an unusual spike in ad impressions originates from a single source, the system can immediately block the traffic, safeguarding advertisers’ investments.

Real-Time Attribution & Measurement

Retail media networks must provide advertisers with instant insights into ad performance by linking online interactions to in-store purchases.

Kafka enables event-driven attribution models, allowing brands to measure how digital ads drive real-world sales. For example, if a customer clicks on an ad for running shoes, visits a store, and buys them later, the platform tracks the conversion in real time, ensuring brands understand the full impact of their campaigns. Solutions like Segment (built on Kafka) provide robust customer data platforms (CDPs) that help retailers unify and analyze customer journeys.

Retail Media as an Advertising Channel for Third-Party Brands

Retailers are increasingly leveraging third-party data sources to bridge the gap between retail media networks and adjacent industries, such as quick-service restaurants (QSRs).

Kafka enables seamless data exchange between grocery stores, delivery apps, and restaurant chains, optimizing cross-industry advertising. For example, a burger chain could dynamically adjust digital menu promotions based on real-time data from a retail partner—if a grocery store’s sales data shows a surge in plant-based meat purchases, the restaurant could prioritize ads for its new vegan burger, ensuring more relevant and effective marketing.

Albertsons’ New Retail Media Strategy Leveraging Data Streaming

One of the most innovative retail media success stories comes from Albertsons. Albertsons is one of the largest supermarket chains in the United States, operating over 2,200 stores under various banners, including Safeway, Vons, and Jewel-Osco, and providing groceries, pharmacy services, and household essentials.

I explored Albertsons in another article about its revamped loyalty platform to retain customers for life. Data streaming is essential and a key strategic part of Albertsons’ enterprise architecture:

Source: Albertsons (Confluent Webinar)

When I hosted a webinar with Albertsons around two years ago on their data streaming strategy, retail media was one of the bullet points. But I didn’t yet realize until now how crucial it would become for retailers:

Retail Media Network Expansion: Albertsons has launched its own retail media network, leveraging first-party data to create highly targeted advertising campaigns.
Real-Time Personalization: With real-time data streaming, Albertsons can provide personalized promotions based on customer purchase history, in-store behavior, and digital engagement.
AI-Powered Insights: Albertsons uses AI and machine learning on top of streaming data pipelines to optimize ad placements, campaign effectiveness, and dynamic pricing strategies.
Data Monetization: By offering data-driven advertising solutions, Albertsons is monetizing its shopper data while enhancing the customer experience with relevant, timely promotions.

Business Value of Real-Time Retail Media

Retailers that adopt data streaming with Kafka and Flink for retail media strategies to unlock massive business value:

New Revenue Streams: Retail media monetization drives ad sales growth
Higher Conversion Rates: Real-time targeting improves customer engagement
Better Customer Insights: Streaming analytics enables deep behavioral insights
Competitive Advantage: Retailers with real-time personalization outperform rivals
Better Customer Experience: Retail media reduces friction and enhances the shopping journey through personalized promotions

The Future of Retail Media is Real-Time and Context-Specific Data Streaming

Retail media is no longer just about placing ads on retailer websites—it’s about delivering real-time, data-driven advertising experiences across every consumer touchpoint.

With Kafka and Flink powering real-time data streaming, retailers can:

Unify online and offline shopping experiences
Enhance personalization with AI-driven insights
Maximize ad revenue with real-time campaign optimization

As retailers like Albertsons, Walmart, and Amazon continue to innovate, the future of retail media will be hyper-personalized, data-driven, and real-time.

How is your organization using real-time data for retail media? Stay ahead of the curve in retail innovation! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation. And download my free book about data streaming use cases and success stories in the retail industry.

The post Retail Media with Data Streaming: The Future of Personalized Advertising in Commerce appeared first on Kai Waehner.

CIO Summit: The State of AI and Why Data Streaming is Key for Success

Kai Waehner — Thu, 13 Mar 2025 07:31:33 +0000

This week, I had the privilege of engaging in insightful conversations at the CIO Summit organized by GDS Group in Amsterdam, Netherlands. The event brought together technology leaders from across Europe and industries such as financial services, manufacturing, energy, gaming, telco, and more. The focus? AI – but with a much-needed reality check. While the potential of AI is undeniable, the hype often outpaces real-world value. Discussions at the summit revolved around how enterprises can move beyond experimentation and truly integrate AI to drive business success.

Key Learnings on the State of AI

The CIO Summit in Amsterdam provided a reality check on AI adoption across industries. While excitement around AI is high, success depends on moving beyond the hype and focusing on real business value. Conversations with technology leaders revealed critical insights about AI’s maturity, challenges, and the key factors driving meaningful impact. Here are the most important takeaways.

AI is Still in Its Early Stages – Beware of the Buzz vs. Value

The AI landscape is evolving rapidly, but many organizations are still in the exploratory phase. Executives recognize the enormous promise of AI but also see challenges in implementation, scaling, and achieving meaningful ROI.

The key takeaway? AI is not a silver bullet. Companies that treat it as just another trendy technology risk wasting resources on hype-driven projects that fail to deliver tangible outcomes.

Generative AI vs. Predictive AI – Understanding the Differences

There was a lot of discussion about Generative AI (GenAI) vs. Predictive AI, two dominant categories that serve very different purposes:

Predictive AI analyzes historical and real-time data to forecast trends, detect anomalies, and automate decision-making (e.g., fraud detection, supply chain optimization, predictive maintenance).
Generative AI creates new content based on trained data (e.g., text, images, or code), enabling applications like automated customer service, software development, and marketing content generation.

While GenAI has captured headlines, Predictive AI remains the backbone of AI-driven automation in enterprises. CIOs must carefully evaluate where each approach adds real business value.

Good Data Quality is Non-Negotiable

A critical takeaway: AI is only as good as the data that fuels it. Poor data quality leads to inaccurate AI models, bad predictions, and failed implementations.

To build trustworthy and effective AI solutions, organizations need:

Accurate, complete, and well-governed data

Real-time and historical data integration

Continuous data validation and monitoring

Context Matters – AI Needs Real-Time Decision-Making

Many AI use cases rely on real-time decision-making. A machine learning model trained on historical data is useful, but without real-time context, it quickly becomes outdated.

For example, fraud detection systems need to analyze real-time transactions while comparing them to historical behavioral patterns. Similarly, AI-powered supply chain optimization depends on up-to-the-minute logistics datarather than just past trends.

The conclusion? Real-time data streaming is essential to unlocking AI’s full potential.

Automate First, Then Apply AI

One common theme among successful AI adopters: Optimize business processes before adding AI.

Organizations that try to retrofit AI onto inefficient, manual processes often struggle with adoption and ROI. Instead, the best approach is:

1⃣ Automate and optimize workflows using real-time data

2⃣ Apply AI to enhance automation and improve decision-making

By taking this approach, companies ensure that AI is applied where it actually makes a difference.

ROI Matters – AI Must Drive Business Value

CIOs are under pressure to deliver business-driven, NOT tech-driven AI projects. AI initiatives that lack a clear ROI roadmap often stall after pilot phases.

Two early success stories for Generative AI stand out:

Customer support – AI chatbots and virtual assistants enhance response times and improve customer experience.
Software engineering – AI-powered code generation boosts developer productivity and reduces time to market.

The lesson? Start with AI applications that deliver clear, measurable business impact before expanding into more experimental areas.

Data Streaming and AI – The Perfect Match

At the heart of AI’s success is data streaming. Why? Because modern AI requires a continuous flow of fresh, real-time data to make accurate predictions and generate meaningful insights.

Data streaming not only powers AI with real-time insights but also ensures that AI-driven decisions directly translate into measurable business value:

Here’s how data streaming powers both Predictive and Generative AI:

Predictive AI + Data Streaming

Predictive AI thrives on timely, high-quality data. Real-time data streaming enables AI models to process and react to events as they happen. Examples include:

Fraud detection: AI analyzes real-time transactions to detect suspicious activity before fraud occurs.

Predictive maintenance: Streaming IoT sensor data allows AI to predict equipment failures before they happen.

Supply chain optimization: AI dynamically adjusts logistics routes based on real-time disruptions.

Here is an example from Capital One bank about fraud detection and prevention in real-time, preventing $150 of fraud on average a year/customer:

Source: Confluent

Generative AI + Data Streaming

Generative AI also benefits from real-time data. Instead of relying on static datasets, streaming data enhances GenAI applications by incorporating the latest information:

AI-powered customer support: Chatbots analyze live customer interactions to generate more relevant responses.

AI-driven marketing content: GenAI adapts promotional messaging in real-time based on customer engagement signals.

Software development acceleration: AI assistants provide real-time code suggestions as developers write code.

In short, without real-time data, AI is limited to outdated insights.

Here is an example for GenAI with data streaming in the travel Industry by Expedia where 60% of travelers are self-servicing in chat, saving 40+% of variable agent cost:

Source: Confluent

The Future of AI: Agentic AI and the Role of Data Streaming

As AI evolves, we are moving toward Agentic AI – systems that autonomously take actions, learn from feedback, and adapt in real time.

For example:

AI-driven cybersecurity systems that detect and respond to threats instantly

Autonomous supply chains that dynamically adjust based on demand shifts

Intelligent business operations where AI continuously optimizes workflows

But Agentic AI can only work if it has access to real-time operational AND analytical data. That’s why data streaming is becoming a critical foundation for the next wave of AI innovation.

The Path to AI Success

The CIO Summit reinforced one key message: AI is here to stay, but its success depends on strategy, data quality, and business value – not just hype.

Organizations that:

Focus on AI applications with clear business ROI

Automate before applying AI

Prioritize real-time data streaming

… will be best positioned to drive AI success at scale.

As AI moves towards autonomous decision-making (Agentic AI), data streaming will become even more critical. The ability to process and act on real-time data will separate AI leaders from laggards.

Now the real question: Where is your AI strategy headed? Let’s discuss!

Stay ahead of the curve! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation. And make sure to download my free book focusing on data streaming use cases, industry stories and business value.

The post CIO Summit: The State of AI and Why Data Streaming is Key for Success appeared first on Kai Waehner.

Virgin Australia’s Journey with Apache Kafka: Driving Innovation in the Airline Industry

Kai Waehner — Tue, 07 Jan 2025 06:48:17 +0000

Data streaming with Apache Kafka and Flink has revolutionized the aviation industry, enabling airlines and airports to improve efficiency, reliability, and customer experience. The airline Virgin Australia exemplifies how leveraging an event-driven architecture can address operational challenges and drive innovation. This article explores how Virgin Australia successfully implemented data streaming to modernize its flight operations and enhance its loyalty program.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch.

Data Streaming with Kafka and Flink in the Aviation and Airline Industry

Data streaming with Apache Kafka and Flink is revolutionizing aviation by enabling real-time data processing and integration across complex airline and airport ecosystems. Airlines rely on diverse systems for flight tracking, crew scheduling, baggage handling, and passenger services, all of which generate vast volumes of data.

Kafka’s event-driven architecture ensures seamless communication between these systems, allowing real-time updates and consistent data flows. Flink then processes this data to provide useful information.

IT Modernization, Cloud-native Middleware and Analytics with Apache Kafka at Lufthansa

For instance, Lufthansa leverages Apache Kafka as a cloud-native middleware to modernize its data integration and enable real-time analytics. Through its KUSCO platform, Kafka replaces legacy tools like TIBCO EMS, offering scalable, cost-efficient, and seamless data sharing across systems. Kafka also powers Lufthansa’s advanced analytics use cases, including:

Anomaly Detection: Real-time alerts using ksqlDB to enhance safety and efficiency.
Fleet Management: Machine learning models embedded in Kafka pipelines for real-time operational predictions.

This shift enables Lufthansa to decouple systems, accelerate innovation, and reduce costs, positioning the airline to meet the demands of a rapidly evolving industry with greater efficiency and agility.

Business Value of Data Streaming at Amsterdam Airport Schiphol Group)

The business value of data streaming in aviation is immense. Airlines gain operational efficiency by reducing delays and optimizing resource allocation. Real-time insights enhance the passenger experience with timely updates, better baggage handling, and personalized interactions.

Airport modernization and digitalization, with consistent real-time information, is another excellent trend. This includes data sharing with partners, such as GDS systems and airlines. Schiphol Group (Amsterdam Airport) presented various use cases for data streaming with Apache Kafka and Flink.

Scalable platforms like Kafka allow airlines to integrate new technologies, future-proofing their operations in an increasingly competitive industry. By leveraging data streaming, aviation companies are not just keeping pace—they’re redefining what’s possible in airline and airport management.

Virgin Australia: Business Overview and IT Strategy

Founded in 2000, Virgin Australia is a leading airline connecting Australia to key global destinations through domestic and international flights. Known for exceptional service and innovation, the airline serves a diverse range of passengers, from leisure travelers to corporate clients.

Virgin Australia’s IT strategy drives its success, focusing on digital transformation to modernize legacy systems and integrate real-time data solutions. The airline uses the latest technology, such as Apache Kafka, and focuses on efficiency to offer good value and new ideas in the airline industry.

This enables the airline to optimize operations, enhance on-time performance, and quickly adapt to disruptions. A customer-first approach is central, leveraging data insights to personalize every stage of the passenger journey and build lasting loyalty.

Virgin Australia partnered with Confluent and the IT consulting firm 4impact to implement Apache Kafka for event streaming, ensuring their systems could meet the airline’s evolving demands. The following is a summary of 4impact’s published success stories:

Success Story 1: Real-Time Flight Schedule Updates with the Flight State Engine (FSE)

Virgin Australia’s Flight State Engine (FSE) creates a central, authoritative view of flight status and streams real-time updates to multiple internal and external systems. Initially built on Oracle SOA, the legacy FSE faced significant limitations:

High costs and slow implementation of new features.
Limited monitoring capabilities.
Lack of scalability for additional event-streaming use cases.

The Solution

4impact replatformed the FSE with a Kafka-based architecture, introducing:

Modern Event Streaming: Kafka replaced Oracle SOA, enabling real-time, high-throughput updates.
Phased Rollout: To minimize disruption, the new FSE ran parallel to the legacy system during implementation.
Future-Proofing: Patterns, templates, and blueprints were developed for future event-streaming applications.

Key Outcomes

The new FSE went live in late 2022, delivering zero outages and exceeding performance expectations.
Speed and cost efficiency for adding new features improved significantly.
The platform became the foundation for other business units, enabling faster delivery of new services and innovations.

Source: 4impact

By replacing the legacy FSE with Kafka, Virgin Australia ensured real-time reliability and created a scalable event-streaming platform to support future projects.

Success Story 2: Transforming the Virgin Business Rewards Program

Virgin Business Rewards is a loyalty program designed to engage small and medium-sized enterprises (SMEs). Previously, the program relied on manual workflows and siloed systems, leading to:

Inefficient processes prone to errors.
Delayed updates on reward earnings and redemptions.
High costs due to the lack of automated communication between systems like Salesforce, Amadeus, and iFly.

The Solution

To address these challenges, 4impact implemented Kafka to automate the program’s workflows:

Event-Driven Architecture: Kafka topics handled asynchronous messaging between systems, avoiding point-to-point integrations.
Custom Microservices: Developed to transform messages and interact with APIs on target systems.
Monitoring and Logging: A centralized mechanism captured business events and system logs, ensuring observability.

Key Outcomes

The new reward and loyalty system went live in Q1 2023, processing thousands of messages daily with a minimal load on endpoint systems.

Reward data was synchronized across all systems, eliminating manual intervention.
Other business units began exploring Kafka’s potential to leverage data for faster, more cost-effective service enhancements.

Source: 4impact

With Apache Kafka, Virgin Australia transformed its loyalty program, ensuring real-time updates and creating a scalable platform for future business needs.

IT Modernization with Data Streaming using Apache Kafka: A Blueprint for Innovation in the Airline Industry

Virgin Australia’s success stories illustrate how data streaming with Apache Kafka, implemented with the help of Confluent and 4impact, can address critical challenges in the aviation industry. By replacing legacy systems with modern event-streaming architectures, the airline achieved:

Real-Time Reliability: Ensuring up-to-date flight information and seamless customer interactions.
Scalability: Creating platforms that support new features and services without high costs or delays.
Customer-Centric Solutions: Enhancing loyalty programs and operational efficiency.

The blog post “Customer Loyalty and Rewards Platform with Apache Kafka” explores how enterprises across various industries use Apache Kafka to enhance customer retention and drive revenue growth through real-time data streaming. It presents case studies from companies like Albertsons, Globe Telecom, Virgin Australia, Disney+ Hotstar, and Porsche to show the value of data streaming in improving customer loyalty programs.

Stay ahead of the curve! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation.

The post Virgin Australia’s Journey with Apache Kafka: Driving Innovation in the Airline Industry appeared first on Kai Waehner.

Stateless vs. Stateful Stream Processing with Kafka Streams and Apache Flink

Kai Waehner — Fri, 27 Dec 2024 08:48:54 +0000

In the world of data-driven applications, the rise of stream processing has changed how we handle and act on data. While traditional databases, data lakes, and warehouses are effective for many batch-based use cases, they fall short in scenarios demanding low latency, scalability, and real-time decision-making. This post explores the key concepts of stateless and stateful stream processing, using Kafka Streams and Apache Flink as examples. These principles apply to any stream processing engine, whether open-source or a cloud service. Let’s break down the differences, practical use cases, the relation to AI/ML, and the immense value stream processing offers compared to traditional data-at-rest methods.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch.

Rethinking Data Processing: From Static to Dynamic

In traditional systems, data is typically stored first in a database or data lake and queried later for computation. This works well for batch processing tasks, like generating reports or dashboards. The process usually looks something like this:

Store Data: Data arrives and is stored in a database or data lake.
Query & Compute: Applications request data for analysis or processing at a later time with a web service, request-response API or SQL script.

However, this approach fails when you need:

Immediate Action: Real-time responses to events, such as fraud detection.
Scalability: Handling thousands or millions of events per second.
Continuous Insights: Ongoing analysis of data in motion.

Enter stream processing—a paradigm where data is continuously processed as it flows through the system. Instead of waiting to store data first, stream processing engines like Kafka Streams and Apache Flink enable you to act on data instantly as it arrives.

Use Case: Fraud Prevention in Real-Time

The blog post uses a fraud prevention scenario to illustrate the power of stream processing. In this example, transactions from various sources (e.g., credit card payments, mobile app purchases) are monitored in real time.

The system flags suspicious activities using three methods:

Stateless Processing: Each transaction is evaluated independently, and high-value payments are flagged immediately.
Stateful Processing: Transactions are analyzed over a time window (e.g., 1 hour) to detect patterns, such as an unusually high number of transactions.
AI Integration: A pre-trained machine learning model is used for real-time fraud detection by predicting the likelihood of fraudulent activity.

This example highlights how stream processing enables instant, scalable, and intelligent fraud detection, something not achievable with traditional batch processing.

To avoid confusion: while I use Kafka Streams for stateless and Apache Flink for stateful in the example, both frameworks are capable of handling both types of processing.

Other Industry Examples of Stream Processing

Predictive Maintenance (Industrial IoT): Continuously monitor sensor data to predict equipment failures and schedule proactive maintenance.
Real-Time Advertisement (Retail): Deliver personalized ads based on real-time user interactions and behavior patterns.
Real-Time Portfolio Monitoring (Finance): Continuously analyze market data and portfolio performance to trigger instant alerts or automated trades during market fluctuations.
Supply Chain Optimization (Logistics): Track shipments in real time to optimize routing, reduce delays, and improve efficiency.
Condition Monitoring (Healthcare): Analyze patient vitals continuously to detect anomalies and trigger immediate alerts.
Network Monitoring (Telecom): Detect outages or performance issues in real time to improve service reliability.

These examples highlight how stream processing drives real-time insights and actions across diverse industries.

What is Stateless Stream Processing?

Stateless stream processing focuses on processing each event independently. In this approach, the system does not need to maintain any context or memory of previous events. Each incoming event is handled in isolation, meaning the logic applied depends solely on the data within that specific event.

This makes stateless processing highly efficient and easy to scale, as it doesn’t require state management or coordination between events. It is ideal for use cases such as filtering, transformations, and simple ETL operations where individual events can be processed with no need of historical data or context.

1. Example: Real-Time Payment Monitoring

Imagine a fraud prevention system that monitors transactions in real time to detect and prevent suspicious activities. Each transaction, whether from a credit card, mobile app, or payment gateway, is evaluated as it occurs. The system checks for anomalies such as unusually high amounts, transactions from unfamiliar locations, or rapid sequences of purchases.

By analyzing these attributes instantly, the system can flag high-risk transactions for further inspection or automatically block them. This real-time evaluation ensures potential fraud is caught immediately, reducing the likelihood of financial loss and enhancing overall security.

You want to flag high-value payments for further inspection. In the following Kafka Streams example:

Each transaction is evaluated as it arrives.
If the transaction amount exceeds 100 (in your chosen currency), it’s sent to a separate topic for further review.

Java Example (Kafka Streams):

KStream payments = builder.stream(“payments”);

payments.filter((key, payment) -> payment.getAmount() > 100)
.to(“high-risk-payments”);

Benefits of Stateless Processing

Low Latency: Immediate processing of individual events.
Simplicity: No need to track or manage past events.
Scalability: Handles large volumes of data efficiently.

This approach is ideal for use cases like filtering, data enrichment, and simple ETL tasks.

What is Stateful Stream Processing?

Stateful stream processing takes it a step further by considering multiple events together. The system maintains state across events, allowing for complex operations like aggregations, joins, and windowed analyses. This means the system can correlate data over a defined period, track patterns, and detect anomalies that emerge across multiple transactions or data points.

2. Example: Fraud Prevention through Continuous Pattern Detection

In fraud prevention, individual transactions may appear normal, but patterns over time can reveal suspicious behavior.

For example, a fraud prevention system might identify suspicious behavior by analyzing all transactions from a specific credit card within a one-hour window, rather than evaluating each transaction in isolation.

Let’s detect anomalies by analyzing transactions with Apache Flink using Flink SQL. In this example:

The system monitors transactions for each credit card within a 1-hour window.
If a card is used over 10 times in an hour, it flags potential fraud.

SQL Example (Apache Flink):

SELECT card_number, COUNT(*) AS transaction_count
FROM payments
GROUP BY TUMBLE(transaction_time, INTERVAL ‘1’ HOUR), card_number
HAVING transaction_count > 10;

Key Concepts in Stateful Processing

Stateful processing relies on maintaining context across multiple events, enabling the system to perform more sophisticated analyses. Here are the key concepts that make stateful stream processing possible:

Windows: Define a time range to group events (e.g., sliding windows, tumbling windows).
State Management: The system remembers past events within the defined window.
Joins: Combine data from multiple sources for enriched analysis.

Benefits of Stateful Processing

Stateful processing is essential for advanced use cases like anomaly detection, real-time monitoring, and predictive analytics:

Complex Analysis: Detect patterns over time.
Event Correlation: Combine events from different sources.
Real-Time Decision-Making: Continuous monitoring without reprocessing data.

Bringing AI and Machine Learning into Stream Processing

Stream processing engines like Kafka Streams and Apache Flink also enable real-time AI and machine learning model inference. This allows you to integrate pre-trained models directly into your data processing pipelines.

3. Example: Real-Time Fraud Detection with AI/ML Models

Consider a payment fraud detection system that uses a TensorFlow model for real-time inference. In this system, transactions from various sources — such as credit cards, mobile apps, and payment gateways — are streamed continuously. Each incoming transaction is preprocessed and sent to the TensorFlow model, which evaluates it based on patterns learned during training.

The model analyzes features like transaction amount, location, device ID, and frequency to predict the likelihood of fraud. If the model identifies a high probability of fraud, the system can trigger immediate actions, such as flagging the transaction, blocking it, or alerting security teams. This real-time inference ensures that potential fraud is detected and addressed instantly, reducing risk and enhancing security.

Here is a code example using Apache Flink’s Python API for predictive AI:

Python Example (Apache Flink):

def predict_fraud(payment):
prediction = model.predict(payment.features)
return prediction > 0.5

stream = payments.map(predict_fraud)

Why Combine AI with Stream Processing?

Integrating AI with stream processing unlocks powerful capabilities for real-time decision-making, enabling businesses to respond instantly to data as it flows through their systems. Here are some key benefits of combining AI with stream processing:

Real-Time Predictions: Immediate fraud detection and prevention.
Automated Decisions: Integrate AI into critical business processes.
Scalability: Handle millions of predictions per second.

Apache Kafka and Flink deliver low-latency, scalable, and robust predictions. My article “Real-Time Model Inference with Apache Kafka and Flink for Predictive AI and GenAI” compares remote inference (via APIs) and embedded inference (within the stream processing application).

For large AI models (e.g., generative AI or large language models), inference is often done via remote calls to avoid embedding large models within the stream processor.

Stateless vs. Stateful Stream Processing: When to Use Each

Choosing between stateless and stateful stream processing depends on the complexity of your use case and whether you need to maintain context across multiple events. The following table outlines the key differences to help you determine the best approach for your specific needs.

Feature	Stateless	Stateful
Use Case	Simple Filtering, ETL	Aggregations, Joins
Latency	Very Low Latency	Slightly Higher Latency due o State Management
Complexity	Simple Logic	Complex Logic Involving Multiple Events
State Management	Not Required	Required for Context-aware Processing
Scalability	High	Depends on the Framework

Read my article “Apache Kafka (including Kafka Streams) + Apache Flink = Match Made in Heaven” to learn more about choosing the right stream processing engine for your use case.

And to clarify again: while this article uses Kafka Streams for stateless and Flink for stateful stream processing, both frameworks are capable of handling both types.

Video Recording

Below, I summarize this content as a ten-minute video on my YouTube channel:

Why Stream Processing is a Fundamental Change

Whether stateless or stateful, stream processing with Kafka Streams, Apache Flink, and similar technologies unlocks real-time capabilities that traditional databases simply cannot offer. From simple ETL tasks to complex fraud detection and AI integration, stream processing empowers organizations to build scalable, low-latency applications.

Investing in stream processing means:

Faster Innovation: Real-time insights drive competitive advantage.
Operational Efficiency: Automate decisions and reduce latency.
Scalability: Handle millions of events seamlessly.

Stream processing isn’t just an evolution of data handling—it’s a revolution. If you’re not leveraging it yet, now is the time to explore this powerful paradigm. If you want to learn more, check out my light board video exploring the core value of Apache Flink:

Stay ahead of the curve! Subscribe to my newsletter for insights into data streaming and connect with me on LinkedIn to continue the conversation.

The post Stateless vs. Stateful Stream Processing with Kafka Streams and Apache Flink appeared first on Kai Waehner.

Open Standards for Data Lineage: OpenLineage for Batch AND Streaming

Kai Waehner — Mon, 13 May 2024 05:20:11 +0000

One of the greatest wishes of companies is end-to-end visibility in their operational and analytical workflows. Where does data come from? Where does it go? To whom am I giving access to? How can I track data quality issues? The capability to follow the data flow to answer these questions is called data lineage. This blog post explores market trends, efforts to provide an open standard with OpenLineage, and how data governance solutions from vendors such as IBM, Google, Confluent and Collibra help fulfil the enterprise-wide data governance needs of most companies, including data streaming technologies such as Apache Kafka and Flink.

What is Data Governance?

Data governance refers to the overall management of the availability, usability, integrity, and security of data used in an organization. It involves establishing processes, roles, policies, standards, and metrics to ensure that data is properly managed throughout its lifecycle. Data governance aims to ensure that data is accurate, consistent, secure, and compliant with regulatory requirements and organizational policies. It encompasses activities such as data quality management, data security, metadata management, and compliance with data-related regulations and standards.

What is the Business Value of Data Governance?

The business value of data governance is significant and multifaceted:

Improved Data Quality: Data governance ensures that data is accurate, consistent, and reliable, leading to better decision-making, reduced errors, and improved operational efficiency.
Enhanced Regulatory Compliance: By establishing policies and procedures for data management and ensuring compliance with regulations such as GDPR, HIPAA, and CCPA, data governance helps mitigate risks associated with non-compliance, including penalties and reputational damage.
Increased Trust and Confidence: Effective data governance instills trust and confidence in data among stakeholders. It leads to greater adoption of data-driven decision-making and improved collaboration across departments.
Cost Reduction: By reducing data redundancy, eliminating data inconsistencies, and optimizing data storage and maintenance processes, data governance helps organizations minimize costs associated with data management and compliance.
Better Risk Management: Data governance enables organizations to identify, assess, and mitigate risks associated with data management, security, privacy, and compliance, reducing the likelihood and impact of data-related incidents.
Support for Business Initiatives: Data governance provides a foundation for strategic initiatives such as digital transformation, data analytics, and AI/ML projects by ensuring that data is available, accessible, and reliable for analysis and decision-making.
Competitive Advantage: Organizations with robust data governance practices can leverage data more effectively to gain insights, innovate, and respond to market changes quickly, giving them a competitive edge in their industry.

Overall, data governance contributes to improved data quality, compliance, trust, cost efficiency, risk management, and competitiveness, ultimately driving better business outcomes and value creation.

What is Data Lineage?

Data lineage refers to the ability to trace the complete lifecycle of data, from its origin through every transformation and movement across different systems and processes. It provides a detailed understanding of how data is created, modified, and consumed within an organization’s data ecosystem, including information about its source, transformations applied, and destinations.

Data Lineage is an essential component of Data Governance: Understanding data lineage helps organizations ensure data quality, compliance with regulations, and adherence to internal policies by providing visibility into data flows and transformations.

Data Lineage is NOT Event Tracing!

Event tracing and data lineage a are different concepts that serve distinct purposes in the realm of data management:

Data Lineage:

Data lineage refers to the ability to track and visualize the complete lifecycle of data, from its origin through every transformation and movement across different systems and processes.
It provides a detailed understanding of how data is created, modified, and consumed within an organization’s data ecosystem, including information about its source, transformations applied, and destinations.
Data lineage focuses on the flow of data and metadata, helping organizations ensure data quality, compliance, and trustworthiness by providing visibility into data flows and transformations.

Event Tracing:

Event tracing, also known as distributed tracing, is a technique used in distributed systems to monitor and debug the flow of individual requests or events as they traverse through various components and services.
It involves instrumenting applications to generate trace data, which contains information about the path and timing of events as they propagate across different nodes and services.
Event tracing is primarily used for performance monitoring, troubleshooting, and root cause analysis in complex distributed systems, helping organizations identify bottlenecks, latency issues, and errors in request processing.

In summary, data lineage focuses on the lifecycle of data within an organization’s data ecosystem, while event tracing is more concerned with monitoring the flow of individual events or requests through distributed systems for troubleshooting and performance analysis.

Here is an example in payments processing: Data lineage would track the path of payment data from initiation to settlement, detailing each step and transformation it undergoes. Meanwhile, event tracing would monitor individual events within the payment system in real-time, capturing the sequence and outcome of actions, such as authentication checks and transaction approvals.

What is the Standard ‘OpenLineage’?

Open Lineage is an open-source project that aims to standardize metadata management for data lineage. It provides a framework for capturing, storing, and sharing metadata related to the lineage of data as it moves through various stages of processing within an organization’s data infrastructure. By providing a common format and APIs for expressing and accessing lineage information, Open Lineage enables interoperability between different data processing systems and tools, facilitating data governance, compliance, and data quality efforts.

Source: OpenLineage (presented at Kafka Summit London 2024)

OpenLineage is an open platform for the collection and analysis of data lineage. It includes an open standard for lineage data collection, integration libraries for the most common tools, and a metadata repository/reference implementation (Marquez). Many frameworks and tools already support producers/consumers:

Source: OpenLineage (presented at Kafka Summit London 2024)

Data Governance for Data Streaming (like Apache Kafka and Flink)

Data streaming involves the real-time processing and movement of data through its distributed messaging platform. This enables organizations to efficiently ingest, process, and analyze large volumes of data from various sources. By decoupling data producers and consumers, a data streaming platform provides a scalable and fault-tolerant solution for building real-time data pipelines to support use cases such as real-time analytics, event-driven architectures, and data integration.

The de facto standard for data streaming is Apache Kafka, used by over 100,000 organizations. Kafka is not just used for big data, it also provides support for transactional workloads.

Data Governance Differences with Data Streaming Compared to Data Lake and Data Warehouse?

Implementing data governance and lineage with data streaming presents several differences and challenges compared to data lakes and data warehouses:

Real-Time Nature: Data streaming involves the processing of data in real-time when it is generated, whereas data lakes and data warehouses typically deal with batch processing of historical data. This real-time nature of data streaming requires governance processes and controls that can operate at the speed of streaming data ingestion, processing, and analysis.
Dynamic Data Flow: Data streaming environments are characterized by dynamic and continuous data flows, with data being ingested, processed, and analyzed in near-real-time. This dynamic nature requires data governance mechanisms that can adapt to changing data sources, schemas, and processing pipelines in real-time, ensuring that governance policies are applied consistently across the entire streaming data ecosystem.
Granular Data Lineage: In data streaming, data lineage needs to be tracked at a more granular level compared to data lakes and data warehouses. This is because streaming data often undergoes multiple transformations and enrichments as it moves through streaming pipelines. In some cases, the lineage of each individual data record must be traced to ensure data quality, compliance, and accountability.
Immediate Actionability: Data streaming environments often require immediate actionability of data governance policies and controls to address issues such as data quality issues, security breaches, or compliance violations in real-time. This necessitates the automation of governance processes and the integration of governance controls directly into streaming data processing pipelines, enabling timely detection, notification, and remediation of governance issues.
Scalability and Resilience: Data streaming platforms like Apache Kafka and Apache Flink are designed for scalability and resilience to handle both, high volumes of data and transactional workloads with critical SLAs. The platform must ensure continuous stream processing even in the face of failures or disruptions. Data governance mechanisms in streaming environments need to be similarly scalable and resilient to keep pace with the scale and speed of streaming data processing, ensuring consistent governance enforcement across distributed and resilient streaming infrastructure.
Metadata Management Challenges: Data streaming introduces unique challenges for metadata management, as metadata needs to be captured and managed in real-time to provide visibility into streaming data pipelines, schema evolution, and data lineage. This requires specialized tools and techniques for capturing, storing, and querying metadata in streaming environments, enabling stakeholders to understand and analyze the streaming data ecosystem effectively.

In summary, implementing data governance with data streaming requires addressing the unique challenges posed by the real-time nature, dynamic data flow, granular data lineage, immediate actionability, scalability, resilience, and metadata management requirements of streaming data environments. This involves adopting specialized governance processes, controls, tools, and techniques tailored to the characteristics and requirements of data streaming platforms like Apache Kafka and Apache Flink.

Schemas and Data Contracts for Streaming Data

The foundation of data governance for streaming data are schemas and data contracts. Confluent Schema Registry is available on GitHub. It became the de facto standard for ensuring data quality and governance in Kafka projects across all industries. Not just for Confluent projects, but also in the broader community leveraging open source technologies. Schema Registry is available under the Confluent Community License that allows deployment in production scenarios with no licensing costs.

Source: Confluent

For more details, check out my article “Policy Enforcement and Data Quality for Apache Kafka with Schema Registry“. And here are two great case studies for financial services companies leveraging schemas and group-wide API contracts across the organization for data governance:

Raiffeisenbank International (RBI): Enterprise-wide data mesh across countries with data streaming.
ING Bank: Schemas in the data streaming enterprise architecture to evolve data contracts

Source: ING Bank

Example: Confluent’s Stream Data Governance Suite for Kafka and Flink

Confluent Cloud is an excellent example of a data governance solution for data streaming. The fully-managed data streaming platform provides capabilities such as

Data Catalog
Data Lineage
Stream Sharing
Data Portal

The Data Portal combines the capabilities in an intuitive user interface to discover, explore, access and use streaming data products:

Source: Confluent

All information and functionality are available in the UI for humans and as an API for integration scenarios.

If you want to learn more about data streaming and data governance in a fun way, check out the free comic ebook “The Data Streaming Revolution: The Force of Kafka + Flink Awakens“:

Source: Confluent

Data Lineage for Streaming Data

Being a core fundament of data governance, data streaming projects require good data lineage for visibility and governance. Today’s market mainly provides two options: Custom projects or buying a commercial product/cloud service. But the market develops. Open standards emerge for data lineage, and integrate data streaming into its implementations.

Let’s explore an example of a commercial solution and an open standard for streaming data lineage:

Cloud service: Data Lineage as part of Confluent Cloud
Open standard: OpenLineage’s integration with Apache Flink and Marquez

Data Lineage in Confluent Cloud for Kafka and Flink

To move forward with updates to critical applications or answer questions on important subjects like data regulation and compliance, teams need an easy means of comprehending the big picture journey of data in motion. Confluent Cloud provides a solution deeply integrated with Kafka and Flink as part of the fully managed SaaS offering.

Source: Confluent

Stream lineage provides a graphical UI of event streams and data relationships with both a bird’s eye view and drill-down magnification for answering questions like:

Where did data come from?
Where is it going?
Where, when, and how was it transformed?

Answers to questions like these allow developers to trust the data they’ve found, and gain the visibility needed to make sure their changes won’t cause any negative or unexpected downstream impact. Developers can learn and decide quickly with live metrics and metadata inspection embedded directly within lineage graphs.

The Confluent documentation goes into much more detail, including examples, tutorials, free cloud credits, etc. Most of the above description is also copied from there.

OpenLineage for Stream Processing with Apache Flink

In recent months, stream processing has gained the particular focus of the OpenLineage community, as described in a dedicated talk at Kafka Summit 2024 in London.

Many useful features for stream processing completed or begun in OpenLineage’s implementation, including:

A seamless OpenLineage and Apache Flink integration
Support for streaming jobs in data catalogs like Marquez, manta, atlan
Progress on a built-in lineage API within the Flink codebase

Here is a screenshot from the live demo of the Kafka Summit talk that shows data lineage across Kafka Topics, Flink applications, and other databases with the reference implementation of OpenLineage (Marquez):

Source: OpenLineage (presented at Kafka Summit London 2024)

The OpenLineage Flink integration is in the early stage with limitations, like no support for Flink SQL or Table API yet. But this is an important initiative. Cross-platform lineage enables a holistic overview of data flow and its dependencies within organizations. This must include stream processing (which often runs the most critical workloads in an enterprise).

The Need for Enterprise-Wide Data Governance and Data Lineage

Data Governance, including Data Lineage, is an enterprise-wide challenge. OpenLineage is an excellent approach for an open standard to integrate with various data platforms like data streaming platform, data lake, data warehouse, lake house, and any other business application.

However, we are still early on this journey. Most companies (have to) build custom solutions today for enterprise-wide governance and lineage of data across various platforms. Short term, most companies leverage purpose-built data governance and lineage features from cloud products like Confluent, Databricks and Snowflake. This makes sense as it creates visibility in the data flows and improves data quality.

Enterprise-wide data governance needs to integrate with all the different data platforms. Today, most companies have built their own solutions – if they have anything at all today (most don’t yet)… Dedicated enterprise governance suites like Collibra or Microsoft Purview get adopted more and more to solve these challenges. And software/cloud vendors like Confluent integrate their purpose-built data lineage and governance into these platforms. Either just via open APIs or via direct and certified integrations.

Balancing Standardization and Innovation with Open Standards and Cloud Services

OpenLineage is a great community initiative to standardize the integration between data platforms and data governance. Hopefully, vendors will adopt such open standards in the future. Today, it is an early stage and you will probably integrate via open APIs or certified (proprietary) connectors.

Balancing standardization and innovation is always a trade-off: Finding the right balance between standardization and innovation entails simplicity, flexibility, and diligent review processes, with a focus on addressing real-world pain points and fostering community-driven extensions.

If you want to learn more about open standards for data governance, please watch this expert panel for data lineage where Accenture and Confluent welcomed experts from Open Lineage, Collibra, Google, IBM / Mantra, IBM / Egeria, Atlan, and Confluent (actually me).

How do you implement data governance and lineage? Do you already leverage OpenLineage or other standards? Or are you investing in commercial products? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Open Standards for Data Lineage: OpenLineage for Batch AND Streaming appeared first on Kai Waehner.

When NOT to Use Apache Kafka? (Lightboard Video)

Kai Waehner — Tue, 26 Mar 2024 06:45:11 +0000

Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This blog post contains a lightboard video that gives you a twenty-minute explanation of the DOs and DONTs.

DisclaimeR: This blog post shares a lightboard video to watch an explanation about when NOT to use Apache Kafka. For a much more detailed and technical blog post with various use cases and case studies, check out this blog post from 2022 (which is still valid today – whenever you read it).

What is Apache Kafka, and what is it NOT?

Kafka is often misunderstood. For instance, I still hear way too often that Kafka is a message queue. Part of the reason is that some vendors only pitch it for a specific problem (such as data ingestion into a data lake or data warehouse) to sell their products. So, in short:

Kafka is…

a scalable real-time messaging platform to process millions of messages per second.
a data streaming platform for massive volumes of big data analytics and small volumes of transactional data processing.
a distributed storage provides true decoupling for backpressure handling, support of various communication protocols, and replayability of events with guaranteed ordering.
a data integration framework (Kafka Connect) for streaming ETL.
a data processing framework (Kafka Streams) for continuous stateless or stateful stream processing.

This combination of characteristics in a single platform makes Kafka unique (and successful).

Kafka is NOT…

a proxy for millions of clients (like mobile apps) – but Kafka-native proxies (like REST or MQTT) exist for some use cases.
an API Management platform – but these tools are usually complementary and used for the creation, life cycle management, or the monetization of Kafka APIs.
a database for complex queries and batch analytics workloads – but good enough for transactional queries and relatively simple aggregations (especially with ksqlDB).
an IoT platform with features such as device management – but direct Kafka-native integration with (some) IoT protocols such as MQTT or OPC-UA is possible and the appropriate approach for (some) use cases.
a technology for hard real-time applications such as safety-critical or deterministic systems – but that’s true for any other IT framework, too. Embedded systems are a different software!

For these reasons, Kafka is complementary, not competitive, to these other technologies. Choose the right tool for the job and combine them!

Lightboard Video: When NOT to use Apache Kafka

The following video explores the key concepts of Apache Kafka. Afterwards, the DOs and DONTs of Kafka show how to complement data streaming with other technologies for analytics, APIs, IoT, and other scenarios.

Data Streaming Vendors and Cloud Services

The research company Forrester defines data streaming platforms as a new software category in a new Forrester Wave. Apache Kafka is the de facto standard used by over 100,000 organizations.

Plenty of vendors offer Kafka platforms and cloud services. Many complementary open source stream processing frameworks like Apache Flink and related cloud offerings emerged. And competitive technologies like Pulsar, Redpanda, or WarpStream try to get market share leveraging the Kafka protocol. Check out the data streaming landscape of 2024 to summarize existing solutions and market trends. The end of the article gives an outlook to potential new entrants in 2025.

Apache Kafka is a Data Streaming Platform: Combine it with other Platforms when needed!

Over 150,000 organizations use Apache Kafka in the meantime. The Kafka protocol is the de facto standard for many open source frameworks, commercial products and serverless cloud SaaS offerings.

However, Kafka is not an allrounder for every use case. Many projects combine Kafka with other technologies, such as databases, data lakes, data warehouses, IoT platforms, and so on. Additionally, Apache Flink is becoming the de facto standard for stream processing (but Kafka Streams is not going away and is the better choice for specific use cases).

Where do you (not) use Apache Kafka? What other technologies do you combine Kafka with? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post When NOT to Use Apache Kafka? (Lightboard Video) appeared first on Kai Waehner.

How the Retailer Intersport uses Apache Kafka as Database with Compacted Topic

Kai Waehner — Thu, 25 Jan 2024 04:31:15 +0000

Compacted Topic is a feature of Apache Kafka to persist and query the latest up-to-date event of a Kafka Topic. The log compaction and key/value search is simple, cost-efficient and scalable. This blog post shows in a success story of Intersport how some use cases store data long term in Kafka with no other database. The retailer requires accurate stock info across the supply chain, including the point of sale (POS) in all international stores.

What is Intersport?

Intersport International Corporation GmbH, commonly known as Intersport, is headquartered in Bern, Switzerland, but its roots trace back to Austria. Intersport is a global sporting goods retail group that operates a network of stores selling sports equipment, apparel, and related products. It is one of the world’s largest sporting goods retailers and has a presence in many countries around the world.

Intersport stores typically offer a wide range of products for various sports and outdoor activities, including sports clothing, footwear, equipment for sports such as soccer, tennis, skiing, cycling, and more. The company often partners with popular sports brands to offer a variety of products to its customers.

Intersport actively promotes sports and physical activity and frequently sponsors sports events and initiatives to encourage people to lead active and healthy lifestyles. The specific products and services offered by Intersport may vary from one location to another, depending on local market demand and trends.

The company automates and innovates continuously with software capabilities like fully automated replenishment, drop shipping, personalized recommendations for customers, and other applications.

How does Intersport leverage Data Streaming with Apache Kafka?

Intersport presented its data streaming success story together with the system integrator DCCS at the Data in Motion Tour 2023 in Vienna, Austria.

Source: DCCS

Here is a summary about the deployment, use cases, and project lifecycle at Intersport:

Apache Kafka as the strategic integration hub powered by fully managed Confluent Cloud
Central nervous system to enable data consistency between real-time data and non-real-time data, i.e., batch systems, files, databases, and APIs.
Loyalty platform with real-time bonus point system
Personalized marketing and hybrid omnichannel customer experience across online and stores
Integration with SAP ERP, financial accounting (SAP FI) and 3rd Party B2B like bike rental, 100s of POS, and legacy like FTP and XML interfaces
Fast time-to-market because of the fully managed cloud: The pilot project with 100 stores and 200 Point of Sale (POS) was finished in 6 months. The entire production rollout took only 12 months.

Source: DCCS

Is Apache Kafka a Database? No. But…

No, Apache Kafka is NOT a database. Apache Kafka is a distributed streaming platform that is designed for building real-time data pipelines and streaming applications. Users frequently apply it for ingesting, processing, and storing large volumes of event data in real time.

Apache Kafka does not provide the traditional features associated with databases, such as random access to stored data or support for complex queries. If you need a database for storage and retrieval of structured data, you would typically use a database system like MySQL, PostgreSQL, MongoDB, or others with Kafka to address different aspects of your data processing needs.

However, Apache Kafka is a database if you focus on cost-efficient long-term storage and the replayability of historical data. I wrote a long article about the database characteristics of Apache Kafka. Read it to understand when (not) to use Kafka as a database. The emergence of Tiered Storage for Kafka created even more use cases.

In this blog post, I want to focus on one specific feature of Apache Kafka for long-term storage and query functionality: Compacted Topics.

What is a Compacted Topic in Apache Kafka?

Kafka is a distributed event streaming platform, and topics are the primary means of organizing and categorizing data within Kafka. “Compacted Topic” in Apache Kafka refers to a specific type of Kafka Topic configuration that is used to keep only the most recent value for each key within the topic.

Source: Apache

In a compacted topic, Kafka ensures that, for each unique key, only the latest message (or event) associated with that key is retained. The system effectively discards older messages with the same key. A Compacted Topic is often used for scenarios where you want to maintain the latest state or record for each key. This can be useful in various applications, such as maintaining the latest user profile information, aggregating statistics, or storing configuration data.

Source: Apache

Here are some key characteristics and use cases for compacted topics in Kafka:

Key-Value Semantics: A compacted topic supports scenarios where you have a key-value data model, and you want to query the most recent value for each unique key.
Log Compaction: Kafka uses a mechanism called “log compaction” to ensure that only the latest message for each key is retained in the topic. This means that Kafka will retain the entire history of changes for each key, but it will remove older versions of a key’s data once a newer version arrives.
Stateful Processing: Compacted topics are often used in stream processing applications where maintaining the state is important. Stream processing frameworks like Apache Kafka Streams and ksqlDB leverage a compacted topic to perform stateful operations.
Change-Data Capture (CDC): Change-data capture scenarios use compacted topics to track changes to data over time. For example, capturing changes to a database table and storing them in Kafka with the latest version of each record.

Compacted Topic at Intersport to Store all Retail Articles in Apache Kafka

Intersport stores all articles in Compacted Topics, i.e., with no retention time. Article records can change several times. Topic compaction cleans out outdated records. Only the most recent version is relevant.

Source: DCCS

Article Data Structure

A model comprises several SKUS as a nested array:

An SKU represents an article with its size and color
Every SKU has shop based prices (purchase price, sales price, list price)
Not every SKU is available in every shop

Source: DCCS

Accurate Stock Information across the Supply Chain

Intersport and DCCS presented their important points and benefits of leveraging Kafka. The central integration hub uses compacted topics for storing and retrieving articles:

Customer facing processes demand real time
Stock info needs to be accurate
Distribute master data to all relevant sub system as soon as it changes
Scale flexible on high load (shopping weekends before Christmas)

Providing the right information at the right time is crucial across the supply chain. Data consistency matters, as not every system is real-time. This is one of the most underestimated sweet spots of Apache Kafka combining real-time messaging with a persistent event store.

Log Compaction in Kafka does NOT Replace BUT Complement other Databases

Intersport is an excellent example in the retail industry for persisting information long-term in Kafka Topics leveraging Kafka’s feature “Compacted Topics“. The benefits are simple usage, cost-efficient event store of the latest up-to-date information, and fast key/value queries, and no need for another database. Hence, Kafka can replace a database for some specific scenarios, like storing and querying the inventory of each store at Intersport.

If you want to learn about other use cases and success stories for data streaming with Kafka and Flink in the retail industry, check out these articles:

How do you use data streaming with Kafka and Flink? What retail use cases did you implement? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post How the Retailer Intersport uses Apache Kafka as Database with Compacted Topic appeared first on Kai Waehner.

Customer Loyalty and Rewards Platform with Apache Kafka

Kai Waehner — Sun, 14 Jan 2024 08:43:26 +0000

Loyalty and rewards platforms are crucial for customer retention and revenue growth for many enterprises across industries. Apache Kafka provides context-specific real-time data and consistency across all applications and databases for a modern and flexible enterprise architecture. This blog post looks at case studies from Albertsons (retail), Globe Telecom (telco), Virgin Australia (aviation), Disney+ Hotstar (sports and gaming), and Porsche (automotive) to explain the value of data streaming for improving the customer loyalty.

What is a Loyalty Platform?

A loyalty platform is a system or software designed to manage and enhance customer loyalty programs for businesses. These programs encourage repeat business, customer retention, and engagement. Loyalty platforms provide tools and features that enable businesses to create and manage various loyalty initiatives.

Key features of a loyalty platform may include:

Points and Rewards System: Customers earn points for making purchases or engaging with the brand, and they can redeem these points for rewards, such as discounts, free products, or other incentives.
Customer Segmentation: Loyalty platforms often allow businesses to segment their customer base to create targeted campaigns and personalized offers based on customer behavior and preferences.
Multi-Channel Integration: Integration with various sales channels, including online stores, mobile apps, and physical stores, ensures a seamless experience for customers and enables businesses to track loyalty across different touchpoints.
Analytics and Reporting: Loyalty platforms provide data and analytics tools to help businesses understand customer behavior, track program effectiveness, and make informed decisions to improve their loyalty initiatives.
Communication Tools: The platform may include features for communicating with customers, such as sending personalized offers, notifications, and other messages to keep them engaged.
User-Friendly Interface: A well-designed interface makes it easy for both businesses and customers to participate in and manage loyalty programs.
Integration with CRM Systems: Integration with Customer Relationship Management (CRM) systems allows businesses to combine loyalty program data with other customer information, providing a more comprehensive view of customer interactions.
Mobile Accessibility: Many loyalty platforms offer mobile apps or mobile-responsive interfaces to allow customers to easily participate in loyalty programs using their smartphones.
Gamification Elements: Some platforms incorporate gamification elements, such as challenges, badges, or tiered levels, to make the loyalty experience more engaging for customers.

Overall, loyalty platforms help businesses build stronger relationships with their customers. They reward and incentify loyalty, ultimately contributing to increased customer retention and satisfaction.

Data Streaming with Apache Kafka for the Next-Generation Loyalty Platform

Apache Kafka is a distributed data streaming platform for building real-time data pipelines and streaming applications.

While it may not be the alone technology for building a loyalty platform, there are several reasons companies incorporate Apache Kafka into the enterprise architecture of a loyalty platform:

Real-Time Data Processing: Apache Kafka excels at handling real-time data streams. In a loyalty platform, real-time processing is crucial for activities such as updating customer points, sending instant notifications, and providing timely rewards.
Message Durability: Kafka persists messages on disk, ensuring durability even in the event of a system failure. This feature is important for loyalty platforms, as it helps prevent data loss and ensures the data consistency and integrity of customer transaction records across real-time and non-real-time applications.
Scalability: Loyalty platforms can experience varying levels of user engagement and data volume. Apache Kafka scales horizontally, allowing the platform to handle increased loads by adding more Kafka brokers to the cluster. This scalability is valuable for accommodating growth in the number of users and transactions.
Data Integration: Loyalty platforms often need to integrate with various data sources and systems, such as customer databases, e-commerce platforms, and CRM systems. Kafka’s ability to act as a data integration hub facilitates the seamless flow of data between different components of the loyalty platform and external systems.
Fault Tolerance and Reliability: Apache Kafka provides fault tolerance by replicating data across multiple brokers in a cluster. This ensures that, even if a broker fails, data is not lost, contributing to the reliability of the loyalty platform.
Event-Driven Architecture: Loyalty platforms can benefit from an event-driven architecture, where events trigger actions and updates across the system. Kafka’s publish-subscribe model enables a decoupled, event-driven approach, making it easier to introduce new features or integrations without tightly coupling components.
Event Sourcing: Apache Kafka supports the event sourcing pattern, which is relevant for loyalty platforms where events like customer transactions, point accruals, and redemptions need to be captured and stored as a sequence of immutable events. This can simplify data modeling and auditing. Tiered Storage for Kafka makes replayability even easier, more scalable, and cost-efficient – with no another data lake.
Analytics and Monitoring: Kafka provides tools for real-time analytics and monitoring, which can be valuable for gaining insights into user behavior, loyalty program effectiveness, and system performance. Integrating Kafka with other analytics platforms allows for a comprehensive view of the loyalty platform’s operations.

Use Cases for a Kafka-powered Loyalty Platform

While specific implementations may vary, here are some examples of how Apache Kafka can be used in loyalty platforms:

Real-Time Points Accumulation: Kafka can process and handle real-time events related to customer transactions. As users make purchases or engage with the loyalty platform, these events can be captured in Kafka topics. The platform can then process these events in real-time with stream processing using Kafka Streams or Apache Flink to update customer points balances and trigger relevant notifications.
Event-Driven Rewards and Offers: Kafka’s publish-subscribe model allows for an event-driven approach to managing rewards and offers. When a customer becomes eligible for a reward or offer, the loyalty platform can publish an event to the appropriate Kafka topic. Subscribers, such as notification services or backend processors, can then react to these events in real-time, ensuring timely communication and fulfillment of rewards via push notifications to a mobile app or location-based service.
Cross-Channel Integration: Loyalty platforms often operate across multiple channels, including online stores, mobile apps, and physical stores. Kafka can facilitate the integration of these channels by serving as a central hub for loyalty-related events. For example, customer interactions in different channels can generate events in Kafka, allowing the loyalty platform to maintain a consistent view of customer activity across all touchpoints – no matter if the interface is real-time, batch, or an API.
Customer Engagement Tracking: Kafka can capture and process events related to customer engagement. Events such as logins, clicks, or interactions with loyalty program features can be streamed to Kafka topics. This data can then be used for real-time analytics, allowing the platform to understand customer behavior and adjust loyalty strategies accordingly.
Fault-Tolerant Transaction Processing: Loyalty platforms deal with critical customer transactions, including point redemptions and reward fulfillment. Kafka’s fault-tolerant architecture and Transaction API ensures these transactions are reliably processed even in the face of hardware failures or other issues. This helps maintain the integrity of customer balances and transaction history.
Scalable Data Processing: As the user base and transaction volume of a loyalty platform grows, Kafka’s scalability becomes crucial. Loyalty platforms can leverage Kafka to scale horizontally, distributing the processing load across multiple Kafka brokers to accommodate increased data throughput.
Audit Trail and Compliance: Kafka’s log-based architecture makes it well-suited for maintaining an audit trail of loyalty-related events. This is valuable for compliance purposes, ensuring that all customer interactions and transactions are recorded in a secure and tamper-evident manner. Replayability of historical data in guaranteed order is a common Kafka sweet spot.
Integration with External Systems: Loyalty platforms often need to integrate with external systems, such as payment gateways, CRM systems, or analytics platforms. Kafka can act as a central integration point, enabling seamless communication between the loyalty platform and these external systems. Kafka’s data integration capabilities have significant benefits compared to ETL, ESB and iPaaS tools.

These examples highlight how Apache Kafka plays a pivotal role in building a robust and scalable architecture for loyalty platforms, providing the infrastructure for real-time processing, fault tolerance, and integration with various components.

Example: Real-Time Rewards Platform for Video Streaming built with Apache Kafka

While Apache Kafka offers these advantages, it’s important to note that building a loyalty and reward platform involves various technologies and platforms. Kafka is a scalable real-time data fabric for the architecture of a loyalty platform and complementary to other open-source, commercial or SaaS transactional and analytics applications.

Here is an example of a loyalty and rewards platform built around video streaming platforms like Twitch. Kafka connects to different APIs and interfaces, correlates the information, and ingested the data into downstream applications:

Many of these interfaces provide bi-directional communication. For instance, the Salesforce CRM is updated with new information from an influencer, i.e., video streamer. On the other side, the Twitch subscribers receive notifications like drops or rewards based on information consumed from Salesforce CRM.

Real World Case Studies Across Industries for Kafka-based Loyalty and Rewards Platforms

Plenty of success stories exist for customer loyalty and reward systems built around Apache Kafka as data fabric and integration hub. Let’s explore some case studies across industries around the world:

Retail: Albertsons (United States)
Airlines: Virgin Australia
Telco: Globe Telecom (Asia)
Automotive / Manufacturing: Porsche (Germany)
Sports and Gaming: Disney+ Hotstar (Asia)
Public Sector: Service NSW (Australia)

Retail: Albertsons – Revamped Loyalty Platform to Retain Customers for Life

Albertsons is the second largest American grocery company with 2200+ stores and 290,000+ employees. “Customers for Life” is the primary goal of its CEO: “We want our customers to interact with us daily […] doubling down on our omnichannel engagement with customers beyond just transactions.”

For this reason, Albertsons choose Apache Kafka as the strategic data integration platform. The cloud-native architecture handles extreme retail scenarios like Christmas or Black Friday:

Albertsons use cases include:

Scalable workforce management and supply chain modernization
Inventory updates from 2200 stores to the cloud services in near real time
Distributing offers and customer clips to each store in near real time
Feed the data in real time to forecast engines for supply chain order forecast, demand planning, and other tasks
Ingesting transactions in near real time to data lake for reporting and analytics
New retail media network

Here is an example business process of Albertsons revamped loyalty platform. The data streaming platform connects to various applications and databases, then processes and shares the events:

Plenty of other success stories for loyalty platforms build with Kafka exist in the retail world. For instance:

Kmart (Australia): The Loyalty platform “OnePass” provides customers a more seamless shopping experience, better customer personalization, and offers. The instant customer feedback improves the retention rate. Turning paper receipts into valuable digital data with streaming also saves millions of dollars. For instance, better analytics forecasts enable accurate product and ranging options and better stock and inventory planning.
Intersport (Austria): The central nervous system for real-time data is powered by Confluent Cloud. Its loyalty platform offers a real-time bonus point system. The data hub shares personalized marketing and omnichannel customer experience across online and stores. Data integration ensures data consistency across the ERP, financial accounting (SAP FI) and 3rd Party B2B, 100s of Point-of-Sale (POS), legacy batch FTP applications.

Aviation: Virgin Australia – Workflow Orchestration between Airline, GDS, CRM and Travel Booking

Virgin Australia is an Australia-based airline and a major player in the Australian aviation industry. Virgin Australia provides a range of domestic and international flights, catering to both leisure and business travelers.

The business flyers loyalty program enhances the utilisation of Virgin Australia among business flyers and strengthens the company’s connection with businesses.

Manual execution of these workflows was inefficient, prone to errors, and costly, leading to delayed feedback for business passengers regarding their earned and redeemed rewards status.

The event streaming platform coordinates the events and workflows across multiple systems, such as iFly, Salesforce CRM, and the global distribution system (GDS) Amadeus.

Ensuring data consistency across real-time and batch systems is one of Kafka’s underestimated sweet spots. Virgin Australia ensures that reward earnings and redemptions are kept in sync across all systems.

Telco: Globe Telecom – Personalized Rewards Points

Globe Telecom provides the largest mobile network in the Philippines and one of the largest fixed-line and broadband networks.

Batch-based processing was holding them back from accessing the real-time data they needed to drive personalized marketing campaigns. Globe Group made the leap to an event-driven architecture with Confluent Cloud, replacing their batch-based systems with real-time processing capabilities.

Data streaming with Apache Kafka and Apache Flink is a journey. Not just for Globe telecom.

One of the first use cases of Globe Telecom was digital rewards. Personalized rewards points are based on the customer purchase. In parallel, Globe could prevent fraud in airtime loans. The borrowing of loan was made easier to operationalize (vs. batch where top-up cash was already spent again).

Manufacturing / Automotive: Porsche – Digital Service Platform for Customers, Fans, and Enthusiasts

Porsche is a German automotive manufacturer (one of Volkswagen’s subsidiaries) specializing in high-performance sports cars, SUVs, and sedans, known for its iconic designs and a strong emphasis on driving dynamics and engineering excellence.

Providing points or rewards for buying cars is a difficult strategic for car makers. Most people buy or lease a car every few years. Hence, car makers often focus on providing a great customer experience during the buying process. ‘My Porsche’ is Porsche’s digital service platform for customers, fans, and enthusiasts across multiple channels like website, mobile app, and the dealership on site:

A customer 360 view across multiple channels is difficult to implement. Porsche has a central data stream strategy across data centers, clouds, and regions to ensure data consistency across real-time, batch, and request response APIs. Porsche’s Streamzilla is the automakers’s one-stop-shop for all data streaming needs powered by Apache Kafka.

Streamzilla enables the data-driven company. It is one central platform, providing a unifying data architecture for setting clean standards and enabling cross-boundary data usage with a single source of truth.

Streamzilla is simplified by design: A single source of truth. The pipeline provides transparency about the cluster state, while increasing productivity and enhancing fault-tolerance & repeatability through automated execution.

Check out Porsche’s Kafka Summit talk or more details about omnichannel customer 360 architectures leveraging Apache Kafka to learn more.

Sports and Gaming: Disney+ Hotstar – Gamification of Live TV Events and Integration of Affiliates

Hotstar (acquired by Disney) is a popular Indian over-the-top (OTT) streaming service that offered a wide range of content, including movies, TV shows, sports, and original programming. “Hotstar Watch N Play” combines gamification and loyalty, which creates a win-win-win for Hotstar, users and affiliates. The feature was introduced to engage users further during live sports events.

Here’s a general overview of how gamification of a live sports event works with “Hotstar Watch N Play”:

Live Streaming of Sports: Hotstar provides live streaming of various sports events, focusing particularly on cricket matches, which are immensely popular in India.
Interactive Experience: The “Watch N Play” feature makes the viewing experience more interactive. Users could predict outcomes, answer trivia questions, and take part in polls related to the live sports event they were watching.
Points and Rewards: Users earn points based on the accuracy of their predictions and their participation in the interactive elements. These points can redeem rewards or showcased on leaderboards.
Leaderboards and Social Interaction: Hotstar often incorporates leaderboards to highlight top scorers among users. This adds a competitive element to the experience, encouraging users to compete with friends and other viewers. Users can also share their achievements on social media.
Engagement and Gamification: “Watch N Play” enhances user engagement by adding a gamified layer to the streaming experience. By blending entertainment with interactivity, Hotstar keeps viewers actively involved during live events.

This infrastructure has to run at extreme scale. Millions of actions have to be processed each second. No surprise that Disney+ Hotstar chose Kafka as the heart of this infrastructure. Kafka Connect integrates with various data sources and sinks.

IoT Integration is often also part of such a customer 360 implementation. Use cases include:

Live e-sports events, TV, video streaming and news stations
Fan engagement
Audience communication
Entertaining features for Alexa, Google Home or sports-specific hardware

Public Sector: NSW Australia – Single View of the Citizen and Partner Onboarding

The public sector includes various government organizations, agencies, and institutions at the national, regional, and local levels that provide public goods and services to the citizens

Obviously, a loyalty and rewards platform for the public sector must look differently (and has another name). The enterprise architecture is similar to all the private sector examples I covered above, though the goals and benefits are different.

Service New South Wales (NSW) is an Australian NSW government agency that delivers the best possible customer experience for people who want to apply for a bushfire support grant, get an energy rebate, manage their driver license, or access any of the many other government services and transactions available within the state of New South Wales.

The agency is part of the NSW government’s greater push to become “the world’s most customer-centric government by 2030.”

Service NSW partners with 70-plus teams and /agencies and offers 200 products, delivering about 1,300 different services and transactions. It’s an enormous effort and creates a complex integration problem from a technology point of view.

There are three components to the Apache Kafka data streaming architecture that make the single view of a citizen possible. The first is the onboarding of partners. In order to build that single view of the customer, Service NSW has to first collect information from 70+ product teams. These include all kinds of data sources, some with public networks, some with private.

Those integration partners include end-user-facing platforms like Salesforce and the interfaces of the apps customers see, such as the Service NSW Mobile App or MyServiceNSW web app.

Apache Kafka = Data Hub for Customer Loyalty

Loyalty points systems and customer rewards are crucial across most industries for long-term customer retention, increased revenue and visibility of a brand. Contextual information at the right time and data consistency across different applications require a reliable and scalable data hub.

Apache Kafka is the de facto standard for data streaming. Modern enterprise architectures leverage Kafka and its ecosystem to provide a better customer experience with accurate and innovative loyalty services.

How does your loyalty platform and rewards system look like? Do you already leverage data streaming in the enterprise architecture? Or even build context-specific recommendations with stream processing technologies like Kafka Streams or Apache Flink? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Customer Loyalty and Rewards Platform with Apache Kafka appeared first on Kai Waehner.

apache kafka Archives - Kai Waehner

Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

Building the Bridge: Event-Driven Architecture + SAP

SAP, Databricks, and Confluent – A Triangular Partnership

Data Products, Not Just Pipelines

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Replacing Legacy Systems, One Step at a Time with Data Streaming: The Strangler Fig Approach

What is the Strangler Fig Design Pattern?

Why Strangler Fig is Better than a Big Bang Migration or Rewrite

Better Than Reverse ETL: A Migration with Data Consistency across Operational and Analytical Applications

Why Data Streaming with Apache Kafka and Flink is an Excellent Fit for the Strangler Fig Pattern

1. True Decoupling of Old and New Systems

2. Real-Time Replication with Persistence

3. Supports Any Technology and Communication Paradigm

4. No Time Pressure – Migrate at Your Own Pace

5. Intelligent Processing in the Data Migration Pipeline

Allianz’s Digital Transformation and Transition to Hybrid Cloud: An IT Modernization Success Story

Event-Driven Innovation: Community and Knowledge Sharing at Allianz

The Future of IT Modernization and Legacy Migrations with Strangler Fig using Data Streaming

Retail Media with Data Streaming: The Future of Personalized Advertising in Commerce

What is Retail Media?

The Digitalization of Retail Media

Online, Brick-and-Mortar, and Hybrid Retail Media

Omnichannel vs. Unified Commerce in Retail Media

How Data Streaming with Apache Kafka and Flink Powers Retail Media

Input Data Sources for Retail Media

Stream Processing Applications for Retail Media

Output Systems for Retail Media

Key Retail Media Use Cases for Data Streaming with Kafka & Flink

Personalized In-Store Promotions

Dynamic Ad Placement & Bidding

Inventory-Aware Ad Targeting

Fraud Detection & Brand Safety

Real-Time Attribution & Measurement

Retail Media as an Advertising Channel for Third-Party Brands

Albertsons’ New Retail Media Strategy Leveraging Data Streaming

Business Value of Real-Time Retail Media

The Future of Retail Media is Real-Time and Context-Specific Data Streaming

CIO Summit: The State of AI and Why Data Streaming is Key for Success

Key Learnings on the State of AI

AI is Still in Its Early Stages – Beware of the Buzz vs. Value

Generative AI vs. Predictive AI – Understanding the Differences

Good Data Quality is Non-Negotiable

Context Matters – AI Needs Real-Time Decision-Making

Automate First, Then Apply AI

ROI Matters – AI Must Drive Business Value

Data Streaming and AI – The Perfect Match

Predictive AI + Data Streaming

Generative AI + Data Streaming

The Future of AI: Agentic AI and the Role of Data Streaming

The Path to AI Success

Virgin Australia’s Journey with Apache Kafka: Driving Innovation in the Airline Industry

Data Streaming with Kafka and Flink in the Aviation and Airline Industry

IT Modernization, Cloud-native Middleware and Analytics with Apache Kafka at Lufthansa

Business Value of Data Streaming at Amsterdam Airport Schiphol Group)

Virgin Australia: Business Overview and IT Strategy

Success Story 1: Real-Time Flight Schedule Updates with the Flight State Engine (FSE)

The Solution

Key Outcomes

Success Story 2: Transforming the Virgin Business Rewards Program

The Solution

Key Outcomes

IT Modernization with Data Streaming using Apache Kafka: A Blueprint for Innovation in the Airline Industry

Stateless vs. Stateful Stream Processing with Kafka Streams and Apache Flink

Rethinking Data Processing: From Static to Dynamic

Use Case: Fraud Prevention in Real-Time

Other Industry Examples of Stream Processing

What is Stateless Stream Processing?

1. Example: Real-Time Payment Monitoring

Benefits of Stateless Processing

What is Stateful Stream Processing?

2. Example: Fraud Prevention through Continuous Pattern Detection

Key Concepts in Stateful Processing

Benefits of Stateful Processing

Bringing AI and Machine Learning into Stream Processing

3. Example: Real-Time Fraud Detection with AI/ML Models

Why Combine AI with Stream Processing?

Stateless vs. Stateful Stream Processing: When to Use Each