Kai Waehner, Author at Kai Waehner

How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI

Kai Waehner — Mon, 02 Jun 2025 04:44:37 +0000

Real-time visibility is no longer a competitive advantage in logistics—it’s a business necessity. As global supply chains become more complex and customer expectations rise, logistics providers must respond with agility and precision. That means shifting away from static, delayed data pipelines toward event-driven architectures built around real-time data.

Technologies like Apache Kafka and Apache Flink are at the heart of this transformation. They allow logistics companies to capture, process, and act on streaming data as it’s generated—from vehicle sensors and telematics systems to inventory platforms and customer applications. This enables new use cases in predictive maintenance, live fleet tracking, customer service automation, and much more.

A growing number of companies across the supply chain are embracing this model. Whether it’s real-time shipment tracking, automated compliance reporting, or AI-driven optimization, the ability to stream, process, and route data instantly is proving vital.

One standout example is Penske Logistics—a transportation leader using Confluent’s data streaming platform (DSP) to transform how it operates and delivers value to customers.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

Why Real-Time Data Matters in Logistics and Transportation

Transportation and logistics operate on tight margins and stricter timelines than almost any other sector. Delays ripple through supply chains, disrupting manufacturing schedules, customer deliveries, and retail inventories. Traditional data integration methods—batch ETL, manual syncing, and siloed systems—simply can’t meet the demands of today’s global logistics networks.

Data streaming enables organizations in the logistics and transportation industry to ingest and process information in real-time while the data is valuable and critical. Vehicle diagnostics, route updates, inventory changes, and customer interactions can all be captured and acted upon in real time. This leads to faster decisions, more responsive services, and smarter operations.

Real-time data also lays the foundation for advanced use cases in automation and AI, where outcomes depend on immediate context and up-to-date information. And for logistics providers, it unlocks a powerful competitive edge.

How Data Streaming with Apache Kafka and Flink Transforms the Supply Chain

Apache Kafka serves as the backbone for real-time messaging—connecting thousands of data producers and consumers across enterprise systems. Apache Flink adds stateful stream processing to the mix, enabling continuous pattern recognition, enrichment, and complex business logic in real time.

In the logistics industry, this event-driven architecture supports use cases such as:

Continuous monitoring of vehicle health and sensor data
Proactive maintenance scheduling
Real-time fleet tracking and route optimization
Integration of telematics, ERP, WMS, and customer systems
Instant alerts for service delays or disruptions
Predictive analytics for capacity and demand forecasting

This isn’t just theory. Leading logistics organizations are deploying these capabilities at scale.

Data Streaming Success Stories Across the Logistics and Transportation Industry

Many transportation and logistics firms are already using Kafka-based architectures to modernize their operations. A few examples:

LKW Walter relies on data streaming to optimize its full truck load (FTL) freight exchanges and enable digital freight matching.
Uber Freight leverages real-time telematics, pricing models, and dynamic load assignment across its digital logistics platform.
Instacart uses event-driven systems to coordinate live order delivery, matching customer demand with available delivery slots.
Maersk incorporates streaming data from containers and ports to enhance shipping visibility and supply chain planning.

These examples show the diversity of value that real-time data brings—across first mile, middle mile, and last mile operations.

An increasing number of companies are using data streaming as the event-driven control tower for their supply chains. It’s not only about real-time insights—it’s also about ensuring consistent data across real-time messaging, HTTP APIs, and batch systems. Learn more in this article: A Real-Time Supply Chain Control Tower powered by Kafka.

Penske Logistics: A Leader in Transportation, Fleet Services, and Supply Chain Innovation

Penske Transportation Solutions is one of North America’s most recognizable logistics brands. It provides commercial truck leasing, rental, and fleet maintenance services, operating a fleet of over 400,000 vehicles. Its logistics arm offers freight management, supply chain optimization, and warehousing for enterprise customers.

Source: Penske Logistics

But Penske is more than a fleet and logistics company. It’s a data-driven operation where technology plays a central role in service delivery. From vehicle telematics to customer support, Penske is leveraging data streaming and AI to meet growing demands for reliability, transparency, and speed.

Penske’s Data Streaming Success Story

Penske explored its data streaming journey at the Confluent Data in Motion Tour. Sarvant Singh, Vice President of Data and Emerging Solutions at Penske, explains the company’s motivation clearly: “We’re an information-intense business. A lot of information is getting exchanged between our customers, associates, and partners. In our business, vehicle uptime and supply chain visibility are critical.”

This focus on uptime is what drove Penske to adopt a real-time data streaming platform, powered by Confluent. Today, Penske ingests and processes around 190 million IoT messages every day from its vehicles.

Each truck contains hundreds of sensors (and thousands of sub-sensors) that monitor everything from engine performance to braking systems. With this volume of data, traditional architectures fell short. Penske turned to Confluent Cloud to leverage Apache Kafka at scale as a fully-managed, elastic SaaS to eliminate the operational burden and unlocking true real-time capabilities.

By streaming sensor data through Confluent and into a proactive diagnostics engine, Penske can now predict when a vehicle may fail—before the problem arises. Maintenance can be scheduled in advance, roadside breakdowns avoided, and customer deliveries kept on track.

This approach has already prevented over 90,000 potential roadside incidents. The business impact is enormous, saving time, money, and reputation.

Other real-time use cases include:

Diagnosing issues instantly to dispatch roadside assistance faster
Triggering preventive maintenance alerts to avoid unscheduled downtime
Automating compliance for IFTA reporting using telematics data
Streamlining repair workflows through integration with electronic DVIRs (Driver Vehicle Inspection Reports)

Why Confluent for Apache Kafka?

Managing Kafka in-house was never the goal for Penske. After initially working with a different provider, they transitioned to Confluent Cloud to avoid the complexity and cost of maintaining open-source Kafka themselves.

“We’re not going to put mission-critical applications on an open source tech,” Singh noted. “Enterprise-grade applications require enterprise level support—and Confluent’s business value has been clear.”

Key reasons for choosing Confluent include:

The ability to scale rapidly without manual rebalancing
Enterprise tooling, including stream governance and connectors
Seamless integration with AI and analytics engines
Reduced time to market and improved uptime

Data Streaming and AI in Action at Penske

Penske’s investment in AI began in 2015, long before it became a mainstream trend. Early use cases included Erica, a virtual assistant that helps customers manage vehicle reservations. Today, AI is being used to reduce repair times, predict failures, and improve customer service experiences.

By combining real-time data with machine learning, Penske can offer more reliable services and automate decisions that previously required human intervention. AI-enabled diagnostics, proactive maintenance, and conversational assistants are already delivering measurable benefits.

The company is also exploring the role of generative AI. Singh highlighted the potential of technologies like ChatGPT for enterprise applications—but also stressed the importance of controls: “Configuration for risk tolerance is going to be the key. Traceability, explainability, and anomaly detection must be built in.”

Fleet Intelligence in Action: Measurable Business Value Through Data Streaming

For a company operating hundreds of thousands of vehicles, the stakes are high. Penske’s real-time architecture has improved uptime, accelerated response times, and empowered technicians and drivers with better tools.

The business outcomes are clear:

Fewer breakdowns and delays
Faster resolution of vehicle issues
Streamlined operations and reporting
Better customer and driver experience
Scalable infrastructure for new services, including electric vehicle fleets

With 165,000 vehicles already connected to Confluent and more being added as EV adoption grows, Penske is just getting started.

The Road Ahead: Agentic AI and the Next Evolution of Event-Driven Architecture Powered By Apache Kafka

The future of logistics will be defined by intelligent, real-time systems that coordinate not just vehicles, but entire networks. As Penske scales its edge computing and expands its use of remote sensing and autonomous technologies, the role of data streaming will only increase.

Agentic AI—systems that act autonomously based on real-time context—will require seamless integration of telematics, edge analytics, and cloud intelligence. This demands a resilient, flexible event-driven foundation. I explored the general idea in a dedicated article: How Apache Kafka and Flink Power Event-Driven Agentic AI in Real Time.

Penske’s journey shows that real-time data streaming is not only possible—it’s practical, scalable, and deeply transformative. The combination of a data streaming platform, sensor analytics, and AI allows the company to turn every vehicle into a smart, connected node in a global supply chain.

For logistics providers seeking to modernize, the path is clear. It starts with streaming data—and the possibilities grow from there. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

The post How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI appeared first on Kai Waehner.

Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid

Kai Waehner — Wed, 28 May 2025 05:17:50 +0000

I had the opportunity to attend SAP Sapphire 2025 in Madrid—an impressive gathering of SAP customers, partners, and technology leaders from around the world. It was a massive event, bringing the global SAP community together to explore the company’s future direction, innovations, and growing ecosystem.

A key highlight was SAP’s deepening integration of Databricks as an OEM partner for AI and analytics within the SAP Business Data Cloud—showing how the ecosystem is evolving toward more open, composable architectures.

At the same time, conversations around Confluent and data streaming highlighted the critical role real-time integration plays in connecting SAP systems (including ERP, MES, DataSphere, Databricks, etc.) with the rest of the enterprise. As always, it was a great place to learn, connect, and discuss where enterprise data architecture is heading—and how technologies like data streaming are enabling that transformation.

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

SAP presented a broad and ambitious strategy centered around the SAP Business Data Cloud (BDC), SAP Joule (including its Agentic AI initiative), and strategic collaborations like SAP Databricks, SAP DataSphere, and integrations across multiple cloud platforms. The vision is clear: SAP wants to connect business processes with modern analytics, AI, and automation.

Source: SAP

For those of us working in data streaming and integration, these developments present a major opportunity. Most customers I meet globally uses SAP ERP or other products like MES, SuccessFactors, or Ariba. The relevance of real-time data streaming in this space is undeniable—and it’s growing.

Building the Bridge: Event-Driven Architecture + SAP

One of the most exciting things about SAP Sapphire is seeing how event-driven architecture is becoming more relevant—even if the conversations don’t start with “Apache Kafka” or “Data Streaming.” In the SAP ecosystem, discussions often focus on business outcomes first, then architecture second. And that’s exactly how it should be.

Many SAP customers are moving toward hybrid cloud environments, where data lives in SAP systems, Salesforce, Workday, ServiceNow, and more. There’s no longer a belief in a single, unified data model. Master Data Management (MDM) as a one-size-fits-all solution has lost its appeal, simply because the real world is more complex.

This is where data streaming with Apache Kafka, Apache Flink, etc. fits in perfectly. Event streaming enables organizations to connect their SAP solutions with the rest of the enterprise—for real-time integration across operational systems, analytics platforms, AI engines, and more. It supports transactional and analytical use cases equally well and can be tailored to each industry’s needs.

In the SAP ecosystem, customers typically don’t look for open source frameworks to assemble their own solutions—they look for a reliable, enterprise-grade platform that just works. That’s why Confluent’s data streaming platform is an excellent fit: it combines the power of Kafka and Flink with the scalability, security, governance, and cloud-native capabilities enterprises expect.

SAP, Databricks, and Confluent – A Triangular Partnership

At the event, I had some great conversations—often literally sitting between leaders from SAP and Databricks. Watching how these two players are evolving—and where Confluent fits into the picture—was eye-opening.

SAP and Databricks are working closely together, especially with the SAP Databricks OEM offering that integrates Databricks into the SAP Business Data Cloud as an embedded AI and analytics engine. SAP DataSphere also plays a central role here, serving as a gateway into SAP’s structured data.

Meanwhile, Databricks is expanding into the operational domain, not just the analytical lakehouse. After acquiring Neon (a Postgres-compatible cloud-native database), Databricks is expected to announce an additional own transactional OLTP solution soon. This shows how rapidly they’re moving beyond batch analytics into the world of operational workloads—areas where Kafka and event streaming have traditionally provided the backbone.

This trend opens up a significant opportunity for data streaming platforms like Confluent to play a central role in modern SAP data architectures. As platforms like Databricks expand their capabilities, the demand for real-time, multi-system integration and cross-platform data sharing continues to grow.

Confluent is uniquely positioned to meet this need—offering not just data movement, but also the ability to process, govern, and enrich data in motion using tools like Apache Flink, and a broad ecosystem of connectors, including those for transactional systems like SAP ERP, but also Oracle databases, IBM mainframe, and other cloud services like Snowflake, ServiceNow or Salesforce.

Data Products, Not Just Pipelines

The term “data product” was mentioned in nearly every conversation—whether from the SAP angle (business semantics and ownership), Databricks (analytics-first), or Confluent (independent, system-agnostic, streaming-native). The key message? Everyone wants real-time, reusable, discoverable data products.

This is where an event-driven architecture powered by a data streaming platform shines: Data Streaming connects everything and distributes data to both operational and analytical systems, with governance, durability, and flexibility at the core.

Confluent’s data streaming platform enables the creation of data products from a wide range of enterprise systems, complementing the SAP data products being developed within the SAP Business Data Cloud. The strength of the partnership lies in the ability to combine these assets—bringing together SAP-native data products with real-time, event-driven data products built from non-SAP systems connected through Confluent. This integration creates a unified, scalable foundation for both operational and analytical use cases across the enterprise.

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

One major takeaway: in the SAP ecosystem, generic messaging around cutting edge technologies such as Apache Kafka does not work. Success comes from being well-prepared—knowing which SAP systems are involved (ECC, S/4HANA, on-prem, or cloud) and what role they play in the customer’s architecture. The conversations must be use case-driven, often tailored to industries like manufacturing, retail, logistics, or the public sector.

This level of specificity is new to many people working in the technical world of Kafka, Flink, and data streaming. Developers and architects often approach integration from a tool- or framework-centric perspective. However, SAP customers expect business-aligned solutions that address concrete pain points in their domain—whether it’s real-time order tracking in logistics, production analytics in manufacturing, or spend transparency in the public sector.

Understanding the context of SAP’s role in the business process, along with industry regulations, workflows, and legacy system constraints, is key to having meaningful conversations. For the data streaming community, this is a shift in mindset—from building pipelines to solving business problems—and it represents a major opportunity to bring strategic value to enterprise customers.

You are lucky: I just published a free ebook about data streaming use cases focusing on industry scenarios and business value: “The Ultimate Data Streaming Guide“.

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Another theme to watch: data lake and format standardization. All cloud providers and data vendors like Databricks, Confluent or Snowflake are investing heavily in supporting open table formats like Apache Iceberg (alongside Delta Lake at Databricks) to standardize analytical integrations and reduce storage costs significantly.

SAP’s investment in Agentic AI through SAP Joule reflects a broader trend across the enterprise software landscape, with vendors like Salesforce, ServiceNow, and others embedding intelligent agents into their platforms. This creates a significant opportunity for Confluent to serve as the streaming backbone—enabling real-time coordination, integration, and decision-making across these diverse, distributed systems.

An event-driven architecture powered by data streaming is crucial for the success of Agentic AI with SAP Joule, Databricks AI agents, and other operational systems that need to be integrated into the business processes. The strategic partnership between Confluent and Databricks makes it even easier to implement end-to-end AI pipelines across the operational and analytical estates.

SAP Sapphire Madrid was a valuable reminder that data streaming is no longer a niche technology—it’s a foundation for digital transformation. Whether it’s SAP ERP, Databricks AI, or new cloud-native operational systems, a Data Streaming Platform connects them all in real time to enable new business models, better customer experiences, and operational agility.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker

Kai Waehner — Mon, 26 May 2025 05:32:01 +0000

Agentic AI is gaining traction as a design pattern for building more intelligent, autonomous, and collaborative systems. Unlike traditional task-based automation, agentic AI involves intelligent agents that operate independently, make contextual decisions, and collaborate with other agents or systems—across domains, departments, and even enterprises.

In the enterprise world, agentic AI is more than just a technical concept. It represents a shift in how systems interact, learn, and evolve. But unlocking its full potential requires more than AI models and point-to-point APIs—it demands the right integration backbone.

That’s where Apache Kafka as event broker for true decoupling comes into play together with two emerging AI standards: Google’s Application-to-Application (A2A) Protocol and Antrophic’s Model Context Protocol (MCP) in an enterprise architecture for Agentic AI.

Inspired by my colleague Sean Falconer’s blog post, “Why Google’s Agent2Agent Protocol Needs Apache Kafka”, this blog post explores the Agentic AI adoption in enterprises and how an event-driven architecture with Apache Kafka fits into the AI architecture.

Business Value of Agentic AI in the Enterprise

For enterprises, the promise of agentic AI is compelling:

Smarter automation through self-directed, context-aware agents
Improved customer experience with faster and more personalized responses
Operational efficiency by connecting internal and external systems more intelligently
Scalable B2B interactions that span suppliers, partners, and digital ecosystems

But none of this works if systems are coupled by brittle point-to-point APIs, slow batch jobs, or disconnected data pipelines. Autonomous agents need continuous, real-time access to events, shared state, and a common communication fabric that scales across use cases.

Model Context Protocol (MCP) + Agent2Agent (A2A): New Standards for Agentic AI

The Model Context Protocol (MCP) coined by Anthropic offers a standardized, model-agnostic interface for context exchange between AI agents and external systems. Whether the interaction is streaming, batch, or API-based, MCP abstracts how agents retrieve inputs, send outputs, and trigger actions across services. This enables real-time coordination between models and tools—improving autonomy, reusability, and interoperability in distributed AI systems.

Source: Anthropic

Google’s Agent2Agent (A2A) protocol complements this by defining how autonomous software agents can interact with one another in a standard way. A2A enables scalable agent-to-agent collaboration—where agents discover each other, share state, and delegate tasks without predefined integrations. It’s foundational for building open, multi-agent ecosystems that work across departments, companies, and platforms.

Source: Google

Why Apache Kafka Is a Better Fit Than an API (HTTP/REST) for A2A and MCP

Most enterprises today use HTTP-based APIs to connect services—ideal for simple, synchronous request-response interactions.

In contrast, Apache Kafka is a distributed event streaming platform designed for asynchronous, high-throughput, and loosely coupled communication—making it a much better fit for multi-agent (A2A) and agentic AI architectures.

API-Based Integration	Kafka-Based Integration
Synchronous, blocking	Asynchronous, event-driven
Point-to-point coupling	Loose coupling with pub/sub topics
Hard to scale to many agents	Supports multiple consumers natively
No shared memory	Kafka retains and replays event history
Limited observability	Full traceability with schema registry & DLQs

Kafka serves as the decoupling layer. It becomes the place where agents publish their state, subscribe to updates, and communicate changes—independently and asynchronously. This enables multi-agent coordination, resilience, and extensibility.

MCP + Kafka = Open, Flexible Communication

As the adoption of Agentic AI accelerates, there’s a growing need for scalable communication between AI agents, services, and operational systems. The Model-Context Protocol (MCP) is emerging as a standard to structure these interactions—defining how agents access tools, send inputs, and receive results. But a protocol alone doesn’t solve the challenges of integration, scaling, or observability.

This is where Apache Kafka comes in.

By combining MCP with Kafka, agents can interact through a Kafka topic—fully decoupled, asynchronous, and in real time. Instead of direct, synchronous calls between agents and services, all communication happens through Kafka topics, using structured events based on the MCP format.

This model supports a wide range of implementations and tech stacks. For instance:

A Python-based AI agent deployed in a SaaS environment
A Spring Boot Java microservice running inside a transactional core system
A Flink application deployed at the edge performing low-latency stream processing
An API gateway translating HTTP requests into MCP-compliant Kafka events

Regardless of where or how an agent is implemented, it can participate in the same event-driven system. Kafka ensures durability, replayability, and scalability. MCP provides the semantic structure for requests and responses.

The result is a highly flexible, loosely coupled architecture for Agentic AI—one that supports real-time processing, cross-system coordination, and long-term observability. This combination is already being explored in early enterprise projects and will be a key building block for agent-based systems moving into production.

Stream Processing as the Agent’s Companion

Stream processing technologies like Apache Flink or Kafka Streams allow agents to:

Filter, join, and enrich events in motion
Maintain stateful context for decisions (e.g., real-time credit risk)
Trigger new downstream actions based on complex event patterns
Apply AI directly within the stream processing logic, enabling real-time inference and contextual decision-making with embedded models or external calls to a model server, vector database, or any other AI platform

Agents don’t need to manage all logic themselves. The data streaming platform can pre-process information, enforce policies, and even trigger fallback or compensating workflows—making agents simpler and more focused.

Technology Flexibility for Agentic AI Design with Data Contracts

One of the biggest advantages of Kafka-based event-driven and decoupled backend for agentic systems is that agents can be implemented in any stack:

Languages: Python, Java, Go, etc.
Environments: Containers, serverless, JVM apps, SaaS tools
Communication styles: Event streaming, REST APIs, scheduled jobs

The Kafka topic is the stable data contract for quality and policy enforcement. Agents can evolve independently, be deployed incrementally, and interoperate without tight dependencies.

Microservices, Data Products, and Reusability – Agentic AI Is Just One Piece of the Puzzle

To be effective, Agentic AI needs to connect seamlessly with existing operational systems and business workflows.

Kafka topics enable the creation of reusable data products that serve multiple consumers—AI agents, dashboards, services, or external partners. This aligns perfectly with data mesh and microservice principles, where ownership, scalability, and interoperability are key.

A single stream of enriched order events might be consumed via a single data product by:

A fraud detection agent
A real-time alerting system
An agent triggering SAP workflow updates
A lakehouse for reporting and batch analytics

This one-to-many model is the opposite of traditional REST designs and crucial for enabling agentic orchestration at scale.

Agentic Al Needs Integration with Core Enterprise Systems

Agentic AI is not a standalone trend—it’s becoming an integral part of broader enterprise AI strategies. While this post focuses on architectural foundations like Kafka, MCP, and A2A, it’s important to recognize how this infrastructure complements the evolution of major AI platforms.

Leading vendors such as Databricks, Snowflake, and others are building scalable foundations for machine learning, analytics, and generative AI. These platforms often handle model training and serving. But to bring agentic capabilities into production—especially for real-time, autonomous workflows—they must connect with operational, transactional systems and other agents at runtime. (See also: Confluent + Databricks blog series | Apache Kafka + Snowflake blog series)

This is where Kafka as the event broker becomes essential: it links these analytical backends with AI agents, transactional systems, and streaming pipelines across the enterprise.

At the same time, enterprise application vendors are embedding AI assistants and agents directly into their platforms:

SAP Joule / Business AI – Embedded AI for finance, supply chain, and operations
Salesforce Einstein / Copilot Studio – Generative AI for CRM and sales automation
ServiceNow Now Assist – Predictive automation across IT and employee services
Oracle Fusion AI / OCI – ML for ERP, HCM, and procurement
Microsoft Copilot – Integrated AI across Dynamics and Power Platform
IBM watsonx, Adobe Sensei, Infor Coleman AI – Governed, domain-specific AI agents

Each of these solutions benefits from the same architectural foundation: real-time data access, decoupled integration, and standardized agent communication.

Whether deployed internally or sourced from vendors, agents need reliable event-driven infrastructure to coordinate with each other and with backend systems. Apache Kafka provides this core integration layer—supporting a consistent, scalable, and open foundation for agentic AI across the enterprise.

Agentic AI Requires Decoupling – Apache Kafka Supports A2A and MCP as an Event Broker

To deliver on the promise of agentic AI, enterprises must move beyond point-to-point APIs and batch integrations. They need a shared, event-driven foundation that enables agents (and other enterprise software) to work independently and together—with shared context, consistent data, and scalable interactions.

Apache Kafka provides exactly that. Combined with MCP and A2A for standardized Agentic AI communication, Kafka unlocks the flexibility, resilience, and openness needed for next-generation enterprise AI.

It’s not about picking one agent platform—it’s about giving every agent the same, reliable interface to the rest of the world. Kafka is that interface.

The post Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker appeared first on Kai Waehner.

Powering Fantasy Sports at Scale: How Dream11 Uses Apache Kafka for Real-Time Gaming

Kai Waehner — Mon, 19 May 2025 06:48:27 +0000

Fantasy sports has become one of the most dynamic and data-intensive digital industries of the past decade. What started as a casual game for sports fans has evolved into a massive business, blending real-time analytics, mobile engagement, and personalized gaming experiences. At the center of this transformation is Apache Kafka—a critical enabler for platforms like Dream11, where millions of users expect live scores, instant feedback, and seamless gameplay. This post explores how fantasy sports works, why real-time data is non-negotiable, and how Dream11 has scaled its Kafka infrastructure to handle some of the world’s most demanding user traffic patterns.

Fantasy Sports: Real-Time Gaming Meets Real-World Sports

Fantasy sports allows users to create virtual teams based on real-life athletes. As matches unfold, players earn points based on the performance of their selected athletes. The better the team performs, the higher the user’s score—and the bigger the prize.

Key characteristics of fantasy gaming:

Multi-sport experience: Users can play across cricket, football, basketball, and more.
Live interaction: Scoring is updated in real time as matches progress.
Contests and leagues: Players join public or private contests, often with cash prizes.
Peak traffic patterns: Most activity spikes in the minutes before a match begins.

This user behavior creates a unique business and technology challenge. Millions of users make critical decisions at the same time, just before the start of each game. The result: extreme concurrency, massive request volumes, and a hard dependency on data accuracy and low latency.

Real-time infrastructure isn’t optional in this model. It’s fundamental to user trust and business success.

Dream11: A Fantasy Sports Giant with Massive Scale

Founded in India, Dream11 is the largest fantasy sports platform in the country—and one of the biggest globally. With over 230 million users, it dominates fantasy gaming across cricket and 11 other sports. The platform sees traffic that rivals the world’s largest digital services.

Source: Dream11

Bipul Karnanit from Dream11 presented very interesting overview at Current 2025 in Bangalore India. Here are a few statistics about Dream11’s scale:

230M users
12 sports
12,000 matches/year
44TB data per day
15M+ peak concurrent users
43M+ peak transactions/day

During major events like the IPL, Dream11 experiences hockey-stick traffic curves, where tens of millions of users log in just minutes before a match begins—making lineup changes, joining contests, and waiting for live updates.

This creates a business-critical need for:

Low latency
Guaranteed data consistency
Fault tolerance
Real-time analytics and scoring
High developer productivity to iterate fast

Apache Kafka at the Heart of Dream11’s Platform

To meet these demands, Dream11 uses Apache Kafka as the foundation of its real-time data infrastructure. Kafka powers the messaging between services that manage user actions, match scores, payouts, leaderboards, and more.

Apache Kafka enables:

Event-driven microservices
Scalable ingestion and processing of user and game data
Loose coupling between systems with data products for operational and analytical consumers
High throughput with guaranteed ordering and durability

Solving Kafka Consumer Challenges at Scale

As the business grew, Dream11’s engineering team encountered challenges with Kafka’s standard consumer APIs, particularly around rebalancing, offset management, and processing guarantees under peak load.

To address these issues, Dream11 built a custom Java-based Kafka consumer library—a foundational component of its internal platform that simplifies Kafka integration across services and boosts developer productivity.

Dream11 Kafka Consumer Library:

Purpose: A custom-built Java library designed to handle high-volume Kafka message consumption at Dream11 scale.
Key Benefit: Abstracts away low-level Kafka consumer details, simplifying tasks like offset management, error handling, and multi-threading, allowing developers to focus on business logic.
Simple Interfaces: Provides easy-to-use interfaces for processing records.
Increased Developer Productivity: Standardized library lead to faster development and fewer errors.

This library plays a crucial role in enabling real-time updates and ensuring seamless gameplay—even under the most demanding user scenarios.

For deeper technical insights, including how Dream11 decoupled polling and processing, implemented at-least-once delivery, and improved throughput with custom worker pools, watch the Dream11 engineering session from Current India 2025 presented by Bipul Karnanit.

Fantasy Sports, Real-Time Expectations, and Business Value

Dream11’s business success is built on user trust, real-time responsiveness, and high-quality gameplay. With millions of users relying on accurate, timely updates, the platform can’t afford downtime, data loss, or delays.

Data Streaming with Apache Kafka enables Dream11 to:

React to user interactions instantly
Deliver consistent data across microservices and devices
Scale dynamically during live events
Streamline the development and deployment of new features

This is not just a backend innovation—it’s a competitive advantage in a space where milliseconds matter and trust is everything.

Dream11’s Kafka Journey: The Backbone of Fantasy Sports at Scale

Fantasy sports is one of the most demanding environments for real-time data platforms. Dream11’s approach—scaling Apache Kafka to serve hundreds of millions of events with precision—is a powerful example of aligning architecture with business needs.

As more industries adopt event-driven systems, Dream11’s journey offers a clear message: Apache Kafka is not just a messaging layer—it’s a strategic platform for building reliable, low-latency digital experiences at scale.

Whether you’re in gaming, finance, telecom, or logistics, there’s much to learn from the way fantasy sports leaders like Dream11 harness data streaming to deliver world-class services.

The post Powering Fantasy Sports at Scale: How Dream11 Uses Apache Kafka for Real-Time Gaming appeared first on Kai Waehner.

Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends?

Kai Waehner — Thu, 15 May 2025 09:57:25 +0000

The modern data landscape is shaped by platforms that excel in different but increasingly overlapping domains. Confluent leads in data streaming with enterprise-grade infrastructure for real-time data movement and processing. Databricks and Snowflake dominate the lakehouse and analytics space—each with unique strengths. Databricks is known for scalable AI and machine learning pipelines, while Snowflake stands out for its simplicity, governed data sharing, and performance in cloud-native analytics.

This final blog in the series brings together everything covered so far and highlights how these technologies power real customer innovation. At Erste Bank, Confluent and Databricks are combined to build an event-driven architecture for Generative AI use cases in customer service. At Siemens, Confluent and Snowflake support a shift-left architecture to drive real-time manufacturing insights and medical AI—using streaming data not just for analytics, but also to trigger operational workflows across systems.

Together, these examples show why so many enterprises adopt a multi-platform strategy—with Confluent as the event-driven backbone, and Databricks or Snowflake (or both) as the downstream platforms for analytics, governance, and AI.

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Blog 1: The Past, Present and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)
Blog 2: Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing
Blog 3: Shift-Left Architecture for AI and Data Warehousing with Confluent and Databricks
Blog 4: Databricks and Confluent in Enterprise Software Environments (with SAP as Example)
Blog 5 (THIS ARTICLE): Databricks and Confluent Leading Data and AI Architectures – and How They Compare to Competitors

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to analytical platforms like Databricks and Snowflake.

The Broader Data Streaming and Lakehouse Landscape

The data streaming and lakehouse space continues to expand, with a variety of platforms offering different capabilities for real-time processing, analytics, and storage.

Data Streaming Market

On the data streaming side, Confluent is the leader. Other cloud-native services like Amazon MSK, Azure Event Hubs, and Google Cloud Managed Kafka provide Kafka-compatible offerings, though they vary in protocol support, ecosystem maturity, and operational simplicity. StreamNative, based on Apache Pulsar, competes with the Kafka offerings, while Decodable and DeltaStream leverage Apache Flink for real-time stream processing using a complementary approach. Startups such as AutoMQ and BufStream pitch reimagining Kafka infrastructure for improved scalability and cost-efficiency in cloud-native architectures.

The data streaming landscape is growing year by year. Here is the latest overview of the data streaming market:

Lakehouse Market

In the lakehouse and analytics platform category, Databricks leads with its cloud-native model combining compute and storage, enabling modern lakehouse architectures. Snowflake is another leading cloud data platform, praised for its ease of use, strong ecosystem, and ability to unify diverse analytical workloads. Microsoft Fabric aims to unify data engineering, real-time analytics, and AI on Azure under one platform. Google BigQuery offers a serverless, scalable solution for large-scale analytics, while platforms like Amazon Redshift, ClickHouse, and Athena serve both traditional and high-performance OLAP use cases.

The Forrester Wave for Lakehouses analyzes and explores the vendor options, showing Databricks, Snowflake and Google as the leaders. Unfortunately, it is not allowed to post the Forrester Wave, so you need to download it from a vendor.

Confluent + Databricks

This blog series highlights Databricks and Confluent because they represent a powerful combination at the intersection of data streaming and the lakehouse paradigm. Together, they enable real-time, AI-driven architectures that unify operational and analytical workloads across modern enterprise environments.

Each platform in the data streaming and Lakehouse space has distinct advantages, but none offer the same combination of real-time capabilities, open architecture, and end-to-end integration as Confluent and Databricks.

It’s also worth noting that open source remains a big – if not the biggest – competitor to all of these vendors. Many enterprises still rely on open-source data lakes built on Elastic, legacy Hadoop, or open table formats such as Apache Hudi—favoring flexibility and cost control over fully managed services.

Confluent: The Leading Data Streaming Platform (DSP)

Confluent is the enterprise-standard platform for data streaming, built on Apache Kafka and extended for cloud-native, real-time operations at global scale. The data streaming platform (DSP) delivers a complete and unified platform with multiple deployment options to meet diverse needs and budgets:

Confluent Cloud – Fully managed, serverless Kafka and Flink service across AWS, Azure, and Google Cloud
Confluent Platform – Self-managed software for on-premises, private cloud, or hybrid environments
WarpStream – Kafka-compatible, cloud-native infrastructure optimized for BYOC (Bring Your Own Cloud) using low-cost object storage like S3

Together, these options offer cost efficiency and flexibility across a wide range of streaming workloads:

Small-volume, mission-critical use cases such as payments or fraud detection, where zero data loss, strict SLAs, and low latency are non-negotiable
High-volume, analytics-driven use cases like clickstream processing for real-time personalization and recommendation engines, where throughput and scalability are key

Confluent supports these use cases with:

Cluster Linking for real-time, multi-region and hybrid cloud data movement
100+ prebuilt connectors for seamless integration with enterprise systems and cloud services
Apache Flink for rich stream processing at scale
Governance and observability with Schema Registry, Stream Catalog, role-based access control, and SLAs
Tableflow for native integration with Delta Lake, Apache Iceberg, and modern lakehouse architectures

While other providers offer fragments—such as Amazon MSK for basic Kafka infrastructure or Azure Event Hubs for ingestion—only Confluent delivers a unified, cloud-native data streaming platform with consistent operations, tooling, and security across environments.

Confluent is trusted by over 6,000 enterprises and backed by deep experience in large-scale streaming deployments, hybrid architectures, and Kafka migrations. It combines industry-leading technology with enterprise-grade support, expertise, and consulting services to help organizations turn real-time data into real business outcomes—securely, efficiently, and at any scale.

Databricks: The Leading Lakehouse for AI and Analytics

Databricks is the leading platform for unified analytics, data engineering, and AI—purpose-built to help enterprises turn massive volumes of data into intelligent, real-time decision-making. Positioned as the Data Intelligence Platform, Databricks combines a powerful lakehouse foundation with full-spectrum AI capabilities, making it the platform of choice for modern data teams.

Its core strengths include:

Delta Lake + Unity Catalog – A robust foundation for structured, governed, and versioned data at scale
Apache Spark – Distributed compute engine for ETL, data preparation, and batch/stream processing
MosaicML – End-to-end tooling for efficient model training, fine-tuning, and deployment of custom AI models
AI/ML tools for data scientists, ML engineers, and analysts—integrated across the platform
Native connectors to BI tools (like Power BI, Tableau) and MLOps platforms for model lifecycle management

Databricks directly competes with Snowflake, especially in the enterprise AI and analytics space. While Snowflake shines with simplicity and governed warehousing, Databricks differentiates by offering a more flexible and performant platform for large-scale model training and advanced AI pipelines.

The platform supports:

Batch and (sort of) streaming analytics
ML model training and inference on shared data
GenAI use cases, including RAG (Retrieval-Augmented Generation) with unstructured and structured sources
Data sharing and collaboration across teams and organizations with open formats and native interoperability

Databricks is trusted by thousands of organizations for AI workloads, offering not only powerful infrastructure but also integrated governance, observability, and scalability—whether deployed on a single cloud or across multi-cloud environments.

Combined with Confluent’s real-time data streaming capabilities, Databricks completes the AI-driven enterprise architecture by enabling organizations to analyze, model, and act on high-quality, real-time data at scale.

Stronger Together: A Strategic Alliance for Data and AI with Tableflow and Delta Lake

Confluent and Databricks are not trying to replace each other. Their partnership is strategic and product-driven.

Recent innovation: Tableflow + Delta Lake – this feature enables bi-directional data exchange between Kafka and Delta Lake.

Direction 1: Confluent streams → Tableflow → Delta Lake (via Unity Catalog)
Direction 2: Databricks insights → Tableflow → Kafka → Flink or other operational systems

This simplifies architecture, reduces cost and latency, and removes the need for Spark jobs to manage streaming data.

Source: Confluent

Confluent becomes the operational data backbone for AI and analytics. Databricks becomes the analytics and AI engine fed with data from Confluent.

Where needed, operational or analytical real-time AI predictions can be done within Confluent’s data streaming platform: with embedded or remote model inference, native integration for search with vector databases, and built-in models for common predictive use cases such as forecasting.

Erste Bank: Building a Foundation for GenAI with Confluent and Databricks

Erste Group Bank AG, one of the largest financial services providers in Central and Eastern Europe, is leveraging Confluent and Databricks to transform its customer service operations with Generative AI. Recognizing that successful GenAI initiatives require more than just advanced models, Erste Bank first focused on establishing a solid foundation of real-time, consistent, and high-quality data leveraging data streaming and an event-driven architecture.

Using Confluent, Erste Bank connects real-time streams, batch workloads, and request-response APIs across its legacy and cloud-native systems in a decoupled way but ensuring data consistency through Kafka. This architecture ensures that operational and analytical data — whether from core banking platforms, digital channels, mobile apps, or CRM systems — flows reliably and consistently across the enterprise. By integrating event streams, historical data, and API calls into a unified data pipeline, Confluent enables Erste Bank to create a live, trusted digital representation of customer interactions.

With this real-time foundation in place, Erste Bank leverages Databricks as its AI and analytics platform to build and scale GenAI applications. At the Data in Motion Tour 2024 in Frankfurt, Erste Bank presented a pilot project where customer service chatbots consume contextual data flowing through Confluent into Databricks, enabling intelligent, personalized responses. Once a customer request is processed, the chatbot triggers a transaction back through Kafka into the Salesforce CRM, ensuring seamless, automated follow-up actions.

Source: Erste Group Bank AG

By combining Confluent’s real-time data streaming capabilities with Databricks’ powerful AI infrastructure, Erste Bank is able to:

Deliver highly contextual, real-time customer service interactions
Automate CRM workflows through real-time event-driven architectures
Build a scalable, resilient platform for future AI-driven applications

This architecture positions Erste Bank to continue expanding GenAI use cases across financial services, from customer engagement to operational efficiency, powered by consistent, trusted, and real-time data.

Confluent: The Neutral Streaming Backbone for Any Data Stack

Confluent is not tied to a monolithic compute engine within a cloud provider. This neutrality is a strength:

Bridges operational systems (mainframes, SAP) with modern data platforms (AI, lakehouses, etc.)
An event-driven architecture built with a data streaming platform feeds multiple lakehouses at once
Works across all major cloud providers, including AWS, Azure, and GCP
Operates at the edge, on-prem, in the cloud and in hybrid scenarios
One size doesn’t fit all – follow best practices from microservices architectures and data mesh to tailor your architecture with purpose-built solutions.

The flexibility makes Confluent the best platform for data distribution—enabling decoupled teams to use the tools and platforms best suited to their needs.

Confluent’s Tableflow also supports Apache Iceberg to enable seamless integration from Kafka into lakehouses beyond Delta Lake and Databricks—such as Snowflake, BigQuery, Amazon Athena, and many other data platforms and analytics engines.

Example: A global enterprise uses Confluent as its central nervous system for data streaming. Customer interaction events flow in real time from web and mobile apps into Confluent. These events are then:

Streamed into Databricks once for multiple GenAI and analytics use cases.
Written to an operational PostgreSQL database to update order status and customer profiles
Pushed into an customer-facing analytics engine like StarTree (powered by Apache Pinot) for live dashboards and real-time customer behavior analytics
Shared with Snowflake through a lift-and-shift M&A use case to unify analytics from an acquired company

This setup shows the power of Confluent’s neutrality and flexibility: enabling real-time, multi-directional data sharing across heterogeneous platforms, without coupling compute and storage.

Snowflake: A Cloud-Native Companion to Confluent – Natively Integrated with Apache Iceberg and Polaris Catalog

Snowflake pairs naturally with Confluent to power modern data architectures. As a cloud-native SaaS from the start, Snowflake has earned broad adoption across industries thanks to its scalability, simplicity, and fully managed experience.

Together, Confluent and Snowflake unlock high-impact use cases:

Near real-time ingestion and enrichment: Stream operational data into Snowflake for immediate analytics and action.
Unified operational and analytical workloads: Combine Confluent’s Tableflow with Snowflake’s Apache Iceberg support through its open source Polaris catalog to bridge operational and analytical data layers.
Shift-left data quality: Improve reliability and reduce costs by validating and shaping data upstream, before it hits storage.

With Confluent as the streaming backbone and Snowflake as the analytics engine, enterprises get a cloud-native stack that’s fast, flexible, and built to scale. Many enterprises use Confluent as data ingestion platform for Databricks, Snowflake, and other analytical and operational downstream applications.

Shift Left at Siemens: Real-Time Innovation with Confluent and Snowflake

Siemens is a global technology leader operating across industry, infrastructure, mobility, and healthcare. Its portfolio includes industrial automation, digital twins, smart building systems, and advanced medical technologies—delivered through units like Siemens Digital Industries and Siemens Healthineers.

To accelerate innovation and reduce operational costs, Siemens is embracing a shift-left architecture to enrich data early in the pipeline before it reaches Snowflake. This enables reusable, real-time data products in the data streaming platform leveraging an event-driven architecture for data sharing with analytical and operational systems beyond Snowflake.

Siemens Digital Industries applies this model to optimize manufacturing and intralogistics, using streaming ETL to feed real-time dashboards and trigger actions like automated inventory replenishment—while continuing to use Snowflake for historical analysis, reporting, and long-term data modeling.

Source: Siemens Digital Industries

Siemens Healthineers embeds AI directly in the stream processor to detect anomalies in medical equipment telemetry, improving response time and avoiding costly equipment failures—while leveraging Snowflake to power centralized analytics, compliance reporting, and cross-device trend analysis.

Source: Siemens Healthineers

These success stories are part of The Data Streaming Use Case Show, my new industry webinar series. Learn more about Siemens’ usage of Confluent and Snowflake and watch the video recording about “shift left”.

Open Outlook: Agentic AI with Model-Context Protocol (MCP) and Agent2Agent Protocol (A2A)

While data and AI platforms like Databricks and Snowflake play a key role, some Agentic AI projects will likely rely on emerging, independent SaaS platforms and specialized tools. Flexibility and open standards are key for future success.

What better way to close a blog series on Confluent and Databricks (and Snowflake) than by looking ahead to one of the most exciting frontiers in enterprise AI: Agentic AI.

As enterprise AI matures, there is growing interest in bi-directional interfaces between operational systems and AI agents. Google’s A2A (Agent-to-Agent) architecture reinforces this shift—highlighting how intelligent agents can autonomously communicate, coordinate, and act across distributed systems.

Confluent + Databricks is an ideal combination to support these emerging Agentic AI patterns, where event-driven agents continuously learn from and act upon streaming data. Models can be embedded directly in Flink for low-latency applications or hosted and orchestrated in Databricks for more complex inference workflows.

The Model-Context-Protocol (MCP) is gaining traction as a design blueprint for standardized interaction between services, models, and event streams. In this context, Confluent and Databricks are well positioned to lead:

Confluent: Event-driven delivery of context, inputs, and actions
Databricks: Model hosting, training, inference, and orchestration
Jointly: Closed feedback loops between business systems and AI agents

Together with protocols like A2A and MCP, this architecture will shape the next generation of intelligent, real-time enterprise applications.

Confluent + Databricks: The Future-Proof Data Stack for AI and Analytics

Databricks and Confluent are not just partners. They are leaders in their respective domains. Together, they enable real-time, intelligent data architectures that support operational excellence and AI innovation.

Other AI and data platforms are part of the landscape, and many bring valuable capabilities. As explored in this blog series, the true decoupling using an event-driven architecture with Apache Kafka allows using any kind of combination of vendors and cloud services. I see many enterprises using Databricks and Snowflake integrated to Confluent. However, the alignment between Confluent and Databricks stands out due to its combination of strengths:

Confluent’s category leadership in data streaming, powering thousands of production deployments across industries
Databricks’ strong position in the lakehouse and AI space, with broad enterprise adoption for analytics and machine learning
Shared product vision and growing engineering and go-to-market alignment across technical and field organizations

For enterprises shaping a long-term data and AI strategy, this combination offers a proven, scalable foundation—bridging real-time operations with analytical depth, without forcing trade-offs between speed, flexibility, or future-readiness.

Stay tuned for deep dives into how these platforms are shaping the future of data-driven enterprises. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to analytical platforms like Databricks and Snowflake.

The post Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends? appeared first on Kai Waehner.

Databricks and Confluent in the World of Enterprise Software (with SAP as Example)

Kai Waehner — Mon, 12 May 2025 11:26:54 +0000

Modern enterprises rely heavily on operational systems like SAP ERP, Oracle, Salesforce, ServiceNow and mainframes to power critical business processes. But unlocking real-time insights and enabling AI at scale requires bridging these systems with modern analytics platforms like Databricks. This blog explores how Confluent’s data streaming platform enables seamless integration between SAP, Databricks, and other systems to support real-time decision-making, AI-driven automation, and agentic AI use cases. It explores how Confluent delivers the real-time backbone needed to build event-driven, future-proof enterprise architectures—supporting everything from inventory optimization and supply chain intelligence to embedded copilots and autonomous agents.

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Blog 1: The Past, Present and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)
Blog 2: Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing
Blog 3: Shift-Left Architecture for AI and Data Warehousing with Confluent and Databricks
Blog 4 (THIS ARTICLE): Databricks and Confluent in Enterprise Software Environments (with SAP as Example)
Blog 5: Databricks and Confluent Leading Data and AI Architectures – and How They Compare to Competitors

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

Most Enterprise Data Is Operational

Enterprise software systems generate a constant stream of operational data across a wide range of domains. This includes orders and inventory from SAP ERP systems, often extended with real-time production data from SAP MES. Oracle databases capture transactional data critical to core business operations, while MongoDB contributes operational data—frequently used as a CDC source or, in some cases, as a sink for analytical queries. Customer interactions are tracked in platforms like Salesforce CRM, and financial or account-related events often originate from IBM mainframes.

Together, these systems form the backbone of enterprise data, requiring seamless integration for real-time intelligence and business agility. This data is often not immediately available for analytics or AI unless it’s integrated into downstream systems.

Confluent is built to ingest and process this kind of operational data in real time. Databricks can then consume it for AI and machine learning, dashboards, or reports. Together, SAP, Confluent and Databricks create a real-time architecture for enterprise decision-making.

SAP Product Landscape for Operational and Analytical Workloads

SAP plays a foundational role in the enterprise data landscape—not just as a source of business data, but as the system of record for core operational processes across finance, supply chain, HR, and manufacturing.

On a high level, the SAP product portfolio has three categories (these days): SAP Business AI, SAP Business Data Cloud (BDC), and SAP Business Applications powered by SAP Business Technology Platform (BTP).

Source: SAP

To support both operational and analytical needs, SAP offers a portfolio of platforms and tools, while also partnering with best-in-class technologies like Databricks and Confluent.

Operational Workloads (Transactional Systems):

SAP S/4HANA – Modern ERP for core business operations
SAP ECC – Legacy ERP platform still widely deployed
SAP CRM / SCM / SRM – Domain-specific business systems
SAP Business One / Business ByDesign – ERP solutions for mid-market and subsidiaries

Analytical Workloads (Data & Analytics Platforms):

SAP Datasphere – Unified data fabric to integrate, catalog, and govern SAP and non-SAP data
SAP Analytics Cloud (SAC) – Visualization, reporting, and predictive analytics
SAP BW/4HANA – Data warehousing and modeling for SAP-centric analytics

SAP Business Data Cloud (BDC)

SAP Business Data Cloud (BDC) is a strategic initiative within SAP Business Technology Platform (BTP) that brings together SAP’s data and analytics capabilities into a unified cloud-native experience. It includes:

SAP Datasphere as the data fabric layer, enabling seamless integration of SAP and third-party data
SAP Analytics Cloud (SAC) for consuming governed data via dashboards and reports
SAP’s partnership with Databricks to allow SAP data to be analyzed alongside non-SAP sources in a lakehouse architecture
Real-time integration scenarios enabled through Confluent and Apache Kafka, bringing operational data in motion directly into SAP and Databricks environments

Together, this ecosystem supports real-time, AI-powered, and governed analytics across operational and analytical workloads—making SAP data more accessible, trustworthy, and actionable within modern cloud data architectures.

SAP Databricks OEM: Limited Scope, Full Control by SAP

SAP recently announced an OEM partnership with Databricks, embedding parts of Databricks’ serverless infrastructure into the SAP ecosystem. While this move enables tighter integration and simplified access to AI workloads within SAP, it comes with significant trade-offs. The OEM model is narrowly scoped, optimized primarily for ML and GenAI scenarios on SAP data, and lacks the openness and flexibility of native Databricks.

This integration is not intended for full-scale data engineering. Core capabilities such as workflows, streaming, Delta Live Tables, and external data connections (e.g., Snowflake, S3, MS SQL) are missing. The architecture is based on data at rest and does not embrace event-driven patterns. Compute options are limited to serverless only, with no infrastructure control. Pricing is complex and opaque, with customers often needing to license Databricks separately to unlock full capabilities.

Critically, SAP controls the entire data integration layer through its BDC Data Products, reinforcing a vendor lock-in model. While this may benefit SAP-centric organizations focused on embedded AI, it restricts broader interoperability and long-term architectural flexibility. In contrast, native Databricks, i.e., outside of SAP, offers a fully open, scalable platform with rich data engineering features across diverse environments.

Whichever Databricks option you prefer, this is where Confluent adds value—offering a truly event-driven, decoupled architecture that complements both SAP Datasphere and Databricks, whether used within or outside the SAP OEM framework.

Confluent and SAP Integration

Confluent provides native and third-party connectors to integrate with SAP systems to enable continuous, low-latency data flow across business applications.

Source: Confluent

This powers modern, event-driven use cases that go beyond traditional batch-based integrations:

Low-latency access to SAP transactional data
Integration with other operational source systems like Salesforce, Oracle, IBM Mainframe, MongoDB, or IoT platforms
Synchronization between SAP DataSphere and other data warehouse and analytics platforms such as Snowflake, Google BigQuery or Databricks
Decoupling of applications for modular architecture
Data consistency across real-time, batch and request-response APIs
Hybrid integration across any edge, on-premise or multi-cloud environments

SAP Datasphere and Confluent

To expand its role in the modern data stack, SAP introduced SAP Datasphere—a cloud-native data management solution designed to extend SAP’s reach into analytics and data integration. Datasphere aims to simplify access to SAP and non-SAP data across hybrid environments.

SAP Datasphere simplifies data access within the SAP ecosystem, but it has key drawbacks when compared to open platforms like Databricks, Snowflake, or Google BigQuery:

Closed Ecosystem: Optimized for SAP, but lacks flexibility for non-SAP integrations.
No Event Streaming: Focused on data at rest, with limited support for real-time processing or streaming architectures.
No Native Stream Processing: Relies on batch methods, adding latency and complexity for hybrid or real-time use cases.

Confluent alleviates these drawbacks and supports this strategy through bi-directional integration with SAP Datasphere. This enables real-time streaming of SAP data into Datasphere and back out to operational or analytical consumers via Apache Kafka. It allows organizations to enrich SAP data, apply real-time processing, and ensure it reaches the right systems in the right format—without waiting for overnight batch jobs or rigid ETL pipelines.

Confluent for Agentic AI with SAP Joule and Databricks

SAP is laying the foundation for agentic AI architectures with a vision centered around Joule—its generative AI copilot—and a tightly integrated data stack that includes SAP Databricks (via OEM), SAP Business Data Cloud (BDC), and a unified knowledge graph. On top of this foundation, SAP is building specialized AI agents for use cases such as customer 360, creditworthiness analysis, supply chain intelligence, and more.

Source: SAP

The architecture combines:

SAP Joule as the interface layer for generative insights and decision support
SAP’s foundational models and domain-specific knowledge graph
SAP BDC and SAP Databricks as the data and ML/AI backbone
Data from both SAP systems (ERP, CRM, HR, logistics) and non-SAP systems (e.g. clickstream, IoT, partner data, social media) from its partnership with Confluent

But here’s the catch: What happens when agents need to communicate with one another to deliver a workflow? Such Agentic systems require continuous, contextual, and event-driven data exchange—not just point-to-point API calls and nightly batch jobs.

This is where Confluent’s data streaming platform comes in as critical infrastructure.

Agentic AI with Apache Kafka as Event Broker

Confluent provides the real-time data streaming platform that connects the operational world of SAP with the analytical and AI-driven world of Databricks, enabling the continuous movement, enrichment, and sharing of data across all layers of the stack.

The above is a conceptual view on the architecture. The AI agents on the left side could be built with SAP Joule, Databricks, or any “outside” GenAI framework.

The data streaming platform helps connecting the AI agents with the reset of the enterprise architecture, both within SAP and Databricks but also beyond:

Real-time data integration from non-SAP systems (e.g., mobile apps, IoT devices, mainframes, web logs) into SAP and Databricks
True decoupling of services and agents via an event-driven architecture (EDA), replacing brittle RPC or point-to-point API calls
Event replay and auditability—critical for traceable AI systems operating in regulated environments
Streaming pipelines for feature engineering and inference: stream-based model triggering with low-latency SLAs
Support for bi-directional flows: e.g., operational triggers in SAP can be enriched by AI agents running in Databricks and pushed back into SAP via Kafka events

Without Confluent, SAP’s agentic architecture risks becoming a patchwork of stateless services bound by fragile REST endpoints—lacking the real-time responsiveness, observability, and scalability required to truly support next-generation AI orchestration.

Confluent turns the SAP + Databricks vision into a living, breathing ecosystem—where context flows continuously, agents act autonomously, and enterprises can build future-proof AI systems that scale.

Data Streaming Use Cases Across SAP Product Suites

With Confluent, organizations can support a wide range of use cases across SAP product suites, including:

Real-Time Inventory Visibility: Live updates of stock levels across warehouses and stores by streaming material movements from SAP ERP and SAP EWM, enabling faster order fulfillment and reduced stockouts.
Dynamic Pricing and Promotions: Stream sales orders and product availability in real time to trigger pricing adjustments or dynamic discounting via integration with SAP ERP and external commerce platforms.
AI-Powered Supply Chain Optimization: Combine data from SAP ERP, SAP Ariba, and external logistics platforms to power ML models that predict delays, optimize routes, and automate replenishment.
Shop Floor Event Processing: Stream sensor and machine data alongside order data from SAP MES, enabling real-time production monitoring, alerting, and throughput optimization.
Employee Lifecycle Automation: Stream employee events (e.g., onboarding, role changes) from SAP SuccessFactors to downstream IT systems (e.g., Active Directory, badge systems), improving HR operations and compliance.
Order-to-Cash Acceleration: Connect order intake (via web portals or Salesforce) to SAP ERP in real time, enabling faster order validation, invoicing, and cash flow.
Procure-to-Pay Automation: Integrate procurement events from SAP Ariba and supplier portals with ERP and financial systems to streamline approvals and monitor supplier performance continuously.
Customer 360 and CRM Synchronization: Synchronize customer master data and transactions between SAP ERP, SAP CX, and third-party CRMs like Salesforce to enable unified customer views.
Real-Time Financial Reporting: Stream financial transactions from SAP S/4HANA into cloud-based lakehouses or BI tools for near-instant reporting and compliance dashboards.
Cross-System Data Consistency: Ensure consistent master data and business events across SAP and non-SAP environments by treating SAP as a real-time event source—not just a system of record.

Example Use Case and Architecture with SAP, Databricks and Confluent

Consider a manufacturing company using SAP ERP for inventory management and Databricks for predictive maintenance. The combination of SAP Datasphere and Confluent enables seamless data integration from SAP systems, while the addition of Databricks supports advanced AI/ML applications—turning operational data into real-time, predictive insights.

With Confluent as the real-time backbone:

Machine telemetry (via MQTT or OPC-UA) and ERP events (e.g., stock levels, work orders) are streamed in real time.
Apache Flink enriches and filters the event streams—adding context like equipment metadata or location.
Tableflow publishes clean, structured data to Databricks as Delta tables for analytics and ML processing.
A predictive model hosted in a Databricks model detects potential equipment failure before it happens in a Flink application calling the remote model with low latency.
The resulting prediction is streamed back to Kafka, triggering an automated work order in SAP via event integration.

This bi-directional, event-driven pattern illustrates how Confluent enables seamless, real-time collaboration across SAP, Databricks, and IoT systems—supporting both operational and analytical use cases with a shared architecture.

Going Beyond SAP with Data Streaming

This pattern applies to other enterprise systems:

Salesforce: Stream customer interactions for real-time personalization through Salesforce Data Cloud
Oracle: Capture transactions via CDC (Change Data Capture)
ServiceNow: Monitor incidents and automate operational responses
Mainframe: Offload events from legacy applications without rewriting code
MongoDB: Sync operational data in real time to support responsive apps
Snowflake: Stream enriched operational data into Snowflake for near real-time analytics, dashboards, and data sharing across teams and partners
OpenAI (or other GenAI platforms): Feed real-time context into LLMs for AI-assisted recommendations or automation
“You name it”: Confluent’s prebuilt connectors and open APIs enable event-driven integration with virtually any enterprise system

Confluent provides the backbone for streaming data across all of these platforms—securely, reliably, and in real time.

Strategic Value for the Enterprise of Event-based Real-Time Integration with Data Streaming

Enterprise software platforms are essential. But they are often closed, slow to change, and not designed for analytics or AI.

Confluent provides real-time access to operational data from platforms like SAP. SAP Datasphere and Databricks enable analytics and AI on that data. Together, they support modern, event-driven architectures.

Use Confluent for real-time data streaming from SAP and other core systems
Use SAP Datasphere and Databricks to build analytics, reports, and AI on that data
Use Tableflow to connect the two platforms seamlessly

This modern approach to data integration delivers tangible business value, especially in complex enterprise environments. It enables real-time decision-making by allowing business logic to operate on live data instead of outdated reports. Data products become reusable assets, as a single stream can serve multiple teams and tools simultaneously. By reducing the need for batch layers and redundant processing, the total cost of ownership (TCO) is significantly lowered. The architecture is also future-proof, making it easy to integrate new systems, onboard additional consumers, and scale workflows as business needs evolve.

Beyond SAP: Enabling Agentic AI Across the Enterprise

The same architectural discussion applies across the enterprise software landscape. As vendors embed AI more deeply into their platforms, the effectiveness of these systems increasingly depends on real-time data access, continuous context propagation, and seamless interoperability.

Without an event-driven foundation, AI agents remain limited—trapped in siloed workflows and brittle API chains. Confluent provides the scalable, reliable backbone needed to enable true agentic AI in complex enterprise environments.

Examples of AI solutions driving this evolution include:

SAP Joule / Business AI – Context-aware agents and embedded AI across ERP, finance, and supply chain
Salesforce Einstein / Copilot Studio – Generative AI for CRM, service, and marketing automation built on top of Salesforce Data Cloud
ServiceNow Now Assist – Intelligent workflows and predictive automation in ITSM and Ops
Oracle Fusion AI / OCI AI Services – Embedded machine learning in ERP, HCM, and SCM
Microsoft Copilot (Dynamics / Power Platform) – AI copilots across business and low-code apps
Workday AI – Smart recommendations for finance, workforce, and HR planning
Adobe Sensei GenAI – GenAI for content creation and digital experience optimization
IBM watsonx – Governed AI foundation for enterprise use cases and data products
Infor Coleman AI – Industry-specific AI for supply chain and manufacturing systems
All the “traditional” cloud providers and data platforms such as Snowflake with Cortex, Microsoft Azure Fabric, AWS SageMaker, AWS Bedrock, and GCP Vertex AI

Each of these platforms benefits from a streaming-first architecture that enables real-time decisions, reusable data, and smarter automation across the business.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

The post Databricks and Confluent in the World of Enterprise Software (with SAP as Example) appeared first on Kai Waehner.

Shift Left Architecture for AI and Analytics with Confluent and Databricks

Kai Waehner — Fri, 09 May 2025 06:03:07 +0000

Modern enterprise architectures are evolving. Traditional batch data pipelines and centralized processing models are being replaced by more flexible, real-time systems. One of the driving concepts behind this change is the Shift Left approach. This blog compares Databricks’ Medallion Architecture with a Shift Left Architecture popularized by Confluent. It explains where each concept fits best—and how they can work together to create a more complete, flexible, and scalable architecture.

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Blog 1: The Past, Present and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)
Blog 2: Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing
Blog 3 (THIS ARTICLE): Shift-Left Architecture for AI and Data Warehousing with Confluent and Databricks
Blog 4: Databricks and Confluent in Enterprise Software Environments (with SAP as Example)
Blog 5: Databricks and Confluent Leading Data and AI Architectures – and How They Compare to Competitors

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including more details about the shift left architecture with data streaming and lakehouses.

Medallion Architecture: Structured, Proven, but Not Always Optimal

The Medallion Architecture, popularized by Databricks, is a well-known design pattern for organizing and processing data within a lakehouse. It provides structure, modularity, and clarity across the data lifecycle by breaking pipelines into three logical layers:

Bronze: Ingest raw data in its original format (often semi-structured or unstructured)
Silver: Clean, normalize, and enrich the data for usability
Gold: Aggregate and transform the data for reporting, dashboards, and machine learning

Source: Databricks

This layered approach is valuable for teams looking to establish governed and scalable data pipelines. It supports incremental refinement of data and enables multiple consumers to work from well-defined stages.

Challenges of the Medallion Architecture

The Medallion Architecture also introduces challenges:

Pipeline delays: Moving data from Bronze to Gold can take minutes or longer—too slow for operational needs
Infrastructure overhead: Each stage typically requires its own compute and storage footprint
Redundant processing: Data transformations are often repeated across layers
Limited operational use: Data is primarily at rest in object storage; using it for real-time operational systems often requires inefficient reverse ETL pipelines.

For use cases that demand real-time responsiveness and/or critical SLAs—such as fraud detection, personalized recommendations, or IoT alerting—this traditional batch-first model may fall short. In such cases, an event-driven streaming-first architecture, powered by a data streaming platform like Confluent, enables faster, more cost-efficient pipelines by performing validation, enrichment, and even model inference before data reaches the lakehouse.

Importantly, this data streaming approach doesn’t replace the Medallion pattern—it complements it. It allows you to “shift left” critical logic, reducing duplication and latency while still feeding trusted, structured data into Delta Lake or other downstream systems for broader analytics and governance.

In other words, shifting data processing left (i.e., before it hits a data lake or Lakehouse) is especially valuable when the data needs to serve multiple downstream systems—operational and analytical alike—because it avoids duplication, reduces latency, and ensures consistent, high-quality data is available wherever it’s needed.

In a Shift Left Architecture, data processing happens earlier—closer to the source, both physically and logically. This often means:

Transforming and validating data as it streams in
Enriching and filtering in real time
Sharing clean, usable data quickly across teams AND different technologies/applications

This is especially useful for:

Reducing time to insight
Improving data quality at the source
Creating reusable, consistent data products
Operational workloads with critical SLAs

How Confluent Enables Shift Left with Databricks

In a Shift Left setup, Apache Kafka provides scalable, low-latency, and truly decoupled ingestion of data across operational and analytical systems, forming the backbone for unified data pipelines.

Schema Registry and data governance policies enforce consistent, validated data across all streams, ensuring high-quality, secure, and compliant data delivery from the very beginning.

Apache Flink enables early data processing — closer to where data is produced. This reduces complexity downstream, improves data quality, and allows real-time decisions and analytics.

Data Quality Governance via Data Contracts and Schema Validation

Flink can enforce data contracts by validating incoming records against predefined schemas (e.g., using JSON Schema, Apache Avro or Protobuf with Schema Registry). This ensures structurally valid data continues through the pipeline. In cases where schema violations occur, records can be automatically routed to a Dead Letter Queue (DLQ) for inspection.

Additionally, data contracts can enforce policy-based rules at the schema level—such as field-level encryption, masking of sensitive data (PII), type coercion, or enrichment defaults. These controls help maintain compliance and reduce risk before data reaches regulated or shared environments.

Apache Flink for Continuous Stream Processing

Flink can perform the following tasks before data ever lands in a data lake or warehouse:

Filtering and Routing

Events can be filtered based on business rules and routed to the appropriate downstream system or Kafka topic. This allows different consumers to subscribe only to relevant data, optimizing both performance and cost.

Metric Calculation

Use Flink to compute rolling aggregates (e.g., counts, sums, averages, percentiles) over windows of data in motion. This is useful for business metrics, anomaly detection, or feeding real-time dashboards—without waiting for batch jobs.

Real-Time Joins and Enrichment

Flink supports both stream-stream and stream-table joins. This enables real-time enrichment of incoming events with contextual information from reference data (e.g., user profiles, product catalogs, pricing tables), often sourced from Kafka topics, databases, or external APIs.

By shifting this logic to the beginning of the pipeline, teams can reduce duplication, avoid unnecessary storage and compute costs in downstream systems, and ensure that data products are clean, policy-compliant, and ready for both operational and analytical use—as soon as they are created.

Example: A financial application might use Flink to calculate running balances, detect anomalies, and enrich records with reference data before pushing to Databricks for reporting and training analytic models.

In addition to enhancing data quality and reducing time-to-insight in the lakehouse, this approach also makes data products immediately usable for operational workloads and downstream applications—without building separate pipelines.

Learn more about stateless and stateful stream processing in real-time architectures using Apache Flink in this in-depth blog post.

Combining Shift Left with Medallion Architecture

These architectures are not mutually exclusive. Shift Left is about processing data earlier. Medallion is about organizing data once it arrives.

You can use Shift Left principles to:

Pre-process operational data before it enters the Bronze layer
Ensure clean, validated data enters Silver with minimal transformation needed
Reduce the need for redundant processing steps between layers

Confluent’s Tableflow bridges the two worlds. It converts Kafka streams into Delta tables, integrating cleanly with the Medallion model while supporting real-time flows.

Shift Left with Delta Lake, Iceberg, and Tableflow

Confluent Tableflow makes it easy to publish Kafka streams into Delta Lake or Apache Iceberg formats. These can be discovered and queried inside Databricks via Unity Catalog.

This integration:

Simplifies integration, governance and discovery
Enables live updates to AI features and dashboards
Removes the need to manage Spark streaming jobs

This is a natural bridge between a data streaming platform and the lakehouse.

Source: Confluent

AI Use Cases for Shift Left with Confluent and Databricks

The Shift Left model benefits both predictive and generative AI:

Model training: Real-time data pipelines can stream features to Delta Lake
Model inference: In some cases, predictions can happen in Confluent (via Flink) and be pushed back to operational systems instantly
Agentic AI: Real-time event-driven architectures are well suited for next-gen, stateful agents

Databricks supports model training and hosting via MosaicML. Confluent can integrate with these models, or run lightweight inference directly from the stream processing application.

Data Warehouse Use Cases for Shift Left with Confluent and Databricks

Batch reporting: Continue using Databricks for traditional BI
Real-time analytics: Flink or real-time OLAP engines (e.g., Apache Pinot, Apache Druid) may be a better fit for sub-second insights
Hybrid: Push raw events into Databricks for historical analysis and use Flink for immediate feedback

Where you do the data processing depends on the use case.

Architecture Benefits Beyond Technology

Shift Left also brings architectural benefits:

Cost Reduction: Processing early can lower storage and compute usage
Faster Time to Market: Data becomes usable earlier in the pipeline
Reusability: Processed streams can be reused and consumed by multiple technologies/applications (not just Databricks teams)
Compliance and Governance: Validated data with lineage can be shared with confidence

These are important for strategic enterprise data architectures.

Bringing in New Types of Data

Shift Left with a data streaming platform supports a wider range of data sources:

Operational databases (like Oracle, DB2, SQL Server, Postgres, MongoDB)
ERP systems (SAP et al)
Mainframes and other legacy technologies
IoT interfaces (MQTT, OPC-UA, proprietary IIoT gateway, etc.)
SaaS platforms (Salesforce, ServiceNow, and so on)
Any other system that does not directly fit into the “table-driven analytics perspective” of a Lakehouse

With Confluent, these interfaces can be connected in real time, enriched at the edge or in transit, and delivered to analytics platforms like Databricks.

This expands the scope of what’s possible with AI and analytics.

Shift Left Using ONLY Databricks

A shift left architecture only with Databricks is possible, too. A Databricks consultant took my Shift Left slide and adjusted it that way:

Relying solely on Databricks for a “Shift Left Architecture” can work if all workloads (should) stay within the platform — but it’s a poor fit for many real-world scenarios.

Databricks focuses on ELT, not true ETL, and lacks native support for operational workloads like APIs, low-latency apps, or transactional systems. This forces teams to rely on reverse ETL tools – a clear anti-pattern in the enterprise architecture – just to get data where it’s actually needed. The result: added complexity, latency, and tight coupling.

The Shift Left Architecture is valuable, but in most cases it requires a modular approach, where streaming, operational, and analytical components work together — not a monolithic platform.

That said, shift left principles still apply within Databricks. Processing data as early as possible improves data quality, reduces overall compute cost, and minimizes downstream data engineering effort. For teams that operate fully inside the Databricks ecosystem, shifting left remains a powerful strategy to simplify pipelines and accelerate insight.

Meesho: Scaling a Real-Time Commerce Platform with Confluent and Databricks

Many high-growth digital platforms adopt a shift-left approach out of necessity—not as a buzzword, but to reduce latency, improve data quality, and scale efficiently by processing data closer to the source.

Meesho, one of India’s largest online marketplaces, relies on Confluent and Databricks to power its hyper-growth business model focused on real-time e-commerce. As the company scaled rapidly, supporting millions of small businesses and entrepreneurs, the need for a resilient, scalable, and low-latency data architecture became critical.

To handle massive volumes of operational events — from inventory updates to order management and customer interactions — Meesho turned to Confluent Cloud. By adopting a fully managed data streaming platform using Apache Kafka, Meesho ensures real-time event delivery, improved reliability, and faster application development. Kafka serves as the central nervous system for their event-driven architecture, connecting multiple services and enabling instant, context-driven customer experiences across mobile and web platforms.

Alongside their data streaming architecture, Meesho migrated from Amazon Redshift to Databricks to build a next-generation analytics platform. Databricks’ lakehouse architecture empowers Meesho to unify operational data from Kafka with batch data from other sources, enabling near real-time analytics at scale. This migration not only improved performance and scalability but also significantly reduced costs and operational overhead.

With Confluent managing real-time event processing and ingestion, and Databricks providing powerful, scalable analytics, Meesho is able to:

Deliver real-time personalized experiences to customers
Optimize operational workflows based on live data
Enable faster, data-driven decision-making across business teams

By combining real-time data streaming with advanced lakehouse analytics, Meesho has built a flexible, future-ready data infrastructure to support its mission of democratizing online commerce for millions across India.

Shift Left: Reducing Complexity, Increasing Value for the Lakehouse (and other Operational Systems)

Shift Left is not about replacing Databricks. It’s about preparing better data earlier in the pipeline—closer to the source—and reducing end-to-end complexity.

Use Confluent for real-time ingestion, enrichment, and transformation
Use Databricks for advanced analytics, reporting, and machine learning
Use Tableflow and Delta Lake to govern and route high-quality data to the right consumers

This architecture not only improves data quality for the lakehouse, but also enables the same real-time data products to be reused across multiple downstream systems—including operational, transactional, and AI-powered applications.

The result: increased agility, lower costs, and scalable innovation across the business.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including more details about the shift left architecture with data streaming and lakehouses.

The post Shift Left Architecture for AI and Analytics with Confluent and Databricks appeared first on Kai Waehner.

Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing

Kai Waehner — Mon, 05 May 2025 03:47:21 +0000

Many organizations use both Confluent and Databricks. While these platforms serve different primary goals—real-time data streaming vs. analytical processing—there are areas where they overlap. This blog explores how the Confluent Data Streaming Platform (DSP) and the Databricks Data Intelligence Platform handle data integration and processing. It explains their different roles, where they intersect, and when one might be a better fit than the other.

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Blog 1: The Past, Present and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)
Blog 2 (THIS ARTICLE): Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing
Blog 3: Shift-Left Architecture for AI and Data Warehousing with Confluent and Databricks
Blog 4: Databricks and Confluent in Enterprise Software Environments (with SAP as Example)
Blog 5: Databricks and Confluent Leading Data and AI Architectures – and How They Compare to Competitors

Data Integration and Processing: Shared Space, Different Strengths

Confluent is focused on continuous, event-based data movement and processing. It connects to hundreds of real-time and non-real-time data sources and targets. It enables low-latency stream processing using Apache Kafka and Flink, forming the backbone of an event-driven architecture. Databricks, on the other hand, combines data warehousing, analytics, and machine learning on a unified, scalable architecture.

Confluent: Event-Driven Integration Platform

Confluent is increasingly used as modern operational middleware, replacing traditional message queues (MQ) and enterprise service buses (ESB) in many enterprise architectures.

Thanks to its event-driven foundation, it supports not just real-time event streaming but also integration with request/response APIs and batch-based interfaces. This flexibility allows enterprises to standardize on the Kafka protocol as the data hub—bridging asynchronous event streams, synchronous APIs, and legacy systems. The immutable event store and true decoupling of producers and consumers help maintain data consistency across the entire pipeline, regardless of whether data flows in real-time, in scheduled batches or via API calls.

Databricks: Batch-Driven Analytics and AI Platform

Databricks excels in batch processing and traditional ELT workloads. It is optimized for storing data first and then transforming it within its platform, but it’s not built as a real-time ETL tool for directly connecting to operational systems or handling complex, upstream data mappings.

Databricks enables data transformations at scale, supporting complex joins, aggregations, and data quality checks over large historical datasets. Its Medallion Architecture (Bronze, Silver, Gold layers) provides a structured approach to incrementally refine and enrich raw data for analytics and reporting. The engine is tightly integrated with Delta Lake and Unity Catalog, ensuring governed and high-performance access to curated datasets for data science, BI, and machine learning.

For most use cases, the right choice is simple.

Confluent is ideal for building real-time pipelines and unifying operational systems.
Databricks is optimized for batch analytics, warehousing, and AI development.

Together, Confluent and Databricks cover both sides of the modern data architecture—streaming and batch, operational and analytical. And Confluent’s Tableflow and a shift-left architecture enable native integration with earlier data validation, simplified pipelines, and faster access to AI-ready data.

Data Ingestion Capabilities

Databricks recently introduced LakeFlow Connect and acquired Arcion to strengthen its capabilities around Change Data Capture (CDC) and data ingestion into Delta Lake. These are good steps toward improving integration, particularly for analytical use cases.

However, Confluent is the industry leader in operational data integration, serving as modern middleware for connecting mainframes, ERP systems, IoT devices, APIs, and edge environments. Many enterprises have already standardized on Confluent to move and process operational data in real time with high reliability and low latency.

Introducing yet another tool—especially for ETL and ingestion—creates unnecessary complexity. It risks a return to Lambda-style architectures, where separate pipelines must be built and maintained for real-time and batch use cases. This increases engineering overhead, inflates cost, and slows time to market.

In contrast, Confluent supports a Kappa architecture model: a single, unified event-driven data streaming pipeline that powers both operational and analytical workloads. This eliminates duplication, simplifies the data flow, and enables consistent, trusted data delivery from source to sink.

Confluent for Data Ingestion into Databricks

Confluent’s integration capabilities provide:

100+ enterprise-grade connectors, including SAP, Salesforce, and mainframe systems
Native CDC support for Oracle, SQL Server, PostgreSQL, MongoDB, Salesforce, and more
Flexible integration via Kafka Clients for any relevant programming language, REST/HTTP, MQTT, JDBC, and other APIs
Support for operational sinks (not just analytics platforms)
Built-in governance, durability, and replayability

A good example: Confluent’s Oracle CDC Connector uses Oracle’s XStream API and delivers “GoldenGate-level performance”, with guaranteed ordering, high throughput, and minimal latency. This enables real-time delivery of operational data into Kafka, Flink, and downstream systems like Databricks.

Bottom line: Confluent offers the most mature, scalable, and flexible ingestion capabilities into Databricks—especially for real-time operational data. For enterprises already using Confluent as the central nervous system of their architecture, adding another ETL layer specifically for the lakehouse integration with weaker coverage and SLAs only slows progress and increases cost.

Stick with a unified approach—fewer moving parts, faster implementation, and end-to-end consistency.

Real-Time vs. Batch: When to Use Each

Batch ETL is well understood. It works fine when data does not need to be processed immediately—e.g., for end-of-day reports, monthly audits, or historical analysis.

Streaming ETL is best when data must be processed in motion. This enables real-time dashboards, live alerts, or AI features based on the latest information.

Confluent DSP is purpose-built for streaming ETL. Kafka and Flink allow filtering, transformation, enrichment, and routing in real time.

Databricks supports batch ELT natively. Delta Live Tables offers a managed way to build data pipelines on top of Spark. Delta Live Tables lets you declaratively define how data should be transformed and processed using SQL or Python. On the other side, Spark Structured Streaming can handle streaming data in near real-time. But it still requires persistent clusters and infrastructure management.

If you’re already invested in Spark, Structured Streaming or Delta Live Tables might be sufficient. But if you’re starting fresh—or looking to simplify your architecture — Confluent’s Tableflow provides a more streamlined, Kafka-native alternative. Tableflow represents Kafka streams as Delta Lake tables. No cluster management. No offset handling. Just discoverable, governed data in Databricks Unity Catalog.

Real-Time and Batch: A Perfect Match at Walmart for Replenishment Forecasting in the Supply Chain

Walmart demonstrates how real-time and batch processing can work together to optimize a large-scale, high-stakes supply chain.

At the heart of this architecture is Apache Kafka, powering Walmart’s real-time inventory management and replenishment system.

Kafka serves as the central data hub, continuously streaming inventory updates, sales transactions, and supply chain events across Walmart’s physical stores and digital channels. This enables real-time replenishment to ensure product availability and timely fulfillment for millions of online and in-store customers.

Batch processing plays an equally important role. Apache Spark processes historical sales, seasonality trends, and external factors in micro-batches to feed forecasting models. These forecasts are used to generate accurate daily order plans across Walmart’s vast store network.

Source: Walmart

This hybrid architecture brings significant operational and business value:

Kafka provides not just low latency, but true decoupling between systems, enabling seamless integration across real-time streams, batch pipelines, and request-response APIs—ensuring consistent, reliable data flow across all environments
Spark delivers scalable, high-performance analytics to refine predictions and improve long-term planning
The result: reduced cycle times, better accuracy, increased scalability and elasticity, improved resiliency, and substantial cost savings

Walmart’s supply chain is just one of many use cases where Kafka powers real-time business processes, decisioning and workflow orchestration at global scale—proof that combining streaming and batch is key to modern data infrastructure.

Batch as a Subset of Streaming in Apache Flink

Apache Flink supports both streaming and batch processing within the same engine. This enables teams to build unified pipelines that handle real-time events and batch-style computations without switching tools or architectures. In Flink, batch is treated as a special case of streaming—where a bounded stream (or a complete window of events) can be processed once all data has arrived.

This approach simplifies operations by avoiding the need for parallel pipelines or separate orchestration layers. It aligns with the principles of the shift-left architecture, allowing earlier processing, validation, and enrichment—closer to the data source. As a result, pipelines are more maintainable, scalable, and responsive.

That said, batch processing is not going away—nor should it. For many use cases, batch remains the most practical solution. Examples include:

Daily financial reconciliations
End-of-day retail reporting
Weekly churn model training
Monthly compliance and audit jobs

In these cases, latency is not critical, and workloads often involve large volumes of historical data or complex joins across datasets.

This is where Databricks excels—especially with its Delta Lake and Medallion architecture, which structures raw, refined, and curated data layers for high-performance analytics, BI, and AI/ML training.

In summary, Flink offers the flexibility to consolidate streaming and batch pipelines, making it ideal for unified data processing. But when batch is the right choice—especially at scale or with complex transformations—Databricks remains a best-in-class platform. The two technologies are not mutually exclusive. They are complementary parts of a modern data stack.

Streaming CDC and Lakehouse Analytics

Streaming CDC is a key integration pattern. It captures changes from operational databases and pushes them into analytics platforms. But CDC isn’t limited to databases. CDC is just as important for business applications like Salesforce, where capturing customer updates in real time enables faster, more responsive analytics and downstream actions.

Confluent is well suited for this. Kafka Connect and Flink can continuously stream changes. These change events are sent to Databricks as Delta tables using Tableflow. Streaming CDC ensures:

Data consistency across operational and analytical workloads leveraging a single data pipeline
Reduced ETL / ELT lag
Near real-time updates to BI dashboards
Timely training of AI/ML models

Streaming CDC also avoids data duplication, reduces latency, and minimizes storage costs.

Reverse ETL: An (Anti) Pattern to Avoid with Confluent and Databricks

Some architectures push data from data lakes or warehouses back into operational systems using reverse ETL. While this may appear to bridge the analytical and operational worlds, it often leads to increased latency, duplicate logic, and fragile point-to-point workflows. These tools typically reprocess data that was already transformed once, leading to inefficiencies, governance issues, and unclear data lineage.

Reverse ETL is an architectural anti-pattern. It violates the principles of an event-driven system. Rather than reacting to events as they happen, reverse ETL introduces delays and additional moving parts—pushing stale insights back into systems that expect real-time updates.

With the upcoming bidirectional integration of Tableflow with Delta Lake, these issues can be avoided entirely. Insights generated in Databricks—from analytics, machine learning, or rule-based engines—can be pushed directly back into Kafka topics.

This approach removes the need for reverse ETL tools, reduces system complexity, and ensures that both operational and analytical layers operate on a shared, governed, and timely data foundation.

It also brings lineage, schema enforcement, and observability into both directions of data flow—streamlining feedback loops and enabling true event-driven decisioning across the enterprise.

In short: Don’t pull data back into operational systems after the fact. Push insights forward at the speed of events.

Multi-Cloud and Hybrid Integration with an Event-Driven Architecture

Confluent is designed for distributed data movement across environments in real-time for operational and analytical use cases:

On-prem, cloud, and edge
Multi-region and multi-cloud
Support for SaaS, BYOC, and private networking

Features like Cluster Linking and Schema Registry ensure consistent replication and governance across environments.

Databricks runs only in the cloud. It supports hybrid access and partner integrations. But the platform is not built for event-driven data distribution across hybrid environments.

In a hybrid architecture, Confluent acts as the bridge. It moves operational data securely and reliably. Then, Databricks can consume it for analytics and AI use cases. Here is an example architecture for industrial IoT use cases:

Uniper: Real-Time Energy Trading with Confluent and Databricks

Uniper, a leading international energy company, leverages Confluent and Databricks to modernize its energy trading operations.

I covered the value of data streaming with Apache Kafka and Flink for energy trading in a dedicated blog post already.

Confluent Cloud with Apache Kafka and Apache Flink provides a scalable real-time data streaming foundation for Uniper, enabling efficient ingestion and processing of market data, IoT sensor inputs, and operational events. This setup supports the full trading lifecycle, improving decision-making, risk management, and operational agility.

Within its Azure environment, Uniper uses Databricks to empower business users to rapidly build trading decision-support tools and advanced analytics applications. By combining a self-service data platform with scalable processing power, Uniper significantly reduces the lead time for developing data apps—from weeks to just minutes.

To deliver real-time insights to its teams, Uniper also leverages Plotly’s Dash Enterprise, creating interactive dashboards that consolidate live data from Databricks, Kafka, Snowflake, and various databases. This end-to-end integration enables dynamic, collaborative workflows, giving analysts and traders fast, actionable insights that drive smarter, faster trading strategies.

By combining real-time data streaming, advanced analytics, and intuitive visualization, Uniper has built a resilient, flexible data architecture that meets the demands of today’s fast-moving energy markets.

From Ingestion to Insight: Modern Data Integration and Processing for AI with Confluent and Databricks

While both platforms can handle integration and processing, their roles are different:

Use Confluent when you need real-time ingestion and processing of operational and analytical workloads, or data delivery across systems and clouds.
Use Databricks for AI workloads, analytics and data warehousing.

When used together, Confluent and Databricks form a complete data integration and processing pipeline for AI and analytics:

Confluent ingests and processes operational data in real time.
Tableflow pushes this data into Delta Lake in a discoverable, secure format.
Databricks performs analytics and model development.
Tableflow (bidirectional) pushes insights or AI models back into Kafka for use in operational systems.

This is the foundation for modern data and AI architectures—real-time pipelines feeding intelligent applications.

The post Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing appeared first on Kai Waehner.

The Past, Present, and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)

Kai Waehner — Fri, 02 May 2025 07:10:42 +0000

Confluent and Databricks are two of the most influential platforms in modern data architectures. Both have roots in open source. Both focus on enabling organizations to work with data at scale. And both have expanded their mission well beyond their original scope.

Confluent and Databricks are often described as serving different parts of the data architecture—real-time vs. batch, operational vs. analytical, data streaming vs. artificial intelligence (AI). But the lines are not always clear. Confluent can run batch workloads and embed AI. Databricks can handle (near) real-time pipelines. With Flink, Confluent supports both operational and analytical processing. Databricks can run operational workloads, too—if latency, availability, and delivery guarantees meet the project’s requirements.

This blog explores where these platforms came from, where they are now, how they complement each other in modern enterprise architectures—and why their roles are future-proof in a data- and AI-driven world.

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Blog 1 (THIS ARTICLE): The Past, Present and Future of Confluent (The Kafka Company) and Databricks (The Spark Company)
Blog 2: Confluent Data Streaming Platform vs. Databricks Data Intelligence Platform for Data Integration and Processing
Blog 3: Shift-Left Architecture for AI and Data Warehousing with Confluent and Databricks
Blog 4: Databricks and Confluent in Enterprise Software Environments (with SAP as Example)
Blog 5: Databricks and Confluent Leading Data and AI Architectures – and How They Compare to Competitors

Operational vs. Analytical Workloads

Confluent and Databricks were designed for different workloads, but the boundaries are not always strict.

Confluent was built for operational workloads—moving and processing data in real time as it flows through systems. This includes use cases like real-time payments, fraud detection, system monitoring, and streaming pipelines.

Databricks focuses on analytical workloads—enabling large-scale data processing, machine learning, and business intelligence.

That said, there is no clear black and white separation. Confluent, especially with the addition of Apache Flink, can support analytical processing on streaming data. Databricks can handle operational workloads too, provided the SLAs—such as latency, uptime, and delivery guarantees—are sufficient for the use case.

With Tableflow and Delta Lake, both platforms can now be natively connected, allowing real-time operational data to flow into analytical environments, and AI insights to flow back into real-time systems—effectively bridging operational and analytical workloads in a unified architecture.

From Apache Kafka and Spark to (Hybrid) Cloud Platforms: Both Confluent and Databricks both have strong open source roots—Kafka and Spark, respectively—but have taken different branding paths.

Confluent: From Apache Kafka to a Data Streaming Platform (DSP)

Confluent is well known as “The Kafka Company.” It was founded by the original creators of Apache Kafka over ten years ago. Kafka is now widely adopted for event streaming in over 150,000 organizations worldwide. Confluent operates tens of thousands of clusters with Confluent Cloud across all major cloud providers, and also in customer’s data centers and edge locations.

But Confluent has become much more than just Kafka. It offers a complete data streaming platform (DSP).

Source: Confluent

This includes:

Apache Kafka as the core messaging and persistence layer
Data integration via Kafka Connect for databases and business applications, a REST/HTTP proxy for request-response APIs and clients for all relevant programming languages
Stream processing via Apache Flink and Kafka Streams (read more about the past, present and future of stream processing)
Tableflow for native integration with lakehouses that support the open table format standard via Delta Lake and Apache Iceberg
24/7 SLAs, security, data governance, disaster recovery – for the most critical workloads companies run
Deployment options: Everywhere (not just cloud) – SaaS, on-prem, edge, hybrid, stretched across data centers, multi-cloud, BYOC (bring your own cloud)

Databricks: From Apache Spark to a Data Intelligence Platform

Databricks has followed a similar evolution. Known initially as “The Spark Company,” it is the original force behind Apache Spark. But Databricks no longer emphasizes Spark in its branding. Spark is still there under the hood, but it’s no longer the dominant story.

Today, it positions itself as the Data Intelligence Platform, focused on AI and analytics.

Source: Databricks

Key components include:

Fully cloud-native deployment model—Databricks is now a cloud-only platform providing BYOC and Serverless products
Delta Lake and Unity Catalog for table format standardization and governance
Model development and AI/ML tools
Data warehouse workloads
Tools for data scientists and data engineers

Together, Confluent and Databricks meet a wide range of enterprise needs and often complement each other in shared customer environments from the edge to multi-cloud data replication and analytics.

Real-Time vs. Batch Processing

A major point of comparison between Confluent and Databricks lies in how they handle data processing—real-time versus batch—and how they increasingly converge through shared formats and integrations.

A key difference between the platforms lies in how they process and share data.

Confluent focuses on data in motion—real-time streams that can be filtered, transformed, and shared across systems as they happen.

Databricks focuses on data at rest—data that has landed in a lakehouse, where it can be queried, aggregated, and used for analysis and modeling.

Both platforms offer native capabilities for data sharing. Confluent provides Stream Sharing, which enables secure, real-time sharing of Kafka topics across organizations and environments. Databricks offers Delta Sharing, an open protocol for sharing data from Delta Lake tables with internal and external consumers.

In many enterprise architectures, the two vendors work together. Kafka and Flink handle continuous real-time processing for operational workloads and data ingestion into the lakehouse. Databricks handles AI workloads (model training and some of the model inference), business intelligence (BI), and reporting. Both do data integration; ETL (Confluent) respectively ELT (Databricks).

Stream Processing with Spark Structured Streaming vs. Apache Flink or Kafka Streams

Many organizations still use Databricks’ Apache Spark Structured Streaming to connect Kafka and Databricks. That’s a valid pattern, especially for teams with Spark expertise.

Flink is available as a serverless offering in Confluent Cloud that can scale down to zero when idle, yet remains highly scalable—even for complex stateful workloads. It supports multiple languages, including Python, Java, and SQL.

For self-managed environments, Kafka Streams offers a lightweight alternative to running Flink in a self-managed Confluent Platform. But be aware that Kafka Streams is limited to Java and operates as a client library embedded directly within the application. Read my dedicated article to learn about the trade-offs between Apache Flink and Kafka Streams.

In short: use what works. If Spark Structured Streaming is already in place and meets your needs, keep it. But for new use cases, Apache Flink or Kafka Streams might be the better choice for stream processing workloads. But make sure to understand the concepts and value of stateless and stateful stream processing before building batch pipelines.

Confluent Tableflow: Unify Operational and Analytic Workloads with Open Table Formats (such as Apache Iceberg and Delta Lake)

Databricks is actively investing in Delta Lake and Unity Catalog to structure, govern, and secure data for analytical applications. The acquisition of Tabular—founded by the original creators of Apache Iceberg—demonstrates Databricks’ commitment to supporting open standards.

Confluent’s Tableflow materializes Kafka streams into Apache Iceberg or Delta Lake tables—automatically, reliably, and efficiently. This native integration between Confluent and Databricks is faster, simpler, and more cost-effective than using a Spark connector or other ETL tools.

Tableflow reads the Kafka segments, checks schema against schema registry, and creates parquet and table metadata.

Source: Confluent

Native stream processing with Apache Flink also plays a growing role. It enables unified real-time and batch stream processing in a single engine. Flink’s ability to “shift left” data processing (closer to the source) supports early validation, enrichment, and transformation. This simplifies the architecture and reduces the need for always-on Spark clusters, which can drive up cost.

These developments highlight how Databricks and Confluent address different but complementary layers of the data ecosystem.

Confluent + Databricks = A Strategic Partnership for Future-Proof AI Architectures

Confluent and Databricks are not competing platforms—they’re complementary. While they serve different core purposes, there are areas where their capabilities overlap. In those cases, it’s less about which is better and more about which fits best for your architecture, team expertise, SLA or latency requirements. The real value comes from understanding how they work together and where you can confidently choose the platform that serves your use case most effectively.

Confluent and Databricks recently deepened their partnership with Tableflow integration with Delta Lake and Unity Catalog. This integration makes real-time Kafka data available inside Databricks as Delta tables. It reduces the need for custom pipelines and enables fast access to trusted operational data.

The architecture supports AI end to end—from ingesting real-time operational data to training and deploying models—all with built-in governance and flexibility. Importantly, data can originate from anywhere: mainframes, on-premise databases, ERP systems, IoT and edge environments or SaaS cloud applications.

With this setup, you can:

Feed data from 100+ Confluent sources (Mainframe, Oracle, SAP, Salesforce, IoT, HTTP/REST applications, and so on) into Delta Lake
Use Databricks for AI model development and business intelligence
Push models back into Kafka and Flink for real-time model inference with critical, operational SLAs and latency

Both directions will be supported. Governance and security metadata flows alongside the data.

Source: Confluent

Michelin: Real-Time Data Streaming and AI Innovation with Confluent and Databricks

A great example of how Confluent and Databricks complement each other in practice is Michelin’s digital transformation journey. As one of the world’s largest tire manufacturers, Michelin set out to become a data-first and digital enterprise. To achieve this, the company needed a foundation for real-time operational data movement and a scalable analytical platform to unlock business insights and drive AI initiatives.

Confluent @ Michelin: Real-Time Data Streaming Pipelines

Confluent Cloud plays a critical role at Michelin by powering real-time data pipelines across their global operations. Migrating from self-managed Kafka to Confluent Cloud on Microsoft Azure enabled Michelin to reduce operational complexity by 35%, meet strict 99.99% SLAs, and speed up time to market by up to nine months. Real-time inventory management, order orchestration, and event-driven supply chain processes are now possible thanks to a fully managed data streaming platform.

Databricks @ Michelin: Centralized Lakehouse

Meanwhile, Databricks empowers Michelin to democratize data access across the organization. By building a centralized lakehouse architecture, Michelin enabled business users and IT teams to independently access, analyze, and develop their own analytical use cases—from predicting stock outages to reducing carbon emissions in logistics. With Databricks’ lakehouse capabilities, they scaled to support hundreds of use cases without central bottlenecks, fostering a vibrant community of innovators across the enterprise.

The synergy between Confluent and Databricks at Michelin is clear:

Confluent moves operational data in real time, ensuring fresh, trusted information flows across systems (including Databricks).
Databricks transforms data into actionable insights, using powerful AI, machine learning, and analytics capabilities.

Confluent + Databricks @ Michelin = Cloud-Native Data-Driven Enterprise

Together, Confluent and Databricks allow Michelin to shift from batch-driven, siloed legacy systems to a cloud-native, real-time, data-driven enterprise—paving the road toward higher agility, efficiency, and customer satisfaction.

As Yves Caseau, Group Chief Digital & Information Officer at Michelin, summarized: “Confluent plays an integral role in accelerating our journey to becoming a data-first and digital business.”

And as Joris Nurit, Head of Data Transformation, added: “Databricks enables our business users to better serve themselves and empowers IT teams to be autonomous.”

The Michelin success story perfectly illustrates how Confluent and Databricks, when used together, bridge operational and analytical workloads to unlock the full value of real-time, AI-powered enterprise architectures.

Confluent and Databricks: Better Together!

Confluent and Databricks are both leaders in different – but connected – layers of the modern data stack.

If you want real-time, event-driven data pipelines, Confluent is the right platform. If you want powerful analytics, AI, and ML, Databricks is a great fit.

Together, they allow enterprises to bridge operational and analytical workloads—and to power AI systems with live, trusted data.

In the next post, I will explore how Confluent’s Data Streaming Platform compares to the Databricks Data Intelligence Platform for data integration and processing.

The post The Past, Present, and Future of Confluent (The Kafka Company) and Databricks (The Spark Company) appeared first on Kai Waehner.

Real-Time Data Sharing in the Telco Industry for MVNO Growth and Beyond with Data Streaming

Kai Waehner — Wed, 30 Apr 2025 07:04:07 +0000

The telecommunications industry is entering a new era. Partnerships with MVNOs, IoT platforms, and enterprise customers demand flexible, secure, and real-time access to network and customer data. Traditional batch-driven architectures are no longer sufficient. Instead, real-time data streaming combined with policy-driven data sharing provides a powerful foundation for building scalable data products for internal and external consumers. A modern Telco must manage data collection, processing, governance, data sharing, and distribution with the same rigor as its core network services. Leading Telcos now operate centralized real-time data streaming platforms to integrate and share network events, subscriber information, billing records, and telemetry from thousands of data sources across the edge and core networks.

Data Streaming in the Telco Industry

Telecommunications networks generate vast amounts of data every second. Every call, message, internet session, device interaction, and network event produces valuable information. Historically, much of this data was processed in batches — often hours or even days after it was collected. This delayed model no longer meets the needs of modern Telcos, partners, and customers.

Data streaming transforms how Telcos handle information. Instead of storing and processing data later, it is ingested, processed, and acted upon in real time as it is generated. This enables continuous intelligence across all parts of the network and business.

Learn more about “The Top 20 Problems with Batch Processing (and How to Fix Them with Data Streaming)“.

Business Value of Data Streaming in the Telecom Sector

Key benefits of data streaming for Telcos include:

Real-Time Visibility: Immediate insight into network health, customer behavior, fraud attempts, and service performance.
Operational Efficiency: Faster detection and resolution of issues reduces downtime, improves customer satisfaction, and lowers operating costs.
New Revenue Opportunities: Real-time data enables new services such as dynamic pricing, personalized offers, and proactive customer support.
Enhanced Security and Compliance: Immediate anomaly detection and instant auditability support regulatory requirements and protect against cyber threats.

Technologies like Apache Kafka and Apache Flink are now core components of Telco IT architectures. They allow Telcos to integrate massive, distributed data flows from radio access networks (RAN), 5G core systems, IoT ecosystems, billing and support platforms, and customer devices.

Modern Telcos use data streaming to not only improve internal operations but also to deliver trusted, secure, and differentiated services to external partners such as MVNOs, IoT platforms, and enterprise customers.

Learn More about Data Streaming in Telco

Learn more about data streaming in the telecommunications sector:

Data streaming is not an allrounder to solve every problem. Hence, a modern enterprise architecture combines data streaming with purpose-built telco-specific platforms and SaaS solutions, and data lakes/warehouses/lakehouses like Snowflake or Databricks for the analytical workloads.

I already wrote about the combination of data streaming platforms like Confluent together with Snowflake and Microsoft Fabric. A blog series about data streaming with Confluent combined with AI and analytics using Databricks is coming right after this blog post here.

By mastering real-time data streaming, Telcos unlock the ability to share valuable insights securely and efficiently with internal divisions, IoT platforms, and enterprise customers.

Mobile Virtual Network Operators (MVNOs) — companies that offer mobile services without owning their own network infrastructure — are an equally important group of consumers. As an MVNO delivers niche services, competitive pricing, and tailored customer experiences, real-time data sharing becomes essential to support their growth and enable differentiation in a highly competitive market.

A strong real-time data sharing platform in the telco industry integrates multiple types of components and stakeholders, organized into four critical areas:

Data Sources

A real-time data platform aggregates information from a wide range of technical systems across the Telco infrastructure.

Radio Access Network (RAN) Metrics: Capture real-time information about signal quality, handovers, and user session performance.
5G Core Network Functions: Manage traffic flows, session lifecycles, and device mobility through UPF, SMF, and AMF components.
Operational Support Systems (OSS) and Business Support Systems (BSS): Provide data for service assurance, provisioning, customer management, and billing processes.
IoT Devices: Send continuous telemetry data from connected vehicles, industrial assets, healthcare monitors, and consumer electronics.
Customer Premises Equipment (CPE): Supply performance and operational data from routers, gateways, modems, and set-top boxes.
Billing Events: Stream usage records, real-time charging information, and transaction logs to support accurate billing.
Customer Profiles: Update subscription plans, user preferences, device types, and behavioral attributes dynamically.
Security Logs: Capture authentication events, threat detections, network access attempts, and audit trail information.

Stream Processing

Stream processing technologies ensure raw events are turned into enriched, actionable data products as they move through the system.

Real-Time Data Ingestion: Continuously collect and process events from all sources with low latency and high reliability.
Data Aggregation and Enrichment: Transform raw network, billing, and device data into structured, valuable datasets.
Actionable Data Products: Create enriched, ready-to-consume information for operational and business use cases across the ecosystem.

Data Governance

Effective governance frameworks guarantee that data sharing is secure, compliant, and aligned with commercial agreements.

Policy-Based Access Control: Enforce business, regulatory, and contractual rules on how data is shared internally and externally.
Data Protection Techniques: Apply masking, anonymization, and encryption to secure sensitive information at every stage.
Compliance Assurance: Meet regulatory requirements like GDPR, CCPA, and telecom-specific standards through real-time monitoring and enforcement.

Data Consumers

Multiple internal and external stakeholders rely on tailored, policy-controlled access to real-time data streams to achieve business outcomes.

MVNO Partners: Consume real-time network metrics, subscriber insights, and fraud alerts to offer better customer experiences and safeguard operations.
Internal Telco Divisions: Use operational data to improve network uptime, optimize marketing initiatives, and detect revenue leakage early.
IoT Platform Services: Rely on device telemetry and mobility data to improve fleet management, predictive maintenance, and automated operations.
Enterprise Customers: Integrate real-time network insights and SLA compliance monitoring into private network and corporate IT systems.
Regulatory and Compliance Bodies: Access live audit streams, security incident data, and privacy-preserving compliance reports as required by law.

In modern Telco architectures, data products act as the building blocks for a data mesh approach, enabling decentralized ownership, scalable integration with microservices, and direct access for consumers across the business and partner ecosystem.

The right data products accelerate time-to-insight and enable additional revenue streams. Leading Telcos typically offer:

Network Quality Metrics: Monitoring service degradation, latency spikes, and coverage gaps continuously.
Customer Behavior Analytics: Tracking app usage, mobility patterns, device types, and engagement trends.
Fraud and Anomaly Detection Feeds: Capturing unusual usage, SIM swaps, or suspicious roaming activities in real time.
Billing and Charging Data Streams: Delivering session records and consumption details instantly to billing systems or MVNO partners.
Device Telemetry and Health Data: Providing operational status and error signals from smartphones, CPE, and IoT devices.
Subscriber Profile Updates: Streaming changes in service plans, device upgrades, or user preferences.
Location-Aware Services Data: Powering geofencing, smart city applications, and targeted marketing efforts.
Churn Prediction Models: Scoring customer retention risks based on usage behavior and network experience.
Network Capacity and Traffic Forecasts: Helping optimize resource allocation and investment planning.
Policy Compliance Monitoring: Ensuring real-time validation of internal and external SLAs, privacy agreements, and regulatory requirements.

These data products can be offered via APIs, secure topics, or integrated into partner platforms for direct consumption.

How Each Data Consumer Gains Strategic Value

Real-time data streaming empowers each data consumer within the Telco ecosystem to achieve specific business outcomes, drive operational excellence, and unlock new growth opportunities based on continuous, trusted insights.

Internal Telco Divisions

Real-time insights into network behavior allow proactive incident management and customer support. Marketing teams optimize campaigns based on live subscriber data, while finance teams minimize revenue leakage by tracking billing and usage patterns instantly.

MVNO Partners

Access to live network quality indicators helps MVNOs improve customer satisfaction and loyalty. Real-time fraud monitoring protects against financial losses. Tailored subscriber insights enable MVNOs to offer personalized plans and upsells based on actual usage.

IoT Platform Services

Large-scale telemetry streaming enables better device management, predictive maintenance, and operational automation. Real-time geolocation data improves logistics, fleet management, and smart infrastructure performance. Event-driven alerts help detect and resolve device malfunctions rapidly.

Enterprise Customers

Private 5G networks and managed services depend on live analytics to meet SLA obligations. Enterprises integrate real-time network telemetry into their own systems for smarter decision-making. Data-driven optimizations ensure higher uptime, better resource utilization, and enhanced customer experiences.

Building a Trusted Data Ecosystem for Telcos with Real-Time Streaming and Hybrid Cloud

Real-time data sharing is no longer a luxury for Telcos — it is a competitive necessity. A successful platform must balance openness with control, ensuring that every data exchange respects privacy, governance, and commercial boundaries.

Hybrid cloud architectures play a critical role in this evolution. They enable Telcos to process, govern, and share real-time data across on-premises infrastructure, edge environments, and public clouds seamlessly. By combining the flexibility of cloud-native services with the security and performance of on-premises systems, hybrid cloud ensures that data remains accessible, scalable, cost-efficient and compliant wherever it is needed.

By deploying scalable data streaming solutions across a hybrid cloud environment, Telcos enable secure, real-time data sharing with MVNOs, IoT platforms, enterprise customers, and internal business units. This empowers critical use cases such as dynamic quality of service monitoring, real-time fraud detection, customer behavior analytics, predictive maintenance for connected devices, and SLA compliance reporting — all without compromising performance or regulatory requirements.

The future of telecommunications belongs to those who implement real-time data streaming and controlled data sharing — turning raw events into strategic advantage faster, more securely, and more effectively than ever before.

How do you share data in your organization? Do you already leverage data streaming or still operate in batch mode? Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases.

The post Real-Time Data Sharing in the Telco Industry for MVNO Growth and Beyond with Data Streaming appeared first on Kai Waehner.

Kai Waehner, Author at Kai Waehner

How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI

Why Real-Time Data Matters in Logistics and Transportation

How Data Streaming with Apache Kafka and Flink Transforms the Supply Chain

Data Streaming Success Stories Across the Logistics and Transportation Industry

Penske Logistics: A Leader in Transportation, Fleet Services, and Supply Chain Innovation

Penske’s Data Streaming Success Story

Why Confluent for Apache Kafka?

Data Streaming and AI in Action at Penske

Fleet Intelligence in Action: Measurable Business Value Through Data Streaming

The Road Ahead: Agentic AI and the Next Evolution of Event-Driven Architecture Powered By Apache Kafka

Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

Building the Bridge: Event-Driven Architecture + SAP

SAP, Databricks, and Confluent – A Triangular Partnership

Data Products, Not Just Pipelines

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker

Business Value of Agentic AI in the Enterprise

Model Context Protocol (MCP) + Agent2Agent (A2A): New Standards for Agentic AI

Why Apache Kafka Is a Better Fit Than an API (HTTP/REST) for A2A and MCP

MCP + Kafka = Open, Flexible Communication

Stream Processing as the Agent’s Companion

Technology Flexibility for Agentic AI Design with Data Contracts

Microservices, Data Products, and Reusability – Agentic AI Is Just One Piece of the Puzzle

Agentic Al Needs Integration with Core Enterprise Systems

Agentic AI Requires Decoupling – Apache Kafka Supports A2A and MCP as an Event Broker

Powering Fantasy Sports at Scale: How Dream11 Uses Apache Kafka for Real-Time Gaming

Fantasy Sports: Real-Time Gaming Meets Real-World Sports

Dream11: A Fantasy Sports Giant with Massive Scale

Apache Kafka at the Heart of Dream11’s Platform

Solving Kafka Consumer Challenges at Scale

Fantasy Sports, Real-Time Expectations, and Business Value

Dream11’s Kafka Journey: The Backbone of Fantasy Sports at Scale

Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends?

About the Confluent and Databricks Blog Series

The Broader Data Streaming and Lakehouse Landscape

Data Streaming Market

Lakehouse Market

Confluent + Databricks

Confluent: The Leading Data Streaming Platform (DSP)

Databricks: The Leading Lakehouse for AI and Analytics

Stronger Together: A Strategic Alliance for Data and AI with Tableflow and Delta Lake

Erste Bank: Building a Foundation for GenAI with Confluent and Databricks

Confluent: The Neutral Streaming Backbone for Any Data Stack

Snowflake: A Cloud-Native Companion to Confluent – Natively Integrated with Apache Iceberg and Polaris Catalog

Shift Left at Siemens: Real-Time Innovation with Confluent and Snowflake

Open Outlook: Agentic AI with Model-Context Protocol (MCP) and Agent2Agent Protocol (A2A)

Confluent + Databricks: The Future-Proof Data Stack for AI and Analytics

Databricks and Confluent in the World of Enterprise Software (with SAP as Example)

About the Confluent and Databricks Blog Series

Most Enterprise Data Is Operational

SAP Product Landscape for Operational and Analytical Workloads

SAP Business Data Cloud (BDC)

SAP Databricks OEM: Limited Scope, Full Control by SAP

Confluent and SAP Integration

SAP Datasphere and Confluent

Confluent for Agentic AI with SAP Joule and Databricks

Agentic AI with Apache Kafka as Event Broker

Data Streaming Use Cases Across SAP Product Suites

Example Use Case and Architecture with SAP, Databricks and Confluent

Going Beyond SAP with Data Streaming

Strategic Value for the Enterprise of Event-based Real-Time Integration with Data Streaming

Beyond SAP: Enabling Agentic AI Across the Enterprise

Shift Left Architecture for AI and Analytics with Confluent and Databricks

About the Confluent and Databricks Blog Series

Medallion Architecture: Structured, Proven, but Not Always Optimal

Challenges of the Medallion Architecture

Shift Left Architecture: Process Earlier, Share Faster

How Confluent Enables Shift Left with Databricks

Data Quality Governance via Data Contracts and Schema Validation

Apache Flink for Continuous Stream Processing

Combining Shift Left with Medallion Architecture

Shift Left with Delta Lake, Iceberg, and Tableflow

AI Use Cases for Shift Left with Confluent and Databricks

Data Warehouse Use Cases for Shift Left with Confluent and Databricks

Architecture Benefits Beyond Technology

Bringing in New Types of Data

Shift Left Using ONLY Databricks