SAP Archives - Kai Waehner https://www.kai-waehner.de/blog/category/sap/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Wed, 28 May 2025 05:17:50 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png SAP Archives - Kai Waehner https://www.kai-waehner.de/blog/category/sap/ 32 32 Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid https://www.kai-waehner.de/blog/2025/05/28/data-streaming-meets-the-sap-ecosystem-and-databricks-insights-from-sap-sapphire-madrid/ Wed, 28 May 2025 05:17:50 +0000 https://www.kai-waehner.de/?p=7962 SAP Sapphire 2025 in Madrid brought together global SAP users, partners, and technology leaders to showcase the future of enterprise data strategy. Key themes included SAP’s Business Data Cloud (BDC) vision, Joule for Agentic AI, and the deepening SAP-Databricks partnership. A major topic throughout the event was the increasing need for real-time integration across SAP and non-SAP systems—highlighting the critical role of event-driven architectures and data streaming platforms like Confluent. This blog shares insights on how data streaming enhances SAP ecosystems, supports AI initiatives, and enables industry-specific use cases across transactional and analytical domains.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

]]>
I had the opportunity to attend SAP Sapphire 2025 in Madrid—an impressive gathering of SAP customers, partners, and technology leaders from around the world. It was a massive event, bringing the global SAP community together to explore the company’s future direction, innovations, and growing ecosystem.

A key highlight was SAP’s deepening integration of Databricks as an OEM partner for AI and analytics within the SAP Business Data Cloud—showing how the ecosystem is evolving toward more open, composable architectures.

At the same time, conversations around Confluent and data streaming highlighted the critical role real-time integration plays in connecting SAP systems (including ERP, MES, DataSphere, Databricks, etc.) with the rest of the enterprise. As always, it was a great place to learn, connect, and discuss where enterprise data architecture is heading—and how technologies like data streaming are enabling that transformation.

Data Streaming with Confluent Meets SAP and Databricks for Agentic AI at Sapphire in Madrid

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, focusing on industry scenarios, success stories and business value.

SAP’s Vision: Business Data Cloud, Joule, and Strategic Ecosystem Moves

SAP presented a broad and ambitious strategy centered around the SAP Business Data Cloud (BDC), SAP Joule (including its Agentic AI initiative), and strategic collaborations like SAP Databricks, SAP DataSphere, and integrations across multiple cloud platforms. The vision is clear: SAP wants to connect business processes with modern analytics, AI, and automation.

SAP ERP with Business Technology Platform BTP and Joule for Agentic AI in the Cloud
Source: SAP

For those of us working in data streaming and integration, these developments present a major opportunity. Most customers I meet globally uses SAP ERP or other products like MES, SuccessFactors, or Ariba. The relevance of real-time data streaming in this space is undeniable—and it’s growing.

Building the Bridge: Event-Driven Architecture + SAP

One of the most exciting things about SAP Sapphire is seeing how event-driven architecture is becoming more relevant—even if the conversations don’t start with “Apache Kafka” or “Data Streaming.” In the SAP ecosystem, discussions often focus on business outcomes first, then architecture second. And that’s exactly how it should be.

Many SAP customers are moving toward hybrid cloud environments, where data lives in SAP systems, Salesforce, Workday, ServiceNow, and more. There’s no longer a belief in a single, unified data model. Master Data Management (MDM) as a one-size-fits-all solution has lost its appeal, simply because the real world is more complex.

This is where data streaming with Apache Kafka, Apache Flink, etc. fits in perfectly. Event streaming enables organizations to connect their SAP solutions with the rest of the enterprise—for real-time integration across operational systems, analytics platforms, AI engines, and more. It supports transactional and analytical use cases equally well and can be tailored to each industry’s needs.

Data Streaming with Confluent as Integration Middleware for SAP ERP DataSphere Joule Databricks with Apache Kafka

In the SAP ecosystem, customers typically don’t look for open source frameworks to assemble their own solutions—they look for a reliable, enterprise-grade platform that just works. That’s why Confluent’s data streaming platform is an excellent fit: it combines the power of Kafka and Flink with the scalability, security, governance, and cloud-native capabilities enterprises expect.

SAP, Databricks, and Confluent – A Triangular Partnership

At the event, I had some great conversations—often literally sitting between leaders from SAP and Databricks. Watching how these two players are evolving—and where Confluent fits into the picture—was eye-opening.

SAP and Databricks are working closely together, especially with the SAP Databricks OEM offering that integrates Databricks into the SAP Business Data Cloud as an embedded AI and analytics engine. SAP DataSphere also plays a central role here, serving as a gateway into SAP’s structured data.

Meanwhile, Databricks is expanding into the operational domain, not just the analytical lakehouse. After acquiring Neon (a Postgres-compatible cloud-native database), Databricks is expected to announce an additional own transactional OLTP solution soon. This shows how rapidly they’re moving beyond batch analytics into the world of operational workloads—areas where Kafka and event streaming have traditionally provided the backbone.

Enterprise Architecture with Confluent and SAP and Databricks for Analytics and AI

This trend opens up a significant opportunity for data streaming platforms like Confluent to play a central role in modern SAP data architectures. As platforms like Databricks expand their capabilities, the demand for real-time, multi-system integration and cross-platform data sharing continues to grow.

Confluent is uniquely positioned to meet this need—offering not just data movement, but also the ability to process, govern, and enrich data in motion using tools like Apache Flink, and a broad ecosystem of connectors, including those for transactional systems like SAP ERP, but also Oracle databases, IBM mainframe, and other cloud services like Snowflake, ServiceNow or Salesforce.

Data Products, Not Just Pipelines

The term “data product” was mentioned in nearly every conversation—whether from the SAP angle (business semantics and ownership), Databricks (analytics-first), or Confluent (independent, system-agnostic, streaming-native). The key message? Everyone wants real-time, reusable, discoverable data products.

Data Product - The Domain Driven Microservice for Data

This is where an event-driven architecture powered by a data streaming platform shines: Data Streaming connects everything and distributes data to both operational and analytical systems, with governance, durability, and flexibility at the core.

Confluent’s data streaming platform enables the creation of data products from a wide range of enterprise systems, complementing the SAP data products being developed within the SAP Business Data Cloud. The strength of the partnership lies in the ability to combine these assets—bringing together SAP-native data products with real-time, event-driven data products built from non-SAP systems connected through Confluent. This integration creates a unified, scalable foundation for both operational and analytical use cases across the enterprise.

Industry-Specific Use Cases to Explore the Business Value of SAP and Data Streaming

One major takeaway: in the SAP ecosystem, generic messaging around cutting edge technologies such as Apache Kafka does not work. Success comes from being well-prepared—knowing which SAP systems are involved (ECC, S/4HANA, on-prem, or cloud) and what role they play in the customer’s architecture. The conversations must be use case-driven, often tailored to industries like manufacturing, retail, logistics, or the public sector.

This level of specificity is new to many people working in the technical world of Kafka, Flink, and data streaming. Developers and architects often approach integration from a tool- or framework-centric perspective. However, SAP customers expect business-aligned solutions that address concrete pain points in their domain—whether it’s real-time order tracking in logistics, production analytics in manufacturing, or spend transparency in the public sector.

Understanding the context of SAP’s role in the business process, along with industry regulations, workflows, and legacy system constraints, is key to having meaningful conversations. For the data streaming community, this is a shift in mindset—from building pipelines to solving business problems—and it represents a major opportunity to bring strategic value to enterprise customers.

You are lucky: I just published a free ebook about data streaming use cases focusing on industry scenarios and business value: “The Ultimate Data Streaming Guide“.

Looking Forward: SAP, Data Streaming, AI, and Open Table Formats

Another theme to watch: data lake and format standardization. All cloud providers and data vendors like Databricks, Confluent or Snowflake are investing heavily in supporting open table formats like Apache Iceberg (alongside Delta Lake at Databricks) to standardize analytical integrations and reduce storage costs significantly.

SAP’s investment in Agentic AI through SAP Joule reflects a broader trend across the enterprise software landscape, with vendors like Salesforce, ServiceNow, and others embedding intelligent agents into their platforms. This creates a significant opportunity for Confluent to serve as the streaming backbone—enabling real-time coordination, integration, and decision-making across these diverse, distributed systems.

An event-driven architecture powered by data streaming is crucial for the success of Agentic AI with SAP Joule, Databricks AI agents, and other operational systems that need to be integrated into the business processes. The strategic partnership between Confluent and Databricks makes it even easier to implement end-to-end AI pipelines across the operational and analytical estates.

SAP Sapphire Madrid was a valuable reminder that data streaming is no longer a niche technology—it’s a foundation for digital transformation. Whether it’s SAP ERP, Databricks AI, or new cloud-native operational systems, a Data Streaming Platform connects them all in real time to enable new business models, better customer experiences, and operational agility.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, focusing on industry scenarios, success stories and business value.

The post Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid appeared first on Kai Waehner.

]]>
Databricks and Confluent in the World of Enterprise Software (with SAP as Example) https://www.kai-waehner.de/blog/2025/05/12/databricks-and-confluent-in-the-world-of-enterprise-software-with-sap-as-example/ Mon, 12 May 2025 11:26:54 +0000 https://www.kai-waehner.de/?p=7824 Enterprise data lives in complex ecosystems—SAP, Oracle, Salesforce, ServiceNow, IBM Mainframes, and more. This article explores how Confluent and Databricks integrate with SAP to bridge operational and analytical workloads in real time. It outlines architectural patterns, trade-offs, and use cases like supply chain optimization, predictive maintenance, and financial reporting, showing how modern data streaming unlocks agility, reuse, and AI-readiness across even the most SAP-centric environments.

The post Databricks and Confluent in the World of Enterprise Software (with SAP as Example) appeared first on Kai Waehner.

]]>
Modern enterprises rely heavily on operational systems like SAP ERP, Oracle, Salesforce, ServiceNow and mainframes to power critical business processes. But unlocking real-time insights and enabling AI at scale requires bridging these systems with modern analytics platforms like Databricks. This blog explores how Confluent’s data streaming platform enables seamless integration between SAP, Databricks, and other systems to support real-time decision-making, AI-driven automation, and agentic AI use cases. It explores how Confluent delivers the real-time backbone needed to build event-driven, future-proof enterprise architectures—supporting everything from inventory optimization and supply chain intelligence to embedded copilots and autonomous agents.

Enterprise Application Integration with Confliuent and Databricks for Oracle SAP Salesforce Servicenow et al

About the Confluent and Databricks Blog Series

This article is part of a blog series exploring the growing roles of Confluent and Databricks in modern data and AI architectures:

Learn how these platforms will affect data use in businesses in future articles. Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

Most Enterprise Data Is Operational

Enterprise software systems generate a constant stream of operational data across a wide range of domains. This includes orders and inventory from SAP ERP systems, often extended with real-time production data from SAP MES. Oracle databases capture transactional data critical to core business operations, while MongoDB contributes operational data—frequently used as a CDC source or, in some cases, as a sink for analytical queries. Customer interactions are tracked in platforms like Salesforce CRM, and financial or account-related events often originate from IBM mainframes. 

Together, these systems form the backbone of enterprise data, requiring seamless integration for real-time intelligence and business agility. This data is often not immediately available for analytics or AI unless it’s integrated into downstream systems.

Confluent is built to ingest and process this kind of operational data in real time. Databricks can then consume it for AI and machine learning, dashboards, or reports. Together, SAP, Confluent and Databricks create a real-time architecture for enterprise decision-making.

SAP Product Landscape for Operational and Analytical Workloads

SAP plays a foundational role in the enterprise data landscape—not just as a source of business data, but as the system of record for core operational processes across finance, supply chain, HR, and manufacturing.

On a high level, the SAP product portfolio has three categories (these days): SAP Business AI, SAP Business Data Cloud (BDC), and SAP Business Applications powered by SAP Business Technology Platform (BTP).

SAP Product Portfolio Categories
Source: SAP

To support both operational and analytical needs, SAP offers a portfolio of platforms and tools, while also partnering with best-in-class technologies like Databricks and Confluent.

Operational Workloads (Transactional Systems):

  • SAP S/4HANA – Modern ERP for core business operations
  • SAP ECC – Legacy ERP platform still widely deployed
  • SAP CRM / SCM / SRM – Domain-specific business systems
  • SAP Business One / Business ByDesign – ERP solutions for mid-market and subsidiaries

Analytical Workloads (Data & Analytics Platforms):

  • SAP Datasphere – Unified data fabric to integrate, catalog, and govern SAP and non-SAP data
  • SAP Analytics Cloud (SAC) – Visualization, reporting, and predictive analytics
  • SAP BW/4HANA – Data warehousing and modeling for SAP-centric analytics

SAP Business Data Cloud (BDC)

SAP Business Data Cloud (BDC) is a strategic initiative within SAP Business Technology Platform (BTP) that brings together SAP’s data and analytics capabilities into a unified cloud-native experience. It includes:

  • SAP Datasphere as the data fabric layer, enabling seamless integration of SAP and third-party data
  • SAP Analytics Cloud (SAC) for consuming governed data via dashboards and reports
  • SAP’s partnership with Databricks to allow SAP data to be analyzed alongside non-SAP sources in a lakehouse architecture
  • Real-time integration scenarios enabled through Confluent and Apache Kafka, bringing operational data in motion directly into SAP and Databricks environments

Together, this ecosystem supports real-time, AI-powered, and governed analytics across operational and analytical workloads—making SAP data more accessible, trustworthy, and actionable within modern cloud data architectures.

SAP Databricks OEM: Limited Scope, Full Control by SAP

SAP recently announced an OEM partnership with Databricks, embedding parts of Databricks’ serverless infrastructure into the SAP ecosystem. While this move enables tighter integration and simplified access to AI workloads within SAP, it comes with significant trade-offs. The OEM model is narrowly scoped, optimized primarily for ML and GenAI scenarios on SAP data, and lacks the openness and flexibility of native Databricks.

This integration is not intended for full-scale data engineering. Core capabilities such as workflows, streaming, Delta Live Tables, and external data connections (e.g., Snowflake, S3, MS SQL) are missing. The architecture is based on data at rest and does not embrace event-driven patterns. Compute options are limited to serverless only, with no infrastructure control. Pricing is complex and opaque, with customers often needing to license Databricks separately to unlock full capabilities.

Critically, SAP controls the entire data integration layer through its BDC Data Products, reinforcing a vendor lock-in model. While this may benefit SAP-centric organizations focused on embedded AI, it restricts broader interoperability and long-term architectural flexibility. In contrast, native Databricks, i.e., outside of SAP, offers a fully open, scalable platform with rich data engineering features across diverse environments.

Whichever Databricks option you prefer, this is where Confluent adds value—offering a truly event-driven, decoupled architecture that complements both SAP Datasphere and Databricks, whether used within or outside the SAP OEM framework.

Confluent and SAP Integration

Confluent provides native and third-party connectors to integrate with SAP systems to enable continuous, low-latency data flow across business applications.

SAP ERP Confluent Data Streaming Integration Access Patterns
Source: Confluent

This powers modern, event-driven use cases that go beyond traditional batch-based integrations:

  • Low-latency access to SAP transactional data
  • Integration with other operational source systems like Salesforce, Oracle, IBM Mainframe, MongoDB, or IoT platforms
  • Synchronization between SAP DataSphere and other data warehouse and analytics platforms such as Snowflake, Google BigQuery or Databricks 
  • Decoupling of applications for modular architecture
  • Data consistency across real-time, batch and request-response APIs
  • Hybrid integration across any edge, on-premise or multi-cloud environments

SAP Datasphere and Confluent

To expand its role in the modern data stack, SAP introduced SAP Datasphere—a cloud-native data management solution designed to extend SAP’s reach into analytics and data integration. Datasphere aims to simplify access to SAP and non-SAP data across hybrid environments.

SAP Datasphere simplifies data access within the SAP ecosystem, but it has key drawbacks when compared to open platforms like Databricks, Snowflake, or Google BigQuery:

  • Closed Ecosystem: Optimized for SAP, but lacks flexibility for non-SAP integrations.
  • No Event Streaming: Focused on data at rest, with limited support for real-time processing or streaming architectures.
  • No Native Stream Processing: Relies on batch methods, adding latency and complexity for hybrid or real-time use cases.

Confluent alleviates these drawbacks and supports this strategy through bi-directional integration with SAP Datasphere. This enables real-time streaming of SAP data into Datasphere and back out to operational or analytical consumers via Apache Kafka. It allows organizations to enrich SAP data, apply real-time processing, and ensure it reaches the right systems in the right format—without waiting for overnight batch jobs or rigid ETL pipelines.

Confluent for Agentic AI with SAP Joule and Databricks

SAP is laying the foundation for agentic AI architectures with a vision centered around Joule—its generative AI copilot—and a tightly integrated data stack that includes SAP Databricks (via OEM), SAP Business Data Cloud (BDC), and a unified knowledge graph. On top of this foundation, SAP is building specialized AI agents for use cases such as customer 360, creditworthiness analysis, supply chain intelligence, and more.

SAP ERP with Business Technology Platform BTP and Joule for Agentic AI in the Cloud
Source: SAP

The architecture combines:

  • SAP Joule as the interface layer for generative insights and decision support
  • SAP’s foundational models and domain-specific knowledge graph
  • SAP BDC and SAP Databricks as the data and ML/AI backbone
  • Data from both SAP systems (ERP, CRM, HR, logistics) and non-SAP systems (e.g. clickstream, IoT, partner data, social media) from its partnership with Confluent

But here’s the catch:  What happens when agents need to communicate with one another to deliver a workflow?  Such Agentic systems require continuous, contextual, and event-driven data exchange—not just point-to-point API calls and nightly batch jobs.

This is where Confluent’s data streaming platform comes in as critical infrastructure.

Agentic AI with Apache Kafka as Event Broker

Confluent provides the real-time data streaming platform that connects the operational world of SAP with the analytical and AI-driven world of Databricks, enabling the continuous movement, enrichment, and sharing of data across all layers of the stack.

Agentic AI with Confluent as Event Broker for Databricks SAP and Oracle

The above is a conceptual view on the architecture. The AI agents on the left side could be built with SAP Joule, Databricks, or any “outside” GenAI framework.

The data streaming platform helps connecting the AI agents with the reset of the enterprise architecture, both within SAP and Databricks but also beyond:

  • Real-time data integration from non-SAP systems (e.g., mobile apps, IoT devices, mainframes, web logs) into SAP and Databricks
  • True decoupling of services and agents via an event-driven architecture (EDA), replacing brittle RPC or point-to-point API calls
  • Event replay and auditability—critical for traceable AI systems operating in regulated environments
  • Streaming pipelines for feature engineering and inference: stream-based model triggering with low-latency SLAs
  • Support for bi-directional flows: e.g., operational triggers in SAP can be enriched by AI agents running in Databricks and pushed back into SAP via Kafka events

Without Confluent, SAP’s agentic architecture risks becoming a patchwork of stateless services bound by fragile REST endpoints—lacking the real-time responsiveness, observability, and scalability required to truly support next-generation AI orchestration.

Confluent turns the SAP + Databricks vision into a living, breathing ecosystem—where context flows continuously, agents act autonomously, and enterprises can build future-proof AI systems that scale.

Data Streaming Use Cases Across SAP Product Suites

With Confluent, organizations can support a wide range of use cases across SAP product suites, including:

  1. Real-Time Inventory Visibility: Live updates of stock levels across warehouses and stores by streaming material movements from SAP ERP and SAP EWM, enabling faster order fulfillment and reduced stockouts.
  2. Dynamic Pricing and Promotions: Stream sales orders and product availability in real time to trigger pricing adjustments or dynamic discounting via integration with SAP ERP and external commerce platforms.
  3. AI-Powered Supply Chain Optimization: Combine data from SAP ERP, SAP Ariba, and external logistics platforms to power ML models that predict delays, optimize routes, and automate replenishment.
  4. Shop Floor Event Processing: Stream sensor and machine data alongside order data from SAP MES, enabling real-time production monitoring, alerting, and throughput optimization.
  5. Employee Lifecycle Automation: Stream employee events (e.g., onboarding, role changes) from SAP SuccessFactors to downstream IT systems (e.g., Active Directory, badge systems), improving HR operations and compliance.
  6. Order-to-Cash Acceleration: Connect order intake (via web portals or Salesforce) to SAP ERP in real time, enabling faster order validation, invoicing, and cash flow.
  7. Procure-to-Pay Automation: Integrate procurement events from SAP Ariba and supplier portals with ERP and financial systems to streamline approvals and monitor supplier performance continuously.
  8. Customer 360 and CRM Synchronization: Synchronize customer master data and transactions between SAP ERP, SAP CX, and third-party CRMs like Salesforce to enable unified customer views.
  9. Real-Time Financial Reporting: Stream financial transactions from SAP S/4HANA into cloud-based lakehouses or BI tools for near-instant reporting and compliance dashboards.
  10. Cross-System Data Consistency: Ensure consistent master data and business events across SAP and non-SAP environments by treating SAP as a real-time event source—not just a system of record.

Example Use Case and Architecture with SAP, Databricks and Confluent

Consider a manufacturing company using SAP ERP for inventory management and Databricks for predictive maintenance. The combination of SAP Datasphere and Confluent enables seamless data integration from SAP systems, while the addition of Databricks supports advanced AI/ML applications—turning operational data into real-time, predictive insights.

With Confluent as the real-time backbone:

  • Machine telemetry (via MQTT or OPC-UA) and ERP events (e.g., stock levels, work orders) are streamed in real time.
  • Apache Flink enriches and filters the event streams—adding context like equipment metadata or location.
  • Tableflow publishes clean, structured data to Databricks as Delta tables for analytics and ML processing.
  • A predictive model hosted in a Databricks model detects potential equipment failure before it happens in a Flink application calling the remote model with low latency.
  • The resulting prediction is streamed back to Kafka, triggering an automated work order in SAP via event integration.

Enterprise Architecture with Confluent and SAP and Databricks for Analytics and AI

This bi-directional, event-driven pattern illustrates how Confluent enables seamless, real-time collaboration across SAP, Databricks, and IoT systems—supporting both operational and analytical use cases with a shared architecture.

Going Beyond SAP with Data Streaming

This pattern applies to other enterprise systems:

  • Salesforce: Stream customer interactions for real-time personalization through Salesforce Data Cloud
  • Oracle: Capture transactions via CDC (Change Data Capture)
  • ServiceNow: Monitor incidents and automate operational responses
  • Mainframe: Offload events from legacy applications without rewriting code
  • MongoDB: Sync operational data in real time to support responsive apps
  • Snowflake: Stream enriched operational data into Snowflake for near real-time analytics, dashboards, and data sharing across teams and partners
  • OpenAI (or other GenAI platforms): Feed real-time context into LLMs for AI-assisted recommendations or automation
  • “You name it”: Confluent’s prebuilt connectors and open APIs enable event-driven integration with virtually any enterprise system

Confluent provides the backbone for streaming data across all of these platforms—securely, reliably, and in real time.

Strategic Value for the Enterprise of Event-based Real-Time Integration with Data Streaming

Enterprise software platforms are essential. But they are often closed, slow to change, and not designed for analytics or AI.

Confluent provides real-time access to operational data from platforms like SAP. SAP Datasphere and Databricks enable analytics and AI on that data. Together, they support modern, event-driven architectures.

  • Use Confluent for real-time data streaming from SAP and other core systems
  • Use SAP Datasphere and Databricks to build analytics, reports, and AI on that data
  • Use Tableflow to connect the two platforms seamlessly

This modern approach to data integration delivers tangible business value, especially in complex enterprise environments. It enables real-time decision-making by allowing business logic to operate on live data instead of outdated reports. Data products become reusable assets, as a single stream can serve multiple teams and tools simultaneously. By reducing the need for batch layers and redundant processing, the total cost of ownership (TCO) is significantly lowered. The architecture is also future-proof, making it easy to integrate new systems, onboard additional consumers, and scale workflows as business needs evolve.

Beyond SAP: Enabling Agentic AI Across the Enterprise

The same architectural discussion applies across the enterprise software landscape. As vendors embed AI more deeply into their platforms, the effectiveness of these systems increasingly depends on real-time data access, continuous context propagation, and seamless interoperability.

Without an event-driven foundation, AI agents remain limited—trapped in siloed workflows and brittle API chains. Confluent provides the scalable, reliable backbone needed to enable true agentic AI in complex enterprise environments.

Examples of AI solutions driving this evolution include:

  • SAP Joule / Business AI – Context-aware agents and embedded AI across ERP, finance, and supply chain
  • Salesforce Einstein / Copilot Studio – Generative AI for CRM, service, and marketing automation built on top of Salesforce Data Cloud
  • ServiceNow Now Assist – Intelligent workflows and predictive automation in ITSM and Ops
  • Oracle Fusion AI / OCI AI Services – Embedded machine learning in ERP, HCM, and SCM
  • Microsoft Copilot (Dynamics / Power Platform) – AI copilots across business and low-code apps
  • Workday AI – Smart recommendations for finance, workforce, and HR planning
  • Adobe Sensei GenAI – GenAI for content creation and digital experience optimization
  • IBM watsonx – Governed AI foundation for enterprise use cases and data products
  • Infor Coleman AI – Industry-specific AI for supply chain and manufacturing systems
  • All the “traditional” cloud providers and data platforms such as Snowflake with Cortex, Microsoft Azure Fabric, AWS SageMaker, AWS Bedrock, and GCP Vertex AI

Each of these platforms benefits from a streaming-first architecture that enables real-time decisions, reusable data, and smarter automation across the business.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including technical architectures and the relation to other operational and analytical platforms like SAP and Databricks.

The post Databricks and Confluent in the World of Enterprise Software (with SAP as Example) appeared first on Kai Waehner.

]]>
How Siemens Healthineers Leverages Data Streaming with Apache Kafka and Flink in Manufacturing and Healthcare https://www.kai-waehner.de/blog/2024/12/17/how-siemens-healthineers-leverages-data-streaming-with-apache-kafka-and-flink-in-manufacturing-and-healthcare/ Tue, 17 Dec 2024 05:58:17 +0000 https://www.kai-waehner.de/?p=7036 Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and empower healthcare professionals. A significant aspect of their technological prowess lies in their use of data streaming to unlock real-time insights and optimize processes. This blog post explores how Siemens Healthineers uses data streaming with Apache Kafka and Flink, their cloud-focused technology stack, and the use cases that drive tangible business value, such as real-time logistics, robotics, SAP ERP integration, AI/ML, and more.

The post How Siemens Healthineers Leverages Data Streaming with Apache Kafka and Flink in Manufacturing and Healthcare appeared first on Kai Waehner.

]]>
Siemens Healthineers, a global leader in medical technology, delivers solutions that improve patient outcomes and empower healthcare professionals. As part of the Siemens AG family, Siemens Healthineers stands out with innovative products, data-driven solutions, and services designed to optimize workflows, improve precision, and enhance efficiency in healthcare systems worldwide. A significant aspect of their technological prowess lies in their use of data streaming to unlock real-time insights and optimize processes. This blog post explores how Siemens Healthineers uses data streaming with Apache Kafka and Flink, their cloud-focused technology stack, and the use cases that drive tangible business value such as real-time logistics, robotics, SAP ERP integration, AI/ML, and more.

Data Streaming with Apache Kafka and Flink in Healthcare and Manufacturing at Siemens Healthineers

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch.

Siemens Healthineers: Shaping the Future of Healthcare Technology

Who They Are

Siemens AG, a global powerhouse in industrial manufacturing, energy, and technology, has been a leader in innovation for over 170 years. Known for its groundbreaking contributions across sectors, Siemens combines engineering expertise with digitalization to shape industries worldwide. Within this ecosystem, Siemens Healthineers stands out as a pivotal player in healthcare technology.

Siemens Healhineers Company Overview
Source: Siemens Healthineers

With over 71,000 employees operating in 70+ countries, Siemens Healthineers supports critical clinical decisions in healthcare. Over 90% of leading hospitals worldwide collaborate with them, and their technologies influence over 70% of critical clinical decisions.

Their Vision

Siemens Healthineers focuses on innovation through data and AI, aiming to streamline healthcare delivery. With more than 24,000 technical intellectual property rights, including 15,000 granted patents, their technological foundation enables precision medicine, enhanced diagnostics, and patient-centric solutions.

Smart Logistics and Manufacturing at Siemens
Source: Siemens Healthineers

Siemens Healthineers and Data Streaming for Healthcare and Manufacturing

Siemens is a large conglomerate. I already covered a few data streaming use cases at other Siemens divisions. For instance, the integration project from SAP ERP on-premise to Salesforce CRM in the cloud.

At the Data in Motion Tour 2024 in Frankfurt, Arash Attarzadeh (“Apache Kafka Jedi“) from Siemens Heathineers presented various very interesting success stories that leverage data streaming using Apache Kafka, Flink, Confluent, and its entire ecosystem.

Healthcare and manufacturing processes generate massive volumes of real-time data. Whether it’s monitoring devices on production floors, analyzing telemetry data from hospitals, or optimizing logistics, Siemens Healthineers recognizes that data streaming enables:

  • Real-time insights: Immediate and continuously action on events as they happen.
  • Improved decision-making: Faster and more accurate responses.
  • Cost efficiency: Reduced downtime and optimized operations.

Healthineers Data Cloud

The Siemens Healthineers Data Cloud serves as the backbone of their data strategy. Built on a robust technology stack, it facilitates real-time data ingestion, transformation, and analytics using tools like Confluent Cloud (including Apache Kafka and Flink) and Snowflake.

Siemens Healthineers Data Cloud Technology Stack with Apache Kafka and Snowflake for Healthcare
Source: Siemens Healthineers

This combination of leading SaaS solutions enables seamless integration of streaming data with batch processes and diverse analytics platforms.

Technology Stack: Healthineers Data Cloud

Key Components

  • Confluent Cloud (Apache Kafka): For real-time data ingestion, data integration and stream processing.
  • Snowflake: A centralized warehouse for analytics and reporting.
  • Matillion: Batch ETL processes for structured and semi-structured data.
  • IoT Data Integration: Sensors and PLCs collect data from manufacturing floors, often via MQTT.
Machine Monitoring and Streaming Analytics with MQTT Confluent Kafka and TensorFlow AI ML in Healthcare and Manufacturing
Source: Siemens Healthineers

Many other solutions are critical for some use cases. Siemens Healthineers also uses Databricks, dbt, OPC-UA, and many other systems for the end-to-end data pipelines.

Diverse Data Ingestion

  • Real-Time Streaming: IoT data (sensors, PLCs) is ingested within minutes.
  • Batch Processing: Structured and semi-structured data from SAP systems.
  • Change Data Capture (CDC): Data changes in SAP sources are captured and available in under 30 minutes.

Not every data integration process is or can be real-time. Data consistency is still one of the most underrated capabilities of data streaming. Apache Kafka supports real-time, batch and request-response APIs communicating with each other in a consistent way.

Use Cases for Data Streaming at Siemens Healthineers

Siemens Healthineers described six different use cases that leverage data streaming together with various other IoT, software and cloud services:

  1. Machine monitoring and predictive maintenance
  2. Data integration layer for analytics
  3. Machine and robot integration
  4. Telemetry data processing for improved diagnostics
  5. Real-time logistics with SAP events for better supply chain efficiency
  6. Track and Trace Orders for improved customer satisfaction and ensured compliance

Let’s take a look at them in the following subsections.

1. Machine Monitoring and Predictive Maintenance in Manufacturing

Goal: To ensure the smooth operation of production devices through predictive maintenance.

Using data streaming, real-time IoT data from drill machines is ingested into Kafka topics, where it’s analyzed to predict maintenance needs. By using a TensorFlow machine learning model for infererence with Apache Kafka, Siemens Healthineers can:

  • Reduce machine downtime.
  • Optimize maintenance schedules.
  • Increase productivity in manufacturing CT scanners.

Business Value: Predictive maintenance reduces operational costs and prevents production halts, ensuring timely delivery of critical medical equipment.

2. IQ-Data Intelligence from IoT and SAP to Cloud

Goal: Develop an end-to-end data integration layer for analytics.

Data from various lifecycle phases (e.g., SAP systems, IoT interfaces via MQTT using Mosquitto, external sources) is streamed into a consistent model using stream processing with ksqlDB. The resulting data backend supports the development of MLOps architectures and enables advanced analytics.

AI MLOps with Kafka Stream Processing Qlik Tableau BI at Siemens Healthineers
Source: Siemens Healthineers

Business Value: Streamlined data integration accelerates the development of AI applications, helping data scientists and analysts make quicker, more informed decisions.

3. Machine Integration with SAP and KUKA Robots

Goal: Integrate machine data for analytics and real-time insights.

Data from SAP systems (such as SAP ME and SAP PCO) and machines like KUKA robots is streamed into Snowflake for analytics. MQTT brokers and Apache Kafka manage real-time data ingestion and facilitate predictive analytics.

Siemens Machine Integration with SAP KUKA Jungheinrich Kafka Confluent Cloud Snowflake
Source: Siemens Healthineers

Business Value: Enhanced machine integration improves production quality and supports the shift toward smart manufacturing processes.

4. Digital Healthcare Service Operations using Data Streaming

Goal: Stream telemetry data from Siemens Healthineers products for analytics.

Telemetry data from hospital devices is streamed via WebSockets to Kafka and combined with ksqlDB for continuous stream processing. Insights are fed back to clients for improved diagnostics.

Business Value: By leveraging real-time device data, Siemens Healthineers enhances the reliability of its medical equipment and improves patient outcomes.

5. Real-Time Logistics with SAP Events and Confluent Cloud

Goal: Stream SAP logistics event data for real-time packaging and shipping updates.

Using Confluent Cloud, Siemens Healthineers reduces delays in packaging and shipping by enabling real-time insights into logistics processes.

SAP Logistics Integration with Apache Kafka for Real-Time Shipping Points
Source: Siemens Healthineers

Business Value: Improved packaging planning reduces delivery times and enhances supply chain efficiency, ensuring faster deployment of medical devices.

6. Track and Trace Orders with Apache Kafka and Snowflake

Goal: Real-time order tracking using streaming data.

Data from Siemens Healthineers orders is streamed into Snowflake using Kafka for real-time monitoring. This enables detailed tracking of orders throughout the supply chain.

Business Value: Enhanced order visibility improves customer satisfaction and ensures compliance with regulatory requirements.

Real-Time Data as a Catalyst for Healthcare and Manufacturing Innovation at Siemens Healthineers

Siemens Healthineers’ innovative use of data streaming exemplifies how real-time insights can drive efficiency, reliability, and innovation in healthcare and manufacturing. By leveraging tools like Confluent (including Apache Kafka and Flink), MQTT and Snowflake and transitiing some workloads to the cloud, they’ve built a robust infrastructure to handle diverse data streams, improve decision-making, and deliver tangible business outcomes.

From predictive maintenance to enhanced supply chain visibility, the adoption of data streaming unlocks value at every stage of the production and service lifecycle. For Siemens Healthineers, these advancements translate into better patient care, streamlined operations, and a competitive edge in the dynamic healthcare industry.

To learn more about the relationship between these key technologies and their applications in different use cases, explore the articles below:

Do you have similar use cases and architectures like Siemens Healthineers to leverage data streaming with Apache Kafka and Flink in the healthcare and manufacturing sector? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post How Siemens Healthineers Leverages Data Streaming with Apache Kafka and Flink in Manufacturing and Healthcare appeared first on Kai Waehner.

]]>
Data Streaming in Healthcare and Pharma: Use Cases and Insights from Cardinal Health https://www.kai-waehner.de/blog/2024/11/28/data-streaming-in-healthcare-and-pharma-use-cases-cardinal-health/ Thu, 28 Nov 2024 04:12:15 +0000 https://www.kai-waehner.de/?p=7047 This blog explores Cardinal Health’s journey, exploring how its event-driven architecture and data streaming power use cases like supply chain optimization, and medical device and equipment management. By integrating Apache Kafka with platforms like Apigee, Dell Boomi and SAP, Cardinal Health sets a benchmark for IT modernization and innovation in the healthcare and pharma sectors.

The post Data Streaming in Healthcare and Pharma: Use Cases and Insights from Cardinal Health appeared first on Kai Waehner.

]]>

The post Data Streaming in Healthcare and Pharma: Use Cases and Insights from Cardinal Health appeared first on Kai Waehner.

]]>
SAP Datasphere and Apache Kafka as Data Fabric for S/4HANA ERP Integration https://www.kai-waehner.de/blog/2024/01/03/sap-datasphere-and-apache-kafka-as-data-fabric-for-s4hana-erp-integration/ Wed, 03 Jan 2024 09:06:02 +0000 https://www.kai-waehner.de/?p=5926 SAP is the leading ERP solution across industries around the world. Data integration with other data platforms, applications, databases, and APIs is one of the hardest challenges in the IT and software landscape. This blog post explores how SAP Datasphere in conjunction with the data streaming platform Apache Kafka enables a reliable, scalable and open data fabric for connecting SAP business objects of ECC and S/4HANA ERP with other real-time, batch, or request-response interfaces.

The post SAP Datasphere and Apache Kafka as Data Fabric for S/4HANA ERP Integration appeared first on Kai Waehner.

]]>
SAP is the leading ERP solution across industries around the world. Data integration with other data platforms, applications, databases, and APIs is one of the hardest challenges in the IT and software landscape. This blog post explores how SAP Datasphere in conjunction with the data streaming platform Apache Kafka enables a reliable, scalable and open data fabric for connecting SAP business objects of ECC and S/4HANA ERP with other real-time, batch, or request-response interfaces.

SAP Datasphere and Apache Kafka as Data Fabric for ERP Integration

What is SAP ERP?

SAP is a German multinational software corporation that develops enterprise software to manage business operations and customer relations. SAP is best known for its ERP (Enterprise Resource Planning) software, which helps organizations integrate and streamline their business processes.

A wide range of industries and companies of all sizes use it. SAP ERP is one of the most widely used ERP solutions globally. SAP is not a single product, like many people think. Over the years, SAP has expanded its product portfolio. It includes cloud-based solutions, analytics, database management, and other enterprise software applications.

SAP ECC, S/4HANA, and more ERP Products

SAP offers a range of ERP products that cater to different business needs and industries. Some of the key SAP ERP products include:

  1. SAP S/4HANA: SAP S/4HANA is the flagship ERP suite that represents the next generation of SAP’s ERP solutions. The product is built on the SAP HANA in-memory database and provides a simplified data model, improved user experience, and advanced analytics capabilities. It covers core business functions, such as finance, supply chain, manufacturing, procurement, and more.
  2. SAP ERP Central Component (ECC): ECC is the predecessor to SAP S/4HANA and is still widely used by many organizations. It includes various modules, such as SAP ERP Financials, SAP ERP Human Capital Management (HCM), SAP ERP Operations, and others.
  3. SAP Business ByDesign: This is a cloud-based ERP solution designed for small to medium-sized enterprises (SMEs). It integrates core business functions, such as financials, human resources, procurement, supply chain management, and customer relationship management (CRM).
  4. SAP Business One: Another ERP solution targeted at small and medium-sized businesses, SAP Business One is an integrated suite that covers areas such as accounting, sales, purchasing, inventory, and production.
  5. SAP S/4HANA Cloud: This is a cloud-based version of SAP S/4HANA, offering similar functionalities but with the advantages of cloud deployment, including scalability, accessibility, and reduced infrastructure management.
  6. SAP Business Suite: This is a set of business applications that includes SAP ERP and other related products. It comprises different modules to address various business processes.
  7. SAP All-in-One: This is an industry-specific version of SAP ERP designed for midsize companies. It provides pre-configured industry solutions for sectors such as manufacturing, retail, and healthcare.

This product list might be out of date when you read it. SAP continuously develops its product offerings. Products get new names from time to time, consolidate, or deprecate. In other words, SAP modernization, integration, and migration are usually an ongoing effort that never ends.

What is SAP Datasphere?

SAP Datasphere is the next generation of SAP Data Warehouse Cloud. The platform provides a comprehensive data service that enables data professionals to deliver seamless and scalable access to critical business data.

Datasphere High Level Architecture

SAP Datasphere is a cloud-based product packaged within SAP’s Business Technology Platform (BTP). Datasphere brings together two previously standalone products, SAP Data Intelligence Cloud (DIC) and SAP Data Warehouse Cloud (DWC), into one cloud native, data integration, and data management platform. The solution allows SAP customers to ingest, integrate, store, and analyze core SAP ERP data, as well as to share this data with other analytical services and downstream applications.

SAP Datasphere = Cloud Data Warehouse and Analytics Platform

Datasphere is the core part of a new solution, known as Business Data Fabric, to simplify data integration and management involving SAP ERP backend data. A key focus of SAP Datasphere is business intelligence and analytics.

I see Datasphere similar to Snowflake or Databricks as a general data warehouse / data lake / lakehouse, but focusing on SAP data with deep integration into the SAP ERP ecosystem and surrounding applications.

However, the out-of-the-box availability of SAP ERP data from SAP ECC, S/4HANA, and other SAP apps enables a simple but powerful opportunity for data integration beyond the SAP landscape. No need to use legacy SAP protocols like BAPI or IDoc anymore. Instead, SAP Datasphere provides a unified way to discover, connect, and manage data across different data sources, systems, and landscapes.

Features of SAP Datasphere and Complementary Software Partnerships

The key features of SAP Datasphere include:

  1. Data Connectivity: SAP Datasphere enables organizations to connect to and access data from various sources, whether they are on-premises or in the cloud. It supports integration with different databases, data lakes, and other data repositories.
  2. Data Orchestration: The platform allows organizations to orchestrate data flows and processing across different data environments. This can be essential for managing complex data pipelines and ensuring data consistency and coherence.
  3. Data Governance: SAP Datasphere includes features for data governance, providing tools for managing metadata, ensuring data quality, and enforcing data policies across the distributed landscape.
  4. Unified Data Discovery: The platform offers a unified view of data assets, helping organizations discover and understand the available data resources across their entire landscape.
  5. Multi-Cloud and Edge Support: SAP Datasphere works in multi-cloud and edge computing environments, providing flexibility and scalability for organizations with diverse data storage and processing needs.

This sounds like any other data management platform, doesn’t it?

But the above features are focusing mainly on SAP environments. Therefore, Datasphere has a few strategic software partnerships:

  • Confluent (data streaming)
  • Databricks (data lakehouse)
  • Collibra (data governance)
  • Data Robot (automated machine learning)

This emphasizes the strength of Datasphere around the SAP ecosystem. The other partners connect non-SAP IT infrastructure and applications with SAP environments bidirectionally.

SAP Datasphere = One-Stop-Shop for Multi-Generation SAP ERP Systems

SAP Datasphere is more than just an analytical platform for SAP ERP data.

Datasphere leverages SAP internal tooling to access data directly from SAP systems. It is a complete data integration and analytics solution optimized for collecting and preparing data from all SAP ERP systems of multiple generations. For the first time in their history, SAP is making core ERP data from numerous back-end systems available in a one-stop-shop fashion through Datasphere.

This brings us to the excellent opportunity of combining SAP business objects with Apache Kafka and the rest of the enterprise architecture.

Why Apache Kafka for SAP Integration?

Apache Kafka is a distributed streaming platform that has gained widespread popularity for its ability to handle large-scale, real-time data streaming and event processing. When it comes to SAP integration, there are several reasons organizations choose to use Apache Kafka:

  1. Real-time Data Streaming
    • Apache Kafka is designed for real-time data streaming, making it well-suited for scenarios where timely and continuous data updates are crucial. This is important in SAP environments where real-time integration is essential for various business processes.
  2. Scalability
    • Kafka is highly scalable and can handle large volumes of data and high-throughput requirements. SAP systems often handle massive amounts of data. Kafka’s scalability enables efficient management and processing of this data.
  3. Reliability and Fault Tolerance
    • Kafka is known for its reliability and fault-tolerance features. It ensures data durability and availability, which is critical for critical applications like those in SAP environments, e.g., in finance or supply chain business processes. Features like rolling upgrades allow zero downtime continuously.
  4. Decoupling Systems
  5. Event-Driven Architecture
    • Kafka supports an event-driven architecture, which aligns well with modern integration patterns. The streaming platform efficiently propagates events, such as changes in SAP data or system events. This enables a more responsive and agile IT landscape. Kafka Connect enables integration with other plain messaging platforms like IBM MQ, TIBCO EMS, or Solace.
  6. Integration with Big Data Ecosystem
  7. Message Retention
    • Kafka stores messages for a configurable period, allowing systems to catch up on missed messages in case of temporary disruptions. This is particularly useful in scenarios where SAP systems may be temporarily offline, unreachable, or cannot handle the throughput. Or if the transaction cost needs to be reduced by offloading the consumption of downstream applications to a cheaper platform like Kafka. Tiered Storage for Kafka is a significant change for long-term event store for ERP information.
  8. Support for Multiple Protocols
    • The Kafka ecosystem supports various communication protocols (like Kafka, HTTP, File, WebSockets, and more), making it versatile for integration with different systems and technologies. This flexibility is crucial in heterogeneous IT environments, where SAP systems coexist with other technologies.
  9. Open Source Community and Ecosystem
    • Kafka has a vibrant open-source community and a rich ecosystem of connectors and tools. This ecosystem can simplify integration efforts by providing pre-built connectors for SAP systems and other common technologies.
  10. Analytical and Operational Workloads
    • Kafka was initially built for big data analytics use cases. However, most organizations leverage the technology for operational workloads, like orders or payments. Kafka evolved over the years and even introduced a transaction API for exactly-once semantics.

An ERP environment should be real-time, scalable, and open. SAP ERP is not just one product or technology. And organizations always combine it with other open source frameworks, proprietary standard software, and SaaS. “Building a Postmodern ERP with Apache Kafka” explores how SAP ERP and other technologies provide the most value together in a flexible, open environment. Many next-generation ERP systems use Kafka under the hood, too. Even if you don’t see it because it is a proprietary product or SaaS. But event-driven architectures are helpful for software products as they are for any other software projects.

Continuous SAP Migration and Cutover with Kafka

Integration between SAP ERP and other applications is crucial. Another kind of project is the migration and ERP modernization, e.g., from SAP ECC to S/4HANA or the migration between SAP and another software vendor.

A SAP migration project involves moving an SAP system or landscape from one environment to another. This could include moving from an on-premises environment to the cloud, upgrading to a newer version of SAP software, or consolidating multiple SAP instances. The exact steps and considerations for a SAP migration can vary based on the specific migration scenario.

Most SAP ERP migrations these days are from SAP ECC to SAP S/4Hana. These projects usually take years. Apache Kafka can provide valuable help in different SAP integration and migration scenarios.

The combination of real-time capabilities, an event storage for true decoupling and data consistency across real-time and non-real-time systems, and data integration with non-SAP systems and APIs make Kafka the perfect middleware for SAP modernization and ERP migrations.

I covered such a migration via Apache Kafka in a data warehouse modernization story where legacy and modern applications live in parallel for some months or even years until the final cutover is done.

Data Warehouse Offloading, Integration, Cutover, and Replacement with Data Streaming

Until the completion of the S/4Hana migration in the cloud, SAP ECC on-premise continues to exist for years. The hybrid deployment and synchronization capabilities of Kafka make it unique for SAP migration and modernization projects.

Confluent’s Fully Managed SAP Integration and Strategic Partnership

Data streaming defines a new software category. Confluent leads the data streaming industry. It provides a serverless cloud offering on all major public clouds and an offering for self-managed deployments powered by Apache Kafka and Flink. In December 2013, the research company Forrester published “The Forrester Wave™: Streaming Data Platforms, Q4 2023“. Get free access to the report here. The report explores what Confluent and other vendors like AWS, Microsoft, Google, Oracle and Clouders provide.

Confluent is now available in the SAP® Store, the online marketplace for SAP and partner offerings. The data streaming platform integrates with SAP Datasphere. The combination delivers a secure, governed solution for accessing SAP data as fully managed data streams for customers.

Datasphere Architecture as part of Business Technology Platform BTS

Confluent provides businesses that use SAP solutions with a cloud-native and complete data streaming platform available everywhere it’s needed – in the cloud, across clouds, on-premises, and hybrid environments. Configured directly within SAP Datasphere, the new Confluent integration allows businesses to:

  • Build real-time applications at a lower cost with fully managed data streams powered by Confluent’s Kora Engine, which reduces the total cost of ownership for Kafka by up to 60%.
  • Move SAP data anywhere it needs to go. Merge with third-party sources in real time via many pre-built connectors, including AWS Redshift, AWS S3, Databricks, Google Cloud BigQuery, MongoDB, and Snowflake paired with a serverless offering for Apache Flink.
  • Maintain strict security, compliance, and governance standards with enterprise-grade data streaming security controls, and the industry’s only fully managed governance suite for Kafka.

Confluent in the SAP PartnerEdge Program

Confluent is a partner in the SAP PartnerEdge program. The SAP PartnerEdge program provides the enablement tools, benefits and support to facilitate building high-quality, innovative applications focused on specific business needs – quickly and cost-effective.

Here is an example architecture connecting SAP ERP and non-SAP applications (Flink and Snowflake in this example) with Datasphere and Confluent:

SAP Datasphere and Data Streaming with Apache Kafka and Flink to integrate ERP and Cloud Data Warehouse

Confluent and SAP Datasphere are the perfect combination for building a data fabric for all enterprise data. Like many companies leverage Apache Kafka as data fabric for AI and Machine Learning.

Alternative Integration Options for SAP and Kafka

Is SAP Datasphere the new silver bullet for SAP ERP integration scenarios? No! As you learned in the above sections, Datasphere enables easy access to old and new SAP ERP data objects. However, Datasphere might have some drawbacks, too:

  • New technology: The product is only available for a few months at the time of writing this blog post in early 2024. It will mature and features will strengthen.
  • Heavyweight: A direct integration with a proprietary SAP API call, e.g., BAPI, IDoc or the more modern Operational Data Provisioning (ODP) might be easier to implement and more cost-efficient from a TCO perspective for some projects.
  • Vendor lock-in: Choosing a SAP product as middleware and/or analytics platform might not be the right strategy. Many organizations choose a best-of-breed approach for different domains and use cases instead of relying on a single vendor from a technology and licensing perspective.

One solution does not fit all integration use cases. Know the different options and make your evaluation. 

Plenty of other options exist for SAP-Kafka integration. I explored tens of APIs, tools, and connectors for data integration between SAP ERP and Apache Kafka.

For instance, look at the Confluent Hub and search for SAP Kafka integration. You will find many mature, lightweight and innovative solutions from various vendors. For instance, INIT, Asapio, Advantco, KaTe, Onibex, and Qlik provide integrations via different open and proprietary SAP interfaces like ODB, OData, REST, BAPI, or iDoc.

SAP Datasphere and Kafka Connect the Entire Enterprise (and Hybrid Cloud)

It was never easier to integrate the SAP ecosystem with the rest of the IT world in an enterprise architecture. SAP Datasphere supports straightforward access to SAP S/4 HANA, SAP BW/4HANA, SAP BW, SAP ECC, and SAP HANA ERP data without the need for complex integration projects. In addition, SAP supports connectivity to Business Warehouse, SAP’s on-premise data warehouse solution.

Apache Kafka enables data consistency across SAP and non-SAP applications across the data center and public cloud. No matter if the data source or sink is real time, near-real-time, batch, file-based, or a rest-response API like HTTP/REST. The heart of the data fabric is event-based, scalable, and reliable.

Confluent is the leading vendor of data streaming technologies like Apache Kafka. The strategic partnership and deep product integration between SAP Datasphere and Confluent provides an excellent opportunity for any organization that needs to integrate SAP and the rest of the IT infrastructure.

Some people might tell you how great Kafka is for analytical use cases. But not suited for operational, critical use cases (because some folks want to pitch another product for SAP integrations). That’s not accurate. Apache Kafka supports analytical AND transactional workloads. Actually, almost all customers I work with around the world use Confluent for transactional data from the SAP ERP for orders, payments, fraud detection, and similar operational use cases.

How do you integrate with your SAP systems today? Do you already use modern technologies like Apache Kafka? What connectors or solutions do you use? Will you use SAP Datasphere in the future? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post SAP Datasphere and Apache Kafka as Data Fabric for S/4HANA ERP Integration appeared first on Kai Waehner.

]]>
Building a Postmodern ERP with Apache Kafka https://www.kai-waehner.de/blog/2020/11/20/postmodern-erp-mes-scm-with-apache-kafka-event-streaming-edge-hybrid-cloud/ Fri, 20 Nov 2020 09:59:59 +0000 https://www.kai-waehner.de/?p=2847 Postmodern ERP represents the next generation of ERP architectures. It is real-time, scalable, and open by using a combination of open source technologies and proprietary standard software. This blog post explores why and how companies, both software vendors and end-users, leverage event streaming with Apache Kafka to implement a Postmodern ERP.

The post Building a Postmodern ERP with Apache Kafka appeared first on Kai Waehner.

]]>
Enterprise resource planning (ERP) exists for many years. It is often monolithic, complex, proprietary, batch, and not scalable. Postmodern ERP represents the next generation of ERP architectures. It is real-time, scalable, and open. A Postmodern ERP uses a combination of open source technologies and proprietary standard software. This blog post explores why and how companies, both software vendors and end-users, leverage event streaming with Apache Kafka to implement a Postmodern ERP.

Postmodern ERP with Apache Kafka

What is ERP (Enterprise Ressource Planning)?

Let’s define the term “ERP” first. This is not an easy task, as ERP is used for concepts and various standard software products.

Enterprise resource planning (ERP) is the integrated management of main business processes, often in real-time and mediated by software and technology.

ERP is usually referred to as a category of business management software—typically a suite of integrated applications – that an organization can use to collect, store, manage, and interpret data from many business activities.

ERP provides an integrated and continuously updated view of core business processes using common databases. These systems track business resources – cash, raw materials, production capacity – and the status of business commitments: orders, purchase orders, and payroll. The applications that make up the system share data across various departments (manufacturing, purchasing, sales, accounting, etc.) that provide the data  ERP facilitates information flow between all business functions and manages connections to outside stakeholders.

It is important to understand that ERP is not just for manufacturing and relevant across various business domains. Hence, Supply Chain Management (SCM) is orthogonal to ERP.

ERP is a Zoo of Concepts, Technologies, and Products

An ERP is a key concept and typically uses various products as part of every supply chain where tangible goods are produced. For that reason, an ERP is very complex in most cases. It usually is not just one product, but a zoo of different components and technologies:

SAP ERP System - Zoo of Products including SCM MES CRM PLM WMS LMS

 

Example: SAP ERP – More than a Single Product…

SAP is the leading ERP vendor. I explored SAP, its product portfolio, and integration options for Kafka in a separate blog post: “Kafka SAP Integration – APIs, Tools, Connector, ERP et al.

Check that out if you want to get deeper into the complexity of a “single product and vendor”. You will be surprised how many technologies and integration options exist to integrate with SAP. SAP’s stack includes plenty of homegrown products like SAP ERP and acquisitions with their own codebase, including Ariba for supplier network, hybris for e-commerce solutions, Concur for travel & expense management, and Qualtrics for experience management. The article “The ERP is Dead. Long live the Distributed Planning System” from the SAP blog goes in a similar direction.

ERP Requirements are Changing…

This is not different for other big vendors. For instance, if you explore the Oracle website, you will also find a confusing product matrix. 🙂

That’s the status quo of most ERP vendors. However, things change due to shifting requirements: Digital Transformation, Cloud, Internet of Things (IoT), Microservices, Big Data, etc. You know what I mean… Requirements for standard software are changing massively.

Every ERP vendor (that wants to survive) is working on a Postmodern ERP these days by upgrading its existing software products or writing a completely new product – that’s often easier. Let’s explore what a Postmodern ERP is in the next section.

Introducing the Postmodern ERP

The term “Postmodern ERP” was coined by Gartner several years ago, already.

From the Gartner Glossary:

Postmodern ERP is a technology strategy that automates and links administrative and operational business capabilities (such as finance, HR, purchasing, manufacturing, and distribution) with appropriate levels of integration that balance the benefits of vendor-delivered integration against business flexibility and agility.”

This definition shows the tight relation to other non-Core-ERP systems, the company’s whole supply chain, and partner systems.

The Architecture of a Postmodern ERP

According to Gartner’s definition of the postmodern ERP strategy, legacy, monolithic and highly customized ERP suites, in which all parts are heavily reliant on each other, should sooner or later be replaced by a mixture of both cloud-based and on-premises applications, which are more loosely coupled and can be easily exchanged if needed. Hint: This sounds a lot like Kafka, doesn’t it?

The basic idea is that there should still be a core ERP solution that would cover the most important business functions. In contrast, other functions will be covered by specialist software solutions that merely extend the core ERP

There is, however, no golden rule as to what business functions should be part of the core ERP and what should be covered by supplementary solutions. According to Gartner, every company must define its own postmodern ERP strategy, based on its internal and external needs, operations, and processes. For example, a company may define that the core ERP solution should cover those business processes that must stay behind the firewall and choose to leave their core ERP on-premises. At the same time, another company may decide to host the core ERP solution in the cloud and move only a few ERP modules as supplementary solutions to on-premises.

Pros and Cons of a Postmodern ERP

SelectHub explores the pros and cons of a Postmodern ERP compared to legacy ERPs:

Pros and Cons of a Postmodern ERP

The pros are pretty obvious, and the main motivation why companies want or need to move away from their legacy ERP system. Software is eating the world. Companies (need to) become more flexible, elastic, and scalable. Applications (need to) become more personalized and context-specific—all that (need to be) in real-time. There are no ways around a Postmodern ERP and all the related supply chain processes to leverage solve these requirements.

The main benefits that companies will gain from implementing a Postmodern ERP strategy are speed and flexibility when reacting to unexpected changes in business processes or on the organizational level. With most applications having a relatively loose connection, it is fairly easy to replace or upgrade them whenever necessary. Companies can also select and combine cloud-based and on-premises solutions that are most suited for their ERP needs.

The cons are more interesting because they need to be solved to deploy a Postmodern ERP successfully. The key downside of a postmodern ERP is that it will most likely lead to an increased number of software vendors that companies will have to manage and pose additional integration challenges for central IT.

Coincidentally, I had similar discussions with customers in the past quarters regularly. More and more companies adopt Apache Kafka to solve these challenges to build a Postmodern ERP and flexible, scalable supply chain processes.

Kafka as the Foundation of a Postmodern ERP

If you follow my blog and presentations, you know that Kafka is used in all areas where an ERP is relevant, for instance, Industrial IoT (IIoT), Supply Chain Management, Edge Analytics, and many other scenarios. Check out “Kafka in Industry 4.0 and Manufacturing” to learn more details about various use cases.

Example: A Postmodern ERP built on top of Kafka

A Postmodern ERP built on top of Apache Kafka is part of this story:

Postmodern ERP with Apache Kafka SAP S4 Hana Oracle XML Web Services MES

This architecture shows a Postmodern ERP with various components. Note that the Core ERP is built on Apache Kafka. Many other systems and applications are integrated.

Each component of the Postmodern ERP has a different integration paradigm:

  • The TMS (Transportation Management System) is a legacy COTS application providing only a legacy XML-based SOAP Web Service interface. The integration is synchronous and not scalable but works for small transactional data sets.
  • The LMS (Labor Management System) is a legacy homegrown application. The integration is implemented via Kafka Connect and a CDC (Change-Data-Capture) connector to push changes from the relational Oracle database in real-time into Kafka.
  • The SRM (Supplier Relationship Management) is a modern application built on top of Kafka itself. Integration with the Core ERP is implemented with Kafka-native replication technologies like MirrorMaker 2, Confluent Replicator, or Confluent Cluster Linking to provide a scalable real-time integration.
  • The MES (Manufacturing Execution System) is an SAP COTS product and part of the SAP S4/Hana product portfolio. The integration options include REST APIs, the Eventing API, and Java APIs. The right choice depends on the use case. Again, read Kafka SAP Integration – APIs, Tools, Connector, ERP et al. to understand how complex the longer explanation is.
  • The CRM (Customer Relationship Management) is Salesforce, a SaaS cloud service, integrated via Kafka Connect and the Confluent connector.
  • Many more integrations to additional internal and external applications are needed in a real-world architecture.

This is a hypothetical implementation of a Postmodern ERP. However, more and more companies implement this architecture for all the discussed benefits. Unfortunately, such modern architecture also includes some challenges. Let’s explore them and discuss how to solve them with Apache Kafka and its ecosystem.

Solving the Challenges of a Postmodern ERP with Kafka

This section covers three main challenges of implementing a Postmodern ERP and how Kafka and its ecosystem help implement this architecture.

I quote the three main challenges from the blog post “Postmodern ERP: Just Another Buzzword?” and then explain how the Kafka ecosystem solved them more or less out-of-the-box.

Issue 1: More Complexity Between Systems!

“Because ERP modules and tools are built to work together, legacy systems can be a lot easier to configure than a postmodern solution composed entirely of best-of-breed solutions. Because postmodern ERP may involve different programs from different vendors, it may be a lot more challenging to integrate. For example, during the buying process, you would need to ask about compatibility with other systems to ensure that the solution that you have in mind would be sufficient.”

First of all, is your existing ERP system easy to integrate? Any ERP system older than five years uses proprietary interfaces (such as BAPI and iDoc in case of SAP) or ugly/complex SOAP web services to integrate with other systems. Even if all the software components come from one single vendor, it was built by different business units or even acquired. The codebases and interfaces speak very different languages and technologies.

So, while a Postmodern ERP requires complex integration between systems, so does any legacy ERP system! Nevertheless:

How Kafka Helps…

Kafka provides an open, scalable, elastic real-time infrastructure for implementing the middleware between your ERP and other systems. More details in the comparison between Kafka and traditional middleware such as ETL and ESB products.

Kafka Connect is a key piece of this architecture. It provides a Kafka-native integration framework.

Additionally, another key reason why Kafka makes these complex integration successful is that Kafka really decouples systems (in contrary to traditional messaging queues or synchronous SOAP/REST web services):

Domain-Driven Design and Decoupling for your Postmodern ERP with Kafka

The heart of Kafka is real-time and event-based. Additionally, Kafka decouples producers and consumers with its storage capabilities and handles the backpressure and optionally the long-term storage of events and data. This way, batch analytics platforms, REST interfaces (e.e.g mobiles apps) with request-response, and databases can access the data, too. Learn more about “Domain-driven Design (DDD) for decoupling applications and microservices with Kafka“.

Understanding the relation between event streaming with Kafka and non-streaming APIs (usually HTTP/REST) is also very important in this discussion. Check out “Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?” for more details.

The integration capabilities and real coupling using Kafka enables the integration of the complexity between systems.

Issue 2: More Difficult Upgrades!

“This con goes hand in hand with the increased complexity between systems. Because of this increased complexity and the fact that the solution isn’t an all-in-one program, making system upgrades can be difficult. When updates occur, your IT team will need to make sure that the relationship between the disparate systems isn’t negatively affected.”

How Kafka Helps…

The issue with upgrades is solved with Kafka out-of-the-box. Remember: Kafka really decouples systems from each other due to its storage capabilities. You can upgrade one system without even informing the other systems and without downtime! Two reasons why this works so well and out-of-the-box:

  1. Kafka is backward compatible. Even if you upgrade the server-side (Kafka brokers, ZooKeeper, Schema Registry, etc.), the other applications and interfaces continue to work without breaking changes. Server-side and client-side can be updated independently. Sometimes an older application is not updated anymore at all because it will be replaced soon. That’s totally fine. An old Kafka client can speak to a newer Kafka broker.
  2. Kafka uses rolling upgrades. The system continues to work without any downtime. 24/7. For Mission-critical workloads like ERP or MES transactions. From the outer perspective, the upgrade will not even be recognized at all.

Let’s take a look at an example with different components of the Postmodern ERP:

Postmodern ERP - Replication between Kafka and ERP Components

In this case, we see different versions and distributions of Kafka being used:

  • The Tier 1 Supplier uses the fully-managed and serverless Confluent Cloud solution. It automatically upgrades to the latest Kafka release under the hood (this is never a problem due to backward compatibility). The client applications use pretty old versions of Kafka.
  • The Core ERP uses open-source Kafka as it is a homegrown solution, not standard software. The operations and support are handled by the company itself (pretty risky for such a critical system, but totally valid). The Kafka version is relatively new. One client application even uses a Kafka version, which is newer than the server-side, to leverage a new feature in Kafka Streams (Kafka is backward compatible in both directions, so this is not a problem).
  • The MES vendor uses Confluent Platform, which embeds Apache Kafka. The version is up-to-date as the vendor does regular releases and supports rolling upgrades.
  • Integration between the different ERP applications is implemented with Kafka-native replication tools, MirrorMaker 2, respectively Confluent Cluster Linking. As discussed in a former section, various other integration options are available, including REST, Kafka Connect, native Kafka clients in any programming languages, or any ETL or ESB tool.

Backward compatibility and rolling upgrades make updating systems easy and invisible for integrated systems. Business continuity is guaranteed out-of-the-box.

Issue 3: Lack of Access When Offline

“When you implement a cloud-based software, you need to account for the fact that you won’t be able to access it when you are offline. Many legacy ERP systems offer on-premise solutions, albeit with a high installation cost. However, this software is available offline. For cloud ERP solutions, you are reliant on the internet to access all of your data. Depending on your specific business needs, this may be a dealbreaker.”

How Kafka Helps…

Hybrid architectures are the new black. Local processing on-premise is required in most use cases. It is okay to build the next generation ERP in the cloud. But the integration between cloud and on-premise/edge is key for success. A great example is Mojix, a Kafka-native cloud platform for real-time retail & supply chain IoT processing with Confluent Cloud.

When tangible goods are produced and sold, some processing needs to happen on-premise (e.g., in a factory) or even closer to the edge (e.g., in a restaurant or retail store). No access to your data is a dealbreaker. No capability of local processing is a dealbreaker. Latency and cost for cloud-only can be another deal-breaker.

Kafka works well on-premise and at the edge. Plenty of examples exist. Including Kafka-native bi-directional real-time replication between on-premise / edge and the cloud.

I covered these topics so often already; therefore, I just share a few links to read:

I specifically recommend the latter link. It covers hybrid architectures where processing at the edge (i.e. outside the data center) is key and required even offline, like in the following example running Kafka in a factory (including the server-side):

Edge Computing with Kafka in Manufacturing and Industry 4.0 MES ERP

The hybrid integration capabilities using Kafka and its ecosystem solves the issue with lack of access when offline.

Kafka and Event Streaming as Foundation for a Postmodern ERP Infrastructure

Postmodern ERP represents the next generation of ERP architectures. It is real-time, scalable, and open by using a combination of open source technologies and proprietary standard software. This blog post explored how software vendors and end-users leverage event streaming with Apache Kafka to implement a Postmodern ERP.

What are your experiences with ERP systems? Did you already implement a Postmodern ERP architecture? Which approach works best for you? What is your strategy? Let’s connect on LinkedIn and discuss it!

The post Building a Postmodern ERP with Apache Kafka appeared first on Kai Waehner.

]]>
Kafka SAP Integration – APIs, Tools, Connector, ERP et al https://www.kai-waehner.de/blog/2020/08/25/kafka-sap-integration-alternatives-connectors-erp-r3-ecc-s4-hana-soap-rest-http-web-service-api-sdk-java/ Tue, 25 Aug 2020 13:58:42 +0000 https://www.kai-waehner.de/?p=2623 A question I get every week from customers across the globe: How can I integrate my SAP system…

The post Kafka SAP Integration – APIs, Tools, Connector, ERP et al appeared first on Kai Waehner.

]]>
A question I get every week from customers across the globe: How can I integrate my SAP system with Apache Kafka? This post explores various alternatives, including connectors, 3rd party tools, custom glue code, and trade-offs between the different options.

After exploring what SAP is, I will discuss several integration options between Apache Kafka and SAP systems:

  • Traditional middleware (ETL/ESB)
  • Web services (SOAP/REST)
  • 3rd party turnkey solutions
  • Kafka-native connectivity with Kafka Connect
  • Custom glue code using SAP SDKs

Disclaimer before you read on:

I am not an SAP expert. It is tough to stay up-to-date with the vast and complex ecosystem of SAP products, (re-)brands, versions, services, SDKs, and APIs. I am sorry if some of the below information is not 100% accurate or outdated. Always double-check on the SAP website (if the links from Google still work – I had some issues with some pages “no longer available” while researching for this blog post). If you see any inaccurate or missing information, please let me know and I will update the blog post.

What is SAP?

SAP is a German multinational software corporation that makes enterprise software to manage business operations and customer relations.  In 2019, SAP had revenue of €27.553 billion, a net income of €3.387 billion, and ~100.000 employees.

It is quite interesting: Nobody asks how to integrate with IBM or Oracle. Instead, people more specifically ask how to integrate with IBM MQ, IBM DB2, IBM Mainframe (still very ambiguous), or any other of the 100s of IBM products.

For SAP, people ask: How can I integrate with SAP? Let’s clarify what SAP is before exploring integration options.

The company is primarily known for its ERP software. But if you check out the official “What is SAP?” page, you find out that SAP offers solutions across a wide range of areas:

  • ERP and Finance
  • CRM and Customer Experience
  • Network and Spend Management
  • Digital Supply Chain
  • HR and People Engagement
  • Experience Management
  • Business Technology Platform
  • Digital Transformation
  • Small and Midsize Enterprises
  • Industry Solutions

SAP’s Software Portfolio

SAP’s stack includes homegrown products like SAP ERP and acquisitions with their own codebase, including Ariba for supplier network, hybris for e-commerce solutions, Concur for travel & expense management, and Qualtrics for experience management.

Even if you talk about SAP ERP, the situation is still not that easy. Most companies still run SAP ERP Central Component (ECC, formerly called SAP R/3), SAP’s sophisticated (and aged) ERP product. ECC runs on a third-party relational database from Oracle, IBM, or Microsoft, while HANA is SAP’s in-memory database. The new ERP product is SAP S4/Hana (no, this is not just the famous in-memory database). Oh, and there is SAP S4/Hana Cloud. And before you wonder: No, this is not the same feature set as the on-premise version!

Various interfaces exist depending on your product. An interface can be an (awful) proprietary technologies like BAPI or iDoc, (okayish) standards-based web service APIs using SOAP or REST / HTTP, a (non-scalable) JDBC database connectivity, or if you are lucky even a (scalable and real-time) Event / Messaging API. The article “The ERP is Dead. Long live the Distributed Planning System” from the SAP blog describes the situation very well.

And sorry, we are still not done yet. Even if you talk about ERP systems, this can mean anything from a zoo of products or components, depending on who you are talking to:

SAP ERP System - Zoo of Products including MES CRM PLM WMS LMS

So, before you want to discuss the integration of your SAP product with Kafka, please please please find out the product, version, and deployment infrastructure of your SAP components.

Different Integration Options between Kafka and SAP

After this introduction, you hopefully understand that there is no silver bullet for SAP integration. The following will explore different integration options between Kafka and SAP and their trade-offs. The main focus is on SAP ERP (old ECC and new S4/Hana), but the overview is more generic, including integration capabilities with other components and products.

SAP Integration with Apache Kafka - R3 ERP S4 Hana Ariba Concur BAPI iDoc REST SOAP Web Services Java

Also, keep in mind that you typically need or want to integrate with a function or service. Direct integration with the data object does not make much sense in most cases, as you would have to re-implement the mapping and denormalization between the data objects. Especially for source integration, i.e., building pipelines from SAP to Kafka. In the case of SAP ERP, you typically integrate with RFC/BAPI/iDoc or any other web service interface for this reason.

Traditional Middleware (ETL / ESB) for SAP Integration

Integration tools exist just for the sake of integrating different sources and sinks:

  • Extract-Transform-Load (ETL) for batch integration, like Informatica, Talend or SAP NetWeaver Process Integration
  • Enterprise Service Bus (ESB) for integration via web services and messaging, like TIBCO BusinessWorks or Software AG webMethods
  • Integration Platform as a Service (iPaaS) for cloud-native integration, similar to ETL/ESB tools, but provided as a fully managed service, such as Boomi,  Mulesoft, or SAP Cloud Integration (and some cloud-washed products from legacy middleware vendors).

Most traditional middleware products were built to integrate with complex, proprietary systems from the last 20+ years, such as IBM Mainframe, EDIFACT, and guess what ERP systems like SAP ECC. In the meantime, all of them also have a Kafka connector. There are plenty of good reasons why many companies chose Kafka as a modern integration platform instead of a legacy of traditional middleware.

Most traditional ETL and ESB tools provide SAP connectivity. SAP Cloud Platform Integration (SAP CPI) is SAP’s own “modern” middleware solution. CPI includes a Kafka adapter to send and receive Kafka messages.

Pros:
  • In place: Typically already in place, no new project is required.
  • Maturity: Built over the years (because of the complexity), running in production for a long time already
  • Tooling: Visual coding for the integration (required because of the complexity), directly map iDoc / BAPI / Hana / SOAP schemas to other data structures
  • Integration: Not just connectors to the legacy systems but also Kafka for producing and consuming messages (due to market pressure)
Cons:
  • Legacy: Products are as old as the source and sink systems,
  • Scalability: Monolithic, inflexible architecture
  • Tight coupling: Integration has to be developed and runs on the middleware, no real decoupling and domain-driven design DDD like in Kafka
  • Licensing: High-cost per server, often already planned to be replaced (e.g., you can replace 100+ IBM MQ or TIBCO EMS servers with a single Kafka cluster)
  • Point-to-point: No streaming architecture, most integrations are based on web services (even if the core under the hood is based on a messaging system)
TL;DR:

Traditional integration tools are mature and have great tooling, but limited scalability/flexibility and high licensing cost. Often a quick win as it is already running, and you just need to add the Kafka connector.

Custom Glue Code for Kafka Integration using SAP SDKs

Writing your custom integration between SAP systems and Kafka is a viable option. This glue code typically leverages the same SDKs as 3rd party tools use under the hood:

  • Legacy: SAP NetWeaver RFC SDK – a C/C++ interface for connecting to SAP systems from release R/3 4.6C up to today’s SAP S/4HANA systems.
  • Legacy: SAP Java Connector (SAP JCo) – the famous JCO.jar library – is a Java SDK for integration to the SAP ECC / ERP (this is just a wrapper around the C/C++ SDK).
  • Legacy: SAP ACO is an integrated ABAP component that is designed to consume RFC Services on remote ABAP systems.
  • Legacy: SAP ABAP TCP Push Channel if you are forced to use custom ABAP code and need or want to use TCP instead of the Confluent REST Proxy for HTTP communication.
  • Legacy: JMS Adapter to integrate via the standard messaging protocol. Great option (if you get it running and working for your use case and functions). For instance, JMS integration can be done via SAP PI.
  • Modern: SAP Cloud SDK allows developing applications with Java or JavaScript that communicate with SAP solutions and services such as SAP S/4 Hana Cloud, SAP SuccessFactors, and others (the term ‘Cloud’ actually means ‘Cloud-native’ in this case, i.e., this SDK also works with SAP’s on-premise products).
  • Modern: SAP Cloud Platform Enterprise Messaging: S4/Hana provides an asynchronous messaging interface (running on Solace on CloudFoundry under the hood). Different messaging standards are supported, including AMQP 1.0 and JMS (depending on the specific product you look at). Some examples demonstrate how to connect via the Java Client using the JMS API.
  • Modern: SAP ODP (Operational Data Provisioning): Technical infrastructure for operational analytics, and data extraction + replication. Some kind of CDC (Change Data Capture) with out-of-the-box support for various SAP products, including SAP BW, SAP BW/4HANA, SAP Data Services, and SAP HANA Smart Data Integration. ODP is not just for SAP interfaces but also integrates with 3rd party technologies (via a custom connector, not out-of-the-box) such as HDFS or Kafka.
Pros:
  • Flexibility: Custom coding allows you to implement precisely what you need.
Cons:
  • Maintenance: No vendor support – develop, maintain, operate, support by yourself.
  • Point-to-point: No streaming architecture, most integrations are based on web services (even if the core under the hood is based on a messaging system).
TL;DR:

“Build vs. Buy” always has trade-offs. I have only seen custom glue code for SAP integration in the field if no solution from a vendor was available and affordable. SAP Cloud Platform Enterprise Messaging is a possible integration pattern for Kafka, but it also adds yet another messaging layer to the architecture.

 

SOAP / REST Web Services for SAP Integration

The last 15 years brought us web services for building a Service-oriented Architecture (SOA) to integrate applications. A web service typically uses SOAP or REST / HTTP as technology. I will not start yet another FUD war here. Both have their use cases and trade-offs.

Pros:
  • Standards-based: Different SDKs, products, and services talk the same language (at least in theory; true for HTTP, not so true for SOAP); most middleware tools have proper support for building HTTP services.
  • Simplicity (HTTP): Well-understood, supported by most programming languages and APIs, established for many use cases – middleware is just yet another one.
Cons:
  • Point-to-point: No streaming architecture, most integrations are based on web services (even if the core under the hood is based on a messaging system).
  • Tight coupling: Integration has to be developed and runs on the middleware, no real decoupling, and domain-driven design DDD like in Kafka.
  • Complexity (SOAP): SOAP/WSDL is just the tip of the iceberg! Check out the list of WS-* standards to understand why this is often called the “WS star hell”. The AXIS framework (Apache extensible Interaction System) is one example of SAP’s SOAP integration using an open framework. While the Apache project was last updated in 2006, SAP still recommends using this interface in 2020.
  • Missing features (REST / HTTP): Representational state transfer (REST) is a concept, but most people mean synchronous HTTP communication. Most middleware tools (and most other applications) only just a small fraction of the full standard. HTTP is an excellent standard, but all the tooling and features need to be built on top of it.
  • Only indirect support: Several SAP products do not provide open interfaces. While using SOAP or HTTP under the hood, you are forced to use the licensed tooling to create web services. For instance, SAP Business Connector (restricted license version of webMethods Integration Server), SAP NetWeaver Process Integration (PI), SAP Process Orchestration (PO), Cloud Platform Integration (CPI), or SAP Cloud Integration.
TL;DR:

SOAP and REST web services work well for point-to-point communication and have good tool support. Both have their trade-offs, make sure to choose the right one – if your SAP product provides both interfaces. Unfortunately, you will often not have a choice. Even worse: You cannot use any tool but are forced to use the right licensed SAP tool or wrapper interface. Large scale, high volume, and continuous processing of data are not ideal requirements for these (legacy) integration products.

For direct HTTP(S) communication with Kafka, Confluent Rest Proxy is an excellent option for producing, consuming, and administrating from any Kafka client (including custom SAP applications). For instance, SAP Cloud Platform Integration (CPI) can use this integration pattern to integrate between SAP and Kafka.

SAP-specific 3rd Party Tools for Kafka

SAP integration is a huge market globally. SAP provides several tools for data integration (some legacy, some modern – honestly, I don’t have a full overview of their complex product and API portfolio). Additionally, plenty of other software vendors have built specific integration software for SAP systems.

A few examples I have seen in the field recently:

Examples:

These are just a few examples. Many more exist for on-premise, cloud, and hybrid integration with different SAP products and interfaces.

Some of these tools are natively integrated into SAP’s integration tools instead of being completely independent runtimes. This can be good or bad. An advantage of this approach is that you can leverage the SAP-native features for complex iDoc / BAPI mappings and the integrated 3rd party connector for Kafka communication.

Pros:
  • Turnkey solution: Built for SAP integration, often combined with other additional helpful features beyond just doing the connectivity, more lightweight than traditional generic middleware.
  • Focus: Many 3rd party solutions focus on a few specific use cases and/or products and technologies. It is much harder to integrate with “SAP in general” than focusing on a particular niche, e.g., Human Resources processes and related HTTP interfaces.
  • Maturity: Built over the years
  • Tooling: Visual coding for the integration (required because of the complexity), directly map iDoc / BAPI / Hana / SOAP schemas to other data structures
  • Integration: Not just connectors to the legacy systems but also modern technologies such as Kafka
Cons:
  • Scalability: Often monolithic, inflexible architecture (but focusing on SAP integration only, therefore often “okayish”)
  • Tight coupling: Integration has to be developed and runs on the tool, but separated from other middleware, thus decoupling and domain-driven design DDD in conjunction with Kafka
  • Licensing: Moderate cost per server (typically cheaper than the traditional generic middleware)
  • Point-to-point: No streaming architecture, most integrations are based on web services (even if the core under the hood is based on a messaging system)
TL;DR:

A turnkey solution is an excellent choice in many scenarios. I see this pattern of combining Kafka with a dedicated 3rd party solution for SAP integration very often. I like it as the architecture is still decoupled, but no vast efforts required for doing a (complex) SAP integration. And there is still hope that even SAP themselves releases a nice Kafka-native integration platform. 🙂

Kafka-native SAP Integration with Kafka Connect

Kafka Connect, an open-source component of Apache Kafka, is a framework for connecting Kafka with external systems such as databases, key-value stores, search indexes, and file systems.

Kafka Connect connectors are available for SAP ERP databases:

Pros:
  • Kafka-native: Kafka under the hood, providing real-time processing for high volumes of data with high scalability and reliability.
  • Simplicity: Just one infrastructure for messaging and data integration, much easier to develop, test, operate, scale, and license than using different frameworks or products (e.g., Kafka for messaging plus an ESB for data integration).
  • Real decoupling: Kafka’s architecture uses smart endpoints and dumb pipes by design, one of the key design principles of microservices. Not just for the applications, but also for the integration components. Leverage all the benefits of a domain-driven architecture for your Kafka-native middleware.
  • Custom connectors: Kafka Connect provides an open template. If no connector is available, you (or your favorite system integrator or Kafka-vendor) can build an SAP-specific connector once, and you can roll it out everywhere.
Cons:
  • Only database connectors: No connectors available beyond the native JDBC database integration are available at the time of writing this.
  • Anti-pattern of direct database access: In most cases, you want or need to integrate with a function or service, not with the data objects. In most cases, you don’t even get direct access from the database admin anyway.
  • Efforts: Build your own SAP-native (i.e., non-JDBC) connector or ask (and pay) your favorite SI or Kafka vendor.

UPDATE January 2021: A Kafka-native integration is available with INIT’s ODP connector (as discussed in section “3rd party tools”). It eliminates the above cons and might be a great 3rd party option for some use cases.

TL;DR:

Kafka Connect is a great framework and used in most Kafka architectures for various good reasons. For SAP integration, the situation is different because no connectors are available (beyond direct database access). It took 3rd party vendors many years to implement RFC/BAPI/iDoc integration with their tools. Such an implementation will probably not happen again for Kafka because it is very complex, and these proprietary legacy interfaces are dying anyway. The situation is different for modern SAP interfaces: Some 3rd party providers leverage Kafka Connect for their product. For instance, INIT Software’s Kafka Connect ODP connector.

A Kafka Connect connector for SAP Cloud Platform Enterprise Messaging using its Java client would be a feasible and best option. I assume we will see such a connector on the market sooner or later.

Embedded Kafka in SAP Products

We have seen various integration options between SAP and Kafka. Unfortunately, all of them are based on the principle of “data at rest” in contrary to Kafka processing “data in motion”. The closest fit until here is the integration via the SAP Cloud Platform Enterprise Messaging because you can at least leverage an asynchronous messaging API.

The real added value comes when Kafka is leveraged not just for real-time messaging but for event streaming. Kafka provides a combination of messaging, data integration, data processing, and real decoupling using its distributed storage infrastructure.

Native Event Streaming with Kafka in SAP Products

Interestingly, some of SAPs acquisitions leverage Kafka under the hood for event streaming. Two public examples:

Obviously, people are also waiting for the Kafka-native SAP S4/Hana interface so that they can leverage events in real-time for processing data in motion and correlate real-time and historical data together. A native Kafka integration with SAP S4/Hana should be the next step for SAP! HERE Technologies provides a great example of how to provide a Kafka-native interface (and an alternative REST option) for their product.

Having said this, current SAP blogs (from mid of 2019) still talk about replacing the 20+ years old BAPI and RFC integration style with SOAP and OData (Open Data Protocol, an open protocol that allows the creation and consumption of queryable and interoperable REST) APIs in SAP S/4HANA Public Cloud.

My personal feeling and hope are that a native Kafka interface is just a matter of time as the market demand is everywhere across the globe (I talk to many customers in EMEA, US, and APAC), and even several non-S4/Hana SAP products use Kafka internally.

I have also seen a two-fold approach from some other vendors: Provide a Kafka-native interface to the outside world first (in SAP terms you could e.g. provide a Kafka-interface on top of BAPIs. At a later point, reengineer the internal architecture away from the non-scalable technology to Kafka under the hood (in SAP terms you could replace RFC / BAPI functions with a more scalable Kafka-native version – even using the same API interface and message structure).

Native Streaming Replication between Products, Departments, and Companies

Native Kafka integration does not just happen within a product or company. A widespread trend I see on the market in different industries is to integrate with partners via Kafka-native streaming replication instead of REST APIs:

Cross Company Streaming Kafka API Integration with MirrorMaker and Cluster Linking

Think about it: If you use Kafka in different application infrastructures, but the interface is just a web service or database, then all the benefits might go away because scalability and/or real-time data correlation capabilities go away.

More and more vendors of standard software use Kafka as the backbone of their internal architecture. If the interface between products (imaginatively say SAP’s ERP system, SAP’s MES system, and the SCM application of an OEM customer) is just a SOAP or REST API, then this does not scale and perform well for the requirements of use cases in the digital transformation and Industry 4.0.

Hence, more and more companies leverage Kafka not just internally but also between departments or even different companies. Streaming replication between companies is possible with tools like MirrorMaker 2.0 or Confluent Replicator. Or you use the much simpler Cluster Linking from Confluent, which enables integration between hybrid, multi-cloud, or 3rd party integration using the Kafka protocol under the hood.

SAP + Apache Kafka = The Future for ERP et al

There is huge demand across the globe to integrate SAP applications with Apache Kafka for real-time messaging, data integration, and data processing at scale. The demand is true for SAP ERP (ECC and S4/Hana) but also for most other products from the vast SAP portfolio.

Kafka is deployed in many modern and innovative use cases for supply chain management, manufacturing, customer experience, and so on. Edge, hybrid and multi-cloud Kafka deployments is the norm, not an exception.

Kafka integrates with SAP systems well. Different integration options are available via SAP SDKs and 3rd party products for proprietary interfaces, open standards, and modern messaging and event streaming concepts. Choose the right option for your need and get started with Kafka SAP integration… 

If you want to build a modernize your existing ERP infrastructure (no matter if SAP or any other vendor), also check out the article “Building a Postmodern ERP with Apache Kafka“.

What are your experiences with SAP Kafka integration? How did it work? What challenges did you face and how did you or do you plan to solve this? What is your strategy? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka SAP Integration – APIs, Tools, Connector, ERP et al appeared first on Kai Waehner.

]]>