Cybersecurity Archives - Kai Waehner

How Data Streaming with Apache Kafka and Flink Drives the Top 10 Innovations in FinServ

Kai Waehner — Sun, 09 Feb 2025 09:59:38 +0000

The FinServ industry is undergoing a major transformation, driven by emerging technologies that enhance efficiency, security, and customer experience. At the heart of these innovations is real-time data streaming, enabled by Apache Kafka and Apache Flink. These technologies allow financial institutions to process and analyze data instantly to make finance smarter, more secure, and more accessible. This blog post explores the top 10 emerging financial technologies and how data streaming plays a critical role in making them a reality.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases across all industries.

Data Streaming in the FinServ Industry

This article builds on FinTechMagazine.com’s “Top 10 Emerging Technologies in Finance“ by mapping each of these innovations to real-time data streaming concepts, possibilities, and real-world success stories.

By leveraging Apache Kafka and Apache Flink, financial institutions can process transactions instantly, detect fraud proactively, and enhance customer experiences with real-time insights. Each emerging technology—whether IoT payment networks, AI-powered banking, or embedded finance—relies on the ability to stream, analyze, and act on data in real time, making data streaming a foundational enabler of the future of finance.

10. IoT Payment Networks: Real-time Processing for Seamless Payments

IoT payment networks enable automated, contactless transactions using connected devices like smartwatches, cars, and home appliances. Whether it’s a fridge restocking groceries or a car paying for tolls, these interactions generate massive real-time data streams that must be processed instantly and securely.

How Data Streaming with Kafka and Flink Adds Value

Fraud Detection in Milliseconds – Flink analyzes streaming transaction data to detect anomalies, flagging fraudulent activity before payments are approved.
Reliable Connectivity – Kafka ensures payment events from IoT devices are securely transmitted and processed, preventing dropped or duplicate transactions.
Dynamic Pricing & Offers – Flink processes sensor and market data to adjust prices dynamically (e.g., surge pricing for EV charging stations) and deliver real-time personalized discounts.
Edge Processing for Low-Latency Payments – Kafka enables local transaction validation on IoT devices, reducing lag in autonomous vehicle payments and retail checkout systems.
Compliance & Security – Streaming pipelines support real-time monitoring, encryption, and anomaly detection, ensuring IoT payments meet financial regulations like PSD2 and PCI DSS.

In financial services, don’t make the mistake of only looking inward for lessons—other industries have been solving similar challenges for years. Consumer IoT and Apache Kafka have long been used together in sectors like retail, where real-time data integration is critical for unified commerce, rewards programs, social selling, and many other use cases.

9. Voice-First Banking: Turning Conversations into Transactions

Voice-first banking enables customers to interact with financial services using smart speakers, virtual assistants, and mobile voice recognition. Whether checking an account balance, making a payment, or applying for a loan, these interactions require instant access to multiple backend systems—from core banking and CRM to fraud detection and credit scoring systems.

To make voice banking seamless, fast, and secure, banks must integrate real-time data streaming between AI-powered voice assistants and backend financial systems. This is where Apache Kafka and Apache Flink come in.

How Data Streaming with Kafka and Flink Adds Value

Seamless Integration Across Banking Systems – Voice assistants need real-time access to core banking (account balances, transactions), CRM (customer history), risk systems (fraud checks), and AI analytics. Kafka acts as a high-speed messaging and integration layer (aka ESB/middleware), ensuring that voice requests are instantly routed to the right backend services (including legacy technologies, such as mainframe) and responses are processed in milliseconds.
Instant Voice Query Processing – When a customer asks, “What’s my balance?”, Flink streams real-time transaction data from Kafka to retrieve the latest balance, rather than relying on outdated batch data.
Secure Authentication & Fraud Detection – Streaming pipelines analyze voice patterns in real time to detect fraud and trigger multi-factor authentication (MFA) if needed.
Personalized & Context-Aware Banking and Advertising – Flink continuously enriches customer profiles by analyzing past transactions, spending habits, and preferences—allowing the system to offer real-time financial insights (e.g., suggesting a savings plan based on spending trends).
Asynchronous Processing for Long-Running Requests – For complex tasks like loan applications, Kafka handles asynchronous processing—initiating background workflows across multiple systems while keeping the customer engaged.

For instance, Northwestern Mutual presented at Kafka Summit how the bank leverages Apache Kafka as a database for real-time transaction processing.

8. Autonomous Finance Platforms: AI-Driven Financial Decision Making

Autonomous finance platforms use AI, machine learning, and multi-agent systems to optimize savings, investments, and budgeting for consumers. These platforms act as digital financial advisors to make real-time decisions based on market data, user spending habits, and risk models.

How Data Streaming with Kafka and Flink Adds Value

Multi-Agent AI System Coordination – Autonomous finance platforms use multiple AI agents to handle different aspects of financial decision-making (e.g., portfolio optimization, credit assessment, fraud detection). Kafka streams data between these AI agents, ensuring they can collaborate in real time to refine investment and savings strategies.
Streaming Market Data Integration – Kafka ingests live stock prices, interest rates, and macroeconomic data, making it instantly available for AI models to adjust financial strategies.
Real-Time Customer Insights – Flink continuously analyzes customer transactions and spending behavior to enable AI-driven recommendations (e.g., automatically moving surplus funds into an interest-bearing account).
Predictive Portfolio Management – By combining real-time stock market data with AI-driven risk models, Flink helps adjust portfolio allocations based on current trends, ensuring maximum returns while minimizing exposure.
Automated Risk Mitigation – Autonomous finance systems must react instantly to market shifts. Flink’s real-time monitoring detects economic downturns or sudden market crashes, triggering immediate adjustments to investment portfolios or loan interest rates.
Event-Driven Financial Automation – Kafka enables real-time triggers (e.g., an AI agent detecting high inflation can automatically adjust a savings strategy).

7. RegTech 3.0: Automating Compliance and Risk Monitoring

RegTech is modernizing compliance by replacing slow batch audits with continuous real-time monitoring, automated reporting, and proactive fraud detection.

Financial institutions need instant insights into transactions, risk exposure, and regulatory changes—Kafka and Flink make this possible by streaming, analyzing, and automating compliance at scale.

How Data Streaming with Kafka and Flink Adds Value

Continuous Transaction Monitoring – Kafka streams every transaction in real time, enabling Flink to detect fraud, money laundering, or unusual patterns instantly—ensuring compliance with AML and KYC regulations.
Automated Regulatory Reporting – Flink processes compliance events as they happen, ensuring regulatory bodies receive up-to-date reports without delays. Kafka integrates compliance data across banking systems for audit-ready records.
Real-Time Fraud Prevention – Flink analyzes transaction behavior in milliseconds, detecting anomalies and triggering security actions like transaction blocking or multi-factor authentication.
Event-Driven Compliance Alerts – Kafka ensures instant alerts when regulations change, allowing banks to adapt in real time instead of relying on manual updates.
Proactive Risk Management – By analyzing live risk factors across transactions, users, and markets, Flink helps financial institutions identify and prevent compliance violations before they occur.

For example, KOR leverages data streaming to revolutionize compliance and regulatory reporting in the derivatives market by enabling on-demand historical reporting and real-time insights that were previously difficult to achieve with traditional batch processing. By using Kafka as a persistent state store, KOR ensures an immutable log of data that allows regulators to track changes over time, reconcile historical corrections, and meet compliance requirements more efficiently than legacy ETL-based big data systems. Read the entire KOR success story in my ebook.

6. Central Bank Digital Currencies (CBDC): The Future of Government-Backed Digital Money

Central Bank Digital Currencies (CBDC) are digital versions of national currencies, designed to enable faster, more secure, and highly scalable financial transactions.

Unlike cryptocurrencies, CBDCs are government-backed, meaning they require robust, real-time infrastructure capable of handling millions of transactions per second. They also need instant settlement, fraud detection, and cross-border interoperability—all of which depend on real-time data streaming.

How Data Streaming with Kafka and Flink Adds Value

Instant Settlement – Kafka ensures that CBDC transactions are processed and confirmed in real time, eliminating delays in digital payments. This allows central banks to enable 24/7 instant transactions, even in cross-border scenarios.
Scalability for Nationwide Adoption – Flink dynamically processes millions of transactions per second, ensuring that a CBDC system can handle high demand without bottlenecks or downtime.
Cross-Border Payments & Exchange Rate Optimization – Flink analyzes foreign exchange markets in real time and ensures optimized B2B data exchange for currency conversion and detecting suspicious cross-border activities for fraud prevention.
Regulatory Monitoring & Compliance – Kafka continuously streams transaction data to regulatory bodies. This ensures governments have real-time visibility into the movement of digital currencies.

At Kafka Summit Bangalore 2024, Mindgate Solutions presented its successful integration of Central Bank Digital Currency (CBDC) into banking apps, leveraging real-time data streaming to enable seamless digital payments. Mindgate utilized Kafka-based microservices architecture to ensure scalability, security, and reliability, reinforcing its leadership in India’s real-time payments ecosystem while processing over 8 billion transactions per month.

5. Green Fintech Infrastructure: Sustainability and ESG in Finance

Green fintech focuses on tracking carbon footprints, ESG (Environmental, Social, and Governance) investments, and climate risks in real time.

As financial institutions shift towards sustainable investment strategies, they need accurate, real-time data on environmental impact, regulatory compliance, and green investment opportunities.

How Data Streaming with Kafka and Flink Adds Value

Real-Time Carbon Tracking – Kafka streams emissions and sustainability data from supply chains to enable instant carbon footprint analysis.
Automated ESG Compliance – Flink analyzes sustainability reports and investment portfolios, automatically flagging non-compliant companies or assets.
Green Investment Insights – Real-time analytics match investors with eco-friendly projects, funds, and companies, helping financial institutions promote sustainable investments.

More details about optimizing the ESG footprint with data streaming: “Green Data, Clean Insights: How Kafka and Flink Power ESG Transformations“.

4. AI-Powered Personalized Banking: Hyper-Personalized Customer Experiences

AI-driven banking solutions are transforming how customers interact with financial institutions to provide real-time insights, spending recommendations, and fraud alerts based on user behavior.

How Data Streaming with Kafka and Flink Adds Value

Real-Time Spending Analysis – Flink continuously processes live transaction data, identifying spending patterns to provide instant budgeting recommendations.
Personalized Alerts & Recommendations – Kafka streams transaction events to banking apps, notifying users of unusual spending, low balances, or savings opportunities.
Automated Financial Planning – Flink enables AI-driven financial assistance, helping users optimize savings, credit usage, and investments based on real-time insights.

A good example is how Erste Group Bank modernized its mobile banking experience with a hyper-personalized approach to ensure that customers receive tailored financial insights while prioritizing data consistency over real-time updates. By offloading data from expensive mainframes to a cloud-native, microservices-driven architecture, Erste Group Bank reduced costs, maintained compliance, and improved operational efficiency—ensuring a seamless flow of consistent, high-quality data across its legacy and modern banking applications. Read the entire Erste Group Bank success story in my ebook.

3. Decentralized Identity Solutions: Secure Identity Without Central Authorities

Decentralized identity solutions allow users to control their personal data, eliminating the need for centralized databases that are vulnerable to hacks. These systems use blockchain and zero-knowledge proofs for secure, passwordless authentication, but require real-time verification and fraud prevention measures.

How Data Streaming with Kafka and Flink Adds Value

Cybersecurity in Real Time – Kafka streams biometric and identity verification data to fraud detection engines, ensuring instant risk analysis.
Passwordless Authentication – Kafka integrates blockchain and biometric authentication to enable real-time identity validation without traditional passwords.
Secure KYC (Know Your Customer) Processing – Flink processes identity verification requests instantly, ensuring faster onboarding and fraud-proof financial transactions.

2. Quantum-Resistant Cryptography: Securing Financial Data in the Quantum Era

Quantum computing poses a major risk to traditional encryption methods, requiring financial institutions to adopt post-quantum cryptography to secure sensitive financial transactions and user data.

How Data Streaming with Kafka and Flink Adds Value

Scalable Cryptographic Upgrades – Streaming data pipelines allow banks to deploy cryptographic updates instantly, ensuring financial systems remain secure without downtime.
Threat Detection & Security Analysis – Flink analyzes live transaction patterns to identify potential vulnerabilities in encryption algorithms before they are exploited.

Nobody knows where quantum computing goes. Frankly, this is the only section of the top 10 finance innovations where I am not sure how much data streaming will be able to help or if completely new paradigms come up.

1. Embedded Finance: Banking Services in Every Digital Experience

Embedded finance integrates banking, payments, lending, and insurance into non-financial platforms, allowing companies like Uber, Shopify, and Apple to offer seamless financial services within their ecosystems.

To function smoothly, embedded finance requires real-time data integration between payment processors, credit scoring systems, fraud detection tools, and regulatory bodies.

How Data Streaming with Kafka and Flink Adds Value

Instant Payments & Transactions – Kafka streams payment data in real time, enabling seamless in-app purchases and instant money transfers.
Real-Time Credit Scoring & Lending – Flink analyzes transaction histories to provide instant credit approvals for loans and BNPL (Buy Now, Pay Later) services.
Fraud Prevention & Compliance – Streaming analytics detect suspicious behavior in real time, ensuring secure embedded financial transactions.

Tech giants like Uber and Shopify have embedded financial services directly into their platforms using event-driven architectures powered by Kafka, enabling real-time payments, lending, and fraud detection. By integrating finance seamlessly into their ecosystems, these companies enhance customer experience, create new revenue streams, and redefine how consumers interact with financial services.

Just like Uber and Shopify use event-driven architectures for real-time payments and financial services, Stripe and many similar FinTech companies power embedded finance for businesses by providing seamless, scalable payment infrastructure. To ensure six-nines (99.9999%) availability, Stripe relies on Apache Kafka as its financial source of truth to enable ultra-reliable transaction processing and real-time financial insights.

The Future of FinServ Is Real-Time: Are You Ready for Data Streaming?

The future of finance is real-time, intelligent, and seamlessly integrated into digital ecosystems. The ability to process massive amounts of financial data instantly is no longer optional—it’s a competitive necessity for operational and analytical use cases.

Data streaming with Apache Kafka and Apache Flink provides the foundation for scalability, security, and real-time analytics that modern financial services demand. By embracing data streaming, financial institutions can deliver:

Faster transactions
Proactive fraud prevention
Better customer experiences
Regulatory compliance

Finance is evolving from batch processing to real-time intelligence—and the companies that adopt streaming-first architectures will lead the industry into the future.

How do you leverage data streaming with Kafka and Flink in financial services? Let’s discuss on LinkedIn or X (former Twitter). Also join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and to stay in touch. And make sure to download my free book about data streaming use cases across all industries.

The post How Data Streaming with Apache Kafka and Flink Drives the Top 10 Innovations in FinServ appeared first on Kai Waehner.

The Role of Data Streaming in McAfee’s Cybersecurity Evolution

Kai Waehner — Mon, 27 Jan 2025 07:33:30 +0000

In today’s digital age, cybersecurity is more vital than ever. Businesses and individuals face escalating threats such as malware, ransomware, phishing attacks, and identity theft. Combatting these challenges requires cutting-edge solutions that protect computers, networks, and devices. Beyond safeguarding digital assets, modern cybersecurity tools ensure compliance, privacy, and trust in an increasingly interconnected world.

As threats grow more sophisticated, the technologies powering cybersecurity solutions must advance to stay ahead. Data streaming technologies like Apache Kafka and Apache Flink have become foundational in this evolution, enabling real-time threat detection, prevention, and rapid response. These tools transform cybersecurity from static defenses to dynamic systems capable of identifying and neutralizing threats as they occur.

A notable example is McAfee, a global leader in cybersecurity, which has embraced data streaming to revolutionize its operations. By transitioning to an event-driven architecture powered by Apache Kafka, McAfee processes massive amounts of real-time data from millions of endpoints, ensuring instant threat identification and mitigation. This integration has enhanced scalability, reduced infrastructure complexity, and accelerated innovation, setting a benchmark for the cybersecurity industry.

Real-time data streaming is not just an advantage—it’s now a necessity for organizations aiming to safeguard digital environments against ever-evolving threats.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch.

Antivirus is NOT Enough: Supply Chain Attack

A supply chain attack occurs when attackers exploit vulnerabilities in an organization’s supply chain, targeting weaker links such as vendors or service providers to indirectly infiltrate the target.

For example, an attacker compromises Vendor 1, a software provider, by injecting malicious code into their product. Vendor 2, a service provider using Vendor 1’s software, becomes infected. The attacker then leverages Vendor 2’s connection to the Enterprise to access sensitive systems, even though Vendor 1 has no direct interaction with the enterprise.

Traditional antivirus software is insufficient to prevent such complex, multi-layered attacks. Ransomware often plays a role in supply chain attacks, as attackers use it to encrypt data or disrupt operations across compromised systems.

Modern solutions focus on real-time monitoring and event-driven architecture to detect and mitigate risks across the supply chain. These solutions utilize behavioral analytics, zero trust policies, and proactive threat intelligence to identify and stop anomalies before they escalate.

By providing end-to-end visibility, they protect organizations from cascading vulnerabilities that traditional endpoint security cannot address. In today’s interconnected world, comprehensive supply chain security is critical to safeguarding enterprises.

The Role of Data Streaming in Cybersecurity

Cybersecurity platforms must rely on real-time data for detecting and mitigating threats. Data streaming provides a backbone for processing massive amounts of security event data as it happens, ensuring swift and effective responses. My blog series on Kafka and cybersecurity looks deeply into these use cases.

To summarize:

Data Collection: A data streaming platforms powered by Apache Kafka collect logs, telemetry, and other data from devices and applications in real time.
Data Processing: Stream processing frameworks like Kafka Streams and Apache Flink continuously process this data with low latency at scale for analytics, identifying anomalies or potential threats.
Actionable Insights: The processed data feeds into Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) systems, enabling automated responses and better decision-making.

This approach transforms static, batch-driven cybersecurity operations into dynamic, real-time processes.

McAfee: A Real-World Data Streaming Success Story

McAfee is a global leader in cybersecurity, providing software solutions that protect computers, networks, and devices. Founded in 1987, the company has evolved from traditional antivirus software to a comprehensive suite of products focused on threat prevention, identity protection, and data security.

Source: McAfee

McAfee’s products cater to both individual consumers and enterprises, offering real-time protection through partnerships with global integrated service providers (ISPs) and telecom operators.

Mahesh Tyagarajan (VP, Platform Engineering and Architecture at McAfee) spoke with Confluent and Forrester about McAfee’s transition from a monolith to event-driven Microservices leveraging Apache Kafka in Confluent Cloud.

Data Streaming at McAfee with Apache Kafka Leveraging Confluent Cloud

As cyber threats have grown more complex, McAfee’s reliance on real-time data streaming has become essential. The company transitioned from a monolithic architecture to a microservices-based ecosystem with the help of Confluent Cloud, powered by Apache Kafka. The fully managed data streaming platform simplified infrastructure management, boosted scalability, and accelerated feature delivery for McAfee

Use Cases for Data Streaming

Real-Time Threat Detection: McAfee processes security events from millions of endpoints, ensuring immediate identification of malware or phishing attempts.
Subscription Management: Data streaming supports real-time customer notifications, updates, and billing processes.
Analytics and Reporting: McAfee integrates real-time data streams into analytics systems, providing insights into user behavior, threat patterns, and operational efficiency.

Transition to an Event-Driven Architecture and Microservices

By moving to an event-driven architecture with Kafka using Confluent Cloud, McAfee:

Standardized its data streaming infrastructure.
Decoupled systems using microservices, enabling scalability and resilience.
Improved developer productivity by reducing infrastructure management overhead.

This transition to data streaming with a fully managed, complete and secure cloud service empowered McAfee to handle high data ingestion volumes, manage hundreds of millions of devices, and deliver new features faster.

Business Value of Data Streaming

The adoption of data streaming delivered significant business benefits:

Improved Customer Experience: Real-time threat detection and personalized updates enhance trust and satisfaction.
Operational Efficiency: Automation and reduced infrastructure complexity save time and resources.
Scalability: McAfee can now support a growing number of devices and data sources without compromising performance.

Data Streaming as the Backbone of an Event-Driven Cybersecurity Evolution in the Cloud

McAfee’s journey showcases the transformative potential of data streaming in cybersecurity. By leveraging Apache Kafka as fully managed cloud service as the backbone of an event-driven microservices architecture, the company has enhanced its ability to detect threats, respond in real time, and deliver exceptional customer experiences.

For organizations looking to stay ahead in the cybersecurity race, investing in real-time data streaming technologies is not just an option—it’s a necessity. To learn more about how data streaming can revolutionize cybersecurity, explore my cybersecurity blog series and follow me for updates on LinkedIn or X (formerly Twitter).

The post The Role of Data Streaming in McAfee’s Cybersecurity Evolution appeared first on Kai Waehner.

The State of Data Streaming for the Public Sector

Kai Waehner — Wed, 02 Aug 2023 05:09:44 +0000

This blog post explores the state of data streaming for the public sector. The evolution of government digitalization, citizen expectations, and cybersecurity risks requires optimized end-to-end visibility into information, comfortable mobile apps, and integration with legacy platforms like mainframe in conjunction with pioneering technologies like social media. Data streaming provides consistency across all layers and allows integrating and correlating data in real-time at any scale. I look at public sector trends to explore how data streaming leverages Apache Kafka and to help as a business enabler, including customer stories from the US Department of Defense (DoD), NASA, Deutsche Bahn (German Railway), and others. A complete slide deck and on-demand video recording are included.

General trends in the public sector

Several disruptive trends impact innovation in the public sector to automate processes, provide a better experience for citizens, and strengthen cybersecurity defense tactics.

The two critical pillars across departments in the public sector are IT modernization and data-driven applications.

IT modernization in the government

The research company Gartner identified the following technology trends for the government to accelerate the digital transformation as they prepare for post-digital government:

These trends differ not much from traditional companies in the private sector like banking or insurance. Data consistency across monolithic legacy infrastructure and cloud-native applications matters.

Accelerating data maturity in the public sector

The public sector is often still very slow in innovation. Time-to-market is crucial. IT modernization requires up-to-date technologies and development principles. Data sharing across applications, departments, or states requires a data-driven enterprise architecture.

McKinsey & Company says “Government entities have created real-time pandemic dashboards, conducted geospatial mapping for drawing new public transportation routes, and analyzed public sentiment to inform economic recovery investment.

While many of these examples were born out of necessity, public-sector agencies are now embracing the impact that data-driven decision making can have on residents, employees, and other agencies. Embedding data and analytics at the core of operations can help optimize government resources by targeting them more effectively and enable civil servants to focus their efforts on activities that deliver the greatest results.”

AI and Machine Learning help with automation. Chatbots and other conversational AI improve the total experience of citizens and public sector employees.

Data streaming in the government and public sector

Real-time data beats slow data in almost all use cases. No matter which agency or department you look at in the government and public sector:

Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. Apache Kafka is the de facto standard for data streaming.

Check out the below links for a broad spectrum of examples and best practices. Additionally, here are a few new customer stories from the last months.

New customer stories for data streaming in the public sector and government

So much innovation is happening worldwide, even in the “slow” public sector. Automation and digitalization change how we search and buy products and services, communicate with partners and customers, provide hybrid shopping models, and more.

Most and more governments and non-profit organizations use a cloud-first approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating IT infrastructure.

Here are a few customer stories from worldwide organizations in the public sector and government:

University of California, San Diego: Integration Platform as a Service (iPaaS) as “Swiss army knife” of integration.
U.S. Citizenship and Immigration Services (USCIS): Real-time inter-agency data sharing.
Deutsche Bahn (German Railway): Customer data platform for real-time notification about delays and cancellations, plus B2B integration with Google Maps.
NASA: General Coordinates Network (GCN) for multi-messenger astronomy alerts between space- and ground-based observatories, physics experiments, and thousands of astronomers worldwide.
US Department of Defense (DOD): Joint All Domain Command and Control (JADC2), a strategic warfighting concept that connects the data sensors, shooters, and related communications devices of all U.S. military services. DOD uses the ride-sharing service Uber as an analogy to describe its desired end state for JADC2 leveraging data streaming.

Resources to learn more

This blog post is just the starting point.

I wrote a blog series exploring why many governments and public infrastructure sectors leverage data streaming for various use cases. Learn about real-world deployments and different architectures for data streaming with Apache Kafka in the public sector:

Learn more about data streaming for the government and public sector in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases. I presented with my colleague and SME for the public sector and governments Will La Forest.

On-demand video recording

The video recording explores public sector trends and architectures for data streaming leveraging Apache Kafka and other modern and cloud-native technologies. The primary focus is the data streaming case studies. Check out our on-demand recording:

Slides

If you prefer learning from slides, check out the deck used for the above recording:

Fullscreen Mode

Case studies and lightboard videos for data streaming in the public sector and government

The state of data streaming for the public sector in 2023 is fascinating. New use cases and case studies come up every month. Mission-critical deployments at governments in the United States and Germany prove the maturity of data streaming concerning security and data privacy. The success stories prove better data governance across the entire organization, secure data collection and processing in real-time, data sharing and cross-agency partnerships with Open APIs for new business models, and many more scenarios.

We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Stay tuned; I will update the links in the next few weeks and publish a separate blog post for each story and lightboard video.

And this is just the beginning. Every month, we will talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on…

Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

The post The State of Data Streaming for the Public Sector appeared first on Kai Waehner.

Apache Kafka in Crypto and FinServ for Cybersecurity and Fraud Detection

Kai Waehner — Fri, 29 Apr 2022 10:34:46 +0000

The insane growth of the crypto and fintech market brings many unknown risks and successful cyberattacks to steal money and crypto coins. This post explores how data streaming with the Apache Kafka ecosystem enables real-time situational awareness and threat intelligence to detect and prevent hacks, money loss, and data breaches. Enterprises stay compliant with the law and keep customers happy in any innovative Fintech or Crypto application.

The Insane Growth of Crypto and FinTech Markets

The crypto and fintech markets are growing like crazy. Not every new crypto coin or blockchain is successful. Only a few fintech like Robinhood in the US or Trade Republic in Europe are successful. In the last months, the crypto market has been a bear market (writing this in April 2022).

Nevertheless, the overall global interest, investment, and growth in this market are unbelievable. Here is just one of many impressive statistics:

This survey came from NBC News. But you can find similar information in many other new severe portals across the globe.

The Threat is Real: Data Breaches, Hacks, Stolen Crypto!

With the growth of cryptocurrencies, blockchains, crypto + NFT markets in conjunction with very intuitive crypto trading mobile apps, and popular “normal” trading apps adding crypto support, cyberattacks are more dangerous than ever before.

Let’s look at two of the many recent successful cyberattacks against crypto markets to steal cryptocurrencies and explain why any crypto marketplace or trading app can be the next victim.

Supply Chain Attacks for Cyberattacks

While it feels more secure in trusting a well-known and trusted crypto shop (say Binance, Coinbase, or Crypto.com), appearances are deceiving. Many successful cyberattacks these days in the crypto and non-crypto world happen via supply chain attacks:

A supply chain attack means even if your infrastructure and applications are secure, attackers still get in via your certified B2B partners (like your CRM system or 3rd party payment integration). If your software or hardware partner gets hacked, the attacker gains access to you.

Hence, a continuous internal cybersecurity strategy with real-time data processing and a zero-trust approach is the only suitable option to provide your customers with a trustworthy and secure environment.

Examples of Successful Crypto Cyberattacks

There are so many successful hacks in the crypto space. Many don’t even make it into the prominent newspapers, even though coins worth millions of dollars are usually stolen.

Let’s look at two examples of successful supply chain attacks:

Hubspot CRM was hacked. Consequently, the crypto companies BlockFi, Swan Bitcoin, and Pantera had to advise users on how to stay safe. (source: Crypto News)
A MailChimp “insider” had carried out the phishing attack by sending malicious links to users of the multimedia platform. This included a successful phishing attack to steal funds stored in Trezor, a popular cryptocurrency wallet company. (source: Crypto Potato)

Obviously, this is not just a problem for crypto and fintech enterprises. Any other customer of a hacked software needs to act the same way. For the context, I choose crypto companies in the above examples.

Cybersecurity: Situational Awareness and Threat Intelligence with Apache Kafka

Cybersecurity in real-time is mandatory to fight successfully against cyberattacks. I wrote a blog series about how data streaming with Apache Kafka helps secure any infrastructure. Learn about use cases, architectures, and reference deployments for Kafka in the cybersecurity space:

Cybersecurity with Apache Kafka for Crypto Markets

Many crypto markets today use data streaming with Apache Kafka for various use cases. If done right, Kafka provides a secure, tamper-proof, encrypted data hub for processing events in real-time and for doing analytics of historical events with one scalable infrastructure:

If you want to learn more about “Kafka and Crypto” use cases, architectures, and success stories, check out this blog: Apache Kafka as Data Hub for Crypto, DeFi, NFT, Metaverse – Beyond the Buzz.

Kafka Architecture for Real-Time Cybersecurity in a Crypto Infrastructure

Let’s now look at a concrete example for integrating, correlating, and applying transactional and analytical information in a crypto environment with the power of the Kafka ecosystem. Here is the overview:

Data Producers from Blockchains, Crypto Markets, and the CRM system

Data comes from various sources:

Back-end applications include internal payment processors, fraud applications, customer platforms, and loyalty apps.
Third-party crypto and trading marketplaces like Coinbase, Binance, and Robinhood and direct transaction data from blockchains like Bitcoin or Ethereum.
External data and customer SaaS such as Salesforce or Snowflake.

The data includes business information, transactional workloads, and technical logs at different volumes and are integrated via various technologies, communication paradigms, and APIs:

Streaming ETL at any scale is a vital strength of the Kafka ecosystem and is often the first choice in data integration, ETL, and iPaaS evaluations. It is also widespread to combine transactional and analytical workloads within Kafka as the event data hub.

Real-Time Data Processing for Crypto Threat Intelligence with Machine Learning

The key benefit is not sending data from A to B in real-time but correlating the data from different sources. This enables detecting suspicious events that might be the consequence of a cyberattack:

AI and Machine Learning help build more advanced use cases and are very common in the Kafka world.

Data Consumers for Alerting and Regulatory Reporting

Real-time situational awareness and threat intelligence are the most crucial application of data streaming in the cybersecurity space. Additionally, many other data sinks consume the data, for instance, for compliance, regulatory reporting, and batch analytics in a data lake or lakehouse:

Kafka enables a Kappa architecture that simplifies real-time AND batch architectures compared to the much more complex and costly Lambda architecture.

Data Streaming with Kafka to Fight Cyberattacks in the Crypto and FinTech Space

Supply chain attacks require not just a secure environment but continuous threat intelligence. Data streaming with the Apache Kafka ecosystem builds the foundation. The example architecture showed how to integrate with internal systems and external blockchains, and crypto markets to correlate data in motion.

Kafka is not a silver bullet but the backbone to provide a scalable real-time data hub for your mission-critical cybersecurity infrastructure. If you deploy cloud-native applications (like most fintech and crypto companies), check out serverless data architectures around Kafka and Data Lakes and compare Kafka alternatives in the cloud, like Amazon MSK, Confluent Cloud, or Azure Event Hubs.

How do you use Apache Kafka with cryptocurrencies, blockchain, or other fintech applications? Do you deploy in the public cloud and leverage a serverless Kafka SaaS offering? What other technologies do you combine with Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in Crypto and FinServ for Cybersecurity and Fraud Detection appeared first on Kai Waehner.

Apache Kafka Landscape for Automotive and Manufacturing

Kai Waehner — Wed, 12 Jan 2022 12:07:20 +0000

Before the Covid pandemic, I had the pleasure of visiting “Motor City” Detroit in November 2019. I met with several automotive companies, suppliers, startups, and cloud providers to discuss use cases and architectures around Apache Kafka. A lot has happened. Since then, I have also met several OEMs and suppliers in Europe and Asia. As I finally go back to Detroit this January 2022 to meet customers again, I thought it would be a good time to update the status quo of event streaming and Apache Kafka in the automotive and manufacturing industry.

Today, in 2022, Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments. This article explores the automotive event streaming landscape, including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.

The Event Streaming Landscape for Automotive and Manufacturing

Every business domain leverages Event Streaming with Apache Kafka in the automotive and manufacturing industry. Data in motion helps everywhere. The infrastructure and deployment differ depending on the use case and requirements. I have seen everything at carmakers and manufacturers across the globe:

Cloud-first strategy with all new business applications in the public cloud deployed and connected across regions and even continents
Hybrid integration scenarios between legacy applications in the data center and modern cloud-native services the public cloud
Edge computing in a smart factory for low latency, cost-efficient data processing, and cybersecurity
Embedded Kafka brokers in machines and vehicles at the disconnected edge

This spread of use cases is impressive. The following diagram depicts a high-level overview:

The following sections describe the automotive and manufacturing landscape for event streaming in more detail:

Manufacturing 4.0
Supply Chain Optimization
Mobility Services
New Business Models

If you are mainly interested in real-world Kafka deployments with examples from BMW, Porsche, Audi, Tesla, and other OEMs, check out the article “Real-World Deployments of Kafka in the Automotive Industry“.

If you want to understand why Kafka makes such a difference in automotive and manufacturing, check out the article “Apache Kafka in the Automotive Industry“. This article explores the business motivation for these game-changing concepts of data in motion for the digitalization of the automotive industry.

Before you start reading the below section, I want to clearly emphasize that Kafka is not the silver bullet for every problem. “When NOT to use Apache Kafka?” digs deep into this discussion.

I keep the following sections relatively short to give a high-level overview. Each section contains links to more deep-dive articles about the topics.

Manufacturing 4.0

Industrial IoT (IIoT) respectively Industry 4.0 changes how the shop floor and production lines produce goods. Automation, process efficiency, and a much better Overall Equipment Effectiveness (OEE) enable cost reduction and flexibility in the production process:

Smart Factory

A smart factory is not necessarily a newly built building like a Tesla Gigafactory. Many enterprises install smart technology like networked sensors for temperature or vibrations measurements into old factories. Improving the Overall Equipment Effectiveness (OEE) is the primary goal of most use cases. Many scenarios leverage Kafka for continuously processing sensor and telemetry data in motion:

Connectivity to machines with modern, open technologies such as MQTT
Visualization and monitoring of equipment and assets (often called Digital Twin or Digital Threat)
Quality assurance such as condition monitoring with stateless stream processing
Predictive maintenance of machines, robots, productions lines with stateful streaming analytics and machine learning
Surveillance and safety-critical video monitoring by processing images and videos
Cybersecurity for situational awareness and threat intelligence
Smart Buildings for maintenance and operations, smarter energy consumptions, optimized space usage, better employee experience

Legacy Modernization with Open APIs and Hybrid Cloud

Factories exist for decades after they are built. Digitalization and the modernization of legacy technologies are some of the biggest challenges in IIoT projects. Such an initiative usually includes several tasks:

Complex integration with proprietary legacy protocols such as Siemens S7, Allan Bradley, Modbus, et al., for instance, with a dedicated open-source framework such as Apache PLC4X running on Kafka Connect
Simple integration with open standards such as HTTP and REST-based web services and a REST Proxy for Kafka
Deployment of a modern, open, scalable data historian replacing or complementing monolithic, proprietary data historians
Postmodern MES and ERP architectures upgrading or replacing legacy proprietary non-scalable MES and ERP systems, for instance, the integration between legacy SAP systems and Kafka
Lift and shift from on-premise data centers to the public cloud with hybrid synchronization and replication using the Kafka protocol and cluster linking

Continuous Data-driven Engineering and Product Development

Last but not least, an opportunity many people underestimate: Continuous data streaming with Kafka enables new possibilities in software engineering and product development for IoT and automotive projects.

For instance, developing and deploying the “big loop” for machine learning of advanced driver-assistance systems (ADAS) or self-driving functions based on sensor data from the fleet is a new way of software engineering. Tesla’s Kafka-based data platform is a fantastic example. A related use case in engineering is the ingest of sensor data during and after test drives.

Supply Chain Optimization

Supply chain processes and solutions are very complex. The Covid pandemic showed how only flexible enterprises could survive, stay profitable, and provide a great customer experience, even in disastrous external events.

Here are the top 5 critical challenges of supply chains:

Time Frames are Shorter
Rapid Change
Zoo of Technologies and Products
Historical Models are No Longer Viable
Lack of visibility

Only real-time data streaming and correlation solve these supply chain challenges end-to-end across regions and companies:

In its detailed blog post, I covered Supply Chain Optimization (SCM) with Apache Kafka. Check it out to learn about real-world supply chain use cases from Bosch, BMW, Walmart, and other companies.

Intra-logistics and Global Distribution Networks

Logistics and supply chains within a factory, distribution center, or store require real-time data integration and processing to provide efficient processing of goods and a great customer experience. Batch processes or manual interaction by human workers cannot implement these use cases. Examples include:

Real-Time Locating System (RTLS) for transportation and logistics within a building to monitor robots, driverless transport systems (DTS), and manual processes for improved safety, controlled security, and optimized operations and productivity
Inventory management for optimized and customized production processes with a better B2B integration in real-time – the same story and benefits as in a modern omnichannel retail architecture
Globally distributed networks powered by a global Kafka infrastructure
Augmented reality for the digitalization and automation of worker tasks

Track & Trace and Fleet Management

Real-time logistics is a game-changer for fleet management and track & trace use cases.

Commercial motor vehicles such as cars, vans, trucks, specialist vehicles (such as mobile construction machinery), forklifts, and trailers
Private vehicles used for work (the ‘grey fleet’)
Aviation machinery such as aircraft (planes and helicopters)
Ships
Rail cars
Non-powered assets such as generators, tanks, gearboxes

All the following aspects are not new. The difference is that event streaming allows to continuously execute these tasks in real-time to act on new information in motion:

Visualization
Location-based services
Routing and navigation
Estimated time of arrival
Alerting
Proactive recalculation
Monitoring of the assets and mechanical components of a vehicle

Most companies have a cloud-first strategy for building such a platform. However, some cases require edge computing either via local 5G location for low latency use cases or embedded Kafka brokers for disconnected data collection and analytics within the vehicles.

Streaming Data Exchange for B2B Collaboration with Partners

Real-time data is not just relevant without a company. OEMs and Tier 1 and Tier 2 suppliers benefit in the same way from data streams. The same is true for car dealerships, end customers, and any other consumer of the data. Hence, a clear trend in the market is the emergence of a Kafka-based streaming data exchange across companies to build a data mesh.

I have often seen this situation in the past: The OEM leverages event streaming. The Tier 1 supplier leverages event streaming. The used ERP solution is built on Kafka, too. All leverage the capabilities of scalable real-time data streaming. It makes little sense to integrate with partners and software vendors via web service APIs, such as SOAP or HTTP/REST. Instead, a streaming interface is a natural choice to hand streaming data to partners.

The following example from the automotive industry shows how independent stakeholders (= domains in different enterprises) use a cross-company streaming data exchange:

Mobility Services

Every OEM, supplier, or innovative startup in the automotive space thinks about providing a mobility service either on top of the goods they sell or as an independent service.

Most mobility services on your mobile apps used today for business or privately are only possible because of a scalable real-time backbone powered by event streaming:

The possibilities for mobility services are endless. A few examples that are mainstream today already:

Omnichannel retail and aftersales to buy additional car features online, for instance, more power, seat heater, up-to-date navigation, self-driving software (okay, the latter one is not mainstream yet, but Tesla shows where it goes)
Connected Cars for ride-hailing, scooter rental, taxi services, food delivery
3rd party integration for adding services that a company does not want to build by themselves

Today’s most successful and widely adopted mobility services are independent of a specific carmaker or supplier.

Examples of prominent Kafka-powered consumer mobility services are Uber and Lyft in the US, Grab in Asia, and FREENOW in Europe. Here Technologies is an excellent example for a B2B mobility service providing mapping information so that companies can build new or improve existing applications on top of it.

A good starting point to learn more is my blog post about Apache Kafka and MQTT for mobility services and transportation.

New Business Models

The access to real-time data enables companies to build entirely new business models on top of their existing products:

A few examples:

Next-generation car rental with excellent customer experience, context-specific coupons, loyalty platform, and car rental fleets with other services from the carmaker.
Reinventing car insurance based on real-time driving information about each driver to build driver-specific pricing based on real-time analysis of the driver behavior instead of legacy approaches using statistical models with attributes like driver age, number of accidents in the past, etc.
Data provider for monetization enables other companies to build new business models with your car data – for instance, working with a government to make a smart city traffic system or a mobility service startup to analyze and correlate car data across OEMs.

This evolution is just the beginning of the usage of streaming data. I have seen many customers build a first streaming pipeline for one use case. However, new business divisions will leverage the data for innovations when the platform is there.

The Data is in Motion in Automotive and Manufacturing

The landscape for Apache Kafka in the automotive and manufacturing industry showed that Apache Kafka is the central nervous system of many applications in various areas for processing analytical and transactional data in motion.

This article explored use cases such as connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models. The possibilities for data in motion are almost endless. The automotive and manufacturing industry is still in the very early stages of leveraging data in motion.

Where do you use Apache Kafka and its ecosystem in the automotive and manufacturing industry? Do you deploy in the public cloud, in your data center, or at the edge outside a data center? What other technologies do you combine with Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka Landscape for Automotive and Manufacturing appeared first on Kai Waehner.

IoT Analytics with Kafka for Real Estate and Smart Building

Kai Waehner — Thu, 25 Nov 2021 13:59:25 +0000

Smart building and real estate generate enormous opportunities for governments and private enterprises in the smart city sector. This blog post explores how event streaming with Apache Kafka enables IoT analytics for cost savings, better consumer experience, and reduced risk. Examples include improved real estate maintenance and operations, smarter energy consumptions, optimized space usage, better employee experience, and better defense against cyber attacks.

This post results from many customer conversations in this space, inspired by the article “5 Examples of IoT and Analytics at Work in Real Estate” from IT Business Edge.

Data in Motion for Smart City and Real Estate

A smart city is an urban area that uses different electronic Internet of Things (IoT) sensors to collect data and then use insights gained from that data to efficiently manage assets, resources, and services. Apache Kafka fits into the smart city architecture as the backbone for real-time streaming data integration and processing. Kafka is the de facto standard for Event Streaming.

The Government-owned Event Streaming platform from the Ohio Department of Transportation (ODOT) is a great example. Many smart city architectures are hybrid and require the combination of various technologies and communication paradigms like data streaming, fire-and-forget, and request-response. For instance, Kafka and MQTT enable the last-mile integration and data correlation of IoT data in real-time at scale.

Event Streaming is possible everywhere, in the traditional data center, the public cloud, or at the edge (outside a data center):

IoT Analytics Use Cases for Event Streaming with Smart Building and Real Estate

Real estate and buildings are crucial components of a smart city. This post explores various use cases for IoT analytics with event streaming to improve the citizen experience and reduce maintenance and operations costs using smart buildings.

The following sections explore these use cases:

Optimized Space Usage within a Smart Building
Predictive Analytics and Preventative Maintenance
Smart Home Energy Consumption
Real Estate Maintenance and Operations
Employee Experience in a Smart Building
Cybersecurity for Situational Awareness and Threat Intelligence

Optimized Space Usage within a Smart Building

Optimized space usage within buildings is crucial from an economic perspective. It enables to size space according to the need for rentals and to reduce building maintenance costs.

A few examples for data processing related to space optimization:

Count people entering and leaving the premises with real-time alerting
Track the walking behavior of visitors with continuous real-time aggregation of various data sources
Optimized space usage during an event to optimize the customer experience; e.g., rearranging the chairs, tables, signs, etc. in a conference ballroom during the conference (as the next conference will have different people, requirements, and challenges)
Plan future building, room, or location constructions with batch analytics on historical information

Predictive Analytics and Preventative Maintenance

Predictive analytics and preventative maintenance require real-time data processing. The monitoring of critical building assets and equipment such as air conditioning, elevators, and lighting prevents breakdowns and improves efficiency:

Continuous data processing is possible either in a stateless or stateful way. Here are two examples:

Stateful preventive maintenance: Continuous tilt and shock detection calculating an average value
Stateless condition monitoring: Temperature and humidity spikes with filter above-threshold events

My blog post about “Streaming Analytics for Condition Monitoring and Predictive Maintenance with Event Streaming and Apache Kafka” goes into more detail. A Kafka-native Digital Twin plays a key role in some IoT projects, too.

Smart Home Energy Consumption

The energy industry lives in a significant change. The increased use of digital tools supports the expected structural changes in the energy system to become green and less wasteful.

Smart energy consumption is a powerful and reasonable approach to reduce waste and save costs. Monitoring energy consumption in real-time enables the improvement of current business usage patterns:

A few examples that require real-time data integration and data processing for sensor analytics:

Analyze malfunctioning equipment for its excessive energy use
Turn on and off lights and automatically context-driven instead of time-based configuration
Monitor air conditioning for overloads

Real Estate Maintenance and Operations

The maintenance and operations of buildings and real estate require on-site and remote work. Hence, the public administration can perform administrative tasks and data analytics in a remote data center or cloud that aggregates information across locations. On the other side, some use cases require edge computing for real-time monitoring and analytics:

It always depends on the point of view. A manager for a smart building might work on-site while a manager monitors all facilities in a region. A global manager oversees many regional managers. Technology needs to support the need of all stakeholders. All of them can do a better job with real-time information and real-time applications.

Employee Experience in a Smart Building

Satisfied employees are crucial for a decent smart city and real estate strategy. Real-time applications can help here, too:

A few examples to improve the experience of the employees:

Ambiance: Adjust noise and light level to reduce distractions
Air quality: Control air to enhance morale and productivity
Feedback Device: Improve layout, equipment, and office supplies

Cybersecurity for Situational Awareness and Threat Intelligence

Continuous data correlation became essential to defend against cyber attacks. Monitoring, alerting, and proactive actions are only possible if data integration and data correlation happen in real-time at scale reliably:

Plenty of use cases require event streaming as the scalable real-time backbone for cybersecurity. Kafka’s cybersecurity examples include situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization.

Smart City, Real Estate, and Smart Building require Real-Time IoT Analytics

Plenty of use cases exist to add business value to real estate and smart buildings. Data-driven correlation and analytics with data from any IoT interface in real-time is a game-changer to improve the consumer experience, save costs, and reduce risks.

Apache Kafka is the de-facto standard for event streaming. No matter if you are on-premise, in the public cloud, at the edge, or in a hybrid scenario, evaluate and compare the available Kafka offerings on the market to start your project the right way.

How do you optimize data usage in real estate and smart buildings? What technologies and architectures do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post IoT Analytics with Kafka for Real Estate and Smart Building appeared first on Kai Waehner.

Apache Kafka in the Public Sector – Part 5: National Security and Defense

Kai Waehner — Wed, 20 Oct 2021 06:01:21 +0000

The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores how the public sector leverages data in motion powered by Apache Kafka to add value for innovative new applications and modernizing legacy IT infrastructures. This is part 5: Use cases and architectures for national security, cybersecurity, defense, and military.

Blog series: Apache Kafka in the Public Sector and Government

This blog series explores why many governments and public infrastructure sectors leverage event streaming for various use cases. Learn about real-world deployments and different architectures for Kafka in the public sector:

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts once published.

As a side note: If you wonder why healthcare is not on the above list. Healthcare is another blog series on its own. While the government can provide public health care through national healthcare systems, it is part of the private sector in many other cases.

National Security and Defense

National security or national defense is the security and defense of a nation-state, including its citizens, economy, and institutions, is a duty of government. Originally conceived as protection against military attack, national security is now widely understood to include non-military dimensions, including the security from terrorism, minimization of crime, economic security, energy security, environmental security, food security, cyber-security, etc. Similarly, national security risks include, in addition to the actions of other nation-states, action by violent non-state actors, by narcotic cartels, and by multinational corporations, and also the effects of natural disasters.

Cybersecurity – The Threat is REAL!

Cybersecurity has become a real threat due to the ongoing digital transformation. Networking, communication, connectivity, open standards, and” always-on” principles provide significant benefits and innovation, but also new cyber threats. The US’s Colonial Pipeline ransomware attack in May 2021 is just one of many successful attacks in the past few quarters. There is no real doubt that the number of attacks will go up significantly in the following months and years.

Supply chain attacks make the threat even bigger. Even if your software is secure, a single loophole in a tiny 3rd party software can provide the attack surface into the whole company:

Defeating Cybersecurity with Apache Kafka

Threat detection, incident management, and proactive or predictive countermeasures are only possible with real-time data correlation and processing.

I won’t repeat myself. You can read all the details in my separate blog series about the success of event streaming with Apache Kafka to provide real-time cybersecurity at scale:

TL;DR: Data in motion HAS TO BE the backbone of cybersecurity infrastructure:

National Security powered by Apache Kafka

While I work a lot with customers from the government and public administration, success stories are scarce. All of this is even more true for national security areas. Hence, contrary to the other posts of this blog series, I can only talk about a use case and architecture without giving a concrete example from the real world. Sorry for that

Nevertheless, I have a great example of sharing: Confluent presented an edge and hybrid demo for smart soldiers together with a partner at the AUSA 2021, an annual event from the association of the united states army.

Edge Computing (Soldier) and Replication to the Data Center (Command Post)

The enterprise architecture looks very similar to other Kafka edge deployments and hybrid architectures from a high level. Hence, my infrastructure checklist for Kafka at the Edge applied in national security use cases, too.

The following diagram shows our national security demo use case and architecture:

Kudos to my colleagues Jeffrey Needham and Michael Peacock, who built the demo leveraging the Kafka ecosystem for national security.

So, what’s happening in this use case?

Soldiers wear a small compute “thing” running the whole “lightweight” single broker Kafka infrastructure.
An MQTT client consumes sensor data from the environment and other remote locations.
The collected sensor data is processed continuously with Kafka-native stream processing (Kafka Streams or ksqlDB). The Kafka broker stores the events and truly decouplings producers and consumers.
Confluent Cluster Linking replicates relevant curated information to the command post in real-time (as long as internet connectivity is available).
The command post is a small data center running a mission-critical Kafka cluster. The aggregation of all the data from all soldiers and other interfaces (vehicles, weather information, etc.) is correlated in real-time to provide situational awareness.
Relevant data is replicated from command posts via Cluster Linking to a remote location (like a military data center). Here, analytics and other traditional IT workloads use the information for different use cases.

TL;DR: The project demonstrates vast benefits. The open infrastructure leverages the same components and technologies across the edge (soldiers), small on-site data centers (command posts), and large remote data centers. Reliable data integration and processing provide all the capabilities for situational awareness in real-time in a national security scenario end-to-end at the edge and remote locations.

NASA enables real-time data from Mars

Let’s end this blog series with one more exciting use case from the public sector. It is not directly related to national security. But who knows, maybe this will be relevant for attack scenarios in the future when the aliens attack.

The National Aeronautics and Space Administration (NASA) is an independent agency of the U.S. federal government responsible for the civilian space program, as well as aeronautics and space research.

NASA enables real-time data from Mars with the help of Apache Kafka. Real-time data extends into the far frontiers via its Deep Space Network (DSN). Data grows exponentially from spacecraft and other systems. Real-time data enables NASA for responsive citizen engagement, real-time situational awareness, anomaly detection, event-driven missions, and security operations. The global fabric of Kafka clusters allows for real-time sharing, event streaming, and other combinations of both real-time and historical data.

More details about NASA’s Kafka usage are available in a great article from the Federal News Network.

Data in Motion for National Security and Defense

National security is relevant across all areas in the public sector. This post showed an example from the military. However, situational awareness in real-time is needed everywhere, as the colonial pipeline attack and many other ransomware stories proved in the last months.

Event Streaming with Apache Kafka provides the unique capability of using a single technology across the edge and hybrid cloud architectures for real-time data integration and processing for National Security. Even disconnected or air-gapped environments are supported. Learn more about data in motion for cybersecurity in my dedicated blog series about Apache Kafka for Cybersecurity across Industries.

How do you leverage event streaming in the public sector? Are you working on any national security or cybersecurity projects? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Public Sector – Part 5: National Security and Defense appeared first on Kai Waehner.

Apache Kafka in the Public Sector – Blog Series about Use Cases and Architectures

Kai Waehner — Thu, 07 Oct 2021 14:13:24 +0000

The public sector includes many different areas. Some groups leverage cutting-edge technology, like military leverage. Others like the public administration are years or even decades behind. This blog series explores how the public sector leverages data in motion powered by Apache Kafka to add value for innovative new applications and modernizing legacy IT infrastructures. Life is a stream of events. Therefore, examples include a broad spectrum of use cases across smart cities, citizen services, energy and utilities, and national security deployed across the edge, hybrid, and multi-cloud scenarios.

Blog series: Apache Kafka in the Public Sector and Government

Life is a Stream of Events (THIS POST)
Smart City
Citizen Services
Energy and Utilities
National Security

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts once published.

The Public Sector is a Broad Spectrum of Use Cases

The public sector covers so many different areas. Examples include defense, law enforcement, national security, healthcare, public administration, police, judiciary, finance and tax, research, aerospace, agriculture, etc. Many of these terms and sectors overlap. In many countries, some of these sectors are private or a combination of public and private. For these reasons, my blog series does not cover specific sectors. Instead, I focus on use cases. Many of these are applicable across many sectors.

Real-time Data Beats Slow Data in the Public Sector

I won’t do yet another long introduction about the added value of real-time data. Check out my blog about “Use Cases across Industries for Data in Motion powered by Apache Kafka” to understand the broad spectrum and benefits. The public sector is not different: Real-time data beats slow data in almost every use case! Here are a few examples:

But think about your use cases! How often can you say that getting data late (like in one hour or the following day) is better than getting data when it happens (now, in a few milliseconds or seconds)? Probably not very often.

An important fact is that the added business value comes from correlating the events from different data sources. As an example, let’s look at the processes in a smart city:

The sensor data from the car is only valuable if an application correlates it with data from other vehicles in the traffic planning system. Intelligent parking is only reasonable if it integrates with the overall city planning. Emergency service needs to receive an alert in real-time if a crash happens. All of that needs to happen in real-time! It does not matter if the use case is about transactional workloads (usually smaller data sets) or analytical workloads (usually more extensive data sets).

Open API and Partnerships are Mandatory

Governments can build great applications. At least in theory. In practice, they rely on external data from partners and 3rd party applications for many potential use cases:

Governments and cities need to work with several other stakeholders, including carmakers, suppliers, telcos, mobility Services, cloud providers, software providers, etc. Standards and open APIs are mandatory for successful cross-cutting projects. The foundation of such an enterprise architecture is an open, reliable, scalable platform that can process data in real-time. Apache Kafka became the de facto standard for event streaming.

An example that shows the added value of data integration across stakeholders and processing the data in real-time: Transportation Services. A mobile app needs context. Think about hailing a taxi ride. It doesn’t help you if you see the position of each taxi on the city map in real-time. You want to know the estimated time of arrival, the estimated cost, the estimated time of arrival at your destination, the car model that will pick you up, and so much more.

This use case – like many others – is only possible if you integrate and correlate the data from many different interfaces like a mapping service, all taxi drivers, all customers in a city, the weather service, backend analytics services, and much more:

The left side of the picture shows a dashboard built with a real-time message queue like RabbitMQ. The right side shows data correlation of data from different sources in real-time with an event streaming platform like Apache Kafka.

I hope you agree on the added value of the event streaming platform. Just sending data from A to B in real-time is not enough. Only the data processing in real-time adds true value.

Data in Motion as Paradigm Shift in the Public Sector

Real-time beats slow data. No matter if you think about cutting-edge use cases in national security or modernizing the IT infrastructure in the public administration. Event Streaming is the foundation of this paradigm shift moving towards real-time data processing in the public sector. The upcoming posts of this blog series explore many different use cases and architectures. If you also want to learn more about Apache Kafka offerings on the market, check out my comparison of Apache Kafka products and cloud services.

How do you leverage event streaming in the public sector? What technologies and architectures do you use? What projects did you already work on or are in the planning? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka in the Public Sector – Blog Series about Use Cases and Architectures appeared first on Kai Waehner.

Kafka for Cybersecurity (Part 5 of 6) – Zero Trust and Air-Gapped Environments

Kai Waehner — Mon, 02 Aug 2021 07:42:49 +0000

Apache Kafka became the de facto standard for processing data in motion across enterprises and industries. Cybersecurity is a key success factor across all use cases. Kafka is not just used as a backbone and source of truth for data. It also monitors, correlates, and proactively acts on events from various real-time and batch data sources to detect anomalies and respond to incidents. This blog series explores use cases and architectures for Kafka in the cybersecurity space, including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization. This post is part five: Air-gapped and Zero Trust Environments.

Blog series: Apache Kafka for Cybersecurity

This blog series explores why security features such as RBAC, encryption, and audit logs are only the foundation of a secure event streaming infrastructure. Learn about use cases, architectures, and reference deployments for Kafka in the cybersecurity space:

Part 1: Data in Motion as cybersecurity backbone
Part 2: Situational awareness
Part 3: Threat intelligence
Part 4: Forensics
Part 5 (THIS POST): Air-gapped and zero trust environments
Part 6: SIEM / SOAR modernization

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Zero Trust – A Firewall is NOT Good Enough

The zero trust security model (also, zero trust architecture, zero trust network architecture, sometimes known as perimeterless security, describes an approach to the design and implementation of IT systems. The main concept behind zero trust is that devices and other interfaces should not be trusted by default, even if they are connected to a managed corporate network such as the corporate LAN and even if the devices were previously verified.

In most modern enterprise environments, corporate networks consist of many interconnected segments, cloud-based services and infrastructure, connections to remote and mobile environments, and increasing connections to non-conventional IT, such as IoT devices. The once traditional approach of trusting devices within a notional corporate perimeter, or devices connected to it via a VPN, makes less sense in such highly diverse and distributed environments. Instead, the zero trust approach advocates mutual authentication, including checking the identity and integrity of devices without respect to location, and providing access to applications and services based on the confidence of device identity and device health in combination with user authentication.

Technical Requirements for Zero Trust Environments

Authentication and authorization are key success factors to secure software. Therefore, enterprise-ready software requires features such as role-based access control (RBAC), Active Directory / LDAP integration, audit logs, end-to-end encryption in motion, and at rest, bring your own key (BYOK), and more. But that is just one piece of the puzzle!

Additionally, zero trust environments require:

Protection of everything, not just firewalls and computing assets
Threat intelligence that includes the cyber network and human intelligence
Safe IT/OT integration at industrial sites
Unidirectional communication enforced through hardware and/or software-based security solutions
Replica servers instead of direct access
Surveillance for safety and theft Protection

Obviously, the implementation of all these requirements is tough and expensive. Hence, some environments might accept less secure infrastructure. A risk assessment, regulations, and due diligence will tell you how much investment into the security is required.

For this reason, many so-called zero trust environments are not as secure as technically possible. A firewall in combination with RBAC, BYOK, and other security features is good enough for some teams. It is always a trade-off. But especially safety-critical infrastructure in air-gapped environments requires a true zero trust architecture.

Air-Gapped Environments and OT/IT Integration

A truly air-gapped environment requires the strongest security implementation. These infrastructures require a zero trust architecture. Relevant examples include power plants, oil rigs, water utilities, railway networks, manufacturing, airplanes (between flight control units and in-flight entertainment systems), and more.

From a technical perspective, zero trust includes unidirectional communication and strict surveillance of people and things entering/leaving the site.

A lot of the content from this section is from my notes of the material from Waterfall Security, a leading provider of unidirectional hardware gateways.

Unidirectional Gateway (= Unidirectional Network = Data Diode)

A zero trust infrastructure requires a robust network segmentation between IT and OT networks. A firewall is not enough. More on this discussion later.

A unidirectional gateway (also referred to as a unidirectional network or data diode) is a network appliance or device that allows data to travel in only one direction.

After years of development, unidirectional networks have evolved from being only a network appliance or device allowing raw data to travel only in one direction, used in guaranteeing information security or protection of critical digital systems, such as industrial control systems, from inbound cyber attacks, to combinations of hardware and software running in proxy computers in the source and destination networks.

The hardware permits data to flow from one network to another but is physically unable to send any information back into the source network. The software replicates databases and emulates protocol servers and devices.

Server replication software replicates databases and other servers from industrial networks to enterprise networks through unidirectional gateway hardware. Users and other applications interact naturally with the replica servers in normal enterprise IT environments. (e.g., replication of an OT historian database to an IT replication for business users)

The product solutions from vendors differ a lot. The hardware gateways from some vendors have limited or no software support. Oh, and no surprise, the branding differs, too. Some vendors sell data diodes, others sell unidirectional gateways, and others tell you that unidirectional gateways are better than data diodes. I stick with the term unidirectional gateway in this post but mean all the vendors.

Use Cases for Unidirectional Gateways

The main reason why enterprises and governments take the cost and efforts to deploy unidirectional gateways is the guarantee to build a secure, hardware-based OT – IT bridge.

Here are some use cases for unidirectional gateways:

Database replication
File transfer
Cloud integration
Device emulation data sniffing
Remote diagnostics and maintenance
Real-time monitoring of safety-critical networks
Scheduled application and operating system updates

Various architecture options enable enforce unidirectional communication. This does not mean that you cannot communicate in both directions. Different patterns exist, such as switching the direction for some time or deploying two unidirectional gateways (one in each direction).

The book Secure Operations Technology from Andrew Ginter is a great resource to learn about the use cases, challenges, best practices, and various architectures for zero trust environments powered by unidirectional gateways.

Firewall (= Software) vs. Unidirectional Gateway (Hardware)

Most people say that their infrastructure is secure when I talk to customers because they use a firewall plus infrastructure- and application-specific configuration such as RBAC, encryption, BYOK, etc. Actually, this is good enough for many use cases because the risk assessment compares the cost to the risk.

Unfortunately, there is no such thing as a “unidirectional firewall”. All TCP and other connections through firewalls are intrinsically bidirectional. A full-duplex SYNC/ACK nature is easily exploited. This is why UDP is preferred for cyber data. Cyber data is not transactional data, so losing bits of it, not having it all, or not getting it in the right order at the right time really doesn’t matter. Threat intelligence is about pattern recognition, not about recognizing revenue.

Therefore, requirements in an air-gapped zero trust environment are different: Instead of discussing data loss or guaranteed ordering, the main goal is unidirectional communication and no technical ability to send data from the outside in.

A firewall is just software. Waterfall’s whitepaper describes the risks of a firewall compared to a hardware-based unidirectional gateway:

A firewall is not good enough for some safety scenarios, as you can see in the above table.

With this long introduction, let’s discuss how Apache Kafka fits into this discussion around air-gapped environments and unidirectional zero trust communication.

Kafka in Zero Trust Edge Environments

Event streaming with Apache Kafka at the edge is not cutting edge anymore. It is a common approach to providing the same open, flexible, and scalable architecture at the edge as in the cloud or data center.

Possible locations for a Kafka edge deployment include retail stores, cell towers, trains, small factories, restaurants, etc. I already discussed the concepts and architectures in detail in the past: “Apache Kafka is the New Black at the Edge” and “Use Cases and Architectures for Kafka at the Edge“.

Kafka in an Air-Gapped Environment

Data in Motion in air-gapped and zero trust environments requires the self-managed deployment of the Kafka infrastructure. This ensures disconnected local event streaming without the need for internet communication. Ideally, engineering teams leverage provisioning and operations tools for Kafka, such as Confluent’s open-source Ansible playbooks and installer or the Confluent Operator for Kubernetes.

Some environments are truly air-gapped and disconnected. However, in many cases, at least a unidirectional replication of some information to the outside world is required.

Hybrid Architecture with Air-Gapped Edge and Cloud Kafka

I wrote a lot about deploying Kafka in hybrid architectures. Check out the following good summary of Kafka deployment patterns: “Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments“.

Some Confluent customers deploy Confluent in front of other air-gapped edge infrastructure to provide a secure intermediary (on Linux) between the existing (Windows) hardware and modern (Linux) infrastructure:

Kafka-native tools for bidirectional replication between the air-gapped environments and the public data center or cloud include MirrorMaker 2, Confluent Replicator, and Confluent Cluster Linking. The latter is my recommended option as it uses the native Kafka protocol for replication. Hence, it does not require separate infrastructure via Kafka Connect like the other two options.

Initiation from the Air-Gapped Kafka and Replication to the Cloud

In many zero trust environments, only the secured site can initiate the communication to remote sites. Not the other way round. This makes sense. It is much harder for external attackers to get communication running as the firewall configuration is very restrictive. However, it is challenging to implement in some scenarios. Not all tools support this requirement.

Confluent provides a feature called “Source Connection Origination” that solves exactly this problem. The source application always initiates communication from the secure on-prem site. You don’t need to get an exception from your network security team!

Hence, data is sent both ways between Kafka clusters without opening up any on-prem firewalls:

If you need even better security, then a unidirectional Kafka gateway is the way to go.

Kafka-native Unidirectional Data Diode

A hardware-based unidirectional gateway is the foundation of true zero trust infrastructures. In the past, the software running on top focused on file and database replication from the edge site to the remote site safely.

Air-gapped safety environments have the same challenges as any other data center:

Higher volumes of events with the need for elastic scale
Demand for real-time capabilities and flexible deployment of applications
Processing data in motion instead of just storing data at rest for batch processing

Hence, Kafka is obviously the right choice for building a software-based unidirectional gateway for zero trust security architectures. This enables streaming data from industrial networks to enterprise networks.

Confluent Data Diode for Streaming OT/IT Replication

Confluent Data Diode combines UDP-based Kafka Connect source and sink connectors for high volume streaming replication between OT and IT. Like any other software for unidirectional gateways, it should run over a one-way hardware interface (Ethernet cable, OWL Cyber, Waterfall WF-500, or any other gateway).

The open architecture of the Confluent Data Diode enables the implementation of additional data processing scenarios such as filtering, anomaly detection, analytics, receive upstream traffic, etc. The blog post “Apache Kafka as Modern Data Historian” explores the capabilities and relation to traditional historians in more detail.

The following diagram shows the architecture of the Confluent Data Diode:

The Data Diode connector solves a similar purpose as Confluent Replicator; however, the big difference is that the data diode connector works over UDP, while Confluent Replicator requires TCP/IP.

The Data Diode connector is meant to be used in a high-security unidirectional network. The network settings do not permit TCP/IP packets in such networks, and UDP packets are only allowed in one direction. The sink connector serializes one or more Kafka records into a datagram packet and sends it to a remote server running the Data Diode Source Connector. The sink connector must be installed in the source Kafka cluster.

Example: Border Protection with Kafka for Security and Surveillance

Let’s take a look at an example I have seen at various customers in similar ways: Security and video surveillance in real-time at the edge (like at the border control):

The example is generic. The architecture is very flexible. Deployment is possible in

Zero trust environments using unidirectional gateways and Confluent’s Data Diode
Less safety-critical (but still critical) environments secured with firewalls, VPN, etc. leveraging the source connection initiation to replicate events in real-time from the edge to the remote data center or cloud
Disconnected edge environments that operate completely offline (or maybe replicate data in a manual process from time to time, e.g., using a USB stick).

Use cases include data integration at the edge, data processing (such as image recognition, data correlation, threat detection, and alerting in real-time), and replication to the remote data center.

Kafka Enables Disconnected Edge Computing and Unidirectional Zero Trust Replication

The firewall is NOT a secure solution for zero trust environments. That might be the key lesson learned for many IT-related people. Frankly, I never heard about hardware-based unidirectional gateways until a few months ago when we had OT/IT conversations with some customers.

The Confluent Data Diode is an exciting solution to modernize the OT/IT integration in a zero trust environment via streaming replication at scale. This is the next step after many companies already run Kafka at the disconnected edge or in hybrid scenarios using “traditional Kafka replication methods” such as MirrorMaker or Confluent Replicator.

Do you already deploy Kafka in air-gapped and zero trust environments? How do you solve the security questions? Do you use a hardware-based unidirectional gateway? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Cybersecurity (Part 5 of 6) – Zero Trust and Air-Gapped Environments appeared first on Kai Waehner.

Kafka for Cybersecurity (Part 4 of 6) – Digital Forensics

Kai Waehner — Fri, 23 Jul 2021 10:22:07 +0000

Blog series: Apache Kafka for Cybersecurity

Part 1: Data in Motion as cybersecurity backbone
Part 2: Situational awareness
Part 3: Threat intelligence
Part 4 (THIS POST): Forensics
Part 5: Air-gapped and zero trust environments
Part 6: SIEM / SOAR modernization

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Digital Forensics

Let’s start with the definition of the term “Digital Forensics”. In the IT world, we can define it as analytics of historical data sets to find insights. More specifically, digital forensics means:

Application of science to criminal and civil laws, mainly during a criminal investigation.
It is applied to internal corporate investigations in the private sector or, more generally, to intrusion investigations in the public and private sector (a specialist probe into the nature and extent of an unauthorized network intrusion).
Forensic scientists collect, preserve, and analyze scientific evidence during the course of investigating digital media in a forensically sound manner.
Identify, preserve, recover, analyze and present facts and opinions about digital information.

The technical aspect is divided into several sub-branches relating to the type of digital devices involved: Computer forensics, network forensics, forensic data analysis, and mobile device forensics.

A digital forensic investigation commonly consists of three stages: acquisition, analysis, and reporting. The final goal is to reconstruct digital events. Let’s see what role Kafka and its ecosystem play here.

Digital Forensics with Kafka’s Long Term Storage and Replayability

Kafka stores data in its distributed commit log. The log is durable and persists events on the disk with guaranteed order. The replication mechanism guarantees no data loss even if a node goes down. Exactly-once semantics (EOS) and other features enable transactional workloads. Hence, more and more deployments leverage Kafka as a database for long-term storage.

Forensics on Historical Events in the Kafka Log

The ordered historical events enable Kafka consumers to do digital forensics:

Capture the complete attack vector
Playback of an attack for the training of humans or machines
Create threat surface simulations
Compliance / regulatory processing
Etc.

The forensics consumption is typically a batch process to consume all events from a specific timeframe. As all consumers are truly decoupled from each other, the “normal processing” can still happen in real-time. There is no performance impact due to the concepts of Kafka’s decoupling to enable a domain-driven design (DDD). The forensics teams use different tools to connect to Kafka. For instance, data scientists usually use the Kafka Python client to consume historical data.

Challenges with Long-Term Storage in Kafka

Storing data long-term in Kafka is possible since the beginning. Each Kafka topic gets a retention time. Many use cases use a retention time of a few hours or days as the data is only processed and stored in another system (like a database or data warehouse). However, more and more projects use a retention time of a few years or even -1 (= forever) for some Kafka topics (e.g., due to compliance reasons or to store transactional data).

The drawback of using Kafka for forensics is the huge volume of historical data and its related high cost and scalability issues. This gets pretty expensive as Kafka uses regular HDDs or SDDS as the disk storage. Additionally, data rebalancing between brokers (e.g., if a new broker is added to a cluster) takes a long time for huge volumes of data sets. Hence, rebalancing takes hours can impact scalability and reliability.

But there is a solution to these challenges: Tiered Storage.

Tiered Storage for Apache Kafka via KIP-405

Tiered Storage for Kafka separates compute and storage. This solves both problems described above:

Significant cost reduction by using a much cheaper storage system.
Much better scalability and elasticity as rebalancing is only needed for the brokers (that only store the small hot data sets)

KIP-405 is the assigned open-source task that describes the plan and process for adding Tiered Storage to Apache Kafka. Confluent is actively working on this with the open-source community. Uber is leading the initiative for this KIP and works on HDFS integration. Check out Uber’s Kafka Summit APAC talk about Tiered Storage for more details.

Confluent Tiered Storage for Kafka

Confluent Tiered Storage is generally available for quite some time in Confluent Platform and used under the hood in Confluent Cloud in thousands of Kafka clusters. Certified object stores include cloud object stores such as AWS S3 or Google Cloud Storage and on-premise object storage such as Pure Storage FlashBlade.

The architecture of Confluent Tiered Storage looks like this:

Benefits of Confluent Tiered Storage for Kafka include:

Store data forever in a cost-efficient way using your favorite object storage (cloud and on-premise)
The separation between computing and storage (hot storage attached to the brokers and cold storage via the cheap object store)
Easy scale up/down as only the hot storage requires rebalancing – most deployments only store the last few hours in hot storage
No breaking code changes in Kafka clients as it is the same regular Kafka API as before
Battle-tested in Confluent Cloud in thousands of Kafka clusters
No impact on performance for real-time consumers as these consume from page cache/memory anyway, not from the hot or cold storage

As you can see, Tiered Storage is a huge benefit to provide long-term storage for massive volumes of data. This allows rethinking your data lake strategy.

True Decoupling for SIEM, SOAR, and other Kafka Consumers

Kafka’s Distributed Commit Log captures the running history of signals. This

enables true decoupling and domain-driven design
absorbs velocity and volume to protect and stabilize slow consumers
allows organic truncation via the right retention time per Kafka topic

Various producers continuously ingest new events into Kafka without knowing or caring about slow consumers. Kafka handles the backpressure. Different consumer applications use their own capable speed and communication paradigm for data ingestion:

Affordability at Scale for Real-Time and Replay

Most digital forensics and cybersecurity solutions use some batch style for analytics. This is expensive from cost but often even more important performance perspective.

A Kafka-native streaming application can process data much faster for near real-time analytics. For instance, a simple KSQL window query is used to sift through the data by playing back the topic. This will be much faster than stuffing into an index of a SIEM and then running a query.

Having tiered and then creating a filtered and aggregated “forensic” topic means the flight data recorder is always there. Many forensics use cases don’t have to needless stash data in a batch-based sink such as Spark or a SIEM.

In summary, a Kafka cluster powered by Tiered Storage is an affordable solution for both real-time analytics and digital forensics at scale.

Distributed Digital Forensics at Scale with Kafka and Spark

The paper “Digital Forensics Compute Cluster (DFORC2): A High Speed Distributed Computing Capability for Digital Forensics” is a great example. The infrastructure processes big data at scale for forensics use cases.

The foundation of the project is the digital forensics platform Autopsy. Autopsy is computer software that makes it simpler to deploy many of the open-source programs and plugins used in The Sleuth Kit. The graphical user interface displays the results from the forensic search of the underlying volume. Using the GUI makes it easier for investigators to flag pertinent sections of data.

The team extended Autopsy with a Kafka and Spark integration to add distributed compute power for data processing:

The paper was published in 2017. Today, several distributed workloads could also leverage only the Kafka ecosystem including Kafka Streams and ksqlDB. The advantages would be the single infrastructure instead of two separate distributed clusters and providing better real-time capabilities. Having said that, some big data batch workloads like map-reduce or shuffling are still a great fit for Spark jobs.

In the case of Autopsy, a tool to analyze and process large volumes of files in batch, Spark is a good choice. But similarly, a lot of file processing can be done in a stream processing application; as long as the files are processed independently and not shuffled together.

The Role of AI and Machine Learning in Digital Forensics

Digital Forensics is all about collecting, analyzing, and acting on historical events. SIEM / SOAR and other cybersecurity applications are great for many use cases. However, they are often not real-time and do not cover all scenarios. In an ideal world, you can act in real-time or even in a predictive way to prevent threats.

In the meantime, Kafka plays a huge role in AI / Machine Learning / Deep Learning infrastructures. A good primer to this topic is the post “Machine Learning and Real-Time Analytics in Apache Kafka Applications“. To be clear: Kafka and Machine Learning are different concepts and technologies. However, they are complementary and a great combination to build scalable real-time infrastructures for predicting attacks and other cyber-related activities.

The following sections show how machine learning and Kafka can be combined for model scoring and/or model training in forensics use cases.

Model Deployment with ksqlDB and TensorFlow

Analytics models enable predictions in real-time if they are deployed to a real-time scoring application. Kafka natively supports embedding models for real-time predictions at scale:

This example uses a trained TensorFlow model. A ksqlDB UDF embeds the model. Of course, Kafka can be combined with any AI technology. An analytic model is just a binary. No matter if you train it with an open-source framework, a cloud service, or a proprietary analytics suite.

Another option is to leverage a streaming model server to connect a deployed model to another streaming application via the Kafka protocol. Various model servers already provide a Kafka-native interface in addition to RPC interfaces such as HTTP or gRPC.

Kafka-native Model Training with TensorFlow I/O

Embedding a model into a Kafka application for low latency scoring and decoupling is an obvious approach. However, in the meantime, more and more companies also train models via direct consumption from the Kafka log:

Many AI products provide a native Kafka interface. For instance, TensorFlow I/O offers a Kafka plugin. There is no need for another data lake just for model training! The model training itself is still a batch job in most cases. That’s the beauty of Kafka: The heart is real-time, durable, and scalable. But the consumer can be anything: Real-time, near real-time, batch, request-response. Kafka truly decouples all consumers and all producers from each other.

We have built a demo project on Github that shows the native integration between Kafka and TensorFlow for model training and model scoring.

Kafka and Tiered Storage as Backbone for Forensics

Digital Forensics collects and analyzes historical digital information to find and present facts about criminal actions. The insights help to reconstruct digital events, find the threat actors, and build better situational awareness and threat detection in the future. This post showed what role Apache Kafka and its ecosystem play in digital forensics.

Often, Kafka is the integration pipeline that handles the backpressure for slow consumers such as SIEM / SOAR products. Additionally, the concept of Tiered Storage for Kafka enables long-term storage and digital forensics use cases. This can include Kafka-native model training. All these use cases are possible parallel to any unrelated real-time analytics workloads as Kafka truly decouples all producers and consumers from each other.

Do you use Kafka for forensics or any other long-term storage use cases? Does the architecture leverage Tiered Storage for Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Cybersecurity (Part 4 of 6) – Digital Forensics appeared first on Kai Waehner.