Threat Intelligence Archives - Kai Waehner

The Role of Data Streaming in McAfee’s Cybersecurity Evolution

Kai Waehner — Mon, 27 Jan 2025 07:33:30 +0000

In today’s digital age, cybersecurity is more vital than ever. Businesses and individuals face escalating threats such as malware, ransomware, phishing attacks, and identity theft. Combatting these challenges requires cutting-edge solutions that protect computers, networks, and devices. Beyond safeguarding digital assets, modern cybersecurity tools ensure compliance, privacy, and trust in an increasingly interconnected world.

As threats grow more sophisticated, the technologies powering cybersecurity solutions must advance to stay ahead. Data streaming technologies like Apache Kafka and Apache Flink have become foundational in this evolution, enabling real-time threat detection, prevention, and rapid response. These tools transform cybersecurity from static defenses to dynamic systems capable of identifying and neutralizing threats as they occur.

A notable example is McAfee, a global leader in cybersecurity, which has embraced data streaming to revolutionize its operations. By transitioning to an event-driven architecture powered by Apache Kafka, McAfee processes massive amounts of real-time data from millions of endpoints, ensuring instant threat identification and mitigation. This integration has enhanced scalability, reduced infrastructure complexity, and accelerated innovation, setting a benchmark for the cybersecurity industry.

Real-time data streaming is not just an advantage—it’s now a necessity for organizations aiming to safeguard digital environments against ever-evolving threats.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch.

Antivirus is NOT Enough: Supply Chain Attack

A supply chain attack occurs when attackers exploit vulnerabilities in an organization’s supply chain, targeting weaker links such as vendors or service providers to indirectly infiltrate the target.

For example, an attacker compromises Vendor 1, a software provider, by injecting malicious code into their product. Vendor 2, a service provider using Vendor 1’s software, becomes infected. The attacker then leverages Vendor 2’s connection to the Enterprise to access sensitive systems, even though Vendor 1 has no direct interaction with the enterprise.

Traditional antivirus software is insufficient to prevent such complex, multi-layered attacks. Ransomware often plays a role in supply chain attacks, as attackers use it to encrypt data or disrupt operations across compromised systems.

Modern solutions focus on real-time monitoring and event-driven architecture to detect and mitigate risks across the supply chain. These solutions utilize behavioral analytics, zero trust policies, and proactive threat intelligence to identify and stop anomalies before they escalate.

By providing end-to-end visibility, they protect organizations from cascading vulnerabilities that traditional endpoint security cannot address. In today’s interconnected world, comprehensive supply chain security is critical to safeguarding enterprises.

The Role of Data Streaming in Cybersecurity

Cybersecurity platforms must rely on real-time data for detecting and mitigating threats. Data streaming provides a backbone for processing massive amounts of security event data as it happens, ensuring swift and effective responses. My blog series on Kafka and cybersecurity looks deeply into these use cases.

To summarize:

Data Collection: A data streaming platforms powered by Apache Kafka collect logs, telemetry, and other data from devices and applications in real time.
Data Processing: Stream processing frameworks like Kafka Streams and Apache Flink continuously process this data with low latency at scale for analytics, identifying anomalies or potential threats.
Actionable Insights: The processed data feeds into Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) systems, enabling automated responses and better decision-making.

This approach transforms static, batch-driven cybersecurity operations into dynamic, real-time processes.

McAfee: A Real-World Data Streaming Success Story

McAfee is a global leader in cybersecurity, providing software solutions that protect computers, networks, and devices. Founded in 1987, the company has evolved from traditional antivirus software to a comprehensive suite of products focused on threat prevention, identity protection, and data security.

Source: McAfee

McAfee’s products cater to both individual consumers and enterprises, offering real-time protection through partnerships with global integrated service providers (ISPs) and telecom operators.

Mahesh Tyagarajan (VP, Platform Engineering and Architecture at McAfee) spoke with Confluent and Forrester about McAfee’s transition from a monolith to event-driven Microservices leveraging Apache Kafka in Confluent Cloud.

Data Streaming at McAfee with Apache Kafka Leveraging Confluent Cloud

As cyber threats have grown more complex, McAfee’s reliance on real-time data streaming has become essential. The company transitioned from a monolithic architecture to a microservices-based ecosystem with the help of Confluent Cloud, powered by Apache Kafka. The fully managed data streaming platform simplified infrastructure management, boosted scalability, and accelerated feature delivery for McAfee

Use Cases for Data Streaming

Real-Time Threat Detection: McAfee processes security events from millions of endpoints, ensuring immediate identification of malware or phishing attempts.
Subscription Management: Data streaming supports real-time customer notifications, updates, and billing processes.
Analytics and Reporting: McAfee integrates real-time data streams into analytics systems, providing insights into user behavior, threat patterns, and operational efficiency.

Transition to an Event-Driven Architecture and Microservices

By moving to an event-driven architecture with Kafka using Confluent Cloud, McAfee:

Standardized its data streaming infrastructure.
Decoupled systems using microservices, enabling scalability and resilience.
Improved developer productivity by reducing infrastructure management overhead.

This transition to data streaming with a fully managed, complete and secure cloud service empowered McAfee to handle high data ingestion volumes, manage hundreds of millions of devices, and deliver new features faster.

Business Value of Data Streaming

The adoption of data streaming delivered significant business benefits:

Improved Customer Experience: Real-time threat detection and personalized updates enhance trust and satisfaction.
Operational Efficiency: Automation and reduced infrastructure complexity save time and resources.
Scalability: McAfee can now support a growing number of devices and data sources without compromising performance.

Data Streaming as the Backbone of an Event-Driven Cybersecurity Evolution in the Cloud

McAfee’s journey showcases the transformative potential of data streaming in cybersecurity. By leveraging Apache Kafka as fully managed cloud service as the backbone of an event-driven microservices architecture, the company has enhanced its ability to detect threats, respond in real time, and deliver exceptional customer experiences.

For organizations looking to stay ahead in the cybersecurity race, investing in real-time data streaming technologies is not just an option—it’s a necessity. To learn more about how data streaming can revolutionize cybersecurity, explore my cybersecurity blog series and follow me for updates on LinkedIn or X (formerly Twitter).

The post The Role of Data Streaming in McAfee’s Cybersecurity Evolution appeared first on Kai Waehner.

Kafka for Cybersecurity (Part 4 of 6) – Digital Forensics

Kai Waehner — Fri, 23 Jul 2021 10:22:07 +0000

Apache Kafka became the de facto standard for processing data in motion across enterprises and industries. Cybersecurity is a key success factor across all use cases. Kafka is not just used as a backbone and source of truth for data. It also monitors, correlates, and proactively acts on events from various real-time and batch data sources to detect anomalies and respond to incidents. This blog series explores use cases and architectures for Kafka in the cybersecurity space, including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization. This post is part four: Digital Forensics.

Blog series: Apache Kafka for Cybersecurity

This blog series explores why security features such as RBAC, encryption, and audit logs are only the foundation of a secure event streaming infrastructure. Learn about use cases, architectures, and reference deployments for Kafka in the cybersecurity space:

Part 1: Data in Motion as cybersecurity backbone
Part 2: Situational awareness
Part 3: Threat intelligence
Part 4 (THIS POST): Forensics
Part 5: Air-gapped and zero trust environments
Part 6: SIEM / SOAR modernization

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Digital Forensics

Let’s start with the definition of the term “Digital Forensics”. In the IT world, we can define it as analytics of historical data sets to find insights. More specifically, digital forensics means:

Application of science to criminal and civil laws, mainly during a criminal investigation.
It is applied to internal corporate investigations in the private sector or, more generally, to intrusion investigations in the public and private sector (a specialist probe into the nature and extent of an unauthorized network intrusion).
Forensic scientists collect, preserve, and analyze scientific evidence during the course of investigating digital media in a forensically sound manner.
Identify, preserve, recover, analyze and present facts and opinions about digital information.

The technical aspect is divided into several sub-branches relating to the type of digital devices involved: Computer forensics, network forensics, forensic data analysis, and mobile device forensics.

A digital forensic investigation commonly consists of three stages: acquisition, analysis, and reporting. The final goal is to reconstruct digital events. Let’s see what role Kafka and its ecosystem play here.

Digital Forensics with Kafka’s Long Term Storage and Replayability

Kafka stores data in its distributed commit log. The log is durable and persists events on the disk with guaranteed order. The replication mechanism guarantees no data loss even if a node goes down. Exactly-once semantics (EOS) and other features enable transactional workloads. Hence, more and more deployments leverage Kafka as a database for long-term storage.

Forensics on Historical Events in the Kafka Log

The ordered historical events enable Kafka consumers to do digital forensics:

Capture the complete attack vector
Playback of an attack for the training of humans or machines
Create threat surface simulations
Compliance / regulatory processing
Etc.

The forensics consumption is typically a batch process to consume all events from a specific timeframe. As all consumers are truly decoupled from each other, the “normal processing” can still happen in real-time. There is no performance impact due to the concepts of Kafka’s decoupling to enable a domain-driven design (DDD). The forensics teams use different tools to connect to Kafka. For instance, data scientists usually use the Kafka Python client to consume historical data.

Challenges with Long-Term Storage in Kafka

Storing data long-term in Kafka is possible since the beginning. Each Kafka topic gets a retention time. Many use cases use a retention time of a few hours or days as the data is only processed and stored in another system (like a database or data warehouse). However, more and more projects use a retention time of a few years or even -1 (= forever) for some Kafka topics (e.g., due to compliance reasons or to store transactional data).

The drawback of using Kafka for forensics is the huge volume of historical data and its related high cost and scalability issues. This gets pretty expensive as Kafka uses regular HDDs or SDDS as the disk storage. Additionally, data rebalancing between brokers (e.g., if a new broker is added to a cluster) takes a long time for huge volumes of data sets. Hence, rebalancing takes hours can impact scalability and reliability.

But there is a solution to these challenges: Tiered Storage.

Tiered Storage for Apache Kafka via KIP-405

Tiered Storage for Kafka separates compute and storage. This solves both problems described above:

Significant cost reduction by using a much cheaper storage system.
Much better scalability and elasticity as rebalancing is only needed for the brokers (that only store the small hot data sets)

KIP-405 is the assigned open-source task that describes the plan and process for adding Tiered Storage to Apache Kafka. Confluent is actively working on this with the open-source community. Uber is leading the initiative for this KIP and works on HDFS integration. Check out Uber’s Kafka Summit APAC talk about Tiered Storage for more details.

Confluent Tiered Storage for Kafka

Confluent Tiered Storage is generally available for quite some time in Confluent Platform and used under the hood in Confluent Cloud in thousands of Kafka clusters. Certified object stores include cloud object stores such as AWS S3 or Google Cloud Storage and on-premise object storage such as Pure Storage FlashBlade.

The architecture of Confluent Tiered Storage looks like this:

Benefits of Confluent Tiered Storage for Kafka include:

Store data forever in a cost-efficient way using your favorite object storage (cloud and on-premise)
The separation between computing and storage (hot storage attached to the brokers and cold storage via the cheap object store)
Easy scale up/down as only the hot storage requires rebalancing – most deployments only store the last few hours in hot storage
No breaking code changes in Kafka clients as it is the same regular Kafka API as before
Battle-tested in Confluent Cloud in thousands of Kafka clusters
No impact on performance for real-time consumers as these consume from page cache/memory anyway, not from the hot or cold storage

As you can see, Tiered Storage is a huge benefit to provide long-term storage for massive volumes of data. This allows rethinking your data lake strategy.

True Decoupling for SIEM, SOAR, and other Kafka Consumers

Kafka’s Distributed Commit Log captures the running history of signals. This

enables true decoupling and domain-driven design
absorbs velocity and volume to protect and stabilize slow consumers
allows organic truncation via the right retention time per Kafka topic

Various producers continuously ingest new events into Kafka without knowing or caring about slow consumers. Kafka handles the backpressure. Different consumer applications use their own capable speed and communication paradigm for data ingestion:

Affordability at Scale for Real-Time and Replay

Most digital forensics and cybersecurity solutions use some batch style for analytics. This is expensive from cost but often even more important performance perspective.

A Kafka-native streaming application can process data much faster for near real-time analytics. For instance, a simple KSQL window query is used to sift through the data by playing back the topic. This will be much faster than stuffing into an index of a SIEM and then running a query.

Having tiered and then creating a filtered and aggregated “forensic” topic means the flight data recorder is always there. Many forensics use cases don’t have to needless stash data in a batch-based sink such as Spark or a SIEM.

In summary, a Kafka cluster powered by Tiered Storage is an affordable solution for both real-time analytics and digital forensics at scale.

Distributed Digital Forensics at Scale with Kafka and Spark

The paper “Digital Forensics Compute Cluster (DFORC2): A High Speed Distributed Computing Capability for Digital Forensics” is a great example. The infrastructure processes big data at scale for forensics use cases.

The foundation of the project is the digital forensics platform Autopsy. Autopsy is computer software that makes it simpler to deploy many of the open-source programs and plugins used in The Sleuth Kit. The graphical user interface displays the results from the forensic search of the underlying volume. Using the GUI makes it easier for investigators to flag pertinent sections of data.

The team extended Autopsy with a Kafka and Spark integration to add distributed compute power for data processing:

The paper was published in 2017. Today, several distributed workloads could also leverage only the Kafka ecosystem including Kafka Streams and ksqlDB. The advantages would be the single infrastructure instead of two separate distributed clusters and providing better real-time capabilities. Having said that, some big data batch workloads like map-reduce or shuffling are still a great fit for Spark jobs.

In the case of Autopsy, a tool to analyze and process large volumes of files in batch, Spark is a good choice. But similarly, a lot of file processing can be done in a stream processing application; as long as the files are processed independently and not shuffled together.

The Role of AI and Machine Learning in Digital Forensics

Digital Forensics is all about collecting, analyzing, and acting on historical events. SIEM / SOAR and other cybersecurity applications are great for many use cases. However, they are often not real-time and do not cover all scenarios. In an ideal world, you can act in real-time or even in a predictive way to prevent threats.

In the meantime, Kafka plays a huge role in AI / Machine Learning / Deep Learning infrastructures. A good primer to this topic is the post “Machine Learning and Real-Time Analytics in Apache Kafka Applications“. To be clear: Kafka and Machine Learning are different concepts and technologies. However, they are complementary and a great combination to build scalable real-time infrastructures for predicting attacks and other cyber-related activities.

The following sections show how machine learning and Kafka can be combined for model scoring and/or model training in forensics use cases.

Model Deployment with ksqlDB and TensorFlow

Analytics models enable predictions in real-time if they are deployed to a real-time scoring application. Kafka natively supports embedding models for real-time predictions at scale:

This example uses a trained TensorFlow model. A ksqlDB UDF embeds the model. Of course, Kafka can be combined with any AI technology. An analytic model is just a binary. No matter if you train it with an open-source framework, a cloud service, or a proprietary analytics suite.

Another option is to leverage a streaming model server to connect a deployed model to another streaming application via the Kafka protocol. Various model servers already provide a Kafka-native interface in addition to RPC interfaces such as HTTP or gRPC.

Kafka-native Model Training with TensorFlow I/O

Embedding a model into a Kafka application for low latency scoring and decoupling is an obvious approach. However, in the meantime, more and more companies also train models via direct consumption from the Kafka log:

Many AI products provide a native Kafka interface. For instance, TensorFlow I/O offers a Kafka plugin. There is no need for another data lake just for model training! The model training itself is still a batch job in most cases. That’s the beauty of Kafka: The heart is real-time, durable, and scalable. But the consumer can be anything: Real-time, near real-time, batch, request-response. Kafka truly decouples all consumers and all producers from each other.

We have built a demo project on Github that shows the native integration between Kafka and TensorFlow for model training and model scoring.

Kafka and Tiered Storage as Backbone for Forensics

Digital Forensics collects and analyzes historical digital information to find and present facts about criminal actions. The insights help to reconstruct digital events, find the threat actors, and build better situational awareness and threat detection in the future. This post showed what role Apache Kafka and its ecosystem play in digital forensics.

Often, Kafka is the integration pipeline that handles the backpressure for slow consumers such as SIEM / SOAR products. Additionally, the concept of Tiered Storage for Kafka enables long-term storage and digital forensics use cases. This can include Kafka-native model training. All these use cases are possible parallel to any unrelated real-time analytics workloads as Kafka truly decouples all producers and consumers from each other.

Do you use Kafka for forensics or any other long-term storage use cases? Does the architecture leverage Tiered Storage for Kafka? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Cybersecurity (Part 4 of 6) – Digital Forensics appeared first on Kai Waehner.

Kafka for Cybersecurity (Part 3 of 6) – Cyber Threat Intelligence

Kai Waehner — Thu, 15 Jul 2021 06:39:35 +0000

Blog series: Apache Kafka for Cybersecurity

Part 1: Data in Motion as cybersecurity backbone
Part 2: Situational awareness
Part 3 (THIS POST): Threat intelligence
Part 4: Forensics
Part 5: Air-gapped and zero trust environments
Part 6: SIEM / SOAR modernization

Subscribe to my newsletter to get updates immediately after the publication. Besides, I will also update the above list with direct links to this blog series’s posts as soon as published.

Cyber Threat Intelligence

Threat intelligence, or cyber threat intelligence, reduces harm by improving decision-making before, during, and after cybersecurity incidents reducing operational mean time to recovery, and reducing adversary dwell time for information technology environments.

Threat intelligence is evidence-based knowledge, including context, mechanisms, indicators, implications, and action-oriented advice about an existing or emerging menace or hazard to assets. This intelligence can be used to inform decisions regarding the subject’s response to that menace or hazard.

Threat intelligence solutions gather raw data about emerging or existing threat actors & threats from various sources. This data is then analyzed and filtered to produce threat intel feeds and management reports that contain information that automated security control solutions can use.

Threat intelligence keeps organizations informed of the risks of advanced persistent threats, zero-day threats and exploits, and how to protect against them.

Situational Awareness is Not Enough…

… but the foundation to collect and pre-process data in real-time at scale. Only real-time situational awareness enables real-time threat intelligence to provide huge benefits to the enterprise:

Mitigate harmful events in cyberspace
Proactive cybersecurity posture that is predictive, not just reactive
Bolster overall risk management policies
Improved detection of threats
Better decision-making during and following the detection of a cyber intrusion

In summary, threat intelligence allows to:

See the whole board. And see it more quickly.
See around corners.
See the enemy before they see you.

Threat Intelligence for Prevention or Mitigation across the Cyber Kill Chain

Threat intelligence is the knowledge that allows you to prevent or mitigate cyberattacks. It covers all the phases of the so-called “Cyber Kill Chain“:

Threat intelligence provides several benefits:

Empowers organizations to develop a proactive cybersecurity posture and to bolster overall risk management policies
Drives momentum toward a cybersecurity posture that is predictive, not just reactive
Enables improved detection of threats
Informs better decision-making during and following the detection of a cyber intrusion

Transactional Data vs. Analytics Data

Most use cases around data-in-motion are about all the data. This is true for all transactional use cases and even for many analytical use cases. Each event is valuable: A sale, an order, a payment, an alert from a machine, etc.

However, data is often full of noise. As I discussed earlier in this blog series, the goal in the cybersecurity space is to find the needle in the haystack and to reduce false-positive alerts.

SIEM, SOAR, OT, and ICS are almost always analytic processing regimes, BUT knowing when they are not is important. Kafka can configure topics to be tuned for transactions or analytics. That is unprecedented in the history of data processing. Threat intelligence (= awareness-in-motion) assumes the PATTERN is valuable, not the data.

Analytics in Motion powered by Kafka Streams / ksqlDB

As you can hopefully imagine from the above requirements and characteristics, event streaming with Apache Kafka and its streaming analytics ecosystem is a perfect fit for the technical infrastructure for threat intelligence.

Threat detection makes sense of the signal and the noise of the data by continuously processing signatures. This enables to detect, contain and neutralize threats proactively:

Analytics can be many things in such a scenario:

Simple business logic such as stateless filtering or stateful aggregations
Complex business rules with custom code or with an integrated separate rules engine like Drools
Machine Learning / Deep Learning by embedding any analytic model (TensorFlow, H2O, DataRobot, whatever) into the Kafka streaming application
Information from external systems applied in real-time in the right context

On a high level, the advantages of using Kafka Streams or ksqlDB for threat intelligence can be described as follows:

A single scalable and reliable real-time infrastructure for end-to-end data integration and data processing
Flexibility to write custom rules and embed other rules engines, frameworks, or trained models
Integration with other threat detection systems like IDS, SIEM, SOAR

The business logic for cyber threat detection looks different for every use case. Known attack patterns like MITRE ATT&ACK help with the implementation. However, situational awareness and threat detection also need to detect unknown anomalies.

Let’s now take a look at a concrete example.

Intel’s Cyber Intelligence Platform

Let me quote Intel themselves:

“As cyber threats continuously grow in sophistication and frequency, companies need to quickly acclimate to effectively detect, respond, and protect their environments. At Intel, we’ve addressed this need by implementing a modern, scalable Cyber Intelligence Platform (CIP) based on Splunk and Confluent. We believe that CIP positions us for the best defense against cyber threats well into the future.

Our CIP ingests tens of terabytes of data each day and transforms it into actionable insights through streams processing, context-smart applications, and advanced analytics techniques. Kafka serves as a massive data pipeline within the platform. It provides us the ability to operate on data in-stream, enabling us to reduce Mean Time to Detect (MTTD) and Mean Time to Respond (MTTR). Faster detection and response ultimately leads to better prevention.”

Let’s explore Intel’s CIP for threat intelligence in more detail.

Detecting Vulnerabilities with Stream Processing

Intel’s CIP leverages the whole Kafka ecosystem provided by Confluent:

Ingestion: Kafka producer clients for various sources such as databases, scanning engines, IP address management, asset management inventory, etc.
Streaming analytics: Kafka Streams for filtering vulnerabilities by business unit, joining asset ownership with vulnerable assets, etc.
Egress: Kafka Connect sink connectors for data lakes, IT partners, other business units, SIEM, SOAR, etc.
High availability: Multi-Region Clusters (MRC) for high availability across regions
And much more…

Here is a high-level architecture:

Intel’s Kafka Maturity Timeline

Building a cybersecurity infrastructure is not a big bang. A step-by-step approach starts with integrating the first sources and sinks, some simple stream processing, and deployment as a pilot project. Over time, more and more data sources and sinks are added, the business logic gets more powerful, and the scale increases.

Intel’s Kafka maturity timeline shows their learning curve:

Kafka Benefits to Intel

Intel describes their benefits for leveraging event streaming as follows:

Economies of scale
Operate on data in the stream
Reduce technical debt and downstream costs
Generates contextually rich data
Global scale and reach
Always on
Modern architecture with a thriving community
Kafka leadership through Confluent expertise

That’s pretty much the same reasons I use in many of my other blog posts to explain the rise of data in motion powered by Apache Kafka across industries and use cases…

For more intel on Intel’s Cyber Intelligence Platform powered by Confluent and Splunk, check out their whitepaper and Kafka Summit talk.

Scalable Real-time Cyber Threat Intelligence with Kafka

Kafka is not just used as a backbone and source of truth for data. It also monitors, correlates, and proactively acts on events from various real-time and batch data sources to implement cyber threat intelligence.

The Cyber Intelligence Platform from Intel is a great example of a Kafka-powered cybersecurity solution. It leverages the whole Kafka ecosystem to build a scalable and reliable real-time integration and processing layer. The streaming analytics logic depends on the use case. It can cover simple business logic but also external rules engines or analytic models.

How do you fight against cybersecurity risks? What technologies and architectures do you use to implement cyber threat intelligence? How does Kafka complement other tools? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Cybersecurity (Part 3 of 6) – Cyber Threat Intelligence appeared first on Kai Waehner.

Kafka for Cybersecurity (Part 1 of 6) – Data in Motion as Backbone

Kai Waehner — Fri, 02 Jul 2021 11:49:00 +0000

Blog series: Apache Kafka for Cybersecurity

Part 1 (THIS POST): Data in Motion as cybersecurity backbone
Part 2: Situational awareness
Part 3: Threat intelligence
Part 4: Forensics
Part 5: Air-gapped and zero trust environments
Part 6: SIEM / SOAR modernization

Why should you care about Cybersecurity?

Cybersecurity is the protection of computer systems and networks from information disclosure, theft of, or damage to their hardware, software, or electronic data, as well as from the disruption or misdirection of the services they provide.

The field is becoming increasingly significant due to the increased reliance on computer systems, the internet, and wireless network standards such as Bluetooth and Wi-Fi, and due to the growth of “smart” devices, including smartphones, televisions, and the various devices that constitute the “Internet of Things”. Owing to its complexity, cybersecurity is also one of the major challenges in the contemporary world in terms of politics and technology.

Various actors can be involved in cybersecurity attacks. This includes web scraping, hackers, criminals, terrorists, state-sponsored and state-initiated actors.

Examples of recent successful attacks

Most successful attacks have a financial and brand impact. However, it depends on the organization and the kind of attack or data breach. Here are a few recent examples quoted from news articles that you have probably heard of in the tv, newspaper, or internet:

533 million Facebook users’ phone numbers and personal data have been leaked online
500 Million LinkedIn Users’ Data Were Allegedly Hacked A Tale O
A Tale of Two Hacks: From SolarWinds to Microsoft Exchange Ransomware
Ransomware attack shuts down biggest U.S. gasoline pipeline

Privacy, safety, and cost are huge factors in these successful attacks. The craziest part is that most of the successful cyber attacks don’t even get public as companies prefer to keep them secret.

Supply Chain Attacks

You are not even safe if your own infrastructure is secure. A supply chain attack is a cyber-attack that seeks to damage an organization by targeting less-secure elements in the supply chain.

A supply chain attack can occur in any industry, from the financial, oil, or government sectors. For instance, cybercriminals tamper with the manufacturing process by installing a rootkit or hardware-based spying components.

The supply chain involves hardware, software, and humans.

Norton Rose Fulbright shows how a supply chain attack looks like:

A well-known example of a supply chain attack is the Experian breach, where the data of millions of T-Mobile customers was exposed by the world’s biggest consumer credit monitoring firm.

The SolarWinds breach is another famous example. SolarWinds’ network management system has over 300,000 customers. Many of them are heavy hitters, much of the US Federal government, including the Department of Defense, 425 of the US Fortune 500, and many customers worldwide.

Impact of Cybersecurity attacks

“It takes 20 years to build a reputation and few minutes of cyber-incident to ruin it.” (Stephane Nappo)

Security attacks are exploding with high costs. The average cost of a data breach is $3.86 MILLION. It takes 280 DAYS on average time to identify and contain a breach. Additionally, the brand impact is huge but very hard to quantify.

Cybersecurity as a key piece of the security landscape

As you can see in the above examples: The threat is real!

The digital transformation requires IT capabilities, even in the OT world. Internet of Things (IoT), Industrial IoT (IIoT), Industry 4.0 (I40), connected vehicles, smart city, social networks, and similar game-changing trends are only possible with modern technology: Networking, communication, connectivity, open standards, ”always-on”, billions of devices, and so on.

The security landscape gets more and more important. And Cybersecurity is a key piece of that:

The other security components are very relevant and complimentary, of course. But let’s also talk about some challenges:

Access control: Complex and error-prone
Encryption: Very important in most cases, sometimes not needed (e.g., in DMZ / air-gapped environments)
Hardware security: No help against insiders
OT security: Avoid risk (change operations) vs. transfer some risk (buy insurance)

Therefore, in addition to the above security factors, each organization requires good cybersecurity infrastructure (including SIEM and SOAR components).

Continuous real-time data correlation across various data sources is mandatory for a good cybersecurity strategy to have a holistic view and understanding of all the events and potential abuses that are taking place. This is a combination of data collection of different activities happening on critical networks followed by data correlation in real-time (so-called stream processing/streaming analytics).

Cybersecurity challenges – The threat is real!

Plenty of different attacks exist: Stealing intellectual property (IP), Denial-of-service attacks (DDoS), ransomware, wiperware, and so on. WannaCry, NotPetya, and SolarWinds are a few famous examples of successful and impactful cyberattacks. The damage is often billions of dollars.

The key challenge for cybersecurity experts: Find the Needle(s) in the Haystack.

Systems need to detect true positives in real-time automatically. This includes capabilities such as:

Threat detection
Intrusion prevention
Anomaly detection
Compliance auditing
Proactive response

The haystack is typically huge, i.e., massive volumes of data. Often, it is not just one haystack but many. Hence, a key task is to reduce false positives. This is possible via:

Automation
Process big volumes of data in real-time
Integration of all sources
No ‘ignore’ on certain events
Creation of filters and correlated event rules
Improve signal-to-noise ratio (SNR)
Correlate a “collection of needles” into a “signature needle”

TL;DR: Correlate massive volumes of data and act in real-time to threats. That’s why Kafka comes into play…

Kafka and Cybersecurity

Real-time data beats slow data. That’s true (almost) everywhere and the main reason for Kafka’s success and huge adoption across use cases and industries. Real-time sensor diagnostics and track&trace in transportation, instant payment and trade processing in banking, real-time inventory and personalized offers in retail, that’s just a few of the examples.

But: Real-time data beats slow data in the (cyber)security space, too. Actually, it is even more critical here. A few examples:

Security: Access control and encryption, regulatory compliance, rules engine, security monitoring, surveillance.
Cybersecurity: Risk classification, threat detection, intrusion detection, incident response, fraud detection

The main goal in cybersecurity is to reduce the Mean Time to Detect (MTTD) and Mean Time to Respond (MTTR). Faster detection and response ultimately leads to better prevention. For this reason, modern enterprise architectures adopt Apache Kafka and its ecosystem as the backbone for cybersecurity:

Kafka and its ecosystem provide low-latency performance at high throughput in conjunction with data integration and data processing capabilities. That’s exactly what you need for most cybersecurity systems.

Having said this, let’s be clear:

Kafka was NOT built for Cybersecurity

From a technology perspective, cybersecurity includes the following features, products, and services:

SIEM / SOAR
Situational Awareness
Operational Awareness
Intrusion Detection
Signals and Noise
Signature Detection
Incident Response
Threat Hunting & Intelligence
Vulnerability Management
Digital Forensics
…

Kafka is an event streaming platform built to process data in motion, not to solve cybersecurity issues. So, why are we talking about Kafka in this context then?

Most existing cybersecurity platforms contain the same characteristics: Batch, proprietary, inflexible, not scalable, expensive. Think about this with your favorite SIEM in mind.

Kafka as the backbone for Cybersecurity

Kafka has different characteristics: Real-time, open, flexible, scalable, cost-efficient. Hence, Kafka is the ideal backbone for a next-generation cybersecurity infrastructure. This enables the following capabilities:

Integrate with all legacy and modern interfaces
Record, filter, curate a broad set of traffic streams
Let analytic sinks consume just the right amount of data
Drastically reduce the complexity of the enterprise architectures
Drastically reduce the cost of SIEM / SOAR deployments
Add new analytics engines
Add stream-speed detection and response at scale in real-time
Add mission-critical (non-) security-related applications

Kafka complements SIEM/SOAR and other Cybersecurity and Network Monitoring Tools

Every enterprise is different… Flexibility is key for your cybersecurity initiative! That’s why I see many customers adopting Confluent as an independent but enterprise-grade and hybrid foundation for the cybersecurity enterprise architecture.

Kafka or Confluent do not replace but complement other security products such as IBM QRadar, HP ArcSight, or Splunk. The same is true for other network security monitoring and cybersecurity tools.

High Velocity and (ridiculous) volume of Netflow / PCAP data are processed via tools such as Zeek / Corelight. Open-source frameworks such as TensorFlow or AutoML products such as DataRobot provide modern analytics (machine learning/deep learning) to enhance the intrusion detection systems (IDS) to respond to incidents proactively.

Kafka provides a flexible and scalable real-time backplane for the cybersecurity platform. Its storage capabilities truly decouple different systems. For instance, Zeek handles the incoming ridiculous volume of PCAP data before Kafka handles the backpressure for slow batch consumers such as Splunk or Elasticsearch:

In the real world, there is NOT just one SIEM, SOAR, or IDS in the enterprise architecture. Different applications solve different problems. Kafka shines due to its true decoupling while still providing real-time consumption for consumers that can handle it.

Cybersecurity is required everywhere

Running a cybersecurity suite in one location is not sufficient. Of course, it depends on the enterprise architecture, use cases, and many other factors. But most companies have hybrid deployments, multi-cloud, and/or edge scenarios.

Therefore, many companies choose Confluent to deploy event streaming everywhere, including uni- or bi-directional integration, edge aggregation setups, and air-gapped environments. This way, one open and flexible template architecture enables streaming applications and real-time cybersecurity everywhere.

Here is an example of an end-to-end cybersecurity infrastructure leveraging serverless Kafka in the cloud (Confluent Cloud) and self-managed, cloud-native Kafka (Confluent Platform) on-premise, connected Kafka at the edge (a Confluent cluster on the ships), and disconnected edge (a single broker in the drone):

Let’s now take a look at a real-world example for a Kafka-powered cybersecurity platform.

Crowdstrike’s Kafka backbone for cybersecurity

Crowdstrike is a cybersecurity cloud solution for endpoint security, threat intelligence, and cyberattack response services. Kafka is the backbone of their infrastructure. They ingest ~5 trillion events per week into the cloud platform.

A cybersecurity platform needs to be up and responsive 24/7. It must be available, operational, reliable, and maintainable all the time. Crowdstrike defined four critical roles for operating their streaming data infrastructure: Observability, Availability, Operability, Data Quality.

Check out Crowdstrike’s tech blog to learn more about their Kafka infrastructure.

Kafka is (not) all you need for cybersecurity

This introductory post explored the basics of cybersecurity and how it relates respectively why it requires data in motion powered by Apache Kafka. The rest of the series will go deeper into specific topics that partly rely on each other.

Threat intelligence is only possible with situational awareness. Forensics is complementary. Deployments differ depending on security, safety, and compliance requirements.

I will also give a few more concrete Kafka-powered examples and discuss a few success stories for some of these topics. Last but not least, I will show different reference architectures where Kafka complements existing tools such as Zeek or Splunk within the enterprise architecture.

How do you solve cybersecurity risks? What technologies and architectures do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Kafka for Cybersecurity (Part 1 of 6) – Data in Motion as Backbone appeared first on Kai Waehner.