KSQL Deep Dive – The Open Source Streaming SQL Engine for Apache Kafka

I had a workshop at Kafka Meetup Tel Aviv in May 2018: “KSQL Deep Dive – The Open Source Streaming SQL Engine for Apache Kafka“.

Here are the agenda, slides and video recording.

KSQL – The Open Source Streaming SQL Engine for Apache Kafka

KSQL is the open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to simplify all this and make stream processing available to everyone. Even though it is simple to use, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood).
Benefits of using KSQL include No coding required; no additional analytics cluster needed; streams and tables as first-class constructs; access to the rich Kafka ecosystem. This session introduces the concepts and architecture of KSQL. Use cases such as Streaming ETL, Real-Time Stream Monitoring or Anomaly Detection are discussed. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.

If you want to get started, try out the KSQL quick start guide. It get’s you started in 10min locally on your laptop or alternatively in a Docker environment.

Agenda

  1. Apache Kafka Ecosystem
  2. Kafka Streams as Foundation for KSQL
  3. Motivation for KSQL
  4. KSQL Concepts
  5. Live Demo #1 – Intro to KSQL
  6. KSQL Architecture
  7. Live Demo #2 – Clickstream Analysis
  8. Building a User Defined Function (Example: Machine Learning)
  9. Getting Started

Slides

Click on the button to load the content from www.slideshare.net.

Load content

Video Recording

There was a Youtube live stream. Unfortunately, we had some technical problems. So the audio of the first half is not really good. Sorry for that. I still want to share it. The second half has good sounds quality:

Looking forward to get your feedback. Also please feel free to ask questions in the Confluent Slack community (where you can also get help from the engineers of KSQL) or create Github tickets if you have problems or contributions to this great open source project.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

How Penske Logistics Transforms Fleet Intelligence with Data Streaming and AI

Real-time visibility has become essential in logistics. As supply chains grow more complex, providers must…

1 day ago

Data Streaming Meets the SAP Ecosystem and Databricks – Insights from SAP Sapphire Madrid

SAP Sapphire 2025 in Madrid brought together global SAP users, partners, and technology leaders to…

6 days ago

Agentic AI with the Agent2Agent Protocol (A2A) and MCP using Apache Kafka as Event Broker

Agentic AI is emerging as a powerful pattern for building autonomous, intelligent, and collaborative systems.…

1 week ago

Powering Fantasy Sports at Scale: How Dream11 Uses Apache Kafka for Real-Time Gaming

Fantasy sports has evolved into a data-driven, real-time digital industry with high stakes and massive…

2 weeks ago

Databricks and Confluent Leading Data and AI Architectures – What About Snowflake, BigQuery, and Friends?

Confluent, Databricks, and Snowflake are trusted by thousands of enterprises to power critical workloads—each with…

3 weeks ago

Databricks and Confluent in the World of Enterprise Software (with SAP as Example)

Enterprise data lives in complex ecosystems—SAP, Oracle, Salesforce, ServiceNow, IBM Mainframes, and more. This article…

3 weeks ago