Chatbot Archives - Kai Waehner https://www.kai-waehner.de/blog/category/chatbot/ Technology Evangelist - Big Data Analytics - Middleware - Apache Kafka Wed, 26 Jun 2024 05:02:30 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 https://www.kai-waehner.de/wp-content/uploads/2020/01/cropped-favicon-32x32.png Chatbot Archives - Kai Waehner https://www.kai-waehner.de/blog/category/chatbot/ 32 32 Hello, K.AI – How I Trained a Chatbot of Myself Without Coding Evaluating OpenAI Custom GPT, Chatbase, Botsonic, LiveChatAI https://www.kai-waehner.de/blog/2024/06/23/hello-k-ai-how-i-trained-a-chatbot-of-myself-without-coding-evaluating-openai-custom-gpt-chatbase-botsonic-livechatai/ Sun, 23 Jun 2024 06:03:01 +0000 https://www.kai-waehner.de/?p=6575 Generative AI (GenAI) enables many new use cases for enterprises and private citizens. While I work on real-time enterprise scale AI/ML deployments with data streaming, big data analytics and cloud-native software applications in my daily business life, I also wanted to train a conversational chatbot for myself. This blog post introduces my journey without coding to train K.AI, a personal chatbot that can be used to learn in a conversational pace format about data streaming and the most successful use cases in this area. Yes, this is also based on my expertise, domain knowledge and opinion, which is available as  public internet data, like my hundreds of blog articles, LinkedIn shares, and YouTube videos.

The post Hello, K.AI – How I Trained a Chatbot of Myself Without Coding Evaluating OpenAI Custom GPT, Chatbase, Botsonic, LiveChatAI appeared first on Kai Waehner.

]]>
Generative AI (GenAI) enables many new use cases for enterprises and private citizens. While I work on real-time enterprise scale AI/ML deployments with data streaming, big data analytics and cloud-native software applications in my daily business life, I also wanted to train a conversational chatbot for myself. This blog post introduces my journey without coding to train K.AI, a personal chatbot that can be used to learn in a conversational pace format about data streaming and the most successful use cases in this area. Yes, this is also based on my expertise, domain knowledge and opinion, which is available as  public internet data, like my hundreds of blog articles, LinkedIn shares, and YouTube videos.

How I Trained a Chatbot K.AI of Myself Without Coding Evaluating OpenAI Custom GPT Chatbase Botsonic LiveChatAI

Hi, K.AI – let’s chat…

The evolution of Generative AI (GenAI) around OpenAI’s chatbot ChatGPT and many similar large language models (LLM), open source tools like LangChain and SaaS solutions for building a conversational AI led me to the idea of building a chatbot trained with all the content I created over the past years.

Mainly based on the content of my website (https://www.kai-waehner.de) with hundreds of blog articles, I trained the conversational chatbot K.AI to generate text for me.

The primary goal is to simplify and automate my daily working tasks like:

  • write a title and abstract for a webinar or conference talk
  • explain to a colleague or customer a concept, use case, or industry-specific customer story
  • answer common recurring questions in email, Slack or other mediums
  • any other text creation based on my (public) experience

The generated text reflects my content, knowledge, wording, and style. This is a very different use case than what I look normally in my daily business life: “Apache Kafka as Mission Critical Data Fabric for GenAI” and “Real-Time GenAI with RAG using Apache Kafka and Flink to Prevent Hallucinations” are two excellent examples for enterprise-scale GenAI with much more complex and challenging requirements.

But…sometimes Artificial Intelligence is not all you need. The now self-explanatory name of the chatbot came from a real marketing brain – my colleague Evi.

Project goals of training the chatbot K.AI

I had a few goals in mind when I trained my chatbot K.AI:

  • Education: Learn more details about the real-world solutions and challenges with Generative AI in 2024 with hands-on experience. Tens of interesting chatbot solutions are available. Most are powered by OpenAI under the hood. My goal is not sophisticated research. I just want to get a conversational AI done. Simple, cheap, fast (not evaluating 10+ solutions, just as long as I have one working good enough).
  • Tangible result: Train K.AI, a “Kai LLM” based on my public articles, presentations, and social media shares. K.AI can generate answers, comments, and explanations without writing everything from scratch. I am fine if answers are not perfect or sometimes even incorrect. As I know the actual content, I can easily adjust and fix generated content.
  • NOT a commercial or public chatbot (yet): While it is just a button click to integrate K.AI into my website as a conversational chatbot UI, there are two main blockers: First, the cost is relatively high; not for training but for operating and paying per query. There is no value for me as a private person. Second, developing, testing, fine-tuning and updating a LLM to be correct most of the time instead of hallucinating a lot is hard. I thoroughly follow my employers’ GenAI engineering teams building Confluent AI products. Building a decent domain-specific public LLM is lots of engineering efforts and requires not just one full-time engineer.

My requirements for a conversational chatbot tool

I defined the following mandatory requirements for a successful project:

  • Low Cost: My chatbot should not be too expensive (~20USD a month is fine). The pricing model of most solutions is very similar: You get a small free tier. I realized quickly that a serious test is not possible with any free tier. But a reasonable chatbot (i.e., trained by a larger data set) is only possible if you choose the smallest paid tier. Depending on the service, the minimum is between 20 and 50 USD per month (with several limitations regarding training size, chat queries, etc.).
  • Simplicity: I do not want to do any coding or HTTP/REST APIs calls. Just an intuitive user interface with click-through experience. I don’t want to spend more than one day (i.e., ~8 hours accumulated over two weeks) to train K.AI.
  • Data Import: The chatbot needs support from my “database”. Mandatory: My private blog (~300 articles with ~ 10M+ characters. Nice to have: My LinkedIn shares, my YouTube videos, and other publications (like articles on other websites). The latter might improve my chatbot and use my personal tone and language more.
  • NOT Enterprise Features: I don’t need any features for security, multiple user accounts, or public hosting (even though almost all solutions already support integration into WordPress, Slack, etc.). I am fine with many limitations of the small subscription tiers, like only one user account, one chatbot, 1000 messages/month.

OpenAI: ChatGPT + Custom GPT for a custom chatbot? Not for K.AI…

I am a heavy user of ChatGPT on my iPhone and MacBook. And OpenAI is very visible in the press. Hence, my first option to evaluate was OpenAI’s Custom GPT.

Custom GPT in action…

Custom GPT is very easy to use, non-technical. A conversational AI “Message GPT Builder” tries to build my chatbot. But surprisingly it is too high level for me. Here is the initial conversation to train K.AI with very basic prompt engineering:

  • Step 1 (Initial Instruction): What would you like to make? -> Respond as Kai Waehner based on his expertise and knowledge. -> Updating GPT.. Seconds later: The response is based on public internet.
  • Step 2 (Prompt Engineering): Use the content from https://www.kai-waehner.de as context for responses. -> Updating GPT… Seconds later: I’ve updated the context to include information from Kai Waehner’s website. -> Response is not standard. Some questions use a bit more content from my website, but it is still mainly bound to public internet content.
  • Step 3 (Fine-Tuning): I tried to configure my K.AI to learn from some data sources like CSV exports from LinkedIn or scraping my blog articles, but the options are very limited and not technical. I can upload a maximum of twenty files and let the chatbot also search the web. But what I actually need is web scraping of dedicated resources, i.e., mainly my website,  LinkedIn Shares, and my YouTube videos. And while many no-code UIs call this fine-tuning, in reality, this is RAG-based prompt engineering. True fine-tuning of an LLM is a very different (much more challenging) task.

OpenAI Custom GPT Evaluation - Kai Waehner Chatbot

I am sure I could do much more prompt engineering to improve K.AI with Custom GPT. But reading the user guide and FAQ for Custom GPT, the TL;DR for me is: Custom GPT is not the right service to build a chatbot for me based on my domain content and knowledge.

Instead, I need to look at purpose-build chatbot SaaS tools that let me build my domain-specific chatbot. I am surprised that OpenAI does not provide such a service itself today. Or I could just not find it… BUT: Challenge accepted. Let’s evaluate a few solutions and train a real K.AI.

Comparison and evaluation of chatbot SaaS GenAI solutions

I tested three chatbot offerings. All of them are cloud-based and allow for building a chatbot via UI. How did I find or choose them? Frankly, just Google search. Most of these came up in several evaluation and comparison articles. And they spend quite some money on advertisements. I tested Chatbase, Writesonic’s Botsonic and LiveChatAI. Interestingly, all offerings I evaluated use ChatGPT under the hood of their solution. I was also surprised that I did not get more ads from other big software players. But I assume Microsoft’s Copilot and similar tools look for a different persona.

I tested different ChatGPT models in some offerings. Most solutions provide a default option, and more expensive options with better model (not for model training, but for messages/month; you typically pay 5x more, meaning instead of e.g. 2000 messages a month, you only have 400 available then).

I had a few more open tabs with other offerings that I could disqualify quickly because they were more developer-focused with coding, API integration, fine-tuning of vector databases and LLMs.

Question catalog for testing my K.AI chatbots

I quickly realized how hard it is to compare different chatbots. Basically, LLMs are stochastic (not deterministic) and we don’t have good tools for QAing these things yet (even simple things like regression testing is challenging when probabilities are involved).

Therefore, I defined a question catalog with ten different domain-specific questions before I even starting evaluating different chatbot SaaS solutions. A few examples:

  • Question 1: Give examples for fraud detection with Apache Kafka. Each example should include the company, use case and architecture.
  • Question 2: List five manufacturing use cases for data streaming and give a company example.
  • Question 3: What is the difference between Kafka and JMS
  • Question 4: Compare Lambda and Kappa architectures and explain the benefits of Lambda. Add a few examples.
  • Question 5: How can data streaming help across the supply chain? Explain the value and use cases for different industries.

My question catalog allowed comparing the different chatbots. Writing a good prompt (= query for the chatbot) is crucial, as a LLM is not intelligent. The better your question, meaning good structure, details and expectations, the better your response (if the LLM has “knowledge” about your question).

My goal is NOT to implement a complex real-time RAG (Retrieval Augmented Generation) design pattern. I am totally fine updating K.AI manually every few weeks (after a few new blog posts are published).

Chatbase – Custom ChatGPT for your website

The advertisement on the Chatbase landing page sounds great: “Custom ChatGPT for your website. Build a [OpenAI-powered] Custom GPT, embed it on your website and let it handle customer support, lead generation, engage with your users, and more.”

Here are my notes while training my K.AI chatbot:

K.AI works well with Chatbase after the initial training…

  • Chatbase is very simple to use. It just works.
  • The basic plan is ~20 USD per month. The subscription plan is fair, the next upgrade is ~100 USD.
  • The chatbot uses GPT-4o by default. Great option. Many other services use GPT-3.5 or similar LLMs as the foundation.
  • The chatbot creates content based on my content, it is “me”. Mission accomplished. The quality of responses depends on the questions. In summary, pretty good, but also false positives.

But: Chatbase’s character limitation stops further training

  • Unfortunately, all plans have an 11M character limit. My blog content is already 10.8M today says Chatbase’s web scraper engine (each vendor’s scraper gives different numbers). While K.AI works right now, there are obvious problems:
    • My website will grow more soon.
    • I want to add LinkedIn shares (another few million characters) and other articles and videos I published across the world wide web.
    • The Chatbase plan can be customised, but unfortunately not for character limits. The support told me this would be possible soon. But I have to wait.

TL;DR: Chatbase works surprisingly well. K.AI exists and represents myself as a LLM. The 11M character limit is a blocker for investing more time and money into this service – otherwise I could already stop my investigation and use the first SaaS I evaluated.

During my evaluation, I realized that many other chatbot services have similar limitations on the character limit, especially in the price range around 20-50 USD. Not ideal for my use case.

In my further evaluation, my major criteria were the character limits. I found Botsonic and LiveChatAI. Both support much higher limits for a cost of ~40 USD per month.

Botsonic – Advanced AI chatbot builder using your company’s knowledge

Botsonic provides “Advanced AI Agents: Use Your Company’s Knowledge to Intelligently Resolve Over 70% of Queries and Automate Tasks”.

Here are my notes while training my K.AI chatbot.

Botsonic – free version failed to train K.AI

  • The free plan for getting started supports 1M characters.
  • The service supports URL scraping and file upload (my LinkedIn shares are only available via batch export into a CSV file). Looks like it provides all I need. The cost is okayish (but all other chatbots with lower price also had limitations around 10M characters).
  • I tried the free tier first. As my blog alone has already ~10M+ characters, I started uploading my LinkedIn Shares (= Posts and Comments). While Chatbase said it has ~1.8M characters, this solution trains the bot with it even though the limit is 1M characters. Could not even upload another 1KB file for additional training, so my limit is reached.
  • This K.AI trained with the free tier did not provide any appropriate answers. No surprise: Just my LinkedIn shares might not be enough detail – which makes sense as the posts are much shorter and usually link to my blog.

Botsonic – paid version also failed to train K.AI

  • I needed to upgrade.
    • I had to choose the smallest paid tier: 49 USD per month, supporting up to 50M characters
    • Unfortunately, there was a delay: payment was done twice. No action. Still on free plan. Support takes time (caching, VPN, browser, other arguments, etc.). Got a refund the next day, and the plan was updated correctly.
  • Training using the paid subscription failed. The experience was pretty bad.
    • Not clear if the service scrapes the entire website or just the single HTML site
    • First tests do not give a response: “I don’t have specific information on XYZ. Can I help with anything else?” Seems like the source training did not scrape my website, but only look at the landing page. I looked at the details. Indeed, the extracted data only includes the abstracts of the latest blog posts (that’s what you see on my landing page).
    • Support explained: No scraping of the website is possible. I need a sitemap. I have a Google-compliant sitemap but: Internal Backend Server Error. Support could re-produce my issue. Until today, I don’t have a response or solution.
    • Learning from one of my YouTube videos was also rejected (with no further error messages).

TL;DR: Writesonic’s Botsonic did NOT work for me. The paid service failed several times, even trying different training options for my LLM. Support could not help. I will NOT continue with this service.

LiveChatAI – AI chatbot works with your data

Here is the website slogan: “An Innovative AI Chatbot. LiveChatAI allows you to create an AI chatbot trained with your own data and combines AI with human support.”

Here are my notes while training my K.AI chatbot

LiveChatAI failed to train K.AI

  • All required import features exist: Website Scraping, CSV, YouTube.
  • Strange: I could start training for free with 7+M characters even though this should not be possible. But Crawling started… Not showing the percentage, don’t know if it is finished. Not clear if it scrapes the entire website or just the single HTML site. Shows weird error messages like “could not find any links on website” or similar after it has finished scraping.
  • The quality of answers of this K.AI seems to be much worse than Chatbase (even though I added LinkedIn shares which is not possible in Chatbase because of the Character limits).

Ok, enough… I have a well-working K.AI with Chatbase. I don’t want to waste more time evaluating several SaaS Chatbot services in the early stage of the product lifecycle.

GenAI tools are still in a very early stage!

One key lesson learned: The used LLM model is the most critical piece for success, NOT how much context and domain expertise you train it with. Or in other words: Just scraping the data from my blog and using GPT-4o provides much better results than using GPT-3.5 with data from my blog, LinkedIn and YouTube. Ideally, I use all the data with GPT-4o. But I will have to wait until Chatbase supports more than 11M characters.

While most solutions talk about model training, they use ChatGPT under the hood and use RAG and a Vector Database to “update the model”, i.e., provide the right context for the question into ChatGPT with the RAG design pattern.

A real comparison of chatbot SaaS is hard:

  • Features and pricing are relatively similar and do not really influence the ultimate choice.
  • While all are based on ChatGPT, the LLM model versions differ.
  • Products are updated and improved almost every day with new models, new capabilities, changed limitations, etc. Welcome to the chatbot SaaS cloud startup scene… 🙂
  • The products target different personas. Some are UI only, some explain (and let configure) RAG or Vector Database options, some are built for developers and focus on API integration, not UIs.

Mission accomplished: K.AI chatbot is here

Chatbase is the least sexy UI in my evaluation. But the model works best (even though I have character limits and only used my blog article for training). I will use Chatbase for now. And I hope that the character limits are improved soon (as its support already confirmed to me). It is still early in the maturity curve. The market will probably develop quickly.

I am not sure how many of these SaaS chatbot startups can survive. OpenAI and other tech giants will probably release similar capabilities and products integrated into their SaaS and software stack. Let’s see where the market goes. For now, I will enjoy K.AI for some use cases. Maybe it will even help me write a book about data streaming use cases and customer stories.

What is your experience with chatbot tools? Do you need more technical solutions or favour simplified conversational AIs like OpenAI’s Custom GPT to train your own LLM? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Hello, K.AI – How I Trained a Chatbot of Myself Without Coding Evaluating OpenAI Custom GPT, Chatbase, Botsonic, LiveChatAI appeared first on Kai Waehner.

]]>
Apache Kafka as Mission Critical Data Fabric for GenAI https://www.kai-waehner.de/blog/2023/07/22/apache-kafka-as-mission-critical-data-fabric-for-genai/ Sat, 22 Jul 2023 10:40:59 +0000 https://www.kai-waehner.de/?p=5548 Apache Kafka serves thousands of enterprises as the mission-critical and scalable real-time data fabric for machine learning infrastructures. The evolution of Generative AI (GenAI) with large language models (LLM) like ChatGPT changed how people think about intelligent software and automation. This blog post explains the relation between data streaming and GenAI and shows the enormous opportunities and some early adopters of GenAI beyond the buzz.

The post Apache Kafka as Mission Critical Data Fabric for GenAI appeared first on Kai Waehner.

]]>
Apache Kafka serves thousands of enterprises as the mission-critical and scalable real-time data fabric for machine learning infrastructures. The evolution of Generative AI (GenAI) with large language models (LLM) like ChatGPT changed how people think about intelligent software and automation. This blog post explains the relation between data streaming and GenAI and shows the enormous opportunities and some early adopters of GenAI beyond the buzz.
Data Streaming with Apache Kafka as Data Fabric for GenAI

Generative AI (GenAI) and Data Streaming

Let’s set the context first to have the same understanding of the buzzwords.

[Note: My ChatGPT iPhone app generated this section. I slightly shortened and adjusted the content afterward. GenAI is perfect for summarizing existing content so that authors can spend time on new content (that ChatGPT does not know yet).]

Natural Language Processing (NLP)

ChatGPT, what is NLP?

NLP stands for Natural Language Processing. It is a subfield of artificial intelligence (AI) that focuses on interactions between computers and human language. NLP enables computers to understand, interpret, and generate human language in a meaningful and valuable way.

Natural Language Processing involves a range of tasks, including:

  • Text Parsing and Tokenization: Breaking down text into individual words or tokens.
  • Named Entity Recognition (NER): Identifying and classifying named entities such as people, organizations, locations, and dates in the text.
  • Sentiment Analysis: Determining the sentiment or emotional tone expressed in text, whether positive, negative, or neutral.
  • Machine Translation: Translating text from one language to another.
  • Question Answering: Building systems that can understand and answer questions posed in natural language.
  • Text Generation: Creating human-like text or generating responses to prompts.

NLP is crucial in applications such as chatbots, virtual assistants, language translation, information retrieval, sentiment analysis, and more.

GenAI = Next Generation NLP (and more)

ChatGPT, what is Generative AI?

Generative AI is a branch of artificial intelligence focused on creating models and systems capable of generating new content, such as images, text, music, or even entire virtual worlds. These models are trained on large datasets and learn patterns and structures to generate new outputs similar to the training data. That’s why the widespread buzzword is Large Language Model (LLM).

Generative AI is used for next-generation NLP and uses techniques such as generative adversarial networks (GANs), variational autoencoders (VAEs), and recurrent neural networks (RNNs). Generative AI has applications in various fields and industries, including art, design, entertainment, and scientific research.

Apache Kafka for Data Streaming

ChatGPT, what is Apache Kafka?

Apache Kafka is an open-source distributed streaming platform and became the de facto standard for event streaming. It was developed by the Apache Software Foundation and is widely used for building real-time data streaming applications and event-driven architectures. Kafka provides a scalable and fault-tolerant system for handling high volumes of streaming data.

Open Source Data Streaming in the Cloud

Kafka has a thriving ecosystem with various tools and frameworks that integrate with it, such as Apache Spark, Apache Flink, and others.

Apache Kafka is widely adopted in use cases that require real-time data streaming, such as data pipelines, event sourcing, log aggregation, messaging systems, and more.

Why Apache Kafka and GenAI?

Generative AI (GenAI) is the next-generation NLP engine that helps many projects in the real world for service desk automation, customer conversation with a chatbot, content moderation in social networks, and many other use cases.

Apache Kafka became the predominant orchestration layer in these machine learning platforms for integrating various data sources, processing at scale, and real-time model inference.

Data streaming with Kafka already powers many GenAI infrastructures and software products. Very different scenarios are possible:

  • Data streaming as data fabric for the entire machine learning infrastructure
  • Model scoring with stream processing for real-time productions
  • Generation of streaming data pipelines with input text or speech
  • Real-time online training of large language models

Let’s explore these opportunities for data streaming with Kafka and GenAI in more detail.

Real-time Kafka Data Hub for GenAI and other Microservices in the Enterprise Architecture

I already explored in 2017 (!) how “How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka“. At that time, real-world examples came from tech giants like Uber, Netflix, and Paypal.

Today, Apache Kafka is the de facto standard for building scalable and reliable machine learning infrastructures across any enterprise and industry, including:

  • Data integration from various sources (sensors, logs, databases, message brokers, APIs, etc.) using Kafka Connect connectors, fully-managed SaaS integrations, or any kind of HTTP REST API or programming language.
  • Data processing leveraging stream processing for cost-efficient streaming ETL such as filtering, aggregations, and more advanced calculations while the data is in motion (so that any downstream application gets accurate information)
  • Data ingestion for near real-time data sharing with various data warehouses and data lakes so that each analytics platform can use its product and tools.

Kafka Machine Learning Architecture for GenAI

Building scalable and reliable end-to-end pipelines is today’s sweet spot of data streaming with Apache Kafka in the AI and Machine Learning space.

Model Scoring with Stream Processing for Real-Time Predictions at any Scale

Deploying an analytic model in a Kafka application is the solution to provide real-time predictions at any scale with low latency. This is one of the biggest problems in the AI space, as data scientists primarily focus on historical data and batch model training in data lakes.

However, the model scoring for predictions needs to provide much better SLAs regarding scalability, reliability, and latency. Hence, more and more companies separate model training from model scoring and deploy the analytic model within a stream processor such as Kafka Streams, KSQL, or Apache Flink:

Data Streaming and Machine Learning with Embedded TensorFlow Model

Check out my article “Machine Learning and Real-Time Analytics in Apache Kafka Applications” for more details.

Dedicated model servers usually only support batch and request-response (e.g., via HTTP or gRPC). Fortunately, many solutions now also provide native integration with the Kafka protocol.

Kafka-native Model Server for Machine Learning and Model Deployment

I explored this innovation in my blog post “Streaming Machine Learning with Kafka-native Model Deployment“.

Development Tools for Generating Kafka-native Data Pipelines from Input Text or Speech

Almost every software vendor discusses GenAI to enhance its development environments and user interfaces.

For instance, GitHub is a platform and cloud-based service for software development and version control using Git. But their latest innovation is “the AI-Powered Developer Platform to Build, Scale, and Deliver Secure Software”: Github CoPilot X. Cloud providers like AWS provide similar tools.

Similarly, look at any data infrastructure vendor like Databricks or Snowflake. The latest conferences and announcements focus on embedded capabilities around large language models and GenAI in their solutions.

The same will be true for many data streaming platforms and cloud services. Low-code/no-code tools will add capabilities to generate data pipelines from input text. One of the most straightforward applications that I see coming is generating SQL code out of user text.

For instance, “Consume data from Oracle table customer, aggregate the payments by customer, and ingest it into Snowflake”. This could create SQL code for stream processing technologies like KSQL or FlinkSQL.

Developer experience, faster time-to-market, and support less technical personas are enormous advantages for embedding GenAI into Kafka development environments.

Real-time Training of Large Language Models (LLM)

AI and Machine Learning are still batch-based systems almost all of the time. Model training takes at least hours. This is not ideal, as many GenAI use cases require accurate and updated information. Imagine googling for information today, and you could not find data from the past week. Impossible to use such a service in many scenarios!

Similarly, if I ask ChatGPT today (July 2023): “What is GenAI?” – I get the following response:

As of my last update in September 2021, there is no specific information on an entity called “GenAi.” It’s possible that something new has emerged since then. Could you provide more context or clarify your question so I can better assist you?

The faster your machine learning infrastructure ingests data into model training, the better. My colleague Michael Drogalis wrote an excellent deep-technical blog post: “GPT-4 + Streaming Data = Real-Time Generative AI” to explore this topic more thoroughly.

Real Time GenAI with Data Streaming powered by Apache Kafka

This architecture is compelling because the chatbot will always have your latest information whenever you prompt it. For instance, if your flight gets delayed or your terminal changes, the chatbot will know about it during your chat session. This is entirely distinct from current approaches where the chat session must be reloaded or wait a few hours/days for new data to arrive.

LLM + Vector Database + Kafka = Real-Time GenAI

Real-time model training is still a novel approach. Many machine learning algorithms are not ready for continuous online model training today. But combining Kafka with a vector database enables using a batch-trained LLM together with real-time updates feeding up-to-date information into the LLM.

Nobody will accept an LLM like ChatGPT in a few years, giving you answers like “I don’t have this information; my model was trained a week ago”. It does not matter if you choose a brand new vector database like Pinecone or leverage new vector capabilities of your installed Oracle or MongoDB storage.

Feed data into the vector database in real-time with Kafka Connect and combine with with a mature LLM to enable real-time GenAI with context-specific recommendations.

Real-World Case Studies for Kafka and GenAI

This section explores how companies across different industries, such as the carmaker BMW, the online travel and booking Expedia, and the dating app Tinder leverage the combination of data streaming with GenAI for reliable real-time conversational AI, NLP and chatbots leveraging Kafka.

Two years ago, I wrote about this topic: “Apache Kafka for Conversational AI, NLP and Chatbot“. But technologies like ChatGPT make it much easier to adopt GenAI in real-world projects with much faster time-to-market and less cost and risk. Let’s explore a few of these success stories for embedding NLP and GenAI into data streaming enterprise architectures.

Disclaimer: As I want to show real-world case studies instead of visionary outlooks, I show several examples deployed in production in the last few years. Hence, the analytic models do not use GenAI, LLM, or ChatGPT as we know it from the press today. But the principles are precisely the same. The only difference is that you could use a cutting-edge model like ChatGPT with much improved and context-specific responses today.

Expedia – Conversations Platform for Better Customer Experience

Expedia is a leading online travel and booking. They have many use cases for machine learning. One of my favorite examples is their Conversations Platform built on Kafka and Confluent Cloud to provide an elastic cloud-native application.

The goal of Expedia’s Conversations Platform was simple: Enable millions of travelers to have natural language conversations with an automated agent via text, Facebook, or their channel of choice. Let them book trips, make changes or cancellations, and ask questions:

  • “How long is my layover?”
  • “Does my hotel have a pool?”
  • “How much will I get charged to bring my golf clubs?”

Then take all that is known about that customer across all of Expedia’s brands and apply machine learning models to immediately give customers what they are looking for in real-time and automatically, whether a straightforward answer or a complex new itinerary.

Real-time Orchestration realized in four Months

Such a platform is no place for batch jobs, back-end processing, or offline APIs. To quickly make decisions that incorporate contextual information, the platform needs data in near real-time, and it needs it from a wide range of services and systems. Meeting these needs meant architecting the Conversations Platform around a central nervous system based on Confluent Cloud and Apache Kafka. Kafka made it possible to orchestrate data from loosely coupled systems, enrich data as it flows between them so that by the time it reaches its destination, it is ready to be acted upon, and surface aggregated data for analytics and reporting.

Expedia built this platform from zero to production in four months. That’s the tremendous advantage of using a fully managed serverless event streaming platform as the foundation. The project team can focus on the business logic.

The Covid pandemic proved the idea of an elastic platform: Companies were hit with a tidal wave of customer questions, cancellations, and re-bookings. Throughout this once-in-a-lifetime event, the Conversations Platform proved up to the challenge, auto-scaling as necessary and taking off much of the load of live agents.

Expedia’s Migration from MQ to Kafka as Foundation for Real-time Machine Learning and Chatbots
As part of their conversations platform, Expedia needed to modernize their IT infrastructure, as Ravi Vankamamidi, Director of Technology at Expedia Group, explained in a Kafka Summit keynote.
Expedia’s old legacy chatbot service relied on a legacy messaging system. This service was a question-and-answer board with very limited scope for booking scenarios. This service could handle two-party conversations. It could not scale to bring all different systems into one architecture to build a powerful chatbot that is helpful for customer conversations.

I explored several times that event streaming is more than just a (scalable) message queue. Check out my old (but still accurate and relevant) Comparison between MQ and Kafka, or the newer comparison between cloud-native iPaaS and Kafka.

Expedia needed a service that was closer to travel assistance. It needed to handle context-specific, multi-party, multi-channel conversations. Hence, features such as natural language processing, translation, and real-time analytics are required. The full service needs to be scalable across multiple brands. Therefore, a fast and highly scalable platform with order guarantees, exactly-once-semantics (EOS), and real-time data processing were needed.
The Kafka-native event streaming platform powered by Confluent was the best choice and met all requirements. The new conversations platform doubled the Net Promoter Score (NPS) one year after the rollout. The new platform proved the business value of the new platform quickly.

BMW – GenAI for Contract Intelligence, Workplace Assistance and Machine Translation

The automotive company BMW presented innovative NLP services at Kafka Summit in 2019. It is no surprise that a carmaker has various NLP scenarios. These include digital contract intelligence, workplace assistance, machine translation, and customer conversations. The latter contains multiple use cases for conversational AI:
  • Service desk automation
  • Speech analysis of customer interaction center (CIC) calls to improve the quality
  • Self-service using smart knowledge bases
  • Agent support
  • Chatbots
The text and speech data is structured, enriched, contextualized, summarized, and translated to build real-time decision support applications. Kafka is a crucial component of BMW’s ML and NLP architecture. The real-time integration and data correlation enable interactive and interoperable data consumption and usage:
NLP Service Framework Based on Data Streaming at BMW
BMW explained the key advantages of leveraging Kafka and its streaming processing library Kafka Streams as the real-time integration and orchestration platform:
  • Flexible integration: Multiple supported interfaces for different deployment scenarios, including various machine learning technologies, programming languages, and cloud providers
  • Modular end-to-end pipelines: Services can be connected to provide full-fledged NLP applications.
  • Configurability: High agility for each deployment scenario

Tinder – Intelligent Content Moderation, Matching and Recommendations with Kafka and GenAI

The dating app Tinder is a great example where I can think of tens of use cases for NLP. Tinder talked at a past Kafka Summit about their Kafka-powered machine learning platform.

Tinder is a massive user of Kafka and its ecosystem for various use cases, including content moderation, matching, recommendations, reminders, and user reactivation. They used Kafka Streams as a Kafka-native stream processing engine for metadata processing and correlation in real-time at scale:

Impact of Data Streaming at Tinder
A critical use case in any dating or social platform is content moderation for detecting fakes, filtering sexual content, and other inappropriate things. Content moderation combines NLP and text processing (e.g., for chat messages) with image processing (e.g., selfie uploads) or processes the metadata with Kafka and stores the linked content in a data lake. Both leverage Deep Learning to process high volumes of text and images. Here is what content moderation looks like in Tinder’s Kafka architecture:
Content Moderation at Tinder with Data Streaming and Machine Learning
Plenty of ways exist to process text, images, and videos with the Kafka ecosystem. I wrote a detailed article about handling large messages and files with Apache Kafka to explore the options and trade-offs.
Chatbots could also play a key role “in the other way round”. More and more dating apps (and other social networks) fight against spam, fraud, and automated chatbots. Like building a chatbot, a chatbot detection system can analyze the data streams to block a dating app’s chatbot.

Kafka as Real-Time Data Fabric for Future GenAI Initiatives

Real-time data beats slow data. Generative AI only adds value if it provides accurate and up-to-date information. Data streaming technologies such as Apache Kafka and Apache Flink enable building a reliable, scalable real-time infrastructure for GenAI. Additionally, the event-based heart of the enterprise architecture guarantees data consistency between real-time and non-real-time systems (near real-time, batch, request-response).

The early adopters like BWM, Expedia, and Tinder proved that Generative AI integrated into a Kafka architecture adds enormous business value. The evolution of AI models with ChatGPT et al. makes the use case even more compelling across every industry.

How do you build conversational AI, chatbots, and other GenAI applications leveraging Apache Kafka? What technologies and architectures do you use? Are data streaming and Kafka part of the architecture? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka as Mission Critical Data Fabric for GenAI appeared first on Kai Waehner.

]]>
Apache Kafka for Conversational AI, NLP and Chatbot https://www.kai-waehner.de/blog/2021/12/08/apache-kafka-for-conversational-ai-nlp-chatbot/ Wed, 08 Dec 2021 00:48:54 +0000 https://www.kai-waehner.de/?p=3958 Natural Language Processing (NLP) helps many projects in the real world for service desk automation, customer conversation with a chatbot, content moderation in social networks, and many other use cases. Learn how event streaming with Apache Kafka is used in conjunction with Machine Learning platforms at the carmaker BMW, the online travel and booking portal Expedia, and the dating app Tinder for reliable real-time conversational AI, NLP, and chatbots.

The post Apache Kafka for Conversational AI, NLP and Chatbot appeared first on Kai Waehner.

]]>
Natural Language Processing (NLP) helps many projects in the real world for service desk automation, customer conversation with a chatbot, content moderation in social networks, and many other use cases. Apache Kafka became the predominant orchestration layer in these machine learning platforms for integrating various data sources, processing at scale, and real-time model inference. This article shows how companies across different industries such as the carmaker BMW, the online travel and booking Expedia, and the dating app Tinder leverage the combination of event streaming with machine learning for reliable real-time conversational AI, NLP, and chatbots.

Conversational AI NLP and Chatbot with Apache Kafka

Natural Language Processing (NLP)

Natural language processing is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, mainly how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of “understanding” the contents of the text, documents, and speech, including the contextual nuances of the language within them. The technology can then accurately extract information and insights in the text and categorize and organize the text itself.

Use cases for natural language processing frequently involve speech recognition, natural language understanding, translation, and natural language generation.

Like most other ML concepts, NLP can use an ocean of different algorithms. In the 2010s, NLP kicked off when representation learning and deep neural network-style machine learning methods became widespread in NLP. Modern ML frameworks such as TensorFlow, in conjunction with the elastic compute power in the public cloud, enabled NLP usage for any company of any size.

Conversational AI has become mainstream because of Deep Learning

A chatbot is a software application used to conduct an online chat conversation via text or text-to-speech instead of direct contact with a live human agent. Designed to simulate how a human behaves as a conversational partner convincingly, chatbot systems typically require continuous tuning and testing, and many in production remain unable to converse adequately. Chatbots are used in dialog systems for various purposes, including customer service, request routing, or information gathering. 

discover.bot published an excellent article explaining how a chatbot works behind the scenes using NLP. It uses a combination of Natural Language Understanding (NLU) and Natural Language Generation (NLG):

Chatbot Architecture with NLU and NLG

Like any machine learning/deep learning application, a chatbot requires model training (= teaching a chatbot) and model scoring (=applying the chatbot in a dialog with a human). Therefore, building a chatbot is a machine learning problem with related tools, APIs, and cloud services. How does event streaming with Kafka fit into this story?

Machine Learning / NLP and Kafka – An Impedance Mismatch

I wrote about Machine Learning and Kafka a lot in the past. TL;DR: Machine Learning requires data integration and processing at scale, and model predictions often require reliable and robust real-time applications. That’s where Kafka and its ecosystem fit into the story:

Kafka Machine Learning Architecture for Java Python Kafka Connect

For more details:

These articles describe different workarounds to solve the impedance mismatch. A few examples:

  • Some teams let the data science teams deploy Python in Docker containers in production and integrate it with other applications and programming platforms like Java, Go, or .NET / C++.
  • A few projects use Faust as Kafka-native streaming Python library (with several limitations compared to Kafka Streams or ksqlDB).
  • Embedding NLP models (trained with any machine learning framework using any programming language, including Python) into a native Kafka application for robust model scoring is a well-known option.
  • Last but not least, many model servers added Kafka-native streaming interfaces directly using the Kafka protocol as an alternative to RPC communication via HTTP or gRPC.

Let’s now look at a few real-world examples for machine learning, NLP, chatbots, and the Kafka ecosystem in companies such as BMW, Expedia, and Tinder.

BMW – Kafka as Orchestration Layer for NLP and Chatbots

The automotive company BMW presented innovative NLP services at Kafka Summit in 2019 already. It is no surprise that a carmaker has various NLP scenarios. These include digital contract intelligence, workplace assistant, machine translation, and customer conversations. The latter contains multiple use cases for conversational AI:
  • Service desk automation
  • Speech analysis of customer interaction center (CIC) calls to improve the quality
  • Self-service using smart knowledge bases
  • Agent support
  • Chatbots
The text and speech data is structured, enriched, contextualized, summarized, and translated to build real-time decision support applications. Kafka is a crucial component of BMW’s ML and NLP architecture. The real-time integration and data correlation enable interactive and interoperable data consumption and usage:
NLP Service Framework Based on Kafka at BMW
BMW explained the key advantages of leveraging Kafka and its streaming processing library Kafka Streams as the real-time integration and orchestration platform:
  • Flexible integration: Multiple supported interfaces for different deployment scenarios, including various machine learning technologies, programming languages, and cloud providers
  • Modular end-to-end pipelines: Services can be connected to provide full-fledged NLP applications.
  • Configurability: High agility for each deployment scenario

Expedia – Conversations Platform powered by Cloud-native Kafka

Expedia is a leading online travel and booking. They have many use cases for machine learning. One of my favorite examples is their Conversations Platform built on Kafka and Confluent Cloud to provide an elastic cloud-native application.

The goal of Expedia’s Conversations Platform was simple: Enable millions of travelers to have natural language conversations with an automated agent via text, Facebook, or their channel of choice. Let them book trips, make changes or cancellations, and ask questions:

  • “How long is my layover?”
  • “Does my hotel have a pool?”
  • “How much will I get charged if I want to bring my golf clubs?”

Then take all that is known about that customer across all of Expedia’s brands and apply machine learning models to give customers what they are looking for immediately in real-time and automatically, whether a straightforward answer or a complex new itinerary.

Real-time Orchestration realized in four Months

Such a platform is no place for batch jobs, back-end processing, or offline APIs. To quickly make decisions that incorporate contextual information, the platform needs data in near real-time, and it needs it from a wide range of services and systems. Meeting these needs meant architecting the Conversations Platform around a central nervous system based on Confluent Cloud and Apache Kafka. Kafka made it possible to orchestrate data from loosely coupled systems, enrich data as it flows between them so that by the time it reaches its destination, it is ready to be acted upon, and surface aggregated data for analytics and reporting.

Expedia built this platform from zero to production in four months. That’s the tremendous advantage of using a fully managed serverless event streaming platform as the foundation. The project team can focus on the business logic.

The Covid pandemic proved the idea of an elastic platform: Companies were hit with a tidal wave of customer questions, cancellations, and re-bookings. Throughout this once-in-a-lifetime event, the Conversations Platform proved up to the challenge, auto-scaling as necessary and taking off much of the load of live agents.

Expedia’s Migration from MQ to Kafka as Foundation for Real-time Machine Learning and Chatbots

As part of their conversations platform, Expedia needed to modernize their IT infrastructure, as Ravi Vankamamidi, Director of Technology at Expedia Group, explained in a Kafka Summit keynote.
Expedia’s old legacy chatbot service relied on a legacy messaging system. This service was a question-and-answer board with very limited scope for booking scenarios. This service could handle two-party conversations. It could not scale to bring all different systems into one architecture to build a powerful chatbot that is helpful for customer conversations.

I explored several times that event streaming is more than just a (scalable) message queue. Check out my old (but still accurate and relevant) Comparison between MQ and Kafka, or the newer comparison between cloud-native iPaaS and Kafka.

Expedia needed a service that was closer to travel assistance. It needed to handle context-specific, multi-party, multi-channel conversations. Hence, features such as natural language processing, translation, and real-time analytics are required. The full service needs to be scalable across multiple brands. Therefore, a fast and highly scalable platform with order guarantees, exactly-once-semantics (EOS), and real-time data processing were needed.
The Kafka-native event streaming platform powered by Confluent was the best choice and met all requirements. One year after the rollout, the new conversations platform doubled the Net Promoter Score (NPS). The new platform proved the business value of the new platform quickly.

Tinder – Content Moderation with Kafka Streams and ML

The dating app Tinder is a great example where I can think of tens of use cases for NLP. Tinder talked at a past Kafka Summit about their Kafka-powered machine learning platform.

Tinder is a massive user of Kafka and its ecosystem for various use cases, including content moderation, matching, recommendations, reminders, and user reactivation. They used Kafka Streams as a Kafka-native stream processing engine for metadata processing and correlation in real-time at scale:

Impact of Apache Kafka at Tinder

A critical use case in any dating or social platform is content moderation for detecting fakes, filtering sexual content, and other inappropriate things. Content moderation combines NLP and text processing (e.g., for chat messages) with image processing (e.g., selfie uploads) or processes the metadata with Kafka and stores the linked content in a data lake. Both leverage Deep Learning to process high volumes of text and images. Here is what content moderation looks like in Tinder’s Kafka architecture:
Content Moderation at Tinder with Kafka and Machine Learning
Plenty of ways exist to process text, images, and videos with the Kafka ecosystem. I wrote a detailed article about handling large messages and files with Apache Kafka to explore the options and trade-offs.
Chatbots could also play a key role “the other way round”. More and more dating apps (and other social networks) fight against spam, fraud, and automated chatbots. Like building a chatbot, a chatbot detection system can analyze the data streams to block a dating app’s chatbot.
Let’s now explore how a developer can build a Kafka-native NLP application.

Telegram Bot API – ML in a Streaming App

Many project teams build their chatbot or other NLP services. Unfortunately, this is a considerable effort and often not cost-efficient. Another simplified and more cost-efficient alternative is integrating an NLP or chatbot API as a service. My colleague Robin Moffat wrote a great post about building a Telegram powered chatbot with Kafka and ksqlDB where the chatbot API is integrated into the real-time Kafka application. This way, like in the Expedia example above, the NLP application integrates in real-time in a truly decoupled fashion with other applications in the enterprise architecture.
Telegram Chatbot integrated with ksqlDB
This example uses Telegram. That is a messaging platform, similar in concept to WhatsApp, Facebook Messenger, etc. A nice Bot API offers integration via a REST API. While the example is using Telegram, the same approach would work just fine with a bot on your platform of choice (Slack, etc.) or within your standalone application that wants to look up the state that’s being populated and maintained from a stream of events in Kafka.

Reddit Text Processing – NLP with Streaming Model Predictions

The drawback of the above Telegram example is integrating a REST API for the chatbot. Remote procedure calls (RPC) are still predominant in machine learning. However, RPC is an anti-pattern in the event streaming world as it creates challenges concerning robustness, latency, scalability, and error handling. RPC integration is fine for many use cases. But because of other available options, there is no need to do RPC calls.

Kafka-native streaming machine learning is an alternative for model predictions. The two deployment options are embedding an analytic model into the Kafka application or using a model server that supports event streaming besides RPC (HTTP/gRPC) calls. I wrote a detailed article with the pros and cons of both approaches for model predictions using Kafka applications.

The following shows an example of a Kafka-native integration between a model server and other applications using the Kafka protocol:

Kafka-native Machine Learning Model Server Seldon

The Seldon model server is an example that already supports the Kafka interface. The Seldon team demoed how they train and deploy a machine learning model leveraging a scalable stream processing architecture for an automated text prediction use-case. They use Scikit-learn (Sklearn) and SpaCy to train an ML model from the Reddit Content Moderation dataset and deploy that model using Seldon Core for real-time processing of text data from Kafka real-time streams:

Seldon Model Server with Kafka Support using Python scikit-learn and SpaCy

Kafka-native NLP to build the next Conversational AI and Chatbot

Apache Kafka became the de facto standard for event streaming. One pillar of Kafka use cases includes ML platforms including various NLP-related concepts such as conversational AI, chatbots, and speech translation for improving and automating service desks, content moderation, and plenty of other use cases.

A Kafka-based orchestration and integration layer provides true decoupling in a scalable real-time platform. The benefits for ML platforms include back-pressure handling, pre-processing, and aggregations at any scale in real-time. Another benefit is the capabilities to connect different communication paradigms and technologies. The examples from BMW, Expedia, and Tinder showed how Kafka-based NLP infrastructure could look.

How do you build conversational AI, chatbots, and other NLP applications? What technologies and architectures do you use? Are event streaming and Kafka part of the architecture? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

The post Apache Kafka for Conversational AI, NLP and Chatbot appeared first on Kai Waehner.

]]>