Optimizing Gen AI Chatbots Using Data Analysis

It all started with a promise: “Let’s launch a generative AI chatbot—give it a few weeks, and it’ll be live for our users.”
And it was. At first, everything worked. Conversations flowed, responses made sense, and everyone was impressed. But then the silence came. Cut-off replies. Answers that sounded natural but meant nothing. Users dropped off after just two messages.
What felt like magic started to sound like noise.
That’s when I realized that building a chatbot isn’t the hard part, the hard part is listening to what it’s not saying. That’s where this project began: using data, language, and analysis to understand, debug, and improve what hides in the quiet gaps of an AI conversation.
The analysis focused on three key areas:
- Funnel performance and user behavior.
- Qualitative issues like hallucinations or interruptions.
- Message and session-level metrics for continuous monitoring.
Why do we feel the need to optimize?
While generative models can simulate natural conversation, they are not perfect. Without regular monitoring, issues like hallucinated responses, API interruptions or incomplete user journeys can quickly degrade the user experience and reduce engagement. To address this, we needed a way to detect these issues early and generate actionable insights.
Interations Data
session_id | Time | Speaker | Message |
---|---|---|---|
34657XXXXXX | 09:00 | AI | 👋 Hello! Welcome, here you can purchase your private dental insurance. Would you like to learn about our plans, or do you already have one and need help? |
34657XXXXXX | 09:01 | User | Hi, I’d like to know what plans you offer. |
34657XXXXXX | 09:01 | AI | Sure 😊 We have 3 main plans: 1. Basic Plan – $9.99/month (…) 2. Plus Plan – $19.99/month (…) 3. Premium Plan – $29.99/month (…) Would you like more details about one of them? |
34657XXXXXX | 09:03 | User | I’m interested in the Plus Plan. Does it have a minimum term? |
34657XXXXXX | 09:03 | AI | Great choice 👍 The Plus Plan has no minimum term. You can cancel anytime without penalty. Coverage starts 48 hours after subscription. Would you like to sign up now? |
34657XXXXXX | 09:04 | User | Yes, please. |
*Notes: sale = 1 means a successful sale. total_user_msgs = number of messages sent by the human user.
Funnel Metrics: Understanding the User Journey
One of the first steps was analyzing how users interact with the chatbot from the beginning of the session to the end. Some key metrics:
session_id | sale | chat_duration_minutes | total_user_msgs |
---|---|---|---|
3460XXXXXX | 0 | 43.1 | 14 |
3463XXXXXX | 1 | 16.4 | 12 |
3460XXXXXX | 0 | 90.8 | 22 |
3470XXXXXX | 0 | 1.6 | 3 |
3468XXXXXX | 0 | 91.5 | 23 |
3464XXXXXX | 1 | 10.5 | 15 |
3461XXXXXX | 1 | 13.3 | 16 |
3467XXXXXX | 0 | 0.9 | 5 |
3471XXXXXX | 1 | 10.2 | 18 |
3465XXXXXX | 0 | 2.6 | 4 |
Analyzing these sessions reveals clear patterns. Successful sales (venta = 1) averaged 12.6 minutes and 15.25 human messages — showing efficient, engaged chats. Non-sales averaged 46.7 minutes and fewer messages, indicating that longer conversations do not equal success. In fact, the two longest sessions (over 90 minutes) didn’t lead to success, likely reflecting user frustration or ineffective flows. Very short sessions with few messages often end without a sale, suggesting drop-offs or poor routing.
🧮 Daily Monitoring Overview
Date | Users | Sales | Conversion Rate (%) | Avg. Conversation Time (min) | Avg. Messages per User | Min Msgs | Max Msgs |
---|---|---|---|---|---|---|---|
01/07/2025 | 45 | 10 | 22.2 | 8.7 | 9.3 | 1 | 21 |
02/07/2025 | 53 | 9 | 17.0 | 15.39 | 8.0 | 1 | 19 |
03/07/2025 | 2 | 0 | 0.0 | 2.7 | 9.0 | 7 | 11 |
04/07/2025 | 26 | 6 | 23.1 | 10.11 | 11.8 | 1 | 24 |
05/07/2025 | 53 | 6 | 11.3 | 19.36 | 10.8 | 1 | 22 |
06/07/2025 | 41 | 8 | 19.5 | 16.15 | 11.9 | 1 | 30 |
July 4th stands out with the highest conversion rate (23.1%) despite only 26 users. Sessions were well-balanced—users sent nearly 12 messages per chat, but conversations weren’t overly long, indicating a smooth and effective flow. In contrast, July 5th had the same number of users as July 2nd (53) but showed a much lower conversion rate (11.3%) and longer chats (over 19 minutes), signaling inefficiencies or friction preventing users from completing the funnel.
Analytics also revealed technical glitches: July 3rd had only 2 users, suggesting a likely traffic or API failure rather than low demand. And when a session hits the maximum of 30 messages without converting (as on July 6th), it’s a clear signal of a broken flow or a user frustration loop.
By tracking message volume, session time, and conversions daily, you can catch these issues early, test hypotheses, and continuously improve your chatbot’s ability to not just talk, but sell.