Optimizing Gen AI Chatbots Using Data Analysis

It all started with a promise: “Let’s launch a generative AI chatbot—give it a few weeks, and it’ll be live for our users.”

And it was. At first, everything worked. Conversations flowed, responses made sense, and everyone was impressed. But then the silence came. Cut-off replies. Answers that sounded natural but meant nothing. Users dropped off after just two messages.

What felt like magic started to sound like noise.

That’s when I realized that building a chatbot isn’t the hard part, the hard part is listening to what it’s not saying. That’s where this project began: using data, language, and analysis to understand, debug, and improve what hides in the quiet gaps of an AI conversation.

The analysis focused on three key areas:

Funnel performance and user behavior.
Qualitative issues like hallucinations or interruptions.
Message and session-level metrics for continuous monitoring.

Why do we feel the need to optimize?

While generative models can simulate natural conversation, they are not perfect. Without regular monitoring, issues like hallucinated responses, API interruptions or incomplete user journeys can quickly degrade the user experience and reduce engagement. To address this, we needed a way to detect these issues early and generate actionable insights.

Interations Data

session_id	Time	Speaker	Message
34657XXXXXX	09:00	AI	👋 Hello! Welcome, here you can purchase your private dental insurance. Would you like to learn about our plans, or do you already have one and need help?
34657XXXXXX	09:01	User	Hi, I’d like to know what plans you offer.
34657XXXXXX	09:01	AI	Sure 😊 We have 3 main plans: 1. Basic Plan – $9.99/month (…) 2. Plus Plan – $19.99/month (…) 3. Premium Plan – $29.99/month (…) Would you like more details about one of them?
34657XXXXXX	09:03	User	I’m interested in the Plus Plan. Does it have a minimum term?
34657XXXXXX	09:03	AI	Great choice 👍 The Plus Plan has no minimum term. You can cancel anytime without penalty. Coverage starts 48 hours after subscription. Would you like to sign up now?
34657XXXXXX	09:04	User	Yes, please.

*Notes: sale = 1 means a successful sale. total_user_msgs = number of messages sent by the human user.

Funnel Metrics: Understanding the User Journey

One of the first steps was analyzing how users interact with the chatbot from the beginning of the session to the end. Some key metrics:

session_id	sale	chat_duration_minutes	total_user_msgs
3460XXXXXX	0	43.1	14
3463XXXXXX	1	16.4	12
3460XXXXXX	0	90.8	22
3470XXXXXX	0	1.6	3
3468XXXXXX	0	91.5	23
3464XXXXXX	1	10.5	15
3461XXXXXX	1	13.3	16
3467XXXXXX	0	0.9	5
3471XXXXXX	1	10.2	18
3465XXXXXX	0	2.6	4

Analyzing these sessions reveals clear patterns. Successful sales (venta = 1) averaged 12.6 minutes and 15.25 human messages — showing efficient, engaged chats. Non-sales averaged 46.7 minutes and fewer messages, indicating that longer conversations do not equal success. In fact, the two longest sessions (over 90 minutes) didn’t lead to success, likely reflecting user frustration or ineffective flows. Very short sessions with few messages often end without a sale, suggesting drop-offs or poor routing.

🧮 Daily Monitoring Overview

Date	Users	Sales	Conversion Rate (%)	Avg. Conversation Time (min)	Avg. Messages per User	Min Msgs	Max Msgs
01/07/2025	45	10	22.2	8.7	9.3	1	21
02/07/2025	53	9	17.0	15.39	8.0	1	19
03/07/2025	2	0	0.0	2.7	9.0	7	11
04/07/2025	26	6	23.1	10.11	11.8	1	24
05/07/2025	53	6	11.3	19.36	10.8	1	22
06/07/2025	41	8	19.5	16.15	11.9	1	30

July 4th stands out with the highest conversion rate (23.1%) despite only 26 users. Sessions were well-balanced—users sent nearly 12 messages per chat, but conversations weren’t overly long, indicating a smooth and effective flow. In contrast, July 5th had the same number of users as July 2nd (53) but showed a much lower conversion rate (11.3%) and longer chats (over 19 minutes), signaling inefficiencies or friction preventing users from completing the funnel.

Analytics also revealed technical glitches: July 3rd had only 2 users, suggesting a likely traffic or API failure rather than low demand. And when a session hits the maximum of 30 messages without converting (as on July 6th), it’s a clear signal of a broken flow or a user frustration loop.

By tracking message volume, session time, and conversions daily, you can catch these issues early, test hypotheses, and continuously improve your chatbot’s ability to not just talk, but sell.